16cf2a73cSMauro Carvalho Chehab============ 26cf2a73cSMauro Carvalho Chehabdm-integrity 36cf2a73cSMauro Carvalho Chehab============ 46cf2a73cSMauro Carvalho Chehab 56cf2a73cSMauro Carvalho ChehabThe dm-integrity target emulates a block device that has additional 66cf2a73cSMauro Carvalho Chehabper-sector tags that can be used for storing integrity information. 76cf2a73cSMauro Carvalho Chehab 86cf2a73cSMauro Carvalho ChehabA general problem with storing integrity tags with every sector is that 96cf2a73cSMauro Carvalho Chehabwriting the sector and the integrity tag must be atomic - i.e. in case of 106cf2a73cSMauro Carvalho Chehabcrash, either both sector and integrity tag or none of them is written. 116cf2a73cSMauro Carvalho Chehab 126cf2a73cSMauro Carvalho ChehabTo guarantee write atomicity, the dm-integrity target uses journal, it 136cf2a73cSMauro Carvalho Chehabwrites sector data and integrity tags into a journal, commits the journal 146cf2a73cSMauro Carvalho Chehaband then copies the data and integrity tags to their respective location. 156cf2a73cSMauro Carvalho Chehab 166cf2a73cSMauro Carvalho ChehabThe dm-integrity target can be used with the dm-crypt target - in this 176cf2a73cSMauro Carvalho Chehabsituation the dm-crypt target creates the integrity data and passes them 186cf2a73cSMauro Carvalho Chehabto the dm-integrity target via bio_integrity_payload attached to the bio. 196cf2a73cSMauro Carvalho ChehabIn this mode, the dm-crypt and dm-integrity targets provide authenticated 206cf2a73cSMauro Carvalho Chehabdisk encryption - if the attacker modifies the encrypted device, an I/O 216cf2a73cSMauro Carvalho Chehaberror is returned instead of random data. 226cf2a73cSMauro Carvalho Chehab 236cf2a73cSMauro Carvalho ChehabThe dm-integrity target can also be used as a standalone target, in this 246cf2a73cSMauro Carvalho Chehabmode it calculates and verifies the integrity tag internally. In this 256cf2a73cSMauro Carvalho Chehabmode, the dm-integrity target can be used to detect silent data 266cf2a73cSMauro Carvalho Chehabcorruption on the disk or in the I/O path. 276cf2a73cSMauro Carvalho Chehab 28c3ba5aa6SRussell HarmonThere's an alternate mode of operation where dm-integrity uses a bitmap 296cf2a73cSMauro Carvalho Chehabinstead of a journal. If a bit in the bitmap is 1, the corresponding 306cf2a73cSMauro Carvalho Chehabregion's data and integrity tags are not synchronized - if the machine 316cf2a73cSMauro Carvalho Chehabcrashes, the unsynchronized regions will be recalculated. The bitmap mode 326cf2a73cSMauro Carvalho Chehabis faster than the journal mode, because we don't have to write the data 336cf2a73cSMauro Carvalho Chehabtwice, but it is also less reliable, because if data corruption happens 346cf2a73cSMauro Carvalho Chehabwhen the machine crashes, it may not be detected. 356cf2a73cSMauro Carvalho Chehab 366cf2a73cSMauro Carvalho ChehabWhen loading the target for the first time, the kernel driver will format 376cf2a73cSMauro Carvalho Chehabthe device. But it will only format the device if the superblock contains 386cf2a73cSMauro Carvalho Chehabzeroes. If the superblock is neither valid nor zeroed, the dm-integrity 396cf2a73cSMauro Carvalho Chehabtarget can't be loaded. 406cf2a73cSMauro Carvalho Chehab 413b671459SRussell HarmonAccesses to the on-disk metadata area containing checksums (aka tags) are 423b671459SRussell Harmonbuffered using dm-bufio. When an access to any given metadata area 433b671459SRussell Harmonoccurs, each unique metadata area gets its own buffer(s). The buffer size 443b671459SRussell Harmonis capped at the size of the metadata area, but may be smaller, thereby 453b671459SRussell Harmonrequiring multiple buffers to represent the full metadata area. A smaller 463b671459SRussell Harmonbuffer size will produce a smaller resulting read/write operation to the 473b671459SRussell Harmonmetadata area for small reads/writes. The metadata is still read even in 483b671459SRussell Harmona full write to the data covered by a single buffer. 493b671459SRussell Harmon 506cf2a73cSMauro Carvalho ChehabTo use the target for the first time: 516cf2a73cSMauro Carvalho Chehab 526cf2a73cSMauro Carvalho Chehab1. overwrite the superblock with zeroes 536cf2a73cSMauro Carvalho Chehab2. load the dm-integrity target with one-sector size, the kernel driver 546cf2a73cSMauro Carvalho Chehab will format the device 556cf2a73cSMauro Carvalho Chehab3. unload the dm-integrity target 566cf2a73cSMauro Carvalho Chehab4. read the "provided_data_sectors" value from the superblock 574e578ba6SRandy Dunlap5. load the dm-integrity target with the target size 586cf2a73cSMauro Carvalho Chehab "provided_data_sectors" 596cf2a73cSMauro Carvalho Chehab6. if you want to use dm-integrity with dm-crypt, load the dm-crypt target 606cf2a73cSMauro Carvalho Chehab with the size "provided_data_sectors" 616cf2a73cSMauro Carvalho Chehab 626cf2a73cSMauro Carvalho Chehab 636cf2a73cSMauro Carvalho ChehabTarget arguments: 646cf2a73cSMauro Carvalho Chehab 656cf2a73cSMauro Carvalho Chehab1. the underlying block device 666cf2a73cSMauro Carvalho Chehab 676cf2a73cSMauro Carvalho Chehab2. the number of reserved sector at the beginning of the device - the 686cf2a73cSMauro Carvalho Chehab dm-integrity won't read of write these sectors 696cf2a73cSMauro Carvalho Chehab 706cf2a73cSMauro Carvalho Chehab3. the size of the integrity tag (if "-" is used, the size is taken from 716cf2a73cSMauro Carvalho Chehab the internal-hash algorithm) 726cf2a73cSMauro Carvalho Chehab 736cf2a73cSMauro Carvalho Chehab4. mode: 746cf2a73cSMauro Carvalho Chehab 756cf2a73cSMauro Carvalho Chehab D - direct writes (without journal) 766cf2a73cSMauro Carvalho Chehab in this mode, journaling is 776cf2a73cSMauro Carvalho Chehab not used and data sectors and integrity tags are written 786cf2a73cSMauro Carvalho Chehab separately. In case of crash, it is possible that the data 796cf2a73cSMauro Carvalho Chehab and integrity tag doesn't match. 806cf2a73cSMauro Carvalho Chehab J - journaled writes 816cf2a73cSMauro Carvalho Chehab data and integrity tags are written to the 826cf2a73cSMauro Carvalho Chehab journal and atomicity is guaranteed. In case of crash, 836cf2a73cSMauro Carvalho Chehab either both data and tag or none of them are written. The 846cf2a73cSMauro Carvalho Chehab journaled mode degrades write throughput twice because the 856cf2a73cSMauro Carvalho Chehab data have to be written twice. 866cf2a73cSMauro Carvalho Chehab B - bitmap mode - data and metadata are written without any 876cf2a73cSMauro Carvalho Chehab synchronization, the driver maintains a bitmap of dirty 886cf2a73cSMauro Carvalho Chehab regions where data and metadata don't match. This mode can 896cf2a73cSMauro Carvalho Chehab only be used with internal hash. 906cf2a73cSMauro Carvalho Chehab R - recovery mode - in this mode, journal is not replayed, 916cf2a73cSMauro Carvalho Chehab checksums are not checked and writes to the device are not 926cf2a73cSMauro Carvalho Chehab allowed. This mode is useful for data recovery if the 936cf2a73cSMauro Carvalho Chehab device cannot be activated in any of the other standard 946cf2a73cSMauro Carvalho Chehab modes. 956cf2a73cSMauro Carvalho Chehab 966cf2a73cSMauro Carvalho Chehab5. the number of additional arguments 976cf2a73cSMauro Carvalho Chehab 986cf2a73cSMauro Carvalho ChehabAdditional arguments: 996cf2a73cSMauro Carvalho Chehab 1006cf2a73cSMauro Carvalho Chehabjournal_sectors:number 1016cf2a73cSMauro Carvalho Chehab The size of journal, this argument is used only if formatting the 1026cf2a73cSMauro Carvalho Chehab device. If the device is already formatted, the value from the 1036cf2a73cSMauro Carvalho Chehab superblock is used. 1046cf2a73cSMauro Carvalho Chehab 10552145f28SRussell Harmoninterleave_sectors:number (default 32768) 1066cf2a73cSMauro Carvalho Chehab The number of interleaved sectors. This values is rounded down to 1076cf2a73cSMauro Carvalho Chehab a power of two. If the device is already formatted, the value from 1086cf2a73cSMauro Carvalho Chehab the superblock is used. 1096cf2a73cSMauro Carvalho Chehab 1106cf2a73cSMauro Carvalho Chehabmeta_device:device 1114e578ba6SRandy Dunlap Don't interleave the data and metadata on the device. Use a 1126cf2a73cSMauro Carvalho Chehab separate device for metadata. 1136cf2a73cSMauro Carvalho Chehab 11452145f28SRussell Harmonbuffer_sectors:number (default 128) 11552145f28SRussell Harmon The number of sectors in one metadata buffer. The value is rounded 11652145f28SRussell Harmon down to a power of two. 1176cf2a73cSMauro Carvalho Chehab 11852145f28SRussell Harmonjournal_watermark:number (default 50) 1196cf2a73cSMauro Carvalho Chehab The journal watermark in percents. When the size of the journal 1206cf2a73cSMauro Carvalho Chehab exceeds this watermark, the thread that flushes the journal will 1216cf2a73cSMauro Carvalho Chehab be started. 1226cf2a73cSMauro Carvalho Chehab 12352145f28SRussell Harmoncommit_time:number (default 10000) 1246cf2a73cSMauro Carvalho Chehab Commit time in milliseconds. When this time passes, the journal is 125751d5b27SAndrew Klychkov written. The journal is also written immediately if the FLUSH 1266cf2a73cSMauro Carvalho Chehab request is received. 1276cf2a73cSMauro Carvalho Chehab 1286cf2a73cSMauro Carvalho Chehabinternal_hash:algorithm(:key) (the key is optional) 1296cf2a73cSMauro Carvalho Chehab Use internal hash or crc. 1306cf2a73cSMauro Carvalho Chehab When this argument is used, the dm-integrity target won't accept 1316cf2a73cSMauro Carvalho Chehab integrity tags from the upper target, but it will automatically 1326cf2a73cSMauro Carvalho Chehab generate and verify the integrity tags. 1336cf2a73cSMauro Carvalho Chehab 1346cf2a73cSMauro Carvalho Chehab You can use a crc algorithm (such as crc32), then integrity target 1356cf2a73cSMauro Carvalho Chehab will protect the data against accidental corruption. 1366cf2a73cSMauro Carvalho Chehab You can also use a hmac algorithm (for example 1376cf2a73cSMauro Carvalho Chehab "hmac(sha256):0123456789abcdef"), in this mode it will provide 1386cf2a73cSMauro Carvalho Chehab cryptographic authentication of the data without encryption. 1396cf2a73cSMauro Carvalho Chehab 1406cf2a73cSMauro Carvalho Chehab When this argument is not used, the integrity tags are accepted 1416cf2a73cSMauro Carvalho Chehab from an upper layer target, such as dm-crypt. The upper layer 1426cf2a73cSMauro Carvalho Chehab target should check the validity of the integrity tags. 1436cf2a73cSMauro Carvalho Chehab 1446cf2a73cSMauro Carvalho Chehabrecalculate 1456cf2a73cSMauro Carvalho Chehab Recalculate the integrity tags automatically. It is only valid 1466cf2a73cSMauro Carvalho Chehab when using internal hash. 1476cf2a73cSMauro Carvalho Chehab 1486cf2a73cSMauro Carvalho Chehabjournal_crypt:algorithm(:key) (the key is optional) 1496cf2a73cSMauro Carvalho Chehab Encrypt the journal using given algorithm to make sure that the 1506cf2a73cSMauro Carvalho Chehab attacker can't read the journal. You can use a block cipher here 151663f63eeSArd Biesheuvel (such as "cbc(aes)") or a stream cipher (for example "chacha20" 152663f63eeSArd Biesheuvel or "ctr(aes)"). 1536cf2a73cSMauro Carvalho Chehab 1546cf2a73cSMauro Carvalho Chehab The journal contains history of last writes to the block device, 155751d5b27SAndrew Klychkov an attacker reading the journal could see the last sector numbers 1566cf2a73cSMauro Carvalho Chehab that were written. From the sector numbers, the attacker can infer 1576cf2a73cSMauro Carvalho Chehab the size of files that were written. To protect against this 1586cf2a73cSMauro Carvalho Chehab situation, you can encrypt the journal. 1596cf2a73cSMauro Carvalho Chehab 1606cf2a73cSMauro Carvalho Chehabjournal_mac:algorithm(:key) (the key is optional) 1616cf2a73cSMauro Carvalho Chehab Protect sector numbers in the journal from accidental or malicious 1626cf2a73cSMauro Carvalho Chehab modification. To protect against accidental modification, use a 1636cf2a73cSMauro Carvalho Chehab crc algorithm, to protect against malicious modification, use a 1646cf2a73cSMauro Carvalho Chehab hmac algorithm with a key. 1656cf2a73cSMauro Carvalho Chehab 1666cf2a73cSMauro Carvalho Chehab This option is not needed when using internal-hash because in this 1676cf2a73cSMauro Carvalho Chehab mode, the integrity of journal entries is checked when replaying 1686cf2a73cSMauro Carvalho Chehab the journal. Thus, modified sector number would be detected at 1696cf2a73cSMauro Carvalho Chehab this stage. 1706cf2a73cSMauro Carvalho Chehab 17152145f28SRussell Harmonblock_size:number (default 512) 1726cf2a73cSMauro Carvalho Chehab The size of a data block in bytes. The larger the block size the 1736cf2a73cSMauro Carvalho Chehab less overhead there is for per-block integrity metadata. 17452145f28SRussell Harmon Supported values are 512, 1024, 2048 and 4096 bytes. 1756cf2a73cSMauro Carvalho Chehab 1766cf2a73cSMauro Carvalho Chehabsectors_per_bit:number 1776cf2a73cSMauro Carvalho Chehab In the bitmap mode, this parameter specifies the number of 1786cf2a73cSMauro Carvalho Chehab 512-byte sectors that corresponds to one bitmap bit. 1796cf2a73cSMauro Carvalho Chehab 1806cf2a73cSMauro Carvalho Chehabbitmap_flush_interval:number 1816cf2a73cSMauro Carvalho Chehab The bitmap flush interval in milliseconds. The metadata buffers 1826cf2a73cSMauro Carvalho Chehab are synchronized when this interval expires. 1836cf2a73cSMauro Carvalho Chehab 1845c024064SMikulas Patockaallow_discards 1855c024064SMikulas Patocka Allow block discard requests (a.k.a. TRIM) for the integrity device. 1865c024064SMikulas Patocka Discards are only allowed to devices using internal hash. 1875c024064SMikulas Patocka 188d537858aSMikulas Patockafix_padding 189d537858aSMikulas Patocka Use a smaller padding of the tag area that is more 190d537858aSMikulas Patocka space-efficient. If this option is not present, large padding is 191d537858aSMikulas Patocka used - that is for compatibility with older kernels. 192d537858aSMikulas Patocka 19309d85f8dSMikulas Patockafix_hmac 19409d85f8dSMikulas Patocka Improve security of internal_hash and journal_mac: 19509d85f8dSMikulas Patocka 19609d85f8dSMikulas Patocka - the section number is mixed to the mac, so that an attacker can't 19709d85f8dSMikulas Patocka copy sectors from one journal section to another journal section 19809d85f8dSMikulas Patocka - the superblock is protected by journal_mac 19909d85f8dSMikulas Patocka - a 16-byte salt stored in the superblock is mixed to the mac, so 20009d85f8dSMikulas Patocka that the attacker can't detect that two disks have the same hmac 20109d85f8dSMikulas Patocka key and also to disallow the attacker to move sectors from one 20209d85f8dSMikulas Patocka disk to another 20309d85f8dSMikulas Patocka 2045c024064SMikulas Patockalegacy_recalculate 2055c024064SMikulas Patocka Allow recalculating of volumes with HMAC keys. This is disabled by 2065c024064SMikulas Patocka default for security reasons - an attacker could modify the volume, 2075c024064SMikulas Patocka set recalc_sector to zero, and the kernel would not detect the 2085c024064SMikulas Patocka modification. 2096cf2a73cSMauro Carvalho Chehab 2100a2bd55cSMilan BrozThe journal mode (D/J), buffer_sectors, journal_watermark, commit_time and 2110a2bd55cSMilan Brozallow_discards can be changed when reloading the target (load an inactive 2120a2bd55cSMilan Broztable and swap the tables with suspend and resume). The other arguments 2130a2bd55cSMilan Brozshould not be changed when reloading the target because the layout of disk 2140a2bd55cSMilan Brozdata depend on them and the reloaded target would be non-functional. 2156cf2a73cSMauro Carvalho Chehab 216*2971c058SRussell HarmonFor example, on a device using the default interleave_sectors of 32768, a 217*2971c058SRussell Harmonblock_size of 512, and an internal_hash of crc32c with a tag size of 4 218*2971c058SRussell Harmonbytes, it will take 128 KiB of tags to track a full data area, requiring 219*2971c058SRussell Harmon256 sectors of metadata per data area. With the default buffer_sectors of 220*2971c058SRussell Harmon128, that means there will be 2 buffers per metadata area, or 2 buffers 221*2971c058SRussell Harmonper 16 MiB of data. 2226cf2a73cSMauro Carvalho Chehab 22340e9c5acSMikulas PatockaStatus line: 22440e9c5acSMikulas Patocka 22540e9c5acSMikulas Patocka1. the number of integrity mismatches 22640e9c5acSMikulas Patocka2. provided data sectors - that is the number of sectors that the user 22740e9c5acSMikulas Patocka could use 22840e9c5acSMikulas Patocka3. the current recalculating position (or '-' if we didn't recalculate) 22940e9c5acSMikulas Patocka 23040e9c5acSMikulas Patocka 2316cf2a73cSMauro Carvalho ChehabThe layout of the formatted block device: 2326cf2a73cSMauro Carvalho Chehab 2336cf2a73cSMauro Carvalho Chehab* reserved sectors 2346cf2a73cSMauro Carvalho Chehab (they are not used by this target, they can be used for 2356cf2a73cSMauro Carvalho Chehab storing LUKS metadata or for other purpose), the size of the reserved 2366cf2a73cSMauro Carvalho Chehab area is specified in the target arguments 2376cf2a73cSMauro Carvalho Chehab 2386cf2a73cSMauro Carvalho Chehab* superblock (4kiB) 2396cf2a73cSMauro Carvalho Chehab * magic string - identifies that the device was formatted 2406cf2a73cSMauro Carvalho Chehab * version 2416cf2a73cSMauro Carvalho Chehab * log2(interleave sectors) 2426cf2a73cSMauro Carvalho Chehab * integrity tag size 2436cf2a73cSMauro Carvalho Chehab * the number of journal sections 2446cf2a73cSMauro Carvalho Chehab * provided data sectors - the number of sectors that this target 2456cf2a73cSMauro Carvalho Chehab provides (i.e. the size of the device minus the size of all 2466cf2a73cSMauro Carvalho Chehab metadata and padding). The user of this target should not send 2476cf2a73cSMauro Carvalho Chehab bios that access data beyond the "provided data sectors" limit. 2486cf2a73cSMauro Carvalho Chehab * flags 2496cf2a73cSMauro Carvalho Chehab SB_FLAG_HAVE_JOURNAL_MAC 2506cf2a73cSMauro Carvalho Chehab - a flag is set if journal_mac is used 2516cf2a73cSMauro Carvalho Chehab SB_FLAG_RECALCULATING 2526cf2a73cSMauro Carvalho Chehab - recalculating is in progress 2536cf2a73cSMauro Carvalho Chehab SB_FLAG_DIRTY_BITMAP 2546cf2a73cSMauro Carvalho Chehab - journal area contains the bitmap of dirty 2556cf2a73cSMauro Carvalho Chehab blocks 2566cf2a73cSMauro Carvalho Chehab * log2(sectors per block) 2576cf2a73cSMauro Carvalho Chehab * a position where recalculating finished 2586cf2a73cSMauro Carvalho Chehab* journal 2596cf2a73cSMauro Carvalho Chehab The journal is divided into sections, each section contains: 2606cf2a73cSMauro Carvalho Chehab 2616cf2a73cSMauro Carvalho Chehab * metadata area (4kiB), it contains journal entries 2626cf2a73cSMauro Carvalho Chehab 2636cf2a73cSMauro Carvalho Chehab - every journal entry contains: 2646cf2a73cSMauro Carvalho Chehab 2656cf2a73cSMauro Carvalho Chehab * logical sector (specifies where the data and tag should 2666cf2a73cSMauro Carvalho Chehab be written) 2676cf2a73cSMauro Carvalho Chehab * last 8 bytes of data 2686cf2a73cSMauro Carvalho Chehab * integrity tag (the size is specified in the superblock) 2696cf2a73cSMauro Carvalho Chehab 2706cf2a73cSMauro Carvalho Chehab - every metadata sector ends with 2716cf2a73cSMauro Carvalho Chehab 2726cf2a73cSMauro Carvalho Chehab * mac (8-bytes), all the macs in 8 metadata sectors form a 2736cf2a73cSMauro Carvalho Chehab 64-byte value. It is used to store hmac of sector 2746cf2a73cSMauro Carvalho Chehab numbers in the journal section, to protect against a 2756cf2a73cSMauro Carvalho Chehab possibility that the attacker tampers with sector 2766cf2a73cSMauro Carvalho Chehab numbers in the journal. 2776cf2a73cSMauro Carvalho Chehab * commit id 2786cf2a73cSMauro Carvalho Chehab 2796cf2a73cSMauro Carvalho Chehab * data area (the size is variable; it depends on how many journal 2806cf2a73cSMauro Carvalho Chehab entries fit into the metadata area) 2816cf2a73cSMauro Carvalho Chehab 2826cf2a73cSMauro Carvalho Chehab - every sector in the data area contains: 2836cf2a73cSMauro Carvalho Chehab 2846cf2a73cSMauro Carvalho Chehab * data (504 bytes of data, the last 8 bytes are stored in 2856cf2a73cSMauro Carvalho Chehab the journal entry) 2866cf2a73cSMauro Carvalho Chehab * commit id 2876cf2a73cSMauro Carvalho Chehab 2886cf2a73cSMauro Carvalho Chehab To test if the whole journal section was written correctly, every 2896cf2a73cSMauro Carvalho Chehab 512-byte sector of the journal ends with 8-byte commit id. If the 2906cf2a73cSMauro Carvalho Chehab commit id matches on all sectors in a journal section, then it is 2916cf2a73cSMauro Carvalho Chehab assumed that the section was written correctly. If the commit id 2926cf2a73cSMauro Carvalho Chehab doesn't match, the section was written partially and it should not 2936cf2a73cSMauro Carvalho Chehab be replayed. 2946cf2a73cSMauro Carvalho Chehab 2956cf2a73cSMauro Carvalho Chehab* one or more runs of interleaved tags and data. 2966cf2a73cSMauro Carvalho Chehab Each run contains: 2976cf2a73cSMauro Carvalho Chehab 2986cf2a73cSMauro Carvalho Chehab * tag area - it contains integrity tags. There is one tag for each 29952145f28SRussell Harmon sector in the data area. The size of this area is always 4KiB or 30052145f28SRussell Harmon greater. 3016cf2a73cSMauro Carvalho Chehab * data area - it contains data sectors. The number of data sectors 3026cf2a73cSMauro Carvalho Chehab in one run must be a power of two. log2 of this value is stored 3036cf2a73cSMauro Carvalho Chehab in the superblock. 304