xref: /linux/Documentation/admin-guide/device-mapper/dm-integrity.rst (revision cdd5b5a9761fd66d17586e4f4ba6588c70e640ea)
16cf2a73cSMauro Carvalho Chehab============
26cf2a73cSMauro Carvalho Chehabdm-integrity
36cf2a73cSMauro Carvalho Chehab============
46cf2a73cSMauro Carvalho Chehab
56cf2a73cSMauro Carvalho ChehabThe dm-integrity target emulates a block device that has additional
66cf2a73cSMauro Carvalho Chehabper-sector tags that can be used for storing integrity information.
76cf2a73cSMauro Carvalho Chehab
86cf2a73cSMauro Carvalho ChehabA general problem with storing integrity tags with every sector is that
96cf2a73cSMauro Carvalho Chehabwriting the sector and the integrity tag must be atomic - i.e. in case of
106cf2a73cSMauro Carvalho Chehabcrash, either both sector and integrity tag or none of them is written.
116cf2a73cSMauro Carvalho Chehab
126cf2a73cSMauro Carvalho ChehabTo guarantee write atomicity, the dm-integrity target uses journal, it
136cf2a73cSMauro Carvalho Chehabwrites sector data and integrity tags into a journal, commits the journal
146cf2a73cSMauro Carvalho Chehaband then copies the data and integrity tags to their respective location.
156cf2a73cSMauro Carvalho Chehab
166cf2a73cSMauro Carvalho ChehabThe dm-integrity target can be used with the dm-crypt target - in this
176cf2a73cSMauro Carvalho Chehabsituation the dm-crypt target creates the integrity data and passes them
186cf2a73cSMauro Carvalho Chehabto the dm-integrity target via bio_integrity_payload attached to the bio.
196cf2a73cSMauro Carvalho ChehabIn this mode, the dm-crypt and dm-integrity targets provide authenticated
206cf2a73cSMauro Carvalho Chehabdisk encryption - if the attacker modifies the encrypted device, an I/O
216cf2a73cSMauro Carvalho Chehaberror is returned instead of random data.
226cf2a73cSMauro Carvalho Chehab
236cf2a73cSMauro Carvalho ChehabThe dm-integrity target can also be used as a standalone target, in this
246cf2a73cSMauro Carvalho Chehabmode it calculates and verifies the integrity tag internally. In this
256cf2a73cSMauro Carvalho Chehabmode, the dm-integrity target can be used to detect silent data
266cf2a73cSMauro Carvalho Chehabcorruption on the disk or in the I/O path.
276cf2a73cSMauro Carvalho Chehab
28c3ba5aa6SRussell HarmonThere's an alternate mode of operation where dm-integrity uses a bitmap
296cf2a73cSMauro Carvalho Chehabinstead of a journal. If a bit in the bitmap is 1, the corresponding
306cf2a73cSMauro Carvalho Chehabregion's data and integrity tags are not synchronized - if the machine
316cf2a73cSMauro Carvalho Chehabcrashes, the unsynchronized regions will be recalculated. The bitmap mode
326cf2a73cSMauro Carvalho Chehabis faster than the journal mode, because we don't have to write the data
336cf2a73cSMauro Carvalho Chehabtwice, but it is also less reliable, because if data corruption happens
346cf2a73cSMauro Carvalho Chehabwhen the machine crashes, it may not be detected.
356cf2a73cSMauro Carvalho Chehab
366cf2a73cSMauro Carvalho ChehabWhen loading the target for the first time, the kernel driver will format
376cf2a73cSMauro Carvalho Chehabthe device. But it will only format the device if the superblock contains
386cf2a73cSMauro Carvalho Chehabzeroes. If the superblock is neither valid nor zeroed, the dm-integrity
396cf2a73cSMauro Carvalho Chehabtarget can't be loaded.
406cf2a73cSMauro Carvalho Chehab
413b671459SRussell HarmonAccesses to the on-disk metadata area containing checksums (aka tags) are
423b671459SRussell Harmonbuffered using dm-bufio. When an access to any given metadata area
433b671459SRussell Harmonoccurs, each unique metadata area gets its own buffer(s). The buffer size
443b671459SRussell Harmonis capped at the size of the metadata area, but may be smaller, thereby
453b671459SRussell Harmonrequiring multiple buffers to represent the full metadata area. A smaller
463b671459SRussell Harmonbuffer size will produce a smaller resulting read/write operation to the
473b671459SRussell Harmonmetadata area for small reads/writes. The metadata is still read even in
483b671459SRussell Harmona full write to the data covered by a single buffer.
493b671459SRussell Harmon
506cf2a73cSMauro Carvalho ChehabTo use the target for the first time:
516cf2a73cSMauro Carvalho Chehab
526cf2a73cSMauro Carvalho Chehab1. overwrite the superblock with zeroes
536cf2a73cSMauro Carvalho Chehab2. load the dm-integrity target with one-sector size, the kernel driver
546cf2a73cSMauro Carvalho Chehab   will format the device
556cf2a73cSMauro Carvalho Chehab3. unload the dm-integrity target
566cf2a73cSMauro Carvalho Chehab4. read the "provided_data_sectors" value from the superblock
574e578ba6SRandy Dunlap5. load the dm-integrity target with the target size
586cf2a73cSMauro Carvalho Chehab   "provided_data_sectors"
596cf2a73cSMauro Carvalho Chehab6. if you want to use dm-integrity with dm-crypt, load the dm-crypt target
606cf2a73cSMauro Carvalho Chehab   with the size "provided_data_sectors"
616cf2a73cSMauro Carvalho Chehab
626cf2a73cSMauro Carvalho Chehab
636cf2a73cSMauro Carvalho ChehabTarget arguments:
646cf2a73cSMauro Carvalho Chehab
656cf2a73cSMauro Carvalho Chehab1. the underlying block device
666cf2a73cSMauro Carvalho Chehab
676cf2a73cSMauro Carvalho Chehab2. the number of reserved sector at the beginning of the device - the
686cf2a73cSMauro Carvalho Chehab   dm-integrity won't read of write these sectors
696cf2a73cSMauro Carvalho Chehab
706cf2a73cSMauro Carvalho Chehab3. the size of the integrity tag (if "-" is used, the size is taken from
716cf2a73cSMauro Carvalho Chehab   the internal-hash algorithm)
726cf2a73cSMauro Carvalho Chehab
736cf2a73cSMauro Carvalho Chehab4. mode:
746cf2a73cSMauro Carvalho Chehab
756cf2a73cSMauro Carvalho Chehab	D - direct writes (without journal)
766cf2a73cSMauro Carvalho Chehab		in this mode, journaling is
776cf2a73cSMauro Carvalho Chehab		not used and data sectors and integrity tags are written
786cf2a73cSMauro Carvalho Chehab		separately. In case of crash, it is possible that the data
796cf2a73cSMauro Carvalho Chehab		and integrity tag doesn't match.
806cf2a73cSMauro Carvalho Chehab	J - journaled writes
816cf2a73cSMauro Carvalho Chehab		data and integrity tags are written to the
826cf2a73cSMauro Carvalho Chehab		journal and atomicity is guaranteed. In case of crash,
836cf2a73cSMauro Carvalho Chehab		either both data and tag or none of them are written. The
846cf2a73cSMauro Carvalho Chehab		journaled mode degrades write throughput twice because the
856cf2a73cSMauro Carvalho Chehab		data have to be written twice.
866cf2a73cSMauro Carvalho Chehab	B - bitmap mode - data and metadata are written without any
876cf2a73cSMauro Carvalho Chehab		synchronization, the driver maintains a bitmap of dirty
886cf2a73cSMauro Carvalho Chehab		regions where data and metadata don't match. This mode can
896cf2a73cSMauro Carvalho Chehab		only be used with internal hash.
906cf2a73cSMauro Carvalho Chehab	R - recovery mode - in this mode, journal is not replayed,
916cf2a73cSMauro Carvalho Chehab		checksums are not checked and writes to the device are not
926cf2a73cSMauro Carvalho Chehab		allowed. This mode is useful for data recovery if the
936cf2a73cSMauro Carvalho Chehab		device cannot be activated in any of the other standard
946cf2a73cSMauro Carvalho Chehab		modes.
956cf2a73cSMauro Carvalho Chehab
966cf2a73cSMauro Carvalho Chehab5. the number of additional arguments
976cf2a73cSMauro Carvalho Chehab
986cf2a73cSMauro Carvalho ChehabAdditional arguments:
996cf2a73cSMauro Carvalho Chehab
1006cf2a73cSMauro Carvalho Chehabjournal_sectors:number
1016cf2a73cSMauro Carvalho Chehab	The size of journal, this argument is used only if formatting the
1026cf2a73cSMauro Carvalho Chehab	device. If the device is already formatted, the value from the
1036cf2a73cSMauro Carvalho Chehab	superblock is used.
1046cf2a73cSMauro Carvalho Chehab
10552145f28SRussell Harmoninterleave_sectors:number (default 32768)
1066cf2a73cSMauro Carvalho Chehab	The number of interleaved sectors. This values is rounded down to
1076cf2a73cSMauro Carvalho Chehab	a power of two. If the device is already formatted, the value from
1086cf2a73cSMauro Carvalho Chehab	the superblock is used.
1096cf2a73cSMauro Carvalho Chehab
1106cf2a73cSMauro Carvalho Chehabmeta_device:device
1114e578ba6SRandy Dunlap	Don't interleave the data and metadata on the device. Use a
1126cf2a73cSMauro Carvalho Chehab	separate device for metadata.
1136cf2a73cSMauro Carvalho Chehab
11452145f28SRussell Harmonbuffer_sectors:number (default 128)
11552145f28SRussell Harmon	The number of sectors in one metadata buffer. The value is rounded
11652145f28SRussell Harmon	down to a power of two.
1176cf2a73cSMauro Carvalho Chehab
11852145f28SRussell Harmonjournal_watermark:number (default 50)
1196cf2a73cSMauro Carvalho Chehab	The journal watermark in percents. When the size of the journal
1206cf2a73cSMauro Carvalho Chehab	exceeds this watermark, the thread that flushes the journal will
1216cf2a73cSMauro Carvalho Chehab	be started.
1226cf2a73cSMauro Carvalho Chehab
12352145f28SRussell Harmoncommit_time:number (default 10000)
1246cf2a73cSMauro Carvalho Chehab	Commit time in milliseconds. When this time passes, the journal is
125751d5b27SAndrew Klychkov	written. The journal is also written immediately if the FLUSH
1266cf2a73cSMauro Carvalho Chehab	request is received.
1276cf2a73cSMauro Carvalho Chehab
1286cf2a73cSMauro Carvalho Chehabinternal_hash:algorithm(:key)	(the key is optional)
1296cf2a73cSMauro Carvalho Chehab	Use internal hash or crc.
1306cf2a73cSMauro Carvalho Chehab	When this argument is used, the dm-integrity target won't accept
1316cf2a73cSMauro Carvalho Chehab	integrity tags from the upper target, but it will automatically
1326cf2a73cSMauro Carvalho Chehab	generate and verify the integrity tags.
1336cf2a73cSMauro Carvalho Chehab
1346cf2a73cSMauro Carvalho Chehab	You can use a crc algorithm (such as crc32), then integrity target
1356cf2a73cSMauro Carvalho Chehab	will protect the data against accidental corruption.
1366cf2a73cSMauro Carvalho Chehab	You can also use a hmac algorithm (for example
1376cf2a73cSMauro Carvalho Chehab	"hmac(sha256):0123456789abcdef"), in this mode it will provide
1386cf2a73cSMauro Carvalho Chehab	cryptographic authentication of the data without encryption.
1396cf2a73cSMauro Carvalho Chehab
1406cf2a73cSMauro Carvalho Chehab	When this argument is not used, the integrity tags are accepted
1416cf2a73cSMauro Carvalho Chehab	from an upper layer target, such as dm-crypt. The upper layer
1426cf2a73cSMauro Carvalho Chehab	target should check the validity of the integrity tags.
1436cf2a73cSMauro Carvalho Chehab
1446cf2a73cSMauro Carvalho Chehabrecalculate
1456cf2a73cSMauro Carvalho Chehab	Recalculate the integrity tags automatically. It is only valid
1466cf2a73cSMauro Carvalho Chehab	when using internal hash.
1476cf2a73cSMauro Carvalho Chehab
1486cf2a73cSMauro Carvalho Chehabjournal_crypt:algorithm(:key)	(the key is optional)
1496cf2a73cSMauro Carvalho Chehab	Encrypt the journal using given algorithm to make sure that the
1506cf2a73cSMauro Carvalho Chehab	attacker can't read the journal. You can use a block cipher here
151663f63eeSArd Biesheuvel	(such as "cbc(aes)") or a stream cipher (for example "chacha20"
152663f63eeSArd Biesheuvel	or "ctr(aes)").
1536cf2a73cSMauro Carvalho Chehab
1546cf2a73cSMauro Carvalho Chehab	The journal contains history of last writes to the block device,
155751d5b27SAndrew Klychkov	an attacker reading the journal could see the last sector numbers
1566cf2a73cSMauro Carvalho Chehab	that were written. From the sector numbers, the attacker can infer
1576cf2a73cSMauro Carvalho Chehab	the size of files that were written. To protect against this
1586cf2a73cSMauro Carvalho Chehab	situation, you can encrypt the journal.
1596cf2a73cSMauro Carvalho Chehab
1606cf2a73cSMauro Carvalho Chehabjournal_mac:algorithm(:key)	(the key is optional)
1616cf2a73cSMauro Carvalho Chehab	Protect sector numbers in the journal from accidental or malicious
1626cf2a73cSMauro Carvalho Chehab	modification. To protect against accidental modification, use a
1636cf2a73cSMauro Carvalho Chehab	crc algorithm, to protect against malicious modification, use a
1646cf2a73cSMauro Carvalho Chehab	hmac algorithm with a key.
1656cf2a73cSMauro Carvalho Chehab
1666cf2a73cSMauro Carvalho Chehab	This option is not needed when using internal-hash because in this
1676cf2a73cSMauro Carvalho Chehab	mode, the integrity of journal entries is checked when replaying
1686cf2a73cSMauro Carvalho Chehab	the journal. Thus, modified sector number would be detected at
1696cf2a73cSMauro Carvalho Chehab	this stage.
1706cf2a73cSMauro Carvalho Chehab
17152145f28SRussell Harmonblock_size:number (default 512)
1726cf2a73cSMauro Carvalho Chehab	The size of a data block in bytes. The larger the block size the
1736cf2a73cSMauro Carvalho Chehab	less overhead there is for per-block integrity metadata.
17452145f28SRussell Harmon	Supported values are 512, 1024, 2048 and 4096 bytes.
1756cf2a73cSMauro Carvalho Chehab
1766cf2a73cSMauro Carvalho Chehabsectors_per_bit:number
1776cf2a73cSMauro Carvalho Chehab	In the bitmap mode, this parameter specifies the number of
1786cf2a73cSMauro Carvalho Chehab	512-byte sectors that corresponds to one bitmap bit.
1796cf2a73cSMauro Carvalho Chehab
1806cf2a73cSMauro Carvalho Chehabbitmap_flush_interval:number
1816cf2a73cSMauro Carvalho Chehab	The bitmap flush interval in milliseconds. The metadata buffers
1826cf2a73cSMauro Carvalho Chehab	are synchronized when this interval expires.
1836cf2a73cSMauro Carvalho Chehab
1845c024064SMikulas Patockaallow_discards
1855c024064SMikulas Patocka	Allow block discard requests (a.k.a. TRIM) for the integrity device.
1865c024064SMikulas Patocka	Discards are only allowed to devices using internal hash.
1875c024064SMikulas Patocka
188d537858aSMikulas Patockafix_padding
189d537858aSMikulas Patocka	Use a smaller padding of the tag area that is more
190d537858aSMikulas Patocka	space-efficient. If this option is not present, large padding is
191d537858aSMikulas Patocka	used - that is for compatibility with older kernels.
192d537858aSMikulas Patocka
19309d85f8dSMikulas Patockafix_hmac
19409d85f8dSMikulas Patocka	Improve security of internal_hash and journal_mac:
19509d85f8dSMikulas Patocka
19609d85f8dSMikulas Patocka	- the section number is mixed to the mac, so that an attacker can't
19709d85f8dSMikulas Patocka	  copy sectors from one journal section to another journal section
19809d85f8dSMikulas Patocka	- the superblock is protected by journal_mac
19909d85f8dSMikulas Patocka	- a 16-byte salt stored in the superblock is mixed to the mac, so
20009d85f8dSMikulas Patocka	  that the attacker can't detect that two disks have the same hmac
20109d85f8dSMikulas Patocka	  key and also to disallow the attacker to move sectors from one
20209d85f8dSMikulas Patocka	  disk to another
20309d85f8dSMikulas Patocka
2045c024064SMikulas Patockalegacy_recalculate
2055c024064SMikulas Patocka	Allow recalculating of volumes with HMAC keys. This is disabled by
2065c024064SMikulas Patocka	default for security reasons - an attacker could modify the volume,
2075c024064SMikulas Patocka	set recalc_sector to zero, and the kernel would not detect the
2085c024064SMikulas Patocka	modification.
2096cf2a73cSMauro Carvalho Chehab
2100a2bd55cSMilan BrozThe journal mode (D/J), buffer_sectors, journal_watermark, commit_time and
2110a2bd55cSMilan Brozallow_discards can be changed when reloading the target (load an inactive
2120a2bd55cSMilan Broztable and swap the tables with suspend and resume). The other arguments
2130a2bd55cSMilan Brozshould not be changed when reloading the target because the layout of disk
2140a2bd55cSMilan Brozdata depend on them and the reloaded target would be non-functional.
2156cf2a73cSMauro Carvalho Chehab
216*2971c058SRussell HarmonFor example, on a device using the default interleave_sectors of 32768, a
217*2971c058SRussell Harmonblock_size of 512, and an internal_hash of crc32c with a tag size of 4
218*2971c058SRussell Harmonbytes, it will take 128 KiB of tags to track a full data area, requiring
219*2971c058SRussell Harmon256 sectors of metadata per data area. With the default buffer_sectors of
220*2971c058SRussell Harmon128, that means there will be 2 buffers per metadata area, or 2 buffers
221*2971c058SRussell Harmonper 16 MiB of data.
2226cf2a73cSMauro Carvalho Chehab
22340e9c5acSMikulas PatockaStatus line:
22440e9c5acSMikulas Patocka
22540e9c5acSMikulas Patocka1. the number of integrity mismatches
22640e9c5acSMikulas Patocka2. provided data sectors - that is the number of sectors that the user
22740e9c5acSMikulas Patocka   could use
22840e9c5acSMikulas Patocka3. the current recalculating position (or '-' if we didn't recalculate)
22940e9c5acSMikulas Patocka
23040e9c5acSMikulas Patocka
2316cf2a73cSMauro Carvalho ChehabThe layout of the formatted block device:
2326cf2a73cSMauro Carvalho Chehab
2336cf2a73cSMauro Carvalho Chehab* reserved sectors
2346cf2a73cSMauro Carvalho Chehab    (they are not used by this target, they can be used for
2356cf2a73cSMauro Carvalho Chehab    storing LUKS metadata or for other purpose), the size of the reserved
2366cf2a73cSMauro Carvalho Chehab    area is specified in the target arguments
2376cf2a73cSMauro Carvalho Chehab
2386cf2a73cSMauro Carvalho Chehab* superblock (4kiB)
2396cf2a73cSMauro Carvalho Chehab	* magic string - identifies that the device was formatted
2406cf2a73cSMauro Carvalho Chehab	* version
2416cf2a73cSMauro Carvalho Chehab	* log2(interleave sectors)
2426cf2a73cSMauro Carvalho Chehab	* integrity tag size
2436cf2a73cSMauro Carvalho Chehab	* the number of journal sections
2446cf2a73cSMauro Carvalho Chehab	* provided data sectors - the number of sectors that this target
2456cf2a73cSMauro Carvalho Chehab	  provides (i.e. the size of the device minus the size of all
2466cf2a73cSMauro Carvalho Chehab	  metadata and padding). The user of this target should not send
2476cf2a73cSMauro Carvalho Chehab	  bios that access data beyond the "provided data sectors" limit.
2486cf2a73cSMauro Carvalho Chehab	* flags
2496cf2a73cSMauro Carvalho Chehab	    SB_FLAG_HAVE_JOURNAL_MAC
2506cf2a73cSMauro Carvalho Chehab		- a flag is set if journal_mac is used
2516cf2a73cSMauro Carvalho Chehab	    SB_FLAG_RECALCULATING
2526cf2a73cSMauro Carvalho Chehab		- recalculating is in progress
2536cf2a73cSMauro Carvalho Chehab	    SB_FLAG_DIRTY_BITMAP
2546cf2a73cSMauro Carvalho Chehab		- journal area contains the bitmap of dirty
2556cf2a73cSMauro Carvalho Chehab		  blocks
2566cf2a73cSMauro Carvalho Chehab	* log2(sectors per block)
2576cf2a73cSMauro Carvalho Chehab	* a position where recalculating finished
2586cf2a73cSMauro Carvalho Chehab* journal
2596cf2a73cSMauro Carvalho Chehab	The journal is divided into sections, each section contains:
2606cf2a73cSMauro Carvalho Chehab
2616cf2a73cSMauro Carvalho Chehab	* metadata area (4kiB), it contains journal entries
2626cf2a73cSMauro Carvalho Chehab
2636cf2a73cSMauro Carvalho Chehab	  - every journal entry contains:
2646cf2a73cSMauro Carvalho Chehab
2656cf2a73cSMauro Carvalho Chehab		* logical sector (specifies where the data and tag should
2666cf2a73cSMauro Carvalho Chehab		  be written)
2676cf2a73cSMauro Carvalho Chehab		* last 8 bytes of data
2686cf2a73cSMauro Carvalho Chehab		* integrity tag (the size is specified in the superblock)
2696cf2a73cSMauro Carvalho Chehab
2706cf2a73cSMauro Carvalho Chehab	  - every metadata sector ends with
2716cf2a73cSMauro Carvalho Chehab
2726cf2a73cSMauro Carvalho Chehab		* mac (8-bytes), all the macs in 8 metadata sectors form a
2736cf2a73cSMauro Carvalho Chehab		  64-byte value. It is used to store hmac of sector
2746cf2a73cSMauro Carvalho Chehab		  numbers in the journal section, to protect against a
2756cf2a73cSMauro Carvalho Chehab		  possibility that the attacker tampers with sector
2766cf2a73cSMauro Carvalho Chehab		  numbers in the journal.
2776cf2a73cSMauro Carvalho Chehab		* commit id
2786cf2a73cSMauro Carvalho Chehab
2796cf2a73cSMauro Carvalho Chehab	* data area (the size is variable; it depends on how many journal
2806cf2a73cSMauro Carvalho Chehab	  entries fit into the metadata area)
2816cf2a73cSMauro Carvalho Chehab
2826cf2a73cSMauro Carvalho Chehab	    - every sector in the data area contains:
2836cf2a73cSMauro Carvalho Chehab
2846cf2a73cSMauro Carvalho Chehab		* data (504 bytes of data, the last 8 bytes are stored in
2856cf2a73cSMauro Carvalho Chehab		  the journal entry)
2866cf2a73cSMauro Carvalho Chehab		* commit id
2876cf2a73cSMauro Carvalho Chehab
2886cf2a73cSMauro Carvalho Chehab	To test if the whole journal section was written correctly, every
2896cf2a73cSMauro Carvalho Chehab	512-byte sector of the journal ends with 8-byte commit id. If the
2906cf2a73cSMauro Carvalho Chehab	commit id matches on all sectors in a journal section, then it is
2916cf2a73cSMauro Carvalho Chehab	assumed that the section was written correctly. If the commit id
2926cf2a73cSMauro Carvalho Chehab	doesn't match, the section was written partially and it should not
2936cf2a73cSMauro Carvalho Chehab	be replayed.
2946cf2a73cSMauro Carvalho Chehab
2956cf2a73cSMauro Carvalho Chehab* one or more runs of interleaved tags and data.
2966cf2a73cSMauro Carvalho Chehab    Each run contains:
2976cf2a73cSMauro Carvalho Chehab
2986cf2a73cSMauro Carvalho Chehab	* tag area - it contains integrity tags. There is one tag for each
29952145f28SRussell Harmon	  sector in the data area. The size of this area is always 4KiB or
30052145f28SRussell Harmon	  greater.
3016cf2a73cSMauro Carvalho Chehab	* data area - it contains data sectors. The number of data sectors
3026cf2a73cSMauro Carvalho Chehab	  in one run must be a power of two. log2 of this value is stored
3036cf2a73cSMauro Carvalho Chehab	  in the superblock.
304