xref: /linux/Documentation/admin-guide/device-mapper/verity.rst (revision aec2f682d47c54ef434b2d440992626d80b1ebdc)
1=========
2dm-verity
3=========
4
5Device-Mapper's "verity" target provides transparent integrity checking of
6block devices using a cryptographic digest provided by the kernel crypto API.
7This target is read-only.
8
9Construction Parameters
10=======================
11
12::
13
14    <version> <dev> <hash_dev>
15    <data_block_size> <hash_block_size>
16    <num_data_blocks> <hash_start_block>
17    <algorithm> <digest> <salt>
18    [<#opt_params> <opt_params>]
19
20<version>
21    This is the type of the on-disk hash format.
22
23    0 is the original format used in the Chromium OS.
24      The salt is appended when hashing, digests are stored continuously and
25      the rest of the block is padded with zeroes.
26
27    1 is the current format that should be used for new devices.
28      The salt is prepended when hashing and each digest is
29      padded with zeroes to the power of two.
30
31<dev>
32    This is the device containing data, the integrity of which needs to be
33    checked.  It may be specified as a path, like /dev/sdaX, or a device number,
34    <major>:<minor>.
35
36<hash_dev>
37    This is the device that supplies the hash tree data.  It may be
38    specified similarly to the device path and may be the same device.  If the
39    same device is used, the hash_start should be outside the configured
40    dm-verity device.
41
42<data_block_size>
43    The block size on a data device in bytes.
44    Each block corresponds to one digest on the hash device.
45
46<hash_block_size>
47    The size of a hash block in bytes.
48
49<num_data_blocks>
50    The number of data blocks on the data device.  Additional blocks are
51    inaccessible.  You can place hashes to the same partition as data, in this
52    case hashes are placed after <num_data_blocks>.
53
54<hash_start_block>
55    This is the offset, in <hash_block_size>-blocks, from the start of hash_dev
56    to the root block of the hash tree.
57
58<algorithm>
59    The cryptographic hash algorithm used for this device.  This should
60    be the name of the algorithm, like "sha1".
61
62<digest>
63    The hexadecimal encoding of the cryptographic hash of the root hash block
64    and the salt.  This hash should be trusted as there is no other authenticity
65    beyond this point.
66
67<salt>
68    The hexadecimal encoding of the salt value.
69
70<#opt_params>
71    Number of optional parameters. If there are no optional parameters,
72    the optional parameters section can be skipped or #opt_params can be zero.
73    Otherwise #opt_params is the number of following arguments.
74
75    Example of optional parameters section:
76        1 ignore_corruption
77
78ignore_corruption
79    Log corrupted blocks, but allow read operations to proceed normally.
80
81restart_on_corruption
82    Restart the system when a corrupted block is discovered. This option is
83    not compatible with ignore_corruption and requires user space support to
84    avoid restart loops.
85
86panic_on_corruption
87    Panic the device when a corrupted block is discovered. This option is
88    not compatible with ignore_corruption and restart_on_corruption.
89
90restart_on_error
91    Restart the system when an I/O error is detected.
92    This option can be combined with the restart_on_corruption option.
93
94panic_on_error
95    Panic the device when an I/O error is detected. This option is
96    not compatible with the restart_on_error option but can be combined
97    with the panic_on_corruption option.
98
99ignore_zero_blocks
100    Do not verify blocks that are expected to contain zeroes and always return
101    zeroes instead. This may be useful if the partition contains unused blocks
102    that are not guaranteed to contain zeroes.
103
104use_fec_from_device <fec_dev>
105    Use forward error correction (FEC) parity data from the specified device to
106    try to automatically recover from corruption and I/O errors.
107
108    If this option is given, then <fec_roots> and <fec_blocks> must also be
109    given.  <hash_block_size> must also be equal to <data_block_size>.
110
111    <fec_dev> can be the same as <dev>, in which case <fec_start> must be
112    outside the data area.  It can also be the same as <hash_dev>, in which case
113    <fec_start> must be outside the hash and optional additional metadata areas.
114
115    If the data <dev> is encrypted, the <fec_dev> should be too.
116
117    For more information, see `Forward error correction`_.
118
119fec_roots <num>
120    The number of parity bytes in each 255-byte Reed-Solomon codeword.  The
121    Reed-Solomon code used will be an RS(255, k) code where k = 255 - fec_roots.
122
123    The supported values are 2 through 24 inclusive.  Higher values provide
124    stronger error correction.  However, the minimum value of 2 already provides
125    strong error correction due to the use of interleaving, so 2 is the
126    recommended value for most users.  fec_roots=2 corresponds to an
127    RS(255, 253) code, which has a space overhead of about 0.8%.
128
129fec_blocks <num>
130    The total number of <data_block_size> blocks that are error-checked using
131    FEC.  This must be at least the sum of <num_data_blocks> and the number of
132    blocks needed by the hash tree.  It can include additional metadata blocks,
133    which are assumed to be accessible on <hash_dev> following the hash blocks.
134
135    Note that this is *not* the number of parity blocks.  The number of parity
136    blocks is inferred from <fec_blocks>, <fec_roots>, and <data_block_size>.
137
138fec_start <offset>
139    This is the offset, in <data_block_size> blocks, from the start of <fec_dev>
140    to the beginning of the parity data.
141
142check_at_most_once
143    Verify data blocks only the first time they are read from the data device,
144    rather than every time.  This reduces the overhead of dm-verity so that it
145    can be used on systems that are memory and/or CPU constrained.  However, it
146    provides a reduced level of security because only offline tampering of the
147    data device's content will be detected, not online tampering.
148
149    Hash blocks are still verified each time they are read from the hash device,
150    since verification of hash blocks is less performance critical than data
151    blocks, and a hash block will not be verified any more after all the data
152    blocks it covers have been verified anyway.
153
154root_hash_sig_key_desc <key_description>
155    This is the description of the USER_KEY that the kernel will lookup to get
156    the pkcs7 signature of the roothash. The pkcs7 signature is used to validate
157    the root hash during the creation of the device mapper block device.
158    Verification of roothash depends on the config DM_VERITY_VERIFY_ROOTHASH_SIG
159    being set in the kernel.  The signatures are checked against the builtin
160    trusted keyring by default, or the secondary trusted keyring if
161    DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING is set.  The secondary
162    trusted keyring includes by default the builtin trusted keyring, and it can
163    also gain new certificates at run time if they are signed by a certificate
164    already in the secondary trusted keyring.
165
166try_verify_in_tasklet
167    If verity hashes are in cache and the IO size does not exceed the limit,
168    verify data blocks in bottom half instead of workqueue. This option can
169    reduce IO latency. The size limits can be configured via
170    /sys/module/dm_verity/parameters/use_bh_bytes. The four parameters
171    correspond to limits for IOPRIO_CLASS_NONE, IOPRIO_CLASS_RT,
172    IOPRIO_CLASS_BE and IOPRIO_CLASS_IDLE in turn.
173    For example:
174    <none>,<rt>,<be>,<idle>
175    4096,4096,4096,4096
176
177Theory of operation
178===================
179
180dm-verity is meant to be set up as part of a verified boot path.  This
181may be anything ranging from a boot using tboot or trustedgrub to just
182booting from a known-good device (like a USB drive or CD).
183
184When a dm-verity device is configured, it is expected that the caller
185has been authenticated in some way (cryptographic signatures, etc).
186After instantiation, all hashes will be verified on-demand during
187disk access.  If they cannot be verified up to the root node of the
188tree, the root hash, then the I/O will fail.  This should detect
189tampering with any data on the device and the hash data.
190
191Cryptographic hashes are used to assert the integrity of the device on a
192per-block basis. This allows for a lightweight hash computation on first read
193into the page cache. Block hashes are stored linearly, aligned to the nearest
194block size.
195
196Hash Tree
197---------
198
199Each node in the tree is a cryptographic hash.  If it is a leaf node, the hash
200of some data block on disk is calculated. If it is an intermediary node,
201the hash of a number of child nodes is calculated.
202
203Each entry in the tree is a collection of neighboring nodes that fit in one
204block.  The number is determined based on block_size and the size of the
205selected cryptographic digest algorithm.  The hashes are linearly-ordered in
206this entry and any unaligned trailing space is ignored but included when
207calculating the parent node.
208
209The tree looks something like:
210
211	alg = sha256, num_blocks = 32768, block_size = 4096
212
213::
214
215                                 [   root    ]
216                                /    . . .    \
217                     [entry_0]                 [entry_1]
218                    /  . . .  \                 . . .   \
219         [entry_0_0]   . . .  [entry_0_127]    . . . .  [entry_1_127]
220           / ... \             /   . . .  \             /           \
221     blk_0 ... blk_127  blk_16256   blk_16383      blk_32640 . . . blk_32767
222
223Forward error correction
224------------------------
225
226dm-verity's optional forward error correction (FEC) support adds strong error
227correction capabilities to dm-verity.  It allows systems that would be rendered
228inoperable by errors to continue operating, albeit with reduced performance.
229
230FEC uses Reed-Solomon (RS) codes that are interleaved across the entire
231device(s), allowing long bursts of corrupt or unreadable blocks to be recovered.
232
233dm-verity validates any FEC-corrected block against the wanted hash before using
234it.  Therefore, FEC doesn't affect the security properties of dm-verity.
235
236The integration of FEC with dm-verity provides significant benefits over a
237separate error correction layer:
238
239- dm-verity invokes FEC only when a block's hash doesn't match the wanted hash
240  or the block cannot be read at all.  As a result, FEC doesn't add overhead to
241  the common case where no error occurs.
242
243- dm-verity hashes are also used to identify erasure locations for RS decoding.
244  This allows correcting twice as many errors.
245
246FEC uses an RS(255, k) code where k = 255 - fec_roots.  fec_roots is usually 2.
247This means that each k (usually 253) message bytes have fec_roots (usually 2)
248bytes of parity data added to get a 255-byte codeword.  (Many external sources
249call RS codewords "blocks".  Since dm-verity already uses the term "block" to
250mean something else, we'll use the clearer term "RS codeword".)
251
252FEC checks fec_blocks blocks of message data in total, consisting of:
253
2541. The data blocks from the data device
2552. The hash blocks from the hash device
2563. Optional additional metadata that follows the hash blocks on the hash device
257
258dm-verity assumes that the FEC parity data was computed as if the following
259procedure were followed:
260
2611. Concatenate the message data from the above sources.
2622. Zero-pad to the next multiple of k blocks.  Let msg be the resulting byte
263   array, and msglen its length in bytes.
2643. For 0 <= i < msglen / k (for each RS codeword):
265     a. Select msg[i + j * msglen / k] for 0 <= j < k.
266        Consider these to be the 'k' message bytes of an RS codeword.
267     b. Compute the corresponding 'fec_roots' parity bytes of the RS codeword,
268        and concatenate them to the FEC parity data.
269
270Step 3a interleaves the RS codewords across the entire device using an
271interleaving degree of data_block_size * ceil(fec_blocks / k).  This is the
272maximal interleaving, such that the message data consists of a region containing
273byte 0 of all the RS codewords, then a region containing byte 1 of all the RS
274codewords, and so on up to the region for byte 'k - 1'.  Note that the number of
275codewords is set to a multiple of data_block_size; thus, the regions are
276block-aligned, and there is an implicit zero padding of up to 'k - 1' blocks.
277
278This interleaving allows long bursts of errors to be corrected.  It provides
279much stronger error correction than storage devices typically provide, while
280keeping the space overhead low.
281
282The cost is slow decoding: correcting a single block usually requires reading
283254 extra blocks spread evenly across the device(s).  However, that is
284acceptable because dm-verity uses FEC only when there is actually an error.
285
286The list below contains additional details about the RS codes used by
287dm-verity's FEC.  Userspace programs that generate the parity data need to use
288these parameters for the parity data to match exactly:
289
290- Field used is GF(256)
291- Bytes are mapped to/from GF(256) elements in the natural way, where bits 0
292  through 7 (low-order to high-order) map to the coefficients of x^0 through x^7
293- Field generator polynomial is x^8 + x^4 + x^3 + x^2 + 1
294- The codes used are systematic, BCH-view codes
295- Primitive element alpha is 'x'
296- First consecutive root of code generator polynomial is 'x^0'
297
298On-disk format
299==============
300
301The verity kernel code does not read the verity metadata on-disk header.
302It only reads the hash blocks which directly follow the header.
303It is expected that a user-space tool will verify the integrity of the
304verity header.
305
306Alternatively, the header can be omitted and the dmsetup parameters can
307be passed via the kernel command-line in a rooted chain of trust where
308the command-line is verified.
309
310Directly following the header (and with sector number padded to the next hash
311block boundary) are the hash blocks which are stored a depth at a time
312(starting from the root), sorted in order of increasing index.
313
314The full specification of kernel parameters and on-disk metadata format
315is available at the cryptsetup project's wiki page
316
317  https://gitlab.com/cryptsetup/cryptsetup/wikis/DMVerity
318
319Status
320======
3211. V (for Valid) is returned if every check performed so far was valid.
322   If any check failed, C (for Corruption) is returned.
3232. Number of corrected blocks by Forward Error Correction.
324   '-' if Forward Error Correction is not enabled.
325
326Example
327=======
328Set up a device::
329
330  # dmsetup create vroot --readonly --table \
331    "0 2097152 verity 1 /dev/sda1 /dev/sda2 4096 4096 262144 1 sha256 "\
332    "4392712ba01368efdf14b05c76f9e4df0d53664630b5d48632ed17a137f39076 "\
333    "1234000000000000000000000000000000000000000000000000000000000000"
334
335A command line tool veritysetup is available to compute or verify
336the hash tree or activate the kernel device. This is available from
337the cryptsetup upstream repository https://gitlab.com/cryptsetup/cryptsetup/
338(as a libcryptsetup extension).
339
340Create hash on the device::
341
342  # veritysetup format /dev/sda1 /dev/sda2
343  ...
344  Root hash: 4392712ba01368efdf14b05c76f9e4df0d53664630b5d48632ed17a137f39076
345
346Activate the device::
347
348  # veritysetup create vroot /dev/sda1 /dev/sda2 \
349    4392712ba01368efdf14b05c76f9e4df0d53664630b5d48632ed17a137f39076
350