xref: /freebsd/crypto/openssl/doc/designs/ML-KEM.md (revision e7be843b4a162e68651d3911f0357ed464915629)
1*e7be843bSPierre ProncheryML-KEM Design
2*e7be843bSPierre Pronchery=============
3*e7be843bSPierre Pronchery
4*e7be843bSPierre ProncheryThis document covers OpenSSL-specific ML-KEM implementation details.
5*e7be843bSPierre Pronchery**ML-KEM** is specified in [FIPS 203], which includes comprehensive pseudo-code
6*e7be843bSPierre Proncheryfor all its algorithms.
7*e7be843bSPierre Pronchery
8*e7be843bSPierre ProncheryML-KEM Parameters & Functions
9*e7be843bSPierre Pronchery-----------------------------
10*e7be843bSPierre Pronchery
11*e7be843bSPierre ProncheryThere are 3 different parameter sets in FIPS 203 (see Section 8).
12*e7be843bSPierre Pronchery
13*e7be843bSPierre ProncheryTo support these variants, OpenSSL has 3 associated key managers and 3
14*e7be843bSPierre Proncherycorresponding KEM function sets.
15*e7be843bSPierre ProncheryThe key management and KEM algorithm names are **ML-KEM-512**, **ML-KEM-768**
16*e7be843bSPierre Proncheryand **ML-KEM-1024**.
17*e7be843bSPierre ProncheryAt the TLS layer, the associated key exchange *groups* are, respectively,
18*e7be843bSPierre Pronchery**MLKEM512**, **MLKEM768** and **MLKEM1024**.
19*e7be843bSPierre Pronchery
20*e7be843bSPierre Pronchery**ML-KEM** makes extensive use of four **SHA3** primitives: **SHA3-256**,
21*e7be843bSPierre Pronchery**SHA3-512**, **SHAKE128** and **SHAKE256**.
22*e7be843bSPierre ProncheryTo improve **ML-KEM** execution performance, the EVP handles for these are
23*e7be843bSPierre Proncherypre-fetched during **ML-KEM** key initialisation and stored with each key.
24*e7be843bSPierre ProncheryThese are then used in key generation, encapsulation and decapsulation.
25*e7be843bSPierre ProncheryThese are also duplicated by reference (**EVP_MD** handles uprefed) when an
26*e7be843bSPierre Pronchery**ML-KEM** key is duplicated.
27*e7be843bSPierre Pronchery
28*e7be843bSPierre ProncheryML-KEM keys
29*e7be843bSPierre Pronchery-----------
30*e7be843bSPierre Pronchery
31*e7be843bSPierre Pronchery**ML-KEM** is an asymmetric algorithm, and has both public and private keys.
32*e7be843bSPierre ProncherySince the public key is exchanged between the two parties as part of key
33*e7be843bSPierre Proncheryagreement, the encoding (*wire-form*) of the public key is clearly defined and
34*e7be843bSPierre Proncherythere are unambiguous choices for its encoding and decoding functions.
35*e7be843bSPierre Pronchery
36*e7be843bSPierre ProncheryIt may be noted that the *wire-form* public key is "compressed".
37*e7be843bSPierre ProncheryInstead of the bulky "A" ("m" in the code) matrix, which represents the majority
38*e7be843bSPierre Proncheryof the storage required for ML-KEM public and private keys, the *wire-form* public
39*e7be843bSPierre Proncherykey, holds a 32-byte seed from which the the matrix is regenerated by the recipient
40*e7be843bSPierre Proncheryof the public key.
41*e7be843bSPierre ProncheryIn the OpenSSL implementation, the matrix is *eagerly* evaluated as part of
42*e7be843bSPierre Proncherydecoding the public key, and stored in memory in the internal form needed for
43*e7be843bSPierre Proncherysubsequent computations (encapsulation).
44*e7be843bSPierre ProncherySince the private key includes the public key as one of its components, the matrix
45*e7be843bSPierre Proncheryis also pre-computed and stored with the private key, and then need not be
46*e7be843bSPierre Proncheryregenerated during decapsulation.
47*e7be843bSPierre ProncheryDuring encapsulation (typically peformed by servers), it is in principle
48*e7be843bSPierre Proncherypossible to save space and compute the matrix elements *just-in-time*, as each
49*e7be843bSPierre Proncherymatrix element is used exactly once.
50*e7be843bSPierre ProncheryThis is not currently implemented, and the matrix is pre-computed in full.
51*e7be843bSPierre Pronchery
52*e7be843bSPierre ProncheryHowever, the same matrix is used both during key generation and decapsulation
53*e7be843bSPierre Proncheryand computing it twice would have a noticeable performance impact (typically on
54*e7be843bSPierre Proncherythe client).
55*e7be843bSPierre ProncheryIf we wanted to do *just-in-time* matrix computation for decapsulation, we'd
56*e7be843bSPierre Proncheryneed to have a different memory layout for public keys when only the public key
57*e7be843bSPierre Proncheryis known, and to change the algorithm code to generate matrix elements on
58*e7be843bSPierre Proncherydemand during encapsulation.
59*e7be843bSPierre ProncheryThis can be considered later, if it is determined that the space savings (9 *
60*e7be843bSPierre Pronchery512 bytes in memory for ML-KEM-768, for the full matrix, instead of 512 bytes
61*e7be843bSPierre Proncheryfor a just-in-time element).
62*e7be843bSPierre ProncherySince servers will generally destroy the client public key soon after the
63*e7be843bSPierre Proncheryshared secret is computed, these don't stay in memory long, and briefly saving
64*e7be843bSPierre Pronchery~2KB may not to be of much benefit).
65*e7be843bSPierre Pronchery
66*e7be843bSPierre ProncheryThe private key format is yet to be clearly standardised, though (to be able to
67*e7be843bSPierre Proncheryfully describe the algorithms) FIPS 203 documents a format that is commonly
68*e7be843bSPierre Proncheryreferred to as the "extended" format.
69*e7be843bSPierre ProncheryThis is the private key format supported by our key management provider
70*e7be843bSPierre Proncheryinterface.
71*e7be843bSPierre ProncheryThe IETF voices interest in using the "seed-based" format (the 64-byte (*d*,
72*e7be843bSPierre Pronchery*z*) seed pair from which the key is generated and can be recovered).
73*e7be843bSPierre ProncheryRecovery of the key from the seed (*d*, *z* pair) is supported by the [FIPS
74*e7be843bSPierre Pronchery203] internal deterministic key generation functions, which are used in the
75*e7be843bSPierre Pronchery*keygen* portion of the Known Answer Tests (KATs).
76*e7be843bSPierre Pronchery
77*e7be843bSPierre ProncheryThe design therefore caters to both options: The default key generation and KEM
78*e7be843bSPierre Proncheryencapsulation/decapsulation functions operate on "extended keys" in the
79*e7be843bSPierre Pronchery[FIPS 203] format, but it will be possible to use the "seed-based" private key
80*e7be843bSPierre Proncheryformat by using the (currently test-only) deterministic *keygen* interface.
81*e7be843bSPierre ProncheryWhen keys are generated randomly, we don't presently provide a mechanism
82*e7be843bSPierre Proncheryto obtain and store the seed.
83*e7be843bSPierre ProncheryThis can be added later if required.
84*e7be843bSPierre Pronchery
85*e7be843bSPierre ProncheryKey generation API
86*e7be843bSPierre Pronchery------------------
87*e7be843bSPierre Pronchery
88*e7be843bSPierre ProncheryKeys can be generated via the usual **EVP_PKEY_generate()** and
89*e7be843bSPierre Pronchery**EVP_PKEY_Q_keygen()** functions.
90*e7be843bSPierre Pronchery
91*e7be843bSPierre ProncheryAn explicit seed can be specified by setting the key generation
92*e7be843bSPierre Pronchery**OSSL_PKEY_PARAM_ML_KEM_SEED** parameter to a 64-byte octet-string
93*e7be843bSPierre Pronchery(concatentation of the **d** and **z** values (32-bytes each) in that order).
94*e7be843bSPierre Pronchery
95*e7be843bSPierre ProncheryKEM API
96*e7be843bSPierre Pronchery-------
97*e7be843bSPierre Pronchery
98*e7be843bSPierre Pronchery**ML-KEM** is meant to be a drop-in replacement for existing KEM algorithms.
99*e7be843bSPierre ProncheryAccessed in the usual way via:
100*e7be843bSPierre Pronchery
101*e7be843bSPierre Pronchery- **EVP_PKEY_encapsulate_init()**,
102*e7be843bSPierre Pronchery- **EVP_PKEY_encapsulate()**,
103*e7be843bSPierre Pronchery- **EVP_PKEY_decapsulate_init()**, and
104*e7be843bSPierre Pronchery- **EVP_PKEY_decapsulate()**.
105*e7be843bSPierre Pronchery
106*e7be843bSPierre ProncheryFor the encapsulation operation, a test-only option exists to bypass the random
107*e7be843bSPierre Proncherynumber generator (secret random inputs are required for security) and pass in
108*e7be843bSPierre Proncherya pre-determined 32-byte random value, by setting of the
109*e7be843bSPierre Pronchery**OSSL_KEM_PARAM_IKME** parameter.
110*e7be843bSPierre Pronchery
111*e7be843bSPierre ProncheryBuffers
112*e7be843bSPierre Pronchery-------
113*e7be843bSPierre Pronchery
114*e7be843bSPierre ProncheryThe **ML-KEM** key management and KEM providers interface with the underlying
115*e7be843bSPierre Proncherylibcrypto implementation via functions that validate the sizes of all provided
116*e7be843bSPierre Proncheryinput/output buffers (encoded keys, ciphertext, shared secrets and seeds) against
117*e7be843bSPierre Proncherythe values expected for the provider's ML-KEM variant (a pointer to the variant
118*e7be843bSPierre Proncheryparameters is stored with each key).
119*e7be843bSPierre Pronchery
120*e7be843bSPierre ProncheryThe underlying libcrypto **ML-KEM** APIs are not directly exposed to users,
121*e7be843bSPierre Proncheryonly the abstracted key management and KEM **EVP** APIs are public interfaces.
122*e7be843bSPierre Pronchery
123*e7be843bSPierre ProncheryConstant Time Considerations
124*e7be843bSPierre Pronchery----------------------------
125*e7be843bSPierre Pronchery
126*e7be843bSPierre ProncheryThe usual constant time methods are used in the implementation.
127*e7be843bSPierre ProncheryHowever, we avoid using a *value-barrier* to set the masks that perform
128*e7be843bSPierre Proncheryconstant-time *select* between one of two values.
129*e7be843bSPierre ProncheryThis avoids a 30-50% performance penalty and is expected to be robust even in
130*e7be843bSPierre Proncherythe face of plausible future compiler optimisations.
131*e7be843bSPierre ProncheryRemainders module the prime are computed via Barret Reduction and the decoding
132*e7be843bSPierre Proncheryand decompression of the decrypted *message* has been tested to not be
133*e7be843bSPierre Proncheryvulnerable to the "clangover" attack in our implementation.
134*e7be843bSPierre Pronchery
135*e7be843bSPierre ProncheryAll the libcrypto functions (other than **ML_KEM_KEY** allocation, which
136*e7be843bSPierre Proncheryreturns **NULL** on error) return 1 for success or zero on error.
137*e7be843bSPierre ProncheryIt should be noted that to avoid chosen-ciphertext attacks, the
138*e7be843bSPierre Pronchery**decapsulate** implementation **must** return success and a synthetic
139*e7be843bSPierre Proncheryshared secret (generated in constant-time whether synthetic or successfully
140*e7be843bSPierre Proncherydecrypted) whenever the input is a well-formed ciphertext.
141*e7be843bSPierre Pronchery
142*e7be843bSPierre ProncheryThe only exception to the above is when, unexpectedly, one of the **SHA3**
143*e7be843bSPierre Proncheryfunctions fails, in that case all hope of constant-time computation is
144*e7be843bSPierre Proncherylost, but we don't expect such failures to be influenced by the content
145*e7be843bSPierre Proncheryof chosen-ciphertexts, so this should not be an issue).
146*e7be843bSPierre Pronchery
147*e7be843bSPierre ProncheryNevertheless, even then we fall back on returning a shared secret from the RNG
148*e7be843bSPierre Proncheryalong with an error indication only when the key derivation function
149*e7be843bSPierre Proncheryfor the synthetic shared secret fails.
150*e7be843bSPierre ProncheryIn all other conditions we return success and, as appropriate, either
151*e7be843bSPierre Proncherythe correct shared secret, or the synthetic alternative generated by the KDF.
152*e7be843bSPierre Pronchery
153*e7be843bSPierre Pronchery<!-- Links  -->
154*e7be843bSPierre Pronchery
155*e7be843bSPierre Pronchery[FIPS 203]:
156*e7be843bSPierre Pronchery    <https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.203.pdf>
157