| #
24eb22d8 |
| 12-Jan-2026 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: x86/aes: Add AES-NI optimization
Optimize the AES library with x86 AES-NI instructions.
The relevant existing assembly functions, aesni_set_key(), aesni_enc(), and aesni_dec(), are a bi
lib/crypto: x86/aes: Add AES-NI optimization
Optimize the AES library with x86 AES-NI instructions.
The relevant existing assembly functions, aesni_set_key(), aesni_enc(), and aesni_dec(), are a bit difficult to extract into the library:
- They're coupled to the code for the AES modes. - They operate on struct crypto_aes_ctx. The AES library now uses different structs. - They assume the key is 16-byte aligned. The AES library only *prefers* 16-byte alignment; it doesn't require it.
Moreover, they're not all that great in the first place:
- They use unrolled loops, which isn't a great choice on x86. - They use the 'aeskeygenassist' instruction, which is unnecessary, is slow on Intel CPUs, and forces the loop to be unrolled. - They have special code for AES-192 key expansion, despite that being kind of useless. AES-128 and AES-256 are the ones used in practice.
These are small functions anyway.
Therefore, I opted to just write replacements of these functions for the library. They address all the above issues.
Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-18-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
293c7cd5 |
| 12-Jan-2026 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: sparc/aes: Migrate optimized code into library
Move the SPARC64 AES assembly code into lib/crypto/, wire the key expansion and single-block en/decryption functions up to the AES library
lib/crypto: sparc/aes: Migrate optimized code into library
Move the SPARC64 AES assembly code into lib/crypto/, wire the key expansion and single-block en/decryption functions up to the AES library API, and remove the "aes-sparc64" crypto_cipher algorithm.
The result is that both the AES library and crypto_cipher APIs use the SPARC64 AES opcodes, whereas previously only crypto_cipher did (and it wasn't enabled by default, which this commit fixes as well).
Note that some of the functions in the SPARC64 AES assembly code are still used by the AES mode implementations in arch/sparc/crypto/aes_glue.c. For now, just export these functions. These exports will go away once the AES mode implementations are migrated to the library as well. (Trying to split up the assembly file seemed like much more trouble than it would be worth.)
Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-17-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
a4e573db |
| 12-Jan-2026 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: riscv/aes: Migrate optimized code into library
Move the aes_encrypt_zvkned() and aes_decrypt_zvkned() assembly functions into lib/crypto/, wire them up to the AES library API, and remove
lib/crypto: riscv/aes: Migrate optimized code into library
Move the aes_encrypt_zvkned() and aes_decrypt_zvkned() assembly functions into lib/crypto/, wire them up to the AES library API, and remove the "aes-riscv64-zvkned" crypto_cipher algorithm.
To make this possible, change the prototypes of these functions to take (rndkeys, key_len) instead of a pointer to crypto_aes_ctx, and change the RISC-V AES-XTS code to implement tweak encryption using the AES library instead of directly calling aes_encrypt_zvkned().
The result is that both the AES library and crypto_cipher APIs use RISC-V's AES instructions, whereas previously only crypto_cipher did (and it wasn't enabled by default, which this commit fixes as well).
Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-15-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
7cf2082e |
| 12-Jan-2026 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: powerpc/aes: Migrate POWER8 optimized code into library
Move the POWER8 AES assembly code into lib/crypto/, wire the key expansion and single-block en/decryption functions up to the AES
lib/crypto: powerpc/aes: Migrate POWER8 optimized code into library
Move the POWER8 AES assembly code into lib/crypto/, wire the key expansion and single-block en/decryption functions up to the AES library API, and remove the superseded "p8_aes" crypto_cipher algorithm.
The result is that both the AES library and crypto_cipher APIs are now optimized for POWER8, whereas previously only crypto_cipher was (and optimizations weren't enabled by default, which this commit fixes too).
Note that many of the functions in the POWER8 assembly code are still used by the AES mode implementations in arch/powerpc/crypto/. For now, just export these functions. These exports will go away once the AES modes are migrated to the library as well. (Trying to split up the assembly file seemed like much more trouble than it would be worth.)
Another challenge with this code is that the POWER8 assembly code uses a custom format for the expanded AES key. Since that code is imported from OpenSSL and is also targeted to POWER8 (rather than POWER9 which has better data movement and byteswap instructions), that is not easily changed. For now I've just kept the custom format. To maintain full correctness, this requires executing some slow fallback code in the case where the usability of VSX changes between key expansion and use. This should be tolerable, as this case shouldn't happen in practice.
Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-14-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
0892c91b |
| 12-Jan-2026 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: powerpc/aes: Migrate SPE optimized code into library
Move the PowerPC SPE AES assembly code into lib/crypto/, wire the key expansion and single-block en/decryption functions up to the AE
lib/crypto: powerpc/aes: Migrate SPE optimized code into library
Move the PowerPC SPE AES assembly code into lib/crypto/, wire the key expansion and single-block en/decryption functions up to the AES library API, and remove the superseded "aes-ppc-spe" crypto_cipher algorithm.
The result is that both the AES library and crypto_cipher APIs are now optimized with SPE, whereas previously only crypto_cipher was (and optimizations weren't enabled by default, which this commit fixes too).
Note that many of the functions in the PowerPC SPE assembly code are still used by the AES mode implementations in arch/powerpc/crypto/. For now, just export these functions. These exports will go away once the AES modes are migrated to the library as well. (Trying to split up the assembly files seemed like much more trouble than it would be worth.)
Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-13-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
2b1ef7ae |
| 12-Jan-2026 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: arm64/aes: Migrate optimized code into library
Move the ARM64 optimized AES key expansion and single-block AES en/decryption code into lib/crypto/, wire it up to the AES library API, and
lib/crypto: arm64/aes: Migrate optimized code into library
Move the ARM64 optimized AES key expansion and single-block AES en/decryption code into lib/crypto/, wire it up to the AES library API, and remove the superseded crypto_cipher algorithms.
The result is that both the AES library and crypto_cipher APIs are now optimized for ARM64, whereas previously only crypto_cipher was (and the optimizations weren't enabled by default, which this fixes as well).
Note: to see the diff from arch/arm64/crypto/aes-ce-glue.c to lib/crypto/arm64/aes.h, view this commit with 'git show -M10'.
Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-12-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
fa229775 |
| 12-Jan-2026 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: arm/aes: Migrate optimized code into library
Move the ARM optimized single-block AES en/decryption code into lib/crypto/, wire it up to the AES library API, and remove the superseded "ae
lib/crypto: arm/aes: Migrate optimized code into library
Move the ARM optimized single-block AES en/decryption code into lib/crypto/, wire it up to the AES library API, and remove the superseded "aes-arm" crypto_cipher algorithm.
The result is that both the AES library and crypto_cipher APIs are now optimized for ARM, whereas previously only crypto_cipher was (and the optimizations weren't enabled by default, which this fixes as well).
Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-11-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
a22fd0e3 |
| 12-Jan-2026 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: aes: Introduce improved AES library
The kernel's AES library currently has the following issues:
- It doesn't take advantage of the architecture-optimized AES code, including the impl
lib/crypto: aes: Introduce improved AES library
The kernel's AES library currently has the following issues:
- It doesn't take advantage of the architecture-optimized AES code, including the implementations using AES instructions.
- It's much slower than even the other software AES implementations: 2-4 times slower than "aes-generic", "aes-arm", and "aes-arm64".
- It requires that both the encryption and decryption round keys be computed and cached. This is wasteful for users that need only the forward (encryption) direction of the cipher: the key struct is 484 bytes when only 244 are actually needed. This missed optimization is very common, as many AES modes (e.g. GCM, CFB, CTR, CMAC, and even the tweak key in XTS) use the cipher only in the forward (encryption) direction even when doing decryption.
- It doesn't provide the flexibility to customize the prepared key format. The API is defined to do key expansion, and several callers in drivers/crypto/ use it specifically to expand the key. This is an issue when integrating the existing powerpc, s390, and sparc code, which is necessary to provide full parity with the traditional API.
To resolve these issues, I'm proposing the following changes:
1. New structs 'aes_key' and 'aes_enckey' are introduced, with corresponding functions aes_preparekey() and aes_prepareenckey().
Generally these structs will include the encryption+decryption round keys and the encryption round keys, respectively. However, the exact format will be under control of the architecture-specific AES code.
(The verb "prepare" is chosen over "expand" since key expansion isn't necessarily done. It's also consistent with hmac*_preparekey().)
2. aes_encrypt() and aes_decrypt() will be changed to operate on the new structs instead of struct crypto_aes_ctx.
3. aes_encrypt() and aes_decrypt() will use architecture-optimized code when available, or else fall back to a new generic AES implementation that unifies the existing two fragmented generic AES implementations.
The new generic AES implementation uses tables for both SubBytes and MixColumns, making it almost as fast as "aes-generic". However, instead of aes-generic's huge 8192-byte tables per direction, it uses only 1024 bytes for encryption and 1280 bytes for decryption (similar to "aes-arm"). The cost is just some extra rotations.
The new generic AES implementation also includes table prefetching, making it have some "constant-time hardening". That's an improvement from aes-generic which has no constant-time hardening.
It does slightly regress in constant-time hardening vs. the old lib/crypto/aes.c which had smaller tables, and from aes-fixed-time which disabled IRQs on top of that. But I think this is tolerable. The real solutions for constant-time AES are AES instructions or bit-slicing. The table-based code remains a best-effort fallback for the increasingly-rare case where a real solution is unavailable.
4. crypto_aes_ctx and aes_expandkey() will remain for now, but only for callers that are using them specifically for the AES key expansion (as opposed to en/decrypting data with the AES library).
This commit begins the migration process by introducing the new structs and functions, backed by the new generic AES implementation.
To allow callers to be incrementally converted, aes_encrypt() and aes_decrypt() are temporarily changed into macros that use a _Generic expression to call either the old functions (which take crypto_aes_ctx) or the new functions (which take the new types). Once all callers have been updated, these macros will go away, the old functions will be removed, and the "_new" suffix will be dropped from the new functions.
Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
a229d832 |
| 11-Dec-2025 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: x86/nh: Migrate optimized code into library
Migrate the x86_64 implementations of NH into lib/crypto/. This makes the nh() function be optimized on x86_64 kernels.
Note: this temporari
lib/crypto: x86/nh: Migrate optimized code into library
Migrate the x86_64 implementations of NH into lib/crypto/. This makes the nh() function be optimized on x86_64 kernels.
Note: this temporarily makes the adiantum template not utilize the x86_64 optimized NH code. This is resolved in a later commit that converts the adiantum template to use nh() instead of "nhpoly1305".
Link: https://lore.kernel.org/r/20251211011846.8179-6-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
b4a8528d |
| 11-Dec-2025 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: arm64/nh: Migrate optimized code into library
Migrate the arm64 NEON implementation of NH into lib/crypto/. This makes the nh() function be optimized on arm64 kernels.
Note: this tempo
lib/crypto: arm64/nh: Migrate optimized code into library
Migrate the arm64 NEON implementation of NH into lib/crypto/. This makes the nh() function be optimized on arm64 kernels.
Note: this temporarily makes the adiantum template not utilize the arm64 optimized NH code. This is resolved in a later commit that converts the adiantum template to use nh() instead of "nhpoly1305".
Link: https://lore.kernel.org/r/20251211011846.8179-5-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
29e39a11 |
| 11-Dec-2025 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: arm/nh: Migrate optimized code into library
Migrate the arm32 NEON implementation of NH into lib/crypto/. This makes the nh() function be optimized on arm32 kernels.
Note: this tempora
lib/crypto: arm/nh: Migrate optimized code into library
Migrate the arm32 NEON implementation of NH into lib/crypto/. This makes the nh() function be optimized on arm32 kernels.
Note: this temporarily makes the adiantum template not utilize the arm32 optimized NH code. This is resolved in a later commit that converts the adiantum template to use nh() instead of "nhpoly1305".
Link: https://lore.kernel.org/r/20251211011846.8179-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
14e15c71 |
| 11-Dec-2025 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: nh: Add NH library
Add support for the NH "almost-universal hash function" to lib/crypto/, specifically the variant of NH used in Adiantum.
This will replace the need for the "nhpoly130
lib/crypto: nh: Add NH library
Add support for the NH "almost-universal hash function" to lib/crypto/, specifically the variant of NH used in Adiantum.
This will replace the need for the "nhpoly1305" crypto_shash algorithm. All the implementations of "nhpoly1305" use architecture-optimized code only for the NH stage; they just use the generic C Poly1305 code for the Poly1305 stage. We can achieve the same result in a simpler way using an (architecture-optimized) nh() function combined with code in crypto/adiantum.c that passes the results to the Poly1305 library.
This commit begins this cleanup by adding the nh() function. The code is derived from crypto/nhpoly1305.c and include/crypto/nhpoly1305.h.
Link: https://lore.kernel.org/r/20251211011846.8179-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
64edccea |
| 14-Dec-2025 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: Add ML-DSA verification support
Add support for verifying ML-DSA signatures.
ML-DSA (Module-Lattice-Based Digital Signature Algorithm) is specified in FIPS 204 and is the standard versi
lib/crypto: Add ML-DSA verification support
Add support for verifying ML-DSA signatures.
ML-DSA (Module-Lattice-Based Digital Signature Algorithm) is specified in FIPS 204 and is the standard version of Dilithium. Unlike RSA and elliptic-curve cryptography, ML-DSA is believed to be secure even against adversaries in possession of a large-scale quantum computer.
Compared to the earlier patch (https://lore.kernel.org/r/20251117145606.2155773-3-dhowells@redhat.com/) that was based on "leancrypto", this implementation:
- Is about 700 lines of source code instead of 4800.
- Generates about 4 KB of object code instead of 28 KB.
- Uses 9-13 KB of memory to verify a signature instead of 31-84 KB.
- Is at least about the same speed, with a microbenchmark showing 3-5% improvements on one x86_64 CPU and -1% to 1% changes on another. When memory is a bottleneck, it's likely much faster.
- Correctly implements the RejNTTPoly step of the algorithm.
The API just consists of a single function mldsa_verify(), supporting pure ML-DSA with any standard parameter set (ML-DSA-44, ML-DSA-65, or ML-DSA-87) as selected by an enum. That's all that's actually needed.
The following four potential features are unneeded and aren't included. However, any that ever become needed could fairly easily be added later, as they only affect how the message representative mu is calculated:
- Nonempty context strings - Incremental message hashing - HashML-DSA - External mu
Signing support would, of course, be a larger and more complex addition. However, the kernel doesn't, and shouldn't, need ML-DSA signing support.
Note that mldsa_verify() allocates memory, so it can sleep and can fail with ENOMEM. Unfortunately we don't have much choice about that, since ML-DSA needs a lot of memory. At least callers have to check for errors anyway, since the signature could be invalid.
Note that verification doesn't require constant-time code, and in fact some steps are inherently variable-time. I've used constant-time patterns in some places anyway, but technically they're not needed.
Reviewed-by: David Howells <dhowells@redhat.com> Tested-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20251214181712.29132-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
2e8f7b17 |
| 05-Dec-2025 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: blake2b: Roll up BLAKE2b round loop on 32-bit
BLAKE2b has a state of 16 64-bit words. Add the message data in and there are 32 64-bit words. With the current code where all the rounds
lib/crypto: blake2b: Roll up BLAKE2b round loop on 32-bit
BLAKE2b has a state of 16 64-bit words. Add the message data in and there are 32 64-bit words. With the current code where all the rounds are unrolled to enable constant-folding of the blake2b_sigma values, this results in a very large code size on 32-bit kernels, including a recurring issue where gcc uses a large amount of stack.
There's just not much benefit to this unrolling when the code is already so large. Let's roll up the rounds when !CONFIG_64BIT.
To avoid having to duplicate the code, just write the code once using a loop, and conditionally use 'unrolled_full' from <linux/unroll.h>.
Then, fold the now-unneeded ROUND() macro into the loop. Finally, also remove the now-unneeded override of the stack frame size warning.
Code size improvements for blake2b_compress_generic():
Size before (bytes) Size after (bytes) ------------------- ------------------ i386, gcc 27584 3632 i386, clang 18208 3248 arm32, gcc 19912 2860 arm32, clang 21336 3344
Running the BLAKE2b benchmark on a !CONFIG_64BIT kernel on an x86_64 processor shows a 16384B throughput change of 351 => 340 MB/s (gcc) or 442 MB/s => 375 MB/s (clang). So clearly not much of a slowdown either. But also that microbenchmark also effectively disregards cache usage, which is important in practice and is far better in the smaller code.
Note: If we rolled up the loop on x86_64 too, the change would be 7024 bytes => 1584 bytes and 1960 MB/s => 1396 MB/s (gcc), or 6848 bytes => 1696 bytes and 1920 MB/s => 1263 MB/s (clang). Maybe still worth it, though not quite as clearly beneficial.
Fixes: 91d689337fe8 ("crypto: blake2b - add blake2b generic implementation") Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251205050330.89704-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
5abe8d8e |
| 03-Dec-2025 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library updates from Eric Biggers: "This is the main crypto library pull request
Merge tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library updates from Eric Biggers: "This is the main crypto library pull request for 6.19. It includes:
- Add SHA-3 support to lib/crypto/, including support for both the hash functions and the extendable-output functions. Reimplement the existing SHA-3 crypto_shash support on top of the library.
This is motivated mainly by the upcoming support for the ML-DSA signature algorithm, which needs the SHAKE128 and SHAKE256 functions. But even on its own it's a useful cleanup.
This also fixes the longstanding issue where the architecture-optimized SHA-3 code was disabled by default.
- Add BLAKE2b support to lib/crypto/, and reimplement the existing BLAKE2b crypto_shash support on top of the library.
This is motivated mainly by btrfs, which supports BLAKE2b checksums. With this change, all btrfs checksum algorithms now have library APIs. btrfs is planned to start just using the library directly.
This refactor also improves consistency between the BLAKE2b code and BLAKE2s code. And as usual, it also fixes the issue where the architecture-optimized BLAKE2b code was disabled by default.
- Add POLYVAL support to lib/crypto/, replacing the existing POLYVAL support in crypto_shash. Reimplement HCTR2 on top of the library.
This simplifies the code and improves HCTR2 performance. As usual, it also makes the architecture-optimized code be enabled by default. The generic implementation of POLYVAL is greatly improved as well.
- Clean up the BLAKE2s code
- Add FIPS self-tests for SHA-1, SHA-2, and SHA-3"
* tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (37 commits) fscrypt: Drop obsolete recommendation to enable optimized POLYVAL crypto: polyval - Remove the polyval crypto_shash crypto: hctr2 - Convert to use POLYVAL library lib/crypto: x86/polyval: Migrate optimized code into library lib/crypto: arm64/polyval: Migrate optimized code into library lib/crypto: polyval: Add POLYVAL library crypto: polyval - Rename conflicting functions lib/crypto: x86/blake2s: Use vpternlogd for 3-input XORs lib/crypto: x86/blake2s: Avoid writing back unchanged 'f' value lib/crypto: x86/blake2s: Improve readability lib/crypto: x86/blake2s: Use local labels for data lib/crypto: x86/blake2s: Drop check for nblocks == 0 lib/crypto: x86/blake2s: Fix 32-bit arg treated as 64-bit lib/crypto: arm, arm64: Drop filenames from file comments lib/crypto: arm/blake2s: Fix some comments crypto: s390/sha3 - Remove superseded SHA-3 code crypto: sha3 - Reimplement using library API crypto: jitterentropy - Use default sha3 implementation lib/crypto: s390/sha3: Add optimized one-shot SHA-3 digest functions lib/crypto: sha3: Support arch overrides of one-shot digest functions ...
show more ...
|
| #
4d8da355 |
| 10-Nov-2025 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: x86/polyval: Migrate optimized code into library
Migrate the x86_64 implementation of POLYVAL into lib/crypto/, wiring it up to the POLYVAL library interface. This makes the POLYVAL lib
lib/crypto: x86/polyval: Migrate optimized code into library
Migrate the x86_64 implementation of POLYVAL into lib/crypto/, wiring it up to the POLYVAL library interface. This makes the POLYVAL library be properly optimized on x86_64.
This drops the x86_64 optimizations of polyval in the crypto_shash API. That's fine, since polyval will be removed from crypto_shash entirely since it is unneeded there. But even if it comes back, the crypto_shash API could just be implemented on top of the library API, as usual.
Adjust the names and prototypes of the assembly functions to align more closely with the rest of the library code.
Also replace a movaps instruction with movups to remove the assumption that the key struct is 16-byte aligned. Users can still align the key if they want (and at least in this case, movups is just as fast as movaps), but it's inconvenient to require it.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251109234726.638437-6-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
37919e23 |
| 10-Nov-2025 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: arm64/polyval: Migrate optimized code into library
Migrate the arm64 implementation of POLYVAL into lib/crypto/, wiring it up to the POLYVAL library interface. This makes the POLYVAL li
lib/crypto: arm64/polyval: Migrate optimized code into library
Migrate the arm64 implementation of POLYVAL into lib/crypto/, wiring it up to the POLYVAL library interface. This makes the POLYVAL library be properly optimized on arm64.
This drops the arm64 optimizations of polyval in the crypto_shash API. That's fine, since polyval will be removed from crypto_shash entirely since it is unneeded there. But even if it comes back, the crypto_shash API could just be implemented on top of the library API, as usual.
Adjust the names and prototypes of the assembly functions to align more closely with the rest of the library code.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251109234726.638437-5-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
3d176751 |
| 10-Nov-2025 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: polyval: Add POLYVAL library
Add support for POLYVAL to lib/crypto/.
This will replace the polyval crypto_shash algorithm and its use in the hctr2 template, simplifying the code and red
lib/crypto: polyval: Add POLYVAL library
Add support for POLYVAL to lib/crypto/.
This will replace the polyval crypto_shash algorithm and its use in the hctr2 template, simplifying the code and reducing overhead.
Specifically, this commit introduces the POLYVAL library API and a generic implementation of it. Later commits will migrate the existing architecture-optimized implementations of POLYVAL into lib/crypto/ and add a KUnit test suite.
I've also rewritten the generic implementation completely, using a more modern approach instead of the traditional table-based approach. It's now constant-time, requires no precomputation or dynamic memory allocations, decreases the per-key memory usage from 4096 bytes to 16 bytes, and is faster than the old polyval-generic even on bulk data reusing the same key (at least on x86_64, where I measured 15% faster). We should do this for GHASH too, but for now just do it for POLYVAL.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251109234726.638437-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
1e29a750 |
| 26-Oct-2025 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: arm64/sha3: Migrate optimized code into library
Instead of exposing the arm64-optimized SHA-3 code via arm64-specific crypto_shash algorithms, instead just implement the sha3_absorb_bloc
lib/crypto: arm64/sha3: Migrate optimized code into library
Instead of exposing the arm64-optimized SHA-3 code via arm64-specific crypto_shash algorithms, instead just implement the sha3_absorb_blocks() and sha3_keccakf() library functions. This is much simpler, it makes the SHA-3 library functions be arm64-optimized, and it fixes the longstanding issue where the arm64-optimized SHA-3 code was disabled by default. SHA-3 still remains available through crypto_shash, but individual architectures no longer need to handle it.
Note: to see the diff from arch/arm64/crypto/sha3-ce-glue.c to lib/crypto/arm64/sha3.h, view this commit with 'git show -M10'.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251026055032.1413733-10-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
05934472 |
| 26-Oct-2025 |
David Howells <dhowells@redhat.com> |
lib/crypto: sha3: Add SHA-3 support
Add SHA-3 support to lib/crypto/. All six algorithms in the SHA-3 family are supported: four digests (SHA3-224, SHA3-256, SHA3-384, and SHA3-512) and two extenda
lib/crypto: sha3: Add SHA-3 support
Add SHA-3 support to lib/crypto/. All six algorithms in the SHA-3 family are supported: four digests (SHA3-224, SHA3-256, SHA3-384, and SHA3-512) and two extendable-output functions (SHAKE128 and SHAKE256).
The SHAKE algorithms will be required for ML-DSA.
[EB: simplified the API to use fewer types and functions, fixed bug that sometimes caused incorrect SHAKE output, cleaned up the documentation, dropped an ad-hoc test that was inconsistent with the rest of lib/crypto/, and many other cleanups]
Signed-off-by: David Howells <dhowells@redhat.com> Co-developed-by: Eric Biggers <ebiggers@kernel.org> Tested-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251026055032.1413733-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
2b81082a |
| 03-Nov-2025 |
Nathan Chancellor <nathan@kernel.org> |
lib/crypto: curve25519-hacl64: Fix older clang KASAN workaround for GCC
Commit 2f13daee2a72 ("lib/crypto/curve25519-hacl64: Disable KASAN with clang-17 and older") inadvertently disabled KASAN in cu
lib/crypto: curve25519-hacl64: Fix older clang KASAN workaround for GCC
Commit 2f13daee2a72 ("lib/crypto/curve25519-hacl64: Disable KASAN with clang-17 and older") inadvertently disabled KASAN in curve25519-hacl64.o for GCC unconditionally because clang-min-version will always evaluate to nothing for GCC. Add a check for CONFIG_CC_IS_CLANG to avoid applying the workaround for GCC, which is only needed for clang-17 and older.
Cc: stable@vger.kernel.org Fixes: 2f13daee2a72 ("lib/crypto/curve25519-hacl64: Disable KASAN with clang-17 and older") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251103-curve25519-hacl64-fix-kasan-workaround-v2-1-ab581cbd8035@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
ba6617bd |
| 18-Oct-2025 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: arm/blake2b: Migrate optimized code into library
Migrate the arm-optimized BLAKE2b code from arch/arm/crypto/ to lib/crypto/arm/. This makes the BLAKE2b library able to use it, and it a
lib/crypto: arm/blake2b: Migrate optimized code into library
Migrate the arm-optimized BLAKE2b code from arch/arm/crypto/ to lib/crypto/arm/. This makes the BLAKE2b library able to use it, and it also simplifies the code because it's easier to integrate with the library than crypto_shash.
This temporarily makes the arm-optimized BLAKE2b code unavailable via crypto_shash. A later commit reimplements the blake2b-* crypto_shash algorithms on top of the BLAKE2b library API, making it available again.
Note that as per the lib/crypto/ convention, the optimized code is now enabled by default. So, this also fixes the longstanding issue where the optimized BLAKE2b code was not enabled by default.
To see the diff from arch/arm/crypto/blake2b-neon-glue.c to lib/crypto/arm/blake2b.h, view this commit with 'git show -M10'.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251018043106.375964-8-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
23a16c95 |
| 18-Oct-2025 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: blake2b: Add BLAKE2b library functions
Add a library API for BLAKE2b, closely modeled after the BLAKE2s API.
This will allow in-kernel users such as btrfs to use BLAKE2b without going t
lib/crypto: blake2b: Add BLAKE2b library functions
Add a library API for BLAKE2b, closely modeled after the BLAKE2s API.
This will allow in-kernel users such as btrfs to use BLAKE2b without going through the generic crypto layer. In addition, as usual the BLAKE2b crypto_shash algorithms will be reimplemented on top of this.
Note: to create lib/crypto/blake2b.c I made a copy of lib/crypto/blake2s.c and made the updates from BLAKE2s => BLAKE2b. This way, the BLAKE2s and BLAKE2b code is kept consistent. Therefore, it borrows the SPDX-License-Identifier and Copyright from lib/crypto/blake2s.c rather than crypto/blake2b_generic.c.
The library API uses 'struct blake2b_ctx', consistent with other lib/crypto/ APIs. The existing 'struct blake2b_state' will be removed once the blake2b crypto_shash algorithms are updated to stop using it.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251018043106.375964-7-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
68546e56 |
| 06-Sep-2025 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: curve25519: Consolidate into single module
Reorganize the Curve25519 library code:
- Build a single libcurve25519 module, instead of up to three modules: libcurve25519, libcurve25519-
lib/crypto: curve25519: Consolidate into single module
Reorganize the Curve25519 library code:
- Build a single libcurve25519 module, instead of up to three modules: libcurve25519, libcurve25519-generic, and an arch-specific module.
- Move the arch-specific Curve25519 code from arch/$(SRCARCH)/crypto/ to lib/crypto/$(SRCARCH)/. Centralize the build rules into lib/crypto/Makefile and lib/crypto/Kconfig.
- Include the arch-specific code directly in lib/crypto/curve25519.c via a header, rather than using a separate .c file.
- Eliminate the entanglement with CRYPTO. CRYPTO_LIB_CURVE25519 no longer selects CRYPTO, and the arch-specific Curve25519 code no longer depends on CRYPTO.
This brings Curve25519 in line with the latest conventions for lib/crypto/, used by other algorithms. The exception is that I kept the generic code in separate translation units for now. (Some of the function names collide between the x86 and generic Curve25519 code. And the Curve25519 functions are very long anyway, so inlining doesn't matter as much for Curve25519 as it does for some other algorithms.)
Link: https://lore.kernel.org/r/20250906213523.84915-11-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
afc4e4a5 |
| 06-Sep-2025 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: tests: Migrate Curve25519 self-test to KUnit
Move the Curve25519 test from an ad-hoc self-test to a KUnit test.
Generally keep the same test logic for now, just translated to KUnit. The
lib/crypto: tests: Migrate Curve25519 self-test to KUnit
Move the Curve25519 test from an ad-hoc self-test to a KUnit test.
Generally keep the same test logic for now, just translated to KUnit. There's one exception, which is that I dropped the incomplete test of curve25519_generic(). The approach I'm taking to cover the different implementations with the KUnit tests is to just rely on booting kernels in QEMU with different '-cpu' options, rather than try to make the tests (incompletely) test multiple implementations on one CPU. This way, both the test and the library API are simpler.
This commit makes the file lib/crypto/curve25519.c no longer needed, as its only purpose was to call the self-test. However, keep it for now, since a later commit will add code to it again.
Temporarily omit the default value of CRYPTO_SELFTESTS that the other lib/crypto/ KUnit tests have. It would cause a recursive kconfig dependency, since the Curve25519 code is still entangled with CRYPTO. A later commit will fix that.
Link: https://lore.kernel.org/r/20250906213523.84915-8-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|