#
5e7d93a6 |
| 26-Aug-2024 |
Getz Mikalsen <getz@FreeBSD.org> |
lib/libc/aarch64/string: add strcmp SIMD implementation
This changeset includes a port of the SIMD implementation of strcmp for amd64 to Aarch64.
Below is a description of its method as described i
lib/libc/aarch64/string: add strcmp SIMD implementation
This changeset includes a port of the SIMD implementation of strcmp for amd64 to Aarch64.
Below is a description of its method as described in D41971.
The basic idea is to process the bulk of the string in aligned blocks of 16 bytes such that one string runs ahead and the other runs behind. The string that runs ahead is checked for NUL bytes, the one that runs behind is compared with the corresponding chunk of the string that runs ahead. This trades an extra load per iteration for the very complicated block-reassembly needed in the other implementations (bionic, glibc). On the flip side, we need two code paths depending on the relative alignment of the two buffers.
The initial part of the string is compared directly if it is known not to cross a page boundary. Otherwise, a complex slow path to avoid crossing into unmapped memory commences.
Performance is better in most cases than the existing implementation from the Arm Optimized Routines repository.
See the DR for benchmark results.
Tested by: fuz (exprun) Reviewed by: fuz, emaste Sponsored by: Google LLC (GSoC 2024) PR: 281175 Differential Revision: https://reviews.freebsd.org/D45839
show more ...
|