Revision tags: release/14.0.0 |
|
#
f4fc317c |
| 25-Sep-2023 |
Robert Clausecker <fuz@FreeBSD.org> |
lib/libc/amd64/string: implement strpbrk() through strcspn()
This lets us use our optimised strcspn() routine for strpbrk() calls.
Sponsored by: The FreeBSD Foundation Tested by: developers@, exp-r
lib/libc/amd64/string: implement strpbrk() through strcspn()
This lets us use our optimised strcspn() routine for strpbrk() calls.
Sponsored by: The FreeBSD Foundation Tested by: developers@, exp-run Approved by: mjg MFC after: 1 month MFC to: stable/14 PR: 275785 Differential Revision: https://reviews.freebsd.org/D41980
show more ...
|
#
c91cd7d0 |
| 19-Dec-2023 |
Robert Clausecker <fuz@FreeBSD.org> |
lib/libc/amd64/string/strcspn.S: always return earliest match in 17--32 char case
When matching against a set of 17--32 characters, strcspn() uses two invocations of PCMPISTRI to match against the f
lib/libc/amd64/string/strcspn.S: always return earliest match in 17--32 char case
When matching against a set of 17--32 characters, strcspn() uses two invocations of PCMPISTRI to match against the first 16 characters of the set and then the remaining characters. If a match was found in the first half of the set, the code originally immediately returned that match. However, it is possible for a match in the second half of the set to occur earlier in the vector, leading to that match being overlooked.
Fix the code by checking if there is a match in the second half of the set and taking the earlier of the two matches.
The correctness of the function has been verified with extended unit tests and test runs against the glibc test suite.
Approved by: mjg (implicit, via IRC) MFC after: 1 week MFC to: stable/14
show more ...
|
#
52d4a4d4 |
| 12-Sep-2023 |
Robert Clausecker <fuz@FreeBSD.org> |
lib/libc/amd64/string/strcspn.S: fix behaviour with sets of 17--32
When a string is matched against a set of 17--32 characters, each chunk of the string is matched first against the first 16 charact
lib/libc/amd64/string/strcspn.S: fix behaviour with sets of 17--32
When a string is matched against a set of 17--32 characters, each chunk of the string is matched first against the first 16 characters of the set and then against the remaining characters. We also check at the same time if the string has a nul byte in the current chunk, terminating the search if it does.
Due to misconceived logic, the order of checks was "first half of set, nul byte, second half of set", meaning that a match with the second half of the set was ignored when the string ended in the same 16 bytes. Reverse the order of checks to fix this problem.
Sponsored by: The FreeBSD Foundation Approved by: mjg (blanket, via IRC) MFC after: 1 week MFC to: stable/14
show more ...
|
#
474408bb |
| 13-Aug-2023 |
Robert Clausecker <fuz@FreeBSD.org> |
lib/libc/amd64/string: add strcspn(3) scalar, x86-64-v2 implementation
This changeset adds both a scalar and an x86-64-v2 implementation of the strcspn(3) function to libc. A baseline implementation
lib/libc/amd64/string: add strcspn(3) scalar, x86-64-v2 implementation
This changeset adds both a scalar and an x86-64-v2 implementation of the strcspn(3) function to libc. A baseline implementation does not appear to be feasible given the requirements of the function.
The scalar implementation is similar to the generic libc implementation, but expands the bit set into a byte set to reduce latency, improving performance. This approach could probably be backported to the generic C version to benefit other platforms.
The x86-64-v2 implementation is built around the infamous pcmpistri instruction. An alternative implementation based on the Muła/Langdale algorithm [1] was prototyped, but performed worse than the pcmpistri approach except for sets of more than 16 characters with long input strings.
All implementations provide special cases for the empty set (reduces to strlen as well as single-character sets (reduces to strchr). The x86-64-v2 kernel falls back to the scalar implementation for sets of more than 32 characters. This limit could be raised by additional multiples of 16 through the use of additional pcmpistri code paths, but I consider this case to be too rare to be of importance.
[1]: http://0x80.pl/articles/simd-byte-lookup.html
Sponsored by: The FreeBSD Foundation Approved by: mjg MFC after: 1 week MFC to: stable/14 Differential Revision: https://reviews.freebsd.org/D41557
show more ...
|