| 30acc842 | 29-Jul-2025 |
Robert Clausecker <fuz@FreeBSD.org> |
libc/amd64: rewrite memrchr() scalar impl. to read the string from the back
A very simple implementation as I don't have the patience right now to write a full SWAR kernel. Should still do the tric
libc/amd64: rewrite memrchr() scalar impl. to read the string from the back
A very simple implementation as I don't have the patience right now to write a full SWAR kernel. Should still do the trick if you wish to opt out of SSE for some reason.
Reported by: Mikael Simonsson <m@mikaelsimonsson.com> Reviewed by: strajabot PR: 288321 MFC after: 1 month
show more ...
|
| cdecda8d | 15-Nov-2023 |
Brooks Davis <brooks@FreeBSD.org> |
libc: move rfork_thread(3) to libsys
rfork_thread(3) is assembly that makes syscalls directly and uses cerror so it belongs in libsys.
Reviewed by: kib, emaste, imp Pull Request: https://github.com
libc: move rfork_thread(3) to libsys
rfork_thread(3) is assembly that makes syscalls directly and uses cerror so it belongs in libsys.
Reviewed by: kib, emaste, imp Pull Request: https://github.com/freebsd/freebsd-src/pull/908
show more ...
|
| 31a46e2c | 14-Nov-2023 |
Brooks Davis <brooks@FreeBSD.org> |
libc: Move per-arch sys/Makefile.inc to libsys
libc/<arch>/sys/Makefile.inc -> libsys/<arch>/Makefile.sys.
Require that libsys/<arch>/Makefile.sys exist. At least for current archtiectures, it's n
libc: Move per-arch sys/Makefile.inc to libsys
libc/<arch>/sys/Makefile.inc -> libsys/<arch>/Makefile.sys.
Require that libsys/<arch>/Makefile.sys exist. At least for current archtiectures, it's not possible for an architecture to not have and MD syscall bits.
powerpcspe/Makefile.sys's structure means it had to be modified when moved so rename detection won't work, but it has trivial contents so the history is unimportant.
Reviewed by: kib, emaste, imp Pull Request: https://github.com/freebsd/freebsd-src/pull/908
show more ...
|
| fb197a4f | 06-Dec-2023 |
Robert Clausecker <fuz@FreeBSD.org> |
lib/libc/amd64/string: add memrchr() scalar, baseline implementation
The scalar implementation is fairly simplistic and only performs slightly better than the generic C implementation. It could be i
lib/libc/amd64/string: add memrchr() scalar, baseline implementation
The scalar implementation is fairly simplistic and only performs slightly better than the generic C implementation. It could be improved by using the same algorithm as for memchr, but it would have been a lot more complicated.
The baseline implementation is similar to timingsafe_memcmp. It's slightly slower than memchr() due to the more complicated main loop, but I don't think that can be significantly improved.
Tested by: developers@, exp-run Approved by: mjg MFC after: 1 month MFC to: stable/14 PR: 275785 Differential Revision: https://reviews.freebsd.org/D42925
show more ...
|
| ea7b1377 | 04-Dec-2023 |
Robert Clausecker <fuz@FreeBSD.org> |
lib/libc/amd64/string: implement strncat() by calling strlen(), memccpy()
This picks up the accelerated implementation of memccpy().
Tested by: developers@, exp-run Approved by: mjg MFC after: 1 mo
lib/libc/amd64/string: implement strncat() by calling strlen(), memccpy()
This picks up the accelerated implementation of memccpy().
Tested by: developers@, exp-run Approved by: mjg MFC after: 1 month MFC to: stable/14 PR: 275785 Differential Revision: https://reviews.freebsd.org/D42902
show more ...
|
| fc0e38a7 | 02-Dec-2023 |
Robert Clausecker <fuz@FreeBSD.org> |
lib/libc/amd64/string: add memccpy scalar, baseline implementation
Based on the strlcpy code from D42863, this patch adds a SIMD-enhanced implementation of memccpy for amd64. A scalar implementation
lib/libc/amd64/string: add memccpy scalar, baseline implementation
Based on the strlcpy code from D42863, this patch adds a SIMD-enhanced implementation of memccpy for amd64. A scalar implementation calling into memchr and memcpy to do the job is provided, too.
Please note that this code does not behave exactly the same as the C implementation of memccpy for overlapping inputs. However, overlapping inputs are not allowed for this function by ISO/IEC 9899:1999 and neither has the C implementation any code to deal with the possibility. It just proceeds byte-by-byte, which may or may not do the expected thing for some overlaps. We do not document whether overlapping inputs are supported in memccpy(3).
Tested by: developers@, exp-run Approved by: mjg MFC after: 1 month MFC to: stable/14 PR: 275785 Differential Revision: https://reviews.freebsd.org/D42902
show more ...
|
| 2b7b03b7 | 29-Nov-2023 |
Robert Clausecker <fuz@FreeBSD.org> |
lib/libc/amd64/string: implement strlcat() through strlcpy()
This should pick up our optimised memchr(), strlen(), and strlcpy() when strlcat() is called.
Tested by: developers@, exp-run Approved b
lib/libc/amd64/string: implement strlcat() through strlcpy()
This should pick up our optimised memchr(), strlen(), and strlcpy() when strlcat() is called.
Tested by: developers@, exp-run Approved by: mjg MFC after: 1 month MFC to: stable/14 PR: 275785 Differential Revision: https://reviews.freebsd.org/D42863
show more ...
|
| 74d6cfad | 12-Nov-2023 |
Robert Clausecker <fuz@FreeBSD.org> |
lib/libc/amd64/string: add strlcpy scalar, baseline implementation
Somewhat similar to stpncpy, but different in that we need to compute the full source length even if the buffer is shorter than the
lib/libc/amd64/string: add strlcpy scalar, baseline implementation
Somewhat similar to stpncpy, but different in that we need to compute the full source length even if the buffer is shorter than the source.
strlcat is implemented as a simple wrapper around strlcpy. The scalar implementation of strlcpy just calls into strlen() and memcpy() to do the job.
Perf-wise we're very close to stpncpy. The code is slightly slower as it needs to carry on with finding the source string length even if the buffer ends before the string.
Sponsored by: The FreeBSD Foundation Tested by: developers@, exp-run Approved by: mjg MFC after: 1 month MFC to: stable/14 PR: 275785 Differential Revision: https://reviews.freebsd.org/D42863
show more ...
|
| aff9143a | 14-Nov-2023 |
Robert Clausecker <fuz@FreeBSD.org> |
lib/libc/amd64/string/strcat.S: enable use of SIMD
strcat has a bespoke scalar assembly implementation we inherited from NetBSD. While it performs well, it is better to call into our SIMD implement
lib/libc/amd64/string/strcat.S: enable use of SIMD
strcat has a bespoke scalar assembly implementation we inherited from NetBSD. While it performs well, it is better to call into our SIMD implementations if any SIMD features are available at all. So do that and implement strcat() by calling into strlen() and strcpy() if these are available.
Sponsored by: The FreeBSD Foundation Tested by: developers@, exp-run Approved by: mjg MFC after: 1 month MFC to: stable/14 PR: 275785 Differential Reviison: https://reviews.freebsd.org/D42600
show more ...
|
| e19d46c8 | 09-Nov-2023 |
Robert Clausecker <fuz@FreeBSD.org> |
lib/libc/amd64/string: implement strncpy() by calling stpncpy()
Sponsored by: The FreeBSD Foundation Tested by: developers@, exp-run Approved by: mjg MFC after: 1 month MFC to: stable/14 PR: 27578
lib/libc/amd64/string: implement strncpy() by calling stpncpy()
Sponsored by: The FreeBSD Foundation Tested by: developers@, exp-run Approved by: mjg MFC after: 1 month MFC to: stable/14 PR: 275785 Differential Revision: https://reviews.freebsd.org/D42519
show more ...
|
| 90253d49 | 30-Oct-2023 |
Robert Clausecker <fuz@FreeBSD.org> |
lib/libc/amd64/string: add stpncpy scalar, baseline implementation
This was surprisingly annoying to get right, despite being such a simple function. A scalar implementation is also provided, it ju
lib/libc/amd64/string: add stpncpy scalar, baseline implementation
This was surprisingly annoying to get right, despite being such a simple function. A scalar implementation is also provided, it just calls into our optimised memchr(), memcpy(), and memset() routines to carry out its job.
I'm quite happy with the performance. glibc only beats us for very long strings, likely due to the use of AVX-512. The scalar implementation just calls into our optimised memchr(), memcpy(), and memset() routines, so it has a high overhead to begin with but then performs ok for the amount of effort that went into it. Still beats the old C code, except for very short strings.
Sponsored by: The FreeBSD Foundation Tested by: developers@, exp-run Approved by: mjg MFC after: 1 month MFC to: stable/14 PR: 275785 Differential Revision: https://reviews.freebsd.org/D42519
show more ...
|