105888197SRobert Clausecker.\" Copyright (c) 2023 The FreeBSD Foundation 205888197SRobert Clausecker. 305888197SRobert Clausecker.\" This documentation was written by Robert Clausecker <fuz@FreeBSD.org> 405888197SRobert Clausecker.\" under sponsorship from the FreeBSD Foundation. 505888197SRobert Clausecker. 605888197SRobert Clausecker.\" Redistribution and use in source and binary forms, with or without 705888197SRobert Clausecker.\" modification, are permitted provided that the following conditions 805888197SRobert Clausecker.\" are met: 905888197SRobert Clausecker.\" 1. Redistributions of source code must retain the above copyright 1005888197SRobert Clausecker.\" notice, this list of conditions and the following disclaimer. 1105888197SRobert Clausecker.\" 2. Redistributions in binary form must reproduce the above copyright 1205888197SRobert Clausecker.\" notice, this list of conditions and the following disclaimer in the 1305888197SRobert Clausecker.\" documentation and/or other materials provided with the distribution. 1405888197SRobert Clausecker. 1505888197SRobert Clausecker.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ''AS IS'' AND 1605888197SRobert Clausecker.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 1705888197SRobert Clausecker.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 1805888197SRobert Clausecker.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 1905888197SRobert Clausecker.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 2005888197SRobert Clausecker.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 2105888197SRobert Clausecker.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 2205888197SRobert Clausecker.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 2305888197SRobert Clausecker.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 2405888197SRobert Clausecker.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 2505888197SRobert Clausecker.\" SUCH DAMAGE 2605888197SRobert Clausecker. 27*c983825aSRobert Clausecker.Dd November 14, 2023 2805888197SRobert Clausecker.Dt SIMD 7 2905888197SRobert Clausecker.Os 3005888197SRobert Clausecker.Sh NAME 3105888197SRobert Clausecker.Nm simd 3205888197SRobert Clausecker.Nd SIMD enhancements 3305888197SRobert Clausecker. 3405888197SRobert Clausecker.Sh DESCRIPTION 3505888197SRobert ClauseckerOn some architectures, the 3605888197SRobert Clausecker.Fx 3705888197SRobert Clausecker.Em libc 3805888197SRobert Clauseckerprovides enhanced implementations of commonly used functions, replacing 3905888197SRobert Clauseckerthe architecture-independent implementations used otherwise. 4005888197SRobert ClauseckerDepending on architecture and function, an enhanced 4105888197SRobert Clauseckerimplementation of a function may either always be used or the 4205888197SRobert Clausecker.Em libc 4305888197SRobert Clauseckerdetects at runtime which SIMD instruction set extensions are 4405888197SRobert Clauseckersupported and picks the most suitable implementation automatically. 4505888197SRobert ClauseckerOn 4605888197SRobert Clausecker.Cm amd64 , 4705888197SRobert Clauseckerthe environment variable 4805888197SRobert Clausecker.Ev ARCHLEVEL 4905888197SRobert Clauseckercan be used to override this mechanism. 5005888197SRobert Clausecker.Pp 51*c983825aSRobert ClauseckerEnhanced functions are present for the following architectures: 52a8cc4671SRobert Clausecker.Bl -column FUNCTION_________ aarch64_ arm_ amd64_ i386_ ppc64_ -offset indent 5305888197SRobert Clausecker.It Em FUNCTION Ta Em AARCH64 Ta Em ARM Ta Em AMD64 Ta Em I386 Ta Em PPC64 54b404e030SRobert Clausecker.It bcmp Ta Ta Ta S1 Ta S 5505888197SRobert Clausecker.It bcopy Ta Ta S Ta S Ta S Ta SV 5605888197SRobert Clausecker.It bzero Ta Ta S Ta S Ta S 5705888197SRobert Clausecker.It div Ta Ta Ta S Ta S 5866f5f4bfSRobert Clausecker.It index Ta S Ta Ta S1 5905888197SRobert Clausecker.It ldiv Ta Ta Ta S Ta S 6005888197SRobert Clausecker.It lldiv Ta Ta Ta S 612a4096b0SRobert Clausecker.It memchr Ta Ta Ta S1 62b404e030SRobert Clausecker.It memcmp Ta Ta S Ta S1 Ta S 6305888197SRobert Clausecker.It memcpy Ta S Ta S Ta S Ta S Ta SV 6405888197SRobert Clausecker.It memmove Ta S Ta S Ta S Ta S Ta SV 6505888197SRobert Clausecker.It memset Ta Ta S Ta S Ta S 66dd1c2e88SRobert Clausecker.It rindex Ta S Ta Ta S1 Ta S 67245fee96SRobert Clausecker.It stpcpy Ta Ta Ta S1 6875a9e225SRobert Clausecker.It stpncpy Ta Ta Ta S1 69*c983825aSRobert Clausecker.It strcat Ta Ta Ta S1 Ta S 7066f5f4bfSRobert Clausecker.It strchr Ta S Ta Ta S1 Ta S 7166f5f4bfSRobert Clausecker.It strchrnul Ta Ta Ta S1 7247adb1e0SRobert Clausecker.It strcmp Ta Ta S Ta S1 Ta S 73245fee96SRobert Clausecker.It strcpy Ta Ta Ta S1 Ta S Ta S2 745fe2597bSRobert Clausecker.It strcspn Ta Ta Ta S2 7566f5f4bfSRobert Clausecker.It strlen Ta Ta S Ta S1 76f5edd845SRobert Clausecker.It strncmp Ta Ta S Ta S1 Ta S 7775a9e225SRobert Clausecker.It strncpy Ta Ta Ta S1 Ta Ta S2 782a4096b0SRobert Clausecker.It strnlen Ta Ta Ta S1 79dd1c2e88SRobert Clausecker.It strrchr Ta S Ta Ta S1 Ta S 808b60e1fdSRobert Clausecker.It strpbrk Ta Ta Ta S2 8175cb2026SRobert Clausecker.It strsep Ta Ta Ta S2 82a559ef1aSRobert Clausecker.It strspn Ta Ta Ta S2 8305888197SRobert Clausecker.It swab Ta Ta Ta Ta S 84a78879dfSRobert Clausecker.It timingsafe_bcmp Ta Ta Ta S1 85a8cc4671SRobert Clausecker.It timingsafe_memcmp Ta Ta Ta S 8605888197SRobert Clausecker.It wcschr Ta Ta Ta Ta S 8705888197SRobert Clausecker.It wcscmp Ta Ta Ta Ta S 8805888197SRobert Clausecker.It wcslen Ta Ta Ta Ta S 8905888197SRobert Clausecker.It wmemchr Ta Ta Ta Ta S 9005888197SRobert Clausecker.El 9105888197SRobert Clausecker.Pp 9205888197SRobert Clausecker.Sy S Ns :\ scalar (non-SIMD), 9305888197SRobert Clausecker.Sy 1 Ns :\ amd64 baseline, 9405888197SRobert Clausecker.Sy 2 Ns :\ x86-64-v2 9505888197SRobert Clauseckeror PowerPC\ 2.05, 9605888197SRobert Clausecker.Sy 3 Ns :\ x86-64-v3, 9705888197SRobert Clausecker.Sy 4 Ns :\ x86-64-v4, 9805888197SRobert Clausecker.Sy V Ns :\ PowerPC\ VSX. 9905888197SRobert Clausecker. 10005888197SRobert Clausecker.Sh ENVIRONMENT 10105888197SRobert Clausecker.Bl -tag 10205888197SRobert Clausecker.It Ev ARCHLEVEL 10305888197SRobert ClauseckerOn 10405888197SRobert Clausecker.Em amd64 , 10505888197SRobert Clauseckercontrols the level of SIMD enhancements used. 10605888197SRobert ClauseckerIf this variable is set to an architecture level from the list below 10705888197SRobert Clauseckerand that architecture level is supported by the processor, SIMD 10805888197SRobert Clauseckerenhancements up to 10905888197SRobert Clausecker.Ev ARCHLEVEL 11005888197SRobert Clauseckerare used. 11105888197SRobert ClauseckerIf 11205888197SRobert Clausecker.Ev ARCHLEVEL 11305888197SRobert Clauseckeris unset, not recognised, or not supported by the processor, the highest 11405888197SRobert Clauseckerlevel of SIMD enhancements supported by the processor is used. 11505888197SRobert Clausecker.Pp 11605888197SRobert ClauseckerA suffix beginning with 11705888197SRobert Clausecker.Sq ":" 11805888197SRobert Clauseckeror 11905888197SRobert Clausecker.Sq "+" 12005888197SRobert Clauseckerin 12105888197SRobert Clausecker.Ev ARCHLEVEL 12205888197SRobert Clauseckeris ignored and may be used for future extensions. 12305888197SRobert ClauseckerThe architecture level can be prefixed with a 12405888197SRobert Clausecker.Sq "!" 12505888197SRobert Clauseckercharacter to force use of the requested architecture level, even if the 12605888197SRobert Clauseckerprocessor does not advertise that it is supported. 12705888197SRobert ClauseckerThis usually causes applications to crash and should only be used for 12805888197SRobert Clauseckertesting purposes or if architecture level detection yields incorrect 12905888197SRobert Clauseckerresults. 13005888197SRobert Clausecker.Pp 13105888197SRobert ClauseckerThe architecture levels follow the AMD64 SysV ABI supplement: 13205888197SRobert Clausecker.Bl -tag -width x86-64-v2 13305888197SRobert Clausecker.It Cm scalar 13405888197SRobert Clauseckerscalar enhancements only (no SIMD) 13505888197SRobert Clausecker.It Cm baseline 13605888197SRobert Clauseckercmov, cx8, x87 FPU, fxsr, MMX, osfxsr, SSE, SSE2 13705888197SRobert Clausecker.It Cm x86-64-v2 13805888197SRobert Clauseckercx16, lahf/sahf, popcnt, SSE3, SSSE3, SSE4.1, SSE4.2 13905888197SRobert Clausecker.It Cm x86-64-v3 14005888197SRobert ClauseckerAVX, AVX2, BMI1, BMI2, F16C, FMA, lzcnt, movbe, osxsave 14105888197SRobert Clausecker.It Cm x86-64-v4 14205888197SRobert ClauseckerAVX-512F/BW/CD/DQ/VL 14305888197SRobert Clausecker.El 14405888197SRobert Clausecker.El 14505888197SRobert Clausecker. 14605888197SRobert Clausecker.Sh DIAGNOSTICS 14705888197SRobert Clausecker.Bl -diag 14805888197SRobert Clausecker.It "Illegal Instruction" 14905888197SRobert ClauseckerPrinted by 15005888197SRobert Clausecker.Xr sh 1 15105888197SRobert Clauseckerif a command is terminated through delivery of a 15205888197SRobert Clausecker.Dv SIGILL 15305888197SRobert Clauseckersignal, see 15405888197SRobert Clausecker.Xr signal 3 . 15505888197SRobert Clausecker.Pp 15605888197SRobert ClauseckerUse of an unsupported architecture level was forced by setting 15705888197SRobert Clausecker.Ev ARCHLEVEL 15805888197SRobert Clauseckerto a string beginning with a 15905888197SRobert Clausecker.Sq "!" 16005888197SRobert Clauseckercharacter, causing a process to crash due to use of an unsupported 16105888197SRobert Clauseckerinstruction. 16205888197SRobert ClauseckerUnset 16305888197SRobert Clausecker.Ev ARCHLEVEL , 16405888197SRobert Clauseckerremove the 16505888197SRobert Clausecker.Sq "!" 16605888197SRobert Clauseckerprefix or select a supported architecture level. 16705888197SRobert Clausecker.Pp 16805888197SRobert ClauseckerMessage may also appear for unrelated reasons. 16905888197SRobert Clausecker.El 17005888197SRobert Clausecker. 17105888197SRobert Clausecker.Sh SEE ALSO 17205888197SRobert Clausecker.Xr string 3 , 17305888197SRobert Clausecker.Xr arch 7 17405888197SRobert Clausecker.Rs 17505888197SRobert Clausecker.%A H. J. Lu 17605888197SRobert Clausecker.%A Michael Matz 17705888197SRobert Clausecker.%A Milind Girkar 17805888197SRobert Clausecker.%A Jan Hubi\[u010D]ka \" \(vc 17905888197SRobert Clausecker.%A Andreas Jaeger 18005888197SRobert Clausecker.%A Mark Mitchell 18105888197SRobert Clausecker.%B System V Application Binary Interface 18205888197SRobert Clausecker.%D May 23, 2023 18305888197SRobert Clausecker.%T AMD64 Architecture Processor Supplement 18405888197SRobert Clausecker.%O Version 1.0 18505888197SRobert Clausecker.Re 18605888197SRobert Clausecker. 18705888197SRobert Clausecker.Sh HISTORY 18805888197SRobert ClauseckerArchitecture-specific enhanced 18905888197SRobert Clausecker.Em libc 19005888197SRobert Clauseckerfunctions were added starting 19105888197SRobert Clauseckerwith 19205888197SRobert Clausecker.Fx 2.0 19305888197SRobert Clauseckerfor 19405888197SRobert Clausecker.Cm i386 , 19505888197SRobert Clausecker.Fx 6.0 19605888197SRobert Clauseckerfor 19705888197SRobert Clausecker.Cm arm , 19805888197SRobert Clausecker.Fx 6.1 19905888197SRobert Clauseckerfor 20005888197SRobert Clausecker.Cm amd64 , 20105888197SRobert Clausecker.Fx 11.0 20205888197SRobert Clauseckerfor 20305888197SRobert Clausecker.Cm aarch64 , 20405888197SRobert Clauseckerand 20505888197SRobert Clausecker.Fx 12.0 20605888197SRobert Clauseckerfor 20705888197SRobert Clausecker.Cm powerpc64 . 20805888197SRobert ClauseckerSIMD-enhanced functions were first added with 20905888197SRobert Clausecker.Fx 13.0 21005888197SRobert Clauseckerfor 21105888197SRobert Clausecker.Cm powerpc64 21205888197SRobert Clauseckerand with 21375a9e225SRobert Clausecker.Fx 14.1 21405888197SRobert Clauseckerfor 21505888197SRobert Clausecker.Cm amd64 . 21605888197SRobert Clausecker.Pp 21705888197SRobert ClauseckerA 21805888197SRobert Clausecker.Nm 21905888197SRobert Clauseckermanual page appeared in 220*c983825aSRobert Clausecker.Fx 14.1 . 22105888197SRobert Clausecker. 22205888197SRobert Clausecker.Sh AUTHOR 22305888197SRobert Clausecker.An Robert Clausecker Aq Mt fuz@FreeBSD.org 22405888197SRobert Clausecker. 22505888197SRobert Clausecker.Sh CAVEATS 22605888197SRobert ClauseckerOther parts of 22705888197SRobert Clausecker.Fx 22805888197SRobert Clauseckersuch as cryptographic routines in the kernel or in 22905888197SRobert ClauseckerOpenSSL may also use SIMD enhancements. 23005888197SRobert ClauseckerThese enhancements are not subject to the 23105888197SRobert Clausecker.Ev ARCHLEVEL 23205888197SRobert Clauseckervariable and may have their own configuration 23305888197SRobert Clauseckermechanism. 23405888197SRobert Clausecker. 23505888197SRobert Clausecker.Sh BUGS 23605888197SRobert ClauseckerUse of SIMD enhancements cannot be configured on powerpc64. 237