1.\" Copyright (c) 2023 The FreeBSD Foundation 2. 3.\" This documentation was written by Robert Clausecker <fuz@FreeBSD.org> 4.\" under sponsorship from the FreeBSD Foundation. 5. 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 1. Redistributions of source code must retain the above copyright 10.\" notice, this list of conditions and the following disclaimer. 11.\" 2. Redistributions in binary form must reproduce the above copyright 12.\" notice, this list of conditions and the following disclaimer in the 13.\" documentation and/or other materials provided with the distribution. 14. 15.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ''AS IS'' AND 16.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 17.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 18.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 19.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 21.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 22.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 23.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 25.\" SUCH DAMAGE 26. 27.Dd June 7, 2024 28.Dt SIMD 7 29.Os 30.Sh NAME 31.Nm simd 32.Nd SIMD enhancements 33. 34.Sh DESCRIPTION 35On some architectures, the 36.Fx 37.Em libc 38provides enhanced implementations of commonly used functions, replacing 39the architecture-independent implementations used otherwise. 40Depending on architecture and function, an enhanced 41implementation of a function may either always be used or the 42.Em libc 43detects at runtime which SIMD instruction set extensions are 44supported and picks the most suitable implementation automatically. 45On 46.Cm amd64 , 47the environment variable 48.Ev ARCHLEVEL 49can be used to override this mechanism. 50.Pp 51Enhanced functions are present for the following architectures: 52.Bl -column FUNCTION_________ aarch64_ arm_ amd64_ i386_ ppc64_ -offset indent 53.It Em FUNCTION Ta Em AARCH64 Ta Em ARM Ta Em AMD64 Ta Em I386 Ta Em PPC64 54.It bcmp Ta Ta Ta S1 Ta S 55.It bcopy Ta Ta S Ta S Ta S Ta SV 56.It bzero Ta Ta S Ta S Ta S 57.It div Ta Ta Ta S Ta S 58.It index Ta A Ta Ta S1 59.It ldiv Ta Ta Ta S Ta S 60.It lldiv Ta Ta Ta S 61.It memchr Ta A Ta Ta S1 62.It memcmp Ta A Ta S Ta S1 Ta S 63.It memccpy Ta Ta Ta S1 64.It memcpy Ta S Ta S Ta S Ta S Ta SV 65.It memmove Ta S Ta S Ta S Ta S Ta SV 66.It memrchr Ta A Ta Ta S1 67.It memset Ta A Ta S Ta S Ta S 68.It rindex Ta A Ta Ta S1 Ta S 69.It stpcpy Ta A Ta Ta S1 70.It stpncpy Ta Ta Ta S1 71.It strcat Ta Ta Ta S1 Ta S 72.It strchr Ta A Ta Ta S1 Ta S 73.It strchrnul Ta A Ta Ta S1 74.It strcmp Ta S Ta S Ta S1 Ta S 75.It strcpy Ta A Ta Ta S1 Ta S Ta S2 76.It strcspn Ta Ta Ta S2 77.It strlcat Ta Ta Ta S1 78.It strlcpy Ta Ta Ta S1 79.It strlen Ta A Ta S Ta S1 80.It strncat Ta Ta Ta S1 81.It strncmp Ta S Ta S Ta S1 Ta S 82.It strncpy Ta Ta Ta S1 Ta Ta S2 83.It strnlen Ta A Ta Ta S1 84.It strrchr Ta A Ta Ta S1 Ta S 85.It strpbrk Ta Ta Ta S2 86.It strsep Ta Ta Ta S2 87.It strspn Ta Ta Ta S2 88.It swab Ta Ta Ta Ta S 89.It timingsafe_bcmp Ta Ta Ta S1 90.It timingsafe_memcmp Ta Ta Ta S 91.It wcschr Ta Ta Ta Ta S 92.It wcscmp Ta Ta Ta Ta S 93.It wcslen Ta Ta Ta Ta S 94.It wmemchr Ta Ta Ta Ta S 95.El 96.Pp 97.Sy S Ns :\ scalar (non-SIMD), 98.Sy 1 Ns :\ amd64 baseline, 99.Sy 2 Ns :\ x86-64-v2 100or PowerPC\ 2.05, 101.Sy 3 Ns :\ x86-64-v3, 102.Sy 4 Ns :\ x86-64-v4, 103.Sy V Ns :\ PowerPC\ VSX, 104.Sy A Ns :\ Arm\ ASIMD (NEON). 105. 106.Sh ENVIRONMENT 107.Bl -tag 108.It Ev ARCHLEVEL 109On 110.Em amd64 , 111controls the level of SIMD enhancements used. 112If this variable is set to an architecture level from the list below 113and that architecture level is supported by the processor, SIMD 114enhancements up to 115.Ev ARCHLEVEL 116are used. 117If 118.Ev ARCHLEVEL 119is unset, not recognised, or not supported by the processor, the highest 120level of SIMD enhancements supported by the processor is used. 121.Pp 122A suffix beginning with 123.Sq ":" 124or 125.Sq "+" 126in 127.Ev ARCHLEVEL 128is ignored and may be used for future extensions. 129The architecture level can be prefixed with a 130.Sq "!" 131character to force use of the requested architecture level, even if the 132processor does not advertise that it is supported. 133This usually causes applications to crash and should only be used for 134testing purposes or if architecture level detection yields incorrect 135results. 136.Pp 137The architecture levels follow the AMD64 SysV ABI supplement: 138.Bl -tag -width x86-64-v2 139.It Cm scalar 140scalar enhancements only (no SIMD) 141.It Cm baseline 142cmov, cx8, x87 FPU, fxsr, MMX, osfxsr, SSE, SSE2 143.It Cm x86-64-v2 144cx16, lahf/sahf, popcnt, SSE3, SSSE3, SSE4.1, SSE4.2 145.It Cm x86-64-v3 146AVX, AVX2, BMI1, BMI2, F16C, FMA, lzcnt, movbe, osxsave 147.It Cm x86-64-v4 148AVX-512F/BW/CD/DQ/VL 149.El 150.El 151. 152.Sh DIAGNOSTICS 153.Bl -diag 154.It "Illegal Instruction" 155Printed by 156.Xr sh 1 157if a command is terminated through delivery of a 158.Dv SIGILL 159signal, see 160.Xr signal 3 . 161.Pp 162Use of an unsupported architecture level was forced by setting 163.Ev ARCHLEVEL 164to a string beginning with a 165.Sq "!" 166character, causing a process to crash due to use of an unsupported 167instruction. 168Unset 169.Ev ARCHLEVEL , 170remove the 171.Sq "!" 172prefix or select a supported architecture level. 173.Pp 174Message may also appear for unrelated reasons. 175.El 176. 177.Sh SEE ALSO 178.Xr string 3 , 179.Xr arch 7 180.Rs 181.%A H. J. Lu 182.%A Michael Matz 183.%A Milind Girkar 184.%A Jan Hubi\[u010D]ka \" \(vc 185.%A Andreas Jaeger 186.%A Mark Mitchell 187.%B System V Application Binary Interface 188.%D May 23, 2023 189.%T AMD64 Architecture Processor Supplement 190.%O Version 1.0 191.Re 192. 193.Sh HISTORY 194Architecture-specific enhanced 195.Em libc 196functions were added starting 197with 198.Fx 2.0 199for 200.Cm i386 , 201.Fx 6.0 202for 203.Cm arm , 204.Fx 6.1 205for 206.Cm amd64 , 207.Fx 11.0 208for 209.Cm aarch64 , 210and 211.Fx 12.0 212for 213.Cm powerpc64 . 214SIMD-enhanced functions were first added with 215.Fx 13.0 216for 217.Cm powerpc64 218and with 219.Fx 14.1 220for 221.Cm amd64 . 222.Pp 223A 224.Nm 225manual page appeared in 226.Fx 14.1 . 227. 228.Sh AUTHOR 229.An Robert Clausecker Aq Mt fuz@FreeBSD.org 230. 231.Sh CAVEATS 232Other parts of 233.Fx 234such as cryptographic routines in the kernel or in 235OpenSSL may also use SIMD enhancements. 236These enhancements are not subject to the 237.Ev ARCHLEVEL 238variable and may have their own configuration 239mechanism. 240. 241.Sh BUGS 242Use of SIMD enhancements cannot be configured on powerpc64. 243