1*05888197SRobert Clausecker.\" Copyright (c) 2023 The FreeBSD Foundation 2*05888197SRobert Clausecker. 3*05888197SRobert Clausecker.\" This documentation was written by Robert Clausecker <fuz@FreeBSD.org> 4*05888197SRobert Clausecker.\" under sponsorship from the FreeBSD Foundation. 5*05888197SRobert Clausecker. 6*05888197SRobert Clausecker.\" Redistribution and use in source and binary forms, with or without 7*05888197SRobert Clausecker.\" modification, are permitted provided that the following conditions 8*05888197SRobert Clausecker.\" are met: 9*05888197SRobert Clausecker.\" 1. Redistributions of source code must retain the above copyright 10*05888197SRobert Clausecker.\" notice, this list of conditions and the following disclaimer. 11*05888197SRobert Clausecker.\" 2. Redistributions in binary form must reproduce the above copyright 12*05888197SRobert Clausecker.\" notice, this list of conditions and the following disclaimer in the 13*05888197SRobert Clausecker.\" documentation and/or other materials provided with the distribution. 14*05888197SRobert Clausecker. 15*05888197SRobert Clausecker.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ''AS IS'' AND 16*05888197SRobert Clausecker.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 17*05888197SRobert Clausecker.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 18*05888197SRobert Clausecker.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 19*05888197SRobert Clausecker.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20*05888197SRobert Clausecker.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 21*05888197SRobert Clausecker.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 22*05888197SRobert Clausecker.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 23*05888197SRobert Clausecker.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 24*05888197SRobert Clausecker.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 25*05888197SRobert Clausecker.\" SUCH DAMAGE 26*05888197SRobert Clausecker. 27*05888197SRobert Clausecker.Dd July 3, 2023 28*05888197SRobert Clausecker.Dt SIMD 7 29*05888197SRobert Clausecker.Os 30*05888197SRobert Clausecker.Sh NAME 31*05888197SRobert Clausecker.Nm simd 32*05888197SRobert Clausecker.Nd SIMD enhancements 33*05888197SRobert Clausecker. 34*05888197SRobert Clausecker.Sh DESCRIPTION 35*05888197SRobert ClauseckerOn some architectures, the 36*05888197SRobert Clausecker.Fx 37*05888197SRobert Clausecker.Em libc 38*05888197SRobert Clauseckerprovides enhanced implementations of commonly used functions, replacing 39*05888197SRobert Clauseckerthe architecture-independent implementations used otherwise. 40*05888197SRobert ClauseckerDepending on architecture and function, an enhanced 41*05888197SRobert Clauseckerimplementation of a function may either always be used or the 42*05888197SRobert Clausecker.Em libc 43*05888197SRobert Clauseckerdetects at runtime which SIMD instruction set extensions are 44*05888197SRobert Clauseckersupported and picks the most suitable implementation automatically. 45*05888197SRobert ClauseckerOn 46*05888197SRobert Clausecker.Cm amd64 , 47*05888197SRobert Clauseckerthe environment variable 48*05888197SRobert Clausecker.Ev ARCHLEVEL 49*05888197SRobert Clauseckercan be used to override this mechanism. 50*05888197SRobert Clausecker.Pp 51*05888197SRobert ClauseckerEnhanced functions are present in the following architectures: 52*05888197SRobert Clausecker.Bl -column FUNCTION_ aarch64_ arm_ amd64_ i386_ ppc64_ -offset indent 53*05888197SRobert Clausecker.It Em FUNCTION Ta Em AARCH64 Ta Em ARM Ta Em AMD64 Ta Em I386 Ta Em PPC64 54*05888197SRobert Clausecker.It bcmp Ta Ta Ta S Ta S 55*05888197SRobert Clausecker.It bcopy Ta Ta S Ta S Ta S Ta SV 56*05888197SRobert Clausecker.It bzero Ta Ta S Ta S Ta S 57*05888197SRobert Clausecker.It div Ta Ta Ta S Ta S 58*05888197SRobert Clausecker.It index Ta S 59*05888197SRobert Clausecker.It ldiv Ta Ta Ta S Ta S 60*05888197SRobert Clausecker.It lldiv Ta Ta Ta S 61*05888197SRobert Clausecker.It memcmp Ta Ta S Ta S Ta S 62*05888197SRobert Clausecker.It memcpy Ta S Ta S Ta S Ta S Ta SV 63*05888197SRobert Clausecker.It memmove Ta S Ta S Ta S Ta S Ta SV 64*05888197SRobert Clausecker.It memset Ta Ta S Ta S Ta S 65*05888197SRobert Clausecker.It rindex Ta S 66*05888197SRobert Clausecker.It stpcpy Ta Ta Ta S 67*05888197SRobert Clausecker.It strcat Ta Ta Ta S Ta S 68*05888197SRobert Clausecker.It strchr Ta S Ta Ta Ta S 69*05888197SRobert Clausecker.It strcmp Ta Ta S Ta S Ta S 70*05888197SRobert Clausecker.It strcpy Ta Ta Ta S Ta S Ta S2 71*05888197SRobert Clausecker.It strlen Ta Ta S Ta S134 72*05888197SRobert Clausecker.It strncmp Ta Ta S Ta Ta S 73*05888197SRobert Clausecker.It strncpy Ta Ta Ta Ta Ta S2 74*05888197SRobert Clausecker.It strrchr Ta S Ta Ta Ta S 75*05888197SRobert Clausecker.It swab Ta Ta Ta Ta S 76*05888197SRobert Clausecker.It wcschr Ta Ta Ta Ta S 77*05888197SRobert Clausecker.It wcscmp Ta Ta Ta Ta S 78*05888197SRobert Clausecker.It wcslen Ta Ta Ta Ta S 79*05888197SRobert Clausecker.It wmemchr Ta Ta Ta Ta S 80*05888197SRobert Clausecker.El 81*05888197SRobert Clausecker.Pp 82*05888197SRobert Clausecker.Sy S Ns :\ scalar (non-SIMD), 83*05888197SRobert Clausecker.Sy 1 Ns :\ amd64 baseline, 84*05888197SRobert Clausecker.Sy 2 Ns :\ x86-64-v2 85*05888197SRobert Clauseckeror PowerPC\ 2.05, 86*05888197SRobert Clausecker.Sy 3 Ns :\ x86-64-v3, 87*05888197SRobert Clausecker.Sy 4 Ns :\ x86-64-v4, 88*05888197SRobert Clausecker.Sy V Ns :\ PowerPC\ VSX. 89*05888197SRobert Clausecker. 90*05888197SRobert Clausecker.Sh ENVIRONMENT 91*05888197SRobert Clausecker.Bl -tag 92*05888197SRobert Clausecker.It Ev ARCHLEVEL 93*05888197SRobert ClauseckerOn 94*05888197SRobert Clausecker.Em amd64 , 95*05888197SRobert Clauseckercontrols the level of SIMD enhancements used. 96*05888197SRobert ClauseckerIf this variable is set to an architecture level from the list below 97*05888197SRobert Clauseckerand that architecture level is supported by the processor, SIMD 98*05888197SRobert Clauseckerenhancements up to 99*05888197SRobert Clausecker.Ev ARCHLEVEL 100*05888197SRobert Clauseckerare used. 101*05888197SRobert ClauseckerIf 102*05888197SRobert Clausecker.Ev ARCHLEVEL 103*05888197SRobert Clauseckeris unset, not recognised, or not supported by the processor, the highest 104*05888197SRobert Clauseckerlevel of SIMD enhancements supported by the processor is used. 105*05888197SRobert Clausecker.Pp 106*05888197SRobert ClauseckerA suffix beginning with 107*05888197SRobert Clausecker.Sq ":" 108*05888197SRobert Clauseckeror 109*05888197SRobert Clausecker.Sq "+" 110*05888197SRobert Clauseckerin 111*05888197SRobert Clausecker.Ev ARCHLEVEL 112*05888197SRobert Clauseckeris ignored and may be used for future extensions. 113*05888197SRobert ClauseckerThe architecture level can be prefixed with a 114*05888197SRobert Clausecker.Sq "!" 115*05888197SRobert Clauseckercharacter to force use of the requested architecture level, even if the 116*05888197SRobert Clauseckerprocessor does not advertise that it is supported. 117*05888197SRobert ClauseckerThis usually causes applications to crash and should only be used for 118*05888197SRobert Clauseckertesting purposes or if architecture level detection yields incorrect 119*05888197SRobert Clauseckerresults. 120*05888197SRobert Clausecker.Pp 121*05888197SRobert ClauseckerThe architecture levels follow the AMD64 SysV ABI supplement: 122*05888197SRobert Clausecker.Bl -tag -width x86-64-v2 123*05888197SRobert Clausecker.It Cm scalar 124*05888197SRobert Clauseckerscalar enhancements only (no SIMD) 125*05888197SRobert Clausecker.It Cm baseline 126*05888197SRobert Clauseckercmov, cx8, x87 FPU, fxsr, MMX, osfxsr, SSE, SSE2 127*05888197SRobert Clausecker.It Cm x86-64-v2 128*05888197SRobert Clauseckercx16, lahf/sahf, popcnt, SSE3, SSSE3, SSE4.1, SSE4.2 129*05888197SRobert Clausecker.It Cm x86-64-v3 130*05888197SRobert ClauseckerAVX, AVX2, BMI1, BMI2, F16C, FMA, lzcnt, movbe, osxsave 131*05888197SRobert Clausecker.It Cm x86-64-v4 132*05888197SRobert ClauseckerAVX-512F/BW/CD/DQ/VL 133*05888197SRobert Clausecker.El 134*05888197SRobert Clausecker.El 135*05888197SRobert Clausecker. 136*05888197SRobert Clausecker.Sh DIAGNOSTICS 137*05888197SRobert Clausecker.Bl -diag 138*05888197SRobert Clausecker.It "Illegal Instruction" 139*05888197SRobert ClauseckerPrinted by 140*05888197SRobert Clausecker.Xr sh 1 141*05888197SRobert Clauseckerif a command is terminated through delivery of a 142*05888197SRobert Clausecker.Dv SIGILL 143*05888197SRobert Clauseckersignal, see 144*05888197SRobert Clausecker.Xr signal 3 . 145*05888197SRobert Clausecker.Pp 146*05888197SRobert ClauseckerUse of an unsupported architecture level was forced by setting 147*05888197SRobert Clausecker.Ev ARCHLEVEL 148*05888197SRobert Clauseckerto a string beginning with a 149*05888197SRobert Clausecker.Sq "!" 150*05888197SRobert Clauseckercharacter, causing a process to crash due to use of an unsupported 151*05888197SRobert Clauseckerinstruction. 152*05888197SRobert ClauseckerUnset 153*05888197SRobert Clausecker.Ev ARCHLEVEL , 154*05888197SRobert Clauseckerremove the 155*05888197SRobert Clausecker.Sq "!" 156*05888197SRobert Clauseckerprefix or select a supported architecture level. 157*05888197SRobert Clausecker.Pp 158*05888197SRobert ClauseckerMessage may also appear for unrelated reasons. 159*05888197SRobert Clausecker.El 160*05888197SRobert Clausecker. 161*05888197SRobert Clausecker.Sh SEE ALSO 162*05888197SRobert Clausecker.Xr string 3 , 163*05888197SRobert Clausecker.Xr arch 7 164*05888197SRobert Clausecker.Rs 165*05888197SRobert Clausecker.%A H. J. Lu 166*05888197SRobert Clausecker.%A Michael Matz 167*05888197SRobert Clausecker.%A Milind Girkar 168*05888197SRobert Clausecker.%A Jan Hubi\[u010D]ka \" \(vc 169*05888197SRobert Clausecker.%A Andreas Jaeger 170*05888197SRobert Clausecker.%A Mark Mitchell 171*05888197SRobert Clausecker.%B System V Application Binary Interface 172*05888197SRobert Clausecker.%D May 23, 2023 173*05888197SRobert Clausecker.%T AMD64 Architecture Processor Supplement 174*05888197SRobert Clausecker.%O Version 1.0 175*05888197SRobert Clausecker.Re 176*05888197SRobert Clausecker. 177*05888197SRobert Clausecker.Sh HISTORY 178*05888197SRobert ClauseckerArchitecture-specific enhanced 179*05888197SRobert Clausecker.Em libc 180*05888197SRobert Clauseckerfunctions were added starting 181*05888197SRobert Clauseckerwith 182*05888197SRobert Clausecker.Fx 2.0 183*05888197SRobert Clauseckerfor 184*05888197SRobert Clausecker.Cm i386 , 185*05888197SRobert Clausecker.Fx 6.0 186*05888197SRobert Clauseckerfor 187*05888197SRobert Clausecker.Cm arm , 188*05888197SRobert Clausecker.Fx 6.1 189*05888197SRobert Clauseckerfor 190*05888197SRobert Clausecker.Cm amd64 , 191*05888197SRobert Clausecker.Fx 11.0 192*05888197SRobert Clauseckerfor 193*05888197SRobert Clausecker.Cm aarch64 , 194*05888197SRobert Clauseckerand 195*05888197SRobert Clausecker.Fx 12.0 196*05888197SRobert Clauseckerfor 197*05888197SRobert Clausecker.Cm powerpc64 . 198*05888197SRobert ClauseckerSIMD-enhanced functions were first added with 199*05888197SRobert Clausecker.Fx 13.0 200*05888197SRobert Clauseckerfor 201*05888197SRobert Clausecker.Cm powerpc64 202*05888197SRobert Clauseckerand with 203*05888197SRobert Clausecker.Fx 14.0 204*05888197SRobert Clauseckerfor 205*05888197SRobert Clausecker.Cm amd64 . 206*05888197SRobert Clausecker.Pp 207*05888197SRobert ClauseckerA 208*05888197SRobert Clausecker.Nm 209*05888197SRobert Clauseckermanual page appeared in 210*05888197SRobert Clausecker.Fx 14.0 . 211*05888197SRobert Clausecker. 212*05888197SRobert Clausecker.Sh AUTHOR 213*05888197SRobert Clausecker.An Robert Clausecker Aq Mt fuz@FreeBSD.org 214*05888197SRobert Clausecker. 215*05888197SRobert Clausecker.Sh CAVEATS 216*05888197SRobert ClauseckerOther parts of 217*05888197SRobert Clausecker.Fx 218*05888197SRobert Clauseckersuch as cryptographic routines in the kernel or in 219*05888197SRobert ClauseckerOpenSSL may also use SIMD enhancements. 220*05888197SRobert ClauseckerThese enhancements are not subject to the 221*05888197SRobert Clausecker.Ev ARCHLEVEL 222*05888197SRobert Clauseckervariable and may have their own configuration 223*05888197SRobert Clauseckermechanism. 224*05888197SRobert Clausecker. 225*05888197SRobert Clausecker.Sh BUGS 226*05888197SRobert ClauseckerUse of SIMD enhancements cannot be configured on powerpc64. 227