xref: /freebsd/share/man/man7/simd.7 (revision 0588819779cd82dc8fd9a1b63bcb531b296986eb)
1*05888197SRobert Clausecker.\" Copyright (c) 2023 The FreeBSD Foundation
2*05888197SRobert Clausecker.
3*05888197SRobert Clausecker.\" This documentation was written by Robert Clausecker <fuz@FreeBSD.org>
4*05888197SRobert Clausecker.\" under sponsorship from the FreeBSD Foundation.
5*05888197SRobert Clausecker.
6*05888197SRobert Clausecker.\" Redistribution and use in source and binary forms, with or without
7*05888197SRobert Clausecker.\" modification, are permitted provided that the following conditions
8*05888197SRobert Clausecker.\" are met:
9*05888197SRobert Clausecker.\" 1. Redistributions of source code must retain the above copyright
10*05888197SRobert Clausecker.\"    notice, this list of conditions and the following disclaimer.
11*05888197SRobert Clausecker.\" 2. Redistributions in binary form must reproduce the above copyright
12*05888197SRobert Clausecker.\"    notice, this list of conditions and the following disclaimer in the
13*05888197SRobert Clausecker.\"    documentation and/or other materials provided with the distribution.
14*05888197SRobert Clausecker.
15*05888197SRobert Clausecker.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ''AS IS'' AND
16*05888197SRobert Clausecker.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
17*05888197SRobert Clausecker.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
18*05888197SRobert Clausecker.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
19*05888197SRobert Clausecker.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
20*05888197SRobert Clausecker.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
21*05888197SRobert Clausecker.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
22*05888197SRobert Clausecker.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
23*05888197SRobert Clausecker.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
24*05888197SRobert Clausecker.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
25*05888197SRobert Clausecker.\" SUCH DAMAGE
26*05888197SRobert Clausecker.
27*05888197SRobert Clausecker.Dd July 3, 2023
28*05888197SRobert Clausecker.Dt SIMD 7
29*05888197SRobert Clausecker.Os
30*05888197SRobert Clausecker.Sh NAME
31*05888197SRobert Clausecker.Nm simd
32*05888197SRobert Clausecker.Nd SIMD enhancements
33*05888197SRobert Clausecker.
34*05888197SRobert Clausecker.Sh DESCRIPTION
35*05888197SRobert ClauseckerOn some architectures, the
36*05888197SRobert Clausecker.Fx
37*05888197SRobert Clausecker.Em libc
38*05888197SRobert Clauseckerprovides enhanced implementations of commonly used functions, replacing
39*05888197SRobert Clauseckerthe architecture-independent implementations used otherwise.
40*05888197SRobert ClauseckerDepending on architecture and function, an enhanced
41*05888197SRobert Clauseckerimplementation of a function may either always be used or the
42*05888197SRobert Clausecker.Em libc
43*05888197SRobert Clauseckerdetects at runtime which SIMD instruction set extensions are
44*05888197SRobert Clauseckersupported and picks the most suitable implementation automatically.
45*05888197SRobert ClauseckerOn
46*05888197SRobert Clausecker.Cm amd64 ,
47*05888197SRobert Clauseckerthe environment variable
48*05888197SRobert Clausecker.Ev ARCHLEVEL
49*05888197SRobert Clauseckercan be used to override this mechanism.
50*05888197SRobert Clausecker.Pp
51*05888197SRobert ClauseckerEnhanced functions are present in the following architectures:
52*05888197SRobert Clausecker.Bl -column FUNCTION_ aarch64_ arm_ amd64_ i386_ ppc64_ -offset indent
53*05888197SRobert Clausecker.It Em FUNCTION Ta Em AARCH64 Ta Em ARM Ta Em AMD64  Ta Em I386 Ta Em PPC64
54*05888197SRobert Clausecker.It    bcmp     Ta            Ta        Ta    S      Ta    S
55*05888197SRobert Clausecker.It    bcopy    Ta            Ta    S   Ta    S      Ta    S    Ta    SV
56*05888197SRobert Clausecker.It    bzero    Ta            Ta    S   Ta    S      Ta    S
57*05888197SRobert Clausecker.It    div      Ta            Ta        Ta    S      Ta    S
58*05888197SRobert Clausecker.It    index    Ta    S
59*05888197SRobert Clausecker.It    ldiv     Ta            Ta        Ta    S      Ta    S
60*05888197SRobert Clausecker.It    lldiv    Ta            Ta        Ta    S
61*05888197SRobert Clausecker.It    memcmp   Ta            Ta    S   Ta    S      Ta    S
62*05888197SRobert Clausecker.It    memcpy   Ta    S       Ta    S   Ta    S      Ta    S    Ta    SV
63*05888197SRobert Clausecker.It    memmove  Ta    S       Ta    S   Ta    S      Ta    S    Ta    SV
64*05888197SRobert Clausecker.It    memset   Ta            Ta    S   Ta    S      Ta    S
65*05888197SRobert Clausecker.It    rindex   Ta    S
66*05888197SRobert Clausecker.It    stpcpy   Ta            Ta        Ta    S
67*05888197SRobert Clausecker.It    strcat   Ta            Ta        Ta    S      Ta    S
68*05888197SRobert Clausecker.It    strchr   Ta    S       Ta        Ta           Ta    S
69*05888197SRobert Clausecker.It    strcmp   Ta            Ta    S   Ta    S      Ta    S
70*05888197SRobert Clausecker.It    strcpy   Ta            Ta        Ta    S      Ta    S    Ta    S2
71*05888197SRobert Clausecker.It    strlen   Ta            Ta    S   Ta    S134
72*05888197SRobert Clausecker.It    strncmp  Ta            Ta    S   Ta           Ta    S
73*05888197SRobert Clausecker.It    strncpy  Ta            Ta        Ta           Ta         Ta    S2
74*05888197SRobert Clausecker.It    strrchr  Ta    S       Ta        Ta           Ta    S
75*05888197SRobert Clausecker.It    swab     Ta            Ta        Ta           Ta    S
76*05888197SRobert Clausecker.It    wcschr   Ta            Ta        Ta           Ta    S
77*05888197SRobert Clausecker.It    wcscmp   Ta            Ta        Ta           Ta    S
78*05888197SRobert Clausecker.It    wcslen   Ta            Ta        Ta           Ta    S
79*05888197SRobert Clausecker.It    wmemchr  Ta            Ta        Ta           Ta    S
80*05888197SRobert Clausecker.El
81*05888197SRobert Clausecker.Pp
82*05888197SRobert Clausecker.Sy S Ns :\ scalar (non-SIMD),
83*05888197SRobert Clausecker.Sy 1 Ns :\ amd64 baseline,
84*05888197SRobert Clausecker.Sy 2 Ns :\ x86-64-v2
85*05888197SRobert Clauseckeror PowerPC\ 2.05,
86*05888197SRobert Clausecker.Sy 3 Ns :\ x86-64-v3,
87*05888197SRobert Clausecker.Sy 4 Ns :\ x86-64-v4,
88*05888197SRobert Clausecker.Sy V Ns :\ PowerPC\ VSX.
89*05888197SRobert Clausecker.
90*05888197SRobert Clausecker.Sh ENVIRONMENT
91*05888197SRobert Clausecker.Bl -tag
92*05888197SRobert Clausecker.It Ev ARCHLEVEL
93*05888197SRobert ClauseckerOn
94*05888197SRobert Clausecker.Em amd64 ,
95*05888197SRobert Clauseckercontrols the level of SIMD enhancements used.
96*05888197SRobert ClauseckerIf this variable is set to an architecture level from the list below
97*05888197SRobert Clauseckerand that architecture level is supported by the processor, SIMD
98*05888197SRobert Clauseckerenhancements up to
99*05888197SRobert Clausecker.Ev ARCHLEVEL
100*05888197SRobert Clauseckerare used.
101*05888197SRobert ClauseckerIf
102*05888197SRobert Clausecker.Ev ARCHLEVEL
103*05888197SRobert Clauseckeris unset, not recognised, or not supported by the processor, the highest
104*05888197SRobert Clauseckerlevel of SIMD enhancements supported by the processor is used.
105*05888197SRobert Clausecker.Pp
106*05888197SRobert ClauseckerA suffix beginning with
107*05888197SRobert Clausecker.Sq ":"
108*05888197SRobert Clauseckeror
109*05888197SRobert Clausecker.Sq "+"
110*05888197SRobert Clauseckerin
111*05888197SRobert Clausecker.Ev ARCHLEVEL
112*05888197SRobert Clauseckeris ignored and may be used for future extensions.
113*05888197SRobert ClauseckerThe architecture level can be prefixed with a
114*05888197SRobert Clausecker.Sq "!"
115*05888197SRobert Clauseckercharacter to force use of the requested architecture level, even if the
116*05888197SRobert Clauseckerprocessor does not advertise that it is supported.
117*05888197SRobert ClauseckerThis usually causes applications to crash and should only be used for
118*05888197SRobert Clauseckertesting purposes or if architecture level detection yields incorrect
119*05888197SRobert Clauseckerresults.
120*05888197SRobert Clausecker.Pp
121*05888197SRobert ClauseckerThe architecture levels follow the AMD64 SysV ABI supplement:
122*05888197SRobert Clausecker.Bl -tag -width x86-64-v2
123*05888197SRobert Clausecker.It Cm scalar
124*05888197SRobert Clauseckerscalar enhancements only (no SIMD)
125*05888197SRobert Clausecker.It Cm baseline
126*05888197SRobert Clauseckercmov, cx8, x87 FPU, fxsr, MMX, osfxsr, SSE, SSE2
127*05888197SRobert Clausecker.It Cm x86-64-v2
128*05888197SRobert Clauseckercx16, lahf/sahf, popcnt, SSE3, SSSE3, SSE4.1, SSE4.2
129*05888197SRobert Clausecker.It Cm x86-64-v3
130*05888197SRobert ClauseckerAVX, AVX2, BMI1, BMI2, F16C, FMA, lzcnt, movbe, osxsave
131*05888197SRobert Clausecker.It Cm x86-64-v4
132*05888197SRobert ClauseckerAVX-512F/BW/CD/DQ/VL
133*05888197SRobert Clausecker.El
134*05888197SRobert Clausecker.El
135*05888197SRobert Clausecker.
136*05888197SRobert Clausecker.Sh DIAGNOSTICS
137*05888197SRobert Clausecker.Bl -diag
138*05888197SRobert Clausecker.It "Illegal Instruction"
139*05888197SRobert ClauseckerPrinted by
140*05888197SRobert Clausecker.Xr sh 1
141*05888197SRobert Clauseckerif a command is terminated through delivery of a
142*05888197SRobert Clausecker.Dv SIGILL
143*05888197SRobert Clauseckersignal, see
144*05888197SRobert Clausecker.Xr signal 3 .
145*05888197SRobert Clausecker.Pp
146*05888197SRobert ClauseckerUse of an unsupported architecture level was forced by setting
147*05888197SRobert Clausecker.Ev ARCHLEVEL
148*05888197SRobert Clauseckerto a string beginning with a
149*05888197SRobert Clausecker.Sq "!"
150*05888197SRobert Clauseckercharacter, causing a process to crash due to use of an unsupported
151*05888197SRobert Clauseckerinstruction.
152*05888197SRobert ClauseckerUnset
153*05888197SRobert Clausecker.Ev ARCHLEVEL ,
154*05888197SRobert Clauseckerremove the
155*05888197SRobert Clausecker.Sq "!"
156*05888197SRobert Clauseckerprefix or select a supported architecture level.
157*05888197SRobert Clausecker.Pp
158*05888197SRobert ClauseckerMessage may also appear for unrelated reasons.
159*05888197SRobert Clausecker.El
160*05888197SRobert Clausecker.
161*05888197SRobert Clausecker.Sh SEE ALSO
162*05888197SRobert Clausecker.Xr string 3 ,
163*05888197SRobert Clausecker.Xr arch 7
164*05888197SRobert Clausecker.Rs
165*05888197SRobert Clausecker.%A H. J. Lu
166*05888197SRobert Clausecker.%A Michael Matz
167*05888197SRobert Clausecker.%A Milind Girkar
168*05888197SRobert Clausecker.%A Jan Hubi\[u010D]ka \" \(vc
169*05888197SRobert Clausecker.%A Andreas Jaeger
170*05888197SRobert Clausecker.%A Mark Mitchell
171*05888197SRobert Clausecker.%B System V Application Binary Interface
172*05888197SRobert Clausecker.%D May 23, 2023
173*05888197SRobert Clausecker.%T AMD64 Architecture Processor Supplement
174*05888197SRobert Clausecker.%O Version 1.0
175*05888197SRobert Clausecker.Re
176*05888197SRobert Clausecker.
177*05888197SRobert Clausecker.Sh HISTORY
178*05888197SRobert ClauseckerArchitecture-specific enhanced
179*05888197SRobert Clausecker.Em libc
180*05888197SRobert Clauseckerfunctions were added starting
181*05888197SRobert Clauseckerwith
182*05888197SRobert Clausecker.Fx 2.0
183*05888197SRobert Clauseckerfor
184*05888197SRobert Clausecker.Cm i386 ,
185*05888197SRobert Clausecker.Fx 6.0
186*05888197SRobert Clauseckerfor
187*05888197SRobert Clausecker.Cm arm ,
188*05888197SRobert Clausecker.Fx 6.1
189*05888197SRobert Clauseckerfor
190*05888197SRobert Clausecker.Cm amd64 ,
191*05888197SRobert Clausecker.Fx 11.0
192*05888197SRobert Clauseckerfor
193*05888197SRobert Clausecker.Cm aarch64 ,
194*05888197SRobert Clauseckerand
195*05888197SRobert Clausecker.Fx 12.0
196*05888197SRobert Clauseckerfor
197*05888197SRobert Clausecker.Cm powerpc64 .
198*05888197SRobert ClauseckerSIMD-enhanced functions were first added with
199*05888197SRobert Clausecker.Fx 13.0
200*05888197SRobert Clauseckerfor
201*05888197SRobert Clausecker.Cm powerpc64
202*05888197SRobert Clauseckerand with
203*05888197SRobert Clausecker.Fx 14.0
204*05888197SRobert Clauseckerfor
205*05888197SRobert Clausecker.Cm amd64 .
206*05888197SRobert Clausecker.Pp
207*05888197SRobert ClauseckerA
208*05888197SRobert Clausecker.Nm
209*05888197SRobert Clauseckermanual page appeared in
210*05888197SRobert Clausecker.Fx 14.0 .
211*05888197SRobert Clausecker.
212*05888197SRobert Clausecker.Sh AUTHOR
213*05888197SRobert Clausecker.An Robert Clausecker Aq Mt fuz@FreeBSD.org
214*05888197SRobert Clausecker.
215*05888197SRobert Clausecker.Sh CAVEATS
216*05888197SRobert ClauseckerOther parts of
217*05888197SRobert Clausecker.Fx
218*05888197SRobert Clauseckersuch as cryptographic routines in the kernel or in
219*05888197SRobert ClauseckerOpenSSL may also use SIMD enhancements.
220*05888197SRobert ClauseckerThese enhancements are not subject to the
221*05888197SRobert Clausecker.Ev ARCHLEVEL
222*05888197SRobert Clauseckervariable and may have their own configuration
223*05888197SRobert Clauseckermechanism.
224*05888197SRobert Clausecker.
225*05888197SRobert Clausecker.Sh BUGS
226*05888197SRobert ClauseckerUse of SIMD enhancements cannot be configured on powerpc64.
227