xref: /freebsd/share/man/man7/simd.7 (revision b64c5a0ace59af62eff52bfe110a521dc73c937b)
1.\" Copyright (c) 2023 The FreeBSD Foundation
2.
3.\" This documentation was written by Robert Clausecker <fuz@FreeBSD.org>
4.\" under sponsorship from the FreeBSD Foundation.
5.
6.\" Redistribution and use in source and binary forms, with or without
7.\" modification, are permitted provided that the following conditions
8.\" are met:
9.\" 1. Redistributions of source code must retain the above copyright
10.\"    notice, this list of conditions and the following disclaimer.
11.\" 2. Redistributions in binary form must reproduce the above copyright
12.\"    notice, this list of conditions and the following disclaimer in the
13.\"    documentation and/or other materials provided with the distribution.
14.
15.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ''AS IS'' AND
16.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
17.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
18.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
19.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
20.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
21.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
22.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
23.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
25.\" SUCH DAMAGE
26.
27.Dd June 7, 2024
28.Dt SIMD 7
29.Os
30.Sh NAME
31.Nm simd
32.Nd SIMD enhancements
33.
34.Sh DESCRIPTION
35On some architectures, the
36.Fx
37.Em libc
38provides enhanced implementations of commonly used functions, replacing
39the architecture-independent implementations used otherwise.
40Depending on architecture and function, an enhanced
41implementation of a function may either always be used or the
42.Em libc
43detects at runtime which SIMD instruction set extensions are
44supported and picks the most suitable implementation automatically.
45On
46.Cm amd64 ,
47the environment variable
48.Ev ARCHLEVEL
49can be used to override this mechanism.
50.Pp
51Enhanced functions are present for the following architectures:
52.Bl -column FUNCTION_________ aarch64_ arm_ amd64_ i386_ ppc64_ -offset indent
53.It Em FUNCTION          Ta Em AARCH64 Ta Em ARM Ta Em AMD64  Ta Em I386 Ta Em PPC64
54.It    bcmp              Ta            Ta        Ta    S1     Ta    S
55.It    bcopy             Ta            Ta    S   Ta    S      Ta    S    Ta    SV
56.It    bzero             Ta            Ta    S   Ta    S      Ta    S
57.It    div               Ta            Ta        Ta    S      Ta    S
58.It    index             Ta    A       Ta        Ta    S1
59.It    ldiv              Ta            Ta        Ta    S      Ta    S
60.It    lldiv             Ta            Ta        Ta    S
61.It    memchr            Ta    A       Ta        Ta    S1
62.It    memcmp            Ta    A       Ta    S   Ta    S1     Ta    S
63.It    memccpy           Ta            Ta        Ta    S1
64.It    memcpy            Ta    S       Ta    S   Ta    S      Ta    S    Ta    SV
65.It    memmove           Ta    S       Ta    S   Ta    S      Ta    S    Ta    SV
66.It    memrchr           Ta    A       Ta        Ta    S1
67.It    memset            Ta    A       Ta    S   Ta    S      Ta    S
68.It    rindex            Ta    A       Ta        Ta    S1     Ta    S
69.It    stpcpy            Ta    A       Ta        Ta    S1
70.It    stpncpy           Ta            Ta        Ta    S1
71.It    strcat            Ta            Ta        Ta    S1     Ta    S
72.It    strchr            Ta    A       Ta        Ta    S1     Ta    S
73.It    strchrnul         Ta    A       Ta        Ta    S1
74.It    strcmp            Ta    S       Ta    S   Ta    S1     Ta    S
75.It    strcpy            Ta    A       Ta        Ta    S1     Ta    S    Ta    S2
76.It    strcspn           Ta            Ta        Ta    S2
77.It    strlcat           Ta            Ta        Ta    S1
78.It    strlcpy           Ta            Ta        Ta    S1
79.It    strlen            Ta    A       Ta    S   Ta    S1
80.It    strncat           Ta            Ta        Ta    S1
81.It    strncmp           Ta    S       Ta    S   Ta    S1     Ta    S
82.It    strncpy           Ta            Ta        Ta    S1     Ta         Ta    S2
83.It    strnlen           Ta    A       Ta        Ta    S1
84.It    strrchr           Ta    A       Ta        Ta    S1     Ta    S
85.It    strpbrk           Ta            Ta        Ta    S2
86.It    strsep            Ta            Ta        Ta    S2
87.It    strspn            Ta            Ta        Ta    S2
88.It    swab              Ta            Ta        Ta           Ta    S
89.It    timingsafe_bcmp   Ta            Ta        Ta    S1
90.It    timingsafe_memcmp Ta            Ta        Ta    S
91.It    wcschr            Ta            Ta        Ta           Ta    S
92.It    wcscmp            Ta            Ta        Ta           Ta    S
93.It    wcslen            Ta            Ta        Ta           Ta    S
94.It    wmemchr           Ta            Ta        Ta           Ta    S
95.El
96.Pp
97.Sy S Ns :\ scalar (non-SIMD),
98.Sy 1 Ns :\ amd64 baseline,
99.Sy 2 Ns :\ x86-64-v2
100or PowerPC\ 2.05,
101.Sy 3 Ns :\ x86-64-v3,
102.Sy 4 Ns :\ x86-64-v4,
103.Sy V Ns :\ PowerPC\ VSX,
104.Sy A Ns :\ Arm\ ASIMD (NEON).
105.
106.Sh ENVIRONMENT
107.Bl -tag
108.It Ev ARCHLEVEL
109On
110.Em amd64 ,
111controls the level of SIMD enhancements used.
112If this variable is set to an architecture level from the list below
113and that architecture level is supported by the processor, SIMD
114enhancements up to
115.Ev ARCHLEVEL
116are used.
117If
118.Ev ARCHLEVEL
119is unset, not recognised, or not supported by the processor, the highest
120level of SIMD enhancements supported by the processor is used.
121.Pp
122A suffix beginning with
123.Sq ":"
124or
125.Sq "+"
126in
127.Ev ARCHLEVEL
128is ignored and may be used for future extensions.
129The architecture level can be prefixed with a
130.Sq "!"
131character to force use of the requested architecture level, even if the
132processor does not advertise that it is supported.
133This usually causes applications to crash and should only be used for
134testing purposes or if architecture level detection yields incorrect
135results.
136.Pp
137The architecture levels follow the AMD64 SysV ABI supplement:
138.Bl -tag -width x86-64-v2
139.It Cm scalar
140scalar enhancements only (no SIMD)
141.It Cm baseline
142cmov, cx8, x87 FPU, fxsr, MMX, osfxsr, SSE, SSE2
143.It Cm x86-64-v2
144cx16, lahf/sahf, popcnt, SSE3, SSSE3, SSE4.1, SSE4.2
145.It Cm x86-64-v3
146AVX, AVX2, BMI1, BMI2, F16C, FMA, lzcnt, movbe, osxsave
147.It Cm x86-64-v4
148AVX-512F/BW/CD/DQ/VL
149.El
150.El
151.
152.Sh DIAGNOSTICS
153.Bl -diag
154.It "Illegal Instruction"
155Printed by
156.Xr sh 1
157if a command is terminated through delivery of a
158.Dv SIGILL
159signal, see
160.Xr signal 3 .
161.Pp
162Use of an unsupported architecture level was forced by setting
163.Ev ARCHLEVEL
164to a string beginning with a
165.Sq "!"
166character, causing a process to crash due to use of an unsupported
167instruction.
168Unset
169.Ev ARCHLEVEL ,
170remove the
171.Sq "!"
172prefix or select a supported architecture level.
173.Pp
174Message may also appear for unrelated reasons.
175.El
176.
177.Sh SEE ALSO
178.Xr string 3 ,
179.Xr arch 7
180.Rs
181.%A H. J. Lu
182.%A Michael Matz
183.%A Milind Girkar
184.%A Jan Hubi\[u010D]ka \" \(vc
185.%A Andreas Jaeger
186.%A Mark Mitchell
187.%B System V Application Binary Interface
188.%D May 23, 2023
189.%T AMD64 Architecture Processor Supplement
190.%O Version 1.0
191.Re
192.
193.Sh HISTORY
194Architecture-specific enhanced
195.Em libc
196functions were added starting
197with
198.Fx 2.0
199for
200.Cm i386 ,
201.Fx 6.0
202for
203.Cm arm ,
204.Fx 6.1
205for
206.Cm amd64 ,
207.Fx 11.0
208for
209.Cm aarch64 ,
210and
211.Fx 12.0
212for
213.Cm powerpc64 .
214SIMD-enhanced functions were first added with
215.Fx 13.0
216for
217.Cm powerpc64
218and with
219.Fx 14.1
220for
221.Cm amd64 .
222.Pp
223A
224.Nm
225manual page appeared in
226.Fx 14.1 .
227.
228.Sh AUTHOR
229.An Robert Clausecker Aq Mt fuz@FreeBSD.org
230.
231.Sh CAVEATS
232Other parts of
233.Fx
234such as cryptographic routines in the kernel or in
235OpenSSL may also use SIMD enhancements.
236These enhancements are not subject to the
237.Ev ARCHLEVEL
238variable and may have their own configuration
239mechanism.
240.
241.Sh BUGS
242Use of SIMD enhancements cannot be configured on powerpc64.
243