xref: /freebsd/share/man/man7/simd.7 (revision c15b847b183bf836148caa1a1dc10d5d86507d09)
105888197SRobert Clausecker.\" Copyright (c) 2023 The FreeBSD Foundation
205888197SRobert Clausecker.
305888197SRobert Clausecker.\" This documentation was written by Robert Clausecker <fuz@FreeBSD.org>
405888197SRobert Clausecker.\" under sponsorship from the FreeBSD Foundation.
505888197SRobert Clausecker.
605888197SRobert Clausecker.\" Redistribution and use in source and binary forms, with or without
705888197SRobert Clausecker.\" modification, are permitted provided that the following conditions
805888197SRobert Clausecker.\" are met:
905888197SRobert Clausecker.\" 1. Redistributions of source code must retain the above copyright
1005888197SRobert Clausecker.\"    notice, this list of conditions and the following disclaimer.
1105888197SRobert Clausecker.\" 2. Redistributions in binary form must reproduce the above copyright
1205888197SRobert Clausecker.\"    notice, this list of conditions and the following disclaimer in the
1305888197SRobert Clausecker.\"    documentation and/or other materials provided with the distribution.
1405888197SRobert Clausecker.
1505888197SRobert Clausecker.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ''AS IS'' AND
1605888197SRobert Clausecker.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
1705888197SRobert Clausecker.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
1805888197SRobert Clausecker.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
1905888197SRobert Clausecker.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
2005888197SRobert Clausecker.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
2105888197SRobert Clausecker.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
2205888197SRobert Clausecker.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
2305888197SRobert Clausecker.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
2405888197SRobert Clausecker.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
2505888197SRobert Clausecker.\" SUCH DAMAGE
2605888197SRobert Clausecker.
27*c15b847bSRobert Clausecker.Dd November 18, 2024
2805888197SRobert Clausecker.Dt SIMD 7
2905888197SRobert Clausecker.Os
3005888197SRobert Clausecker.Sh NAME
3105888197SRobert Clausecker.Nm simd
3205888197SRobert Clausecker.Nd SIMD enhancements
3305888197SRobert Clausecker.
3405888197SRobert Clausecker.Sh DESCRIPTION
3505888197SRobert ClauseckerOn some architectures, the
3605888197SRobert Clausecker.Fx
3705888197SRobert Clausecker.Em libc
3805888197SRobert Clauseckerprovides enhanced implementations of commonly used functions, replacing
3905888197SRobert Clauseckerthe architecture-independent implementations used otherwise.
4005888197SRobert ClauseckerDepending on architecture and function, an enhanced
4105888197SRobert Clauseckerimplementation of a function may either always be used or the
4205888197SRobert Clausecker.Em libc
4305888197SRobert Clauseckerdetects at runtime which SIMD instruction set extensions are
4405888197SRobert Clauseckersupported and picks the most suitable implementation automatically.
4505888197SRobert ClauseckerOn
4605888197SRobert Clausecker.Cm amd64 ,
4705888197SRobert Clauseckerthe environment variable
4805888197SRobert Clausecker.Ev ARCHLEVEL
4905888197SRobert Clauseckercan be used to override this mechanism.
5005888197SRobert Clausecker.Pp
51c983825aSRobert ClauseckerEnhanced functions are present for the following architectures:
52a8cc4671SRobert Clausecker.Bl -column FUNCTION_________ aarch64_ arm_ amd64_ i386_ ppc64_ -offset indent
5305888197SRobert Clausecker.It Em FUNCTION          Ta Em AARCH64 Ta Em ARM Ta Em AMD64  Ta Em I386 Ta Em PPC64
54ce6af7a4SGetz Mikalsen.It    bcmp              Ta    A       Ta        Ta    S1     Ta    S
55ce6af7a4SGetz Mikalsen.It    bcopy             Ta    A       Ta    S   Ta    S      Ta    S    Ta    SV
56ce6af7a4SGetz Mikalsen.It    bzero             Ta    A       Ta    S   Ta    S      Ta    S
5705888197SRobert Clausecker.It    div               Ta            Ta        Ta    S      Ta    S
586c5ee6e5SGetz Mikalsen.It    index             Ta    A       Ta        Ta    S1
5905888197SRobert Clausecker.It    ldiv              Ta            Ta        Ta    S      Ta    S
6005888197SRobert Clausecker.It    lldiv             Ta            Ta        Ta    S
616c5ee6e5SGetz Mikalsen.It    memchr            Ta    A       Ta        Ta    S1
626c5ee6e5SGetz Mikalsen.It    memcmp            Ta    A       Ta    S   Ta    S1     Ta    S
63ce6af7a4SGetz Mikalsen.It    memccpy           Ta    A       Ta        Ta    S1
64ce6af7a4SGetz Mikalsen.It    memcpy            Ta    A       Ta    S   Ta    S      Ta    S    Ta    SV
65ce6af7a4SGetz Mikalsen.It    memmove           Ta    A       Ta    S   Ta    S      Ta    S    Ta    SV
666c5ee6e5SGetz Mikalsen.It    memrchr           Ta    A       Ta        Ta    S1
676c5ee6e5SGetz Mikalsen.It    memset            Ta    A       Ta    S   Ta    S      Ta    S
686c5ee6e5SGetz Mikalsen.It    rindex            Ta    A       Ta        Ta    S1     Ta    S
696c5ee6e5SGetz Mikalsen.It    stpcpy            Ta    A       Ta        Ta    S1
7075a9e225SRobert Clausecker.It    stpncpy           Ta            Ta        Ta    S1
71ce6af7a4SGetz Mikalsen.It    strcat            Ta    A       Ta        Ta    S1     Ta    S
726c5ee6e5SGetz Mikalsen.It    strchr            Ta    A       Ta        Ta    S1     Ta    S
736c5ee6e5SGetz Mikalsen.It    strchrnul         Ta    A       Ta        Ta    S1
74ce6af7a4SGetz Mikalsen.It    strcmp            Ta    A       Ta    S   Ta    S1     Ta    S
756c5ee6e5SGetz Mikalsen.It    strcpy            Ta    A       Ta        Ta    S1     Ta    S    Ta    S2
76ce6af7a4SGetz Mikalsen.It    strcspn           Ta    S       Ta        Ta    S2
77ce6af7a4SGetz Mikalsen.It    strlcat           Ta    A       Ta        Ta    S1
78ce6af7a4SGetz Mikalsen.It    strlcpy           Ta    A       Ta        Ta    S1
796c5ee6e5SGetz Mikalsen.It    strlen            Ta    A       Ta    S   Ta    S1
80ce6af7a4SGetz Mikalsen.It    strncat           Ta    A       Ta        Ta    S1
81ce6af7a4SGetz Mikalsen.It    strncmp           Ta    A       Ta    S   Ta    S1     Ta    S
8275a9e225SRobert Clausecker.It    strncpy           Ta            Ta        Ta    S1     Ta         Ta    S2
836c5ee6e5SGetz Mikalsen.It    strnlen           Ta    A       Ta        Ta    S1
846c5ee6e5SGetz Mikalsen.It    strrchr           Ta    A       Ta        Ta    S1     Ta    S
85ce6af7a4SGetz Mikalsen.It    strpbrk           Ta    S       Ta        Ta    S2
86ce6af7a4SGetz Mikalsen.It    strsep            Ta    S       Ta        Ta    S2
87ce6af7a4SGetz Mikalsen.It    strspn            Ta    S       Ta        Ta    S2
8805888197SRobert Clausecker.It    swab              Ta            Ta        Ta           Ta    S
89*c15b847bSRobert Clausecker.It    timingsafe_bcmp   Ta    A       Ta        Ta    S1
90*c15b847bSRobert Clausecker.It    timingsafe_memcmp Ta    S       Ta        Ta    S
9105888197SRobert Clausecker.It    wcschr            Ta            Ta        Ta           Ta    S
9205888197SRobert Clausecker.It    wcscmp            Ta            Ta        Ta           Ta    S
9305888197SRobert Clausecker.It    wcslen            Ta            Ta        Ta           Ta    S
9405888197SRobert Clausecker.It    wmemchr           Ta            Ta        Ta           Ta    S
9505888197SRobert Clausecker.El
9605888197SRobert Clausecker.Pp
9705888197SRobert Clausecker.Sy S Ns :\ scalar (non-SIMD),
9805888197SRobert Clausecker.Sy 1 Ns :\ amd64 baseline,
9905888197SRobert Clausecker.Sy 2 Ns :\ x86-64-v2
10005888197SRobert Clauseckeror PowerPC\ 2.05,
10105888197SRobert Clausecker.Sy 3 Ns :\ x86-64-v3,
10205888197SRobert Clausecker.Sy 4 Ns :\ x86-64-v4,
1036c5ee6e5SGetz Mikalsen.Sy V Ns :\ PowerPC\ VSX,
1046c5ee6e5SGetz Mikalsen.Sy A Ns :\ Arm\ ASIMD (NEON).
10505888197SRobert Clausecker.
10605888197SRobert Clausecker.Sh ENVIRONMENT
10705888197SRobert Clausecker.Bl -tag
10805888197SRobert Clausecker.It Ev ARCHLEVEL
10905888197SRobert ClauseckerOn
11005888197SRobert Clausecker.Em amd64 ,
11105888197SRobert Clauseckercontrols the level of SIMD enhancements used.
11205888197SRobert ClauseckerIf this variable is set to an architecture level from the list below
11305888197SRobert Clauseckerand that architecture level is supported by the processor, SIMD
11405888197SRobert Clauseckerenhancements up to
11505888197SRobert Clausecker.Ev ARCHLEVEL
11605888197SRobert Clauseckerare used.
11705888197SRobert ClauseckerIf
11805888197SRobert Clausecker.Ev ARCHLEVEL
11905888197SRobert Clauseckeris unset, not recognised, or not supported by the processor, the highest
12005888197SRobert Clauseckerlevel of SIMD enhancements supported by the processor is used.
12105888197SRobert Clausecker.Pp
12205888197SRobert ClauseckerA suffix beginning with
12305888197SRobert Clausecker.Sq ":"
12405888197SRobert Clauseckeror
12505888197SRobert Clausecker.Sq "+"
12605888197SRobert Clauseckerin
12705888197SRobert Clausecker.Ev ARCHLEVEL
12805888197SRobert Clauseckeris ignored and may be used for future extensions.
12905888197SRobert ClauseckerThe architecture level can be prefixed with a
13005888197SRobert Clausecker.Sq "!"
13105888197SRobert Clauseckercharacter to force use of the requested architecture level, even if the
13205888197SRobert Clauseckerprocessor does not advertise that it is supported.
13305888197SRobert ClauseckerThis usually causes applications to crash and should only be used for
13405888197SRobert Clauseckertesting purposes or if architecture level detection yields incorrect
13505888197SRobert Clauseckerresults.
13605888197SRobert Clausecker.Pp
13705888197SRobert ClauseckerThe architecture levels follow the AMD64 SysV ABI supplement:
13805888197SRobert Clausecker.Bl -tag -width x86-64-v2
13905888197SRobert Clausecker.It Cm scalar
14005888197SRobert Clauseckerscalar enhancements only (no SIMD)
14105888197SRobert Clausecker.It Cm baseline
14205888197SRobert Clauseckercmov, cx8, x87 FPU, fxsr, MMX, osfxsr, SSE, SSE2
14305888197SRobert Clausecker.It Cm x86-64-v2
14405888197SRobert Clauseckercx16, lahf/sahf, popcnt, SSE3, SSSE3, SSE4.1, SSE4.2
14505888197SRobert Clausecker.It Cm x86-64-v3
14605888197SRobert ClauseckerAVX, AVX2, BMI1, BMI2, F16C, FMA, lzcnt, movbe, osxsave
14705888197SRobert Clausecker.It Cm x86-64-v4
14805888197SRobert ClauseckerAVX-512F/BW/CD/DQ/VL
14905888197SRobert Clausecker.El
15005888197SRobert Clausecker.El
15105888197SRobert Clausecker.
15205888197SRobert Clausecker.Sh DIAGNOSTICS
15305888197SRobert Clausecker.Bl -diag
15405888197SRobert Clausecker.It "Illegal Instruction"
15505888197SRobert ClauseckerPrinted by
15605888197SRobert Clausecker.Xr sh 1
15705888197SRobert Clauseckerif a command is terminated through delivery of a
15805888197SRobert Clausecker.Dv SIGILL
15905888197SRobert Clauseckersignal, see
16005888197SRobert Clausecker.Xr signal 3 .
16105888197SRobert Clausecker.Pp
16205888197SRobert ClauseckerUse of an unsupported architecture level was forced by setting
16305888197SRobert Clausecker.Ev ARCHLEVEL
16405888197SRobert Clauseckerto a string beginning with a
16505888197SRobert Clausecker.Sq "!"
16605888197SRobert Clauseckercharacter, causing a process to crash due to use of an unsupported
16705888197SRobert Clauseckerinstruction.
16805888197SRobert ClauseckerUnset
16905888197SRobert Clausecker.Ev ARCHLEVEL ,
17005888197SRobert Clauseckerremove the
17105888197SRobert Clausecker.Sq "!"
17205888197SRobert Clauseckerprefix or select a supported architecture level.
17305888197SRobert Clausecker.Pp
17405888197SRobert ClauseckerMessage may also appear for unrelated reasons.
17505888197SRobert Clausecker.El
17605888197SRobert Clausecker.
17705888197SRobert Clausecker.Sh SEE ALSO
17805888197SRobert Clausecker.Xr string 3 ,
17905888197SRobert Clausecker.Xr arch 7
18005888197SRobert Clausecker.Rs
18105888197SRobert Clausecker.%A H. J. Lu
18205888197SRobert Clausecker.%A Michael Matz
18305888197SRobert Clausecker.%A Milind Girkar
18405888197SRobert Clausecker.%A Jan Hubi\[u010D]ka \" \(vc
18505888197SRobert Clausecker.%A Andreas Jaeger
18605888197SRobert Clausecker.%A Mark Mitchell
18705888197SRobert Clausecker.%B System V Application Binary Interface
18805888197SRobert Clausecker.%D May 23, 2023
18905888197SRobert Clausecker.%T AMD64 Architecture Processor Supplement
19005888197SRobert Clausecker.%O Version 1.0
19105888197SRobert Clausecker.Re
19205888197SRobert Clausecker.
19305888197SRobert Clausecker.Sh HISTORY
19405888197SRobert ClauseckerArchitecture-specific enhanced
19505888197SRobert Clausecker.Em libc
19605888197SRobert Clauseckerfunctions were added starting
19705888197SRobert Clauseckerwith
19805888197SRobert Clausecker.Fx 2.0
19905888197SRobert Clauseckerfor
20005888197SRobert Clausecker.Cm i386 ,
20105888197SRobert Clausecker.Fx 6.0
20205888197SRobert Clauseckerfor
20305888197SRobert Clausecker.Cm arm ,
20405888197SRobert Clausecker.Fx 6.1
20505888197SRobert Clauseckerfor
20605888197SRobert Clausecker.Cm amd64 ,
20705888197SRobert Clausecker.Fx 11.0
20805888197SRobert Clauseckerfor
20905888197SRobert Clausecker.Cm aarch64 ,
21005888197SRobert Clauseckerand
21105888197SRobert Clausecker.Fx 12.0
21205888197SRobert Clauseckerfor
21305888197SRobert Clausecker.Cm powerpc64 .
21405888197SRobert ClauseckerSIMD-enhanced functions were first added with
21505888197SRobert Clausecker.Fx 13.0
21605888197SRobert Clauseckerfor
21705888197SRobert Clausecker.Cm powerpc64
21805888197SRobert Clauseckerand with
21975a9e225SRobert Clausecker.Fx 14.1
22005888197SRobert Clauseckerfor
22105888197SRobert Clausecker.Cm amd64 .
22205888197SRobert Clausecker.Pp
22305888197SRobert ClauseckerA
22405888197SRobert Clausecker.Nm
22505888197SRobert Clauseckermanual page appeared in
226c983825aSRobert Clausecker.Fx 14.1 .
22705888197SRobert Clausecker.
22805888197SRobert Clausecker.Sh AUTHOR
22905888197SRobert Clausecker.An Robert Clausecker Aq Mt fuz@FreeBSD.org
23005888197SRobert Clausecker.
23105888197SRobert Clausecker.Sh CAVEATS
23205888197SRobert ClauseckerOther parts of
23305888197SRobert Clausecker.Fx
23405888197SRobert Clauseckersuch as cryptographic routines in the kernel or in
23505888197SRobert ClauseckerOpenSSL may also use SIMD enhancements.
23605888197SRobert ClauseckerThese enhancements are not subject to the
23705888197SRobert Clausecker.Ev ARCHLEVEL
23805888197SRobert Clauseckervariable and may have their own configuration
23905888197SRobert Clauseckermechanism.
24005888197SRobert Clausecker.
24105888197SRobert Clausecker.Sh BUGS
24205888197SRobert ClauseckerUse of SIMD enhancements cannot be configured on powerpc64.
243