History log of /freebsd/lib/libc/amd64/string/strchrnul.S (Results 1 – 3 of 3)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: release/14.0.0
# 3d8ef251 25-Aug-2023 Robert Clausecker <fuz@FreeBSD.org>

lib/libc/amd64/string/strchrnul.S: fix edge case in scalar code

When the buffer is immediately preceeded by the character we
are looking for and begins with one higher than that character,
and the b

lib/libc/amd64/string/strchrnul.S: fix edge case in scalar code

When the buffer is immediately preceeded by the character we
are looking for and begins with one higher than that character,
and the buffer is misaligned, a match was errorneously detected
in the first character. Fix this by changing the way we prevent
matches before the buffer from being detected: instead of
removing the corresponding bit from the 0x80..80 mask, set the
LSB of bytes before the buffer after xoring with the character we
look for.

The bug only affects amd64 with ARCHLEVEL=scalar (cf. simd(7)).
The change comes at a 2% performance impact for short strings
if ARCHLEVEL is set to scalar. The default configuration is not
affected.

os: FreeBSD
arch: amd64
cpu: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
│ strchrnul.scalar.0.out │ strchrnul.scalar.2.out │
│ sec/op │ sec/op vs base │
Short 57.89µ ± 2% 59.08µ ± 1% +2.07% (p=0.030 n=20)
Mid 19.24µ ± 0% 19.73µ ± 0% +2.53% (p=0.000 n=20)
Long 11.03µ ± 0% 11.03µ ± 0% ~ (p=0.547 n=20)
geomean 23.07µ 23.43µ +1.53%

│ strchrnul.scalar.0.out │ strchrnul.scalar.2.out │
│ B/s │ B/s vs base │
Short 2.011Gi ± 2% 1.970Gi ± 1% -2.02% (p=0.030 n=20)
Mid 6.049Gi ± 0% 5.900Gi ± 0% -2.47% (p=0.000 n=20)
Long 10.56Gi ± 0% 10.56Gi ± 0% ~ (p=0.547 n=20)
geomean 5.045Gi 4.969Gi -1.50%

MFC to: stable/14
MFC after: 3 days
Approved by: mjg (blanket, via IRC)
Sponsored by: The FreeBSD Foundation

show more ...


# d7302cab 07-Aug-2023 Robert Clausecker <fuz@FreeBSD.org>

lib/libc/amd64/string/strchrnul.S: fix wrong indentation

Uses spaces instead of tabs for this line by accident.

Reported by: jrtc27, kib
Approved by: kib


# 61f4c4d3 30-Jun-2023 Robert Clausecker <fuz@FreeBSD.org>

lib/libc/amd64/string: add strchrnul implementations (scalar, baseline)

A lot better than the generic (pre) implementaion. We do not beat glibc
for long strings, likely due to glibc switching to AV

lib/libc/amd64/string: add strchrnul implementations (scalar, baseline)

A lot better than the generic (pre) implementaion. We do not beat glibc
for long strings, likely due to glibc switching to AVX once the input is
sufficiently long. X86-64-v3 and v4 implementations may be added at a
future time.

os: FreeBSD
arch: amd64
cpu: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
│ strchrnul_pre.out │ strchrnul_scalar.out │ strchrnul_baseline.out │
│ sec/op │ sec/op vs base │ sec/op vs base │
Short 129.68µ ± 3% 59.91µ ± 1% -53.80% (p=0.000 n=20) 44.37µ ± 1% -65.79% (p=0.000 n=20)
Mid 21.15µ ± 0% 19.30µ ± 0% -8.76% (p=0.000 n=20) 12.30µ ± 0% -41.85% (p=0.000 n=20)
Long 13.772µ ± 0% 11.028µ ± 0% -19.92% (p=0.000 n=20) 3.285µ ± 0% -76.15% (p=0.000 n=20)
geomean 33.55µ 23.36µ -30.37% 12.15µ -63.80%

│ strchrnul_pre.out │ strchrnul_scalar.out │ strchrnul_baseline.out │
│ B/s │ B/s vs base │ B/s vs base │
Short 919.3Mi ± 3% 1989.7Mi ± 1% +116.45% (p=0.000 n=20) 2686.8Mi ± 1% +192.28% (p=0.000 n=20)
Mid 5.505Gi ± 0% 6.033Gi ± 0% +9.60% (p=0.000 n=20) 9.466Gi ± 0% +71.97% (p=0.000 n=20)
Long 8.453Gi ± 0% 10.557Gi ± 0% +24.88% (p=0.000 n=20) 35.441Gi ± 0% +319.26% (p=0.000 n=20)
geomean 3.470Gi 4.983Gi +43.62% 9.584Gi +176.22%

For comparison, glibc on the same machine:

│ strchrnul_glibc.out │
│ sec/op │
Short 49.73µ ± 0%
Mid 14.60µ ± 0%
Long 1.237µ ± 0%
geomean 9.646µ

│ strchrnul_glibc.out │
│ B/s │
Short 2.341Gi ± 0%
Mid 7.976Gi ± 0%
Long 94.14Gi ± 0%
geomean 12.07Gi

Sponsored by: The FreeBSD Foundation
Approved by: mjg
Differential Revision: https://reviews.freebsd.org/D41333

show more ...