xref: /freebsd/contrib/arm-optimized-routines/math/README.contributors (revision dd21556857e8d40f66bf5ad54754d9d52669ebf7)
1STYLE REQUIREMENTS
2==================
3
41. With the exception of math/aarch64/experimental/, most code in this
5   sub-directory is expected to be upstreamed into glibc so the GNU
6   Coding Standard and glibc specific conventions should be followed
7   to ease upstreaming.
8
92. ABI and symbols: the code should be written so it is suitable for inclusion
10   into a libc with minimal changes. This e.g. means that internal symbols
11   should be hidden and in the implementation reserved namespace according to
12   ISO C and POSIX rules. If possible the built shared libraries and static
13   library archives should be usable to override libc symbols at link time (or
14   at runtime via LD_PRELOAD). This requires the symbols to follow the glibc ABI
15   (other than symbol versioning), this cannot be done reliably for static
16   linking so this is a best effort requirement.
17
183. API: include headers should be suitable for benchmarking and testing code
19   and should not conflict with libc headers.
20
21
22CONTRIBUTION GUIDELINES FOR math SUB-DIRECTORY
23==============================================
24
251. Math functions have quality and performance requirements.
26
272. Quality:
28   - Worst-case ULP error should be small in the entire input domain (for most
29     common double precision scalar functions the target is < 0.66 ULP error,
30     and < 1 ULP for single precision, even performance optimized function
31     variant should not have > 5 ULP error if the goal is to be a drop in
32     replacement for a standard math function), this should be tested
33     statistically (or on all inputs if possible in reasonable amount of time).
34     The ulp tool is for this and runulp.sh should be updated for new functions.
35
36   - All standard rounding modes need to be supported but in non-default rounding
37     modes the quality requirement can be relaxed. (Non-nearest rounded
38     computation can be slow and inaccurate but has to be correct for conformance
39     reasons.)
40
41   - Special cases and error handling need to follow ISO C Annex F requirements,
42     POSIX requirements, IEEE 754-2008 requirements and Glibc requiremnts:
43     https://www.gnu.org/software/libc/manual/html_mono/libc.html#Errors-in-Math-Functions
44     this should be tested by direct tests (glibc test system may be used for it).
45
46   - Error handling code should be decoupled from the approximation code as much
47     as possible. (There are helper functions, these take care of errno as well
48     as exception raising.)
49
50   - Vector math code does not need to work in non-nearest rounding mode and error
51     handling side effects need not happen (fenv exceptions and errno), but the
52     result should be correct (within quality requirements, which are lower for
53     vector code than for scalar code).
54
55   - Error bounds of the approximation should be clearly documented.
56
57   - The code should build and pass tests on arm, aarch64 and x86_64 GNU linux
58     systems. (Routines and features can be disabled on specific targets, but
59     the build must complete). On aarch64, both little- and big-endian targets
60     are supported as well as valid combinations of architecture extensions.
61     The configurations that should be tested depend on the contribution.
62
633. Performance:
64   - Common math code should be benchmarked on modern aarch64 microarchitectures
65     over typical inputs.
66
67   - Performance improvements should be documented (relative numbers can be
68     published; it is enough to use the mathbench microbenchmark tool which should
69     be updated for new functions).
70
71   - Attention should be paid to the compilation flags: for aarch64 fma
72     contraction should be on and math errno turned off so some builtins can be
73     inlined.
74
75   - The code should be reasonably performant on x86_64 too, e.g. some rounding
76     instructions and fma may not be available on x86_64, such builtins turn into
77     libc calls with slow code. Such slowdown is not acceptable, a faster fallback
78     should be present: glibc and bionic use the same code on all targets. (This
79     does not apply to vector math code).
80