xref: /freebsd/contrib/arm-optimized-routines/math/README.contributors (revision f3087bef11543b42e0d69b708f367097a4118d24)
1072a4ba8SAndrew TurnerSTYLE REQUIREMENTS
2072a4ba8SAndrew Turner==================
3072a4ba8SAndrew Turner
4*f3087befSAndrew Turner1. With the exception of math/aarch64/experimental/, most code in this
5*f3087befSAndrew Turner   sub-directory is expected to be upstreamed into glibc so the GNU
6*f3087befSAndrew Turner   Coding Standard and glibc specific conventions should be followed
7072a4ba8SAndrew Turner   to ease upstreaming.
8072a4ba8SAndrew Turner
9072a4ba8SAndrew Turner2. ABI and symbols: the code should be written so it is suitable for inclusion
10072a4ba8SAndrew Turner   into a libc with minimal changes. This e.g. means that internal symbols
11072a4ba8SAndrew Turner   should be hidden and in the implementation reserved namespace according to
12072a4ba8SAndrew Turner   ISO C and POSIX rules. If possible the built shared libraries and static
13072a4ba8SAndrew Turner   library archives should be usable to override libc symbols at link time (or
14072a4ba8SAndrew Turner   at runtime via LD_PRELOAD). This requires the symbols to follow the glibc ABI
15072a4ba8SAndrew Turner   (other than symbol versioning), this cannot be done reliably for static
16072a4ba8SAndrew Turner   linking so this is a best effort requirement.
17072a4ba8SAndrew Turner
18072a4ba8SAndrew Turner3. API: include headers should be suitable for benchmarking and testing code
19072a4ba8SAndrew Turner   and should not conflict with libc headers.
20072a4ba8SAndrew Turner
21072a4ba8SAndrew Turner
22072a4ba8SAndrew TurnerCONTRIBUTION GUIDELINES FOR math SUB-DIRECTORY
23072a4ba8SAndrew Turner==============================================
24072a4ba8SAndrew Turner
25072a4ba8SAndrew Turner1. Math functions have quality and performance requirements.
26072a4ba8SAndrew Turner
27072a4ba8SAndrew Turner2. Quality:
28072a4ba8SAndrew Turner   - Worst-case ULP error should be small in the entire input domain (for most
29072a4ba8SAndrew Turner     common double precision scalar functions the target is < 0.66 ULP error,
30072a4ba8SAndrew Turner     and < 1 ULP for single precision, even performance optimized function
31072a4ba8SAndrew Turner     variant should not have > 5 ULP error if the goal is to be a drop in
32072a4ba8SAndrew Turner     replacement for a standard math function), this should be tested
33072a4ba8SAndrew Turner     statistically (or on all inputs if possible in reasonable amount of time).
34072a4ba8SAndrew Turner     The ulp tool is for this and runulp.sh should be updated for new functions.
35072a4ba8SAndrew Turner
36072a4ba8SAndrew Turner   - All standard rounding modes need to be supported but in non-default rounding
37072a4ba8SAndrew Turner     modes the quality requirement can be relaxed. (Non-nearest rounded
38072a4ba8SAndrew Turner     computation can be slow and inaccurate but has to be correct for conformance
39072a4ba8SAndrew Turner     reasons.)
40072a4ba8SAndrew Turner
41072a4ba8SAndrew Turner   - Special cases and error handling need to follow ISO C Annex F requirements,
42072a4ba8SAndrew Turner     POSIX requirements, IEEE 754-2008 requirements and Glibc requiremnts:
43072a4ba8SAndrew Turner     https://www.gnu.org/software/libc/manual/html_mono/libc.html#Errors-in-Math-Functions
44072a4ba8SAndrew Turner     this should be tested by direct tests (glibc test system may be used for it).
45072a4ba8SAndrew Turner
46072a4ba8SAndrew Turner   - Error handling code should be decoupled from the approximation code as much
47072a4ba8SAndrew Turner     as possible. (There are helper functions, these take care of errno as well
48072a4ba8SAndrew Turner     as exception raising.)
49072a4ba8SAndrew Turner
50072a4ba8SAndrew Turner   - Vector math code does not need to work in non-nearest rounding mode and error
51072a4ba8SAndrew Turner     handling side effects need not happen (fenv exceptions and errno), but the
52072a4ba8SAndrew Turner     result should be correct (within quality requirements, which are lower for
53072a4ba8SAndrew Turner     vector code than for scalar code).
54072a4ba8SAndrew Turner
55072a4ba8SAndrew Turner   - Error bounds of the approximation should be clearly documented.
56072a4ba8SAndrew Turner
57072a4ba8SAndrew Turner   - The code should build and pass tests on arm, aarch64 and x86_64 GNU linux
58072a4ba8SAndrew Turner     systems. (Routines and features can be disabled on specific targets, but
59072a4ba8SAndrew Turner     the build must complete). On aarch64, both little- and big-endian targets
60072a4ba8SAndrew Turner     are supported as well as valid combinations of architecture extensions.
61072a4ba8SAndrew Turner     The configurations that should be tested depend on the contribution.
62072a4ba8SAndrew Turner
63072a4ba8SAndrew Turner3. Performance:
64072a4ba8SAndrew Turner   - Common math code should be benchmarked on modern aarch64 microarchitectures
65072a4ba8SAndrew Turner     over typical inputs.
66072a4ba8SAndrew Turner
67072a4ba8SAndrew Turner   - Performance improvements should be documented (relative numbers can be
68072a4ba8SAndrew Turner     published; it is enough to use the mathbench microbenchmark tool which should
69072a4ba8SAndrew Turner     be updated for new functions).
70072a4ba8SAndrew Turner
71072a4ba8SAndrew Turner   - Attention should be paid to the compilation flags: for aarch64 fma
72072a4ba8SAndrew Turner     contraction should be on and math errno turned off so some builtins can be
73072a4ba8SAndrew Turner     inlined.
74072a4ba8SAndrew Turner
75072a4ba8SAndrew Turner   - The code should be reasonably performant on x86_64 too, e.g. some rounding
76072a4ba8SAndrew Turner     instructions and fma may not be available on x86_64, such builtins turn into
77072a4ba8SAndrew Turner     libc calls with slow code. Such slowdown is not acceptable, a faster fallback
78072a4ba8SAndrew Turner     should be present: glibc and bionic use the same code on all targets. (This
79072a4ba8SAndrew Turner     does not apply to vector math code).
80