arm-optimized-routines/math/README.contributors

072a4ba8SAndrew TurnerSTYLE REQUIREMENTS
072a4ba8SAndrew Turner==================
072a4ba8SAndrew Turner
*f3087befSAndrew Turner1. With the exception of math/aarch64/experimental/, most code in this
*f3087befSAndrew Turner   sub-directory is expected to be upstreamed into glibc so the GNU
*f3087befSAndrew Turner   Coding Standard and glibc specific conventions should be followed
072a4ba8SAndrew Turner   to ease upstreaming.
072a4ba8SAndrew Turner
072a4ba8SAndrew Turner2. ABI and symbols: the code should be written so it is suitable for inclusion
072a4ba8SAndrew Turner   into a libc with minimal changes. This e.g. means that internal symbols
072a4ba8SAndrew Turner   should be hidden and in the implementation reserved namespace according to
072a4ba8SAndrew Turner   ISO C and POSIX rules. If possible the built shared libraries and static
072a4ba8SAndrew Turner   library archives should be usable to override libc symbols at link time (or
072a4ba8SAndrew Turner   at runtime via LD_PRELOAD). This requires the symbols to follow the glibc ABI
072a4ba8SAndrew Turner   (other than symbol versioning), this cannot be done reliably for static
072a4ba8SAndrew Turner   linking so this is a best effort requirement.
072a4ba8SAndrew Turner
072a4ba8SAndrew Turner3. API: include headers should be suitable for benchmarking and testing code
072a4ba8SAndrew Turner   and should not conflict with libc headers.
072a4ba8SAndrew Turner
072a4ba8SAndrew Turner
072a4ba8SAndrew TurnerCONTRIBUTION GUIDELINES FOR math SUB-DIRECTORY
072a4ba8SAndrew Turner==============================================
072a4ba8SAndrew Turner
072a4ba8SAndrew Turner1. Math functions have quality and performance requirements.
072a4ba8SAndrew Turner
072a4ba8SAndrew Turner2. Quality:
072a4ba8SAndrew Turner   - Worst-case ULP error should be small in the entire input domain (for most
072a4ba8SAndrew Turner     common double precision scalar functions the target is < 0.66 ULP error,
072a4ba8SAndrew Turner     and < 1 ULP for single precision, even performance optimized function
072a4ba8SAndrew Turner     variant should not have > 5 ULP error if the goal is to be a drop in
072a4ba8SAndrew Turner     replacement for a standard math function), this should be tested
072a4ba8SAndrew Turner     statistically (or on all inputs if possible in reasonable amount of time).
072a4ba8SAndrew Turner     The ulp tool is for this and runulp.sh should be updated for new functions.
072a4ba8SAndrew Turner
072a4ba8SAndrew Turner   - All standard rounding modes need to be supported but in non-default rounding
072a4ba8SAndrew Turner     modes the quality requirement can be relaxed. (Non-nearest rounded
072a4ba8SAndrew Turner     computation can be slow and inaccurate but has to be correct for conformance
072a4ba8SAndrew Turner     reasons.)
072a4ba8SAndrew Turner
072a4ba8SAndrew Turner   - Special cases and error handling need to follow ISO C Annex F requirements,
072a4ba8SAndrew Turner     POSIX requirements, IEEE 754-2008 requirements and Glibc requiremnts:
072a4ba8SAndrew Turner     https://www.gnu.org/software/libc/manual/html_mono/libc.html#Errors-in-Math-Functions
072a4ba8SAndrew Turner     this should be tested by direct tests (glibc test system may be used for it).
072a4ba8SAndrew Turner
072a4ba8SAndrew Turner   - Error handling code should be decoupled from the approximation code as much
072a4ba8SAndrew Turner     as possible. (There are helper functions, these take care of errno as well
072a4ba8SAndrew Turner     as exception raising.)
072a4ba8SAndrew Turner
072a4ba8SAndrew Turner   - Vector math code does not need to work in non-nearest rounding mode and error
072a4ba8SAndrew Turner     handling side effects need not happen (fenv exceptions and errno), but the
072a4ba8SAndrew Turner     result should be correct (within quality requirements, which are lower for
072a4ba8SAndrew Turner     vector code than for scalar code).
072a4ba8SAndrew Turner
072a4ba8SAndrew Turner   - Error bounds of the approximation should be clearly documented.
072a4ba8SAndrew Turner
072a4ba8SAndrew Turner   - The code should build and pass tests on arm, aarch64 and x86_64 GNU linux
072a4ba8SAndrew Turner     systems. (Routines and features can be disabled on specific targets, but
072a4ba8SAndrew Turner     the build must complete). On aarch64, both little- and big-endian targets
072a4ba8SAndrew Turner     are supported as well as valid combinations of architecture extensions.
072a4ba8SAndrew Turner     The configurations that should be tested depend on the contribution.
072a4ba8SAndrew Turner
072a4ba8SAndrew Turner3. Performance:
072a4ba8SAndrew Turner   - Common math code should be benchmarked on modern aarch64 microarchitectures
072a4ba8SAndrew Turner     over typical inputs.
072a4ba8SAndrew Turner
072a4ba8SAndrew Turner   - Performance improvements should be documented (relative numbers can be
072a4ba8SAndrew Turner     published; it is enough to use the mathbench microbenchmark tool which should
072a4ba8SAndrew Turner     be updated for new functions).
072a4ba8SAndrew Turner
072a4ba8SAndrew Turner   - Attention should be paid to the compilation flags: for aarch64 fma
072a4ba8SAndrew Turner     contraction should be on and math errno turned off so some builtins can be
072a4ba8SAndrew Turner     inlined.
072a4ba8SAndrew Turner
072a4ba8SAndrew Turner   - The code should be reasonably performant on x86_64 too, e.g. some rounding
072a4ba8SAndrew Turner     instructions and fma may not be available on x86_64, such builtins turn into
072a4ba8SAndrew Turner     libc calls with slow code. Such slowdown is not acceptable, a faster fallback
072a4ba8SAndrew Turner     should be present: glibc and bionic use the same code on all targets. (This
072a4ba8SAndrew Turner     does not apply to vector math code).