All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. All advertising materials mentioning features or use of this software
must display the following acknowledgement:
This product includes software developed by the University of
California, Berkeley and its contributors.
4. Neither the name of the University nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
SUCH DAMAGE.
from: @(#)math.3 6.10 (Berkeley) 5/6/91
$Id$
Name Appears on Page Description Error Bound (ULPs) acos sin.3m inverse trigonometric function 3 acosh asinh.3m inverse hyperbolic function 3 asin sin.3m inverse trigonometric function 3 asinh asinh.3m inverse hyperbolic function 3 atan sin.3m inverse trigonometric function 1 atanh asinh.3m inverse hyperbolic function 3 atan2 sin.3m inverse trigonometric function 2 cabs hypot.3m complex absolute value 1 cbrt sqrt.3m cube root 1 ceil floor.3m integer no less than 0 copysign ieee.3m copy sign bit 0 cos sin.3m trigonometric function 1 cosh sinh.3m hyperbolic function 3 erf erf.3m error function ??? erfc erf.3m complementary error function ??? exp exp.3m exponential 1 expm1 exp.3m exp(x)-1 1 fabs floor.3m absolute value 0 floor floor.3m integer no greater than 0 hypot hypot.3m Euclidean distance 1 ilogb ieee.3m exponent extraction 0 j0 j0.3m bessel function ??? j1 j0.3m bessel function ??? jn j0.3m bessel function ??? lgamma lgamma.3m log gamma function; (formerly gamma.3m) log exp.3m natural logarithm 1 log10 exp.3m logarithm to base 10 3 log1p exp.3m log(1+x) 1 pow exp.3m exponential x**y 60-500 remainder ieee.3m remainder 0 rint floor.3m round to nearest integer 0 scalbn ieee.3m exponent adjustment 0 sin sin.3m trigonometric function 1 sinh sinh.3m hyperbolic function 3 sqrt sqrt.3m square root 1 tan sin.3m trigonometric function 3 tanh sinh.3m hyperbolic function 3 y0 j0.3m bessel function ??? y1 j0.3m bessel function ??? yn j0.3m bessel function ???
sqrt(-1.0) = 0.0 and log(-1.0) = -1.7e38.
cos(1.0e-11) > cos(0.0) > 1.0.
pow(x,1.0) != \(!= x when x = 2.0, 3.0, 4.0, ..., 9.0.
pow(-1.0,1.0e10) trapped on Integer Overflow.
sqrt(1.0e30) and sqrt(1.0e-30) were very slow.
DEC VAX-11 D_floating-point:
This is the format for which the original math library libm was developed, and to which this manual is still principally dedicated. It is the double-precision format for the PDP-11 and the earlier VAX-11 machines; VAX-11s after 1983 were provided with an optional "G" format closer to the IEEE double-precision format. The earlier DEC MicroVAXs have no D format, only G double-precision. (Why? Why not?)
Properties of D_floating-point:
Wordsize: 64 bits, 8 bytes. Radix: Binary.
Precision: 56 sig. significant bits, roughly like 17 sig. significant decimals.
If x and x' are consecutive positive D_floating-point numbers (they differ by 1 \*(up), then
1.3e-17 < 0.5**56 < (x'-x)/x \(<= 0.5**55 < 2.8e-17.
Range: Overflow threshold = 2.0**127 = 1.7e38. Underflow threshold = 0.5**128 = 2.9e-39. NOTE: THIS RANGE IS COMPARATIVELY NARROW.
Overflow customarily stops computation.
Underflow is customarily flushed quietly to zero.
CAUTION:
It is possible to have x != \(!= y and yet x-y = 0 because of underflow. Similarly x > y > 0 cannot prevent either x\(**y = 0 or y/x = 0 from happening without warning.
Although 2**55 different representations of zero are accepted by the hardware, only the obvious representation is ever produced. There is no -0 on a VAX.
Reserved operands:
of the 2**55 that the hardware recognizes, only one of them is ever produced. Any floating-point operation upon a reserved operand, even a MOVF or MOVD, customarily stops computation, so they are not much used.
Divisions by zero and operations that overflow are invalid operations that customarily stop computation or, in earlier machines, produce reserved operands that will stop computation.
Except for its narrow range, D_floating-point is one of the better computer arithmetics designed in the 1960's. Its properties are reflected fairly faithfully in the elementary functions for a VAX distributed in 4.3 BSD. They over/underflow only if their results have to lie out of range or very nearly so, and then they behave much as any rational arithmetic operation that over/underflowed would behave. Similarly, expressions like log(0) and atanh(1) behave like 1/0; and sqrt(-3) and acos(3) behave like 0/0; they all produce reserved operands and/or stop computation! The situation is described in more detail in manual pages.
This response seems excessively punitive, so it is destined
to be replaced at some time in the foreseeable future by a
more flexible but still uniform scheme being developed to
handle all floating-point arithmetic exceptions neatly.
See infnan(3M) for the present state of affairs.
How do the functions in 4.3 BSD's new libm for UNIX compare with their counterparts in DEC's VAX/VMS library? Some of the VMS functions are a little faster, some are a little more accurate, some are more puritanical about exceptions (like pow(0.0,0.0) and atan2(0.0,0.0)), and most occupy much more memory than their counterparts in libm. The VMS codes interpolate in large table to achieve speed and accuracy; the libm codes use tricky formulas compact enough that all of them may some day fit into a ROM.
More important, DEC regards the VMS codes as proprietary and guards them zealously against unauthorized use. But the libm codes in 4.3 BSD are intended for the public domain; they may be copied freely provided their provenance is always acknowledged, and provided users assist the authors in their researches by reporting experience with the codes. Therefore no user of UNIX on a machine whose arithmetic resembles VAX D_floating-point need use anything worse than the new libm.
IEEE STANDARD 754 Floating-Point Arithmetic:
This standard is on its way to becoming more widely adopted than any other design for computer arithmetic. VLSI chips that conform to some version of that standard have been produced by a host of manufacturers, among them ...
Intel i8087, i80287 National Semiconductor 32081 Motorola 68881 Weitek WTL-1032, ... , -1165 Zilog Z8070 Western Electric (AT&T) WE32106.Other implementations range from software, done thoroughly in the Apple Macintosh, through VLSI in the Hewlett-Packard 9000 series, to the ELXSI 6400 running ECL at 3 Megaflops. Several other companies have adopted the formats of IEEE 754 without, alas, adhering to the standard's way of handling rounding and exceptions like over/underflow. The DEC VAX G_floating-point format is very similar to the IEEE 754 Double format, so similar that the C programs for the IEEE versions of most of the elementary functions listed above could easily be converted to run on a MicroVAX, though nobody has volunteered to do that yet.
The codes in 4.3 BSD's libm for machines that conform to IEEE 754 are intended primarily for the National Semi. 32081 and WTL 1164/65. To use these codes with the Intel or Zilog chips, or with the Apple Macintosh or ELXSI 6400, is to forego the use of better codes provided (perhaps freely) by those companies and designed by some of the authors of the codes above. Except for atan, cabs, cbrt, erf, erfc, hypot, j0-jn, lgamma, pow and y0-yn, the Motorola 68881 has all the functions in libm on chip, and faster and more accurate; it, Apple, the i8087, Z8070 and WE32106 all use 64 sig. significant bits. The main virtue of 4.3 BSD's libm codes is that they are intended for the public domain; they may be copied freely provided their provenance is always acknowledged, and provided users assist the authors in their researches by reporting experience with the codes. Therefore no user of UNIX on a machine that conforms to IEEE 754 need use anything worse than the new libm.
Properties of IEEE 754 Double-Precision:
Wordsize: 64 bits, 8 bytes. Radix: Binary.
Precision: 53 sig. significant bits, roughly like 16 sig. significant decimals.
If x and x' are consecutive positive Double-Precision numbers (they differ by 1 \*(up), then
1.1e-16 < 0.5**53 < (x'-x)/x \(<= 0.5**52 < 2.3e-16.
Range: Overflow threshold = 2.0**1024 = 1.8e308 Underflow threshold = 0.5**1022 = 2.2e-308
Overflow goes by default to a signed f "" .
Underflow is Gradual, rounding to the nearest integer multiple of 0.5**1074 = 4.9e-324.
Its sign transforms correctly through multiplication or division, and is preserved by addition of zeros with like signs; but x-x yields +0 for every finite x. The only operations that reveal zero's sign are division by zero and copysign(x,\(+-0). In particular, comparison (x > y, x \(>= y, etc.) cannot be affected by the sign of zero; but if finite x = y then f = 1/(x-y) != \(!= -1/(y-x) = f - .
it persists when added to itself or to any finite number. Its sign transforms correctly through multiplication and division, and f (finite)/\(+- \0=\0\(+-0 (nonzero)/0 = f \(+- . But Infinity-Infinity, Infinity\(**0 and Infinity/Infinity \(if-\(if, \(if\(**0 and \(if/\(if are, like 0/0 and sqrt(-3), invalid operations that produce \*(nn. ...
there are 2**53-2 of them, all called \*(nn (Not a Number). Some, called Signaling \*(nns, trap any floating-point operation performed upon them; they are used to mark missing or uninitialized values, or nonexistent elements of arrays. The rest are Quiet \*(nns; they are the default results of Invalid Operations, and propagate through subsequent arithmetic operations. If x != \(!= x then x is \*(nn; every other predicate (x > y, x = y, x < y, ...) is FALSE if \*(nn is involved.
NOTE: Trichotomy is violated by \*(nn.
Besides being FALSE, predicates that entail ordered comparison, rather than mere (in)equality, signal Invalid Operation when \*(nn is involved.
Every algebraic operation (+, -, \(**, /, sqrt) \(sr) is rounded by default to within half an \*(up, and when the rounding error is exactly half an \*(up then the rounded value's least significant bit is zero. This kind of rounding is usually the best kind, sometimes provably so; for instance, for every x = 1.0, 2.0, 3.0, 4.0, ..., 2.0**52, we find (x/3.0)\(**3.0 == x and (x/10.0)\(**10.0 == x and ... despite that both the quotients and the products have been rounded. Only rounding like IEEE 754 can do that. But no single kind of rounding can be proved best for every circumstance, so IEEE 754 provides rounding towards zero or towards f + or towards f - at the programmer's option. And the same kinds of rounding are specified for Binary-Decimal Conversions, at least for magnitudes between roughly 1.0e-10 and 1.0e37.
IEEE 754 recognizes five kinds of floating-point exceptions, listed below in declining order of probable importance.
Exception Default Result .tc \(ru .tc Invalid Operation \*(nn, or FALSE Overflow \(+-Infinity Divide by Zero \(+-Infinity \} Overflow \(+-\(if Divide by Zero \(+-\(if \} Underflow Gradual Underflow Inexact Rounded value
For each kind of floating-point exception, IEEE 754 provides a Flag that is raised each time its exception is signaled, and stays raised until the program resets it. Programs may also test, save and restore a flag. Thus, IEEE 754 provides three ways by which programs may cope with exceptions for which the default result might be unsatisfactory:
CAUTION: The only reliable ways to discover whether Underflow has occurred are to test whether products or quotients lie closer to zero than the underflow threshold, or to test the Underflow flag. (Sums and differences cannot underflow in IEEE 754; if x != \(!= y then x-y is correct to full precision and certainly nonzero regardless of how tiny it may be.) Products and quotients that underflow gradually can lose accuracy gradually without vanishing, so comparing them with zero (as one might on a VAX) will not reveal the loss. Fortunately, if a gradually underflowed value is destined to be added to something bigger than the underflow threshold, as is almost always the case, digits lost to gradual underflow will not be missed because they would have been rounded off anyway. So gradual underflows are usually provably ignorable. The same cannot be said of underflows flushed to 0.
At the option of an implementor conforming to IEEE 754, other ways to cope with exceptions may be provided:
The crucial problem for exception handling is the problem of Scope, and the problem's solution is understood, but not enough manpower was available to implement it fully in time to be distributed in 4.3 BSD's libm. Ideally, each elementary function should act as if it were indivisible, or atomic, in the sense that ...
Ideally, every programmer should be able conveniently to turn a debugged subprogram into one that appears atomic to its users. But simulating all three characteristics of an atomic function is still a tedious affair, entailing hosts of tests and saves-restores; work is under way to ameliorate the inconvenience.
Meanwhile, the functions in libm are only approximately atomic. They signal no inappropriate exception except possibly ...
when a result, if properly computed, might have lain barely within range, and
when it happens to be exact, thanks to fortuitous cancellation of errors.
Invalid Operation is signaled only when
any result but \*(nn would probably be misleading.
the exact result would be finite but beyond the overflow threshold.
a function takes exactly infinite values at finite operands.
the exact result would be nonzero but tinier than the underflow threshold.
greater range or precision would be needed to represent the exact result.