1.\" Copyright (c) 1985 Regents of the University of California. 2.\" All rights reserved. 3.\" 4.\" Redistribution and use in source and binary forms, with or without 5.\" modification, are permitted provided that the following conditions 6.\" are met: 7.\" 1. Redistributions of source code must retain the above copyright 8.\" notice, this list of conditions and the following disclaimer. 9.\" 2. Redistributions in binary form must reproduce the above copyright 10.\" notice, this list of conditions and the following disclaimer in the 11.\" documentation and/or other materials provided with the distribution. 12.\" 3. Neither the name of the University nor the names of its contributors 13.\" may be used to endorse or promote products derived from this software 14.\" without specific prior written permission. 15.\" 16.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 17.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 18.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 19.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 20.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 21.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 22.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 23.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 24.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 25.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 26.\" SUCH DAMAGE. 27.\" 28.\" from: @(#)ieee.3 6.4 (Berkeley) 5/6/91 29.\" 30.Dd January 26, 2005 31.Dt IEEE 3 32.Os 33.Sh NAME 34.Nm ieee 35.Nd IEEE standard 754 for floating-point arithmetic 36.Sh DESCRIPTION 37The IEEE Standard 754 for Binary Floating-Point Arithmetic 38defines representations of floating-point numbers and abstract 39properties of arithmetic operations relating to precision, 40rounding, and exceptional cases, as described below. 41.Ss IEEE STANDARD 754 Floating-Point Arithmetic 42Radix: Binary. 43.Pp 44Overflow and underflow: 45.Bd -ragged -offset indent -compact 46Overflow goes by default to a signed \*(If. 47Underflow is 48.Em gradual . 49.Ed 50.Pp 51Zero is represented ambiguously as +0 or \-0. 52.Bd -ragged -offset indent -compact 53Its sign transforms correctly through multiplication or 54division, and is preserved by addition of zeros 55with like signs; but x\-x yields +0 for every 56finite x. 57The only operations that reveal zero's 58sign are division by zero and 59.Fn copysign x \(+-0 . 60In particular, comparison (x > y, x \(>= y, etc.)\& 61cannot be affected by the sign of zero; but if 62finite x = y then \*(If = 1/(x\-y) \(!= \-1/(y\-x) = \-\*(If. 63.Ed 64.Pp 65Infinity is signed. 66.Bd -ragged -offset indent -compact 67It persists when added to itself 68or to any finite number. 69Its sign transforms 70correctly through multiplication and division, and 71(finite)/\(+-\*(If\0=\0\(+-0 72(nonzero)/0 = \(+-\*(If. 73But 74\*(If\-\*(If, \*(If\(**0 and \*(If/\*(If 75are, like 0/0 and sqrt(\-3), 76invalid operations that produce \*(Na. ... 77.Ed 78.Pp 79Reserved operands (\*(Nas): 80.Bd -ragged -offset indent -compact 81An \*(Na is 82.Em ( N Ns ot Em a N Ns umber ) . 83Some \*(Nas, called Signaling \*(Nas, trap any floating-point operation 84performed upon them; they are used to mark missing 85or uninitialized values, or nonexistent elements 86of arrays. 87The rest are Quiet \*(Nas; they are 88the default results of Invalid Operations, and 89propagate through subsequent arithmetic operations. 90If x \(!= x then x is \*(Na; every other predicate 91(x > y, x = y, x < y, ...) is FALSE if \*(Na is involved. 92.Ed 93.Pp 94Rounding: 95.Bd -ragged -offset indent -compact 96Every algebraic operation (+, \-, \(**, /, 97\(sr) 98is rounded by default to within half an 99.Em ulp , 100and when the rounding error is exactly half an 101.Em ulp 102then 103the rounded value's least significant bit is zero. 104(An 105.Em ulp 106is one 107.Em U Ns nit 108in the 109.Em L Ns ast 110.Em P Ns lace . ) 111This kind of rounding is usually the best kind, 112sometimes provably so; for instance, for every 113x = 1.0, 2.0, 3.0, 4.0, ..., 2.0**52, we find 114(x/3.0)\(**3.0 == x and (x/10.0)\(**10.0 == x and ... 115despite that both the quotients and the products 116have been rounded. 117Only rounding like IEEE 754 can do that. 118But no single kind of rounding can be 119proved best for every circumstance, so IEEE 754 120provides rounding towards zero or towards 121+\*(If or towards \-\*(If 122at the programmer's option. 123.Ed 124.Pp 125Exceptions: 126.Bd -ragged -offset indent -compact 127IEEE 754 recognizes five kinds of floating-point exceptions, 128listed below in declining order of probable importance. 129.Bl -column -offset indent "Invalid Operation" "Gradual Underflow" 130.Em "Exception Default Result" 131Invalid Operation \*(Na, or FALSE 132Overflow \(+-\*(If 133Divide by Zero \(+-\*(If 134Underflow Gradual Underflow 135Inexact Rounded value 136.El 137.Pp 138NOTE: An Exception is not an Error unless handled 139badly. 140What makes a class of exceptions exceptional 141is that no single default response can be satisfactory 142in every instance. 143On the other hand, if a default 144response will serve most instances satisfactorily, 145the unsatisfactory instances cannot justify aborting 146computation every time the exception occurs. 147.Ed 148.Ss Data Formats 149Single-precision: 150.Bd -ragged -offset indent -compact 151Type name: 152.Vt float 153.Pp 154Wordsize: 32 bits. 155.Pp 156Precision: 24 significant bits, 157roughly like 7 significant decimals. 158.Pp 159If x and x' are consecutive positive single-precision 160numbers (they differ by 1 161.Em ulp ) , 162then 163.Bl -column "XXX" -compact 1645.9e\-08 < 0.5**24 < (x'\-x)/x \(<= 0.5**23 < 1.2e\-07. 165.El 166.Pp 167.Bl -column "XXX" -compact 168Range: Overflow threshold = 2.0**128 = 3.4e38 169 Underflow threshold = 0.5**126 = 1.2e\-38 170.El 171.Pp 172Underflowed results round to the nearest 173integer multiple of 174.Bl -column "XXX" -compact 1750.5**149 = 1.4e\-45. 176.El 177.Ed 178.Pp 179Double-precision: 180.Bd -ragged -offset indent -compact 181Type name: 182.Vt double 183.Po On some architectures, 184.Vt long double 185is the same as 186.Vt double 187.Pc 188.Pp 189Wordsize: 64 bits. 190.Pp 191Precision: 53 significant bits, 192roughly like 16 significant decimals. 193.Pp 194If x and x' are consecutive positive double-precision 195numbers (they differ by 1 196.Em ulp ) , 197then 198.Bl -column "XXX" -compact 1991.1e\-16 < 0.5**53 < (x'\-x)/x \(<= 0.5**52 < 2.3e\-16. 200.El 201.Pp 202.Bl -column "XXX" -compact 203Range: Overflow threshold = 2.0**1024 = 1.8e308 204 Underflow threshold = 0.5**1022 = 2.2e\-308 205.El 206.Pp 207Underflowed results round to the nearest 208integer multiple of 209.Bl -column "XXX" -compact 2100.5**1074 = 4.9e\-324. 211.El 212.Ed 213.Pp 214Extended-precision: 215.Bd -ragged -offset indent -compact 216Type name: 217.Vt long double 218(when supported by the hardware) 219.Pp 220Wordsize: 96 bits. 221.Pp 222Precision: 64 significant bits, 223roughly like 19 significant decimals. 224.Pp 225If x and x' are consecutive positive extended-precision 226numbers (they differ by 1 227.Em ulp ) , 228then 229.Bl -column "XXX" -compact 2301.0e\-19 < 0.5**63 < (x'\-x)/x \(<= 0.5**62 < 2.2e\-19. 231.El 232.Pp 233.Bl -column "XXX" -compact 234Range: Overflow threshold = 2.0**16384 = 1.2e4932 235 Underflow threshold = 0.5**16382 = 3.4e\-4932 236.El 237.Pp 238Underflowed results round to the nearest 239integer multiple of 240.Bl -column "XXX" -compact 2410.5**16445 = 5.7e\-4953. 242.El 243.Ed 244.Pp 245Quad-extended-precision: 246.Bd -ragged -offset indent -compact 247Type name: 248.Vt long double 249(when supported by the hardware) 250.Pp 251Wordsize: 128 bits. 252.Pp 253Precision: 113 significant bits, 254roughly like 34 significant decimals. 255.Pp 256If x and x' are consecutive positive quad-extended-precision 257numbers (they differ by 1 258.Em ulp ) , 259then 260.Bl -column "XXX" -compact 2619.6e\-35 < 0.5**113 < (x'\-x)/x \(<= 0.5**112 < 2.0e\-34. 262.El 263.Pp 264.Bl -column "XXX" -compact 265Range: Overflow threshold = 2.0**16384 = 1.2e4932 266 Underflow threshold = 0.5**16382 = 3.4e\-4932 267.El 268.Pp 269Underflowed results round to the nearest 270integer multiple of 271.Bl -column "XXX" -compact 2720.5**16494 = 6.5e\-4966. 273.El 274.Ed 275.Ss Additional Information Regarding Exceptions 276For each kind of floating-point exception, IEEE 754 277provides a Flag that is raised each time its exception 278is signaled, and stays raised until the program resets 279it. 280Programs may also test, save and restore a flag. 281Thus, IEEE 754 provides three ways by which programs 282may cope with exceptions for which the default result 283might be unsatisfactory: 284.Bl -enum 285.It 286Test for a condition that might cause an exception 287later, and branch to avoid the exception. 288.It 289Test a flag to see whether an exception has occurred 290since the program last reset its flag. 291.It 292Test a result to see whether it is a value that only 293an exception could have produced. 294.Pp 295CAUTION: The only reliable ways to discover 296whether Underflow has occurred are to test whether 297products or quotients lie closer to zero than the 298underflow threshold, or to test the Underflow 299flag. 300(Sums and differences cannot underflow in 301IEEE 754; if x \(!= y then x\-y is correct to 302full precision and certainly nonzero regardless of 303how tiny it may be.) 304Products and quotients that 305underflow gradually can lose accuracy gradually 306without vanishing, so comparing them with zero 307(as one might on a VAX) will not reveal the loss. 308Fortunately, if a gradually underflowed value is 309destined to be added to something bigger than the 310underflow threshold, as is almost always the case, 311digits lost to gradual underflow will not be missed 312because they would have been rounded off anyway. 313So gradual underflows are usually 314.Em provably 315ignorable. 316The same cannot be said of underflows flushed to 0. 317.El 318.Pp 319At the option of an implementor conforming to IEEE 754, 320other ways to cope with exceptions may be provided: 321.Bl -enum 322.It 323ABORT. 324This mechanism classifies an exception in 325advance as an incident to be handled by means 326traditionally associated with error-handling 327statements like "ON ERROR GO TO ...". 328Different 329languages offer different forms of this statement, 330but most share the following characteristics: 331.Bl -dash 332.It 333No means is provided to substitute a value for 334the offending operation's result and resume 335computation from what may be the middle of an 336expression. 337An exceptional result is abandoned. 338.It 339In a subprogram that lacks an error-handling 340statement, an exception causes the subprogram to 341abort within whatever program called it, and so 342on back up the chain of calling subprograms until 343an error-handling statement is encountered or the 344whole task is aborted and memory is dumped. 345.El 346.It 347STOP. 348This mechanism, requiring an interactive 349debugging environment, is more for the programmer 350than the program. 351It classifies an exception in 352advance as a symptom of a programmer's error; the 353exception suspends execution as near as it can to 354the offending operation so that the programmer can 355look around to see how it happened. 356Quite often 357the first several exceptions turn out to be quite 358unexceptionable, so the programmer ought ideally 359to be able to resume execution after each one as if 360execution had not been stopped. 361.It 362\&... Other ways lie beyond the scope of this document. 363.El 364.Pp 365Ideally, each 366elementary function should act as if it were indivisible, or 367atomic, in the sense that ... 368.Bl -enum 369.It 370No exception should be signaled that is not deserved by 371the data supplied to that function. 372.It 373Any exception signaled should be identified with that 374function rather than with one of its subroutines. 375.It 376The internal behavior of an atomic function should not 377be disrupted when a calling program changes from 378one to another of the five or so ways of handling 379exceptions listed above, although the definition 380of the function may be correlated intentionally 381with exception handling. 382.El 383.Pp 384The functions in 385.Nm libm 386are only approximately atomic. 387They signal no inappropriate exception except possibly ... 388.Bl -tag -width indent -offset indent -compact 389.It Xo 390Over/Underflow 391.Xc 392when a result, if properly computed, might have lain barely within range, and 393.It Xo 394Inexact in 395.Fn cabs , 396.Fn cbrt , 397.Fn hypot , 398.Fn log10 399and 400.Fn pow 401.Xc 402when it happens to be exact, thanks to fortuitous cancellation of errors. 403.El 404Otherwise, ... 405.Bl -tag -width indent -offset indent -compact 406.It Xo 407Invalid Operation is signaled only when 408.Xc 409any result but \*(Na would probably be misleading. 410.It Xo 411Overflow is signaled only when 412.Xc 413the exact result would be finite but beyond the overflow threshold. 414.It Xo 415Divide-by-Zero is signaled only when 416.Xc 417a function takes exactly infinite values at finite operands. 418.It Xo 419Underflow is signaled only when 420.Xc 421the exact result would be nonzero but tinier than the underflow threshold. 422.It Xo 423Inexact is signaled only when 424.Xc 425greater range or precision would be needed to represent the exact result. 426.El 427.Sh SEE ALSO 428.Xr fenv 3 , 429.Xr ieee_test 3 , 430.Xr math 3 431.Pp 432An explanation of IEEE 754 and its proposed extension p854 433was published in the IEEE magazine MICRO in August 1984 under 434the title "A Proposed Radix- and Word-length-independent 435Standard for Floating-point Arithmetic" by 436.An "W. J. Cody" 437et al. 438The manuals for Pascal, C and BASIC on the Apple Macintosh 439document the features of IEEE 754 pretty well. 440Articles in the IEEE magazine COMPUTER vol.\& 14 no.\& 3 (Mar.\& 4411981), and in the ACM SIGNUM Newsletter Special Issue of 442Oct.\& 1979, may be helpful although they pertain to 443superseded drafts of the standard. 444.Sh STANDARDS 445.St -ieee754 446