xref: /freebsd/lib/msun/man/ieee.3 (revision 52267f7411adcc76ede961420e08c0e42f42d415)
1.\" Copyright (c) 1985 Regents of the University of California.
2.\" All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\" 4. Neither the name of the University nor the names of its contributors
13.\"    may be used to endorse or promote products derived from this software
14.\"    without specific prior written permission.
15.\"
16.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
17.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
18.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
19.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
20.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
21.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
22.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
23.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
24.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
25.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
26.\" SUCH DAMAGE.
27.\"
28.\"     from: @(#)ieee.3	6.4 (Berkeley) 5/6/91
29.\" $FreeBSD$
30.\"
31.Dd January 26, 2005
32.Dt IEEE 3
33.Os
34.Sh NAME
35.Nm ieee
36.Nd IEEE standard 754 for floating-point arithmetic
37.Sh DESCRIPTION
38The IEEE Standard 754 for Binary Floating-Point Arithmetic
39defines representations of floating-point numbers and abstract
40properties of arithmetic operations relating to precision,
41rounding, and exceptional cases, as described below.
42.Ss IEEE STANDARD 754 Floating-Point Arithmetic
43Radix: Binary.
44.Pp
45Overflow and underflow:
46.Bd -ragged -offset indent -compact
47Overflow goes by default to a signed \*(If.
48Underflow is
49.Em gradual .
50.Ed
51.Pp
52Zero is represented ambiguously as +0 or \-0.
53.Bd -ragged -offset indent -compact
54Its sign transforms correctly through multiplication or
55division, and is preserved by addition of zeros
56with like signs; but x\-x yields +0 for every
57finite x.
58The only operations that reveal zero's
59sign are division by zero and
60.Fn copysign x \(+-0 .
61In particular, comparison (x > y, x \(>= y, etc.)\&
62cannot be affected by the sign of zero; but if
63finite x = y then \*(If = 1/(x\-y) \(!= \-1/(y\-x) = \-\*(If.
64.Ed
65.Pp
66Infinity is signed.
67.Bd -ragged -offset indent -compact
68It persists when added to itself
69or to any finite number.
70Its sign transforms
71correctly through multiplication and division, and
72(finite)/\(+-\*(If\0=\0\(+-0
73(nonzero)/0 = \(+-\*(If.
74But
75\*(If\-\*(If, \*(If\(**0 and \*(If/\*(If
76are, like 0/0 and sqrt(\-3),
77invalid operations that produce \*(Na. ...
78.Ed
79.Pp
80Reserved operands (\*(Nas):
81.Bd -ragged -offset indent -compact
82An \*(Na is
83.Em ( N Ns ot Em a N Ns umber ) .
84Some \*(Nas, called Signaling \*(Nas, trap any floating-point operation
85performed upon them; they are used to mark missing
86or uninitialized values, or nonexistent elements
87of arrays.
88The rest are Quiet \*(Nas; they are
89the default results of Invalid Operations, and
90propagate through subsequent arithmetic operations.
91If x \(!= x then x is \*(Na; every other predicate
92(x > y, x = y, x < y, ...) is FALSE if \*(Na is involved.
93.Ed
94.Pp
95Rounding:
96.Bd -ragged -offset indent -compact
97Every algebraic operation (+, \-, \(**, /,
98\(sr)
99is rounded by default to within half an
100.Em ulp ,
101and when the rounding error is exactly half an
102.Em ulp
103then
104the rounded value's least significant bit is zero.
105(An
106.Em ulp
107is one
108.Em U Ns nit
109in the
110.Em L Ns ast
111.Em P Ns lace . )
112This kind of rounding is usually the best kind,
113sometimes provably so; for instance, for every
114x = 1.0, 2.0, 3.0, 4.0, ..., 2.0**52, we find
115(x/3.0)\(**3.0 == x and (x/10.0)\(**10.0 == x and ...
116despite that both the quotients and the products
117have been rounded.
118Only rounding like IEEE 754 can do that.
119But no single kind of rounding can be
120proved best for every circumstance, so IEEE 754
121provides rounding towards zero or towards
122+\*(If or towards \-\*(If
123at the programmer's option.
124.Ed
125.Pp
126Exceptions:
127.Bd -ragged -offset indent -compact
128IEEE 754 recognizes five kinds of floating-point exceptions,
129listed below in declining order of probable importance.
130.Bl -column -offset indent "Invalid Operation" "Gradual Underflow"
131.Em "Exception	Default Result"
132Invalid Operation	\*(Na, or FALSE
133Overflow	\(+-\*(If
134Divide by Zero	\(+-\*(If
135Underflow	Gradual Underflow
136Inexact	Rounded value
137.El
138.Pp
139NOTE: An Exception is not an Error unless handled
140badly.
141What makes a class of exceptions exceptional
142is that no single default response can be satisfactory
143in every instance.
144On the other hand, if a default
145response will serve most instances satisfactorily,
146the unsatisfactory instances cannot justify aborting
147computation every time the exception occurs.
148.Ed
149.Ss Data Formats
150Single-precision:
151.Bd -ragged -offset indent -compact
152Type name:
153.Vt float
154.Pp
155Wordsize: 32 bits.
156.Pp
157Precision: 24 significant bits,
158roughly like 7 significant decimals.
159.Bd -ragged -offset indent -compact
160If x and x' are consecutive positive single-precision
161numbers (they differ by 1
162.Em ulp ) ,
163then
164.Bd -ragged -compact
1655.9e\-08 < 0.5**24 < (x'\-x)/x \(<= 0.5**23 < 1.2e\-07.
166.Ed
167.Ed
168.Pp
169.Bl -column "XXX" -compact
170Range:	Overflow threshold  = 2.0**128 = 3.4e38
171	Underflow threshold = 0.5**126 = 1.2e\-38
172.El
173.Bd -ragged -offset indent -compact
174Underflowed results round to the nearest
175integer multiple of 0.5**149 = 1.4e\-45.
176.Ed
177.Ed
178.Pp
179Double-precision:
180.Bd -ragged -offset indent -compact
181Type name:
182.Vt double
183.Bd -ragged -offset indent -compact
184On some architectures,
185.Vt long double
186is the the same as
187.Vt double .
188.Ed
189.Pp
190Wordsize: 64 bits.
191.Pp
192Precision: 53 significant bits,
193roughly like 16 significant decimals.
194.Bd -ragged -offset indent -compact
195If x and x' are consecutive positive double-precision
196numbers (they differ by 1
197.Em ulp ) ,
198then
199.Bd -ragged -compact
2001.1e\-16 < 0.5**53 < (x'\-x)/x \(<= 0.5**52 < 2.3e\-16.
201.Ed
202.Ed
203.Pp
204.Bl -column "XXX" -compact
205Range:	Overflow threshold  = 2.0**1024 = 1.8e308
206	Underflow threshold = 0.5**1022 = 2.2e\-308
207.El
208.Bd -ragged -offset indent -compact
209Underflowed results round to the nearest
210integer multiple of 0.5**1074 = 4.9e\-324.
211.Ed
212.Ed
213.Pp
214Extended-precision:
215.Bd -ragged -offset indent -compact
216Type name:
217.Vt long double
218(when supported by the hardware)
219.Pp
220Wordsize: 96 bits.
221.Pp
222Precision: 64 significant bits,
223roughly like 19 significant decimals.
224.Bd -ragged -offset indent -compact
225If x and x' are consecutive positive extended-precision
226numbers (they differ by 1
227.Em ulp ) ,
228then
229.Bd -ragged -compact
2301.0e\-19 < 0.5**63 < (x'\-x)/x \(<= 0.5**62 < 2.2e\-19.
231.Ed
232.Ed
233.Pp
234.Bl -column "XXX" -compact
235Range:	Overflow threshold  = 2.0**16384 = 1.2e4932
236	Underflow threshold = 0.5**16382 = 3.4e\-4932
237.El
238.Bd -ragged -offset indent -compact
239Underflowed results round to the nearest
240integer multiple of 0.5**16445 = 5.7e\-4953.
241.Ed
242.Ed
243.Pp
244Quad-extended-precision:
245.Bd -ragged -offset indent -compact
246Type name:
247.Vt long double
248(when supported by the hardware)
249.Pp
250Wordsize: 128 bits.
251.Pp
252Precision: 113 significant bits,
253roughly like 34 significant decimals.
254.Bd -ragged -offset indent -compact
255If x and x' are consecutive positive quad-extended-precision
256numbers (they differ by 1
257.Em ulp ) ,
258then
259.Bd -ragged -compact
2609.6e\-35 < 0.5**113 < (x'\-x)/x \(<= 0.5**112 < 2.0e\-34.
261.Ed
262.Ed
263.Pp
264.Bl -column "XXX" -compact
265Range:	Overflow threshold  = 2.0**16384 = 1.2e4932
266	Underflow threshold = 0.5**16382 = 3.4e\-4932
267.El
268.Bd -ragged -offset indent -compact
269Underflowed results round to the nearest
270integer multiple of 0.5**16494 = 6.5e\-4966.
271.Ed
272.Ed
273.Ss Additional Information Regarding Exceptions
274.Pp
275For each kind of floating-point exception, IEEE 754
276provides a Flag that is raised each time its exception
277is signaled, and stays raised until the program resets
278it.
279Programs may also test, save and restore a flag.
280Thus, IEEE 754 provides three ways by which programs
281may cope with exceptions for which the default result
282might be unsatisfactory:
283.Bl -enum
284.It
285Test for a condition that might cause an exception
286later, and branch to avoid the exception.
287.It
288Test a flag to see whether an exception has occurred
289since the program last reset its flag.
290.It
291Test a result to see whether it is a value that only
292an exception could have produced.
293.Pp
294CAUTION: The only reliable ways to discover
295whether Underflow has occurred are to test whether
296products or quotients lie closer to zero than the
297underflow threshold, or to test the Underflow
298flag.
299(Sums and differences cannot underflow in
300IEEE 754; if x \(!= y then x\-y is correct to
301full precision and certainly nonzero regardless of
302how tiny it may be.)
303Products and quotients that
304underflow gradually can lose accuracy gradually
305without vanishing, so comparing them with zero
306(as one might on a VAX) will not reveal the loss.
307Fortunately, if a gradually underflowed value is
308destined to be added to something bigger than the
309underflow threshold, as is almost always the case,
310digits lost to gradual underflow will not be missed
311because they would have been rounded off anyway.
312So gradual underflows are usually
313.Em provably
314ignorable.
315The same cannot be said of underflows flushed to 0.
316.El
317.Pp
318At the option of an implementor conforming to IEEE 754,
319other ways to cope with exceptions may be provided:
320.Bl -enum
321.It
322ABORT.
323This mechanism classifies an exception in
324advance as an incident to be handled by means
325traditionally associated with error-handling
326statements like "ON ERROR GO TO ...".
327Different
328languages offer different forms of this statement,
329but most share the following characteristics:
330.Bl -dash
331.It
332No means is provided to substitute a value for
333the offending operation's result and resume
334computation from what may be the middle of an
335expression.
336An exceptional result is abandoned.
337.It
338In a subprogram that lacks an error-handling
339statement, an exception causes the subprogram to
340abort within whatever program called it, and so
341on back up the chain of calling subprograms until
342an error-handling statement is encountered or the
343whole task is aborted and memory is dumped.
344.El
345.It
346STOP.
347This mechanism, requiring an interactive
348debugging environment, is more for the programmer
349than the program.
350It classifies an exception in
351advance as a symptom of a programmer's error; the
352exception suspends execution as near as it can to
353the offending operation so that the programmer can
354look around to see how it happened.
355Quite often
356the first several exceptions turn out to be quite
357unexceptionable, so the programmer ought ideally
358to be able to resume execution after each one as if
359execution had not been stopped.
360.It
361\&... Other ways lie beyond the scope of this document.
362.El
363.Pp
364Ideally, each
365elementary function should act as if it were indivisible, or
366atomic, in the sense that ...
367.Bl -enum
368.It
369No exception should be signaled that is not deserved by
370the data supplied to that function.
371.It
372Any exception signaled should be identified with that
373function rather than with one of its subroutines.
374.It
375The internal behavior of an atomic function should not
376be disrupted when a calling program changes from
377one to another of the five or so ways of handling
378exceptions listed above, although the definition
379of the function may be correlated intentionally
380with exception handling.
381.El
382.Pp
383The functions in
384.Nm libm
385are only approximately atomic.
386They signal no inappropriate exception except possibly ...
387.Bl -tag -width indent -offset indent -compact
388.It Xo
389Over/Underflow
390.Xc
391when a result, if properly computed, might have lain barely within range, and
392.It Xo
393Inexact in
394.Fn cabs ,
395.Fn cbrt ,
396.Fn hypot ,
397.Fn log10
398and
399.Fn pow
400.Xc
401when it happens to be exact, thanks to fortuitous cancellation of errors.
402.El
403Otherwise, ...
404.Bl -tag -width indent -offset indent -compact
405.It Xo
406Invalid Operation is signaled only when
407.Xc
408any result but \*(Na would probably be misleading.
409.It Xo
410Overflow is signaled only when
411.Xc
412the exact result would be finite but beyond the overflow threshold.
413.It Xo
414Divide-by-Zero is signaled only when
415.Xc
416a function takes exactly infinite values at finite operands.
417.It Xo
418Underflow is signaled only when
419.Xc
420the exact result would be nonzero but tinier than the underflow threshold.
421.It Xo
422Inexact is signaled only when
423.Xc
424greater range or precision would be needed to represent the exact result.
425.El
426.Sh SEE ALSO
427.Xr fenv 3 ,
428.Xr ieee_test 3 ,
429.Xr math 3
430.Pp
431An explanation of IEEE 754 and its proposed extension p854
432was published in the IEEE magazine MICRO in August 1984 under
433the title "A Proposed Radix- and Word-length-independent
434Standard for Floating-point Arithmetic" by
435.An "W. J. Cody"
436et al.
437The manuals for Pascal, C and BASIC on the Apple Macintosh
438document the features of IEEE 754 pretty well.
439Articles in the IEEE magazine COMPUTER vol.\& 14 no.\& 3 (Mar.\&
4401981), and in the ACM SIGNUM Newsletter Special Issue of
441Oct.\& 1979, may be helpful although they pertain to
442superseded drafts of the standard.
443.Sh STANDARDS
444.St -ieee754
445