History log of /freebsd/lib/msun/src/s_tanf.c (Results 1 – 25 of 36)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 0dd5a560 28-Jan-2024 Steve Kargl <kargl@FreeBSD.org>

lib/msun: Cleanup after $FreeBSD$ removal

Remove no longer needed explicit inclusion of sys/cdefs.h.

PR: 276669
MFC after: 1 week


Revision tags: release/14.0.0
# 1d386b48 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


Revision tags: release/13.2.0, release/12.4.0, release/13.1.0, release/12.3.0, release/13.0.0, release/12.2.0, release/11.4.0, release/12.1.0, release/11.3.0, release/12.0.0, release/11.2.0, release/10.4.0, release/11.1.0, release/11.0.1, release/11.0.0, release/10.3.0, release/10.2.0, release/10.1.0, release/9.3.0, release/10.0.0, release/9.2.0, release/8.4.0, release/9.1.0, release/8.3.0_cvs, release/8.3.0, release/9.0.0, release/7.4.0_cvs, release/8.2.0_cvs, release/7.4.0, release/8.2.0, release/8.1.0_cvs, release/8.1.0, release/7.3.0_cvs, release/7.3.0, release/8.0.0_cvs, release/8.0.0, release/7.2.0_cvs, release/7.2.0, release/7.1.0_cvs, release/7.1.0, release/6.4.0_cvs, release/6.4.0
# e822ea5b 25-Feb-2008 Bruce Evans <bde@FreeBSD.org>

Inline __ieee754__rem_pio2f(). On amd64 (A64) and i386 (A64), this
gives an average speedup of about 12 cycles or 17% for
9pi/4 < |x| <= 2**19pi/2 and a smaller speedup for larger x, and a
small spe

Inline __ieee754__rem_pio2f(). On amd64 (A64) and i386 (A64), this
gives an average speedup of about 12 cycles or 17% for
9pi/4 < |x| <= 2**19pi/2 and a smaller speedup for larger x, and a
small speeddown for |x| <= 9pi/4 (only 1-2 cycles average, but that
is 4%).

Inlining this is less likely to bust caches than inlining the float
version since it is much smaller (about 220 bytes text and rodata) and
has many fewer branches. However, the float version was already large
due to its manual inlining of the branches and also the polynomial
evaluations.

show more ...


# 70d818a2 25-Feb-2008 Bruce Evans <bde@FreeBSD.org>

Change __ieee754_rem_pio2f() to return double instead of float so that
this function and its callers cosf(), sinf() and tanf() don't waste time
converting values from doubles to floats and back for |

Change __ieee754_rem_pio2f() to return double instead of float so that
this function and its callers cosf(), sinf() and tanf() don't waste time
converting values from doubles to floats and back for |x| > 9pi/4.
All these functions were optimized a few years ago to mostly use doubles
internally and across the __kernel*() interfaces but not across the
__ieee754_rem_pio2f() interface.

This saves about 40 cycles in cosf(), sinf() and tanf() for |x| > 9pi/4
on amd64 (A64), and about 20 cycles on i386 (A64) (except for cosf()
and sinf() in the upper range). 40 cycles is about 35% for |x| < 9pi/4
<= 2**19pi/2 and about 5% for |x| > 2**19pi/2. The saving is much
larger on amd64 than on i386 since the conversions are not easy to
optimize except on i386 where some of them are automatic and others
are optimized invalidly. amd64 is still about 10% slower in cosf()
and tanf() in the lower range due to conversion overhead.

This also gives a tiny speedup for |x| <= 9pi/4 on amd64 (by simplifying
the code). It also avoids compiler bugs and/or additional slowness
in the conversions on (not yet supported) machines where double_t !=
double.

show more ...


Revision tags: release/7.0.0_cvs, release/7.0.0
# 5aa554c7 22-Feb-2008 David Schultz <das@FreeBSD.org>

s/rcsid/__FBSDID/


Revision tags: release/6.3.0_cvs, release/6.3.0, release/6.2.0_cvs, release/6.2.0, release/5.5.0_cvs, release/5.5.0, release/6.1.0_cvs, release/6.1.0
# 960d3da0 28-Nov-2005 Bruce Evans <bde@FreeBSD.org>

Changed spelling of the request-to-inline macro name to match the change
of the function name.

Added my (non-)copyright.

In k_tanf.c, added the first set of redundant parentheses to control
groupin

Changed spelling of the request-to-inline macro name to match the change
of the function name.

Added my (non-)copyright.

In k_tanf.c, added the first set of redundant parentheses to control
grouping which was claimed to be added in the previous commit.

show more ...


# 94a5f9be 23-Nov-2005 Bruce Evans <bde@FreeBSD.org>

Use only double precision for "kernel" tanf (except for returning float).
This is a minor interface change. The function is renamed from
__kernel_tanf() to __kernel_tandf() so that misues of it will

Use only double precision for "kernel" tanf (except for returning float).
This is a minor interface change. The function is renamed from
__kernel_tanf() to __kernel_tandf() so that misues of it will cause
link errors and not crashes.

This version is a routine translation with no special optimizations
for accuracy or efficiency. It gives an unimportant increase in
accuracy, from ~0.9 ulps to 0.5285 ulps. Almost all of the error is
from the minimax polynomial (~0.03 ulps and the final rounding step
(< 0.5 ulps). It gives strange differences in efficiency in the -5
to +10% range, with -O1 fairly consistently becoming faster and -O2
slower on AXP and A64 with gcc-3.3 and gcc-3.4.

show more ...


# 4ce51209 21-Nov-2005 Bruce Evans <bde@FreeBSD.org>

Mess up the "kernel" float trig function .c files with ifdefs so that
they can be #included in other .c files to give inline functions, and
use them to inline the functions in most callers (not in e_

Mess up the "kernel" float trig function .c files with ifdefs so that
they can be #included in other .c files to give inline functions, and
use them to inline the functions in most callers (not in e_lgammaf_r.c).
__kernel_tanf() is too large and complicated for gcc to inline very well.

An athlons, this gives a speed increase under favourable pipeline
conditions of about 10% overall (larger for AXP, smaller for A64).
E.g., on AXP, sinf() on uniformly distributed args in [-2Pi, 2Pi]
now takes 30-56 cycles; it used to take 45-61 cycles; hardware fsin
takes 65-129.

show more ...


# 23f6483e 20-Nov-2005 Bruce Evans <bde@FreeBSD.org>

Restored a cleanup in rev.1.9 tthat was lost in rev.1.10.


# 8299eb7e 19-Nov-2005 Bruce Evans <bde@FreeBSD.org>

Moved all the optimizations for |x| <= 9pi/2 from
__ieee754_rem_pio2f() to its 3 callers and manually inline them.

On Athlons, with favourable compiler flags and optimizations and
favourable pipelin

Moved all the optimizations for |x| <= 9pi/2 from
__ieee754_rem_pio2f() to its 3 callers and manually inline them.

On Athlons, with favourable compiler flags and optimizations and
favourable pipeline conditions, this gives a speedup of 30-40 cycles
for cosf(), sinf() and tanf() on the range pi/4 < |x| <= 9pi/4, so
thes functions are now signifcantly faster than the hardware trig
functions in many cases. E.g., in a benchmark with uniformly distributed
x in [-2pi, 2pi], A64 hardware fcos took 72-129 cycles and cosf() took
37-55 cycles. Out-of-order execution is needed to get both of these
times. The optimizations in this commit apparently work more by
removing 1 serialization point than by reducing latency.

show more ...


# 75ff209c 17-Nov-2005 Bruce Evans <bde@FreeBSD.org>

Minor cleanups:

s_cosf.c and s_sinf.c:
Use a non-bogus magic constant for the threshold of pi/4. It was 2 ulps
smaller than pi/4 rounded down, but its value is not critical so it should
be the resu

Minor cleanups:

s_cosf.c and s_sinf.c:
Use a non-bogus magic constant for the threshold of pi/4. It was 2 ulps
smaller than pi/4 rounded down, but its value is not critical so it should
be the result of natural rounding.

s_cosf.c and s_tanf.c:
Use a literal 0.0 instead of an unnecessary variable initialized to
[(float)]0.0. Let the function prototype convert to 0.0F.

Improved wording in some comments.

Attempted to improve indentation of comments.

show more ...


Revision tags: release/6.0.0_cvs, release/6.0.0
# cb92d4d5 02-Nov-2005 Bruce Evans <bde@FreeBSD.org>

Moved the optimization for tiny x from __kernel_tan[f](x) to tan[f](x)
so that it can be faster for tiny x and avoided for reduced x.

This improves things a little differently than for cosine and si

Moved the optimization for tiny x from __kernel_tan[f](x) to tan[f](x)
so that it can be faster for tiny x and avoided for reduced x.

This improves things a little differently than for cosine and sine.
We still need to reclassify x in the "kernel" functions, but we get
an extra optimization for tiny x, and an overall optimization since
tiny reduced x rarely happens. We also get optimizations for space
and style. A large block of poorly duplicated code to fix a special
case is no longer needed. This supersedes the fixes in k_sin.c revs
1.9 and 1.11 and k_sinf.c 1.8 and 1.10.

Fixed wrong constant for the cutoff for "tiny" in tanf(). It was
2**-28, but should be almost the same as the cutoff in sinf() (2**-12).
The incorrect cutoff protected us from the bugs fixed in k_sinf.c 1.8
and 1.10, except 4 cases of reduced args passed the cutoff and needed
special handling in theory although not in practice. Now we essentially
use a cutoff of 0 for the case of reduced args, so we now have 0 special
args instead of 4.

This change makes no difference to the results for sinf() (since it
only changes the algorithm for the 4 special args and the results for
those happen not to change), but it changes lots of results for sin().
Exhaustive testing is impossible for sin(), but exhaustive testing
for sinf() (relative to a version with the old algorithm and a fixed
cutoff) shows that the changes in the error are either reductions or
from 0.5-epsilon ulps to 0.5+epsilon ulps. The new method just uses
some extra terms in approximations so it tends to give more accurate
results, and there are apparently no problems from having extra
accuracy. On amd64 with -O1, on all float args the error range in ulps
is reduced from (0.500, 0.665] to [0.335, 0.500) in 24168 cases and
increased from 0.500-epsilon to 0.500+epsilon in 24 cases. Non-
exhaustive testing by ucbtest shows no differences.

show more ...


Revision tags: release/5.4.0_cvs, release/5.4.0, release/4.11.0_cvs, release/4.11.0, release/5.3.0_cvs, release/5.3.0, release/4.10.0_cvs, release/4.10.0, release/5.2.1_cvs, release/5.2.1, release/5.2.0_cvs, release/5.2.0, release/4.9.0_cvs, release/4.9.0, release/5.1.0_cvs, release/5.1.0, release/4.8.0_cvs, release/4.8.0, release/5.0.0_cvs, release/5.0.0, release/4.7.0_cvs, release/4.6.2_cvs, release/4.6.2, release/4.6.1, release/4.6.0_cvs
# 59b19ff1 28-May-2002 Alfred Perlstein <alfred@FreeBSD.org>

Fix formatting, this is hard to explain, so I'll show one example.

- float ynf(int n, float x) /* wrapper ynf */
+float
+ynf(int n, float x) /* wrapper ynf */

This is because the __S

Fix formatting, this is hard to explain, so I'll show one example.

- float ynf(int n, float x) /* wrapper ynf */
+float
+ynf(int n, float x) /* wrapper ynf */

This is because the __STDC__ stuff was indented.

Reviewed by: md5

show more ...


# 2dcc2286 28-May-2002 Alfred Perlstein <alfred@FreeBSD.org>

Assume __STDC__, remove non-__STDC__ code.

Reviewed by: md5


Revision tags: release/4.5.0_cvs, release/4.4.0_cvs, release/4.3.0_cvs, release/4.3.0, release/4.2.0, release/4.1.1_cvs, release/4.1.0, release/3.5.0_cvs, release/4.0.0_cvs, release/3.4.0_cvs, release/3.3.0_cvs
# 7f3dea24 28-Aug-1999 Peter Wemm <peter@FreeBSD.org>

$Id$ -> $FreeBSD$


Revision tags: release/3.2.0, release/3.1.0, release/3.0.0, release/2.2.8, release/2.2.7, release/2.2.6, release/2.2.5_cvs, release/2.2.2_cvs, release/2.2.1_cvs, release/2.2.0, release/2.1.7_cvs
# 7e546392 22-Feb-1997 Peter Wemm <peter@FreeBSD.org>

Revert $FreeBSD$ to $Id$


Revision tags: release/2.1.6_cvs, release/2.1.6.1
# 1130b656 14-Jan-1997 Jordan K. Hubbard <jkh@FreeBSD.org>

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.

show more ...


Revision tags: release/2.1.5_cvs, release/2.1.0_cvs, release/2.0.5_cvs
# 6c06b4e2 30-May-1995 Rodney W. Grimes <rgrimes@FreeBSD.org>

Remove trailing whitespace.


Revision tags: release/2.0
# 3a8617a8 19-Aug-1994 Jordan K. Hubbard <jkh@FreeBSD.org>

J.T. Conklin's latest version of the Sun math library.

-- Begin comments from J.T. Conklin:
The most significant improvement is the addition of "float" versions
of the math functions that take float

J.T. Conklin's latest version of the Sun math library.

-- Begin comments from J.T. Conklin:
The most significant improvement is the addition of "float" versions
of the math functions that take float arguments, return floats, and do
all operations in floating point. This doesn't help (performance)
much on the i386, but they are still nice to have.

The float versions were orginally done by Cygnus' Ian Taylor when
fdlibm was integrated into the libm we support for embedded systems.
I gave Ian a copy of my libm as a starting point since I had already
fixed a lot of bugs & problems in Sun's original code. After he was
done, I cleaned it up a bit and integrated the changes back into my
libm.
-- End comments

Reviewed by: jkh
Submitted by: jtc

show more ...


Revision tags: release/13.2.0, release/12.4.0, release/13.1.0, release/12.3.0, release/13.0.0, release/12.2.0, release/11.4.0, release/12.1.0, release/11.3.0, release/12.0.0, release/11.2.0, release/10.4.0, release/11.1.0, release/11.0.1, release/11.0.0, release/10.3.0, release/10.2.0, release/10.1.0, release/9.3.0, release/10.0.0, release/9.2.0, release/8.4.0, release/9.1.0, release/8.3.0_cvs, release/8.3.0, release/9.0.0, release/7.4.0_cvs, release/8.2.0_cvs, release/7.4.0, release/8.2.0, release/8.1.0_cvs, release/8.1.0, release/7.3.0_cvs, release/7.3.0, release/8.0.0_cvs, release/8.0.0, release/7.2.0_cvs, release/7.2.0, release/7.1.0_cvs, release/7.1.0, release/6.4.0_cvs, release/6.4.0
# e822ea5b 25-Feb-2008 Bruce Evans <bde@FreeBSD.org>

Inline __ieee754__rem_pio2f(). On amd64 (A64) and i386 (A64), this
gives an average speedup of about 12 cycles or 17% for
9pi/4 < |x| <= 2**19pi/2 and a smaller speedup for larger x, and a
small spe

Inline __ieee754__rem_pio2f(). On amd64 (A64) and i386 (A64), this
gives an average speedup of about 12 cycles or 17% for
9pi/4 < |x| <= 2**19pi/2 and a smaller speedup for larger x, and a
small speeddown for |x| <= 9pi/4 (only 1-2 cycles average, but that
is 4%).

Inlining this is less likely to bust caches than inlining the float
version since it is much smaller (about 220 bytes text and rodata) and
has many fewer branches. However, the float version was already large
due to its manual inlining of the branches and also the polynomial
evaluations.

show more ...


# 70d818a2 25-Feb-2008 Bruce Evans <bde@FreeBSD.org>

Change __ieee754_rem_pio2f() to return double instead of float so that
this function and its callers cosf(), sinf() and tanf() don't waste time
converting values from doubles to floats and back for |

Change __ieee754_rem_pio2f() to return double instead of float so that
this function and its callers cosf(), sinf() and tanf() don't waste time
converting values from doubles to floats and back for |x| > 9pi/4.
All these functions were optimized a few years ago to mostly use doubles
internally and across the __kernel*() interfaces but not across the
__ieee754_rem_pio2f() interface.

This saves about 40 cycles in cosf(), sinf() and tanf() for |x| > 9pi/4
on amd64 (A64), and about 20 cycles on i386 (A64) (except for cosf()
and sinf() in the upper range). 40 cycles is about 35% for |x| < 9pi/4
<= 2**19pi/2 and about 5% for |x| > 2**19pi/2. The saving is much
larger on amd64 than on i386 since the conversions are not easy to
optimize except on i386 where some of them are automatic and others
are optimized invalidly. amd64 is still about 10% slower in cosf()
and tanf() in the lower range due to conversion overhead.

This also gives a tiny speedup for |x| <= 9pi/4 on amd64 (by simplifying
the code). It also avoids compiler bugs and/or additional slowness
in the conversions on (not yet supported) machines where double_t !=
double.

show more ...


Revision tags: release/7.0.0_cvs, release/7.0.0
# 5aa554c7 22-Feb-2008 David Schultz <das@FreeBSD.org>

s/rcsid/__FBSDID/


Revision tags: release/6.3.0_cvs, release/6.3.0, release/6.2.0_cvs, release/6.2.0, release/5.5.0_cvs, release/5.5.0, release/6.1.0_cvs, release/6.1.0
# 960d3da0 28-Nov-2005 Bruce Evans <bde@FreeBSD.org>

Changed spelling of the request-to-inline macro name to match the change
of the function name.

Added my (non-)copyright.

In k_tanf.c, added the first set of redundant parentheses to control
groupin

Changed spelling of the request-to-inline macro name to match the change
of the function name.

Added my (non-)copyright.

In k_tanf.c, added the first set of redundant parentheses to control
grouping which was claimed to be added in the previous commit.

show more ...


# 94a5f9be 23-Nov-2005 Bruce Evans <bde@FreeBSD.org>

Use only double precision for "kernel" tanf (except for returning float).
This is a minor interface change. The function is renamed from
__kernel_tanf() to __kernel_tandf() so that misues of it will

Use only double precision for "kernel" tanf (except for returning float).
This is a minor interface change. The function is renamed from
__kernel_tanf() to __kernel_tandf() so that misues of it will cause
link errors and not crashes.

This version is a routine translation with no special optimizations
for accuracy or efficiency. It gives an unimportant increase in
accuracy, from ~0.9 ulps to 0.5285 ulps. Almost all of the error is
from the minimax polynomial (~0.03 ulps and the final rounding step
(< 0.5 ulps). It gives strange differences in efficiency in the -5
to +10% range, with -O1 fairly consistently becoming faster and -O2
slower on AXP and A64 with gcc-3.3 and gcc-3.4.

show more ...


# 4ce51209 21-Nov-2005 Bruce Evans <bde@FreeBSD.org>

Mess up the "kernel" float trig function .c files with ifdefs so that
they can be #included in other .c files to give inline functions, and
use them to inline the functions in most callers (not in e_

Mess up the "kernel" float trig function .c files with ifdefs so that
they can be #included in other .c files to give inline functions, and
use them to inline the functions in most callers (not in e_lgammaf_r.c).
__kernel_tanf() is too large and complicated for gcc to inline very well.

An athlons, this gives a speed increase under favourable pipeline
conditions of about 10% overall (larger for AXP, smaller for A64).
E.g., on AXP, sinf() on uniformly distributed args in [-2Pi, 2Pi]
now takes 30-56 cycles; it used to take 45-61 cycles; hardware fsin
takes 65-129.

show more ...


12