History log of /freebsd-10.0-release/lib/msun/ld80/
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
259065 07-Dec-2013 gjb

- Copy stable/10 (r259064) to releng/10.0 as part of the
10.0-RELEASE cycle.
- Update __FreeBSD_version [1]
- Set branch name to -RC1

[1] 10.0-CURRENT __FreeBSD_version value ended at '55', so
start releng/10.0 at '100' so the branch is started with
a value ending in zero.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


251343 03-Jun-2013 kargl

ld80 and ld128 implementations of expm1l(). This code started life
as a fairly faithful implementation of the algorithm found in

PTP Tang, "Table-driven implementation of the Expm1 function
in IEEE floating-point arithmetic," ACM Trans. Math. Soft., 18,
211-222 (1992).

Over the last 18-24 months, the code has under gone significant
optimization and testing.

Reviewed by: bde
Obtained from: bde (most of the optimizations)


251339 03-Jun-2013 kargl

ld80/s_expl.c:

* Use integral numerical constants, and let the compiler do the
conversion to long double.

ld128/s_expl.c:

* Use integral numerical constants, and let the compiler do the
conversion to long double.
* Use the ENTERI/RETURNI macros, which are no-ops on ld128. This
however makes the ld80 and ld128 identical.

Reviewed by: bde (as part of larger diff)


251338 03-Jun-2013 kargl

Micro-optimization: move the unary mius operator to operate
on a literal constant.

Obtained from: bde


251335 03-Jun-2013 kargl

ld80/s_expl.c:

* In the special case x = -Inf or -NaN, use a micro-optimization
to eliminate the need to access u.xbits.man.

* Fix an off-by-one for small arguments |x| < 0x1p-65.

ld128/s_expl.c:

* In the special case x = -Inf or -NaN, use a micro-optimization
to eliminate the need to access u.xbits.manh and u.xbits.manl.

* Fix an off-by-one for small arguments |x| < 0x1p-114.

Obtained from: bde


251334 03-Jun-2013 kargl

ld80/s_expl.c:

* Update the evaluation of the polynomial. This allows the removal
of the now unused variables t23 and t45.

ld128/s_expl.c:

* Update the evaluation of the polynomial and the intermediate
result t. This update allows several numerical constants to be
written as double rather than long double constants. Update
the constants as appropriate.

Obtained from: bde


251333 03-Jun-2013 kargl

Rename a few P2, P3, ... coefficients to A2, A3, ... missed in
my previous commit.


251330 03-Jun-2013 kargl

Update a comment to reflect that we are using an endpoint of
an interval instead of a midpoint.


251328 03-Jun-2013 kargl

Add a u suffix to the IEEEl2bits unions o_threshold and u_threshold,
and use macros to access the e component of the unions. This allows
the portions of the code in ld80 to be identical to the ld128 code.

Obtained from: bde


251327 03-Jun-2013 kargl

Introduce the macro LOG2_INTERVAL, which is log2(number of intervals).
Use the macroi as a micro-optimization to convert a subtraction and
division to a shift.

Obtained from: bde


251325 03-Jun-2013 kargl

Whitespace.


251321 03-Jun-2013 kargl

* Rename the polynomial coefficients from P2, P3, ... to A2, A3, ....
The names now coincide with the name used in PTP Tang's paper.

* Rename the variable from s to tbl to better reflect that
this is a table, and to be consistent with the naming scheme
in s_exp2l.c

Reviewed by: bde (as part of larger diff)


251316 03-Jun-2013 kargl

* Style(9). Start non-Copyright fancy formatted comments with /**.

Reviewed by: bde (as part of larger diff)


251315 03-Jun-2013 kargl

ld80/s_expl.c:

* Update Copyright years to include 2013.

ld128/s_expl.c:

* Correct and update Copyright years. This code originated from
the ld80 version, so it should reflect the same time period.

Reviewed by: bde (as part of larger diff)


251292 03-Jun-2013 das

Add logl, log2l, log10l, and log1pl.

Submitted by: bde


251046 27-May-2013 kargl

Style(9)

Approved by: das (implicit)
Reported by: jh


251041 27-May-2013 kargl

* Update polynomial coefficients.
* Use ENTERI/RETURNI to allow the use of FP_PE on i386 target.

Reviewed by: das (and bde a long time ago)
Approved by: das (mentor)
Obtained from: bde (polynomial coefficients)


251024 27-May-2013 das

Fix some regressions caused by the switch from gcc to clang. The fixes
are workarounds for various symptoms of the problem described in clang
bugs 3929, 8100, 8241, 10409, and 12958.

The regression tests did their job: they failed, someone brought it
up on the mailing lists, and then the issue got ignored for 6 months.
Oops. There may still be some regressions for functions we don't have
test coverage for yet.


241516 13-Oct-2012 kargl

* Update the comment that explains the choice of values in the
table and the requirement on trailing zero bits.

* Remove the __aligned() compiler directives as these were found
to have a negative effect on the produced code.

Submitted by: bde
Approved by: das (mentor)


241051 29-Sep-2012 kargl

* src/math_private.h:
. Change the API for the LD80C by removing the explicit passing
of the sign bit. The sign can be determined from the last
parameter of the macro.
. On i386, load long double by bit manipulations to work around
at least a gcc compiler issue. On non-i386 ld80 architectures,
use a simple assignment.

* ld80/s_expl.c:
. Update the only consumer of LD80C.

Submitted by: bde
Approved by: das (mentor)


240866 23-Sep-2012 kargl

* ld80/s_expl.c:
. Fix the threshold for expl(x) where |x| is small.
. Also update the previously incorrect comment to match the
new threshold.

* ld128/s_expl.c:
. Re-order logic in exceptional cases to match the logic used in
other long double functions.
. Fix the threshold for expl(x) where is |x| is small.
. Also update the previously incorrect comment to match the
new threshold.

Submitted by: bde
Approved by: das (mentor)


240865 23-Sep-2012 kargl

Fix whitespace issue.

Approved by: das (mentor, implicit)


240864 23-Sep-2012 kargl

* ld80/s_expl.c:
. Guard a comment from reformatting by indent(1).
. Re-order variables in declarations to alphabetical order.
. Remove a banal comment.

* ld128/s_expl.c:
. Add a comment to point to ld80/s_expl.c for implementation details.
. Move the #define of INTERVAL to reduce the diff with ld80/s_expl.c.
. twom10000 does not need to be volatile, so move its declaration.
. Re-order variables in declarations to alphabetical order.
. Add a comment that describes the argument reduction.
. Remove the same banal comment found in ld80/s_expl.c.

Reviewed by: bde
Approved by: das (mentor)


240861 23-Sep-2012 kargl

* Update the lookup table to use 53-bit high and low values.
Also, update the comment to describe the choice of using
a high and low decomposition of 2^(i/INTERNVAL) for
0 <= i <= INTERVAL in preparation for an implementation of
expm1l.

* Move the #define of INTERVAL above the comment, because the
comment refers to INTERVAL.

Reviewed by: bde
Approved by: das (mentor)


238923 30-Jul-2012 kargl

Whitespace.

Submitted by: bde
Approved by: das (pre-approved)


238784 26-Jul-2012 kargl

Replace the macro name NUM with INTERVALS. This change provides
compatibility with the INTERVALS macro used in the soon-to-be-commmitted
expm1l() and someday-to-be-committed log*l() functions.

Add a comment into ld128/s_expl.c noting at gcc issue that was
deleted when rewriting ld80/e_expl.c as ld128/s_expl.c.

Requested by: bde
Approved by: das (mentor)


238783 26-Jul-2012 kargl

* ld80/expl.c:
. Remove a few #ifdefs that should have been removed in the initial
commit.
. Sort fpmath.h to its rightful place.

* ld128/s_expl.c:
. Replace EXPMASK with its actual value.
. Sort fpmath.h to its rightful place.

Requested by: bde
Approved by: das (mentor)


238722 23-Jul-2012 kargl

Compute the exponential of x for Intel 80-bit format and IEEE 128-bit
format. These implementations are based on

PTP Tang, "Table-driven implementation of the exponential function
in IEEE floating-point arithmetic," ACM Trans. Math. Soft., 15,
144-157 (1989).

PR: standards/152415
Submitted by: kargl
Reviewed by: bde, das
Approved by: das (mentor)


223262 18-Jun-2011 benl

Fix clang warnings.

Approved by: philip (mentor)


222508 30-May-2011 kargl

Clean up the unneeded cpp macro INLINE_REM_PIO2L.

Reviewed by: das
Approved by: das (mentor)


221234 29-Apr-2011 kargl

Improve the accuracy from a max ULP of ~2000 to max ULP < 0.79
on i386-class hardware for sinl and cosl. The hand-rolled argument
reduction have been replaced by e_rem_pio2l() implementations. To
preserve history the following commands have been executed:

svn cp src/e_rem_pio2.c ld80/e_rem_pio2l.h
mv ${HOME}/bde/ld80/e_rem_pio2l.c ld80/e_rem_pio2l.h

svn cp src/e_rem_pio2.c ld128/e_rem_pio2l.h
mv ${HOME}/bde/ld128/e_rem_pio2l.c ld128/e_rem_pio2l.h

The ld80 version has been tested by bde, das, and kargl over the
last few years (bde, das) and few months (kargl). An older ld128
version was tested by das. The committed version has only been
compiled tested via 'make universe'.

Approved by: das (mentor)
Obtained from: bde


181152 02-Aug-2008 das

On i386, gcc truncates long double constants to double precision
at compile time regardless of the dynamic precision, and there's
no way to disable this misfeature at compile time. Hence, it's
impossible to generate the appropriate tables of constants for the
long double inverse trig functions in a straightforward way on i386;
this change hacks around the problem by encoding the underlying bits
in the table.

Note that these functions won't pass the regression test on i386,
even with the FPU set to extended precision, because the regression
test is similarly damaged by gcc. However, the tests all pass when
compiled with a modified version of gcc.

Reported by: bde


181074 31-Jul-2008 das

Add implementations of acosl(), asinl(), atanl(), atan2l(),
and cargl().

Reviewed by: bde
sparc64 testing resources from: remko


176387 18-Feb-2008 bde

2 long double constants were missing L suffixes. This helped break tanl()
on !(amd64 || i386). It gave slightly worse than double precision in some
cases. tanl() now passes tests of 2^24 values on ia64.


176386 18-Feb-2008 bde

Fix a typo which broke k_tanl.c on !(amd64 || i386).


176357 17-Feb-2008 das

Add kernel functions for 80-bit long doubles. Many thanks to Steve and
Bruce for putting lots of effort into these; getting them right isn't
easy, and they went through many iterations.

Submitted by: Steve Kargl <sgk@apl.washington.edu> with revisions from bde


176231 13-Feb-2008 bde

Fix exp2*(x) on signaling NaNs by returning x+x as usual.

This has the side effect of confusing gcc-4.2.1's optimizer into more
often doing the right thing. When it does the wrong thing here, it
seems to be mainly making too many copies of x with dependency chains.
This effect is tiny on amd64, but in some cases on i386 it is enormous.
E.g., on i386 (A64) with -O1, the current version of exp2() should
take about 50 cycles, but took 83 cycles before this change and 66
cycles after this change. exp2f() with -O1 only speeded up from 51
to 47 cycles. (exp2f() should take about 40 cycles, on an Athlon in
either i386 or amd64 mode, and now takes 42 on amd64). exp2l() with
-O1 slowed down from 155 cycles to 123 for some args; this is unimportant
since the i386 exp2l() is a fake; the wrong thing for it seems to
involve branch misprediction.


176074 07-Feb-2008 bde

Use a better method of scaling by 2**k. Instead of adding to the
exponent bits of the reduced result, construct 2**k (hopefully in
parallel with the construction of the reduced result) and multiply by
it. This tends to be much faster if the construction of 2**k is
actually in parallel, and might be faster even with no parallelism
since adjustment of the exponent requires a read-modify-wrtite at an
unfortunate time for pipelines.

In some cases involving exp2* on amd64 (A64), this change saves about
40 cycles or 30%. I think it is inherently only about 12 cycles faster
in these cases and the rest of the speedup is from partly-accidentally
avoiding compiler pessimizations (the construction of 2**k is now
manually scheduled for good results, and -O2 doesn't always mess this
up). In most cases on amd64 (A64) and i386 (A64) the speedup is about
20 cycles. The worst case that I found is expf on ia64 where this
change is a pessimization of about 10 cycles or 5%. The manual
scheduling for plain exp[f] is harder and not as tuned.

This change ld128/s_exp2l.c has not been tested.


175460 18-Jan-2008 das

Implement exp2l(). There is one version for machines with 80-bit
long doubles (i386, amd64, ia64) and one for machines with 128-bit
long doubles (sparc64). Other platforms use the double version.
I've only done runtime testing on i386.

Thanks to bde@ for helpful discussions and bugfixes.


174759 18-Dec-2007 das

Since nan() is supposed to work the same as strtod("nan(...)", NULL),
my original implementation made both use the same code. Unfortunately,
this meant libm depended on a vendor header at compile time and previously-
unexposed vendor bits in libc at runtime.

Hence, I just wrote my own version of the relevant vendor routine. As it
turns out, mine has a factor of 8 fewer of lines of code, and is a bit more
readable anyway. The strtod() and *scanf() routines still use vendor code.

Reviewed by: bde


174684 16-Dec-2007 das

Implement and document nan(), nanf(), and nanl(). This commit
adds two new directories in msun: ld80 and ld128. These are for
long double functions specific to the 80-bit long double format
used on x86-derived architectures, and the 128-bit format used on
sparc64, respectively.