about summary refs log tree commit diff
path: root/sysdeps/powerpc/fpu
Commit message (Collapse)AuthorAgeFilesLines
* PowerPC: Set/restore rounding mode only when neededAdhemerval Zanella2013-11-253-4/+277
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch helps some math functions performance by adding the libc_fexxx variant of inline functions to handle both FPU round and exception set/restore and by using them on the libc_fexxx_ctx functions. It is based on already coded fexxx family functions for PPC with fpu. Here is the summary of performance improvements due this patch (measured on a POWER7 machine): Before: cos(): ITERS:9.5895e+07: TOTAL:5116.03Mcy, MAX:77.6cy, MIN:49.792cy, 18744 calls/Mcy exp(): ITERS:2.827e+07: TOTAL:5187.15Mcy, MAX:494.018cy, MIN:38.422cy, 5450.01 calls/Mcy pow(): ITERS:6.1705e+07: TOTAL:5144.26Mcy, MAX:171.95cy, MIN:29.935cy, 11994.9 calls/Mcy sin(): ITERS:8.6898e+07: TOTAL:5117.06Mcy, MAX:83.841cy, MIN:46.582cy, 16982 calls/Mcy tan(): ITERS:2.9473e+07: TOTAL:5115.39Mcy, MAX:191.017cy, MIN:172.352cy, 5761.63 calls/Mcy After: cos(): ITERS:2.05265e+08: TOTAL:5111.37Mcy, MAX:78.754cy, MIN:24.196cy, 40158.5 calls/Mcy exp(): ITERS:3.341e+07: TOTAL:5170.84Mcy, MAX:476.317cy, MIN:15.574cy, 6461.23 calls/Mcy pow(): ITERS:7.6153e+07: TOTAL:5129.1Mcy, MAX:147.5cy, MIN:30.916cy, 14847.2 calls/Mcy sin(): ITERS:1.58816e+08: TOTAL:5115.11Mcy, MAX:1490.39cy, MIN:22.341cy, 31048.4 calls/Mcy tan(): ITERS:3.4964e+07: TOTAL:5114.18Mcy, MAX:177.422cy, MIN:146.115cy, 6836.68 calls/Mcy
* PowerPC: Fix __fe_mask_env exportAdhemerval Zanella2013-11-131-1/+3
| | | | | This patch does not export __fe_mask_env anymore, only providing a compatibility symbol. It fixes BZ#14143.
* PowerPC floating point little-endian [11 of 15]Alan Modra2013-10-041-43/+44
| | | | | | | | | | | | http://sourceware.org/ml/libc-alpha/2013-07/msg00202.html Another little-endian fix. * sysdeps/powerpc/fpu_control.h (_FPU_GETCW): Rewrite using 64-bit int/double union. (_FPU_SETCW): Likewise. * sysdeps/powerpc/fpu/tst-setcontext-fpscr.c (_GET_DI_FPSCR): Likewise. (_SET_DI_FPSCR, _GET_SI_FPSCR, _SET_SI_FPSCR): Likewise.
* PowerPC floating point little-endian [10 of 15]Alan Modra2013-10-042-34/+32
| | | | | | | | | | | | | | | | | | | http://sourceware.org/ml/libc-alpha/2013-07/msg00201.html These two functions oddly test x+1>0 when a double x is >= 0.0, and similarly when x is negative. I don't see the point of that since the test should always be true. I also don't see any need to convert x+1 to integer rather than simply using xr+1. Note that the standard allows these functions to return any value when the input is outside the range of long long, but it's not too hard to prevent xr+1 overflowing so that's what I've done. (With rounding mode FE_UPWARD, x+1 can be a lot more than what you might naively expect, but perhaps that situation was covered by the x - xrf < 1.0 test.) * sysdeps/powerpc/fpu/s_llround.c (__llround): Rewrite. * sysdeps/powerpc/fpu/s_llroundf.c (__llroundf): Rewrite.
* PowerPC floating point little-endian [9 of 15]Alan Modra2013-10-041-25/+29
| | | | | | | | | | | | | | http://sourceware.org/ml/libc-alpha/2013-07/msg00200.html This works around the fact that vsx is disabled in current little-endian gcc. Also, float constants take 4 bytes in memory vs. 16 bytes for vector constants, and we don't need to write one lot of masks for double (register format) and another for float (mem format). * sysdeps/powerpc/fpu/s_float_bitwise.h (__float_and_test28): Don't use vector int constants. (__float_and_test24, __float_and8, __float_get_exp): Likewise.
* PowerPC floating point little-endian [8 of 15]Anton Blanchard2013-10-0414-40/+39
| | | | | | | | | | | | | | | | | | | | | | http://sourceware.org/ml/libc-alpha/2013-07/msg00199.html Corrects floating-point environment code for little-endian. * sysdeps/powerpc/fpu/fenv_libc.h (fenv_union_t): Replace int array with long long. * sysdeps/powerpc/fpu/e_sqrt.c (__slow_ieee754_sqrt): Adjust. * sysdeps/powerpc/fpu/e_sqrtf.c (__slow_ieee754_sqrtf): Adjust. * sysdeps/powerpc/fpu/fclrexcpt.c (__feclearexcept): Adjust. * sysdeps/powerpc/fpu/fedisblxcpt.c (fedisableexcept): Adjust. * sysdeps/powerpc/fpu/feenablxcpt.c (feenableexcept): Adjust. * sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Adjust. * sysdeps/powerpc/fpu/feholdexcpt.c (feholdexcept): Adjust. * sysdeps/powerpc/fpu/fesetenv.c (__fesetenv): Adjust. * sysdeps/powerpc/fpu/feupdateenv.c (__feupdateenv): Adjust. * sysdeps/powerpc/fpu/fgetexcptflg.c (__fegetexceptflag): Adjust. * sysdeps/powerpc/fpu/fraiseexcpt.c (__feraiseexcept): Adjust. * sysdeps/powerpc/fpu/fsetexcptflg.c (__fesetexceptflag): Adjust. * sysdeps/powerpc/fpu/ftestexcept.c (fetestexcept): Adjust.
* PowerPC floating point little-endian [3 of 15]Alan Modra2013-10-041-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | http://sourceware.org/ml/libc-alpha/2013-08/msg00083.html Further replacement of ieee854 macros and unions. These files also have some optimisations for comparison against 0.0L, infinity and nan. Since the ABI specifies that the high double of an IBM long double pair is the value rounded to double, a high double of 0.0 means the low double must also be 0.0. The ABI also says that infinity and nan are encoded in the high double, with the low double unspecified. This means that tests for 0.0L, +/-Infinity and +/-NaN need only check the high double. * sysdeps/ieee754/ldbl-128ibm/e_atan2l.c (__ieee754_atan2l): Rewrite all uses of ieee854 long double macros and unions. Simplify tests for long doubles that are fully specified by the high double. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_ilogbl.c (__ieee754_ilogbl): Likewise. Remove dead code too. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_log10l.c (__ieee754_log10l): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_logl.c (__ieee754_logl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Likewise. Remove dead code too. * sysdeps/ieee754/ldbl-128ibm/k_tanl.c (__kernel_tanl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_frexpl.c (__frexpl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_isinf_nsl.c (__isinf_nsl): Likewise. Simplify. * sysdeps/ieee754/ldbl-128ibm/s_isinfl.c (___isinfl): Likewise. Simplify. * sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (__log1pl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_modfl.c (__modfl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_nextafterl.c (__nextafterl): Likewise. Comment on variable precision. * sysdeps/ieee754/ldbl-128ibm/s_nexttoward.c (__nexttoward): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_nexttowardf.c (__nexttowardf): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_remquol.c (__remquol): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_scalblnl.c (__scalblnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c (__scalbnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_tanhl.c (__tanhl): Likewise. * sysdeps/powerpc/fpu/libm-test-ulps: Adjust tan_towardzero ulps.
* Update powerpc-fpu ULPs.Adhemerval Zanella2013-09-111-2/+63
|
* Update powerpc-fpu ULPs.Adhemerval Zanella2013-07-041-22/+146
|
* Remove trailing whitespace.Joseph Myers2013-06-056-14/+14
|
* Link extra-libs consistently with libc and ld.so.Joseph Myers2013-05-311-3/+0
|
* Update powerpc libm-test ULPs.Adhemerval Zanella2013-05-281-0/+37
|
* Don't include expected results in libm-test test names.Joseph Myers2013-05-221-1875/+1875
|
* Handle sincos with generic libm-test logic.Joseph Myers2013-05-191-6/+6
|
* PowerPC: fix hypot/hypotf check for -INFAdhemerval Zanella2013-05-172-6/+6
|
* Add #include <stdint.h> for uint[32|64]_t usage (except installed headers).Ryan S. Arnold2013-05-164-3/+4
|
* Update powerpc libm-test ULPsAdhemerval Zanella2013-05-081-6/+440
|
* PowerPC: fix hypot/hypof FP exceptionsAdhemerval Zanella2013-05-062-16/+14
| | | | | This patch fixes hypot/hypotf spurious floating-point exceptions generate by internal operations.
* Update powerpc libm-test ULPsAdhemerval Zanella2013-05-031-0/+96
|
* Update powerpc libm-test ULPsAdhemerval Zanella2013-04-301-2/+414
|
* Update powerpc libm-test ULPsAdhemerval Zanella2013-04-291-27/+3318
|
* Fix e_logl (128ibm) spurious underflowAdhemerval Zanella2013-03-281-0/+6
|
* PowerPC: fix libm ABI issue for llroundlAdhemerval Zanella2013-03-261-0/+4
|
* PowerPC: fix sqrtl ABI issueAdhemerval Zanella2013-03-211-0/+4
| | | | This patch fixes a sqrtl ABI issue when building for powerpc64.
* Promote a math test for sNaN handling to the top-level.Thomas Schwinge2013-03-152-337/+0
|
* Use GCC's builtins for generating NaNs.Thomas Schwinge2013-03-151-55/+9
|
* Better distinguish between NaN/qNaN/sNaN.Thomas Schwinge2013-03-152-48/+48
|
* PowerPC: unify math_ldbl.h implementationsAdhemerval Zanella2013-03-081-162/+9
| | | | | This patch removes redudant definition from PowerPC specific math_ldbl, using the definitions from ieee754 math_ldbl.h.
* Use same installed powerpc headers for hard and soft float.Joseph Myers2013-03-013-271/+0
|
* Remove bp-sym.h and BP_SYM uses from C code.Joseph Myers2013-02-146-18/+12
|
* Adapt installed powerpc headers better for soft-float / no-FPRs.Joseph Myers2013-01-173-19/+33
|
* Update powerpc ULPsSiddhesh Poyarekar2013-01-091-36/+56
|
* Fix spelling errors in sysdeps/powerpc files.Anton Blanchard2013-01-073-4/+4
|
* Fix warnings in test-powerpc-snan.cAndreas Schwab2013-01-041-3/+2
|
* Update powerpc libm ULPsAndreas Schwab2013-01-041-0/+5
|
* Update copyright notices with scripts/update-copyrights.Joseph Myers2013-01-0249-54/+49
|
* Update powerpc libm-test ULPsAndreas Schwab2012-11-231-0/+41
|
* Make fma use of Dekker and Knuth algorithms use round-to-nearest (bug 14796).Joseph Myers2012-11-031-1/+2
|
* Update powerpc libm ULPsAndreas Schwab2012-10-311-20/+420
|
* Fix ctan, ctanh of subnormals in round-upwards mode (bug 14328).Adhemerval Zanella2012-07-111-1/+274
| | | | IBM long double fixes and POWER ulps update.
* Fix float range reduction problems (#14283)Andreas Schwab2012-07-061-2/+2
|
* PowerPC: Fix for POWER7 sinf/cosfAdhemerval Zanella2012-06-012-4/+6
| | | | | This patch fixes some sinf/cosf calculations that generated unexpected underflows exceptions.
* Sort sysdeps/powerpc/fpu/libm-test-ulpsAndreas Schwab2012-06-011-166/+163
|
* Don't include exceptions in libm-test-ulps test names.Joseph Myers2012-05-241-5/+5
|
* PowerPC: ULPs updateAdhemerval Zanella2012-05-211-12/+46
| | | | | | Adjustments for libm ulps added with commit d8b82cad1b525bdcbfff88d218c7c45032e4a3af, 495fd99f3a119e5c0c542ccc6cf9c93b1fb9e892, and 5ba3cc691c856e5c67a7d4cd4713f20a79f7ba81. I also adjusted some exp10 ulps definition that was higher than needed.
* Update powerpc ULPs for ccos, csin, ccosh, csinh tests.Adhemerval Zanella2012-05-191-0/+156
|
* Fix for ldbl-128ibm acosl/asinl inaccuraciesAdhemerval Zanella2012-05-041-0/+151
| | | | | | | | | | 2012-05-02 Adhemerval Zanella <azanella@linux.vnet.ibm.com> * sysdeps/ieee754/ldbl-128ibm/e_acosl.c (__ieee754_acosl): Fix long double comparison inaccuracies. * sysdeps/ieee754/ldbl-128ibm/e_asinl.c (__ieee754_asinl): * Likewise. * sysdeps/powerpc/fpu/libm-test-ulps: Update.
* Fix ctan, ctanh overflow for ldbl-128ibm (bug 11521).Adhemerval Zanella2012-04-261-7/+70
|
* Correct powerpc64 s_floorl edge cases (bug 13886).Adhemerval Zanella2012-04-241-0/+13
| | | | | [BZ #13886] Remove powerpc64/fpu/s_floorl. Use fully correct ldbl-128bim/s_floorl.c.
* Update powerpc libm test ULPsAndreas Schwab2012-03-261-2/+24
|