about summary refs log tree commit diff
path: root/sysdeps/generic/math_private.h
Commit message (Collapse)AuthorAgeFilesLines
* Refactor some libm type-generic macros.Joseph Myers2016-11-101-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch refactors some type-generic libm macros, in both math.h and math_private.h, to be based on a common __MATH_TG macro rather than all replicating similar logic to choose a function to call based on the type of the argument. This should serve to illustrate what I think float128 support for such macros should look like: common macros such as __MATH_TG may need different definitions depending on whether float128 is supported in glibc, so that the individual macros themselves do not need conditionals on float128 support. Tested for x86_64, x86, mips64 and powerpc. * math/math.h (__MATH_TG): New macro. [__USE_ISOC99] (fpclassify): Define using __MATH_TG. [__USE_ISOC99] (signbit): Likewise. [__USE_ISOC99] (isfinite): Likewise. [__USE_ISOC99] (isnan): Likewise. [__USE_ISOC99] (isinf): Likewise. [__GLIBC_USE (IEC_60559_BFP_EXT)] (issignaling): Likewise. [__GLIBC_USE (IEC_60559_BFP_EXT)] (__MATH_EVAL_FMT2): New macro. [__GLIBC_USE (IEC_60559_BFP_EXT)] (iseqsig): Define using __MATH_TG and __MATH_EVAL_FMT2. * sysdeps/generic/math_private.h (fabs_tg): Define using __MATH_TG. * sysdeps/ieee754/ldbl-128ibm/bits/iscanonical.h [!__NO_LONG_DOUBLE_MATH] (__iscanonicalf): New macro. [!__NO_LONG_DOUBLE_MATH] (__iscanonical): Likewise. [!__NO_LONG_DOUBLE_MATH] (iscanonical): Define using __MATH_TG. * sysdeps/ieee754/ldbl-96/bits/iscanonical.h (__iscanonicalf): New macro. (__iscanonical): Likewise. (iscanonical): Define using __MATH_TG.
* Remove __nan{f,,l} macrosPaul E. Murphy2016-09-201-7/+0
| | | | | | Use the GCC builtin instead. With the exception of the files built from a template, they are unused. This is preparation for making the s_nanF objects generated.
* Remove unneeded stubs for k_rem_pio2l.Paul E. Murphy2016-09-011-2/+0
| | | | | | | | | | This is only used for the float and double variants. Instead, just add it to the type specific list of files, and remove all stubs, and remove the declaration from math_private.h. I verified x86_64, i486, ia64, m68k, and ppc64 build.
* Fix clog, clog10 inaccuracy (bug 19016).Joseph Myers2015-09-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For arguments with X^2 + Y^2 close to 1, clog and clog10 avoid large errors from log(hypot) by computing X^2 + Y^2 - 1 in a way that avoids cancellation error and then using log1p. However, the thresholds for using that approach still result in log being used on argument as large as sqrt(13/16) > 0.9, leading to significant errors, in some cases above the 9ulp maximum allowed in glibc libm. This patch arranges for the approach using log1p to be used in any cases where |X|, |Y| < 1 and X^2 + Y^2 >= 0.5 (with the existing allowance for cases where one of X and Y is very small), adjusting the __x2y2m1 functions to work with the wider range of inputs. This way, log only gets used on arguments below sqrt(1/2) (or substantially above 1), where the error involved is much less. Tested for x86_64, x86, mips64 and powerpc. For the ulps regeneration I removed the existing clog and clog10 ulps before regenerating to allow any reduced ulps to appear. Tests added include those found by random test generation to produce large ulps either before or after the patch, and some found by trying inputs close to the (0.75, 0.5) threshold where the potential errors from using log are largest. [BZ #19016] * sysdeps/generic/math_private.h (__x2y2m1f): Update comment to allow more cases with X^2 + Y^2 >= 0.5. * sysdeps/ieee754/dbl-64/x2y2m1.c (__x2y2m1): Likewise. Add -1 as normal element in sum instead of special-casing based on values of arguments. * sysdeps/ieee754/dbl-64/x2y2m1f.c (__x2y2m1f): Update comment. * sysdeps/ieee754/ldbl-128/x2y2m1l.c (__x2y2m1l): Likewise. Add -1 as normal element in sum instead of special-casing based on values of arguments. * sysdeps/ieee754/ldbl-128ibm/x2y2m1l.c (__x2y2m1l): Likewise. * sysdeps/ieee754/ldbl-96/x2y2m1.c [FLT_EVAL_METHOD != 0] (__x2y2m1): Update comment. * sysdeps/ieee754/ldbl-96/x2y2m1l.c (__x2y2m1l): Likewise. Add -1 as normal element in sum instead of special-casing based on values of arguments. * math/s_clog.c (__clog): Handle more cases using log1p without hypot. * math/s_clog10.c (__clog10): Likewise. * math/s_clog10f.c (__clog10f): Likewise. * math/s_clog10l.c (__clog10l): Likewise. * math/s_clogf.c (__clogf): Likewise. * math/s_clogl.c (__clogl): Likewise. * math/auto-libm-test-in: Add more tests of clog and clog10. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Refactor code forcing underflow exceptions.Joseph Myers2015-09-231-0/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Various floating-point functions have code to force underflow exceptions if a tiny result was computed in a way that might not have resulted in such exceptions even though the result is inexact. This typically uses math_force_eval to ensure that the underflowing expression is evaluated, but sometimes uses volatile. This patch refactors such code to use three new macros math_check_force_underflow, math_check_force_underflow_nonneg and math_check_force_underflow_complex (which in turn use math_force_eval). In the limited number of cases not suited to a simple conversion to these macros, existing uses of volatile are changed to use math_force_eval instead. The converted code does not always execute exactly the same sequence of operations as the original code, but the overall effects should be the same. Tested for x86_64, x86, mips64 and powerpc. * sysdeps/generic/math_private.h (fabs_tg): New macro. (min_of_type): Likewise. (math_check_force_underflow): Likewise. (math_check_force_underflow_nonneg): Likewise. (math_check_force_underflow_complex): Likewise. * math/e_exp2l.c (__ieee754_exp2l): Use math_check_force_underflow_nonneg. * math/k_casinh.c (__kernel_casinh): Likewise. * math/k_casinhf.c (__kernel_casinhf): Likewise. * math/k_casinhl.c (__kernel_casinhl): Likewise. * math/s_catan.c (__catan): Use math_check_force_underflow_complex. * math/s_catanf.c (__catanf): Likewise. * math/s_catanh.c (__catanh): Likewise. * math/s_catanhf.c (__catanhf): Likewise. * math/s_catanhl.c (__catanhl): Likewise. * math/s_catanl.c (__catanl): Likewise. * math/s_ccosh.c (__ccosh): Likewise. * math/s_ccoshf.c (__ccoshf): Likewise. * math/s_ccoshl.c (__ccoshl): Likewise. * math/s_cexp.c (__cexp): Likewise. * math/s_cexpf.c (__cexpf): Likewise. * math/s_cexpl.c (__cexpl): Likewise. * math/s_clog.c (__clog): Use math_check_force_underflow_nonneg. * math/s_clog10.c (__clog10): Likewise. * math/s_clog10f.c (__clog10f): Likewise. * math/s_clog10l.c (__clog10l): Likewise. * math/s_clogf.c (__clogf): Likewise. * math/s_clogl.c (__clogl): Likewise. * math/s_csin.c (__csin): Use math_check_force_underflow_complex. * math/s_csinf.c (__csinf): Likewise. * math/s_csinh.c (__csinh): Likewise. * math/s_csinhf.c (__csinhf): Likewise. * math/s_csinhl.c (__csinhl): Likewise. * math/s_csinl.c (__csinl): Likewise. * math/s_csqrt.c (__csqrt): Use math_check_force_underflow. * math/s_csqrtf.c (__csqrtf): Likewise. * math/s_csqrtl.c (__csqrtl): Likewise. * math/s_ctan.c (__ctan): Use math_check_force_underflow_complex. * math/s_ctanf.c (__ctanf): Likewise. * math/s_ctanh.c (__ctanh): Likewise. * math/s_ctanhf.c (__ctanhf): Likewise. * math/s_ctanhl.c (__ctanhl): Likewise. * math/s_ctanl.c (__ctanl): Likewise. * stdlib/strtod_l.c (round_and_return): Use math_force_eval instead of volatile. * sysdeps/ieee754/dbl-64/e_asin.c (__ieee754_asin): Use math_check_force_underflow. * sysdeps/ieee754/dbl-64/e_atanh.c (__ieee754_atanh): Likewise. * sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Do not use volatile when forcing underflow. * sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Use math_check_force_underflow_nonneg. * sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Likewise. * sysdeps/ieee754/dbl-64/e_j1.c (__ieee754_j1): Use math_check_force_underflow. * sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Likewise. * sysdeps/ieee754/dbl-64/e_sinh.c (__ieee754_sinh): Likewise. * sysdeps/ieee754/dbl-64/s_asinh.c (__asinh): Likewise. * sysdeps/ieee754/dbl-64/s_atan.c (atan): Use math_check_force_underflow_nonneg. * sysdeps/ieee754/dbl-64/s_erf.c (__erf): Use math_check_force_underflow. * sysdeps/ieee754/dbl-64/s_expm1.c (__expm1): Likewise. * sysdeps/ieee754/dbl-64/s_fma.c (__fma): Use math_force_eval instead of volatile. * sysdeps/ieee754/dbl-64/s_log1p.c (__log1p): Use math_check_force_underflow. * sysdeps/ieee754/dbl-64/s_sin.c (__sin): Likewise. * sysdeps/ieee754/dbl-64/s_tan.c (tan): Use math_check_force_underflow_nonneg. * sysdeps/ieee754/dbl-64/s_tanh.c (__tanh): Use math_check_force_underflow. * sysdeps/ieee754/flt-32/e_asinf.c (__ieee754_asinf): Likewise. * sysdeps/ieee754/flt-32/e_atanhf.c (__ieee754_atanhf): Likewise. * sysdeps/ieee754/flt-32/e_exp2f.c (__ieee754_exp2f): Use math_check_force_underflow_nonneg. * sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r): Likewise. * sysdeps/ieee754/flt-32/e_j1f.c (__ieee754_j1f): Use math_check_force_underflow. * sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise. * sysdeps/ieee754/flt-32/e_sinhf.c (__ieee754_sinhf): Likewise. * sysdeps/ieee754/flt-32/k_sinf.c (__kernel_sinf): Likewise. * sysdeps/ieee754/flt-32/k_tanf.c (__kernel_tanf): Likewise. * sysdeps/ieee754/flt-32/s_asinhf.c (__asinhf): Likewise. * sysdeps/ieee754/flt-32/s_atanf.c (__atanf): Likewise. * sysdeps/ieee754/flt-32/s_erff.c (__erff): Likewise. * sysdeps/ieee754/flt-32/s_expm1f.c (__expm1f): Likewise. * sysdeps/ieee754/flt-32/s_log1pf.c (__log1pf): Likewise. * sysdeps/ieee754/flt-32/s_tanhf.c (__tanhf): Likewise. * sysdeps/ieee754/ldbl-128/e_asinl.c (__ieee754_asinl): Likewise. * sysdeps/ieee754/ldbl-128/e_atanhl.c (__ieee754_atanhl): Likewise. * sysdeps/ieee754/ldbl-128/e_expl.c (__ieee754_expl): Use math_check_force_underflow_nonneg. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128/e_j1l.c (__ieee754_j1l): Use math_check_force_underflow. * sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise. * sysdeps/ieee754/ldbl-128/e_sinhl.c (__ieee754_sinhl): Likewise. * sysdeps/ieee754/ldbl-128/k_sincosl.c (__kernel_sincosl): Likewise. * sysdeps/ieee754/ldbl-128/k_sinl.c (__kernel_sinl): Likewise. * sysdeps/ieee754/ldbl-128/k_tanl.c (__kernel_tanl): Likewise. * sysdeps/ieee754/ldbl-128/s_asinhl.c (__asinhl): Likewise. * sysdeps/ieee754/ldbl-128/s_atanl.c (__atanl): Likewise. * sysdeps/ieee754/ldbl-128/s_erfl.c (__erfl): Likewise. * sysdeps/ieee754/ldbl-128/s_expm1l.c (__expm1l): Likewise. * sysdeps/ieee754/ldbl-128/s_fmal.c (__fmal): Use math_force_eval instead of volatile. * sysdeps/ieee754/ldbl-128/s_log1pl.c (__log1pl): Use math_check_force_underflow. * sysdeps/ieee754/ldbl-128/s_tanhl.c (__tanhl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_asinl.c (__ieee754_asinl): Use math_check_force_underflow. * sysdeps/ieee754/ldbl-128ibm/e_atanhl.c (__ieee754_atanhl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r): Use math_check_force_underflow_nonneg. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Use math_check_force_underflow. * sysdeps/ieee754/ldbl-128ibm/e_sinhl.c (__ieee754_sinhl): Likewise. * sysdeps/ieee754/ldbl-128ibm/k_sincosl.c (__kernel_sincosl): Likewise. * sysdeps/ieee754/ldbl-128ibm/k_sinl.c (__kernel_sinl): Likewise. * sysdeps/ieee754/ldbl-128ibm/k_tanl.c (__kernel_tanl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_asinhl.c (__asinhl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_atanl.c (__atanl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_erfl.c (__erfl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_tanhl.c (__tanhl): Likewise. * sysdeps/ieee754/ldbl-96/e_asinl.c (__ieee754_asinl): Likewise. * sysdeps/ieee754/ldbl-96/e_atanhl.c (__ieee754_atanhl): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r): Use math_check_force_underflow_nonneg. * sysdeps/ieee754/ldbl-96/e_j1l.c (__ieee754_j1l): Use math_check_force_underflow. * sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise. * sysdeps/ieee754/ldbl-96/e_sinhl.c (__ieee754_sinhl): Likewise. * sysdeps/ieee754/ldbl-96/k_sinl.c (__kernel_sinl): Likewise. * sysdeps/ieee754/ldbl-96/k_tanl.c (__kernel_tanl): Use math_check_force_underflow_nonneg. * sysdeps/ieee754/ldbl-96/s_asinhl.c (__asinhl): Use math_check_force_underflow. * sysdeps/ieee754/ldbl-96/s_erfl.c (__erfl): Likewise. * sysdeps/ieee754/ldbl-96/s_fmal.c (__fmal): Use math_force_eval instead of volatile. * sysdeps/ieee754/ldbl-96/s_tanhl.c (__tanhl): Use math_check_force_underflow.
* Avoid excess range overflowing results from cosh, sinh, lgamma (bug 18980).Joseph Myers2015-09-181-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Various i386 libm functions return values with excess range and precision; Wilco Dijkstra's patches to make isfinite etc. expand inline cause this pre-existing issue to result in test failures (when e.g. a result that overflows float but not long double gets counted as overflowing for some purposes but not others). This patch addresses those cases arising from functions defined in C, adding a math_narrow_eval macro that forces values to memory to eliminate excess precision if FLT_EVAL_METHOD indicates this is needed, and is a no-op otherwise. I'll convert existing uses of volatile and asm for this purpose to use the new macro later, once i386 has clean test results again (which requires fixes for .S files as well). Tested for x86_64 and x86. Committed. [BZ #18980] * sysdeps/generic/math_private.h: Include <float.h>. (math_narrow_eval): New macro. [FLT_EVAL_METHOD != 0] (excess_precision): Likewise. * sysdeps/ieee754/dbl-64/e_cosh.c (__ieee754_cosh): Use math_narrow_eval on overflowing return value. * sysdeps/ieee754/dbl-64/e_lgamma_r.c (__ieee754_lgamma_r): Likewise. * sysdeps/ieee754/dbl-64/e_sinh.c (__ieee754_sinh): Likewise. * sysdeps/ieee754/flt-32/e_coshf.c (__ieee754_coshf): Likewise. * sysdeps/ieee754/flt-32/e_lgammaf_r.c (__ieee754_lgammaf_r): Likewise. * sysdeps/ieee754/flt-32/e_sinhf.c (__ieee754_sinhf): Likewise.
* Fix lgamma (negative) inaccuracy (bug 2542, bug 2543, bug 2558).Joseph Myers2015-09-101-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing implementations of lgamma functions (except for the ia64 versions) use the reflection formula for negative arguments. This suffers large inaccuracy from cancellation near zeros of lgamma (near where the gamma function is +/- 1). This patch fixes this inaccuracy. For arguments above -2, there are no zeros and no large cancellation, while for sufficiently large negative arguments the zeros are so close to integers that even for integers +/- 1ulp the log(gamma(1-x)) term dominates and cancellation is not significant. Thus, it is only necessary to take special care about cancellation for arguments around a limited number of zeros. Accordingly, this patch uses precomputed tables of relevant zeros, expressed as the sum of two floating-point values. The log of the ratio of two sines can be computed accurately using log1p in cases where log would lose accuracy. The log of the ratio of two gamma(1-x) values can be computed using Stirling's approximation (the difference between two values of that approximation to lgamma being computable without computing the two values and then subtracting), with appropriate adjustments (which don't reduce accuracy too much) in cases where 1-x is too small to use Stirling's approximation directly. In the interval from -3 to -2, using the ratios of sines and of gamma(1-x) can still produce too much cancellation between those two parts of the computation (and that interval is also the worst interval for computing the ratio between gamma(1-x) values, which computation becomes more accurate, while being less critical for the final result, for larger 1-x). Because this can result in errors slightly above those accepted in glibc, this interval is instead dealt with by polynomial approximations. Separate polynomial approximations to (|gamma(x)|-1)(x-n)/(x-x0) are used for each interval of length 1/8 from -3 to -2, where n (-3 or -2) is the nearest integer to the 1/8-interval and x0 is the zero of lgamma in the relevant half-integer interval (-3 to -2.5 or -2.5 to -2). Together, the two approaches are intended to give sufficient accuracy for all negative arguments in the problem range. Outside that range, the previous implementation continues to be used. Tested for x86_64, x86, mips64 and powerpc. The mips64 and powerpc testing shows up pre-existing problems for ldbl-128 and ldbl-128ibm with large negative arguments giving spurious "invalid" exceptions (exposed by newly added tests for cases this patch doesn't affect the logic for); I'll address those problems separately. [BZ #2542] [BZ #2543] [BZ #2558] * sysdeps/ieee754/dbl-64/e_lgamma_r.c (__ieee754_lgamma_r): Call __lgamma_neg for arguments from -28.0 to -2.0. * sysdeps/ieee754/flt-32/e_lgammaf_r.c (__ieee754_lgammaf_r): Call __lgamma_negf for arguments from -15.0 to -2.0. * sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r): Call __lgamma_negl for arguments from -48.0 or -50.0 to -2.0. * sysdeps/ieee754/ldbl-96/e_lgammal_r.c (__ieee754_lgammal_r): Call __lgamma_negl for arguments from -33.0 to -2.0. * sysdeps/ieee754/dbl-64/lgamma_neg.c: New file. * sysdeps/ieee754/dbl-64/lgamma_product.c: Likewise. * sysdeps/ieee754/flt-32/lgamma_negf.c: Likewise. * sysdeps/ieee754/flt-32/lgamma_productf.c: Likewise. * sysdeps/ieee754/ldbl-128/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-128/lgamma_productl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/lgamma_productl.c: Likewise. * sysdeps/ieee754/ldbl-96/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-96/lgamma_product.c: Likewise. * sysdeps/ieee754/ldbl-96/lgamma_productl.c: Likewise. * sysdeps/generic/math_private.h (__lgamma_negf): New prototype. (__lgamma_neg): Likewise. (__lgamma_negl): Likewise. (__lgamma_product): Likewise. (__lgamma_productl): Likewise. * math/Makefile (libm-calls): Add lgamma_neg and lgamma_product. * math/auto-libm-test-in: Add more tests of lgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Fix libm feupdateenv namespace (bug 17748).Joseph Myers2015-01-071-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Concluding the fixes for C90 libm functions calling C99 fe* functions, this patch fixes the case of feupdateenv by making it a weak alias for __feupdateenv and making the affected code call __feupdateenv. Tested for x86_64 (testsuite, and that installed stripped shared libraries are unchanged by the patch). Also tested for ARM (soft-float) that the math.h linknamespace tests now pass. [BZ #17748] * include/fenv.h (__feupdateenv): Use libm_hidden_proto. * math/feupdateenv.c (__feupdateenv): Use libm_hidden_def. * sysdeps/aarch64/fpu/feupdateenv.c (feupdateenv): Rename to __feupdateenv and define as weak alias of __feupdateenv. Use libm_hidden_weak. * sysdeps/alpha/fpu/feupdateenv.c (__feupdateenv): Use libm_hidden_def. * sysdeps/arm/feupdateenv.c (feupdateenv): Rename to __feupdateenv and define as weak alias of __feupdateenv. Use libm_hidden_weak. * sysdeps/hppa/fpu/feupdateenv.c (feupdateenv): Likewise. * sysdeps/i386/fpu/feupdateenv.c (__feupdateenv): Use libm_hidden_def. * sysdeps/ia64/fpu/feupdateenv.c (feupdateenv): Rename to __feupdateenv and define as weak alias of __feupdateenv. Use libm_hidden_weak. * sysdeps/m68k/fpu/feupdateenv.c (__feupdateenv): Use libm_hidden_def. * sysdeps/mips/fpu/feupdateenv.c (feupdateenv): Rename to __feupdateenv and define as weak alias of __feupdateenv. Use libm_hidden_weak. * sysdeps/powerpc/fpu/feupdateenv.c (__feupdateenv): Use libm_hidden_def. * sysdeps/powerpc/nofpu/feupdateenv.c (__feupdateenv): Likewise. * sysdeps/powerpc/powerpc32/e500/nofpu/feupdateenv.c (__feupdateenv): Likewise. * sysdeps/s390/fpu/feupdateenv.c (feupdateenv): Rename to __feupdateenv and define as weak alias of __feupdateenv. Use libm_hidden_weak. * sysdeps/sh/sh4/fpu/feupdateenv.c (feupdateenv): Likewise. * sysdeps/sparc/fpu/feupdateenv.c (__feupdateenv): Use libm_hidden_def. * sysdeps/tile/math_private.h (__feupdateenv): New inline function. * sysdeps/x86_64/fpu/feupdateenv.c (__feupdateenv): Use libm_hidden_def. * sysdeps/generic/math_private.h (default_libc_feupdateenv): Call __feupdateenv instead of feupdateenv. (default_libc_feupdateenv_test): Likewise. (libc_feresetround_ctx): Likewise.
* Fix libm fesetround namespace (bug 17748).Joseph Myers2015-01-071-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Continuing the fixes for C90 libm functions calling C99 fe* functions, this patch fixes the case of fesetround by making it a weak alias of __fesetround and making the affected code call __fesetround. An existing __fesetround function in fenv_libc.h for powerpc is renamed to __fesetround_inline. Tested for x86_64 (testsuite, and that disassembly of installed shared libraries is unchanged by the patch). Also tested for ARM (soft-float) that fesetround failures disappear from the linknamespace test results (feupdateenv remains to be addressed to complete fixing bug 17748). [BZ #17748] * include/fenv.h (__fesetround): Declare. Use libm_hidden_proto. * math/fesetround.c (fesetround): Rename to __fesetround and define as weak alias of __fesetround. Use libm_hidden_weak. * sysdeps/aarch64/fpu/fesetround.c (fesetround): Likewise. * sysdeps/alpha/fpu/fesetround.c (fesetround): Likewise. * sysdeps/arm/fesetround.c (fesetround): Likewise. * sysdeps/hppa/fpu/fesetround.c (fesetround): Likewise. * sysdeps/i386/fpu/fesetround.c (fesetround): Likewise. * sysdeps/ia64/fpu/fesetround.c (fesetround): Likewise. * sysdeps/m68k/fpu/fesetround.c (fesetround): Likewise. * sysdeps/mips/fpu/fesetround.c (fesetround): Likewise. * sysdeps/powerpc/fpu/fenv_libc.h (__fesetround): Rename to __fesetround_inline. * sysdeps/powerpc/fpu/fenv_private.h (libc_fesetround_ppc): Call __fesetround_inline instead of __fesetround. * sysdeps/powerpc/fpu/fesetround.c (fesetround): Rename to __fesetround and define as weak alias of __fesetround. Use libm_hidden_weak. Call __fesetround_inline instead of __fesetround. * sysdeps/powerpc/nofpu/fesetround.c (fesetround): Rename to __fesetround and define as weak alias of __fesetround. Use libm_hidden_weak. * sysdeps/powerpc/powerpc32/e500/nofpu/fesetround.c (fesetround): Likewise. * sysdeps/s390/fpu/fesetround.c (fesetround): Likewise. * sysdeps/sh/sh4/fpu/fesetround.c (fesetround): Likewise. * sysdeps/sparc/fpu/fesetround.c (fesetround): Likewise. * sysdeps/tile/math_private.h (__fesetround): New inline function. * sysdeps/x86_64/fpu/fesetround.c (fesetround): Rename to __fesetround and define as weak alias of __fesetround. Use libm_hidden_weak. * sysdeps/generic/math_private.h (default_libc_fesetround): Call __fesetround instead of fesetround. (default_libc_feholdexcept_setround): Likewise. (libc_feholdsetround_ctx): Likewise. (libc_feholdsetround_noex_ctx): Likewise.
* Fix libm fesetenv namespace (bug 17748).Joseph Myers2015-01-061-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Continuing the fixes for C90 libm functions calling C99 fe* functions, this patch fixes the case of fesetenv by making it a weak alias of __fesetenv and making the affected code (including various copies of feupdateenv which also gets called from C90 functions) call __fesetenv. Tested for x86_64 (testsuite, and that disassembly of installed shared libraries is unchanged by the patch). Also tested for ARM (soft-float) that fesetenv failures disappear from the linknamespace test results (fsetround and feupdateenv remain to be addressed to complete fixing bug 17748). [BZ #17748] * include/fenv.h (__fesetenv): Use libm_hidden_proto. * math/fesetenv.c (__fesetenv): Use libm_hidden_def. * sysdeps/aarch64/fpu/fesetenv.c (fesetenv): Rename to __fesetenv and define as weak alias of __fesetenv. Use libm_hidden_weak. * sysdeps/alpha/fpu/fesetenv.c (__fesetenv): Use libm_hidden_def. * sysdeps/arm/fesetenv.c (fesetenv): Rename to __fesetenv and define as weak alias of __fesetenv. Use libm_hidden_weak. * sysdeps/hppa/fpu/fesetenv.c (fesetenv): Likewise. * sysdeps/i386/fpu/fesetenv.c (__fesetenv): Use libm_hidden_def. * sysdeps/ia64/fpu/fesetenv.c (fesetenv): Rename to __fesetenv and define as weak alias of __fesetenv. Use libm_hidden_weak. * sysdeps/m68k/fpu/fesetenv.c (__fesetenv): Use libm_hidden_def. * sysdeps/mips/fpu/fesetenv.c (fesetenv): Rename to __fesetenv and define as weak alias of __fesetenv. Use libm_hidden_weak. * sysdeps/powerpc/fpu/fesetenv.c (__fesetenv): Use libm_hidden_def. * sysdeps/powerpc/nofpu/fesetenv.c (__fesetenv): Likewise. * sysdeps/powerpc/powerpc32/e500/nofpu/fesetenv.c (__fesetenv): Likewise. * sysdeps/s390/fpu/fesetenv.c (fesetenv): Rename to __fesetenv and define as weak alias of __fesetenv. Use libm_hidden_weak. * sysdeps/sh/sh4/fpu/fesetenv.c (fesetenv): Likewise. * sysdeps/sparc/fpu/fesetenv.c (__fesetenv): Use libm_hidden_def. * sysdeps/tile/math_private.h (__fesetenv): New inline function. * sysdeps/x86_64/fpu/fesetenv.c (fesetenv): Rename to __fesetenv and define as weak alias of __fesetenv. Use libm_hidden_weak. * sysdeps/generic/math_private.h (default_libc_fesetenv): Use __fesetenv instead of fesetenv. (libc_feresetround_noex_ctx): Likewise. * sysdeps/alpha/fpu/feupdateenv.c (__feupdateenv): Likewise. * sysdeps/hppa/fpu/feupdateenv.c (feupdateenv): Likewise. * sysdeps/i386/fpu/feupdateenv.c (__feupdateenv): Likewise. * sysdeps/ia64/fpu/feupdateenv.c (feupdateenv): Likewise. * sysdeps/m68k/fpu/feupdateenv.c (__feupdateenv): Likewise. * sysdeps/mips/fpu/feupdateenv.c (feupdateenv): Likewise. * sysdeps/powerpc/nofpu/feupdateenv.c (__feupdateenv): Likewise. * sysdeps/powerpc/powerpc32/e500/nofpu/feupdateenv.c (__feupdateenv): Likewise. * sysdeps/s390/fpu/feupdateenv.c (feupdateenv): Likewise. * sysdeps/sh/sh4/fpu/feupdateenv.c (feupdateenv): Likewise. * sysdeps/sparc/fpu/feupdateenv.c (__feupdateenv): Likewise. * sysdeps/x86_64/fpu/feupdateenv.c (__feupdateenv): Likewise.
* Fix libm feholdexcept namespace (bug 17748).Joseph Myers2015-01-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Continuing the fixes for C90 libm functions calling C99 fe* functions, this patch fixes the case of feholdexcept by making it a weak alias of __feholdexcept and making the affected code call __feholdexcept. Tested for x86_64 (testsuite, and that disassembly of installed shared libraries is unchanged by the patch). Also tested for ARM (soft-float) that feholdexcept failures disappear from the linknamespace test failures (fesetenv, fsetround and feupdateenv remain to be addressed to complete fixing bug 17748). [BZ #17748] * include/fenv.h (__feholdexcept): Declare. Use libm_hidden_proto. * math/feholdexcpt.c (feholdexcept): Rename to __feholdexcept and define as weak alias of __feholdexcept. Use libm_hidden_weak. * sysdeps/aarch64/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/alpha/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/arm/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/hppa/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/i386/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/ia64/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/m68k/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/mips/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/powerpc/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/powerpc/nofpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/powerpc/powerpc32/e500/nofpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/s390/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/sh/sh4/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/sparc/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/x86_64/fpu/feholdexcpt.c (feholdexcept): Likewise. * sysdeps/generic/math_private.h (default_libc_feholdexcept): Use __feholdexcept instead of feholdexcept. (default_libc_feholdexcept_setround): Likewise.
* Fix libm fegetenv namespace (bug 17748).Joseph Myers2014-12-311-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some C90 libm functions call fegetenv via libc_feholdsetround* functions in math_private.h. This patch makes them call __fegetenv instead, making fegetenv into a weak alias for __fegetenv as needed. Tested for x86_64 (testsuite, and that disassembly of installed shared libraries is unchanged by the patch). Also tested for ARM (soft-float) that fegetenv failures disappear from the linknamespace test failures (however, similar fixes will also be needed for fegetround, feholdexcept, fesetenv, fesetround and feupdateenv before this set of namespace issues covered by bug 17748 is fully fixed and those linknamespace tests start passing). [BZ #17748] * include/fenv.h (__fegetenv): Use libm_hidden_proto. * math/fegetenv.c (__fegetenv): Use libm_hidden_def. * sysdeps/aarch64/fpu/fegetenv.c (fegetenv): Rename to __fegetenv and define as weak alias of __fegetenv. Use libm_hidden_weak. * sysdeps/alpha/fpu/fegetenv.c (__fegetenv): Use libm_hidden_def. * sysdeps/arm/fegetenv.c (fegetenv): Rename to __fegetenv and define as weak alias of __fegetenv. Use libm_hidden_weak. * sysdeps/hppa/fpu/fegetenv.c (fegetenv): Likewise. * sysdeps/i386/fpu/fegetenv.c (__fegetenv): Use libm_hidden_def. * sysdeps/ia64/fpu/fegetenv.c (fegetenv): Rename to __fegetenv and define as weak alias of __fegetenv. Use libm_hidden_weak. * sysdeps/m68k/fpu/fegetenv.c (__fegetenv): Use libm_hidden_def. * sysdeps/mips/fpu/fegetenv.c (fegetenv): Rename to __fegetenv and define as weak alias of __fegetenv. Use libm_hidden_weak. * sysdeps/powerpc/fpu/fegetenv.c (__fegetenv): Use libm_hidden_def. * sysdeps/powerpc/nofpu/fegetenv.c (__fegetenv): Likewise. * sysdeps/powerpc/powerpc32/e500/nofpu/fegetenv.c (__fegetenv): Likewise. * sysdeps/s390/fpu/fegetenv.c (fegetenv): Rename to __fegetenv and define as weak alias of __fegetenv. Use libm_hidden_weak. * sysdeps/sh/sh4/fpu/fegetenv.c (fegetenv): Likewise. * sysdeps/sparc/fpu/fegetenv.c (__fegetenv): Use libm_hidden_def. * sysdeps/tile/math_private.h (__fegetenv): New inline function. * sysdeps/x86_64/fpu/fegetenv.c (fegetenv): Rename to __fegetenv and define as weak alias of __fegetenv. Use libm_hidden_weak. * sysdeps/generic/math_private.h (libc_feholdsetround_ctx): Use __fegetenv instead of fegetenv. (libc_feholdsetround_noex_ctx): Likewise.
* Add generic HAVE_RM_CTX implementationWilco Dijkstra2014-06-231-19/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a generic implementation of HAVE_RM_CTX using standard fenv calls. As a result math functions using SET_RESTORE_ROUND* macros do not suffer from a large slowdown on targets which do not implement optimized libc_fe*_ctx inline functions. Most of the libc_fe* inline functions are now unused and could be removed in the future (there are a few math functions left which use a mixture of standard fenv calls and libc_fe* inline functions - they could be updated to use SET_RESTORE_ROUND or improved to avoid expensive fenv manipulations across just a few FP instructions). libc_feholdsetround*_noex_ctx is added to enable better optimization of SET_RESTORE_ROUND_NOEX* implementations. Performance measurements on ARM and x86 of sin() show significant gains over the current default, fairly close to a highly optimized fenv_private: ARM x86 no fenv_private : 100% 100% generic HAVE_RM_CTX : 250% 350% fenv_private (CTX) : 250% 450% 2014-06-23 Will Newton <will.newton@linaro.org> Wilco <wdijkstr@arm.com> * sysdeps/generic/math_private.h: Add generic HAVE_RM_CTX implementation. Include get-rounding-mode.h. [!HAVE_RM_CTX]: Define HAVE_RM_CTX to zero. [!libc_feholdsetround_noex_ctx]: Define libc_feholdsetround_noex_ctx. [!libc_feholdsetround_noexf_ctx]: Define libc_feholdsetround_noexf_ctx. [!libc_feholdsetround_noexl_ctx]: Define libc_feholdsetround_noexl_ctx. (libc_feholdsetround_ctx): New function. (libc_feresetround_ctx): New function. (libc_feholdsetround_noex_ctx): New function. (libc_feresetround_noex_ctx): New function.
* Revert "Fix HAVE_RM_CTX -Wundef warnings"Will Newton2014-03-171-1/+1
| | | | This reverts commit 9290130a8de3f30970f741c79dfe8459f798e05f.
* Fix HAVE_RM_CTX -Wundef warningsWill Newton2014-03-171-1/+1
| | | | | | | | | | ChangeLog: 2014-03-17 Will Newton <will.newton@linaro.org> * sysdeps/generic/math_private.h: Check whether HAVE_RM_CTX is defined with #ifdef rather than #if.
* Consolidate multiple precision sin/cos functionsSiddhesh Poyarekar2013-10-081-4/+2
|
* Set/restore rounding mode only when neededSiddhesh Poyarekar2013-06-121-14/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The most common use case of math functions is with default rounding mode, i.e. rounding to nearest. Setting and restoring rounding mode is an unnecessary overhead for this, so I've added support for a context, which does the set/restore only if the FP status needs a change. The code is written such that only x86 uses these. Other architectures should be unaffected by it, but would definitely benefit if the set/restore has as much overhead relative to the rest of the code, as the x86 bits do. Here's a summary of the performance improvement due to these improvements; I've only mentioned functions that use the set/restore and have benchmark inputs for x86_64: Before: cos(): ITERS:4.69335e+08: TOTAL:28884.6Mcy, MAX:4080.28cy, MIN:57.562cy, 16248.6 calls/Mcy exp(): ITERS:4.47604e+08: TOTAL:28796.2Mcy, MAX:207.721cy, MIN:62.385cy, 15543.9 calls/Mcy pow(): ITERS:1.63485e+08: TOTAL:28879.9Mcy, MAX:362.255cy, MIN:172.469cy, 5660.86 calls/Mcy sin(): ITERS:3.89578e+08: TOTAL:28900Mcy, MAX:704.859cy, MIN:47.583cy, 13480.2 calls/Mcy tan(): ITERS:7.0971e+07: TOTAL:28902.2Mcy, MAX:1357.79cy, MIN:388.58cy, 2455.55 calls/Mcy After: cos(): ITERS:6.0014e+08: TOTAL:28875.9Mcy, MAX:364.283cy, MIN:45.716cy, 20783.4 calls/Mcy exp(): ITERS:5.48578e+08: TOTAL:28764.9Mcy, MAX:191.617cy, MIN:51.011cy, 19071.1 calls/Mcy pow(): ITERS:1.70013e+08: TOTAL:28873.6Mcy, MAX:689.522cy, MIN:163.989cy, 5888.18 calls/Mcy sin(): ITERS:4.64079e+08: TOTAL:28891.5Mcy, MAX:6959.3cy, MIN:36.189cy, 16062.8 calls/Mcy tan(): ITERS:7.2354e+07: TOTAL:28898.9Mcy, MAX:1295.57cy, MIN:380.698cy, 2503.7 calls/Mcy So the improvements are: cos: 27.9089% exp: 22.6919% pow: 4.01564% sin: 19.1585% tan: 1.96086% The downside of the change is that it will have an adverse performance impact on non-default rounding modes, but I think the tradeoff is justified.
* Skip modifying exception mask and flags in SET_RESTORE_ROUND_53BITSiddhesh Poyarekar2013-06-051-6/+6
| | | | | We only need to set/restore rounding mode to ensure correct computation for non-default rounding modes.
* Improve tgamma accuracy (bugs 2546, 2560, 5159, 15426).Joseph Myers2013-05-081-0/+12
|
* Make fma use of Dekker and Knuth algorithms use round-to-nearest (bug 14796).Joseph Myers2012-11-031-0/+16
|
* Fix inaccuracy of clog, clog10 near |z| = 1 (bug 13629).Joseph Myers2012-09-251-0/+7
|
* Fix ilogb exception and errno (bug 6794)Adhemerval Zanella2012-04-171-0/+3
| | | | | | | | | | | | | | | | | | | | [BZ #6794] Following Joseph comments about bug 6794, here is a proposed fix. It turned out to be a large fix mainly because I had to move some file along to follow libm files/names conventions. Basically I have added wrappers (w_ilogb.c, w_ilogbf.c, w_ilogbl.c) that now calls the symbol '__ieee754_ilogb'. The wrappers checks for '__ieee754_ilogb' output and set the errno and raise exceptions as expected. The '__ieee754_ilogb' is implemented in sysdeps. I have moved the 's_ilogb[f|l]' files to e_ilogb[f|l] and renamed the '__ilogb[f|l]' to '__ieee754_ilogb[f|l]'. I also found out a bug in i386 and x86-64 assembly coded ilogb implementation where it raises a FE_DIVBYZERO when argument is '0.0'. I corrected this issue as well. Finally I added the errno and FE_INVALID tests for 0.0, NaN and +-InF argument. Tested on i386, x86-64, ppc32 and ppc64.
* Avoid overflows from long double functions using __kernel_standard.Joseph Myers2012-03-281-0/+1
|
* Create and use libc_feupdateenv_test.Richard Henderson2012-03-191-0/+18
| | | | | We can reduce the number of STMXCSR, and often we can avoid the call to __feraiseexcept.
* Create and use SET_RESTORE_ROUND{,_NOEX,_53BIT}{,F,L}.Richard Henderson2012-03-191-0/+69
|
* Convert libc_feholdexcept et al from macros to inline functions.Richard Henderson2012-03-191-21/+78
|
* Make inline __isnan, __isinf_ns, __finite generic.Richard Henderson2012-03-191-13/+26
| | | | | | For code generation to stay identical on x86_64, this requires that we define the fp word manipulation macros before including the generic header.
* Implement and use libc_feholdexcept_setround_53bit and libc_feupdateenv_53bitAndreas Jaeger2012-03-141-0/+5
| | | | so that double arithmetic in s_sin is done in 53 bit (without extend i386 double precision)
* Move math/math_private.h to sysdeps/generic/math_private.h.Richard Henderson2012-03-091-0/+392
This reverts commit 60d6f5a6f50d838bcb4240fcc0223cac445c6c83.