about summary refs log tree commit diff
path: root/math
Commit message (Collapse)AuthorAgeFilesLines
* Fix math.h, tgmath.h XSI POSIX namespace (gamma, isnan, scalb) (bug 18967).Joseph Myers2015-09-153-4/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | math.h incorrectly declares various functions for XSI POSIX 2001 and 2008 editions. gamma was removed in the 2001 edition but is still declared, along with gammaf and gammal which were never standard functions. isnan is still declared as a function, along with isnanf and isnanl which were never standard functions, although in 2001 the function was replaced by the type-generic macro. scalbf and scalbl are declared although never standard, and scalb was removed in the 2008 edition but is still declared. The scalb type-generic macro in tgmath.h shouldn't be present for any POSIX version, since POSIX never had such a type-generic macro. This patch disables all those declarations in the relevant cases (as a minimal fix, it leaves them enabled for __USE_MISC). For the matter of declaring scalb but not scalbf or scalbl for the 2001 edition, a new macro __MATH_DECLARING_DOUBLE is added, defined by math.h around includes of bits/mathcalls.h, for bits/mathcalls.h to use to test which type's functions are being declared. Tested for x86_64 and x86 (testsuite, and that installed stripped shared libraries are unchanged by the patch). [BZ #18967] * math/math.h (__MATH_DECLARING_DOUBLE): New macro. Define and undefine around includes of <bits/mathcalls.h>. * math/bits/mathcalls.h [!__USE_MISC && __USE_XOPEN2K] (isnan): Do not declare function. [!__USE_MISC && __USE_XOPEN2K] (gamma): Likewise. [!__USE_MISC && (!__MATH_DECLARING_DOUBLE || __USE_XOPEN2K8)] (scalb): Likewise. * math/tgmath.h [!__USE_MISC && __USE_XOPEN_EXTENDED] (scalb): Do not define macro. * conform/Makefile (test-xfail-XOPEN2K/math.h/conform): Remove variable. (test-xfail-XOPEN2K/tgmath.h/conform): Likewise. (test-xfail-XOPEN2K8/math.h/conform): Likewise. (test-xfail-XOPEN2K8/tgmath.h/conform): Likewise.
* Mark fegetround pure (bug 16296).Joseph Myers2015-09-151-1/+1
| | | | | | | | | | | | Bug 16296 notes that fegetround is a pure function and should be marked as such in fenv.h. This patch implements that. Tested for x86_64 and x86 (testsuite, and that installed stripped shared libraries are unchanged by this patch). [BZ #16296] * math/fenv.h (fegetround): Use __attribute_pure__. * include/fenv.h (__fegetround): Likewise.
* Fix ctan, ctanh missing underflows (bug 18595).Joseph Myers2015-09-158-210/+908
| | | | | | | | | | | | | | | | | | | | Similar to various other bugs in this area, ctan and ctanh can fail to raise the underflow exception for some cases of results that are tiny and inexact. This patch forces the exception in a similar way to previous fixes. Tested for x86_64 and x86. [BZ #18595] * math/s_ctan.c (__ctan): Force underflow exception for results whose real or imaginary part has small absolute value. * math/s_ctanf.c (__ctanf): Likewise. * math/s_ctanh.c (__ctanh): Likewise. * math/s_ctanhf.c (__ctanhf): Likewise. * math/s_ctanhl.c (__ctanhl): Likewise. * math/s_ctanl.c (__ctanl): Likewise. * math/auto-libm-test-in: Do not allow missing underflow for ctan and ctanh. Add more tests of ctan and ctanh.
* Fix i386 exp10 missing underflows (bug 18966).Joseph Myers2015-09-152-0/+166
| | | | | | | | | | | | | | | | | | | | | | | | | | | On i386, the double version of exp10 can miss underflow exceptions if the result is in the subnormal range for double but the last 11 bits of the 64-bit extended-precision mantissa happen to be zero. This patch forces the exception in a similar way to previous fixes. As with the exp2 and exp fixes, the exp10f changes may in fact not be needed to ensure underflow exceptions, but are included for consistency and to fix the exp10 part of bug 18875 by ensuring that excess range and precision is removed from underflowing return values. Tested for x86_64 and x86. [BZ #18875] [BZ #18966] * sysdeps/i386/fpu/e_exp10.S (dbl_min): New object. (MO): New macro. (__ieee754_exp10): For small results, force underflow exception and remove excess range and precision from return value. * sysdeps/i386/fpu/e_exp10f.S (flt_min): New object. (MO): New macro. (__ieee754_exp10f): For small results, force underflow exception and remove excess range and precision from return value. * math/auto-libm-test-in: Add more tests of exp10. * math/auto-libm-test-out: Regenerated.
* Fix i386 exp missing underflows (bug 18961).Joseph Myers2015-09-142-0/+594
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | On i386, the double version of exp can miss underflow exceptions if the result is in the subnormal range for double but the last 11 bits of the 64-bit extended-precision mantissa happen to be zero. This patch forces the exception in a similar way to previous fixes. As with the exp2 fixes, the expf changes may in fact not be needed to ensure underflow exceptions, but are included for consistency and to fix the exp part of bug 18875 by ensuring that excess range and precision is removed from underflowing return values. Tested for x86_64 and x86. [BZ #18875] [BZ #18961] * sysdeps/i386/fpu/e_exp.S (dbl_min): New object. (MO): New macro. (__ieee754_exp): For small results, force underflow exception and remove excess range and precision from return value. (__exp_finite): Likewise. * sysdeps/i386/fpu/e_expf.S (flt_min): New object. (MO): New macro. (__ieee754_expf): For small results, force underflow exception and remove excess range and precision from return value. (__expf_finite): Likewise. * math/auto-libm-test-in: Add more tests of exp. * math/auto-libm-test-out: Regenerated.
* Fix exp2 missing underflows (bug 16521).Joseph Myers2015-09-143-2/+608
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Various exp2 implementations in glibc can miss underflow exceptions when the scaling down part of the calculation is exact (or, in the x86 case, when the conversion from extended precision to the target precision is exact). This patch forces the exception in a similar way to previous fixes. The x86 exp2f changes may in fact not be needed for this purpose - it's likely to be the case that no argument of type float has an exp2 result so close to an exact subnormal float value that it equals that value when rounded to 64 bits (even taking account of variation between different x86 implementations). However, they are included for consistency with the changes to exp2 and so as to fix the exp2f part of bug 18875 by ensuring that excess range and precision is removed from underflowing return values. Tested for x86_64, x86 and mips64. [BZ #16521] [BZ #18875] * math/e_exp2l.c (__ieee754_exp2l): Force underflow exception for small results. * sysdeps/i386/fpu/e_exp2.S (dbl_min): New object. (MO): New macro. (__ieee754_exp2): For small results, force underflow exception and remove excess range and precision from return value. * sysdeps/i386/fpu/e_exp2f.S (flt_min): New object. (MO): New macro. (__ieee754_exp2f): For small results, force underflow exception and remove excess range and precision from return value. * sysdeps/i386/fpu/e_exp2l.S (ldbl_min): New object. (MO): New macro. (__ieee754_exp2l): Force underflow exception for small results. * sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Likewise. * sysdeps/ieee754/flt-32/e_exp2f.c (__ieee754_exp2f): Likewise. * sysdeps/x86_64/fpu/e_exp2l.S (ldbl_min): New object. (MO): New macro. (__ieee754_exp2l): Force underflow exception for small results. * math/auto-libm-test-in: Add more tests or exp2. * math/auto-libm-test-out: Regenerated.
* Add more random libm test inputs (mainly for ldbl-128).Joseph Myers2015-09-122-0/+3448
| | | | | | | | | | | | | | | | | | | | | This patch adds more libm test inputs found through random test generation to increase previously known ulps. This particular test generation was run for mips64, so most of the increased ulps are for ldbl-128 (float and double having been fairly well covered by such testing for x86_64), but there's the odd ulps increase for other formats. Tested for x86_64, x86 and mips64. * math/auto-libm-test-in: Add more tests of acos, acosh, asin, asinh, atan, atan2, atanh, cabs, carg, cos, csqrt, erfc, exp, exp10, exp2, log, log1p, log2, pow, sin, sincos, sinh, tan and tanh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/mips/mips32/libm-test-ulps: Likewise. * sysdeps/mips/mips64/libm-test-ulps: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Fix ldbl-128/ldbl-128ibm lgamma spurious "invalid", incorrect signgam (bug ↵Joseph Myers2015-09-112-0/+45
| | | | | | | | | | | | | | | | | | | | 18952). The ldbl-128 / ldbl-128ibm implementation of lgammal converts (the floor of minus) non-integer negative arguments to int to determine the value of signgam. When those values are outside the range of int, this produces spurious "invalid" exceptions and incorrect values of signgam. This patch fixes this by instead determining signgam through comparing half the integer in question to floor of half the integer. Tested for mips64, x86_64 and x86. [BZ #18952] * sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r): Do not convert non-integer negative arguments to int to determine the value of signgam. * math/auto-libm-test-in: Add more tests of lgamma. * math/auto-libm-test-out: Regenerated.
* Add more randomly-generated libm tests.Joseph Myers2015-09-112-0/+584
| | | | | | | | | | | | | This patch adds more libm test inputs found through random test generation to increase observed ulps on x86_64. Tested for x86_64 and x86. * math/auto-libm-test-in: Add more tests of acosh, atanh, cbrt, cosh, csqrt, erfc, expm1 and lgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Fix lgamma (negative) inaccuracy (bug 2542, bug 2543, bug 2558).Joseph Myers2015-09-103-27/+21756
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing implementations of lgamma functions (except for the ia64 versions) use the reflection formula for negative arguments. This suffers large inaccuracy from cancellation near zeros of lgamma (near where the gamma function is +/- 1). This patch fixes this inaccuracy. For arguments above -2, there are no zeros and no large cancellation, while for sufficiently large negative arguments the zeros are so close to integers that even for integers +/- 1ulp the log(gamma(1-x)) term dominates and cancellation is not significant. Thus, it is only necessary to take special care about cancellation for arguments around a limited number of zeros. Accordingly, this patch uses precomputed tables of relevant zeros, expressed as the sum of two floating-point values. The log of the ratio of two sines can be computed accurately using log1p in cases where log would lose accuracy. The log of the ratio of two gamma(1-x) values can be computed using Stirling's approximation (the difference between two values of that approximation to lgamma being computable without computing the two values and then subtracting), with appropriate adjustments (which don't reduce accuracy too much) in cases where 1-x is too small to use Stirling's approximation directly. In the interval from -3 to -2, using the ratios of sines and of gamma(1-x) can still produce too much cancellation between those two parts of the computation (and that interval is also the worst interval for computing the ratio between gamma(1-x) values, which computation becomes more accurate, while being less critical for the final result, for larger 1-x). Because this can result in errors slightly above those accepted in glibc, this interval is instead dealt with by polynomial approximations. Separate polynomial approximations to (|gamma(x)|-1)(x-n)/(x-x0) are used for each interval of length 1/8 from -3 to -2, where n (-3 or -2) is the nearest integer to the 1/8-interval and x0 is the zero of lgamma in the relevant half-integer interval (-3 to -2.5 or -2.5 to -2). Together, the two approaches are intended to give sufficient accuracy for all negative arguments in the problem range. Outside that range, the previous implementation continues to be used. Tested for x86_64, x86, mips64 and powerpc. The mips64 and powerpc testing shows up pre-existing problems for ldbl-128 and ldbl-128ibm with large negative arguments giving spurious "invalid" exceptions (exposed by newly added tests for cases this patch doesn't affect the logic for); I'll address those problems separately. [BZ #2542] [BZ #2543] [BZ #2558] * sysdeps/ieee754/dbl-64/e_lgamma_r.c (__ieee754_lgamma_r): Call __lgamma_neg for arguments from -28.0 to -2.0. * sysdeps/ieee754/flt-32/e_lgammaf_r.c (__ieee754_lgammaf_r): Call __lgamma_negf for arguments from -15.0 to -2.0. * sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r): Call __lgamma_negl for arguments from -48.0 or -50.0 to -2.0. * sysdeps/ieee754/ldbl-96/e_lgammal_r.c (__ieee754_lgammal_r): Call __lgamma_negl for arguments from -33.0 to -2.0. * sysdeps/ieee754/dbl-64/lgamma_neg.c: New file. * sysdeps/ieee754/dbl-64/lgamma_product.c: Likewise. * sysdeps/ieee754/flt-32/lgamma_negf.c: Likewise. * sysdeps/ieee754/flt-32/lgamma_productf.c: Likewise. * sysdeps/ieee754/ldbl-128/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-128/lgamma_productl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/lgamma_productl.c: Likewise. * sysdeps/ieee754/ldbl-96/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-96/lgamma_product.c: Likewise. * sysdeps/ieee754/ldbl-96/lgamma_productl.c: Likewise. * sysdeps/generic/math_private.h (__lgamma_negf): New prototype. (__lgamma_neg): Likewise. (__lgamma_negl): Likewise. (__lgamma_product): Likewise. (__lgamma_productl): Likewise. * math/Makefile (libm-calls): Add lgamma_neg and lgamma_product. * math/auto-libm-test-in: Add more tests of lgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Don't use -Wno-uninitialized in math/.Joseph Myers2015-08-201-3/+0
| | | | | | | | | | | | The uninitialized variable warnings in math/ having been fixed for all the supported floating-point formats, this patch removes the use of -Wno-uninitialized there, continuing with the goal of avoiding -Wno- options in makefiles as far as possible.. Tested for x86_64 and x86 (full build and testsuite runs), and for powerpc and mips64 (verified that glibc builds without errors). * math/Makefile (CFLAGS): Don't add -Wno-uninitialized.
* Fix csqrt missing underflows (bug 18370).Joseph Myers2015-08-195-0/+394
| | | | | | | | | | | | | | | | | | | The csqrt implementations in glibc can miss underflow exceptions when the real or imaginary part of the result becomes tiny in the course of scaling down (in particular, multiplication by 0.5) and that scaling is exact although the relevant part of the mathematical result isn't. This patch forces the exception in a similar way to previous fixes. Tested for x86_64 and x86. [BZ #18370] * math/s_csqrt.c (__csqrt): Force underflow exception for results whose real or imaginary part has small absolute value. * math/s_csqrtf.c (__csqrtf): Likewise. * math/s_csqrtl.c (__csqrtl): Likewise. * math/auto-libm-test-in: Add more tests of csqrt. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update.
* Fix csqrt spurious underflows (bug 18823).Joseph Myers2015-08-175-9/+1850
| | | | | | | | | | | | | | | | | | The csqrt functions scale up small arguments to avoid underflows when calling hypot functions. However, even when hypot does not underflow, a subsequent calculation of 0.5 * hypot can underflow. This patch duly increases the threshold and scale factor to avoid such underflows as well. Tested for x86_64, x86 and mips64. [BZ #18823] * math/s_csqrt.c (__csqrt): Increase threshold and scale factor for scaling up small arguments. * math/s_csqrtf.c (__csqrtf): Likewise. * math/s_csqrtl.c (__csqrtl): Likewise. * math/auto-libm-test-in: Add more tests of csqrt. * math/auto-libm-test-out: Regenerated.
* Fix fma spurious underflows (bug 18824).Joseph Myers2015-08-142-0/+274
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Various fma implementations have logic that, when computing fma (x, y, z) where z is large (so care needs taking to avoid internal overflow) but x * y is small, scale x * y up instead of down to avoid internal underflows resulting from scaling down. (In these cases, x * y is small enough that only its sign actually matters rather than the exact value.) The threshold for scaling up instead of down was correct for "if the unscaled values were multiplied, the low part of the multiplication could underflow", and the scaling was sufficient to ensure that the low part of the multiplication did not underflow (given that cases of very small x * y - less than half the least subnormal - were previously dealt with). However, the choice in the functions wasn't between scaling up or no scaling, but between scaling up and scaling down (scaling down actually being needed when x * y isn't so small compared to z and so the exact value does matter). Thus a larger threshold is needed to ensure that scaling down doesn't produce values the multiplication of whose low parts underflows. This patch increases the thresholds accordingly. Tested for x86_64, x86 and mips64 (with the MIPS version of s_fmal.c removed so that the ldbl-128 version gets tested instead of the soft-fp one). [BZ #18824] * sysdeps/ieee754/dbl-64/s_fma.c (__fma): Increase threshold for scaling x * y up instead of down. * sysdeps/ieee754/ldbl-128/s_fmal.c (__fmal): Likewise. * sysdeps/ieee754/ldbl-96/s_fmal.c (__fmal): Likewise. * math/auto-libm-test-in: Add more tests of fma. * math/auto-libm-test-out: Regenerated.
* Add more random libm-test inputs.Joseph Myers2015-08-132-0/+2198
| | | | | | | | | | | | | | | | This patch adds more test inputs to various libm functions found through random generation to have larger ulps errors than previously listed in libm-test-ulp, on at least one of x86_64 and x86. Tested for x86_64 and x86. * math/auto-libm-test-in: Add more tests of acos, acosh, asin, asinh, atan, atan2, atanh, cabs, cbrt, cosh, csqrt, erf, erfc, exp, exp2, lgamma, log, log1p, log2, pow, sin, sincos, tan, tanh and tgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Fix tanh missing underflows (bug 16520).Joseph Myers2015-08-132-0/+320
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to various other bugs in this area, some tanh implementations do not raise the underflow exception for subnormal arguments, when the result is tiny and inexact. This patch forces the exception in a similar way to previous fixes. Tested for x86_64, x86, mips64 and powerpc. [BZ #16520] * sysdeps/ieee754/dbl-64/s_tanh.c: Include <float.h>. (__tanh): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/flt-32/s_tanhf.c: Include <float.h>. (__tanhf): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-128/s_tanhl.c: Include <float.h>. (__tanhl): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-128ibm/s_tanhl.c: Include <float.h>. (__tanhl): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-96/s_tanhl.c: Include <float.h>. (__tanhl): Force underflow exception for arguments with small absolute value. * math/auto-libm-test-in: Add more tests of tanh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update.
* Update libmvec multiarch functions for <cpu-features.h>H.J. Lu2015-08-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch updates libmvec multiarch functions to use the newly defined HAS_CPU_FEATURE, HAS_ARCH_FEATURE and LOAD_RTLD_GLOBAL_RO_RDX from <cpu-features.h>. * math/Makefile ($(addprefix $(objpfx), $(libm-vec-tests))): Remove $(objpfx)init-arch.o. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Remove init-arch. * sysdeps/x86_64/fpu/math-tests-arch.h (avx_usable): Removed. (INIT_ARCH_EXT): Defined as empty. (CHECK_ARCH_EXT): Replace HAS_XXX with HAS_ARCH_FEATURE (XXX). * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.S: Remove __init_cpu_features call. Replace HAS_XXX with HAS_CPU_FEATURE/HAS_ARCH_FEATURE (XXX). * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core.S: Likewise.
* Add more tests of various libm functions.Joseph Myers2015-08-112-0/+1607
| | | | | | | | | | | | | | This patch adds more tests of various libm functions found through random test generation to give increased ulps on 32-bit x86. Tested for x86_64 and x86. * math/auto-libm-test-in: Add more tests of acosh, asin, asinh, atanh, cabs, carg, cbrt, cosh, csqrt, erf, erfc, exp, exp10, expm1, hypot, log, log10, log1p, log2, pow, sinh, tan and tgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Fix ldbl-128ibm tanhl inaccuracy (bug 18790).Joseph Myers2015-08-102-0/+936
| | | | | | | | | | | | | | | | ldbl-128ibm tanhl uses a too-small threshold to decide when to return +/-1, resulting in large errors. This patch changes it to a more appropriate threshold (the requirement is for 2*exp(-2|x|) to be small in terms of ulps of 1). Tested for x86_64, x86 and powerpc. [BZ #18790] * sysdeps/ieee754/ldbl-128ibm/s_tanhl.c (__tanhl): Increase threshold for returning +/- 1. * math/auto-libm-test-in: Add more tests of tanh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update.
* Fix ldbl-128ibm sinhl inaccuracy near 0 (bug 18789).Joseph Myers2015-08-102-0/+772
| | | | | | | | | | | | | | | ldbl-128ibm sinhl uses a too-big threshold to decide when to return the argument, resulting in large errors. This patch fixes it to use a more appropriate threshold. Tested for x86_64, x86 and powerpc. [BZ #18789] * sysdeps/ieee754/ldbl-128ibm/e_sinhl.c (__ieee754_sinhl): Use smaller threshold for returning the argument. * math/auto-libm-test-in: Add more tests of sinh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update.
* Fix tan missing underflows (bug 16517).Joseph Myers2015-08-072-0/+320
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to various other bugs in this area, some tan implementations do not raise the underflow exception for subnormal arguments, when the result is tiny and inexact. This patch forces the exception in a similar way to previous fixes. Tested for x86_64, x86, mips64 and powerpc. [BZ #16517] * sysdeps/ieee754/dbl-64/s_tan.c: Include <float.h>. (tan): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/flt-32/k_tanf.c: Include <float.h>. (__kernel_tanf): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-128/k_tanl.c: Include <float.h>. (__kernel_tanl): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-128ibm/k_tanl.c: Include <float.h>. (__kernel_tanl): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-96/k_tanl.c: Include <float.h>. (__kernel_tanl): Force underflow exception for arguments with small absolute value. * math/auto-libm-test-in: Add more tests of tan. * math/auto-libm-test-out: Regenerated.
* Fix sysdeps/i386/fpu/s_scalbn.S buildSamuel Thibault2015-08-071-0/+2
| | | | * math/Versions (libc: GLIBC_2_22): New (empty) version set.
* Fix sinh missing underflows (bug 16519).Joseph Myers2015-08-062-0/+320
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to various other bugs in this area, some sinh implementations do not raise the underflow exception for subnormal arguments, when the result is tiny and inexact. This patch forces the exception in a similar way to previous fixes. Tested for x86_64, x86, mips64 and powerpc. [BZ #16519] * sysdeps/ieee754/dbl-64/e_sinh.c: Include <float.h>. (__ieee754_sinh): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/flt-32/e_sinhf.c: Include <float.h>. (__ieee754_sinhf): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-128/e_sinhl.c: Include <float.h>. (__ieee754_sinhl): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-128ibm/e_sinhl.c: Include <float.h>. (__ieee754_sinhl): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-96/e_sinhl.c: Include <float.h>. (__ieee754_sinhl): Force underflow exception for arguments with small absolute value. * math/auto-libm-test-in: Add more tests of sinh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update.
* Fix powf (close to -1, large) (bug 18647).Joseph Myers2015-08-052-0/+5651
| | | | | | | | | | | | | | | | The flt-32 implementation of powf wrongly uses x-1 instead of |x|-1 when computing log (x) for the case where |x| is close to 1 and y is large. This patch fixes the logic accordingly. Relevant tests existed for x close to 1, and corresponding tests are added for x close to -1, as well as for some new variant cases. Tested for x86_64 and x86. [BZ #18647] * sysdeps/ieee754/flt-32/e_powf.c (__ieee754_powf): For large y and |x| close to 1, use absolute value of x when computing log. * math/auto-libm-test-in: Add more tests of pow. * math/auto-libm-test-out: Regenerated.
* Prevent runtime fail of SSE vector math tests on non SSE4.1 machine.Andrew Senkevich2015-07-301-2/+0
| | | | | | | | [BZ #18740] * sysdeps/x86_64/fpu/Makefile (double-vlen2-arch-ext-cflags, float-vlen4-arch-ext-cflags): Removed. * math/Makefile (CFLAGS-test-double-vlen2-wrappers.c, CFLAGS-test-float-vlen4-wrappers.c): Likewise.
* Modify several tests to use test-skeleton.cArjun Shankar2015-07-153-6/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These tests were skipped by the use-test-skeleton conversion done in commit 29955b5d because they were reused in other tests via the #include directive, and so deemed worth an inspection before they were modified. This has now been done. ChangeLog: 2015-07-09 Arjun Shankar <arjun.is@lostca.se> * elf/tst-leaks1.c (main): Converted to ... (do_test): ... this. (TEST_FUNCTION): New macro. Include test-skeleton.c. * localedata/tst-langinfo.c (main): Converted to ... (do_test): ... this. (TEST_FUNCTION): New macro. Include test-skeleton.c. * math/test-fpucw.c (main): Converted to ... (do_test): ... this. (TEST_FUNCTION): New macro. Include test-skeleton.c. * math/test-tgmath.c (main): Converted to ... (do_test): ... this. (TEST_FUNCTION): New macro. Include test-skeleton.c. * math/test-tgmath2.c (main): Converted to ... (do_test): ... this. (TEST_FUNCTION): New macro. Include test-skeleton.c. * setjmp/tst-setjmp.c (main): Converted to ... (do_test): ... this. (TEST_FUNCTION): New macro. Include test-skeleton.c. * stdio-common/tst-sscanf.c (main): Converted to ... (do_test): ... this. (TEST_FUNCTION): New macro. Include test-skeleton.c. * sysdeps/x86_64/tst-audit6.c (main): Converted to ... (do_test): ... this. (TEST_FUNCTION): New macro. Include test-skeleton.c.
* math/test-fenvinline: avoid compiler warningChris Metcalf2015-07-101-2/+2
| | | | | | | | On tile (and any other machine with no FP exceptions) the feenable_test() function will generate a "function defined but not used" warning because all of the callers are commented out. We already were ifdef'ing out the body of the function, so instead just ifdef out the entire function if FE_ALL_EXCEPT == 0.
* Improve tgamma accuracy (bug 18613).Joseph Myers2015-06-294-112/+174
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In non-default rounding modes, tgamma can be slightly less accurate than permitted by glibc's accuracy goals. Part of the problem is error accumulation, addressed in this patch by setting round-to-nearest for internal computations. However, there was also a bug in the code dealing with computing pow (x + n, x + n) where x + n is not exactly representable, providing another source of error even in round-to-nearest mode; it was necessary to address both bugs to get errors for all testcases within glibc's accuracy goals. Given this second fix, accuracy in round-to-nearest mode is also improved (hence regeneration of ulps for tgamma should be from scratch - truncate libm-test-ulps or at least remove existing tgamma entries - so that the expected ulps can be reduced). Some additional complications also arose. Certain tgamma tests should strictly, according to IEEE semantics, overflow or not depending on the rounding mode; this is beyond the scope of glibc's accuracy goals for any function without exactly-determined results, but gen-auto-libm-tests doesn't handle being lax there as it does for underflow. (libm-test.inc also doesn't handle being lax about whether the result in cases very close to the overflow threshold is infinity or a finite value close to overflow, but that doesn't cause problems in this case though I've seen it cause problems with random test generation for some functions.) Thus, spurious-overflow markings, with a comment, are added to auto-libm-test-in (no bug in Bugzilla because the issue is with the testsuite, not a user-visible bug in glibc). And on x86, after the patch I saw ERANGE issues as previously reported by Carlos (see my commentary in <https://sourceware.org/ml/libc-alpha/2015-01/msg00485.html>), which needed addressing by ensuring excess range and precision were eliminated at various points if FLT_EVAL_METHOD != 0. I also noticed and fixed a cosmetic issue where 1.0f was used in long double functions and should have been 1.0L. This completes the move of all functions to testing in all rounding modes with ALL_RM_TEST, so gen-libm-have-vector-test.sh is updated to remove the workaround for some functions not using ALL_RM_TEST. Tested for x86_64, x86, mips64 and powerpc. [BZ #18613] * sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gamma_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. * sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gammaf_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gammal_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. Use 1.0L not 1.0f as numerator of division. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gammal_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. Use 1.0L not 1.0f as numerator of division. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gammal_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. Use 1.0L not 1.0f as numerator of division. * math/libm-test.inc (tgamma_test_data): Remove one test. Moved to auto-libm-test-in. (tgamma_test): Use ALL_RM_TEST. * math/auto-libm-test-in: Add one test of tgamma. Mark some other tests of tgamma with spurious-overflow. * math/auto-libm-test-out: Regenerated. * math/gen-libm-have-vector-test.sh: Do not check for START. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Fix ldbl-128 j1l spurious underflows (bug 18612).Joseph Myers2015-06-292-0/+178
| | | | | | | | | | | | | | | | | The ldbl-128 implementation of j1l produces spurious underflow exceptions for some small arguments, as a result of squaring the argument. This patch fixes it just to use a linear approximation for sufficiently small arguments, and then to force an underflow exception only in the cases where it is required. Tested for mips64. [BZ #18612] * sysdeps/ieee754/ldbl-128/e_j1l.c (__ieee754_j1l): For small arguments, just return 0.5 times the argument, with underflow forced as needed. * math/auto-libm-test-in: Add more tests of j1. * math/auto-libm-test-out: Regenerated.
* Fix j1, jn missing underflows (bug 16559).Joseph Myers2015-06-292-0/+964
| | | | | | | | | | | | | | | | | | | | | | | | | | Similar to various other bugs in this area, j1 and jn implementations can fail to raise the underflow exception when the internal computation is exact although the actual function is inexact. This patch forces the exception in a similar way to other such fixes. (The ldbl-128 / ldbl-128ibm j1l implementation is different and doesn't need a change for this until spurious underflows in it are fixed.) Tested for x86_64, x86, mips64 and powerpc. [BZ #16559] * sysdeps/ieee754/dbl-64/e_j1.c: Include <float.h>. (__ieee754_j1): Force underflow exception for small results. * sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Likewise. * sysdeps/ieee754/flt-32/e_j1f.c: Include <float.h>. (__ieee754_j1f): Force underflow exception for small results. * sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise. * sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise. * sysdeps/ieee754/ldbl-96/e_j1l.c: Include <float.h>. (__ieee754_j1l): Force underflow exception for small results. * sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise. * math/auto-libm-test-in: Add more tests of j1 and jn. * math/auto-libm-test-out: Regenerated.
* Use round-to-nearest internally in jn, test with ALL_RM_TEST (bug 18602).Joseph Myers2015-06-251-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some existing jn tests, if run in non-default rounding modes, produce errors above those accepted in glibc, which causes problems for moving tests of jn to use ALL_RM_TEST. This patch makes jn set rounding to-nearest internally, as was done for yn some time ago, then computes the appropriate underflowing value for results that underflowed to zero in to-nearest, and moves the tests to ALL_RM_TEST. It does nothing about the general inaccuracy of Bessel function implementations in glibc, though it should make jn more accurate on average in non-default rounding modes through reduced error accumulation. The recomputation of results that underflowed to zero should as a side-effect fix some cases of bug 16559, where jn just used an exact zero, but that is *not* the goal of this patch and other cases of that bug remain unfixed. (Most of the changes in the patch are reindentation to add new scopes for SET_RESTORE_ROUND*.) Tested for x86_64, x86, powerpc and mips64. [BZ #16559] [BZ #18602] * sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Set round-to-nearest internally then recompute results that underflowed to zero in the original rounding mode. * sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise. * sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise. * sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise * math/libm-test.inc (jn_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Refactor libm tests.Joseph Myers2015-06-2420-485/+269
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch refactors the libm tests using libm-test.inc to reduce the level of duplicate definitions. New headers are created for the definitions shared by tests for a particular type; by tests of inline functions; by tests of non-inline functions; by scalar tests; and by vector tests. The unused MATHCONST macro is removed. A new macro VEC_LEN is added to the vector headers to allow the macros defining wrappers for vector functions to be defined once, instead of six times each (differing only in vector length) as before. There is still scope for further refactoring, but this seems a useful start. Tested for x86_64. * math/test-double.h: New file. * math/test-float.h: Likewise. * math/test-ldouble.h: Likewise. * math/test-math-inline.h: Likewise. * math/test-math-no-inline.h: Likewise. * math/test-math-scalar.h: Likewise. * math/test-math-vector.h: Likewise. * math/test-vec-loop.h: Remove file. Contents moved into test-math-vector.h. * math/libm-test.inc (MATHCONST): Do not document macro. * math/test-double.c: Include test-double.h, test-math-no-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_DOUBLE): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-float.c: Include test-float.h, test-math-no-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_FLOAT): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-idouble.c: Include test-double.h, test-math-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_DOUBLE): Likewise. (TEST_MATHVEC): Likewise. (TEST_INLINE): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-ifloat.c: Include test-float.h, test-math-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_FLOAT): Likewise. (TEST_MATHVEC): Likewise. (TEST_INLINE): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-ildoubl.c: Include test-ldouble.h, test-math-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_LDOUBLE): Likewise. (TEST_MATHVEC): Likewise. (TEST_INLINE): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-ldouble.c: Include test-ldouble.h, test-math-no-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_LDOUBLE): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-double-vlen2.h: Include test-double.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_DOUBLE): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * math/test-double-vlen4.h: Include test-double.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_DOUBLE): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * math/test-double-vlen8.h: Include test-double.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_DOUBLE): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * math/test-float-vlen4.h: Include test-float.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_FLOAT): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * math/test-float-vlen8.h: Include test-float.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_FLOAT): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * math/test-float-vlen16.h: Include test-float.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_FLOAT): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Do not include test-vec-loop.h. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Likewise.
* Fix cexp, ccos, ccosh, csin, csinh spurious underflows (bug 18594).Joseph Myers2015-06-2421-37/+2280
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cexp, ccos, ccosh, csin and csinh have spurious underflows in cases where they compute sin of the smallest normal, that produces an underflow exception (depending on which sin implementation is in use) but the final result does not underflow. ctan and ctanh may also have such underflows, or they may be latent (the issue there is that e.g. ctan (DBL_MIN) should, rounded upwards, be the next double value above DBL_MIN, which under glibc's accuracy goals may not have an underflow exception, but the intermediate computation of sin (DBL_MIN) would legitimately underflow on before-rounding architectures). This patch fixes all those functions so they use plain comparisons (> DBL_MIN etc.) instead of comparing the result of fpclassify with FP_SUBNORMAL (in all these cases, we already know the number being compared is finite). Note that in the case of csin / csinf / csinl, there is no need for fabs calls in the comparison because the real part has already been reduced to its absolute value. As the patch fixes the failures that previously obstructed moving tests of cexp to use ALL_RM_TEST, those tests are moved to ALL_RM_TEST by the patch (two functions remain yet to be converted). Tested for x86_64 and x86 and ulps updated accordingly. [BZ #18594] * math/s_ccosh.c (__ccosh): Compare with least normal value instead of comparing class with FP_SUBNORMAL. * math/s_ccoshf.c (__ccoshf): Likewise. * math/s_ccoshl.c (__ccoshl): Likewise. * math/s_cexp.c (__cexp): Likewise. * math/s_cexpf.c (__cexpf): Likewise. * math/s_cexpl.c (__cexpl): Likewise. * math/s_csin.c (__csin): Likewise. * math/s_csinf.c (__csinf): Likewise. * math/s_csinh.c (__csinh): Likewise. * math/s_csinhf.c (__csinhf): Likewise. * math/s_csinhl.c (__csinhl): Likewise. * math/s_csinl.c (__csinl): Likewise. * math/s_ctan.c (__ctan): Likewise. * math/s_ctanf.c (__ctanf): Likewise. * math/s_ctanh.c (__ctanh): Likewise. * math/s_ctanhf.c (__ctanhf): Likewise. * math/s_ctanhl.c (__ctanhl): Likewise. * math/s_ctanl.c (__ctanl): Likewise. * math/auto-libm-test-in: Add more tests of ccos, ccosh, cexp, csin, csinh, ctan and ctanh. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (cexp_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Move csin, csinh tests to auto-libm-test-in.Joseph Myers2015-06-243-78/+1300
| | | | | | | | | | | | | | | | | | This patch moves most tests of csin and csinh with finite inputs from libm-test.inc to auto-libm-test-in. The remaining two tests of each function with small arguments are not moved because moving them causes the time required by gen-auto-libm-tests to go up from under 8 seconds to over 11 minutes for me. (The current development version of MPC has had speed improvements for mpc_sin for some time, but there hasn't been a release containing those improvements yet.) Tested for x86_64 and x86. * math/auto-libm-test-in: Add more tests of csin and csinh. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (csin_test_data): Remove tests moved to auto-libm-test-in. (csinh_test_data): Likewise.
* Fix csin, csinh overflow in directed rounding modes (bug 18593).Joseph Myers2015-06-249-18/+232
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | csin and csinh can produce bad results when overflowing in directed rounding modes, because a multiplication that can overflow is followed by a possible negation. This patch fixes this by negating one of the arguments of the multiplication before the multiplication instead of negating the result. The new tests for this issue are added to auto-libm-test-in, starting use of that file for csin and csinh. The issue was found in the course of moving existing tests for csin and csinh (existing tests, by being enabled in more cases than previously, showed the issue for float and double but not for long double); that move will now be done separately. Tested for x86_64 and x86 and ulps updated accordingly. [BZ #18593] * math/s_csin.c (__csin): Negate before rather than after possibly overflowing multiplication. * math/s_csinf.c (__csinf): Likewise. * math/s_csinh.c (__csinh): Likewise. * math/s_csinhf.c (__csinhf): Likewise. * math/s_csinhl.c (__csinhl): Likewise. * math/s_csinl.c (__csinl): Likewise. * math/auto-libm-test-in: Add some tests of csin and csinh. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (csin_test_data): Use AUTO_TESTS_c_c. (csinh_test_data): Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Update.
* Fix sin, sincos missing underflows (bug 16526, bug 16538).Joseph Myers2015-06-232-0/+640
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to various other bugs in this area, some sin and sincos implementations do not raise the underflow exception for subnormal arguments, when the result is tiny and inexact. This patch forces the exception in a similar way to previous fixes. Tested for x86_64, x86, mips64 and powerpc. [BZ #16526] [BZ #16538] * sysdeps/ieee754/dbl-64/s_sin.c: Include <float.h>. (__sin): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/flt-32/k_sinf.c: Include <float.h>. (__kernel_sinf): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-128/k_sincosl.c: Include <float.h>. (__kernel_sincosl): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-128/k_sinl.c: Include <float.h>. (__kernel_sinl): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-128ibm/k_sincosl.c: Include <float.h>. (__kernel_sincosl): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-128ibm/k_sinl.c: Include <float.h>. (__kernel_sinl): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-96/k_sinl.c: Include <float.h>. (__kernel_sinl): Force underflow exception for arguments with small absolute value. * sysdeps/powerpc/fpu/k_sinf.c: Include <float.h>. (__kernel_sinf): Force underflow exception for arguments with small absolute value. * math/auto-libm-test-in: Add more tests of sin and sincos. * math/auto-libm-test-out: Regenerated.
* Fix spurious "inexact" exceptions from __kernel_standard_l (bug 18245, bug ↵Joseph Myers2015-06-231-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | 18583). __kernel_standard_l converts long double arguments to double for use in SVID "struct exception". This has special-case handling for when that conversion would overflow or underflow but the original long double function wouldn't. However, it turns out that "inexact" exceptions can be spurious here as well, when the function is exactly determined and __kernel_standard_l is being called for a domain error. This patch fixes this by using feholdexcept / fesetenv to avoid exceptions from the conversion, replacing the previous special-case logic for overflow and underflow (this covers all functions using __kernel_standard_l, not just those that actually need a change, since there doesn't seem to be much point in restricting things just to the functions that mustn't get "inexact" here). Tested for x86_64 and x86. [BZ #18245] [BZ #18583] * sysdeps/ieee754/k_standardl.c: Include <fenv.h>. (__kernel_standard_l): Use feholdexcept and fesetenv around conversion to double instead of special-casing overflow and underflow. * math/libm-test.inc (fmod_test_data): Add more tests. (remainder_test_data): Likewise. (sqrt_test_data): Likewise.
* Fix math/Makefile dependency on libm-test.stmp for libmvec tests.Joseph Myers2015-06-231-1/+3
| | | | | | | | | | | | | | | | | | | | | | | Since the libmvec tests went in I've noticed build failures from parallel testing in math/, when those tests start building before libm-test.c has been fully generated. (This only applies if libm test sources have been modified after the original glibc build, because otherwise libm-test.stmp was generated during the original build and doesn't get regenerated during testing.) Those tests depend on libm-test.stmp, but the dependency uses $(libmvec-tests), which is set in the sysdeps Makefile fragments, and appears before the inclusion of ../Rules, which is what includes those fragments; thus, the dependency does not work and parallel make can start building the vector tests too soon. This patch moves the dependency further down so that the required variable is defined when the dependency is. Tested for x86_64. * math/Makefile [$(PERL) != no] ($(addprefix $(objpfx), $(addsuffix .o, $(libm-vec-tests)))): Move dependency on libm-test.stmp below the inclusion of Rules.
* Fix csqrt spurious underflows (bug 18371).Joseph Myers2015-06-235-6/+827
| | | | | | | | | | | | | | | | | | | | | | The csqrt implementations in glibc can cause spurious underflows in some cases as a side-effect of the scaling for large arguments (when underflow is correct for the square root of the argument that was scaled down to avoid overflow, but not for the original argument). This patch arranges to avoid the underflowing intermediate computation (eliminating a multiplication in 0.5 in the problem cases where a subsequent scaling by 2 would follow). Tested for x86_64 and x86 and ulps updated accordingly (only needed for x86). [BZ #18371] * math/s_csqrt.c (__csqrt): Avoid multiplication by 0.5 where intermediate but not final result might underflow. * math/s_csqrtf.c (__csqrtf): Likewise. * math/s_csqrtl.c (__csqrtl): Likewise. * math/auto-libm-test-in: Add more tests of csqrt. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update.
* Fix exp2, exp2f spurious underflows (bug 18219).Joseph Myers2015-06-232-0/+182
| | | | | | | | | | | | | | | | | | | | | The dbl-64 and flt-32 implementations of exp2 functions produce spurious underflow exceptions. The underlying reason is the same in both cases: the computation works as (2^a - 1)*2^b + 2^b for suitably chosen a and b, where a has small magnitude so 2^a - 1 can be computed with a low-degree polynomial approximation, and (2^a - 1)*2^b can underflow even when the final result does not. This patch fixes this by adjusting the threshold for when scaling is used to avoid intermediate underflow so it works for any possible value of a where the final result would not underflow. Tested for x86_64 and x86. [BZ #18219] * sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Reduce threshold on absolute value of exponent for which scaling is used. * sysdeps/ieee754/flt-32/e_exp2f.c (__ieee754_exp2f): Likewise. * math/auto-libm-test-in: Add more tests of exp2. * math/auto-libm-test-out: Regenerated.
* Fix expm1 missing underflows (bug 16353).Joseph Myers2015-06-222-119/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to various other bugs in this area, some expm1 implementations do not raise the underflow exception for subnormal arguments, when the result is tiny and inexact. This patch forces the exception in a similar way to previous fixes. (The issue does not apply to the ldbl-* implementations or to those for x86 / x86_64 long double. The change to sysdeps/ieee754/dbl-64/wordsize-64/e_cosh.c is one I missed when previously fixing bug 16354; the bug in that implementation was previously latent, but the expm1 fixes stopped it being latent and so required it to be fixed to avoid spurious underflows from cosh.) Tested for x86_64 and x86. [BZ #16353] * sysdeps/i386/fpu/s_expm1.S (dbl_min): New object. (__expm1): Force underflow exception for arguments with small absolute value. * sysdeps/i386/fpu/s_expm1f.S (flt_min): New object. (__expm1f): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/dbl-64/s_expm1.c: Include <float.h>. (__expm1): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/flt-32/s_expm1f.c: Include <float.h>. (__expm1f): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/dbl-64/wordsize-64/e_cosh.c (__ieee754_cosh): Check for small arguments before calling __expm1. * math/auto-libm-test-in: Do not mark underflow exceptions as possibly missing for bug 16353. * math/auto-libm-test-out: Regenerated.
* Fix x86_64 / x86 expm1l (-min_subnorm) result sign (bug 18569).Joseph Myers2015-06-212-0/+320
| | | | | | | | | | | | | | | | | | | | In the x86 / x86_64 implementations of expm1l, when expm1l's result should underflow to 0 (argument minus the least subnormal, in some rounding modes), it can be a zero of the wrong sign. This patch fixes this by returning the argument with underflow forced in that case (this is a 1ulp error relative to the correctly rounded result of -0, which is OK in terms of the documented accuracy goals, whereas a result with the wrong sign never is). Tested for x86_64 and x86. [BZ #18569] * sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Force underflow and return argument in case of subnormal argument. * sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Likewise. * math/auto-libm-test-in: Add more tests of expm1. * math/auto-libm-test-out: Regenerated.
* Fix x86 / x86_64 expl, exp10l missing underflows (bug 16361).Joseph Myers2015-06-212-27/+238
| | | | | | | | | | | | | | | | | | | | | Similar to various other bugs in this area, the x86 and x86_64 implementations of expl / exp10l can fail to produce underflow exceptions when the unscaled result has trailing 0 bits so the scaling down to subnormal precision is exact. This patch fixes this by forcing the exception in the case of tiny results. Tested for x86_64 and x86. [BZ #16361] * sysdeps/i386/fpu/e_expl.S [!USE_AS_EXPM1L] (cmin): New object. [!USE_AS_EXPM1L] (IEEE754_EXPL): Force underflow exception for tiny results. * sysdeps/x86_64/fpu/e_expl.S [!USE_AS_EXPM1L] (cmin): New object. [!USE_AS_EXPM1L] (IEEE754_EXPL): Force underflow exception for tiny results. * math/auto-libm-test-in: Add more tests of exp and exp10. Do not mark underflow exceptions as possibly missing for bug 16361. * math/auto-libm-test-out: Regenerated.
* Fix asinh missing underflows (bug 16350).Joseph Myers2015-06-182-125/+124
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to various other bugs in this area, some asinh implementations do not raise the underflow exception for subnormal arguments, when the result is tiny and inexact. This patch forces the exception in a similar way to previous fixes. Tested for x86_64, x86 and mips64. [BZ #16350] * sysdeps/i386/fpu/s_asinh.S (__asinh): Force underflow exception for arguments with small absolute value. * sysdeps/i386/fpu/s_asinhf.S (__asinhf): Likewise. * sysdeps/i386/fpu/s_asinhl.S (__asinhl): Likewise. * sysdeps/ieee754/dbl-64/s_asinh.c: Include <float.h>. (__asinh): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/flt-32/s_asinhf.c: Include <float.h>. (__asinhf): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-128/s_asinhl.c: Include <float.h>. (__asinhl): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-128ibm/s_asinhl.c: Include <float.h>. (__asinhl): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-96/s_asinhl.c: Include <float.h>. (__asinhl): Force underflow exception for arguments with small absolute value. * math/auto-libm-test-in: Do not mark underflow exceptions as possibly missing for bug 16350. * math/auto-libm-test-out: Regenerated.
* Remove stray spurious-underflow markings from cexp test.Joseph Myers2015-06-182-293/+292
| | | | | | | | | | | | I noticed that I'd left a spurious-underflow allowance behind in auto-libm-test-in for a bug that was fixed some time ago. This patch removes it. Tested for x86_64 and x86. * math/auto-libm-test-in: Remove spurious underflow allowance for tests of cexp. * math/auto-libm-test-out: Regenerated.
* Vector sincosf for x86_64 and tests.Andrew Senkevich2015-06-183-0/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized sincosf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * NEWS: Mention addition of x86_64 vector sincosf. * math/test-float-vlen16.h: Added wrapper for sincosf tests. * math/test-float-vlen4.h: Likewise. * math/test-float-vlen8.h: Likewise. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added sincosf SIMD declaration. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S * sysdeps/x86_64/fpu/svml_s_sincosf16_core.S * sysdeps/x86_64/fpu/svml_s_sincosf4_core.S * sysdeps/x86_64/fpu/svml_s_sincosf8_core.S * sysdeps/x86_64/fpu/svml_s_sincosf8_core_avx.S * sysdeps/x86_64/fpu/svml_s_sincosf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_sincosf_data.h: New file. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Added 3 argument wrappers. * sysdeps/x86_64/fpu/test-float-vlen16.c: : Vector sincosf tests. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise.
* Vector sincos for x86_64 and tests.Andrew Senkevich2015-06-1810-19/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized sincos containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * NEWS: Mention addition of x86_64 vector sincos. * bits/libm-simd-decl-stubs.h: Added stubs for sincos. * math/math.h (__MATHDECL_VEC): New macro. * math/bits/mathcalls.h: Added sincos declaration with __MATHDECL_VEC. * math/gen-libm-have-vector-test.sh: Added generation of sincos wrapper declaration under condition. * math/test-vec-loop.h (TEST_VEC_LOOP): Refactored. * math/test-double-vlen2.h: Added wrapper for sincos tests, reflected TEST_VEC_LOOP change. * math/test-double-vlen4.h: Likewise. * math/test-double-vlen8.h: Likewise. * math/test-float-vlen16.h: Reflected TEST_VEC_LOOP change. * math/test-float-vlen4.h: Likewise. * math/test-float-vlen8.h: Likewise. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added sincos SIMD declaration. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos_data.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos_data.h: New file. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Added wrappers for sincos. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Vector sincos tests. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise.
* Vector powf for x86_64 and tests.Andrew Senkevich2015-06-183-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized powf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for powf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Added 2 argument wrappers. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_powf16_core.S: New file. * sysdeps/x86_64/fpu/svml_s_powf4_core.S: New file. * sysdeps/x86_64/fpu/svml_s_powf8_core.S: New file. * sysdeps/x86_64/fpu/svml_s_powf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_powf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_powf_data.h: New file. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector powf tests. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. * math/test-float-vlen16.h: Fixed 2 argument macro. * math/test-float-vlen4.h: Likewise. * math/test-float-vlen8.h: Likewise. * NEWS: Mention addition of x86_64 vector powf.
* Remove ldbl-128ibm variants of complex math functions.Joseph Myers2015-06-172-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sysdeps/ieee754/ldbl-128ibm has its own versions of cprojl, ctanhl and ctanl. Having its own versions, where otherwise the math/ copies are generally used for all floating-point formats, means they are liable to get out of sync and not benefit from bug fixes to the generic versions. The substantive differences (not arising from getting out of sync and slightly different fixes for the same issues) are: long double compat handling (also done in the ldbl-opt versions, so doesn't require special versions for ldbl-128ibm); handling of LDBL_EPSILON (conditionally undefined and redefined in other math/ implementations, so doesn't justify a special version), and: /* __gcc_qmul does not respect -0.0 so we need the following fixup. */ if ((__real__ res == 0.0L) && (__real__ x == 0.0L)) __real__ res = __real__ x; if ((__real__ res == 0.0L) && (__imag__ x == 0.0L)) __imag__ res = __imag__ x; But if that statement about __gcc_qmul was ever true for an old version of that libgcc function, it's not the case for any GCC version now supported to build glibc; there's explicit logic early in that function (and similarly in __gcc_qdiv) to return an appropriately signed zero if the product of the high parts is zero. So this patch adds the special LDBL_EPSILON handling to the generic functions and removes the ldbl-128ibm versions. Tested for powerpc32 (compared test-ldouble.out before and after the changes; there are slight changes to results for ctanl / ctanhl, arising from divergence of the implementations, but nothing that affects the overall nature of the issues shown by the testsuite, and in particular nothing related to signs of zero resutls). * math/s_ctanhl.c [LDBL_MANT_DIG == 106] (LDBL_EPSILON): Undefine and redefine. * math/s_ctanl.c [LDBL_MANT_DIG == 106] (LDBL_EPSILON): Undefine and redefine. * sysdeps/ieee754/ldbl-128ibm/s_cprojl.c: Remove file. * sysdeps/ieee754/ldbl-128ibm/s_ctanhl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_ctanl.c: Likewise.
* Vector pow for x86_64 and tests.Andrew Senkevich2015-06-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized pow containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * bits/libm-simd-decl-stubs.h: Added stubs for pow. * math/bits/mathcalls.h: Added pow declaration with __MATHCALL_VEC. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for pow. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Added 2 argument wrappers. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_pow2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_pow4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_pow4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_pow8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_pow_data.S: New file. * sysdeps/x86_64/fpu/svml_d_pow_data.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector pow test. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector pow.