about summary refs log tree commit diff
path: root/sysdeps/ieee754/dbl-64
Commit message (Collapse)AuthorAgeFilesLines
...
* Add getpayload, getpayloadf, getpayloadl.Joseph Myers2016-10-192-0/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TS 18661-1 defines functions for manipulating the payloads of NaNs. This patch implements the getpayload functions for glibc; these extract the NaN payload (from an argument passed as a pointer, for which corresponding libm-test support is added) and return it in the same floating-point type. The return value of these functions is unspecified for non-NaN arguments; the patch does the simplest thing to implement, which is that the functions do not check whether the argument is a NaN and just treat the relevant bits of the representation as a payload regardless. A conversion from integer to floating-point is used to produce the required return value, except in the ldbl-128 case; as 128-bit integers are not supported for all configurations using ldbl-128, the code constructs the required floating-point representation of the return value directly instead. Tested for x86_64, x86, mips64 and powerpc. * math/bits/mathcalls.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (getpayload): New declaration. * math/Versions (getpayload): New libm symbol at version GLIBC_2.25. (getpayloadf): Likewise. (getpayloadl): Likewise. * math/Makefile (libm-calls): Add s_getpayloadF. * math/libm-test.inc: Include <nan-high-order-bit.h>. (struct test_f_f_data): Add comment. (RUN_TEST_fp_f): New macro. (RUN_TEST_LOOP_fp_f): Likewise. (getpayload_test_data): New array. (getpayload_test): New function. (main): Call getpayload_test. * math/gen-libm-test.pl (parse_args): Handle 'p' in argument descriptor. * manual/arith.texi (FP Bit Twiddling): Document getpayload, getpayloadf and getpayloadl. * manual/libm-err-tab.pl: Update comment on interfaces without ulps tabulated. * sysdeps/ieee754/dbl-64/s_getpayload.c: New file. * sysdeps/ieee754/dbl-64/wordsize-64/s_getpayload.c: Likewise. * sysdeps/ieee754/flt-32/s_getpayloadf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_getpayloadl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_getpayloadl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_getpayloadl.c: Likewise. * sysdeps/nacl/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
* Define HIGH_ORDER_BIT_IS_SET_FOR_SNAN to 0 or 1.Joseph Myers2016-10-176-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves the HIGH_ORDER_BIT_IS_SET_FOR_SNAN macro from being defined or undefined to the preferred convention of always being defined, to either 0 or 1, so allowing typo-proof tests with #if. The macro is moved from math_private.h to a new header nan-high-order-bit.h to make it easy for all architectures to define, either through the sysdeps/generic version of the header or through providing their own version of the header, without needing #ifndef in the generic math_private.h to give a default definition. The move also allows the macro to be used without needing math_private.h to be included; the immediate motivation of this patch is to allow tests to access this information (to know what kinds of NaNs 0 is a valid payload for) without needing to include math_private.h. Existing C level rather than preprocessor conditionals at all, but this patch does not make such a change). Tested for x86_64 and x86 (testsuite); also verified for x86_64, x86, mips64 and powerpc that installed stripped shared libraries are unchanged by the patch. * sysdeps/generic/nan-high-order-bit.h: New file. * sysdeps/hppa/nan-high-order-bit.h: Likewise. * sysdeps/mips/nan-high-order-bit.h: Likewise. * sysdeps/hppa/math_private.h: Remove file. * sysdeps/mips/math_private.h (HIGH_ORDER_BIT_IS_SET_FOR_SNAN): Do not define here. * sysdeps/ieee754/dbl-64/s_issignaling.c: Include <nan-high-order-bit.h>. [HIGH_ORDER_BIT_IS_SET_FOR_SNAN]: Test with #if not #ifdef. * sysdeps/ieee754/dbl-64/s_totalorder.c: Include <nan-high-order-bit.h>. [HIGH_ORDER_BIT_IS_SET_FOR_SNAN]: Test with #if not #ifdef. * sysdeps/ieee754/dbl-64/s_totalordermag.c: Include <nan-high-order-bit.h>. [HIGH_ORDER_BIT_IS_SET_FOR_SNAN]: Test with #if not #ifdef. * sysdeps/ieee754/dbl-64/wordsize-64/s_issignaling.c: Include <nan-high-order-bit.h>. [HIGH_ORDER_BIT_IS_SET_FOR_SNAN]: Test with #if not #ifdef. * sysdeps/ieee754/dbl-64/wordsize-64/s_totalorder.c: Include <nan-high-order-bit.h>. [HIGH_ORDER_BIT_IS_SET_FOR_SNAN]: Test with #if not #ifdef. * sysdeps/ieee754/dbl-64/wordsize-64/s_totalordermag.c: Include <nan-high-order-bit.h>. [HIGH_ORDER_BIT_IS_SET_FOR_SNAN]: Test with #if not #ifdef. * sysdeps/ieee754/flt-32/s_issignalingf.c: Include <nan-high-order-bit.h>. [HIGH_ORDER_BIT_IS_SET_FOR_SNAN]: Test with #if not #ifdef. * sysdeps/ieee754/flt-32/s_totalorderf.c: Include <nan-high-order-bit.h>. [HIGH_ORDER_BIT_IS_SET_FOR_SNAN]: Test with #if not #ifdef. * sysdeps/ieee754/flt-32/s_totalordermagf.c: Include <nan-high-order-bit.h>. [HIGH_ORDER_BIT_IS_SET_FOR_SNAN]: Test with #if not #ifdef. * sysdeps/ieee754/ldbl-128/s_issignalingl.c: Include <nan-high-order-bit.h>. [HIGH_ORDER_BIT_IS_SET_FOR_SNAN]: Test with #if not #ifdef. * sysdeps/ieee754/ldbl-128/s_totalorderl.c: Include <nan-high-order-bit.h>. [HIGH_ORDER_BIT_IS_SET_FOR_SNAN]: Test with #if not #ifdef. * sysdeps/ieee754/ldbl-128/s_totalordermagl.c: Include <nan-high-order-bit.h>. [HIGH_ORDER_BIT_IS_SET_FOR_SNAN]: Test with #if not #ifdef. * sysdeps/ieee754/ldbl-128ibm/s_issignalingl.c: Include <nan-high-order-bit.h>. [HIGH_ORDER_BIT_IS_SET_FOR_SNAN]: Test with #if not #ifdef. * sysdeps/ieee754/ldbl-128ibm/s_totalorderl.c: Include <nan-high-order-bit.h>. [HIGH_ORDER_BIT_IS_SET_FOR_SNAN]: Test with #if not #ifdef. * sysdeps/ieee754/ldbl-128ibm/s_totalordermagl.c: Include <nan-high-order-bit.h>. [HIGH_ORDER_BIT_IS_SET_FOR_SNAN]: Test with #if not #ifdef. * sysdeps/ieee754/ldbl-96/s_issignalingl.c: Include <nan-high-order-bit.h>. [HIGH_ORDER_BIT_IS_SET_FOR_SNAN]: Test with #if not #ifdef. * sysdeps/ieee754/ldbl-96/s_totalorderl.c: Include <nan-high-order-bit.h>. [HIGH_ORDER_BIT_IS_SET_FOR_SNAN]: Test with #if not #ifdef. * sysdeps/ieee754/ldbl-96/s_totalordermagl.c: Include <nan-high-order-bit.h>. [HIGH_ORDER_BIT_IS_SET_FOR_SNAN]: Test with #if not #ifdef.
* Add totalordermag, totalordermagf, totalordermagl.Joseph Myers2016-10-152-0/+94
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In addition to the totalorder functions, TS 18661-1 defines totalordermag functions, which do the same comparison but on the absolute values of the arguments. This patch implements these functions for glibc, including the type-generic macro in <tgmath.h>. In general the implementations are similar to but simpler than those for the totalorder functions. Tested for x86_64, x86, mips64 and powerpc. * math/bits/mathcalls.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (totalordermag): New declaration. * math/tgmath.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (totalordermag): New macro. * math/Versions (totalordermag): New libm symbol at version GLIBC_2.25. (totalordermagf): Likewise. (totalordermagl): Likewise. * math/Makefile (libm-calls): Add s_totalordermagF. * math/libm-test.inc (totalordermag_test_data): New array. (totalordermag_test): New function. (main): Call totalordermag_test. * math/test-tgmath.c (NCALLS): Increase to 125. (F(compile_test)): Call totalordermag. (F(totalordermag)): New function. * manual/arith.texi (FP Comparison Functions): Document totalordermag, totalordermagf and totalordermagl. * manual/libm-err-tab.pl: Update comment on interfaces without ulps tabulated. * sysdeps/ieee754/dbl-64/s_totalordermag.c: New file. * sysdeps/ieee754/dbl-64/wordsize-64/s_totalordermag.c: Likewise. * sysdeps/ieee754/flt-32/s_totalordermagf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_totalordermagl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_totalordermagl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_totalordermagl.c: Likewise. * sysdeps/ieee754/ldbl-opt/nldbl-totalordermag.c: Likewise. * sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add totalordermag. (CFLAGS-nldbl-totalordermag.c): New variable. * sysdeps/ieee754/ldbl-128ibm/test-totalorderl-ldbl-128ibm.c (do_test): Also test totalordermagl. * sysdeps/ieee754/ldbl-96/test-totalorderl-ldbl-96.c (do_test): Likewise. * sysdeps/nacl/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
* Fix warnings from latest GCC.steve ellcey-CA Eng-Software2016-10-141-4/+4
| | | | | * sysdeps/ieee754/dbl-64/e_pow.c (checkint) Make conditions explicitly boolean.
* Add totalorder, totalorderf, totalorderl.Joseph Myers2016-10-122-0/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TS 18661-1 defines totalorder functions implementing the totalOrder comparison operation from IEEE 754-2008. This patch implements these functions for glibc, including the type-generic macro in <tgmath.h>. (The totalordermag functions will be added in a separate patch.) The description of the totalOrder operation is complicated. However, for IEEE interchange binary formats and the preferred quiet NaN convention, what that complicated description means is that you interpret the representation as a sign-magnitude integer (with -0 coming before +0) and do a <= comparison on that interpretation. For finite values and infinities the ordering of the sign-magnitude integers is just the same as the ordering of floating-point values, so this extends that to all representations. (Different representations of the same floating-point value - which includes same quantum in the decimal case - must still be considered equal by this operation, but that issue doesn't arise for IEEE interchange binary formats.) So the complications are: * When MIPS quiet NaN conventions are in use, the representation of NaNs needs adjusting before making such an integer comparison. This patch does this adjustment only when both arguments are NaNs, as there's no need for it if only one is a NaN, and as long as both are NaNs you can just flip the relevant bits without any problems from this turning a NaN into an infinity. * For the m68k version of ldbl-96, where the high mantissa bit is "don't care" for infinities and NaNs, representations where it differs must compare the same. Note: although the testcase for this compiles, I have not actually tested on m68k. * For ldbl-128ibm, the low part must be ignored when the high part is NaN, and low parts of +0 and -0 must be considered the same whatever the high part. The new tests in libm-test.inc are the first tests there specifying particular payloads for input NaNs. Separate tests are also added for the ldbl-96 and ldbl-128ibm special cases where there are different representations of the same value that must compare equal (which can't be covered in libm-test.inc as that only specifies values, not representations). Tested for x86_64, x86, mips64 and powerpc. * math/bits/mathcalls.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (totalorder): New declaration. * math/tgmath.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (totalorder): New macro. * math/Versions (totalorder): New libm symbol at version GLIBC_2.25. (totalorderf): Likewise. (totalorderl): Likewise. * math/Makefile (libm-calls): Add s_totalorderF. * math/gen-libm-test.pl (parse_args): Escape quotes in test name string. * math/libm-test.inc (PAYLOAD_DIG): New macro. (qnan_value_pl): Likewise. (snan_value_pl): Likewise. (qnan_value): Define using qnan_value_pl. (snan_value): Define using snan_value_pl. (struct test_ff_i_data): Add comment about which tests use this structure. (RUN_TEST_ff_b): New macro. (RUN_TEST_LOOP_ff_b): Likewise. (totalorder_test_data): New array. (totalorder_test): New function. (main): Call totalorder_test. * math/test-tgmath.c (NCALLS): Increase to 122. (F(compile_test)): Call totalorder. (F(totalorder)): New function. * manual/arith.texi (FP Comparison Functions): Document totalorder, totalorderf and totalorderl. * manual/libm-err-tab.pl: Update comment on interfaces without ulps tabulated. * sysdeps/ieee754/dbl-64/s_totalorder.c: New file. * sysdeps/ieee754/dbl-64/wordsize-64/s_totalorder.c: Likewise. * sysdeps/ieee754/flt-32/s_totalorderf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_totalorderl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_totalorderl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_totalorderl.c: Likewise. * sysdeps/ieee754/ldbl-opt/nldbl-totalorder.c: Likewise. * sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add totalorder. (CFLAGS-nldbl-totalorder.c): New variable. * sysdeps/ieee754/ldbl-128ibm/test-totalorderl-ldbl-128ibm.c: New file. * sysdeps/ieee754/ldbl-128ibm/Makefile [$(subdir) = math] (tests): Add test-totalorderl-ldbl-128ibm. * sysdeps/ieee754/ldbl-96/test-totalorderl-ldbl-96.c: New file. * sysdeps/ieee754/ldbl-96/Makefile [$(subdir) = math] (tests): Add test-totalorderl-ldbl-96. * sysdeps/nacl/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
* Update comments for some functions in s_sin.cSiddhesh Poyarekar2016-10-061-22/+17
| | | | | Update comments for some functions to bring them in sync with what the functions are actually doing.
* Adjust calls to do_sincos_1 and do_sincos_2 in s_sincos.cSiddhesh Poyarekar2016-10-061-4/+4
| | | | | | | Adjust calls to do_sincos_1 and do_sincos_2 to pass a boolean shift_quadrant instead of the numeric 0 and 1. This does not affect codegen.
* Make quadrant shift a boolean in reduce_and_compute in s_sin.cSiddhesh Poyarekar2016-10-061-4/+4
| | | | | | | | Like the previous change, make the quadrant shift a boolean to make it clearer that we will do at most a single rotation of the quadrants to compute the cosine from the sine function. This does not affect codegen.
* Check n instead of k1 to decide on sign of sin/cos resultSiddhesh Poyarekar2016-10-061-1/+1
| | | | | | | | | | | | | | | | | | | | | For k1 in 1 and 3, n can only have values of 0 and 2, so checking k1 & 2 is equivalent to checking n & 2. We prefer the latter so that we don't use k1 for anything other than selecting the quadrant in do_sincos_1, thus dropping it completely. The previous logic was: "Compute sine for the value and based on the new rotated quadrant (k1) negate the value if we're in the fourth quadrant." With this change, the logic now is: "Compute sine for the value and negate it if we were either (1) in the fourth quadrant or (2) we actually wanted the cosine and were in the third quadrant." * sysdeps/ieee754/dbl-64/s_sin.c (do_sincos_1): Check N instead of K1.
* Make the quadrant shift K a bool in do_sincos_* functionsSiddhesh Poyarekar2016-10-061-19/+19
| | | | | | | | | | | | | | | | The do_sincos_* functions are helpers to compute sin/cos, where they get cosine by computing sine for the next quadrant. This is decided with the value of K passed to it, which is the amount by which to shift the quadrant. Since we will only need the shift to be 0 or 1, we make K a bool to make that explicit. * sysdeps/ieee754/dbl-64/s_sin.c (do_sincos_1): Rename K to SHIFT_QUADRANT and make it bool. (do_sincos_2): Likewise. (sloww): Likewise. (sloww1): Likewise. (__sin): Adjust calls to do_sincos_1 and do_sincos_2. (__cos): Likewise.
* Use __builtin_fma more in dbl-64 code.Joseph Myers2016-09-301-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sysdeps/ieee754/dbl-64/dla.h can use a macro DLA_FMS for more efficient double-width operations when fused multiply-subtract is supported. However, this macro is only defined for x86_64, conditional on architecture-specific __FMA4__. This patch makes the code use __builtin_fma conditional on __FP_FAST_FMA, as used elsewhere in glibc. Tested for x86_64, x86 and powerpc. On powerpc (where this is causing fused operations to be used where they weren't previously) I see an increase from 1ulp to 2ulp in the imaginary part of clog10: testing double (without inline functions) Failure: Test: Imaginary part of: clog10 (0x1.7a858p+0 - 0x6.d940dp-4 i) Result: is: -1.2237865208199886e-01 -0x1.f5435146bb61ap-4 should be: -1.2237865208199888e-01 -0x1.f5435146bb61cp-4 difference: 2.7755575615628914e-17 0x1.0000000000000p-55 ulp : 2.0000 max.ulp : 1.0000 Maximal error of real part of: clog10 is : 3 ulp accepted: 3 ulp Maximal error of imaginary part of: clog10 is : 2 ulp accepted: 1 ulp This is actually resulting from atan2 becoming *more* accurate (atan2 (-0x6.d940dp-4, 0x1.7a858p+0) should ideally be -0x1.208cd6e841554p-2 but was -0x1.208cd6e841555p-2 from a powerpc libm built before this change, and is -0x1.208cd6e841554p-2 from a powerpc libm built after this change). Since these functions are not expected to be correctly rounding by glibc's accuracy goals, neither result is a problem, but this does imply that some of this code, although designed to be correctly rounding, is not in fact correctly rounding (possibly because of GCC creating fused operations where the code does not expect it, something we've only disabled for specific functions where it was found to cause large errors). (Of course as previously discussed I think we should remove the slow cases where an error analysis shows this wouldn't increase the errors much above 0.5ulp; it's only functions such as cratan2 that are expected to be correctly rounding, not atan2.) * sysdeps/ieee754/dbl-64/dla.h [__FP_FAST_FMA] (DLA_FMS): Define macro to use __builtin_fma. * sysdeps/x86_64/fpu/dla.h: Remove file.
* Use copysign instead of ternary for some sin/cos input rangesSiddhesh Poyarekar2016-09-301-13/+13
| | | | | | | | | | | | | | | | | | | These are remaining cases where we can deduce and conclude that the sign of the result should be the same as the sign of the input being checked. For example, for sin(x), the sign of the result is the same as the result itself for x < pi. Likewise, for sine values where x after range reduction falls into this range and its sign is preserved. * sysdeps/ieee754/dbl-64/s_sin.c (do_sincos_1): Use copysign instead of ternary condition. (do_sincos_2): Likewise. (__sin): Likewise. (__cos): Likewise. (slow): Likewise. (sloww): Likewise. (sloww1): Likewise. (bsloww): Likewise. (bsloww1): Likewise.
* Use copysign instead of ternary conditions for positive constantsSiddhesh Poyarekar2016-09-301-19/+19
| | | | | | | | | | | | | | | | | | This is the first very simple substitution of ternary conditions for correction adjustments with __copysign for positive constants. * sysdeps/ieee754/dbl-64/s_sin.c (do_cos_slow): use copysign instead of ternary condition. (do_sin_slow): Likewise. (do_sincos_1): Likewise. (do_sincos_2): Likewise. (__cos): Likewise. (sloww): Likewise. (sloww1): Likewise. (sloww2): Likewise. (bsloww): Likewise. (bsloww1): Likewise. (bsloww2): Likewise.
* consolidate sign checks for slow2Siddhesh Poyarekar2016-09-301-8/+10
| | | | | | | | | | Simplify the code a bit by consolidating sign checks in slow1 and slow2 into __sin at the higher level. * sysdeps/ieee754/dbl-64/s_sin.c (slow1): Consolidate sign check from here... (slow2): ... and here... (__sin): ... to here.
* Inline all support functions for sin and cosSiddhesh Poyarekar2016-09-021-24/+28
| | | | | | | | | | | | | | | | | | | | | | The support functions for sin and cos have a lot of identical functionality, so inlining them gives a pretty decent jump in functionality: ~19% in the sincos function. On SPEC2006 this translates to about 2.1% in the tonto test. * sysdeps/ieee754/dbl-64/s_sin.c (do_cos): Mark as inline. (do_cos_slow): Likewise. (do_sin): Likewise. (do_sin_slow): Likewise. (slow): Likewise. (slow1): Likewise. (slow2): Likewise. (sloww): Likewise. (sloww1): Likewise. (sloww2): Likewise. (bsloww): Likewise. (bsloww1): Likewise. (bsloww2): Likewise. (cslow2): Likewise.
* Use do_sin for sin(x) where 0.25 < |x| < 0.855469Siddhesh Poyarekar2016-09-021-18/+3
| | | | | | | | The only code looks slightly different from do_sin but on closer examination, should give exactly the same result. Drop it in favour of the do_sin function call. * sysdeps/ieee754/dbl-64/s_sin.c (__sin): Use do_sin.
* Consolidate input partitioning into do_cos and do_sinSiddhesh Poyarekar2016-09-021-109/+82
| | | | | | | | | | | | | | | | | | | | | | | | All calls to do_cos are preceded by code that partitions x into a larger double that gives an offset into the sincos table and a smaller double that is used in a polynomial computation. Consolidate all of them into do_cos and do_sin to reduce code duplication. * sysdeps/ieee754/dbl-64/s_sin.c (do_cos): Accept X and DX as input arguments. Consolidate input partitioning from callers here. (do_cos_slow): Likewise. (do_sin): Likewise. (do_sin_slow): Likewise. (do_sincos_1): Remove the no longer necessary input partitioning. (do_sincos_2): Likewise. (__sin): Likewise. (__cos): Likewise. (slow1): Likewise. (slow2): Likewise. (sloww1): Likewise. (sloww2): Likewise. (bsloww1): Likewise. (bsloww2): Likewise. (cslow2): Likewise.
* Use fabs(x) instead of branching on signedness of input to sin and cosSiddhesh Poyarekar2016-08-301-148/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The sin and cos code is inconsistent about its use of fabs to get the absolute value of X where in some places it conditionalizes the code while in others it uses fabs. fabs seems to be a better candidate in most cases because it avoids a branch. Similarly there is an attempt to make it easier for the compiler to emit conditional assignment instructions (like fcsel on aarch64) where it can, by isolating conditional assignment constructs from the rest of the expression. A further benefit of this change is to identify common constructs across functions and consolidate them in future patches. * sysdeps/ieee754/dbl-64/s_sin.c (do_cos_slow): Use ternary instead of if/else. (do_sin_slow): Likewise. (do_sincos_1): Use fabs instead of if/else. (do_sincos_2): Likewise. (__sin): Likewise. (__cos): Likewise. (slow2): Likewise. (sloww): Likewise. (sloww1): Likewise. Drop argument M. (sloww2): Use fabs instead of if/else. (bsloww): Likewise. (bsloww1): Likewise. (bsloww2): Likewise.
* Add fall through commentsSiddhesh Poyarekar2016-08-301-0/+2
| | | | Add fall through comments I had missed writing in previously.
* Consolidate reduce_and_compute codeSiddhesh Poyarekar2016-08-301-17/+14
| | | | | | | | | | | | This patch reshuffles the reduce_and_compute code so that the structure matches other code structures of the same type elsewhere in s_sin.c and s_sincos.c. This is the beginning of an attempt to consolidate and reduce code duplication in functions in s_sin.c to make it easier to read and possibly also easier for the compiler to optimize. * sysdeps/ieee754/dbl-64/s_sin.c (reduce_and_compute): Consolidate switch cases 0 and 2.
* Merge common usage of mul_split functionPaul E. Murphy2016-08-193-93/+3
| | | | | | | | | | | A number of files share identical code for the mul_split function. This moves the duplicated function mul_split into its own header, and refactors the fma usage into a single selection macro. Likewise, mul_split when used by a long double implementation is renamed mul_splitl for clarity.
* Get rid of array-bounds warning in __kernel_rem_pio2[f] with gcc 6.1 -O3.Stefan Liebler2016-08-181-0/+10
| | | | | | | | | | | | | | | | | | On s390x I get the following werror when build with gcc 6.1 (or current gcc head) and -O3: ../sysdeps/ieee754/dbl-64/k_rem_pio2.c: In function ‘__kernel_rem_pio2’: ../sysdeps/ieee754/dbl-64/k_rem_pio2.c:254:18: error: array subscript is below array bounds [-Werror=array-bounds] for (k = 1; iq[jk - k] == 0; k++) ~~^~~~~~~~ I get the same error with sysdeps/ieee754/flt-32/k_rem_pio2f.c. This patch adds DIAG_* macros around it. ChangeLog: * sysdeps/ieee754/dbl-64/k_rem_pio2.c (__kernel_rem_pio2): Use DIAG_*_NEEDS_COMMENT macro to get rid of array-bounds warning. * sysdeps/ieee754/flt-32/k_rem_pio2f.c (__kernel_rem_pio2f): Likewise.
* sparc64: add a VIS3 version of ceil, floor and truncAurelien Jarno2016-08-032-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sparc64 passes floating point values in the floating point registers. As the the generic ceil, floor and trunc functions use integer instructions, it makes sense to provide a VIS3 version consisting in the the generic version compiled with -mvis3. GCC will then use movdtox, movxtod, movwtos and movstow instructions. sparc32 passes the floating point values in the integer registers, so it doesn't make sense to do the same. Changelog: * sysdeps/ieee754/dbl-64/s_trunc.c: Avoid alias renamed. * sysdeps/ieee754/dbl-64/wordsize-64/s_trunc.c: Likewise. * sysdeps/ieee754/flt-32/s_truncf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/Makefile [$(subdir) = math && $(have-as-vis3) = yes] (libm-sysdep_routines): Add s_ceilf-vis3, s_ceil-vis3, s_floorf-vis3, s_floor-vis3, s_truncf-vis3, s_trunc-vis3. (CFLAGS-s_ceilf-vis3.c): New. Set to -Wa,-Av9d -mvis3. (CFLAGS-s_ceil-vis3.c): Likewise. (CFLAGS-s_floorf-vis3.c): Likewise. (CFLAGS-s_floor-vis3.c): Likewise. (CFLAGS-s_truncf-vis3.c): Likewise. (CFLAGS-s_trunc-vis3.c): Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_ceil-vis3.c: New file. * sysdeps/sparc/sparc64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf-vis3.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_floor-vis3.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_floorf-vis3.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_trunc-vis3.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_truncf-vis3.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_truncf.c: Likewise.
* Fix cos computation for multiple precision fallback (bz #20357)Siddhesh Poyarekar2016-07-181-2/+2
| | | | | | | | | | | | | | | | | | | | During the sincos consolidation I made two mistakes, one was a logical error due to which cos(0x1.8475e5afd4481p+0) returned sin(0x1.8475e5afd4481p+0) instead. The second issue was an error in negating inputs for the correct quadrants for sine. I could not find a suitable test case for this despite running a program to search for such an input for a couple of hours. Following patch fixes both issues. Tested on x86_64. Thanks to Matt Clay for identifying the issue. [BZ #20357] * sysdeps/ieee754/dbl-64/s_sin.c (sloww): Fix up condition to call __mpsin/__mpcos and to negate values. * math/auto-libm-test-in: Add test. * math/auto-libm-test-out: Regenerate.
* Add nextup and nextdown math functionsRajalakshmi Srinivasaraghavan2016-06-161-0/+58
| | | | | | | | | | | TS 18661 adds nextup and nextdown functions alongside nextafter to provide support for float128 equivalent to it. This patch adds nextupl, nextup, nextupf, nextdownl, nextdown and nextdownf to libm before float128 support. The nextup functions return the next representable value in the direction of positive infinity and the nextdown functions return the next representable value in the direction of negative infinity. These are currently enabled as GNU extensions.
* Fix dbl-64 atan2 (sNaN, qNaN) (bug 20252).Joseph Myers2016-06-131-1/+1
| | | | | | | | | | | | | | | | | | The dbl-64 implementation of atan2, passed arguments (sNaN, qNaN), fails to raise the "invalid" exception. This patch fixes it to add both arguments, rather than just adding the second argument to itself, in the case where the second argument is a NaN (which is checked for before checking for the first argument being a NaN). sNaN tests for atan2 are added, along with some qNaN tests I noticed were missing but should have been there by analogy with other tests present. Tested for x86_64 and x86. [BZ #20252] * sysdeps/ieee754/dbl-64/e_atan2.c (__ieee754_atan2): Add both arguments when second argument is a NaN. * math/libm-test.inc (atan2_test_data): Add sNaN tests and more qNaN tests.
* Fix frexp (NaN) (bug 20250).Joseph Myers2016-06-132-1/+4
| | | | | | | | | | | | | | | | | | | | | Various implementations of frexp functions return sNaN for sNaN input. This patch fixes them to add such arguments to themselves so that qNaN is returned. Tested for x86_64, x86, mips64 and powerpc. [BZ #20250] * sysdeps/i386/fpu/s_frexpl.S (__frexpl): Add non-finite input to itself. * sysdeps/ieee754/dbl-64/s_frexp.c (__frexp): Add non-finite or zero input to itself. * sysdeps/ieee754/dbl-64/wordsize-64/s_frexp.c (__frexp): Likewise. * sysdeps/ieee754/flt-32/s_frexpf.c (__frexpf): Likewise. * sysdeps/ieee754/ldbl-128/s_frexpl.c (__frexpl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_frexpl.c (__frexpl): Likewise. * sysdeps/ieee754/ldbl-96/s_frexpl.c (__frexpl): Likewise. * math/libm-test.inc (frexp_test_data): Add sNaN tests.
* Fix dbl-64 asin (sNaN) (bug 20213).Joseph Myers2016-06-061-1/+1
| | | | | | | | | | | | | The dbl-64 version of asin returns sNaN for sNaN arguments. This patch fixes it to add NaN arguments to themselves so that qNaN is returned in this case. Tested for x86_64 and x86. [BZ #20213] * sysdeps/ieee754/dbl-64/e_asin.c (__ieee754_asin): Add NaN argument to itself. * math/libm-test.inc (asin_test_data): Add sNaN tests.
* Fix dbl-64 acos (sNaN) (bug 20212).Joseph Myers2016-06-061-1/+1
| | | | | | | | | | | | | The dbl-64 version of acos returns sNaN for sNaN arguments. This patch fixes it to add NaN arguments to themselves so that qNaN is returned in this case. Tested for x86_64 and x86. [BZ #20212] * sysdeps/ieee754/dbl-64/e_asin.c (__ieee754_acos): Add NaN argument to itself. * math/libm-test.inc (acos_test_data): Add sNaN tests.
* Do not raise "inexact" from generic round (bug 15479).Joseph Myers2016-05-242-16/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | C99 and C11 allow but do not require ceil, floor, round and trunc to raise the "inexact" exception for noninteger arguments. TS 18661-1 requires that this exception not be raised by these functions. This aligns them with general IEEE semantics, where "inexact" is only raised if the final step of rounding the infinite-precision result to the result type is inexact; for these functions, the infinite-precision integer result is always representable in the result type, so "inexact" should never be raised. The generic implementations of ceil, floor and round functions contain code to force "inexact" to be raised. This patch removes it for round functions to align them with TS 18661-1 in this regard. The tests *are* updated by this patch; there are fewer architecture-specific versions than for ceil and floor, and I fixed the powerpc ones some time ago. If any others still have the issue, as shown by tests for round failing with spurious exceptions, they can be fixed separately by architecture maintainers or others. Tested for x86_64, x86 and mips64. [BZ #15479] * sysdeps/ieee754/dbl-64/s_round.c (huge): Remove variable. (__round): Do not force "inexact" exception. * sysdeps/ieee754/dbl-64/wordsize-64/s_round.c (huge): Remove variable. (__round): Do not force "inexact" exception. * sysdeps/ieee754/flt-32/s_roundf.c (huge): Remove variable. (__roundf): Do not force "inexact" exception. * sysdeps/ieee754/ldbl-128/s_roundl.c (huge): Remove variable. (__roundl): Do not force "inexact" exception. * sysdeps/ieee754/ldbl-96/s_roundl.c (huge): Remove variable. (__roundl): Do not force "inexact" exception. * math/libm-test.inc (round_test_data): Do not allow spurious "inexact" exceptions.
* Do not raise "inexact" from generic floor (bug 15479).Joseph Myers2016-05-242-15/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | C99 and C11 allow but do not require ceil, floor, round and trunc to raise the "inexact" exception for noninteger arguments. TS 18661-1 requires that this exception not be raised by these functions. This aligns them with general IEEE semantics, where "inexact" is only raised if the final step of rounding the infinite-precision result to the result type is inexact; for these functions, the infinite-precision integer result is always representable in the result type, so "inexact" should never be raised. The generic implementations of ceil, floor and round functions contain code to force "inexact" to be raised. This patch removes it for floor functions to align them with TS 18661-1 in this regard. Note that some architecture-specific versions may still raise "inexact", so the tests are not updated and the bug is not yet fixed. Tested for x86_64, x86 and mips64. [BZ #15479] * sysdeps/ieee754/dbl-64/s_floor.c: Do not mention "inexact" exception in comment. (huge): Remove variable. (__floor): Do not force "inexact" exception. * sysdeps/ieee754/dbl-64/wordsize-64/s_floor.c: Do not mention "inexact" exception in comment. (huge): Remove variable. (__floor): Do not force "inexact" exception. * sysdeps/ieee754/flt-32/s_floorf.c: Do not mention "inexact" exception in comment. (huge): Remove variable. (__floorf): Do not force "inexact" exception. * sysdeps/ieee754/ldbl-128/s_floorl.c: Do not mention "inexact" exception in comment. (huge): Remove variable. (__floorl): Do not force "inexact" exception.
* Do not raise "inexact" from generic ceil (bug 15479).Joseph Myers2016-05-242-15/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | C99 and C11 allow but do not require ceil, floor, round and trunc to raise the "inexact" exception for noninteger arguments. TS 18661-1 requires that this exception not be raised by these functions. This aligns them with general IEEE semantics, where "inexact" is only raised if the final step of rounding the infinite-precision result to the result type is inexact; for these functions, the infinite-precision integer result is always representable in the result type, so "inexact" should never be raised. The generic implementations of ceil, floor and round functions contain code to force "inexact" to be raised. This patch removes it for ceil functions to align them with TS 18661-1 in this regard. Note that some architecture-specific versions may still raise "inexact", so the tests are not updated and the bug is not yet fixed. Tested for x86_64, x86 and mips64. [BZ #15479] * sysdeps/ieee754/dbl-64/s_ceil.c: Do not mention "inexact" exception in comment. (huge): Remove variable. (__ceil): Do not force "inexact" exception. * sysdeps/ieee754/dbl-64/wordsize-64/s_ceil.c: Do not mention "inexact" exception in comment. (huge): Remove variable. (__ceil): Do not force "inexact" exception. * sysdeps/ieee754/flt-32/s_ceilf.c (huge): Remove variable. (__ceilf): Do not force "inexact" exception. * sysdeps/ieee754/ldbl-128/s_ceill.c: Do not mention "inexact" exception in comment. (huge): Remove variable. (__ceill): Do not force "inexact" exception.
* Fix __finitel libm compat symbol version.Joseph Myers2016-01-202-4/+4
| | | | | | | | | | | | | | | | | | The changes to restrict implementation-namespace symbol aliases such as __finitel to compat symbols used code for __finitel in libm analogous to that for __finitel in libc. However, the versions for the two symbols are actually different, GLIBC_2.0 in libc and GLIBC_2.1 in libm. This patch fixes the handling of the libm compat symbol. Tested for mips (o32), where it fixes an ABI test failure. * sysdeps/ieee754/dbl-64/s_finite.c [NO_LONG_DOUBLE && LDBL_CLASSIFY_COMPAT] (__finitel): Define compat symbol at version GLIBC_2_1 and use GLIBC_2_1 in SHLIB_COMPAT condition for libm, not GLIBC_2_0. * sysdeps/ieee754/dbl-64/wordsize-64/s_finite.c [NO_LONG_DOUBLE && LDBL_CLASSIFY_COMPAT] (__finitel): Likewise.
* Call math_opt_barrier inside ifH.J. Lu2016-01-151-1/+4
| | | | | | | | | | | | Since floating-point operation may trigger floating-point exceptions, we call math_opt_barrier inside if to prevent code motion. [BZ #19465] * sysdeps/ieee754/dbl-64/s_fma.c (__fma): Call math_opt_barrier inside if. * sysdeps/ieee754/ldbl-128/s_fmal.c (__fmal): Likewise. * sysdeps/ieee754/ldbl-96/s_fma.c (__fma): Likewise. * sysdeps/ieee754/ldbl-96/s_fmal.c (__fmal): Likewise.
* Eliminate redundant sign extensions in pow()Anton Blanchard2016-01-041-2/+4
| | | | | | | | When looking at the code generated for pow() on ppc64 I noticed quite a few sign extensions. Making the array indices unsigned reduces the number of sign extensions from 24 to 7. Tested for powerpc64le and x86_64.
* Update copyright dates with scripts/update-copyrights.Joseph Myers2016-01-0490-90/+90
|
* Consolidate sincos computation for 2.426265 < |x| < 105414350Siddhesh Poyarekar2015-12-212-281/+123
| | | | | | | Like the previous change, exploit the fact that computation for sin and cos is identical except that it is apart by a quadrant. Also remove csloww, csloww1 and csloww2 since they can easily be expressed in terms of sloww, sloww1 and sloww2.
* Consolidate sin and cos code for 105414350 <|x|< 281474976710656Siddhesh Poyarekar2015-12-212-146/+119
| | | | | | The sin and cos computation for this range of input is identical except for a difference in quadrants by 1. Exploit that fact and the common argument reduction to reduce computations for sincos.
* Consolidate range reduction in sincos for x > 281474976710656Siddhesh Poyarekar2015-12-212-2/+57
| | | | | | Range reduction needs to be done only once for sin and cos, so copy over all of the relevant functions (__sin, __cos, reduce_and_compute) and consolidate common code.
* math: add LDBL_CLASSIFY_COMPAT supportChris Metcalf2015-12-036-6/+34
| | | | | | | | | | | | | | | | | If a platform does not define "long-double-fcts = yes" in its Makefiles and it does define __NO_LONG_DOUBLE_MATH in its installed headers, it will currently create exported symbols for __finitel, __isinfl, and __isnanl that can't be reached from userspace by correct use of the finite(), isinf(), or isnan() macros in <math.h>. To avoid this situation, by default for such platforms we now no longer export these symbols, thus causing appropriate link-time errors. However, for platforms that previously exported these symbols, we continue to do so as compat symbols; this is enabled by adding LDBL_CLASSIFY_COMPAT to math_private.h for the platform. For tile, remove the now-unnecessary exports of those functions from libc and libm.
* Use hex float constants in sysdeps/ieee754/dbl-64/e_sqrt.c.Joseph Myers2015-12-012-46/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Various sysdeps/ieee754/dbl-64 functions use double constants defined using a union between a double and two ints, with separate big-endian and little-endian definitions of the constants. With modern C, this is unnecessary complication; hex float constants (or __builtin_inf etc.) suffice to specify the exact value desired, and so can avoid separate versions for each endianness. Having this complication also complicates cleanups such as removing slow paths from these library functions, as they need to make sure to remove both copies of variables that are no longer used after such a cleanup (and in at least one case, proper removal of a slow path will also involve removing slow-path-only values from the middle of an array - an array with both big-endian and little-endian copies - and adjusting other references to that array). So it makes sense to clean up the code to define these constants using hex floats and so eliminate the endianness conditional. This patch does so in the case of sqrt, where the two constants are such that it makes sense just to put them directly in the code using them and eliminate the names for them altogether. Tested for arm (the code generated for sqrt does change, though not in any significant way). * sysdeps/ieee754/dbl-64/e_sqrt.c: Do not include uroot.h. (__ieee754_sqrt): Use hex float constants instead of tm256.x and t512.x. * sysdeps/ieee754/dbl-64/uroot.h: Remove file.
* Include s_sin.c in s_sincos.cSiddhesh Poyarekar2015-11-172-19/+20
| | | | | | | | | | | | | | Include the __sin and __cos functions as local static copies to allow deper optimization of the functions. This change shows an improvement of about 17% in the min case and 12.5% in the mean case for the sincos microbenchmark on x86_64. * sysdeps/ieee754/dbl-64/s_sin.c (__sin)[IN_SINCOS]: Mark function static and don't set or restore rounding. (__cos)[IN_SINCOS]: Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c: Include s_sin.c. (__sincos): Set and restore rounding mode. Remove check for infinite or NaN input.
* Remove redundant else clauses in s_sin.cSiddhesh Poyarekar2015-11-171-186/+158
| | | | | Makes the code easier to read due to the reduced nesting. The generated binary is unchanged.
* Fix dbl-64 remainder sign of zero result (bug 19201).Joseph Myers2015-11-031-0/+2
| | | | | | | | | | | | | | | | | | For some large arguments, the dbl-64 implementation of remainder gives zero results with the wrong sign, resulting from a subtraction that is mathematically correct but does not guarantee that a zero result has the sign of the first argument to remainder. This patch adds an appropriate check for this case, similar to other implementations of remainder in the case of equality, and adds tests of remainder on inputs already used to test remquo. Tested for x86_64 and x86. [BZ #19201] * sysdeps/ieee754/dbl-64/e_remainder.c (__ieee754_remainder): Check for zero remainder in case of large exponents and ensure correct sign of result in that case. * math/libm-test.inc (remainder_test_data): Add more tests.
* Use C11 *_TRUE_MIN macros where applicable.Joseph Myers2015-10-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | C11 defines standard <float.h> macros *_TRUE_MIN for the least positive subnormal value of a type. Now that we build with -std=gnu11, we can use these macros in glibc. This patch replaces previous uses of the GCC predefines __*_DENORM_MIN__ (used in <float.h> to define *_TRUE_MIN), as well as *_DENORM_MIN references in comments. Tested for x86_64 and x86 (testsuite, and that installed shared libraries are unchanged by the patch). Also tested for powerpc that installed stripped shared libraries are unchanged by the patch. * math/libm-test.inc (min_subnorm_value): Use LDBL_TRUE_MIN, DBL_TRUE_MIN and FLT_TRUE_MIN instead of __LDBL_DENORM_MIN__, __DBL_DENORM_MIN__ and __FLT_DENORM_MIN__. * sysdeps/ieee754/dbl-64/s_fma.c (__fma): Refer to DBL_TRUE_MIN instead of DBL_DENORM_MIN in comment. * sysdeps/ieee754/ldbl-128/s_fmal.c (__fmal): Refer to LDBL_TRUE_MIN instead of LDBL_DENORM_MIN in comment. * sysdeps/ieee754/ldbl-128ibm/s_nextafterl.c: Include <float.h>. (__nextafterl): Use LDBL_TRUE_MIN instead of __LDBL_DENORM_MIN__. * sysdeps/ieee754/ldbl-96/s_fmal.c (__fmal): Refer to LDBL_TRUE_MIN instead of LDBL_DENORM_MIN in comment.
* Remove GCC version conditionals on -Wmaybe-uninitialized pragmas.Joseph Myers2015-10-271-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One common case of __GNUC_PREREQ (4, 7) conditionals is use of diagnostic control pragmas for -Wmaybe-uninitialized, an option introduced in GCC 4.7 where older GCC needed -Wuninitialized to be controlled instead if the warning appeared with older GCC. This patch removes such conditionals. (There remain several older uses of -Wno-uninitialized in makefiles that still need to be converted to diagnostic control pragmas if the issue is still present with current sources and supported GCC versions, and it's likely that in most cases those pragmas also will end up controlling -Wmaybe-uninitialized.) Tested for x86_64 and x86 (testsuite, and that installed stripped shared libraries are unchanged by the patch, except for libresolv since res_send.c contains assertions whose line numbers are changed by the patch). * resolv/res_send.c (send_vc) [__GNUC_PREREQ (4, 7)]: Make code unconditional. * soft-fp/fmadf4.c [__GNUC_PREREQ (4, 7)]: Likewise. [!__GNUC_PREREQ (4, 7)]: Remove conditional code. * soft-fp/fmasf4.c [__GNUC_PREREQ (4, 7)]: Make code unconditional. [!__GNUC_PREREQ (4, 7)]: Remove conditional code. * soft-fp/fmatf4.c [__GNUC_PREREQ (4, 7)]: Make code unconditional. [!__GNUC_PREREQ (4, 7)]: Remove conditional code. * stdlib/setenv.c [((__GNUC__ << 16) + __GNUC_MINOR__) >= ((4 << 16) + 7)]: Make code unconditional. [!(((__GNUC__ << 16) + __GNUC_MINOR__) >= ((4 << 16) + 7))]: Remove conditional code. * sysdeps/ieee754/dbl-64/e_lgamma_r.c (__ieee754_lgamma_r) [__GNUC_PREREQ (4, 7)]: Make code unconditional. (__ieee754_lgamma_r) [!__GNUC_PREREQ (4, 7)]: Remove conditional code. * sysdeps/ieee754/flt-32/e_lgammaf_r.c (__ieee754_lgammaf_r) [__GNUC_PREREQ (4, 7)]: Make code unconditional. (__ieee754_lgammaf_r) [!__GNUC_PREREQ (4, 7)]: Remove conditional code. * sysdeps/ieee754/ldbl-128/k_tanl.c (__kernel_tanl) [__GNUC_PREREQ (4, 7)]: Make code unconditional. (__kernel_tanl) [!__GNUC_PREREQ (4, 7)]: Remove conditional code. * sysdeps/ieee754/ldbl-128ibm/k_tanl.c (__kernel_tanl) [__GNUC_PREREQ (4, 7)]: Make code unconditional. (__kernel_tanl) [!__GNUC_PREREQ (4, 7)]: Remove conditional code. * sysdeps/ieee754/ldbl-96/e_lgammal_r.c (__ieee754_lgammal_r) [__GNUC_PREREQ (4, 7)]: Make code unconditional. (__ieee754_lgammal_r) [!__GNUC_PREREQ (4, 7)]: Remove conditional code. * sysdeps/ieee754/ldbl-96/k_tanl.c (__kernel_tanl) [__GNUC_PREREQ (4, 7)]: Make code unconditional. (__kernel_tanl) [!__GNUC_PREREQ (4, 7)]: Remove conditional code.
* Fix j1, jn missing errno setting on underflow (bug 18611).Joseph Myers2015-10-232-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | j1 and jn can underflow for small arguments, but fail to set errno when underflowing to 0. This patch fixes them to set errno in that case. Tested for x86_64, x86, mips64 and powerpc. [BZ #18611] * sysdeps/ieee754/dbl-64/e_j1.c (__ieee754_j1): Set errno and avoid excess range and precision on underflow. * sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Likewise. * sysdeps/ieee754/flt-32/e_j1f.c (__ieee754_j1f): Likewise. * sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise. * sysdeps/ieee754/ldbl-128/e_j1l.c (__ieee754_j1l): Set errno on underflow. * sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise. * sysdeps/ieee754/ldbl-96/e_j1l.c (__ieee754_j1l): Likewise. * sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise. * math/auto-libm-test-in: Do not allow missing errno setting for tests of j1 and jn. * math/auto-libm-test-out: Regenerated.
* Fix lrint, llrint, lround, llround missing exceptions for MIPS (bug 16399).Joseph Myers2015-10-094-6/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For 32-bit MIPS and some other systems, various of the lrint, llrint, lround, llround functions can be missing exceptions on overflow because casts do not (in current GCC) result in the proper exceptions. In the MIPS case there are two problems here: MIPS I code generation uses an assembler macro that doesn't raise exceptions, while the libgcc conversions of floating-point values to long long also do not raise "invalid" on all overflow cases (and can raise spurious "inexact"). This patch adds support in the generic code (only the functions for which this problem has actually been seen) for forcing the "invalid" exception in the problem cases, and enables that support for the affected MIPS cases. Tested for MIPS; also tested for x86_64 and x86 that installed stripped shared libraries are unchanged by this patch. [BZ #16399] * sysdeps/generic/fix-fp-int-convert-overflow.h: New file. * sysdeps/ieee754/dbl-64/s_llrint.c: Include <fenv.h>, <limits.h> and <fix-fp-int-convert-overflow.h>. (__llrint) [FE_INVALID]: Force FE_INVALID exception as needed if FIX_DBL_LLONG_CONVERT_OVERFLOW. * sysdeps/ieee754/dbl-64/s_llround.c: Include <fenv.h>, <limits.h> and <fix-fp-int-convert-overflow.h>. (__llround) [FE_INVALID]: Force FE_INVALID exception as needed if FIX_DBL_LLONG_CONVERT_OVERFLOW. * sysdeps/ieee754/dbl-64/s_lrint.c: Include <fix-fp-int-convert-overflow.h>. (__lrint) [FE_INVALID]: Force FE_INVALID exception as needed if FIX_DBL_LLONG_CONVERT_OVERFLOW. * sysdeps/ieee754/dbl-64/s_lround.c: Include <fix-fp-int-convert-overflow.h>. (__lround) [FE_INVALID]: Force FE_INVALID exception as needed if FIX_DBL_LLONG_CONVERT_OVERFLOW. * sysdeps/ieee754/flt-32/s_llrintf.c: Include <fenv.h>, <limits.h> and <fix-fp-int-convert-overflow.h>. (__llrintf) [FE_INVALID]: Force FE_INVALID exception as needed if FIX_DBL_LLONG_CONVERT_OVERFLOW. * sysdeps/ieee754/flt-32/s_llroundf.c: Include <fenv.h>, <limits.h> and <fix-fp-int-convert-overflow.h>. (__llroundf) [FE_INVALID]: Force FE_INVALID exception as needed if FIX_DBL_LLONG_CONVERT_OVERFLOW. * sysdeps/ieee754/flt-32/s_lrintf.c: Include <fenv.h>, <limits.h> and <fix-fp-int-convert-overflow.h>. (__lrintf) [FE_INVALID]: Force FE_INVALID exception as needed if FIX_DBL_LLONG_CONVERT_OVERFLOW. * sysdeps/ieee754/flt-32/s_lroundf.c: Include <fenv.h>, <limits.h> and <fix-fp-int-convert-overflow.h>. (__lroundf) [FE_INVALID]: Force FE_INVALID exception as needed if FIX_DBL_LLONG_CONVERT_OVERFLOW. * sysdeps/mips/mips32/fpu/fix-fp-int-convert-overflow.h: New file.
* Fix dbl-64 lrint for 64-bit long (bug 19095).Joseph Myers2015-10-091-1/+1
| | | | | | | | | | | | | | | The dbl-64 implementation of lrint produces incorrect results for some arguments with 64-bit long because a 32-bit (unsigned) low part of the mantissa is shifted left, losing high bits in the process. This patch fixes this by casting to long int before shifting, as in lround (as this case only applies for 64-bit long, there are no issues with sign-extension). Tested for mips64 (n64). [BZ #19095] * sysdeps/ieee754/dbl-64/s_lrint.c (__lrint): Cast low part of mantissa to long int before shifting left.
* Fix lrint, llrint missing exceptions close to overflow threshold (bug 19094).Joseph Myers2015-10-081-4/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The dbl-64, ldbl-96 and ldbl-128 implementations of lrint and llrint fail to produce "invalid" exceptions in cases where the rounded result overflows the target type, but truncating the floating-point argument to the next integer towards zero does not overflow it (so in particular casts do not produce such exceptions). (This issue cannot arise for float, or for double with 64-bit target type, or for ldbl-96 with 64-bit target type and negative arguments, because of insufficient precision in the floating-point type for arguments with the relevant property to exist. It also obviously cannot arise in FE_TOWARDZERO mode.) This patch fixes these problems by inserting checks for the special cases that can occur in each implementation, and explicitly raising FE_INVALID (and avoiding the cast if it might raise spurious FE_INEXACT, while raising FE_INEXACT explicitly in the cases where it is needed; unlike lround and llround, FE_INEXACT is required, not optional, for these functions for a within-range inexact result). The fixes are conditional on FE_INVALID or FE_INEXACT being defined. If any future architecture supports one but not both of those exceptions, the code will fail to compile and need fixing to handle that case (this seemed better than conditioning on both macros being defined, resulting in code that would compile but quietly miss exceptions on such a system). Tested for x86_64, x86 and mips64. Tested the ldbl-96 changes (only relevant for ia64, it appears) on x86_64 by removing the x86_64 versions of lrintl / llrintl. [BZ #19094] * sysdeps/ieee754/dbl-64/s_lrint.c: Include <fenv.h> and <limits.h>. (__lrint) [FE_INVALID || FE_INEXACT]: Force FE_INVALID exception when result overflows but exception would not result from cast. * sysdeps/ieee754/ldbl-128/s_llrintl.c: Include <fenv.h> and <limits.h>. (__llrintl) [FE_INVALID || FE_INEXACT]: Force FE_INVALID exception when result overflows but exception would not result from cast. * sysdeps/ieee754/ldbl-128/s_lrintl.c: Include <fenv.h> and <limits.h>. (__lrintl) [FE_INVALID || FE_INEXACT]: Force FE_INVALID exception when result overflows but exception would not result from cast. * sysdeps/ieee754/ldbl-96/s_llrintl.c: Include <fenv.h> and <limits.h>. (__llrintl) [FE_INVALID || FE_INEXACT]: Force FE_INVALID exception when result overflows but exception would not result from cast. * sysdeps/ieee754/ldbl-96/s_lrintl.c: Include <fenv.h> and <limits.h>. (__lrintl) [FE_INVALID || FE_INEXACT]: Force FE_INVALID exception when result overflows but exception would not result from cast. * math/libm-test.inc (lrint_test_data): Add more tests. (llrint_test_data): Likewise.