mirror/glibc - mirror of git://sourceware.org/git/glibc.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Add new log2 implementation	Szabolcs Nagy	2018-09-12	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similar algorithm is used as in log: log2(2^k x) = k + log2(c) + log2(x/c) where the last term is approximated by a polynomial of x/c - 1, the first order coefficient is about 1/ln2 in this case. There is separate code path when fma instruction is not available for computing x/c - 1 precisely, for which the table size is doubled. The worst case error is 0.547 ULP (0.55 without fma), the read only global data size is 1168 bytes (2192 without fma) on aarch64. The non-nearest rounding error is less than 1 ULP. Improvements on Cortex-A72 compared to current glibc master: log2 thruput: 2.00x in [0.01 11.1] log2 latency: 2.04x in [0.01 11.1] log2 thruput: 2.17x in [0.999 1.001] log2 latency: 2.88x in [0.999 1.001] Tested on aarch64-linux-gnu (defined __FP_FAST_FMA) arm-linux-gnueabihf (!defined __FP_FAST_FMA) x86_64-linux-gnu (!defined __FP_FAST_FMA) powerpc64le-linxu-gnu (defined __FP_FAST_FMA) targets. * NEWS: Mention log2 improvements. * math/Makefile (type-double-routines): Add e_log2_data. * sysdeps/i386/fpu/e_log2_data.c: New file. * sysdeps/ia64/fpu/e_log2_data.c: New file. * sysdeps/ieee754/dbl-64/e_log2.c: Rewrite. * sysdeps/ieee754/dbl-64/e_log2_data.c: New file. * sysdeps/ieee754/dbl-64/math_config.h (__log2_data): Add. * sysdeps/ieee754/dbl-64/wordsize-64/e_log2.c: Remove. * sysdeps/m68k/m680x0/fpu/e_log2_data.c: New file.
*	Add new log implementation	Szabolcs Nagy	2018-09-12	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Optimized log using carefully generated lookup table with 1/c and log(c) values for small intervalls around 1. The log(c) is very near a double precision value, it has about 62 bits precision. The algorithm is log(2^k x) = k log(2) + log(c) + log(x/c), where the last term is approximated by a polynomial of x/c - 1. Near 1 a single polynomial of x - 1 is used. There is separate code path when fma instruction is not available for computing x/c - 1 precisely, in which case the table size is doubled. The code uses __builtin_fma under __FP_FAST_FMA to ensure it is inlined as an instruction. With the default configuration settings the worst case error is 0.519 ULP (and 0.520 without fma), the rodata size is 2192 bytes (4240 without fma). The non-nearest rounding error is less than 1 ULP. Improvements on Cortex-A72 compared to current glibc master: log thruput: 3.28x in [0.01 11.1] log latency: 2.23x in [0.01 11.1] log thruput: 1.56x in [0.999 1.001] log latency: 1.57x in [0.999 1.001] Tested on aarch64-linux-gnu (defined __FP_FAST_FMA) arm-linux-gnueabihf (!defined __FP_FAST_FMA) x86_64-linux-gnu (!defined __FP_FAST_FMA) powerpc64le-linux-gnu (defined __FP_FAST_FMA) targets. * NEWS: Mention log improvement. * math/Makefile (type-double-routines): Add e_log_data. * sysdeps/i386/fpu/e_log_data.c: New file. * sysdeps/ia64/fpu/e_log_data.c: New file. * sysdeps/ieee754/dbl-64/e_log.c: Rewrite. * sysdeps/ieee754/dbl-64/e_log_data.c: New file. * sysdeps/ieee754/dbl-64/math_config.h (__log_data): Add. * sysdeps/ieee754/dbl-64/ulog.h: Remove. * sysdeps/ieee754/dbl-64/ulog.tbl: Remove. * sysdeps/m68k/m680x0/fpu/e_log_data.c: New file.
*	Add new exp and exp2 implementations	Szabolcs Nagy	2018-09-05	3	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Optimized exp and exp2 implementations using a lookup table for fractional powers of 2. There are several variants, see e_exp_data.c, they can be selected by modifying math_config.h allowing different tradeoffs. The default selection should be acceptable as generic libm code. Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on aarch64 the rodata size is 2160 bytes, shared between exp and exp2. On aarch64 .text + .rodata size decreased by 24912 bytes. The non-nearest rounding error is less than 1 ULP even on targets without efficient round implementation (although the error rate is higher in that case). Targets with single instruction, rounding mode independent, to nearest integer rounding and conversion can use them by setting TOINT_INTRINSICS and adding the necessary code to their math_private.h. The __exp1 code uses the same algorithm, so the error bound of pow increased a bit. New double precision error handling code was added following the style of the single precision error handling code. Improvements on Cortex-A72 compared to current glibc master: exp thruput: 1.61x in [-9.9 9.9] exp latency: 1.53x in [-9.9 9.9] exp thruput: 1.13x in [0.5 1] exp latency: 1.30x in [0.5 1] exp2 thruput: 2.03x in [-9.9 9.9] exp2 latency: 1.64x in [-9.9 9.9] For small (< 1) inputs the current exp code uses a separate algorithm so the speed up there is less. Was tested on aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets, only non-nearest rounding ulp errors increase and they are within acceptable bounds (ulp updates are in separate patches). * NEWS: Mention exp and exp2 improvements. * math/Makefile (libm-support): Remove t_exp. (type-double-routines): Add math_err and e_exp_data. * sysdeps/aarch64/libm-test-ulps: Update. * sysdeps/arm/libm-test-ulps: Update. * sysdeps/i386/fpu/e_exp_data.c: New file. * sysdeps/i386/fpu/math_err.c: New file. * sysdeps/i386/fpu/t_exp.c: Remove. * sysdeps/ia64/fpu/e_exp_data.c: New file. * sysdeps/ia64/fpu/math_err.c: New file. * sysdeps/ia64/fpu/t_exp.c: Remove. * sysdeps/ieee754/dbl-64/e_exp.c: Rewrite. * sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite. * sysdeps/ieee754/dbl-64/e_exp_data.c: New file. * sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound. * sysdeps/ieee754/dbl-64/eexp.tbl: Remove. * sysdeps/ieee754/dbl-64/math_config.h: New file. * sysdeps/ieee754/dbl-64/math_err.c: New file. * sysdeps/ieee754/dbl-64/t_exp.c: Remove. * sysdeps/ieee754/dbl-64/t_exp2.h: Remove. * sysdeps/ieee754/dbl-64/uexp.h: Remove. * sysdeps/ieee754/dbl-64/uexp.tbl: Remove. * sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file. * sysdeps/m68k/m680x0/fpu/math_err.c: New file. * sysdeps/m68k/m680x0/fpu/t_exp.c: Remove. * sysdeps/powerpc/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Update.
*	Remove unused math files	Wilco Dijkstra	2018-08-24	3	-9/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove empty files due to the sin/cos improvements: k_sinf.c, k_cosf.c, k_cos.c, k_sin.c. After the tanf change s_rem_pio2f.c and k_rem_pio2f.c (and the ia64, m68k and powerpc equivalents) are no longer used, so remove them. All e_rem_pio2.c files were already empty or commented out, so remove them too. Passes build-many-glibcs. * math/Makefile: Remove empty files k_sin(f).c, k_cos(f).c. Remove unused files e_rem_pio2(f).c, k_rem_pio2f.c. * sysdeps/i386/fpu/e_rem_pio2.c: Delete file. * sysdeps/ia64/fpu/e_rem_pio2.c: Likewise. * sysdeps/ia64/fpu/e_rem_pio2f.c: Likewise. * sysdeps/ia64/fpu/k_rem_pio2f.c: Likewise. * sysdeps/ieee754/dbl-64/e_rem_pio2.c: Likewise. * sysdeps/ieee754/dbl-64/k_cos.c: Likewise. * sysdeps/ieee754/dbl-64/k_sin.c: Likewise. * sysdeps/ieee754/flt-32/e_rem_pio2f.c: Likewise. * sysdeps/ieee754/flt-32/k_cosf.c: Likewise. * sysdeps/ieee754/flt-32/k_rem_pio2f.c: Likewise. * sysdeps/ieee754/flt-32/k_sinf.c: Likewise. * sysdeps/m68k/m680x0/fpu/e_rem_pio2.c: Likewise * sysdeps/m68k/m680x0/fpu/e_rem_pio2f.c: Likewise * sysdeps/m68k/m680x0/fpu/k_rem_pio2f.c: Likewise * sysdeps/powerpc/fpu/e_rem_pio2f.c: Likewise. * sysdeps/powerpc/fpu/k_rem_pio2f.c: Likewise.
*	Move EXCEPTION_TESTS_* out of math-tests.h	Joseph Myers	2018-08-23	1	-26/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Continuing moving macros out of math-tests.h to smaller headers following typo-proof conventions instead of using #ifndef, this patch moves the EXCEPTION_TESTS_* macros for individual types out to their own sysdeps header. As with ROUNDING_TESTS_, there is no need to define these macros if FE_ALL_EXCEPT == 0 and the individual exception macros are undefined; thus, math-tests-exceptions.h headers are only needed for soft-float ARM and RISC-V, while the other cases that defined these macros do not need to do so (and the associated math-tests.h headers are thus removed without needing replacement by math-tests-exceptions.h headers). Tested with build-many-glibcs.py. sysdeps/generic/math-tests-exceptions.h: New file. * sysdeps/generic/math-tests.h: Include <math-tests-exceptions.h>. (EXCEPTION_TESTS_float): Do not define here. (EXCEPTION_TESTS_double): Likewise. (EXCEPTION_TESTS_long_double): Likewise. (EXCEPTION_TESTS_float128): Likewise. * sysdeps/arm/math-tests.h [__SOFTFP__] (EXCEPTION_TESTS_float): Likewise. [__SOFTFP__] (EXCEPTION_TESTS_double): Likewise. [__SOFTFP__] (EXCEPTION_TESTS_long_double): Likewise. * sysdeps/arm/nofpu/math-tests-exceptions.h: New file. * sysdeps/m68k/coldfire/math-tests.h: Remove file. * sysdeps/mips/math-tests.h: Likewise. * sysdeps/nios2/math-tests.h: Likewise. * sysdeps/riscv/math-tests.h [!__riscv_flen] (EXCEPTION_TESTS_float): Do not define here. [!__riscv_flen] (EXCEPTION_TESTS_double): Likewise. [!__riscv_flen] (EXCEPTION_TESTS_long_double): Likewise. * sysdeps/riscv/nofpu/math-tests-exceptions.h: New file.
*	Move ROUNDING_TESTS_* out of math-tests.h.	Joseph Myers	2018-08-22	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Continuing moving macros out of math-tests.h to smaller headers following typo-proof conventions instead of using #ifndef, this patch moves the ROUNDING_TESTS_* macros for individual types out to their own sysdeps header. In the soft-float case where FE_TONEAREST is the only rounding mode macro defined, there is no need to define ROUNDING_TESTS_; it is only necessary when rounding modes macros are defined that may not be supported at runtime. Thus, the ROUNDING_TESTS_ definitions for some configurations are just removed, not moved to new math-tests-rounding.h headers; the only architectures needing math-tests-rounding.h are those where the macros are defined in bits/fenv.h because of the possibility of a soft-float compilation using a hard-float glibc with the same ABI (i.e., ARM and RISC-V). The test--vlen.h headers, by using #undef, do not yet follow typo-proof conventions (but they no longer implicitly rely on being included before math-tests.h, and this area can always be cleaned up further in future). Tested with build-many-glibcs.py. * sysdeps/generic/math-tests-rounding.h: New file. * sysdeps/generic/math-tests.h: Include <math-tests-rounding.h>. (ROUNDING_TESTS_float): Do not define here. (ROUNDING_TESTS_double): Likewise. (ROUNDING_TESTS_long_double): Likewise. (ROUNDING_TESTS_float128): Likewise. * math/test-double-vlen2.h: Include <math-tests-rounding.h>. (ROUNDING_TESTS_double): Undefine before defining. * math/test-double-vlen4.h: Include <math-tests-rounding.h>. (ROUNDING_TESTS_double): Undefine before defining. * math/test-double-vlen8.h: Include <math-tests-rounding.h>. (ROUNDING_TESTS_double): Undefine before defining. * math/test-float-vlen16.h: Include <math-tests-rounding.h>. (ROUNDING_TESTS_float): Undefine before defining. * math/test-float-vlen4.h: Include <math-tests-rounding.h>. (ROUNDING_TESTS_float): Undefine before defining. * math/test-float-vlen8.h: Include <math-tests-rounding.h>. (ROUNDING_TESTS_float): Undefine before defining. * sysdeps/arm/nofpu/math-tests-rounding.h: New file. * sysdeps/arm/math-tests.h [__SOFTFP__] (ROUNDING_TESTS_float): Do not define here. [__SOFTFP__] (ROUNDING_TESTS_double): Likewise. [__SOFTFP__] (ROUNDING_TESTS_long_double): Likewise. * sysdeps/riscv/nofpu/math-tests-rounding.h: New file. * sysdeps/riscv/math-tests.h [!__riscv_flen] (ROUNDING_TESTS_float): Do not define here. [!__riscv_flen] (ROUNDING_TESTS_double): Likewise. [!__risv_flen] (ROUNDING_TESTS_long_double): Likewise. * sysdeps/m68k/coldfire/math-tests.h [!__mcffpu__] (ROUNDING_TESTS_float): Likewise. [!__mcffpu__] (ROUNDING_TESTS_double): Likewise. [!__mcffpu__] (ROUNDING_TESTS_long_double): Likewise. * sysdeps/mips/math-tests.h [__mips_soft_float] (ROUNDING_TESTS_float): Likewise. [__mips_soft_float] (ROUNDING_TESTS_double): Likewise. [__mips_soft_float] (ROUNDING_TESTS_long_double): Likewise. * sysdeps/nios2/math-tests.h (ROUNDING_TESTS_float): Likewise. (ROUNDING_TESTS_double): Likewise. (ROUNDING_TESTS_long_double): Likewise.
*	Improve performance of sincosf	Wilco Dijkstra	2018-08-10	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is a complete rewrite of sincosf. The new version is significantly faster, as well as simple and accurate. The worst-case ULP is 0.5607, maximum relative error is 0.5303 * 2^-23 over all 4 billion inputs. In non-nearest rounding modes the error is 1ULP. The algorithm uses 3 main cases: small inputs which don't need argument reduction, small inputs which need a simple range reduction and large inputs requiring complex range reduction. The code uses approximate integer comparisons to quickly decide between these cases. The small range reducer uses a single reduction step to handle values up to 120.0. It is fastest on targets which support inlined round instructions. The large range reducer uses integer arithmetic for simplicity. It does a 32x96 bit multiply to compute a 64-bit modulo result. This is more than accurate enough to handle the worst-case cancellation for values close to an integer multiple of PI/4. It could be further optimized, however it is already much faster than necessary. sincosf throughput gains on Cortex-A72: * \|x\| < 0x1p-12 : 1.6x * \|x\| < M_PI_4 : 1.7x * \|x\| < 2 * M_PI: 1.5x * \|x\| < 120.0 : 1.8x * \|x\| < Inf : 2.3x * math/Makefile: Add s_sincosf_data.c. * sysdeps/ia64/fpu/s_sincosf_data.c: New file. * sysdeps/ieee754/flt-32/s_sincosf.h (abstop12): Add new function. (sincosf_poly): Likewise. (reduce_small): Likewise. (reduce_large): Likewise. * sysdeps/ieee754/flt-32/s_sincosf.c (sincosf): Rewrite. * sysdeps/ieee754/flt-32/s_sincosf_data.c: New file with sincosf data. * sysdeps/m68k/m680x0/fpu/s_sincosf_data.c: New file. * sysdeps/x86_64/fpu/s_sincosf_data.c: New file.
*	m68k: Reorganize log1p and significand implementations	Tulio Magno Quites Machado Filho	2018-06-22	6	-32/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 5e79e0292bfb03f40e43379fd92581ad8eae9cb8 broke m68k after s_significand.c became available in the build directory. All m68k implementations of log1p and significand were including s_significand.c and stopped working after the inclusion of the the auto-generated file. This patch reorganizes the implementation of log1p and significand for m680x0 in order to avoid hitting this problem. * sysdeps/m68k/m680x0/fpu/s_log1p.c: Set as the generic file for all log1p and significand functions on m680x0. * sysdeps/m68k/m680x0/fpu/s_log1pf.c: Include s_log1p.c instead of s_significand.c.. * sysdeps/m68k/m680x0/fpu/s_log1pl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_significandf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_significandl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_significand.c: Move all the code to s_log1p.c and include it.. Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
*	Mark _init and _fini as hidden [BZ #23145]	H.J. Lu	2018-06-08	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	_init and _fini are special functions provided by glibc for linker to define DT_INIT and DT_FINI in executable and shared library. They should never be put in dynamic symbol table. This patch marks them as hidden to remove them from dynamic symbol table. Tested with build-many-glibcs.py. [BZ #23145] * elf/Makefile (tests-special): Add $(objpfx)check-initfini.out. ($(all-built-dso:=.dynsym): New target. (common-generated): Add $(all-built-dso:$(common-objpfx)%=%.dynsym). ($(objpfx)check-initfini.out): New target. (generated): Add check-initfini.out. * scripts/check-initfini.awk: New file. * sysdeps/aarch64/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/alpha/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/arm/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/hppa/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/i386/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/ia64/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/m68k/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/microblaze/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/mips/mips32/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/mips/mips64/n32/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/mips/mips64/n64/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/nios2/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/powerpc/powerpc32/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/powerpc/powerpc64/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/s390/s390-32/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/s390/s390-64/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/sh/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/sparc/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/x86_64/crti.S (_init): Mark as hidden. (_fini): Likewise.
*	Do not include math-barriers.h in math_private.h.	Joseph Myers	2018-05-11	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch continues the math_private.h cleanup by stopping math_private.h from including math-barriers.h and making the users of the barrier macros include the latter header directly. No attempt is made to remove any math_private.h includes that are now unused, except in strtod_l.c where that is done to avoid line number changes in assertions, so that installed stripped shared libraries can be compared before and after the patch. (I think the floating-point environment support in math_private.h should also move out - some architectures already have fenv_private.h as an architecture-internal header included from their math_private.h - and after moving that out might be a better time to identify unused math_private.h includes.) Tested for x86_64 and x86, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/generic/math_private.h: Do not include <math-barriers.h>. * stdlib/strtod_l.c: Include <math-barriers.h> instead of <math_private.h>. * math/fromfp.h: Include <math-barriers.h>. * math/math-narrow.h: Likewise. * math/s_nextafter.c: Likewise. * math/s_nexttowardf.c: Likewise. * sysdeps/aarch64/fpu/s_llrint.c: Likewise. * sysdeps/aarch64/fpu/s_llrintf.c: Likewise. * sysdeps/aarch64/fpu/s_lrint.c: Likewise. * sysdeps/aarch64/fpu/s_lrintf.c: Likewise. * sysdeps/i386/fpu/s_nextafterl.c: Likewise. * sysdeps/i386/fpu/s_nexttoward.c: Likewise. * sysdeps/i386/fpu/s_nexttowardf.c: Likewise. * sysdeps/ieee754/dbl-64/e_atan2.c: Likewise. * sysdeps/ieee754/dbl-64/e_atanh.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp2.c: Likewise. * sysdeps/ieee754/dbl-64/e_j0.c: Likewise. * sysdeps/ieee754/dbl-64/e_sqrt.c: Likewise. * sysdeps/ieee754/dbl-64/s_expm1.c: Likewise. * sysdeps/ieee754/dbl-64/s_fma.c: Likewise. * sysdeps/ieee754/dbl-64/s_fmaf.c: Likewise. * sysdeps/ieee754/dbl-64/s_log1p.c: Likewise. * sysdeps/ieee754/dbl-64/s_nearbyint.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c: Likewise. * sysdeps/ieee754/flt-32/e_atanhf.c: Likewise. * sysdeps/ieee754/flt-32/e_j0f.c: Likewise. * sysdeps/ieee754/flt-32/s_expm1f.c: Likewise. * sysdeps/ieee754/flt-32/s_log1pf.c: Likewise. * sysdeps/ieee754/flt-32/s_nearbyintf.c: Likewise. * sysdeps/ieee754/flt-32/s_nextafterf.c: Likewise. * sysdeps/ieee754/k_standardl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_asinl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_expl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_powl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nearbyintl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nextafterl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nexttoward.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nexttowardf.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_asinl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_nextafterl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_nexttoward.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_nexttowardf.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise. * sysdeps/ieee754/ldbl-96/e_atanhl.c: Likewise. * sysdeps/ieee754/ldbl-96/e_j0l.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fma.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-96/s_nexttoward.c: Likewise. * sysdeps/ieee754/ldbl-96/s_nexttowardf.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_nexttowardfd.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_nextafterl.c: Likewise.
*	Move math_opt_barrier, math_force_eval to separate math-barriers.h.	Joseph Myers	2018-05-09	2	-20/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch continues cleaning up math_private.h by moving the math_opt_barrier and math_force_eval macros to a separate header math-barriers.h. At present, those macros are inside a "#ifndef math_opt_barrier" in math_private.h to allow architectures to override them and then use a separate math-barriers.h header, no such #ifndef or #include_next is needed; architectures just have their own alternative version of math-barriers.h when providing their own optimized versions that avoid going through memory unnecessarily. The generic math-barriers.h has a comment added to document these two macros. In this patch, math_private.h is made to #include <math-barriers.h>, so files using these macros do not need updating yet. That is because of uses of math_force_eval in math_check_force_underflow and math_check_force_underflow_nonneg, which are still defined in math_private.h. Once those are moved out to a separate header, that separate header can be made to include <math-barriers.h>, as can the other files directly using these barrier macros, and then the include of <math-barriers.h> from math_private.h can be removed. Tested for x86_64 and x86. Also tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. * sysdeps/generic/math-barriers.h: New file. * sysdeps/generic/math_private.h [!math_opt_barrier] (math_opt_barrier): Move to math-barriers.h. [!math_opt_barrier] (math_force_eval): Likewise. * sysdeps/aarch64/fpu/math-barriers.h: New file. * sysdeps/aarch64/fpu/math_private.h (math_opt_barrier): Move to math-barriers.h. (math_force_eval): Likewise. * sysdeps/alpha/fpu/math-barriers.h: New file. * sysdeps/alpha/fpu/math_private.h (math_opt_barrier): Move to math-barriers.h. (math_force_eval): Likewise. * sysdeps/x86/fpu/math-barriers.h: New file. * sysdeps/i386/fpu/fenv_private.h (math_opt_barrier): Move to math-barriers.h. (math_force_eval): Likewise. * sysdeps/m68k/m680x0/fpu/math_private.h: Move to.... * sysdeps/m68k/m680x0/fpu/math-barriers.h: ... here. Adjust multiple-include guard for rename. * sysdeps/powerpc/fpu/math-barriers.h: New file. * sysdeps/powerpc/fpu/math_private.h (math_opt_barrier): Move to math-barriers.h. (math_force_eval): Likewise.
*	Drop fpregset unused symbol exposition	Samuel Thibault	2018-04-20	1	-1/+1
\| \| \| \| \| \| \| \| \|	* sysdeps/arm/sys/ucontext.h: Remove fpregset struct name, unused and non-compliant. * sysdeps/i386/sys/ucontext.h: Likewise. * sysdeps/m68k/sys/ucontext.h: Likewise. * sysdeps/mips/sys/ucontext.h: Likewise. * sysdeps/unix/sysv/linux/hppa/sys/ucontext.h: Likewise.
*	elf: Unify symbol address run-time calculation [BZ #19818]	Maciej W. Rozycki	2018-04-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Wrap symbol address run-time calculation into a macro and use it throughout, replacing inline calculations. There are a couple of variants, most of them different in a functionally insignificant way. Most calculations are right following RESOLVE_MAP, at which point either the map or the symbol returned can be checked for validity as the macro sets either both or neither. In some places both the symbol and the map has to be checked however. My initial implementation therefore always checked both, however that resulted in code larger by as much as 0.3%, as many places know from elsewhere that no check is needed. I have decided the size growth was unacceptable. Having looked closer I realized that it's the map that is the culprit. Therefore I have modified LOOKUP_VALUE_ADDRESS to accept an additional boolean argument telling it to access the map without checking it for validity. This in turn has brought quite nice results, with new code actually being smaller for i686, and MIPS o32, n32 and little-endian n64 targets, unchanged in size for x86-64 and, unusually, marginally larger for big-endian MIPS n64, as follows: i686: text data bss dec hex filename 152255 4052 192 156499 26353 ld-2.27.9000-base.so 152159 4052 192 156403 262f3 ld-2.27.9000-elf-symbol-value.so MIPS/o32/el: text data bss dec hex filename 142906 4396 260 147562 2406a ld-2.27.9000-base.so 142890 4396 260 147546 2405a ld-2.27.9000-elf-symbol-value.so MIPS/n32/el: text data bss dec hex filename 142267 4404 260 146931 23df3 ld-2.27.9000-base.so 142171 4404 260 146835 23d93 ld-2.27.9000-elf-symbol-value.so MIPS/n64/el: text data bss dec hex filename 149835 7376 408 157619 267b3 ld-2.27.9000-base.so 149787 7376 408 157571 26783 ld-2.27.9000-elf-symbol-value.so MIPS/o32/eb: text data bss dec hex filename 142870 4396 260 147526 24046 ld-2.27.9000-base.so 142854 4396 260 147510 24036 ld-2.27.9000-elf-symbol-value.so MIPS/n32/eb: text data bss dec hex filename 142019 4404 260 146683 23cfb ld-2.27.9000-base.so 141923 4404 260 146587 23c9b ld-2.27.9000-elf-symbol-value.so MIPS/n64/eb: text data bss dec hex filename 149763 7376 408 157547 2676b ld-2.27.9000-base.so 149779 7376 408 157563 2677b ld-2.27.9000-elf-symbol-value.so x86-64: text data bss dec hex filename 148462 6452 400 155314 25eb2 ld-2.27.9000-base.so 148462 6452 400 155314 25eb2 ld-2.27.9000-elf-symbol-value.so [BZ #19818] * sysdeps/generic/ldsodefs.h (LOOKUP_VALUE_ADDRESS): Add `set' parameter. (SYMBOL_ADDRESS): New macro. [!ELF_FUNCTION_PTR_IS_SPECIAL] (DL_SYMBOL_ADDRESS): Use SYMBOL_ADDRESS for symbol address calculation. * elf/dl-runtime.c (_dl_fixup): Likewise. (_dl_profile_fixup): Likewise. * elf/dl-symaddr.c (_dl_symbol_address): Likewise. * elf/rtld.c (dl_main): Likewise. * sysdeps/aarch64/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/alpha/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/arm/dl-machine.h (elf_machine_rel): Likewise. (elf_machine_rela): Likewise. * sysdeps/hppa/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/hppa/dl-symaddr.c (_dl_symbol_address): Likewise. * sysdeps/i386/dl-machine.h (elf_machine_rel): Likewise. (elf_machine_rela): Likewise. * sysdeps/ia64/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/m68k/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/microblaze/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/mips/dl-machine.h (ELF_MACHINE_BEFORE_RTLD_RELOC): Likewise. (elf_machine_reloc): Likewise. (elf_machine_got_rel): Likewise. * sysdeps/mips/dl-trampoline.c (__dl_runtime_resolve): Likewise. * sysdeps/nios2/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/riscv/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/s390/s390-32/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/s390/s390-64/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/sh/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/sparc/sparc32/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/sparc/sparc64/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/tile/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/x86_64/dl-machine.h (elf_machine_rela): Likewise. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
*	Revert m68k __ieee754_sqrt change	Wilco Dijkstra	2018-03-16	1	-0/+1
\| \| \| \| \| \| \| \|	Revert m68k __ieee754_sqrt change as it causes a build failure in one m68k configuration. m68k-linux-gnu now passes again. * sysdeps/m68k/m680x0/fpu/mathimpl.h (__ieee754_sqrt): Revert previous commit.
*	Remove all target specific __ieee754_sqrt(f/l) inlines	Wilco Dijkstra	2018-03-15	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove the now unused target specific__ieee754_sqrt(f/l) inlines. Also remove inlines of sqrt which are for really old GCC versions. Removing these is desirable, under the general principle of leaving such inlining to the compiler rather than trying to do it in installed headers, especially when only very old compilers are affected. Note that removing inlines for __ieee754_sqrt disables inlining in the sqrt wrapper functions. Given the sqrt function will typically only be called for negative arguments, it doesn't matter whether the inlining happens or not. * sysdeps/aarch64/fpu/math_private.h (__ieee754_sqrt): Remove. (__ieee754_sqrtf): Remove. * sysdeps/alpha/fpu/math_private.h (__ieee754_sqrt): Remove. (__ieee754_sqrtf): Remove. * sysdeps/generic/math-type-macros.h (M_SQRT): Use sqrt. * sysdeps/m68k/m680x0/fpu/mathimpl.h (__ieee754_sqrt): Remove. * sysdeps/powerpc/fpu/math_private.h (__ieee754_sqrt): Remove. (__ieee754_sqrtf): Remove. * sysdeps/s390/fpu/bits/mathinline.h: Remove file. * sysdeps/sparc/fpu/bits/mathinline.h (sqrt) Remove. (sqrtf): Remove. (sqrtl): Remove. (__ieee754_sqrt): Remove. (__ieee754_sqrtf): Remove. (__ieee754_sqrtl): Remove. * sysdeps/m68k/m680x0/fpu/mathimpl.h (__ieee754_sqrt): Remove. * sysdeps/x86/fpu/math_private.h (__ieee754_sqrt): Remove. * sysdeps/x86_64/fpu/math_private.h (__ieee754_sqrt): Remove. (__ieee754_sqrtf): Remove. (__ieee754_sqrtl): Remove.
*	Rename all __ieee754_sqrt(f/l) calls to sqrt(f/l)	Wilco Dijkstra	2018-03-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use sqrt(f/l) to enable inlining by GCC - if inlining doesn't happen, the asm redirect ensures we will still call __ieee754_sqrt(f/l). * sysdeps/ieee754/dbl-64/e_acosh.c (__ieee754_acosh): Use sqrt. * sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Likewise. * sysdeps/ieee754/dbl-64/e_hypot.c (__ieee754_hypot): Likewise. * sysdeps/ieee754/dbl-64/e_j0.c (__ieee754_j0): Likewise. * sysdeps/ieee754/dbl-64/e_j1.c (__ieee754_j1): Likewise. * sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Likewise. * sysdeps/ieee754/dbl-64/s_asinh.c (__asinh): Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/e_acosh.c (__ieee754_acosh): Likewise. * sysdeps/ieee754/flt-32/e_acosf.c (__ieee754_acosf): Likewise. * sysdeps/ieee754/flt-32/e_acoshf.c (__ieee754_acoshf): Likewise. * sysdeps/ieee754/flt-32/e_asinf.c (__ieee754_asinf): Likewise. * sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Likewise. * sysdeps/ieee754/flt-32/e_hypotf.c (__ieee754_hypotf): Likewise. * sysdeps/ieee754/flt-32/e_j0f.c (__ieee754_j0f): Likewise. * sysdeps/ieee754/flt-32/e_j1f.c (__ieee754_j1f): Likewise. * sysdeps/ieee754/flt-32/e_powf.c (__ieee754_powf): Likewise. * sysdeps/ieee754/flt-32/s_asinhf.c (__asinhf): Likewise. * sysdeps/ieee754/ldbl-128/e_acoshl.c (__ieee754_acoshl): Use sqrtl. * sysdeps/ieee754/ldbl-128/e_acosl.c (__ieee754_acosl): Likewise. * sysdeps/ieee754/ldbl-128/e_asinl.c (__ieee754_asinl): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128/e_hypotl.c (__ieee754_hypotl): Likewise. * sysdeps/ieee754/ldbl-128/e_j0l.c (__ieee754_j0l): Likewise. * sysdeps/ieee754/ldbl-128/e_j1l.c (__ieee754_j1l): Likewise. * sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise. * sysdeps/ieee754/ldbl-128/e_powl.c (__ieee754_powl): Likewise. * sysdeps/ieee754/ldbl-128/s_asinhl.c (__ieee754_asinhl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_acoshl.c (__ieee754_acoshl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_acosl.c (__ieee754_acosl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_asinl.c (__ieee754_asinl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_hypotl.c (__ieee754_hypotl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_j0l.c (__ieee754_j0l): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_j1l.c (__ieee754_j1l): Likewise * sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_asinhl.c (__ieee754_asinhl): Likewise. * sysdeps/ieee754/ldbl-96/e_acoshl.c (__ieee754_acoshl): Use sqrtl. * sysdeps/ieee754/ldbl-96/e_asinl.c (__ieee754_asinl): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-96/e_hypotl.c (__ieee754_hypotl): Likewise. * sysdeps/ieee754/ldbl-96/e_j0l.c (__ieee754_j0l): Likewise. * sysdeps/ieee754/ldbl-96/e_j1l.c (__ieee754_j1l): Likewise. * sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise. * sysdeps/ieee754/ldbl-96/s_asinhl.c (__ieee754_asinhl): Likewise. * sysdeps/m68k/m680x0/fpu/e_pow.c (__ieee754_pow): Likewise. * sysdeps/powerpc/fpu/e_hypot.c (__ieee754_hypot): Likewise. * sysdeps/powerpc/fpu/e_hypotf.c (__ieee754_hypotf): Likewise.
*	hurd: add gscope support	Samuel Thibault	2018-03-11	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* elf/dl-support.c [!THREAD_GSCOPE_IN_TCB] (_dl_thread_gscope_count): Define variable. * sysdeps/generic/ldsodefs.h [!THREAD_GSCOPE_IN_TCB] (struct rtld_global): Add _dl_thread_gscope_count member. * sysdeps/mach/hurd/tls.h: Include <atomic.h>. [!defined __ASSEMBLER__] (THREAD_GSCOPE_GLOBAL, THREAD_GSCOPE_SET_FLAG, THREAD_GSCOPE_RESET_FLAG, THREAD_GSCOPE_WAIT): Define macros. * sysdeps/generic/tls.h: Document THREAD_GSCOPE_IN_TCB. * sysdeps/aarch64/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/alpha/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/arm/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/hppa/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/i386/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/ia64/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/m68k/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/microblaze/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/mips/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/nios2/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/powerpc/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/riscv/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/s390/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/sh/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/sparc/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/tile/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/x86_64/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1.
*	Remove mplog and mpexp	Wilco Dijkstra	2018-02-15	2	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove the now unused mplog and mpexp files. * math/Makefile: Remove mpexp.c and mplog.c * sysdeps/i386/fpu/mpexp.c: Delete file. * sysdeps/i386/fpu/mplog.c: Likewise. * sysdeps/ia64/fpu/mpexp.c: Likewise. * sysdeps/ia64/fpu/mplog.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c: Remove mention of mpexp and mplog. * sysdeps/ieee754/dbl-64/mpa.h (__pow_mp): Remove unused function. * sysdeps/ieee754/dbl-64/mpexp.c: Delete file. * sysdeps/ieee754/dbl-64/mplog.c: Likewise. * sysdeps/m68k/m680x0/fpu/mpexp.c: Likewise. * sysdeps/m68k/m680x0/fpu/mplog.c: Likewise. * sysdeps/x86_64/fpu/multiarch/Makefile: Remove mpexp* and mplog. sysdeps/x86_64/fpu/multiarch/e_log-avx.c: Remove unused defines. * sysdeps/x86_64/fpu/multiarch/e_log-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_log-fma4.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpexp-avx.c: Delete file. * sysdeps/x86_64/fpu/multiarch/mpexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpexp-fma4.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mplog-avx.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mplog-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mplog-fma4.c: Likewise.
*	Remove slow paths from exp	Szabolcs Nagy	2018-02-12	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove the __slowexp code, so exp is no longer correctly rounded. The result is computed to about 70 bits precision so the worst case ulp error is about 0.500007 in nearest rounding mode. * manual/probes.texi: Remove slowexp probes. * math/Makefile: Remove slowexp. * sysdeps/generic/math_private.h (__slowexp): Remove. * sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Remove __slowexp and document error bounds. * sysdeps/i386/fpu/slowexp.c: Remove. * sysdeps/ia64/fpu/slowexp.c: Remove. * sysdeps/ieee754/dbl-64/slowexp.c: Remove. * sysdeps/ieee754/dbl-64/uexp.h (err_0): Remove. * sysdeps/m68k/m680x0/fpu/slowexp.c: Remove. * sysdeps/powerpc/power4/fpu/Makefile (CPPFLAGS-slowexp.c): Remove. * sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowexp-fma. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Remove. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Remove. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Remove. * sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Remove. * sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Remove. * sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Remove.
*	Remove slow paths from pow	Wilco Dijkstra	2018-02-12	2	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove the slow paths from pow. Like several other double precision math functions, pow is exactly rounded. This is not required from math functions and causes major overheads as it requires multiple fallbacks using higher precision arithmetic if a result is close to 0.5ULP. Ridiculous slowdowns of up to 100000x have been reported when the highest precision path triggers. All GLIBC math tests pass on AArch64 and x64 (with ULP of pow set to 1). The worst case error is ~0.506ULP. A simple test over a few hundred million values shows pow is 10% faster on average. This fixes BZ #13932. [BZ #13932] * sysdeps/ieee754/dbl-64/uexp.h (err_1): Remove. * benchtests/pow-inputs: Update comment for slow path cases. * manual/probes.texi (slowpow_p10): Delete removed probe. (slowpow_p10): Likewise. * math/Makefile: Remove halfulp.c and slowpow.c. * sysdeps/aarch64/libm-test-ulps: Set ULP of pow to 1. * sysdeps/generic/math_private.h (__exp1): Remove error argument. (__halfulp): Remove. (__slowpow): Remove. * sysdeps/i386/fpu/halfulp.c: Delete file. * sysdeps/i386/fpu/slowpow.c: Likewise. * sysdeps/ia64/fpu/halfulp.c: Likewise. * sysdeps/ia64/fpu/slowpow.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove error argument, improve comments and add error analysis. * sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Add error analysis. (power1): Remove function: (log1): Remove error argument, add error analysis. (my_log2): Remove function. * sysdeps/ieee754/dbl-64/halfulp.c: Delete file. * sysdeps/ieee754/dbl-64/slowpow.c: Likewise. * sysdeps/m68k/m680x0/fpu/halfulp.c: Likewise. * sysdeps/m68k/m680x0/fpu/slowpow.c: Likewise. * sysdeps/powerpc/power4/fpu/Makefile: Remove CPPFLAGS-slowpow.c. * sysdeps/x86_64/fpu/libm-test-ulps: Set ULP of pow to 1. * sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowpow-fma.c, slowpow-fma4.c, halfulp-fma.c, halfulp-fma4.c. * sysdeps/x86_64/fpu/multiarch/e_pow-fma.c (__slowpow): Remove define. * sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c (__slowpow): Likewise. * sysdeps/x86_64/fpu/multiarch/halfulp-fma.c: Delete file. * sysdeps/x86_64/fpu/multiarch/halfulp-fma4.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowpow-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowpow-fma4.c: Likewise.
*	Unify and simplify bits/byteswap.h, bits/byteswap-16.h headers (bug 14508, ↵	Joseph Myers	2018-02-06	1	-88/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bug 15512, bug 17082, bug 20530). We have a general principle of preferring optimizations for library facilities to use compiler built-in functions rather than being located in library headers, where the compiler can reasonably optimize code without needing to know glibc implementation details. This patch applies this principle to bits/byteswap.h, eliminating all the architecture-specific variants and bits/byteswap-16.h. The __bswap_16, __bswap_32 and __bswap_64 interfaces all become inline functions, never macros, using the GCC built-in functions where available and otherwise a single architecture-independent definition using shifts and masking (which compilers may well be able to detect and optimize; GCC has detection of various byte-swapping idioms). The __bswap_constant_32 macro needs to stay around because of uses in static initializers within glibc and its tests, and so for consistency all __bswap_constant_* are kept rather than just being inlined into the old-GCC-or-non-GCC parts of the __bswap_* inline function definitions. Various open bugs are addressed by this cleanup, with caveats about exactly what is covered by those bugs and when the bugs applied at all. Bug 14508 reports -Wformat warnings building glibc because __bswap_* sometimes returned the wrong types. Obviously we already don't have such warnings any more or the build would be failing, given -Werror, and I suspect that bug was originally for wrong types for x86_64, as fixed by commit d394eb742a3565d7fe7a4b02710a60b5f219ee64 (glibc 2.17). The only case I saw removed by this patch where the types would still have been wrong was the non-__GNUC__ case of __bswap_64 in the s390 header (using unsigned long long int, but uint64_t would be unsigned long int for 64-bit). In any case, the single header consistently uses __uintN_t types after this patch, thereby eliminating all such bugs. The existing string/test-endian-types.c test already suffices to verify that the types are correct with the compiler used to build glibc and its tests. Bug 15512 reports an error from __bswap_constant_16 with -Werror -Wsign-conversion. I am unable to reproduce this with any GCC version supporting -Wsign-conversion - all seem to be able to avoid warning for ((x) >> 8) & 0xffu, where x is uint16_t, which while it formally does involve an implicit conversion from int to unsigned int, is also a case where it should be easy for the compiler to see that the value converted is never negative. But in this patch __bswap_constant_16 is changed to use signed 0xff so that no such implicit conversion occurs at all, and a test with -Werror -Wsign-conversion is added. Bug 17082 objects to the use of ({}) statement expressions in these macros preventing use at file scope (in C, that's in sizeof etc.; in C++, more generally in static initializers). The particular case of these interfaces is fixed by this patch as it changes them to inline functions, eliminating all uses of ({}) in bits/byteswap.h, and a corresponding testcase is added. The bug tries to raise a more general policy question about use of ({}) in macros in installed headers, referring to "many other libc functions" (unspecified which functions are being considered). Since such policy questions belong on libc-alpha, and since there are macros in installed headers which can't really avoid using ({}) (where they are type-generic, so can't use an inline function, but need a temporary variable, and a few where the interface involves returning memory from alloca so can't use an inline function either), I propose to consider that bug fixed with this change. That is without prejudice to any other new bugs anyone wishes to file for precisely defined sets of macros requesting moving away from ({}) where it is clearly possible for those interfaces. Where ({}) can be avoided, typically by use of an inline function, I think that's a good idea - that inline functions are typically to be preferred to ({}) for header interfaces where such optimizations are useful but the interface is suited to being defined using an inline function. Bug 20530 requests use of __builtin_bswap16 when available (GCC 4.8 and later), which this patch implements. Tested for x86_64, and with build-many-glibcs.py. Also did an x86_64 test with the __GNUC_PREREQ conditionals changed to "#if 0" to verify the old-GCC/non-GCC case in the headers. (There are already existing tests for correctness of results of these interfaces.) [BZ #14508] [BZ #15512] [BZ #17082] [BZ #20530] * bits/byteswap.h: Update file comment. Do not include <bits/byteswap-16.h>. (__bswap_constant_16): Cast result to __uint16_t. Use signed 0xff constant. (__bswap_16): Define as inline function. (__bswap_constant_32): Reformat definition. (__bswap_32): Always define as inline function, not macro, using __uint32_t. Use __builtin_bswap32 if [__GNUC_PREREQ (4, 3)], otherwise __bswap_constant_32. (__bswap_constant_64): Reformat definition. Do not use __extension__ here. (__bswap_64): Always define as inline function, not macro. Use __extension__ on function definition. Use __builtin_bswap64 if [__GNUC_PREREQ (4, 3)], otherwise __bswap_constant_64. * string/test-endian-file-scope.c: New file. * string/test-endian-sign-conversion.c: Likewise. * string/Makefile (headers): Remove bits/byteswap-16.h. (tests): Add test-endian-file-scope and test-endian-sign-conversion. (CFLAGS-test-endian-sign-conversion.c): New variable. * bits/byteswap-16.h: Remove file. * sysdeps/ia64/bits/byteswap-16.h: Likewise. * sysdeps/ia64/bits/byteswap.h: Likewise. * sysdeps/m68k/bits/byteswap.h: Likewise. * sysdeps/s390/bits/byteswap-16.h: Likewise. * sysdeps/s390/bits/byteswap.h: Likewise. * sysdeps/tile/bits/byteswap.h: Likewise. * sysdeps/x86/bits/byteswap-16.h: Likewise. * sysdeps/x86/bits/byteswap.h: Likewise.
*	Move LDBL_CLASSIFY_COMPAT to its own header.	Joseph Myers	2018-02-01	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The general rule in glibc is that it's better for a macro to be always defined, and tested with #if, than for it to be tested with #ifdef, because the latter is prone to typos in the macro name as well as to the header with the macro accidentally not being included in a file testing it. (Testing with an "if" statement is even better, in those cases where it's possible to do things that way, as it then means both cases in the code get checked for syntax in glibc builds with either value of the condition.) math_private.h has several different groups of macros, meaning that architectures wanting to override some of them need to define those then include the generic version, which then defines macros if not already defined. It's hard to avoid that arrangement completely, but various cases can be improved by splitting out macros or groups of macros into separate files. This patch splits out the LDBL_CLASSIFY_COMPAT macro into a separate ldbl-classify-compat.h header. This macro is tested with #ifdef; this patch changes it to testing with #if, with a default definition to 0 in the generic header and then architecture-specific headers defining it to 1. Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/generic/ldbl-classify-compat.h: New file. * sysdeps/arm/ldbl-classify-compat.h: Likewise. * sysdeps/m68k/coldfire/ldbl-classify-compat.h: Likewise. * sysdeps/microblaze/ldbl-classify-compat.h: Likewise. * sysdeps/mips/ldbl-classify-compat.h: Likewise. * sysdeps/nios2/ldbl-classify-compat.h: Likewise. * sysdeps/sh/ldbl-classify-compat.h: Likewise. * sysdeps/ieee754/dbl-64/s_finite.c: Include <ldbl-classify-compat.h>. [LDBL_CLASSIFY_COMPAT]: Test value, not whether defined. * sysdeps/ieee754/dbl-64/s_isinf.c: Include <ldbl-classify-compat.h>. [LDBL_CLASSIFY_COMPAT]: Test value, not whether defined. * sysdeps/ieee754/dbl-64/s_isnan.c: Include <ldbl-classify-compat.h>. [LDBL_CLASSIFY_COMPAT]: Test value, not whether defined. * sysdeps/ieee754/dbl-64/wordsize-64/s_finite.c: Include <ldbl-classify-compat.h>. [LDBL_CLASSIFY_COMPAT]: Test value, not whether defined. * sysdeps/ieee754/dbl-64/wordsize-64/s_isinf.c: Include <ldbl-classify-compat.h>. [LDBL_CLASSIFY_COMPAT]: Test value, not whether defined. * sysdeps/ieee754/dbl-64/wordsize-64/s_isnan.c: Include <ldbl-classify-compat.h>. [LDBL_CLASSIFY_COMPAT]: Test value, not whether defined. * sysdeps/arm/math_private.h (LDBL_CLASSIFY_COMPAT): Remove macro. * sysdeps/mips/math_private.h (LDBL_CLASSIFY_COMPAT): Likewise. * sysdeps/m68k/coldfire/math_private.h: Remove file. * sysdeps/microblaze/math_private.h: Likewise. * sysdeps/nios2/math_private.h: Likewise. * sysdeps/sh/math_private.h: Likewise.
*	Remove some math_private.h libc_feholdexcept_setround overrides.	Joseph Myers	2018-02-01	2	-43/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	math_private.h headers for configurations lacking support for floating-point exceptions and rounding modes define libc_feholdexcept_setround to override the default version with one that discards its rounding mode argument. Unlike other such libc_fe* macros that I removed, this one is actually used for such configurations (in dbl-64/e_sqrt.c). However, this does not make the macro required. It's only used for such configurations with FE_TONEAREST as the rounding mode (anything needing another mode should not be used when that mode is unavailable), and the default definition just calls __feholdexcept and __fesetround. Since we now have suitable inline do-nothing definitions of __feholdexcept and __fesetround for the cases of no exceptions and rounding modes, we can just rely on those inlines to achieve the same optimization as this macro definition. Thus, this patch removes those macro definitions (and the math_private.h headers containing them, when no longer needed after that removal). Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/m68k/coldfire/fpu/math_private.h: Move to .... * sysdeps/m68k/coldfire/math_private.h: ... here. * sysdeps/m68k/coldfire/nofpu/math_private.h: Remove file. * sysdeps/tile/math_private.h: Likewise. * sysdeps/microblaze/math_private.h (libc_feholdexcept_setround): Remove macro. * sysdeps/nios2/math_private.h (libc_feholdexcept_setround): Likewise.
*	Remove some math_private.h libc_fe* overrides.	Joseph Myers	2018-02-01	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	math_private.h headers for configurations lacking support for floating-point exceptions and rounding modes define various libc_fe* macros to override the default versions with ones that discard any exception or rounding mode arguments. Three of the four macros defined in these headers are no longer needed there: those macros are only used in fma implementations that are not used for such configurations, now all those configurations properly use soft-fp fma implementations instead. (Effectively, those macros were a workaround to allow glibc to build in the absence of a proper fma implementation for this case - now there is such an implementation, there is no need to support building the wrong implementation for those configurations.) Thus, this patch removes the unnecessary macros. Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/m68k/coldfire/nofpu/math_private.h (libc_fesetround): Remove macro. (libc_fetestexcept): Likewise. (libc_feupdateenv_test): Likewise. * sysdeps/microblaze/math_private.h (libc_fesetround): Likewise. (libc_fetestexcept): Likewise. (libc_feupdateenv_test): Likewise. * sysdeps/nios2/math_private.h (libc_fesetround): Likewise. (libc_fetestexcept): Likewise. (libc_feupdateenv_test): Likewise. * sysdeps/tile/math_private.h (libc_fesetround): Likewise. (libc_fetestexcept): Likewise. (libc_feupdateenv_test): Likewise.
*	Move some fenv.h override macros to generic math_private.h.	Joseph Myers	2018-02-01	1	-7/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Various configurations lacking support for floating-point exceptions and rounding modes have a math_private.h that overrides certain functions and macros, internal and external, to avoid references to FE_* constants that are undefined in those configurations. For example, there are unconditional feraiseexcept (FE_INVALID) calls in generic libm code, and these macro definitions duly define feraiseexcept to ignore its argument to avoid an error from FE_INVALID being undefined. In fact it is easy to tell in an architecture-independent way whether this is needed, by testing whether FE_ALL_EXCEPT == 0. Thus, this patch puts such a test, and feraiseexcept and __feraiseexcept macros, in the generic math_private.h, so reducing the duplication between architecture versions of this header. The feclearexcept macro present in several versions of this header, and fetestexcept in the tile version, are not needed; they would have been needed before there were proper soft-fp fma implementations (when generic versions, that depend on FE_TOWARDZERO and FE_INEXACT, were being used for configurations not supporting those features), but aren't needed any more, and so are removed. The tile version of this header has several inline functions for fenv.h functions to optimize calls to them away in such configurations where they do nothing useful, and all these header versions also have definitions of some of the libc_fe* internal macros. I intend to make those generic in subsequent patches. Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. * sysdeps/generic/math_private.h [FE_ALL_EXCEPT == 0] (feraiseexcept): New macro. [FE_ALL_EXCEPT == 0] (__feraiseexcept): Likewise. * sysdeps/m68k/coldfire/nofpu/math_private.h (feraiseexcept): Remove macro. (__feraiseexcept): Likewise. (feclearexcept): Likewise. * sysdeps/microblaze/math_private.h (feraiseexcept): Likewise. (__feraiseexcept): Likewise. (feclearexcept): Likewise. * sysdeps/nios2/math_private.h (feraiseexcept): Likewise. (__feraiseexcept): Likewise. (feclearexcept): Likewise. * sysdeps/tile/math_private.h (feraiseexcept): Likewise. (__feraiseexcept): Likewise. (feclearexcept): Likewise. (fetestexcept): Likewise.
*	Add ColdFire math-tests.h.	Joseph Myers	2018-02-01	1	-0/+29
\| \| \| \| \| \| \| \| \| \|	Since I've been fixing build issues for ColdFire, this patch adds a math-tests.h file for ColdFire, reflecting the lack of support for exceptions and rounding modes for soft float. I think it is logically correct, but have not tested it beyond build-many-glibcs.py for both hard and soft float. * sysdeps/m68k/coldfire/math-tests.h: New file.
*	Fix m68k bits/fenv.h for no-FPU ColdFire.	Joseph Myers	2018-02-01	1	-11/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The m68k bits/fenv.h is in sysdeps/m68k/fpu/, meaning that no-FPU ColdFire instead gets the generic (top-level) bits/fenv.h. That top-level bits/fenv.h defines no rounding mode constants. That no longer works for building glibc tests: some tests fail to build (at least with warnings) if no rounding mode macros are defined, so at least FE_TONEAREST must be defined in all cases (as various architectures without rounding mode support indeed do), while __FE_UNDEFINED must be defined in the case where not all the standard rounding modes are supported. On general principles of supporting multilib toolchains with a single set of headers shared between multilibs for a given architecture, it's also desirable for the same bits/fenv.h header to work for both FPU and no-FPU configurations. Thus, this patch moves the m68k bits/fenv.h to sysdeps/m68k/bits/fenv.h, and inserts appropriate conditionals to handle the no-FPU case. All the exception macros, and FE_NOMASK_ENV, are disabled in the no-FPU case; FE_ALL_EXCEPT is defined to 0 in that case. All rounding modes except FE_TONEAREST are disabled in that case, and __FE_UNDEFINED is defined accordingly. To avoid an unnecessary ABI change, fenv_t is defined in the no-FPU case to match the definition it would have got from the generic bits/fenv.h. This suffices to get a clean glibc and testsuite build for this configuration with build-many-glibcs.py (and keeps a clean build for the other m68k configurations); it has not been otherwise tested. * sysdeps/m68k/fpu/bits/fenv.h: Move to .... * sysdeps/m68k/bits/fenv.h: ... here. [!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (FE_INEXACT): Do not define. [!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (FE_DIVBYZERO): Likewise. [!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (FE_UNDERFLOW): Likewise. [!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (FE_OVERFLOW): Likewise. [!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (FE_INVALID): Likewise. [!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (FE_ALL_EXCEPT): Define to 0. [!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (__FE_UNDEFINED): New enum constant. [!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (FE_TOWARDZERO): Do not define. [!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (FE_DOWNWARD): Likewise. [!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (FE_UPWARD): Likewise. [!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (fenv_t): Define to match generic bits/fenv.h. [!__HAVE_68881__ && !__HAVE_FPU__ && !__mcffpu__] (FE_NOMASK_ENV): Do not define.
*	Add no-FPU ColdFire math_private.h.	Joseph Myers	2018-01-24	1	-0/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As with some other soft-float configurations, no-FPU ColdFire needs various fenv.h functions and glibc-internal macros overridden in math_private.h to avoid references to undefined FE_* macros when building glibc. This patch adds a suitable math_private.h, based on the MicroBlaze one (Nios II and Tile also have similar files). There's a case for having such a file in sysdeps/ieee754/soft-fp so this logic is applied more generally to configurations without exceptions and rounding modes, even when the relevant macros are defined in fenv.h - the only case where that might be inappropriate is ARM soft-float (where the fenv.h functions might or might not work at runtime, depending on whether the processor used at runtime supports VFP). There's also a case that soft-float configurations (on processors with both hard-float and soft-float) should more consistently avoid defining FE_* macros in bits/fenv.h when not actually supported. But both of those are separate potential cleanups. This allows the no-FPU ColdFire build to get further (another fix is needed to allow the build to complete). * sysdeps/m68k/coldfire/nofpu/math_private.h: New file. Based on MicroBlaze file.
*	Update copyright dates with scripts/update-copyrights.	Joseph Myers	2018-01-01	121	-121/+121
\| \| \| \| \| \| \|	* All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.
*	Revert exp reimplementation (causes test failures).	Joseph Myers	2017-12-19	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Revert: 2017-12-19 Joseph Myers <joseph@codesourcery.com> * sysdeps/x86_64/fpu/libm-test-ulps: Update. 2017-12-19 Patrick McGehearty <patrick.mcgehearty@oracle.com> * sysdeps/ieee754/dbl-64/e_exp.c: Include <math-svid-compat.h> and <errno.h>. Include "eexp.tbl". (half): New constant. (one): Likewise. (__ieee754_exp): Rewrite. (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/eexp.tbl: New file. * sysdeps/ieee754/dbl-64/slowexp.c: Remove file. * sysdeps/i386/fpu/slowexp.c: Likewise. * sysdeps/ia64/fpu/slowexp.c: Likewise. * sysdeps/m68k/m680x0/fpu/slowexp.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Likewise. * sysdeps/generic/math_private.h (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/e_pow.c: Remove mention of slowexp.c in comment. * sysdeps/powerpc/power4/fpu/Makefile [$(subdir) = math] (CPPFLAGS-slowexp.c): Remove variable. * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove slowexp-fma, slowexp-fma4 and slowexp-avx. (CFLAGS-slowexp-fma.c): Remove variable. (CFLAGS-slowexp-fma4.c): Likewise. (CFLAGS-slowexp-avx.c): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Do not define as macro. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Likewise. * math/Makefile (type-double-routines): Remove slowexp. * manual/probes.texi (slowexp_p6): Remove. (slowexp_p32): Likewise.
*	Improve __ieee754_exp() performance by greater than 5x on sparc/x86.	Patrick McGehearty	2017-12-19	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These changes will be active for all platforms that don't provide their own exp() routines. They will also be active for ieee754 versions of ccos, ccosh, cosh, csin, csinh, sinh, exp10, gamma, and erf. Typical performance gains is typically around 5x when measured on Sparc s7 for common values between exp(1) and exp(40). Using the glibc perf tests on sparc, sparc (nsec) x86 (nsec) old new old new max 17629 395 5173 144 min 399 54 15 13 mean 5317 200 1349 23 The extreme max times for the old (ieee754) exp are due to the multiprecision computation in the old algorithm when the true value is very near 0.5 ulp away from an value representable in double precision. The new algorithm does not take special measures for those cases. The current glibc exp perf tests overrepresent those values. Informal testing suggests approximately one in 200 cases might invoke the high cost computation. The performance advantage of the new algorithm for other values is still large but not as large as indicated by the chart above. Glibc correctness tests for exp() and expf() were run. Within the test suite 3 input values were found to cause 1 bit differences (ulp) when "FE_TONEAREST" rounding mode is set. No differences in exp() were seen for the tested values for the other rounding modes. Typical example: exp(-0x1.760cd2p+0) (-1.46113312244415283203125) new code: 2.31973271630014299393707e-01 0x1.db14cd799387ap-3 old code: 2.31973271630014271638132e-01 0x1.db14cd7993879p-3 exp = 2.31973271630014285508337 (high precision) Old delta: off by 0.49 ulp New delta: off by 0.51 ulp In addition, because ieee754_exp() is used by other routines, cexp() showed test results with very small imaginary input values where the imaginary portion of the result was off by 3 ulp when in upward rounding mode, but not in the other rounding modes. For x86, tgamma showed a few values where the ulp increased to 6 (max ulp for tgamma is 5). Sparc tgamma did not show these failures. I presume the tgamma differences are due to compiler optimization differences within the gamma function.The gamma function is known to be difficult to compute accurately. * sysdeps/ieee754/dbl-64/e_exp.c: Include <math-svid-compat.h> and <errno.h>. Include "eexp.tbl". (half): New constant. (one): Likewise. (__ieee754_exp): Rewrite. (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/eexp.tbl: New file. * sysdeps/ieee754/dbl-64/slowexp.c: Remove file. * sysdeps/i386/fpu/slowexp.c: Likewise. * sysdeps/ia64/fpu/slowexp.c: Likewise. * sysdeps/m68k/m680x0/fpu/slowexp.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Likewise. * sysdeps/generic/math_private.h (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/e_pow.c: Remove mention of slowexp.c in comment. * sysdeps/powerpc/power4/fpu/Makefile [$(subdir) = math] (CPPFLAGS-slowexp.c): Remove variable. * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove slowexp-fma, slowexp-fma4 and slowexp-avx. (CFLAGS-slowexp-fma.c): Remove variable. (CFLAGS-slowexp-fma4.c): Likewise. (CFLAGS-slowexp-avx.c): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Do not define as macro. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Likewise. * math/Makefile (type-double-routines): Remove slowexp. * manual/probes.texi (slowexp_p6): Remove. (slowexp_p32): Likewise.
*	Fix m68k bits/mathinline.h attributes (bug 22631).	Joseph Myers	2017-12-19	2	-68/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	m68k bits/mathinline.h declares various functions with const attributes. These are inappropriate for functions that have results depending on the rounding mode; the machine-independent bits/mathcalls.h only uses const attributes for a very few functions with no rounding mode dependence, and the m68k header should do likewise. GCC uses pure for such functions with -frounding-math, resulting in GCC mainline warning for conflicts with between the header and the built-in attributes and glibc failing to build for m68k with GCC mainline. This patch fixes the attributes to avoid using const except when bits/mathcalls.h does so. (There are a few functions where maybe bits/mathcalls.h could do so but doesn't, but keeping the headers in sync in this regard seems to be the safe approach.) Tested compilation with build-many-glibcs.py with GCC mainline. [BZ #22631] * sysdeps/m68k/m680x0/fpu/bits/mathinline.h (__m81_defun): Add argument for attrubutes. All callers changed. (__inline_mathop1): Likewise. All callers changed. (__inline_mathop): Likewise. All callers changed. [__USE_MISC] (scalbn): Use __inline_forward instead of __inline_forward_c. [__USE_ISOC99] (scalbln): Likewise. [__USE_ISOC99] (nearbyint): Likewise. [__USE_ISOC99] (lrint): Likewise. [__USE_MISC] (scalbnf): Likewise. [__USE_ISOC99] (scalblnf): Likewise. [__USE_ISOC99] (nearbyintf): Likewise. [__USE_ISOC99] (lrintf): Likewise. [__USE_MISC] (scalbnl): Likewise. [__USE_ISOC99] (scalblnl): Likewise. [__USE_ISOC99] (nearbyintl): Likewise. [__USE_ISOC99] (lrintl): Likewise. * sysdeps/m68k/m680x0/fpu/mathimpl.h: All callers of __inline_mathop and __m81_defun changed.
*	Add sysdeps/ieee754/soft-fp.	Joseph Myers	2017-12-12	3	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The default sysdeps/ieee754 fma implementations rely on exceptions and rounding modes to achieve correct results through internal use of round-to-odd. Thus, glibc configurations without support for exceptions and rounding modes instead need to use implementations of fma based on soft-fp. At present, this is achieved via having implementation files in soft-fp/ that are #included by sysdeps files for each glibc configuration that needs them. In general this means such a configuration has its own s_fma.c and s_fmaf.c. TS 18661-1 adds functions that do an operation (+ - * / sqrt fma) on arguments wider than the return type, with a single rounding of the infinite-precision result to that return type. These are also naturally implemented using round-to-odd on platforms with hardware support for rounding modes and exceptions but lacking hardware support for these narrowing operations themselves. (Platforms that have direct hardware support for such narrowing operations include at least ia64, and Power ISA 2.07 or later, which I think means POWER8 or later.) So adding the remaining TS 18661-1 functions would mean at least six narrowing function implementations (fadd fsub fmul fdiv ffma fsqrt), with aliases for other types and further implementations in some configurations, that need to be overridden for configurations lacking hardware exceptions and rounding modes. Requiring all such configurations (currently seven of them) to have their own source files for all those functions seems undesirable. Thus, this patch adds a directory sysdeps/ieee754/soft-fp to contain libm function implementations based on soft-fp. This directory is then used via Implies from all the configurations that need it, so no more files need adding to every such configuration when adding more functions with soft-fp implementations. A configuration can still selectively #include a particular file from this directory if desired; thus, the MIPS #include of the fmal implementation is retained, since that's appropriate even for hard float (because long double is always implementated in software for MIPS64, so the soft-fp implementation of fmal is better than the ldbl-128 one). This also provides additional motivation for my recent patch removing --with-fp / --without-fp: previously there was no need for correct use of --without-fp for no-FPU ARM or SH3, and now we have autodetection nofpu/ sysdeps directories can be used by this patch for those configurations without imposing any new requirements on how glibc is configured. (The mips64//fpu/s_fma.c files added by this patch are needed to keep the dbl-64 version of fma for double, rather than the ldbl-128 one, used in that case.) Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. soft-fp/fmadf4.c: Move to .... * sysdeps/ieee754/soft-fp/s_fma.c: ... here. * soft-fp/fmasf4.c: Move to .... * sysdeps/ieee754/soft-fp/s_fmaf.c: ... here. * soft-fp/fmatf4.c: Move to .... * sysdeps/ieee754/soft-fp/s_fmal.c: ... here. * sysdeps/ieee754/soft-fp/Makefile: New file. * sysdeps/arm/preconfigure.ac: Define with_fp_cond. * sysdeps/arm/preconfigure: Regenerated. * sysdeps/arm/nofpu/Implies: New file. * sysdeps/arm/s_fma.c: Remove file. * sysdeps/arm/s_fmaf.c: Likewise. * sysdeps/m68k/coldfire/nofpu/Implies: New file. * sysdeps/m68k/coldfire/nofpu/s_fma.c: Remove file. * sysdeps/m68k/coldfire/nofpu/s_fmaf.c: Likewise. * sysdeps/microblaze/Implies: Add ieee754/soft-fp. * sysdeps/microblaze/s_fma.c: Remove file. * sysdeps/microblaze/s_fmaf.c: Likewise. * sysdeps/mips/mips32/nofpu/Implies: New file. * sysdeps/mips/mips64/n32/fpu/s_fma.c: Likewise. * sysdeps/mips/mips64/n32/nofpu/Implies: Likewise. * sysdeps/mips/mips64/n64/fpu/s_fma.c: Likewise. * sysdeps/mips/mips64/n64/nofpu/Implies: Likewise. * sysdeps/mips/ieee754/s_fma.c: Remove file. * sysdeps/mips/ieee754/s_fmaf.c: Likewise. * sysdeps/mips/ieee754/s_fmal.c: Update include for move of fmal implementation. * sysdeps/nios2/Implies: Add ieee754/soft-fp. * sysdeps/nios2/s_fma.c: Remove file. * sysdeps/nios2/s_fmaf.c: Likewise. * sysdeps/sh/nofpu/Implies: New file. * sysdeps/sh/s_fma.c: Remove file. * sysdeps/sh/s_fmaf.c: Likewise. * sysdeps/tile/Implies: Add ieee754/soft-fp. * sysdeps/tile/s_fma.c: Remove file. * sysdeps/tile/s_fmaf.c: Likewise.
*	Remove --with-fp / --without-fp.	Joseph Myers	2017-12-12	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is a configure option --without-fp that specifies that nofpu sysdeps directories should be used instead of fpu directories. For most glibc configurations, this option is of no use: either there is no valid nofpu variant of that configuration, or there are no fpu or nofpu sysdeps directories for that processor and so the option does nothing. For a few configurations, if you are using a soft-float compiler this option is required, and failing to use it generally results in compilation errors from inline asm using unavailable floating-point instructions. We're moving away from --with-cpu to configuring glibc based on how the compiler generates code, and it is natural to do so for --without-fp as well; in most cases the soft-float and hard-float ABIs are incompatible so you have no hope of building a working glibc with an inappropriately configured compiler or libgcc. This patch eliminates --without-fp, replacing it entirely by automatic configuration based on the compiler. Configurations for which this is relevant (coldfire / mips / powerpc32 / sh) define a variable with_fp_cond in their preconfigure fragments (under the same conditions under which those fragments do anything); this is a preprocessor conditional which the toplevel configure script then uses in a test to determine which sysdeps directories to use. The config.make with-fp variable remains. It's used only by powerpc (sysdeps/powerpc/powerpc32/Makefile) to add -mhard-float to various flags variables. For powerpc, -mcpu= options can imply use of soft-float. That could be an issue if you want to build for e.g. 476fp, but are using --with-cpu=476 because there isn't a 476fp sysdeps directory. If in future we eliminate --with-cpu and replace it entirely by testing the compiler, it would be natural at that point to eliminate that code as well (as the user should then just use a compiler defaulting to 476fp and the 476 sysdeps directory would be used automatically). Tested for x86_64, and tested with build-many-glibcs.py that installed shared libraries are unchanged by this patch. * configure.ac (--with-fp): Remove configure option. (with_fp_cond): New variable. (libc_cv_with_fp): New configure test. Use this variable instead of with_fp. * configure: Regenerated. * config.make.in (with-fp): Use @libc_cv_with_fp@. * manual/install.texi (Configuring and compiling): Remove --without-fp. * INSTALL: Regenerated. * sysdeps/m68k/preconfigure (with_fp_cond): Define for ColdFire. * sysdeps/mips/preconfigure (with_fp_cond): Define. * sysdeps/powerpc/preconfigure (with_fp_cond): Define for 32-bit. * sysdeps/sh/preconfigure (with_fp_cond): Define. * scripts/build-many-glibcs.py (Context.add_all_configs): Do not use --without-fp to configure glibc.
*	Use libm_alias_float for coldfire.	Joseph Myers	2017-11-30	3	-3/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Continuing the preparation for additional _FloatN / _FloatNx function aliases, this patch makes coldfire libm function implementations use libm_alias_float to define function aliases. Untested, given the currently broken state of GCC for coldfire. * sysdeps/m68k/coldfire/fpu/s_fabsf.c: Include <libm-alias-float.h>. (fabsf): Define using libm_alias_float. * sysdeps/m68k/coldfire/fpu/s_lrintf.c: Include <libm-alias-float.h>. (lrintf): Define using libm_alias_float. * sysdeps/m68k/coldfire/fpu/s_rintf.c: Include <libm-alias-float.h>. (rintf): Define using libm_alias_float.
*	Use libm_alias_double for coldfire.	Joseph Myers	2017-11-30	3	-15/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Continuing the preparation for additional _FloatN / _FloatNx function aliases, this patch makes coldfire libm function implementations use libm_alias_double to define function aliases. Untested, given the currently broken state of GCC for coldfire. * sysdeps/m68k/coldfire/fpu/s_fabs.c: Include <libm-alias-double.h>. (fabs): Define using libm_alias_double. * sysdeps/m68k/coldfire/fpu/s_lrint.c: Include <libm-alias-double.h>. (lrint): Define using libm_alias_double. * sysdeps/m68k/coldfire/fpu/s_rint.c: Include <libm-alias-double.h>. (rint): Define using libm_alias_double.
*	Rework m68k libm functions to use declare_mgen_alias.	Joseph Myers	2017-11-30	73	-434/+654
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Many m68k libm functions use their own system to share code between different types and functions, involving defining macros before including code for another function (for example, s_atan.c also acts as a template that can define other functions). Thes files serving as templates generate function aliases directly with e.g. "weak_alias (__CONCATX(__,FUNC), FUNC)" in s_atan.c. To be prepared to generate _Float32, _Float64 and _Float32x function aliases, this needs changing so that the libm_alias_* macros get used instead. As the macro to use varies depending on the type, that would mean additional macros to define in several different places to get the appropriate alias-generation macro used in each case. Rather than adding to the m68k-specific mechanisms, this patch converts the functions in question to use something closer to the math/ type-generic template mechanism. After this patch, these functions have m68k-specific templates such as s_atan_template.c, but those templates use all the same macros as in the math/ templates, such as FLOAT, M_DECL_FUNC, M_SUF and declare_mgen_alias. There is no automatic generation of the files such as s_atan.c that include the appropriate math-type-macros-.h header and the template file (the existing automatic generation logic is only applicable for the fixed set of templates listed in math/ - and sysdeps sources always override files generated that way), so those files are still checked in, but they are all the obvious two-line files (with one additional definition in the case of the expm1 implementations), rather than making e.g. s_atan.c special. Functions are only converted where they should have aliases for _FloatN / _FloatNx types. Those m68k functions that do not generate public names (those that only generate __ieee754_, with wrappers generating the public names, and classification functions that only exist once per format not once per type so don't get aliases) are unchanged. However, log1p (public names generated by wrapper) and significand (not provided for new types so no new aliases needed) needed changing because they previously included the atan implementations. Now, s_significand.c is the main implementation for functions with that prototype and using the old implementation approach, while log1p includes it in place of atan. Any further cleanups in this area (which preserve the proper set of functions getting aliases defined by libm_alias_float and libm_alias_double) are of course welcome, just not needed for the goals of this patch. Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/m68k/m680x0/fpu/s_atan_template.c: New file. * sysdeps/m68k/m680x0/fpu/s_ceil_template.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_cos_template.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_expm1_template.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_fabs_template.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_floor_template.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_frexp_template.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_lrint_template.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_modf_template.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_nearbyint_template.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_remquo_template.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_rint_template.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_sin_template.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_sincos_template.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_tan_template.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_tanh_template.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_trunc_template.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_atan.c: Reimplement to use s_atan_template.c. * sysdeps/m68k/m680x0/fpu/s_atanf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_atanl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_ceil.c: Reimplement to use s_ceil_template.c. * sysdeps/m68k/m680x0/fpu/s_ceilf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_ceill.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_cos.c: Reimplement to use s_cos_template.c. * sysdeps/m68k/m680x0/fpu/s_cosf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_cosl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_expm1.c: Reimplement to use s_expm1_template.c. * sysdeps/m68k/m680x0/fpu/s_expm1f.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_expm1l.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_fabs.c: Reimplement to use s_fabs_template.c. * sysdeps/m68k/m680x0/fpu/s_fabsf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_fabsl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_floor.c: Reimplement to use s_floor_template.c. * sysdeps/m68k/m680x0/fpu/s_floorf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_floorl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_frexp.c: Reimplement to use s_frexp_template.c. * sysdeps/m68k/m680x0/fpu/s_frexpf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_lrint.c: Reimplement to use s_lrint_template.c. * sysdeps/m68k/m680x0/fpu/s_lrintf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_lrintl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_modf.c: Reimplement to use s_modf_template.c. * sysdeps/m68k/m680x0/fpu/s_modff.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_modfl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_nearbyint.c: Reimplement to use s_nearbyint_template.c. * sysdeps/m68k/m680x0/fpu/s_nearbyintf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_nearbyintl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_remquo.c: Reimplement to use s_remquo_template.c. * sysdeps/m68k/m680x0/fpu/s_remquof.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_remquol.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_rint.c: Reimplement to use s_rint_template.c. * sysdeps/m68k/m680x0/fpu/s_rintf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_rintl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_sin.c: Reimplement to use s_sin_template.c. * sysdeps/m68k/m680x0/fpu/s_sinf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_sinl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_sincos.c: Reimplement to use s_sincos_template.c. * sysdeps/m68k/m680x0/fpu/s_sincosf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_sincosl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_tan.c: Reimplement to use s_tan_template.c. * sysdeps/m68k/m680x0/fpu/s_tanf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_tanl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_tanh.c: Reimplement to use s_tanh_template.c. * sysdeps/m68k/m680x0/fpu/s_tanhf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_tanhl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_trunc.c: Reimplement to use s_trunc_template.c. * sysdeps/m68k/m680x0/fpu/s_truncf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_truncl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_significand.c: Reimplement based on s_atan.c instead of including s_atan.c. * sysdeps/m68k/m680x0/fpu/s_significandf.c: Reimplement based on s_atanf.c instead of including s_atanf.c. * sysdeps/m68k/m680x0/fpu/s_significandl.c: Reimplement based on s_atanl.c instead of including s_atanl.c. * sysdeps/m68k/m680x0/fpu/s_log1p.c: Include s_significand.c instead of s_atan.c. * sysdeps/m68k/m680x0/fpu/s_log1pf.c: Include s_significandf.c instead of s_atanf.c. * sysdeps/m68k/m680x0/fpu/s_log1pl.c: Include s_significandl.c instead of s_atanl.c.
*	Use libm_alias macros in m68k llrint functions.	Joseph Myers	2017-11-30	3	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Most m68k libm functions share code via sources for one function including those for another function or type, in a way that will require significant changes to create function aliases in a way friendly to adding _FloatN / _FloatNx aliases. The llrint function implementations, however, use a conventional separate implementation for each floating-point type. Thus preparing them for _FloatN / _FloatNx aliases is just a matter of changing them to include the appropriate headers and use the appropriate macros, which this patch does. The llrintl changes aren't strictly required, since m68k long double does not meet the criteria for a _FloatN / _FloatNx type, but are included anyway to keep consistency between the implementations for the three types. Tested with build-many-glibcs.py that installed stripped shared libraries for m68k-linux-gnu are unchanged by the patch. * sysdeps/m68k/m680x0/fpu/s_llrint.c: Include <libm-alias-double.h>. (llrint): Define using libm_alias_double. * sysdeps/m68k/m680x0/fpu/s_llrintf.c: Include <libm-alias-float.h>. (llrintf): Define using libm_alias_float. * sysdeps/m68k/m680x0/fpu/s_llrintl.c: Include <libm-alias-ldouble.h>. (llrintl): Define using libm_alias_ldouble.
*	Use declare_mgen_alias in m68k templates.	Joseph Myers	2017-11-30	4	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some m68k libm functions have their own templates replacing the generic math/ ones but using the type-generic template machinery. These currently define function aliases directly using weak_alias. In preparation for additional _FloatN / _FloatNx function aliases, this patch changes them to use declare_mgen_alias for creating aliases instead. Tested with build-many-glibcs.py that installed stripped shared libraries for m68k-linux-gnu are unchanged by the patch. * sysdeps/m68k/m680x0/fpu/s_ccosh_template.c (ccosh): Use declare_mgen_alias instead of weak_alias. * sysdeps/m68k/m680x0/fpu/s_cexp_template.c (cexp): Likewise. * sysdeps/m68k/m680x0/fpu/s_csin_template.c (csin): Likewise. * sysdeps/m68k/m680x0/fpu/s_csinh_template.c (csinh): Likewise.
*	nptl: Define __PTHREAD_MUTEX_{NUSERS_AFTER_KIND,USE_UNION}	Adhemerval Zanella	2017-11-07	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds two new internal defines to set the internal pthread_mutex_t layout required by the supported ABIS: 1. __PTHREAD_MUTEX_NUSERS_AFTER_KIND which control whether to define __nusers fields before or after __kind. The preferred value for is 0 for new ports and it sets __nusers before __kind. 2. __PTHREAD_MUTEX_USE_UNION which control whether internal __spins and __list members will be place inside an union for linuxthreads compatibility. The preferred value is 0 for ports and it sets to not use an union to define both fields. It fixes the wrong offsets value for __kind value on x86_64-linux-gnu-x32. Checked with a make check run-built-tests=no on all afected ABIs. [BZ #22298] * nptl/allocatestack.c (allocate_stack): Check if __PTHREAD_MUTEX_HAVE_PREV is non-zero, instead if __PTHREAD_MUTEX_HAVE_PREV is defined. * nptl/descr.h (pthread): Likewise. * nptl/nptl-init.c (__pthread_initialize_minimal_internal): Likewise. * nptl/pthread_create.c (START_THREAD_DEFN): Likewise. * sysdeps/nptl/fork.c (__libc_fork): Likewise. * sysdeps/nptl/pthread.h (PTHREAD_MUTEX_INITIALIZER): Likewise. * sysdeps/nptl/bits/thread-shared-types.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): New defines. (__pthread_internal_list): Check __PTHREAD_MUTEX_USE_UNION instead of __WORDSIZE for internal layout. (__pthread_mutex_s): Check __PTHREAD_MUTEX_NUSERS_AFTER_KIND instead of __WORDSIZE for internal __nusers layout and __PTHREAD_MUTEX_USE_UNION instead of __WORDSIZE whether to use an union for __spins and __list fields. (__PTHREAD_MUTEX_HAVE_PREV): Define also for __PTHREAD_MUTEX_USE_UNION case. * sysdeps/aarch64/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): New defines. * sysdeps/alpha/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/arm/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/hppa/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/ia64/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/m68k/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/microblaze/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/mips/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/nios2/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/powerpc/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/s390/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/sh/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/sparc/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/tile/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/x86/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
*	nptl: Add tests for internal pthread_mutex_t offsets	Adhemerval Zanella	2017-11-07	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds a new build test to check for internal fields offsets for user visible internal field. Although currently the only field which is statically initialized to a non zero value is pthread_mutex_t.__data.__kind value, the tests also check the offset of __kind, __spins, __elision (if supported), and __list internal member. A internal header (pthread-offset.h) is added to each major ABI with the reference value. Checked on x86_64-linux-gnu and with a build check for all affected ABIs (aarch64-linux-gnu, alpha-linux-gnu, arm-linux-gnueabihf, hppa-linux-gnu, i686-linux-gnu, ia64-linux-gnu, m68k-linux-gnu, microblaze-linux-gnu, mips64-linux-gnu, mips64-n32-linux-gnu, mips-linux-gnu, powerpc64le-linux-gnu, powerpc-linux-gnu, s390-linux-gnu, s390x-linux-gnu, sh4-linux-gnu, sparc64-linux-gnu, sparcv9-linux-gnu, tilegx-linux-gnu, tilegx-linux-gnu-x32, tilepro-linux-gnu, x86_64-linux-gnu, and x86_64-linux-x32). * nptl/pthreadP.h (ASSERT_PTHREAD_STRING, ASSERT_PTHREAD_INTERNAL_OFFSET): New macro. * nptl/pthread_mutex_init.c (__pthread_mutex_init): Add build time checks for internal pthread_mutex_t offsets. * sysdeps/aarch64/nptl/pthread-offsets.h (__PTHREAD_MUTEX_NUSERS_OFFSET, __PTHREAD_MUTEX_KIND_OFFSET, __PTHREAD_MUTEX_SPINS_OFFSET, __PTHREAD_MUTEX_ELISION_OFFSET, __PTHREAD_MUTEX_LIST_OFFSET): New macro. * sysdeps/alpha/nptl/pthread-offsets.h: Likewise. * sysdeps/arm/nptl/pthread-offsets.h: Likewise. * sysdeps/hppa/nptl/pthread-offsets.h: Likewise. * sysdeps/i386/nptl/pthread-offsets.h: Likewise. * sysdeps/ia64/nptl/pthread-offsets.h: Likewise. * sysdeps/m68k/nptl/pthread-offsets.h: Likewise. * sysdeps/microblaze/nptl/pthread-offsets.h: Likewise. * sysdeps/mips/nptl/pthread-offsets.h: Likewise. * sysdeps/nios2/nptl/pthread-offsets.h: Likewise. * sysdeps/powerpc/nptl/pthread-offsets.h: Likewise. * sysdeps/s390/nptl/pthread-offsets.h: Likewise. * sysdeps/sh/nptl/pthread-offsets.h: Likewise. * sysdeps/sparc/nptl/pthread-offsets.h: Likewise. * sysdeps/tile/nptl/pthread-offsets.h: Likewise. * sysdeps/x86_64/nptl/pthread-offsets.h: Likewise. Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
*	m68k: Update elf_machine_load_address for static PIE	H.J. Lu	2017-10-20	1	-0/+6
\| \| \| \| \| \| \| \| \|	When --enable-static-pie is used to configure glibc, we need to use _dl_relocate_static_pie to compute load address in static PIE. * sysdeps/m68k/dl-machine.h (elf_machine_load_address): Use _dl_relocate_static_pie instead of _dl_start to compute load address in static PIE.
*	m68k: Check PIC instead of SHARED in start.S	H.J. Lu	2017-10-20	1	-1/+1
\| \| \| \| \| \| \|	Since start.o may be compiled as PIC, we should check PIC instead of SHARED. * sysdeps/m68k/start.S (_start): Check PIC instead of SHARED.
*	Do not wrap logf, log2f and powf	Szabolcs Nagy	2017-10-02	3	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new generic logf, log2f and powf code don't need wrappers any more, they set errno inline so only use the wrappers on targets that need it. * sysdeps/ieee754/flt-32/e_log2f.c (__log2f): Define without wrapper. * sysdeps/ieee754/flt-32/e_logf.c (__logf): Likewise * sysdeps/ieee754/flt-32/e_powf.c (__powf): Likewise * sysdeps/ieee754/flt-32/w_log2f.c: New file. * sysdeps/ieee754/flt-32/w_logf.c: New file. * sysdeps/ieee754/flt-32/w_powf.c: New file. * sysdeps/i386/fpu/w_log2f.c: New file. * sysdeps/i386/fpu/w_logf.c: New file. * sysdeps/i386/fpu/w_powf.c: New file. * sysdeps/m68k/m680x0/fpu/w_log2f.c: New file. * sysdeps/m68k/m680x0/fpu/w_logf.c: New file. * sysdeps/m68k/m680x0/fpu/w_powf.c: New file.
*	Do not wrap expf and exp2f	Szabolcs Nagy	2017-10-02	2	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new generic expf and exp2f code don't need wrappers any more, they set errno inline, so only use the wrappers on targets that need it. (If the wrapper is needed, then the top level wrapper code is included, otherwise empty w_expf.c is used to suppress the wrapper.) A powerpc64 expf implementation includes the expf c code directly which needed some changes. sysdeps/ieee754/flt-32/e_exp2f.c (__exp2f): Define without wrapper. * sysdeps/ieee754/flt-32/e_expf.c (__expf): Likewise * sysdeps/ieee754/flt-32/w_exp2f.c: New file. * sysdeps/ieee754/flt-32/w_expf.c: New file. * sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-ppc64.c: Update for the new expf code. * sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c: New file. * sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c: New file. * sysdeps/m68k/m680x0/fpu/w_exp2f.c: New file. * sysdeps/m68k/m680x0/fpu/w_expf.c: New file. * sysdeps/i386/fpu/w_exp2f.c: New file. * sysdeps/i386/fpu/w_expf.c: New file. * sysdeps/i386/i686/fpu/multiarch/w_expf.c: New file. * sysdeps/x86_64/fpu/w_expf.c: New file.
*	New generic powf	Szabolcs Nagy	2017-09-29	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	without wrapper on aarch64: powf reciprocal-throughput: 4.2x faster powf latency: 2.6x faster old worst-case error: 1.11 ulp new worst-case error: 0.82 ulp aarch64 .text size: -780 bytes aarch64 .rodata size: +144 bytes powf(x,y) is implemented as exp2(ylog2(x)) with the same algorithms that are used in exp2f and log2f, except that the log2f polynomial is larger for extra precision and its output (and exp2f input) may be scaled by a power of 2 (POWF_SCALE) to simplify the argument reduction step of exp2 (possible when efficient round and convert toint operation is available). The special case handling tries to minimize the checks in the hot path. When the input of exp2_inline is checked, int arithmetics is used as that was faster on the tested aarch64 cores. math/Makefile (type-float-routines): Add e_powf_log2_data. * sysdeps/ieee754/flt-32/e_powf.c: New implementation. * sysdeps/ieee754/flt-32/e_powf_log2_data.c: New file. * sysdeps/ieee754/flt-32/math_config.h (__powf_log2_data): Define. (issignalingf_inline): Likewise. (POWF_LOG2_TABLE_BITS): Likewise. (POWF_LOG2_POLY_ORDER): Likewise. (POWF_SCALE_BITS): Likewise. (POWF_SCALE): Likewise. * sysdeps/i386/fpu/e_powf_log2_data.c: New file. * sysdeps/ia64/fpu/e_powf_log2_data.c: New file. * sysdeps/m68k/m680x0/fpu/e_powf_log2_data.c: New file.
*	New generic log2f	Szabolcs Nagy	2017-09-29	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similar to the new logf: double precision arithmetics and a small lookup table is used. The argument reduction step is the same as in the new logf. without wrapper on aarch64: log2f reciprocal-throughput: 2.3x faster log2f latency: 2.1x faster old worst case error: 1.72 ulp new worst case error: 0.75 ulp aarch64 .text size: -252 bytes aarch64 .rodata size: +244 bytes * math/Makefile (type-float-routines): Add e_log2f_data. * sysdeps/ieee754/flt-32/e_log2f.c: New implementation. * sysdeps/ieee754/flt-32/e_log2f_data.c: New file. * sysdeps/ieee754/flt-32/math_config.h (__log2f_data): Define. (LOG2F_TABLE_BITS, LOG2F_POLY_ORDER): Define. * sysdeps/i386/fpu/e_log2f_data.c: New file. * sysdeps/ia64/fpu/e_log2f_data.c: New file. * sysdeps/m68k/m680x0/fpu/e_log2f_data.c: New file.
*	New generic logf	Szabolcs Nagy	2017-09-29	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	without wrapper on aarch64: logf reciprocal-throughput: 2.2x faster logf latency: 1.9x faster old worst case error: 0.89 ulp new worst case error: 0.82 ulp aarch64 .text size: -356 bytes aarch64 .rodata size: +240 bytes Uses double precision arithmetics and a lookup table to allow smaller polynomial and avoid the use of division. Data is in a separate translation unit with fixed layout to prevent the compiler generating suboptimal literal access. Errors are handled inline according to POSIX rules, but this patch keeps the wrapper with SVID compatible error handling. Needs libm-test-ulps adjustment for clogf in non-nearest rounding mode. * math/Makefile (type-float-routines): Add e_logf_data. * sysdeps/ieee754/flt-32/e_logf.c: New implementation. * sysdeps/ieee754/flt-32/e_logf_data.c: New file. * sysdeps/ieee754/flt-32/math_config.h (__logf_data): Define. (LOGF_TABLE_BITS, LOGF_POLY_ORDER): Define. * sysdeps/i386/fpu/e_logf_data.c: New file. * sysdeps/ia64/fpu/e_logf_data.c: New file. * sysdeps/m68k/m680x0/fpu/e_logf_data.c: New file.
*	Remove ancient __signbit inlines	Wilco Dijkstra	2017-09-28	2	-68/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove __signbit inlines from mathinline.h. Math.h already uses the builtin when supported, so additional inlines are only used on pre 4.0 GCCs. Similarly remove ancient copysign and fabs inlines. * sysdeps/alpha/fpu/bits/mathinline.h: Delete file. * sysdeps/ia64/fpu/bits/mathinline.h: Delete file. * sysdeps/m68k/coldfire/fpu/bits/mathinline.h: Delete file. * sysdeps/m68k/m680x0/fpu/bits/mathinline.h: (__signbitf): Remove. (__signbit): Remove. (__signbitl): Remove. * sysdeps/powerpc/bits/mathinline.h (__signbitf): Remove. (__signbit): Remove. (__signbitl): Remove. * sysdeps/s390/fpu/bits/mathinline.h: (__signbitf): Remove. (__signbit): Remove. (__signbitl): Remove * sysdeps/sparc/fpu/bits/mathinline.h (__signbitf): Remove. (__signbit): Remove. (__signbitl): Remove. * sysdeps/tile/bits/mathinline.h: Delete file. * sysdeps/x86/fpu/bits/mathinline.h (__signbitf): Remove. (__signbit): Remove. (__signbitl): Remove.
*	Simplify C99 isgreater macros	Wilco Dijkstra	2017-09-28	1	-54/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Simplify the C99 isgreater macros. Although some support was added in GCC 2.97, not all targets added support until GCC 3.1. Therefore only use the builtins in math.h from GCC 3.1 onwards, and defer to generic macros otherwise. Improve the generic isunordered macro to use compares rather than call fpclassify twice - this is not only faster but also correct for signaling NaNs. * math/math.h: Improve handling of C99 isgreater macros. * sysdeps/alpha/fpu/bits/mathinline.h: Remove isgreater macros. * sysdeps/m68k/m680x0/fpu/bits/mathinline.h: Likewise. * sysdeps/powerpc/bits/mathinline.h: Likewise. * sysdeps/sparc/fpu/bits/mathinline.h: Likewise. * sysdeps/x86/fpu/bits/mathinline.h: Likewise.