mirror/glibc - mirror of git://sourceware.org/git/glibc.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	x86-64: Add powf with FMA	H.J. Lu	2017-10-22	3	-1/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	For workload-spec2017.wrf, on Skylake, it improves performance by: Before After Improvement reciprocal-throughput 35.4713 27.3842 29% latency 82.4537 66.3175 24% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_powf-fma. (CFLAGS-e_powf-fma.c): New. * sysdeps/x86_64/fpu/multiarch/e_powf-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/e_powf.c: Likewise.
*	x86-64: Add log2f with FMA	H.J. Lu	2017-10-22	3	-1/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	For workload-spec2017.wrf, on Skylake, it improves performance by: Before After Improvement reciprocal-throughput 16.5937 14.0789 17% latency 41.7755 35.3586 18% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_log2f-fma. (CFLAGS-e_log2f-fma.c): New. * sysdeps/x86_64/fpu/multiarch/e_log2f-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/e_log2f.c: Likewise.
*	x86-64: Add logf with FMA	H.J. Lu	2017-10-22	3	-1/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	For workload-spec2017.wrf, on Skylake, it improves performance by: Before After Improvement reciprocal-throughput 16.1534 13.8874 16% latency 41.9642 34.3072 22% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_logf-fma. (CFLAGS-e_logf-fma.c): New. * sysdeps/x86_64/fpu/multiarch/e_logf-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/e_logf.c: Likewise.
*	i386: Replace assembly versions of e_logf with generic e_logf.c	H.J. Lu	2017-10-22	9	-143/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch replaces i386 assembly versions of e_logf with generic e_logf.c. For workload-spec2017.wrf, on Nehalem, it improves performance by: Before After Improvement reciprocal-throughput 73.3865 40.0454 83% latency 90.0985 54.4479 65% On Skylake, it improves performance by: Before After Improvement reciprocal-throughput 75.1384 22.1452 239% latency 91.9441 50.7925 81% On IvyBridge with --disable-multi-arch, it improves performance by: Before After Improvement reciprocal-throughput 84.5575 28.7879 193% latency 103.971 57.5231 80% * sysdeps/i386/fpu/e_logf.S: Removed. * sysdeps/i386/fpu/e_logf_data.c: Likewise. * sysdeps/i386/fpu/w_logf.c: Likewise. * sysdeps/i386/i686/fpu/e_logf.S: Likewise. * sysdeps/i386/fpu/libm-test-ulps: Updated for generic e_logf.c. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/i386/i686/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_logf-sse2. (CFLAGS-e_logf-sse2.c): New. * sysdeps/i386/i686/fpu/multiarch/e_logf-sse2.c: New file. * sysdeps/i386/i686/fpu/multiarch/e_logf.c: Likewise.
*	i386: Replace assembly versions of e_exp2f with generic e_exp2f.c	H.J. Lu	2017-10-22	7	-54/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch replaces i386 assembly versions of e_exp2f with generic e_exp2f.c. For workload-spec2017.wrf, on Nehalem, it improves performance by: Before After Improvement reciprocal-throughput 112.996 40.0454 182% latency 126.581 54.4479 132% On Skylake, it improves performance by: Before After Improvement reciprocal-throughput 113.14 39.447 186% latency 136.068 55.684 144% On IvyBridge with --disable-multi-arch, it improves performance by: Before After Improvement reciprocal-throughput 132.521 40.3759 228% latency 145.791 58.4587 149% * sysdeps/i386/fpu/e_exp2f.S: Removed. * sysdeps/i386/fpu/w_exp2f.c: Likewise. * sysdeps/i386/fpu/libm-test-ulps: Updated for generic e_exp2f.c. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/i386/i686/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_exp2f-sse2. (CFLAGS-e_exp2f-sse2.c): New. * sysdeps/i386/i686/fpu/multiarch/e_exp2f-sse2.c: New file. * sysdeps/i386/i686/fpu/multiarch/e_exp2f.c: Likewise.
*	x86-64: Add exp2f with FMA	H.J. Lu	2017-10-22	3	-1/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	For workload-spec2017.wrf, on Skylake, it improves performance by: Before After Improvement reciprocal-throughput 13.0291 11.2225 16% latency 44.5154 37.5766 18% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_exp2f-fma. (CFLAGS-e_exp2f-fma.c): New. * sysdeps/x86_64/fpu/multiarch/e_exp2f-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/e_exp2f.c: Likewise.
*	i386: Replace assembly versions of e_expf with generic e_expf.c	H.J. Lu	2017-10-22	12	-442/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch replaces i386 assembly versions of e_expf with generic e_expf.c. For workload-spec2017.wrf, on Nehalem, it improves performance by: Before After Improvement reciprocal-throughput 55.5724 40.2664 38% latency 80.0687 60.8517 31% On Skylake, it improves performance by: Before After Improvement reciprocal-throughput 62.4056 39.4188 58% latency 85.5496 59.6377 43% On IvyBridge with --disable-multi-arch, it improves performance by: Before After Improvement reciprocal-throughput 133.707 40.3778 231% latency 149.191 63.2515 135% * sysdeps/i386/fpu/e_exp2f_data.c: Removed. * sysdeps/i386/fpu/e_expf.S: Likewise. * sysdeps/i386/fpu/math_errf.c: Likewise. * sysdeps/i386/fpu/w_expf.c: Likewise. * sysdeps/i386/i686/fpu/multiarch/e_expf-ia32.S: Likewise. * sysdeps/i386/i686/fpu/multiarch/e_expf-sse2.S: Likewise. * sysdeps/i386/i686/fpu/multiarch/w_expf.c: Likewise. * sysdeps/i386/fpu/libm-test-ulps: Updated for generic e_expf.c. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/i386/i686/fpu/multiarch/Makefile (libm-sysdep_routines): Remove e_expf-ia32. (CFLAGS-e_expf-sse2.c): New. * sysdeps/i386/i686/fpu/multiarch/e_expf-sse2.c: New file. * sysdeps/i386/i686/fpu/multiarch/e_expf.c: Rewritten.
*	x86-64: Replace assembly versions of e_expf with generic e_expf.c	H.J. Lu	2017-10-22	7	-529/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch replaces x86-64 assembly versions of e_expf with generic e_expf.c. For workload-spec2017.wrf, on Nehalem, it improves performance by: Before After Improvement reciprocal-throughput 36.039 20.7749 73% latency 58.8096 40.8715 43% On Skylake, it improves Before After Improvement reciprocal-throughput 18.4436 11.1693 65% latency 47.5162 37.5411 26% * sysdeps/x86_64/fpu/e_expf.S: Removed. * sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: Likewise. * sysdeps/x86_64/fpu/w_expf.c: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Updated for generic e_expf.c. * sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_expf-fma.c): New. * sysdeps/x86_64/fpu/multiarch/e_expf-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/e_expf.c (__redirect_ieee754_expf): Renamed to ... (__redirect_expf): This. (SYMBOL_NAME): Changed to expf. (__ieee754_expf): Renamed to ... (__expf): This. (__GI___expf): This. (__ieee754_expf): Add strong_alias. (__expf_finite): Likewise. (__expf): New. Include <sysdeps/ieee754/flt-32/e_expf.c>.
*	Add bits/floatn.h defines for more _FloatN / _FloatNx types.	Joseph Myers	2017-10-20	5	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The bits/floatn.h header currently only has defines relating to _Float128. This patch adds defines relating to other _FloatN / _FloatNx types. The approach taken is to add defines for all _FloatN / _FloatNx types known to GCC, and to put them in a common bits/floatn-common.h header included at the end of all the individual bits/floatn.h headers. If in future some defines become different for different glibc configurations, they will move out into the separate bits/floatn.h headers. Some defines are expected always to be the same across glibc ports. Corresponding defines are nevertheless put in this header. The intent is that where there are conditionals (in headers or in non-installed files) that can just repeat the same or nearly the same logic for each floating-point type, they should do so, even if in fact the cases for some types could be unconditionally present or absent because the same conditionals are true or false for all glibc configurations. This should make the glibc code with such conditionals easier to read, because the reader can just see that the same conditionals are repeated for each type, rather than seeing different conditionals for different types and needing to reason, at each location with such differences, why those differences are indeed correct there. (Cases involving per-format rather than per-type logic are more likely still to need differences in how they handle different types.) Having such defines and conditionals also helps in incremental preparation for adding _Float32 / _Float64 / _Float32x / _Float64x function aliases. I intend subsequent patches to add such conditionals corresponding to those already present for _Float128, as well as making more architecture-specific function implementations use common macros to define aliases in preparation for adding such _FloatN / _FloatNx aliases. Tested for x86_64. * bits/floatn-common.h: New file. * math/Makefile (headers): Add bits/floatn-common.h. * bits/floatn.h: Include <bits/floatn-common.h>. * sysdeps/ia64/bits/floatn.h: Likewise. * sysdeps/ieee754/ldbl-128/bits/floatn.h: Likewise. * sysdeps/mips/ieee754/bits/floatn.h: Likewise. * sysdeps/powerpc/bits/floatn.h: Likewise. * sysdeps/x86/bits/floatn.h: Likewise.
*	posix: Fix improper assert in Linux posix_spawn (BZ#22273)	Adhemerval Zanella	2017-10-20	1	-6/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As noted by Florian Weimer, current Linux posix_spawn implementation can trigger an assert if the auxiliary process is terminated before actually setting the err member: 340 /* Child must set args.err to something non-negative - we rely on 341 the parent and child sharing VM. / 342 args.err = -1; [...] 362 new_pid = CLONE (__spawni_child, STACK (stack, stack_size), stack_size, 363 CLONE_VM \| CLONE_VFORK \| SIGCHLD, &args); 364 365 if (new_pid > 0) 366 { 367 ec = args.err; 368 assert (ec >= 0); Another possible issue is killing the child between setting the err and actually calling execve. In this case the process will not ran, but posix_spawn also will not report any error: 269 270 args->err = 0; 271 args->exec (args->file, args->argv, args->envp); As suggested by Andreas Schwab, this patch removes the faulty assert and also handles any signal that happens before fork and execve as the spawn was successful (and thus relaying the handling to the caller to figure this out). Different than Florian, I can not see why using atomics to set err would help here, essentially the code runs sequentially (due CLONE_VFORK) and I think it would not be legal the compiler evaluate ec without checking for new_pid result (thus there is no need to compiler barrier). Summarizing the possible scenarios on posix_spawn execution, we have: 1. For default case with a success execution, args.err will be 0, pid will not be collected and it will be reported to caller. 2. For default failure case, args.err will be positive and the it will be collected by the waitpid. An error will be reported to the caller. 3. For the unlikely case where the process was terminated and not collected by a caller signal handler, it will be reported as succeful execution and not be collected by posix_spawn (since args.err will be 0). The caller will need to actually handle this case. 4. For the unlikely case where the process was terminated and collected by caller we have 3 other possible scenarios: 4.1. The auxiliary process was terminated with args.err equal to 0: it will handled as 1. (so it does not matter if we hit the pid reuse race since we won't possible collect an unexpected process). 4.2. The auxiliary process was terminated after execve (due a failure in calling it) and before setting args.err to -1: it will also be handle as 1. but with the issue of not be able to report the caller a possible execve failures. 4.3. The auxiliary process was terminated after args.err is set to -1: this is the case where it will be possible to hit the pid reuse case where we will need to collected the auxiliary pid but we can not be sure if it will be expected one. I think for this case we need to actually change waitpid to use WNOHANG to avoid hanging indefinitely on the call and report an error to caller since we can't differentiate between a default failure as 2. and a possible pid reuse race issue. Checked on x86_64-linux-gnu. sysdeps/unix/sysv/linux/spawni.c (__spawnix): Handle the case where the auxiliary process is terminated by a signal before calling _exit or execve.
*	x86-64: Use fxsave/xsave/xsavec in _dl_runtime_resolve [BZ #21265]	H.J. Lu	2017-10-20	8	-306/+230
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In _dl_runtime_resolve, use fxsave/xsave/xsavec to preserve all vector, mask and bound registers. It simplifies _dl_runtime_resolve and supports different calling conventions. ld.so code size is reduced by more than 1 KB. However, use fxsave/xsave/xsavec takes a little bit more cycles than saving and restoring vector and bound registers individually. Latency for _dl_runtime_resolve to lookup the function, foo, from one shared library plus libc.so: Before After Change Westmere (SSE)/fxsave 345 866 151% IvyBridge (AVX)/xsave 420 643 53% Haswell (AVX)/xsave 713 1252 75% Skylake (AVX+MPX)/xsavec 559 719 28% Skylake (AVX512+MPX)/xsavec 145 272 87% Ryzen (AVX)/xsavec 280 553 97% This is the worst case where portion of time spent for saving and restoring registers is bigger than majority of cases. With smaller _dl_runtime_resolve code size, overall performance impact is negligible. On IvyBridge, differences in build and test time of binutils with lazy binding GCC and binutils are noises. On Westmere, differences in bootstrap and "makc check" time of GCC 7 with lazy binding GCC and binutils are also noises. [BZ #21265] * sysdeps/x86/cpu-features-offsets.sym (XSAVE_STATE_SIZE_OFFSET): New. * sysdeps/x86/cpu-features.c: Include <libc-pointer-arith.h>. (get_common_indeces): Set xsave_state_size, xsave_state_full_size and bit_arch_XSAVEC_Usable if needed. (init_cpu_features): Remove bit_arch_Use_dl_runtime_resolve_slow and bit_arch_Use_dl_runtime_resolve_opt. * sysdeps/x86/cpu-features.h (bit_arch_Use_dl_runtime_resolve_opt): Removed. (bit_arch_Use_dl_runtime_resolve_slow): Likewise. (bit_arch_Prefer_No_AVX512): Updated. (bit_arch_MathVec_Prefer_No_AVX512): Likewise. (bit_arch_XSAVEC_Usable): New. (STATE_SAVE_OFFSET): Likewise. (STATE_SAVE_MASK): Likewise. [__ASSEMBLER__]: Include <cpu-features-offsets.h>. (cpu_features): Add xsave_state_size and xsave_state_full_size. (index_arch_Use_dl_runtime_resolve_opt): Removed. (index_arch_Use_dl_runtime_resolve_slow): Likewise. (index_arch_XSAVEC_Usable): New. * sysdeps/x86/cpu-tunables.c (TUNABLE_CALLBACK (set_hwcaps)): Support XSAVEC_Usable. Remove Use_dl_runtime_resolve_slow. * sysdeps/x86_64/Makefile (tst-x86_64-1-ENV): New if tunables is enabled. * sysdeps/x86_64/dl-machine.h (elf_machine_runtime_setup): Replace _dl_runtime_resolve_sse, _dl_runtime_resolve_avx, _dl_runtime_resolve_avx_slow, _dl_runtime_resolve_avx_opt, _dl_runtime_resolve_avx512 and _dl_runtime_resolve_avx512_opt with _dl_runtime_resolve_fxsave, _dl_runtime_resolve_xsave and _dl_runtime_resolve_xsavec. * sysdeps/x86_64/dl-trampoline.S (DL_RUNTIME_UNALIGNED_VEC_SIZE): Removed. (DL_RUNTIME_RESOLVE_REALIGN_STACK): Check STATE_SAVE_ALIGNMENT instead of VEC_SIZE. (REGISTER_SAVE_BND0): Removed. (REGISTER_SAVE_BND1): Likewise. (REGISTER_SAVE_BND3): Likewise. (REGISTER_SAVE_RAX): Always defined to 0. (VMOV): Removed. (_dl_runtime_resolve_avx): Likewise. (_dl_runtime_resolve_avx_slow): Likewise. (_dl_runtime_resolve_avx_opt): Likewise. (_dl_runtime_resolve_avx512): Likewise. (_dl_runtime_resolve_avx512_opt): Likewise. (_dl_runtime_resolve_sse): Likewise. (_dl_runtime_resolve_sse_vex): Likewise. (USE_FXSAVE): New. (_dl_runtime_resolve_fxsave): Likewise. (USE_XSAVE): Likewise. (_dl_runtime_resolve_xsave): Likewise. (USE_XSAVEC): Likewise. (_dl_runtime_resolve_xsavec): Likewise. * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_avx512): Removed. (_dl_runtime_resolve_avx512_opt): Likewise. (_dl_runtime_resolve_avx): Likewise. (_dl_runtime_resolve_avx_opt): Likewise. (_dl_runtime_resolve_sse): Likewise. (_dl_runtime_resolve_sse_vex): Likewise. (_dl_runtime_resolve_fxsave): New. (_dl_runtime_resolve_xsave): Likewise. (_dl_runtime_resolve_xsavec): Likewise.
*	m68k: Update elf_machine_load_address for static PIE	H.J. Lu	2017-10-20	1	-0/+6
\| \| \| \| \| \| \| \| \|	When --enable-static-pie is used to configure glibc, we need to use _dl_relocate_static_pie to compute load address in static PIE. * sysdeps/m68k/dl-machine.h (elf_machine_load_address): Use _dl_relocate_static_pie instead of _dl_start to compute load address in static PIE.
*	m68k: Check PIC instead of SHARED in start.S	H.J. Lu	2017-10-20	1	-1/+1
\| \| \| \| \| \| \|	Since start.o may be compiled as PIC, we should check PIC instead of SHARED. * sysdeps/m68k/start.S (_start): Check PIC instead of SHARED.
*	sysconf: Fix missing definition of UIO_MAXIOV on Linux [BZ #22321]	Florian Weimer	2017-10-20	4	-1/+72
\| \| \| \| \| \| \|	After commit 37f802f86400684c8d13403958b2c598721d6360 (Remove __need_IOV_MAX and __need_FOPEN_MAX), UIO_MAXIOV is no longer supplied (indirectly) through <bits/stdio_lim.h>, so sysdeps/posix/sysconf.c no longer sees the definition.
*	i386: Regenerate libm-test-ulps	H.J. Lu	2017-10-19	1	-2/+2
\| \| \| \| \| \|	Regenerate libm-test-ulps for --disable-multi-arch. * sysdeps/i386/fpu/libm-test-ulps: Regenerated.
*	Add MIPS bits/floatn.h.	Joseph Myers	2017-10-19	1	-0/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds a MIPS-specific bits/floatn.h header. This header is identical to the ldbl-128 version except for the comment at the top; the purpose is to ensure that a 32-bit MIPS build installs a header that is the same as in a 64-bit MIPS build and so properly shows _Float128 support to be available for 64-bit compilations, on the general principle of an installation for one multilib providing headers also suitable for other multilibs. Tested with build-many-glibcs.py. * sysdeps/mips/ieee754/bits/floatn.h: New file.
*	Install correct bits/long-double.h for MIPS64 (bug 22322).	Joseph Myers	2017-10-19	1	-0/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similar to bug 21987 for SPARC, MIPS64 wrongly installs the ldbl-128 version of bits/long-double.h, meaning incorrect results when using headers installed from a 64-bit installation for a 32-bit build. (I haven't actually seen this cause build failures before its interaction with bits/floatn.h did so - installed headers wrongly expecting _Float128 to be available in a 32-bit configuration.) This patch fixes the bug by moving the MIPS header to sysdeps/mips/ieee754, which comes before sysdeps/ieee754/ldbl-128 in the sysdeps directory ordering. (bits/floatn.h will need a similar fix - duplicating the ldbl-128 version for MIPS will suffice - for headers from a 32-bit installation to be correct for 64-bit builds.) Tested with build-many-glibcs.py (compilers build for mips64-linux-gnu, where there was previously a libstdc++ build failure as at <https://sourceware.org/ml/libc-testresults/2017-q4/msg00130.html>). [BZ #22322] * sysdeps/mips/bits/long-double.h: Move to .... * sysdeps/mips/ieee754/bits/long-double.h: ... here.
*	x86-64: Don't set GLRO(dl_platform) to NULL [BZ #22299]	H.J. Lu	2017-10-19	5	-4/+103
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since ld.so expands $PLATFORM with GLRO(dl_platform), don't set GLRO(dl_platform) to NULL. [BZ #22299] * sysdeps/x86/cpu-features.c (init_cpu_features): Don't set GLRO(dl_platform) to NULL. * sysdeps/x86_64/Makefile (tests): Add tst-platform-1. (modules-names): Add tst-platformmod-1 and x86_64/tst-platformmod-2. (CFLAGS-tst-platform-1.c): New. (CFLAGS-tst-platformmod-1.c): Likewise. (CFLAGS-tst-platformmod-2.c): Likewise. (LDFLAGS-tst-platformmod-2.so): Likewise. ($(objpfx)tst-platform-1): Likewise. ($(objpfx)tst-platform-1.out): Likewise. (tst-platform-1-ENV): Likewise. ($(objpfx)x86_64/tst-platformmod-2.os): Likewise. * sysdeps/x86_64/tst-platform-1.c: New file. * sysdeps/x86_64/tst-platformmod-1.c: Likewise. * sysdeps/x86_64/tst-platformmod-2.c: Likewise.
*	Add _Float128 function aliases.	Joseph Myers	2017-10-18	20	-2/+868
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for f128 function aliases on platforms where long double has the binary128 format (and thus GCC 7 provides the _Float128 type with the same ABI as long double but as a distinct type in terms of C type compatibility). This is the same API as provided in glibc 2.26 for powerpc64le / x86_64 / x86 / ia64 where _Float128 has a different format from long double, with the bulk of the API coming from TS 18661-3. All the functions alias the corresponding long double functions, and __ function names are not provided since those are only needed once for each floating-point format, not more than once for different types with the same format (so for example, -ffinite-math-only maps foof128 to __fool_finite, while type-generic macros end up calling e.g. __issignalingl for _Float128 arguments on such platforms). The preparation for this feature was done in previous patches, so this one just needs to add the relevant makefile and header definitions, and update macro definitions of libm_alias_ldouble_other_r, to turn on the feature, and update documentation and ABI baselines. Tested (a) for x86_64, (b) for aarch64, (c) with build-many-glibcs.py with both GCC 6 and GCC 7. * sysdeps/ieee754/ldbl-128/Makeconfig: New file. * sysdeps/ieee754/ldbl-128/bits/floatn.h: Likewise. * sysdeps/ieee754/ldbl-128/float128-abi.h: Likewise. * sysdeps/generic/libm-alias-ldouble.h: Include <bits/floatn.h>. [__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128] (libm_alias_ldouble_other_r): Also create _Float128 alias. * sysdeps/ieee754/ldbl-opt/libm-alias-ldouble.h: Include <bits/floatn.h>. [__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128] (libm_alias_ldouble_other_r): Also create _Float128 alias. * manual/math.texi (Mathematics): Document additional architecture support for _Float128. * sysdeps/unix/sysv/linux/aarch64/libc.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
*	[AARCH64] Rewrite elf_machine_load_address using _DYNAMIC symbol	Szabolcs Nagy	2017-10-18	1	-34/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch rewrites aarch64 elf_machine_load_address to use special _DYNAMIC symbol instead of _dl_start. The static address of _DYNAMIC symbol is stored in the first GOT entry. Here is the change which makes this solution work (part of binutils 2.24): https://sourceware.org/ml/binutils/2013-06/msg00248.html i386, x86_64 targets use the same method to do this as well. The original implementation relies on a trick that R_AARCH64_ABS32 relocation being resolved at link time and the static address fits in the 32bits. However, in LP64, normally, the address is defined to be 64 bit. Here is the C version one which should be portable in all cases. * sysdeps/aarch64/dl-machine.h (elf_machine_load_address): Use _DYNAMIC symbol to calculate load address.
*	powerpc: fix check-before-set in SET_RESTORE_ROUND	Paul Clarke	2017-10-18	1	-7/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A performance regression was introduced by commit 84d74e427a771906830800e574a72f8d25a954b8 "powerpc: Cleanup fenv_private.h". In the powerpc implementation of SET_RESTORE_ROUND, there is the following code in the "SET" function (slightly simplified): -- old.fenv = fegetenv_register (); new.l = (old.l & _FPU_MASK_TRAPS_RN) \| r; (1) if (new.l != old.l) (2) { if ((old.l & _FPU_ALL_TRAPS) != 0) (void) __fe_mask_env (); fesetenv_register (new.fenv); (3) -- Line (1) sets the value of "new" to the current value of FPSCR, but masks off summary bits, exceptions, non-IEEE mode, and rounding mode, then ORs in the new rounding mode. Line (2) compares this new value to the current value in order to avoid setting a new value in the FPSCR (line (3)) unless something significant has changed (exception enables or rounding mode). The summary bits are not germane to the comparison, but are cleared in "new" and preserved in "old", resulting in false negative comparisons, and unnecessarily setting the FPSCR in those cases with associated negative performance impacts. The solution is to treat the summaries identically for "new" and "old": - save them in SET - leave them alone otherwise - restore the saved values in RESTORE Also minor changes: - expand _FPU_MASK_RN to 64bit hex, to match other MASKs - treat bit 52 (left-to-right) as reserved (since it is) * sysdeps/powerpc/fpu/fenv_private.h (_FPU_MASK_TRAPS_RN): (_FPU_MASK_FRAC_INEX_RET_CC): Fix masks to more properly handle summary bits. (_FPU_MASK_RN): Expand _FPU_MASK_RN to 64bit hex. (_FPU_MASK_NOT_RN_NI): Treat bit 52 (left-to-right) as reserved. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
*	posix: Add p{readv,writev}2 flags to generic uio-ext.h	Adhemerval Zanella	2017-10-17	1	-2/+1
\| \| \| \| \|	* bits/uio-ext.h (RWF_HIPRI, RWF_DSYNC, RWF_SYNC, RWF_NOWAIT): New defines.
*	Add common ifunc-init.h header	Adhemerval Zanella	2017-10-17	2	-40/+55
\| \| \| \| \| \| \| \| \| \|	This patch moves the generic definition from x86_64 init-arch to a common header ifunc-init.h. No functional changes is expected. Checked on a x86_64-linux-gnu build. * sysdeps/generic/ifunc-init.h: New file. * sysdeps/x86/init-arch.h: Use generic ifunc-init.h.
*	Move some float128 symbol version definitions.	Joseph Myers	2017-10-16	2	-109/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With support for _Float128 functions on platforms where that type has the same ABI as long double, as well as on platforms where it is ABI-distinct, those functions will need to be exported from glibc's shared libraries at appropriate symbol versions in each case. This patch avoids duplication of lists of symbols to export by moving the symbols other than __* to math/Versions and stdlib/Versions. There, they are conditional on <float128-abi.h> defining FLOAT128_VERSION and a default version of that header is added that does not define that macro. Enabling the float128 function aliases will then include adding a sysdeps/ieee754/ldbl-128/float128-abi.h that defines FLOAT128_VERSION to GLIBC_2.27. Symbols __* remain in sysdeps/ieee754/float128/Versions; those symbols should be present only once per floating-point format, not once per type. Note that if any platforms currently lacking support for a type with binary128 format get glibc support for such a type in future (whether only as _Float128, or also as a new long double format), and new libm functions (present for all types) have been added by then, additional macros will be needed to allow such functions to get a version of the form "GLIBC_2.28 if the platform had _Float128 support by then, or the later version at which that platform had _Float128 support added". This is not however a preexisting condition, but would have applied equally to the existing support for _Float128 as an ABI-distinct type. New all-type libm functions should just be added to the appropriate symbol version (currently GLIBC_2.27) for all types, with such special-case handling for _Float128 versions (and _Float64x as well in future) waiting until someone actually wants to add support for _Float128 to an existing platform after a release in which that platform and a post-2.26 libm function had support but that platform lacked _Float128 support. Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. Also tested in conjunction with the remaining changes to enable float128 aliases. * sysdeps/generic/float128-abi.h: New file. * sysdeps/ieee754/float128/Versions (FLOAT128_VERSION): Move non-__prefixed symbols to .... * math/Versions: ... here. Include <float128-abi.h>. * stdlib/Versions ... and here. Include <float128-abi.h>
*	Support strtof128 etc. aliases.	Joseph Myers	2017-10-16	2	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for building strtof128, wcstof128, strtof128_l and wcstof128_l as aliases, in the case of __HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128. Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. Also tested together with changes to enable float128 aliases. * stdlib/strtold.c: Include <bits/floatn.h> [__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128] (strtof128): Define and later undefine as macro. Define as weak alias if [!USE_WIDE_CHAR]. [__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128] (wcstof128): Define and later undefine as macro. Define as weak alias if [USE_WIDE_CHAR]. * sysdeps/ieee754/ldbl-128/strtold_l.c [__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128] (strtof128_l): Define and later undefine as macro. Define as weak alias if [!USE_WIDE_CHAR]. [__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128] (wcstof128_l): Define and later undefine as macro. Define as weak alias if [USE_WIDE_CHAR]. * sysdeps/ieee754/ldbl-64-128/strtold_l.c: Include <bits/floatn.h>. [__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128] (strtof128_l): Define and later undefine as macro. Define as weak alias if [!USE_WIDE_CHAR]. [__HAVE_FLOAT128 && !__HAVE_DISTINCT_FLOAT128] (wcstof128_l): Define and later undefine as macro. Define as weak alias if [USE_WIDE_CHAR].
*	Use libm_alias_ldouble_other in ldbl-64-128/s_nextafterl.c.	Joseph Myers	2017-10-13	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch makes ldbl-64-128/s_nextafterl.c restore the default weak_alias definition and use libm_alias_ldouble_other (having undefined and redefined weak_alias for the include of ldbl-128/s_nextafterl.c, so the libm_alias_ldouble use in the latter file is ineffective). Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. Also tested together with changes to enable float128 aliases. * sysdeps/ieee754/ldbl-64-128/s_nextafterl.c (weak_alias): Undefine and restore default definition. Use libm_alias_ldouble_other.
*	Fix TLS relocations against local symbols on powerpc32, sparc32 and sparc64	James Clarke	2017-10-13	3	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Normally, TLS relocations against local symbols are optimised by the linker to be absolute. However, gold does not do this, and so it is possible to end up with, for example, R_SPARC_TLS_DTPMOD64 referring to a local symbol. Since sym_map is left as null in elf_machine_rela for the special local symbol case, the relocation handling thinks it has nothing to do, and so the module gets left as 0. Havoc then ensues when the variable in question is accessed. Before this fix, the main_local_gold program would receive a SIGBUS on sparc64, and SIGSEGV on powerpc32. With this fix applied, that test now passes like the rest of them. * sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_rela): Assign sym_map to be map for local symbols, as TLS relocations use sym_map to determine whether the symbol is defined and to extract the TLS information. * sysdeps/sparc/sparc32/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/sparc/sparc64/dl-machine.h (elf_machine_rela): Likewise.
*	powerpc: Avoid putting floating point values in memory [BZ #22189]	Tulio Magno Quites Machado Filho	2017-10-13	1	-0/+7
\| \| \| \| \| \|	[BZ #22189] * sysdeps/powerpc/fpu/math_private.h (math_opt_barrier): (math_force_eval): Add powerpc version.
*	[BZ #22142] powerpc: Fix the carry bit on mpn_[add\|sub]_n on POWER7	Tulio Magno Quites Machado Filho	2017-10-13	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix the ifdef clause that was being used in the opposite way, setting a wrong value of the carry bit. This is also correcting 2 memory accesses that were mistakenly referring to r0 while they were supposed to mean the immediate value 0. [BZ #22142] * stdio-common/tst-printf.c (fp_test): Add tests for DBL_MAX and -DBL_MAX. (do_test): Likewise. * stdio-common/tst-printf.sh: Likewise. * sysdeps/powerpc/powerpc64/power7/add_n.S: Invert the initial ifdef clause in order to set the carry bit right. Replace r0 by 0 without changing the behavior.
*	Use libm_alias_ldouble for SPARC fabsl.	Joseph Myers	2017-10-13	2	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch makes SPARC fabsl implementation use libm_alias_ldouble, to prepare them for also defining _Float128 function aliases. Tested with build-many-glibcs.py that installed stripped shared libraries (sparc64-linux-gnu and sparcv9-linux-gnu) are unchanged by the patch. * sysdeps/sparc/sparc32/fpu/s_fabsl.c: Include <libm-alias-ldouble.h>. (fabsl): Define using libm_alias_ldouble. * sysdeps/sparc/sparc64/fpu/s_fabsl.c: Include <libm-alias-ldouble.h>. (fabsl): Define using libm_alias_ldouble.
*	Fix ldbl-opt/w_lgamma_compatl.c libm_alias_ldouble_other usage.	Joseph Myers	2017-10-13	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Testing with changes to enable _Float128 function aliases shows that the libm_alias_ldouble_other usage in ldbl-opt/w_lgamma_compatl.c does not in fact work. Furthermore, it is unnecessary; the relevant aliases get created through w_lgammal_compat2.c. This patch removes the problem code. Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. Also tested in conjunction with patches to enable _Float128 function aliases. * sysdeps/ieee754/ldbl-opt/w_lgamma_compatl.c [BUILD_LGAMMA]: Remove conditional code.
*	Fix ldbl-opt/s_clog10l.c libm_alias_ldouble_other usage.	Joseph Myers	2017-10-13	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Testing with changes to enable _Float128 function aliases shows that the libm_alias_ldouble_other usage in ldbl-opt/s_clog10l.c does not in fact work, because __clog10l is defined with long_double_symbol rather than as a normal C alias. This patch fixes this by renaming the __clog10l__internal alias (not strictly necessary, but avoids a hack with "__clog10l_interna" / "__clog10l__interna" as first argument to libm_alias_ldouble_other) and using the renamed alias when calling libm_alias_ldouble_other. Tested with build-many-glibcs.py that installed stripped shared libraries are unchanges by the patch. Also tested in conjunction with patches to enable _Float128 function aliases. * sysdeps/ieee754/ldbl-opt/s_clog10l.c (__clog10l__internal): Rename to __clog10_internal_l. (__clog10_internal_l): Define aliases using libm_alias_ldouble_other instead of using libm_alias_ldouble_other with __clog10.
*	Linux: Consolidate {RTLD_}SINGLE_THREAD_P definition	Adhemerval Zanella	2017-10-11	25	-676/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current GLIBC has two ways to implement the single thread optimization on syscalls to avoid calling the cancellation path: either by using global variables (__{libc,pthread}_multiple_thread) or by accessing the TCB field (defined by TLS_MULTIPLE_THREADS_IN_TCB). Both the variables and the macros to acces its value are defined in the architecture sysdep-cancel.h header. This patch consolidates its definition on only one header, sysdeps/unix/sysv/linux/sysdep-cancel.h, and adds a new define (SINGLE_THREAD_BY_GLOBAL) which the architecture defines if it prefer to use the global variables instead of the TCB field. This is an optimization, so if the architecture does not define it, the TCB method will be used as default. Checked on x86_64-linux-gnu and on a build with major touched ABIs (aarch64-linux-gnu, alpha-linux-gnu, arm-linux-gnueabihf, hppa-linux-gnu, i686-linux-gnu, m68k-linux-gnu, microblaze-linux-gnu, mips-linux-gnu, mips64-linux-gnu, powerpc-linux-gnu, powerpc64le-linux-gnu, s390-linux-gnu, s390x-linux-gnu, sh4-linux-gnu, sparcv9-linux-gnu, sparc64-linux-gnu, tilegx-linux-gnu). * sysdeps/unix/sysv/linux/aarch64/sysdep-cancel.h: Remove file. * sysdeps/unix/sysv/linux/alpha/sysdep-cancel.h: Likewise. * sysdeps/unix/sysv/linux/arm/sysdep-cancel.h: Likewise. * sysdeps/unix/sysv/linux/hppa/sysdep-cancel.h: Likewise. * sysdeps/unix/sysv/linux/mips/sysdep-cancel.h: Likewise. * sysdeps/unix/sysv/linux/nios2/sysdep-cancel.h: Likewise. * sysdeps/unix/sysv/linux/powerpc/sysdep-cancel.h: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/sysdep-cancel.h: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/sysdep-cancel.h: Likewise. * sysdeps/unix/sysv/linux/sh/sysdep-cancel.h: Likewise. * sysdeps/unix/sysv/linux/sparc/sysdep-cancel.h: Likewise. * sysdeps/unix/sysv/linux/tile/sysdep-cancel.h: Likewise. * sysdeps/unix/sysv/linux/x86_64/sysdep-cancel.h: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/sysdep.h (SINGLE_THREAD_BY_GLOBAL): Define. * sysdeps/unix/sysv/linux/aarch64/sysdep.h (SINGLE_THREAD_BY_GLOBAL): Likewise. * sysdeps/unix/sysv/linux/alpha/sysdep.h (SINGLE_THREAD_BY_GLOBAL): Likewise. * sysdeps/unix/sysv/linux/arm/sysdep.h (SINGLE_THREAD_BY_GLOBAL): Likewise. * sysdeps/unix/sysv/linux/hppa/sysdep.h (SINGLE_THREAD_BY_GLOBAL): Likewise. * sysdeps/unix/sysv/linux/microblaze/sysdep.h (SINGLE_THREAD_BY_GLOBAL): Likewise. * sysdeps/unix/sysv/linux/x86_64/sysdep.h (SINGLE_THREAD_BY_GLOBAL): Likewise.
*	Use generic alias macros in ldbl-opt.	Joseph Myers	2017-10-11	4	-14/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes ldbl-opt code to use generic libm alias macros in preparation for getting _FloatN / _FloatNx aliases where appropriate. Four functions are affected, that undefine and redefine alias macros before including the implementations they wrap in such a way that _FloatN / _FloatNx aliases would not appear. s_clog10l.c undefines and redefined declare_mgen_alias, so just needs a libm_alias_ldouble_other call added. w_exp10l_compat.c undefines and redefines weak_alias, but in fact does not need to do so, since math/w_exp10l_compat.c uses libm_alias_ldouble and does not use weak_alias other than through that, so the undefines and redefines of weak_alias are removed. w_lgamma_compatl.c and w_remainderl_compat.c are made to use libm_alias_ldouble_other in conjunction with restoring the original definition of weak_alias so this is effective. Tested with build-many-glibcs.py. Installed stripped shared libraries are unchanged by this patch. * sysdeps/ieee754/ldbl-opt/s_clog10l.c: Use libm_alias_ldouble_other. * sysdeps/ieee754/ldbl-opt/w_exp10l_compat.c (weak_alias): Do not undefine and redefine. [LIBM_SVID_COMPAT && !LONG_DOUBLE_COMPAT (libm, GLIBC_2_1)] (exp10l): Do not define here. * sysdeps/ieee754/ldbl-opt/w_lgamma_compatl.c [BUILD_LGAMMA] (weak_alias): Undefine and redefine. [BUILD_LGAMMA]: Use libm_alias_ldouble_other. * sysdeps/ieee754/ldbl-opt/w_remainderl_compat.c [LIBM_SVID_COMPAT] (weak_alias): Undefine and redefine here. [LIBM_SVID_COMPAT]: Use libm_alias_ldouble_other.
*	Add libm_alias_*_other_r macros.	Joseph Myers	2017-10-10	11	-7/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some libm functions are unable to use the generic alias macros such as libm_alias_double because they have special symbol versioning requirements for the main float, double or long double public names. To facilitate adding _FloatN / _FloatNx function aliases in future, it's still desirable to have generic macros those functions can use as far as possible. This patch adds macros such as libm_alias_double_other, which only define names for _FloatN / _FloatNx aliases, not for float / double / long double. As present, all these new macros do nothing, but they are called in the appropriate places in macros such as libm_alias_double. This patch also arranges for lgamma implementations, and the recently added optimized float function implementations, to use the new macros to make them ready for addition of _FloatN / _FloatNx aliases. Tested for x86_64, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. * sysdeps/generic/libm-alias-double.h (libm_alias_double_other_r): New macro. (libm_alias_double_other): Likewise. (libm_alias_double_r): Use libm_alias_double_other_r. * sysdeps/generic/libm-alias-float.h (libm_alias_float_other_r): New macro. (libm_alias_float_other): Likewise. (libm_alias_float_r): Use libm_alias_float_other_r. * sysdeps/generic/libm-alias-float128.h (libm_alias_float128_other_r): New macro. (libm_alias_float128_other): Likewise. (libm_alias_float128_r): Use libm_alias_float128_other_r. * sysdeps/generic/libm-alias-ldouble.h (libm_alias_ldouble_other_r): New macro. (libm_alias_ldouble_other): Likewise. (libm_alias_ldouble_r): Use libm_alias_ldouble_other_r. * sysdeps/ieee754/ldbl-opt/libm-alias-double.h (libm_alias_double_other_r): New macro. (libm_alias_double_other): Likewise. (libm_alias_double_r): Use libm_alias_double_other_r. * sysdeps/ieee754/ldbl-opt/libm-alias-ldouble.h (libm_alias_ldouble_other_r): New macro. (libm_alias_ldouble_other): Likewise. (libm_alias_ldouble_r): Use libm_alias_ldouble_other_r. * math/w_lgamma_main.c: Include <libm-alias-double.h>. [!USE_AS_COMPAT]: Use libm_alias_double_other. * math/w_lgammaf_main.c: Include <libm-alias-float.h>. [!USE_AS_COMPAT]: Use libm_alias_float_other. * math/w_lgammal_main.c: Include <libm-alias-ldouble.h>. [!USE_AS_COMPAT]: Use libm_alias_ldouble_other. * math/w_exp2f.c: Use libm_alias_float_other. * math/w_expf.c: Likewise. * math/w_log2f.c: Likewise. * math/w_logf.c: Likewise. * math/w_powf.c: Likewise. * sysdeps/ieee754/flt-32/e_exp2f.c: Include <libm-alias-float.h>. [!__exp2f]: Use libm_alias_float_other. * sysdeps/ieee754/flt-32/e_expf.c: Include <libm-alias-float.h>. [!__expf]: Use libm_alias_float_other. * sysdeps/ieee754/flt-32/e_log2f.c: Include <libm-alias-float.h>. [!__log2f]: Use libm_alias_float_other. * sysdeps/ieee754/flt-32/e_logf.c: Include <libm-alias-float.h>. [!__logf]: Use libm_alias_float_other. * sysdeps/ieee754/flt-32/e_powf.c: Include <libm-alias-float.h>. [!__powf]: Use libm_alias_float_other.
*	Use generic macros for lgamma_r function aliases.	Joseph Myers	2017-10-09	7	-21/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Continuing the use of generic macros for defining libm function aliases, in preparation for adding more _FloatN / _FloatNx function names, this patch makes the lgamma_r functions use such macros. declare_mgen_alias_r becomes a standard macro in math-type-macros.h instead of being locally defined in w_lgamma_r_templace.c. This in turn must be defined by each math-type-macros-<type>.h. Rather than providing an unused default in math-type-macros.h, that header is made to give an error if math-type-macros-<type>.h failed to define declare_mgen_alias or declare_mgen_alias_r. The compat lgamma_r wrappers are updated similarly. The ldbl-opt versions are removed as no longer needed. Tested for x86_64, and with build-many-glibcs.py. Installed stripped shared libraries are unchanged except for powerpc64le (where the usual issue applies that an ldbl-opt long double function previously used long_double_symbol unconditionally and now the symbol versions on powerpc64le mean weak_alias is used instead, resulting in the same symbol versions in the final shared library but still enough difference in the input objects for that library not to be byte-identical). * sysdeps/generic/math-type-macros.h [!declare_mgen_alias]: Give error. Remove default definition of declare_mgen_alias. [!declare_mgen_alias_r]: Likewise. * sysdeps/generic/math-type-macros-double.h [!declare_mgen_alias_r] (declare_mgen_alias_r): New macro. * sysdeps/generic/math-type-macros-float.h [!declare_mgen_alias_r] (declare_mgen_alias_r): Likewise. * sysdeps/generic/math-type-macros-float128.h [!declare_mgen_alias_r] (declare_mgen_alias_r): Likewise. * sysdeps/generic/math-type-macros-ldouble.h [!declare_mgen_alias_r] (declare_mgen_alias_r): Likewise. * math/w_lgamma_r_template.c (declare_mgen_alias_r_x): Remove macro. (declare_mgen_alias_r_s): Likewise. (declare_mgen_alias_r): Likewise. * math/w_lgamma_r_compat.c: Include <libm-alias-double.h>. (lgamma_r): Define using libm_alias_double_r. * math/w_lgammaf_r_compat.c: Include <libm-alias-float.h>. (lgammaf_r): Define using libm_alias_float_r. * math/w_lgammal_r_compat.c: Include <libm-alias-ldouble.h>. (lgammal_r): Define using libm_alias_ldouble_r. * sysdeps/ieee754/ldbl-opt/w_lgamma_r_compat.c: Remove file. * sysdeps/ieee754/ldbl-opt/w_lgammal_r_compat.c: Likewise.
*	Remove ldbl-opt w_scalbln.c.	Joseph Myers	2017-10-09	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \|	The ldbl-opt version of w_scalbln.c is not in fact needed; it handles compat symbol versions for libc, but this file isn't built for libc, only for libm. This patch removes this file. Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. * sysdeps/ieee754/ldbl-opt/w_scalbln.c: Remove file.
*	Use libm_alias_double in ldbl-128, ldbl-96 fma.	Joseph Myers	2017-10-06	2	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch makes the ldbl-128 and ldbl-96 implementations of fma use libm_alias_double. Tested for x86_64, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/ieee754/ldbl-128/s_fma.c: Include <libm-alias-double.h>. [!__fma] (fma): Define using libm_alias_double. * sysdeps/ieee754/ldbl-96/s_fma.c: Include <libm-alias-double.h>. [!__fma] (fma): Define using libm_alias_double.
*	Use libm_alias_ldouble for ldbl-128 functions.	Joseph Myers	2017-10-06	69	-176/+99
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch makes ldbl-128 functions use libm_alias_ldouble to define function aliases. float128_private.h is updated accordingly. Most of the ldbl-64-128 wrappers are removed as no longer needed with this change (leaving those that involve versioning for functions in libc or that shouldn't be exported from libm for _Float128 / _Float64x types with the same format as long double). Tested for x86_64, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. * sysdeps/ieee754/float128/float128_private.h: Include <libm-alias-ldouble.h> and <libm-alias-float128.h>. (libm_alias_ldouble_r): Undefine and redefine. * sysdeps/ieee754/ldbl-128/s_asinhl.c: Include <libm-alias-ldouble.h>. (asinhl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_atanl.c: Include <libm-alias-ldouble.h>. (atanl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_cbrtl.c: Include <libm-alias-ldouble.h>. (cbrtl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_ceill.c: Include <libm-alias-ldouble.h>. (ceill): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_copysignl.c: Include <libm-alias-ldouble.h>. (copysignl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_cosl.c: Include <libm-alias-ldouble.h>. (cosl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_erfl.c: Include <libm-alias-ldouble.h>. (erfl): Define using libm_alias_ldouble. (erfcl): Likewise. * sysdeps/ieee754/ldbl-128/s_expm1l.c: Include <libm-alias-ldouble.h>. (expm1l): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_fabsl.c: Include <libm-alias-ldouble.h>. (fabsl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_floorl.c: Include <libm-alias-ldouble.h>. (floorl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_fmal.c: Include <libm-alias-ldouble.h>. (fmal): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_frexpl.c: Include <libm-alias-ldouble.h>. (frexpl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_fromfpl.c (fromfpl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_fromfpl_main.c: Include <libm-alias-ldouble.h>. * sysdeps/ieee754/ldbl-128/s_fromfpxl.c (fromfpxl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_getpayloadl.c: Include <libm-alias-ldouble.h>. (getpayloadl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_llrintl.c: Include <libm-alias-ldouble.h>. (llrintl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_llroundl.c: Include <libm-alias-ldouble.h>. (llroundl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_logbl.c: Include <libm-alias-ldouble.h>. (logbl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_lrintl.c: Include <libm-alias-ldouble.h>. (lrintl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_lroundl.c: Include <libm-alias-ldouble.h>. (lroundl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_modfl.c: Include <libm-alias-ldouble.h>. (modfl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_nearbyintl.c: Include <libm-alias-ldouble.h>. (nearbyintl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_nextafterl.c: Include <libm-alias-ldouble.h>. (nextafterl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_nextupl.c: Include <libm-alias-ldouble.h>. (nextupl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_remquol.c: Include <libm-alias-ldouble.h>. (remquol): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_rintl.c: Include <libm-alias-ldouble.h>. (rintl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_roundevenl.c: Include <libm-alias-ldouble.h>. (roundevenl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_roundl.c: Include <libm-alias-ldouble.h>. (roundl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_setpayloadl.c (setpayloadl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_setpayloadl_main.c: Include <libm-alias-ldouble.h>. * sysdeps/ieee754/ldbl-128/s_setpayloadsigl.c (setpayloadsigl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_sincosl.c: Include <libm-alias-ldouble.h>. (sincosl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_sinl.c: Include <libm-alias-ldouble.h>. (sinl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_tanhl.c: Include <libm-alias-ldouble.h>. (tanhl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_tanl.c: Include <libm-alias-ldouble.h>. (tanl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_totalorderl.c: Include <libm-alias-ldouble.h>. (totalorderl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_totalordermagl.c: Include <libm-alias-ldouble.h>. (totalordermagl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_truncl.c: Include <libm-alias-ldouble.h>. (truncl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_ufromfpl.c (ufromfpl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-128/s_ufromfpxl.c (ufromfpxl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-64-128/s_copysignl.c: Include <libm-alias-ldouble.h>. (weak_alias): Do not undefine and redefine. [IS_IN (libc)] (libm_alias_ldouble): Undefine and redefine. (copysignl): Define with long_double_symbol only if [IS_IN (libc)]. * sysdeps/ieee754/ldbl-64-128/s_frexpl.c: Include <libm-alias-ldouble.h>. (weak_alias): Do not undefine and redefine. [IS_IN (libc)] (libm_alias_ldouble): Undefine and redefine. (frexpl): Define with long_double_symbol only if [IS_IN (libc)]. * sysdeps/ieee754/ldbl-64-128/s_modfl.c: Include <libm-alias-ldouble.h>. (weak_alias): Do not undefine and redefine. [IS_IN (libc)] (libm_alias_ldouble): Undefine and redefine. (modfl): Define with long_double_symbol only if [IS_IN (libc)]. * sysdeps/ieee754/ldbl-64-128/s_asinhl.c: Remove file. * sysdeps/ieee754/ldbl-64-128/s_atanl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_cbrtl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_ceill.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_cosl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_erfl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_expm1l.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_fabsl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_floorl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_logbl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_nearbyintl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_remquol.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_rintl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_roundl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_sincosl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_sinl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_tanhl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_tanl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_truncl.c: Likewise.
*	Remove redundant ldbl-64-128 files.	Joseph Myers	2017-10-06	5	-38/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Various source files in ldbl-64-128 are redundant, because they wrap files that no longer provide public symbols that need special versioning (those symbols having moved to separate errno-setting wrappers), or, in the case of w_scalblnl.c, because the type-generic template now does everything required (it deals with symbol versioning for use in libm, and this file is never built for libc anyway - the compat scalbln* symbols in libc, as opposed to scalbn, are only for i386 and m68k and are aliases to the corresponding scalbn symbols). This patch removes those redundant files. Tested with build-many-glibcs.py (for all ldbl-64-128 configurations) that installed stripped shared libraries are unchanged by this patch. * sysdeps/ieee754/ldbl-64-128/e_ilogbl.c: Remove file. * sysdeps/ieee754/ldbl-64-128/s_log1pl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_scalblnl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_scalbnl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/w_scalblnl.c: Likewise.
*	powerpc: Fix IFUNC for memrchr	Rajalakshmi Srinivasaraghavan	2017-10-06	3	-17/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recent commit 59ba2d2b5421 missed to add __memrchr_power8 in ifunc list. Also handled discarding unwanted bytes for unaligned inputs in power8 optimization. 2017-10-05 Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com> * sysdeps/powerpc/powerpc64/multiarch/memrchr-ppc64.c: Revert back to powerpc32 file. * sysdeps/powerpc/powerpc64/multiarch/memrchr.c (memrchr): Add __memrchr_power8 to ifunc list. * sysdeps/powerpc/powerpc64/power8/memrchr.S: Mask extra bytes for unaligned inputs.
*	Update ARM libm-test-ulps.	Joseph Myers	2017-10-05	1	-2/+8
\| \| \| \|	* sysdeps/arm/libm-test-ulps: Update.
*	Use libm_alias_ldouble for ldbl-96 functions.	Joseph Myers	2017-10-05	31	-30/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch makes ldbl-96 functions use libm_alias_ldouble to define function aliases. Tested for x86_64, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/ieee754/ldbl-96/s_asinhl.c: Include <libm-alias-ldouble.h>. (asinhl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_cbrtl.c: Include <libm-alias-ldouble.h>. (cbrtl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_copysignl.c: Include <libm-alias-ldouble.h>. (copysignl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_cosl.c: Include <libm-alias-ldouble.h>. (cosl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_erfl.c: Include <libm-alias-ldouble.h>. (erfl): Define using libm_alias_ldouble. (erfcl): Likewise. * sysdeps/ieee754/ldbl-96/s_fmal.c: Include <libm-alias-ldouble.h>. (fmal): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_frexpl.c: Include <libm-alias-ldouble.h>. (frexpl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_fromfpl.c (fromfpl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_fromfpl_main.c: Include <libm-alias-ldouble.h>. * sysdeps/ieee754/ldbl-96/s_fromfpxl.c (fromfpxl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_getpayloadl.c: Include <libm-alias-ldouble.h>. (getpayloadl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_llrintl.c: Include <libm-alias-ldouble.h>. (llrintl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_llroundl.c: Include <libm-alias-ldouble.h>. (llroundl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_lrintl.c: Include <libm-alias-ldouble.h>. (lrintl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_lroundl.c: Include <libm-alias-ldouble.h>. (lroundl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_modfl.c: Include <libm-alias-ldouble.h>. (modfl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_nextupl.c: Include <libm-alias-ldouble.h>. (nextupl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_remquol.c: Include <libm-alias-ldouble.h>. (remquol): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_roundevenl.c: Include <libm-alias-ldouble.h>. (roundevenl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_roundl.c: Include <libm-alias-ldouble.h>. (roundl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_setpayloadl.c (setpayloadl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_setpayloadl_main.c: Include <libm-alias-ldouble.h>. * sysdeps/ieee754/ldbl-96/s_setpayloadsigl.c: Include <libm-alias-ldouble.h>. (setpayloadsigl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_sincosl.c: Include <libm-alias-ldouble.h>. (sincosl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_sinl.c: Include <libm-alias-ldouble.h>. (sinl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_tanhl.c: Include <libm-alias-ldouble.h>. (tanhl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_tanl.c: Include <libm-alias-ldouble.h>. (tanl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_totalorderl.c: Include <libm-alias-ldouble.h>. (totalorderl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_totalordermagl.c: Include <libm-alias-ldouble.h>. (totalordermagl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_ufromfpl.c (ufromfpl): Define using libm_alias_ldouble. * sysdeps/ieee754/ldbl-96/s_ufromfpxl.c (ufromfpxl): Define using libm_alias_ldouble.
*	aarch64: Optimized implementation of memmove for Qualcomm Falkor	Siddhesh Poyarekar	2017-10-05	4	-2/+241
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is an optimized memmove implementation for the Qualcomm Falkor processor core. Due to the way the falkor memcpy needs to be written, code cannot be easily shared between memmove and memcpy like in case of other aarch64 memcpy implementations due to which this routine is separate. The underlying principle is the same as that of memcpy where it tries to use registers with the same lower 4 bits for fetching the same stream, thus optimizing hardware prefetcher performance. The memcpy copy loop copies 64 bytes at a time using the same register pair since that's the way to train the hardware prefetcher on the falkor core. memmove cannot quite do that since it needs to avoid overlaps, so it does the next best thing, i.e. has a 32 byte loop with a 32 byte end (prefetch a loop ahead to account for overlapping locations) with register pairs that alias so that they hit the same prefetcher. Due to this difference in loop size, they have to currently be separate implementations but efforts are on to try and get memmove to fall back into memcpy whenever it can without simply duplicating all of the code. Performance: The routine fares around 20-25% better than the generic memmove for most medium to large sizes (i.e. > 128 bytes) for the new walking memmove benchmark (memmove-walk) with an unexplained regression between 1K and 2K. The minor regression is something worth looking into for us, but the remaining gains are significant enough that we would like this included upstream as we looking into the cause for the regression. Here is a snippet of the numbers as generated from the microbenchmark by the compare_strings script. Comparisons are against __memmove_generic: Function: memmove Variant: walk __memmove_thunderx __memmove_falkor __memmove_generic ======================================================================================================================== <snip> length=16384: 12508800.00 ( 6.09%) 11486800.00 ( 13.76%) 13319600.00 length=16400: 13614200.00 ( -0.67%) 11585000.00 ( 14.33%) 13523600.00 length=16385: 13448400.00 ( 0.10%) 11732700.00 ( 12.84%) 13461200.00 length=16399: 13594100.00 ( -0.22%) 11859600.00 ( 12.57%) 13564400.00 length=16386: 13211600.00 ( 1.13%) 11503800.00 ( 13.91%) 13362400.00 length=16398: 13218600.00 ( 2.12%) 11573200.00 ( 14.30%) 13504700.00 length=16387: 13510900.00 ( -0.37%) 11744200.00 ( 12.76%) 13461300.00 length=16397: 13603700.00 ( -0.15%) 11878200.00 ( 12.55%) 13583200.00 length=16388: 13461700.00 ( -0.13%) 11558000.00 ( 14.03%) 13444100.00 length=16396: 13517500.00 ( -0.03%) 11561300.00 ( 14.45%) 13513900.00 length=16389: 13534100.00 ( 0.17%) 11756800.00 ( 13.28%) 13556900.00 length=16395: 13585600.00 ( 0.11%) 11791800.00 ( 13.30%) 13601200.00 length=16390: 13480100.00 ( -0.13%) 11685500.00 ( 13.20%) 13462100.00 length=16394: 13529900.00 ( -0.23%) 11549800.00 ( 14.43%) 13498200.00 length=16391: 13595400.00 ( -0.26%) 11768200.00 ( 13.22%) 13560600.00 length=16393: 13567000.00 ( 0.20%) 11779700.00 ( 13.35%) 13594700.00 length=32768: 71308800.00 ( -6.53%) 50220800.00 ( 24.98%) 66939200.00 length=32784: 72100800.00 (-11.55%) 50114100.00 ( 22.47%) 64636300.00 length=32769: 71767000.00 ( -7.10%) 51238400.00 ( 23.54%) 67010000.00 length=32783: 70113700.00 (-40.95%) 51129000.00 ( -2.78%) 49744400.00 length=32770: 71367600.00 ( -6.52%) 50244700.00 ( 25.01%) 67000900.00 length=32782: 64366700.00 ( 4.71%) 50101400.00 ( 25.83%) 67545600.00 length=32771: 71440100.00 ( -6.51%) 51263900.00 ( 23.57%) 67074900.00 length=32781: 66993000.00 ( 0.34%) 51108300.00 ( 23.97%) 67220300.00 length=32772: 71443900.00 (-60.50%) 50062100.00 (-12.47%) 44512600.00 length=32780: 71759100.00 ( -6.58%) 50263200.00 ( 25.35%) 67328600.00 length=32773: 71714900.00 (-33.21%) 51076600.00 ( 5.12%) 53835400.00 length=32779: 71756900.00 ( -6.56%) 51290800.00 ( 23.83%) 67337800.00 length=32774: 59689300.00 (-34.55%) 50068400.00 (-12.86%) 44363300.00 length=32778: 71847500.00 (-18.20%) 50084100.00 ( 17.61%) 60786500.00 length=32775: 71599300.00 ( -6.54%) 51278200.00 ( 23.70%) 67204800.00 length=32777: 71862900.00 (-60.85%) 51094000.00 (-14.36%) 44677900.00 length=65536: 282848000.00 ( -6.60%) 199187000.00 ( 24.93%) 265325000.00 length=65552: 243285000.00 (-41.61%) 198512000.00 (-15.54%) 171805000.00 length=65537: 255415000.00 (-23.47%) 202499000.00 ( 2.11%) 206858000.00 length=65551: 280122000.00 (-62.95%) 203349000.00 (-18.29%) 171911000.00 length=65538: 283676000.00 (-14.46%) 198368000.00 ( 19.96%) 247848000.00 length=65550: 275566000.00 (-51.76%) 198494000.00 ( -9.31%) 181581000.00 length=65539: 283699000.00 ( -6.58%) 203453000.00 ( 23.57%) 266195000.00 length=65549: 286572000.00 ( -6.65%) 202607000.00 ( 24.60%) 268712000.00 length=65540: 283710000.00 ( -6.59%) 199161000.00 ( 25.17%) 266160000.00 length=65548: 237573000.00 ( 11.48%) 198462000.00 ( 26.06%) 268395000.00 length=65541: 284150000.00 ( -6.58%) 203273000.00 ( 23.75%) 266600000.00 length=65547: 286250000.00 ( -6.70%) 202594000.00 ( 24.48%) 268263000.00 length=65542: 284167000.00 ( -6.60%) 199122000.00 ( 25.31%) 266584000.00 length=65546: 285656000.00 ( -6.59%) 198443000.00 ( 25.95%) 268002000.00 length=65543: 284600000.00 ( -6.58%) 203247000.00 ( 23.89%) 267030000.00 length=65545: 285665000.00 ( -6.40%) 202575000.00 ( 24.55%) 268472000.00 <snip> * sysdeps/aarch64/multiarch/Makefile (sysdep_routines): Add memmove_falkor. * sysdeps/aarch64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list): Likewise. * sysdeps/aarch64/multiarch/memmove.c: Likewise. * sysdeps/aarch64/multiarch/memmove_falkor.S: New file.
*	Remove add-ons mechanism.	Joseph Myers	2017-10-05	4	-1/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	glibc has an add-ons mechanism to allow additional software to be integrated into the glibc build. Such add-ons may be within the glibc source tree, or outside it at a path passed to the --enable-add-ons configure option. localedata and crypt were once add-ons, distributed in separate release tarballs, but long since stopped using that mechanism. Linuxthreads was always an add-on. Ports spent some time as an add-on with separate release tarballs, then was first moved into the glibc source tree, then had its sysdeps files moved into the main sysdeps hierarchy so the add-ons mechanism was no longer used. NPTL spent some time as an add-on in the main glibc tree before stopping using the add-on mechanism. libidn used to have separate release tarballs but no longer does so, but still uses the add-ons mechanism within the glibc source tree. Various other software has supported building with the add-ons mechanism at times in the past, but I don't think any is still widely used. Add-ons involve significant, little-used complexity in the glibc build system, and make it hard to understand what the space of possible glibc configurations is. This patch removes the add-ons mechanism. libidn is now built via the Subdirs mechanism to cause any configuration using sysdeps/unix/inet to build libidn; HAVE_LIBIDN (which effectively means shared libraries are available) is now defined via sysdeps/unix/inet/configure. Various references to add-ons around the source tree are removed (in the case of maint.texi, the example list of sysdeps directories is still very out of date). Externally maintained ports should now put their files in the normal sysdeps directory structure rather than being arranged as add-ons; they probably need to change e.g. elf.h anyway, rather than actually being able to work just as a drop-in subtree. Hurd libpthread should be arranged similarly to NPTL, so some files might go in a hurd-pthreads (or similar) top-level directory in glibc, while sysdeps files should go in the normal sysdeps directory structure (possibly in hurd or hurd-pthreads subdirectories, just as there are nptl subdirectories in the sysdeps tree). Tested for x86_64, and with build-many-glibcs.py. * configure.ac (--enable-add-ons): Remove option. (machine): Do not mention add-ons in comment. (LIBC_PRECONFIGURE): Likewise. (add_ons): Remove variable and sanity checks and logic to locate add-ons. (add_ons_automatic): Remove variable. (configured_add_ons): Likewise. (add_ons_sfx): Likewise. (add_ons_pfx): Likewise. (add_on_subdirs): Likewise. (sysnames_add_ons): Likewise. Remove loop over add-ons and consideration of add-ons in Implies handling. (sysdeps_add_ons): Likewise. * configure: Regenerated. * libidn/configure.ac: Remove. * libidn/configure: Likewise. * sysdeps/unix/inet/configure.ac: New file. * sysdeps/unix/inet/configure: New generated file. * sysdeps/unix/inet/Subdirs: Add libidn. * Makeconfig (sysdeps-srcdirs): Remove variable. (+sysdep_dirs): Do not include $(sysdeps-srcdirs). ($(common-objpfx)config.status): Do not depend on add-on files. ($(common-objpfx)shlib-versions.v.i): Do not mention add-ons in comment. (all-subdirs): Do not include $(add-on-subdirs). * Makefile (dist-prepare): Do not use $(sysdeps-add-ons). * config.make.in (add-ons): Remove variable. (add-on-subdirs): Likewise. (sysdeps-add-ons): Likewise. * manual/Makefile (add-chapters): Remove. ($(objpfx)texis): Do not depend on $(add-chapters). (nonexamples): Do not handle $(add-chapters). (examples): Do not handle $(add-ons). (chapters.% top-menu.%): Do not pass '$(add-chapters)' to libc-texinfo.sh. * manual/install.texi (Installation): Do not mention add-ons. (--enable-add-ons): Do not document configure option. * INSTALL: Regenerated. * manual/libc-texinfo.sh: Do not handle $2 add-ons argument. * manual/maint.texi (Hierarchy Conventions): Do not mention add-ons. * scripts/build-many-glibcs.py (Glibc.build_glibc): Do not use --enable-add-ons. * scripts/gen-sorted.awk: Do not handle Subdirs files from add-ons. * scripts/test-installation.pl: Do not handle glibc-compat add-on. * sysdeps/nptl/Makeconfig: Do not mention add-ons in comment.
*	S390: Regenerate ULPs	Stefan Liebler	2017-10-05	1	-2/+8
\| \| \| \| \| \| \| \|	Updated ulps file. ChangeLog: * sysdeps/s390/fpu/libm-test-ulps: Regenerated.
*	Don't use hidden visibility in libc.a with PIE on i386	H.J. Lu	2017-10-04	2	-1/+12
\| \| \| \| \| \| \| \| \| \| \| \| \|	On i386, when multi-arch is enabled, all external functions must be called via PIC PLT in PIE, which requires setting up EBX register, since they may be IFUNC functions. * config.h.in (NO_HIDDEN_EXTERN_FUNC_IN_PIE): New. * include/libc-symbols.h (__hidden_proto_hiddenattr): Add check for PIC and NO_HIDDEN_EXTERN_FUNC_IN_PIE. * sysdeps/i386/configure.ac (NO_HIDDEN_EXTERN_FUNC_IN_PIE): New AC_DEFINE if multi-arch is enabled. * sysdeps/i386/configure: Regenerated.
*	Use libm_alias_double for dbl-64 fma.	Joseph Myers	2017-10-04	4	-15/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch makes dbl-64 fma use libm_alias_double. The ldbl-opt version is removed. The sparc32 version no longer needs to handle compat symbols, while alpha needs a new wrapper to avoid getting the ldbl-128 version (where ldbl-opt is earlier in the list of sysdeps directories, so previously fma came from there). Tested for x86_64, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/ieee754/dbl-64/s_fma.c: Include <libm-alias-double.h>. (fma): Define using libm_alias_double. * sysdeps/ieee754/ldbl-opt/s_fma.c: Remove file. * sysdeps/sparc/sparc32/fpu/s_fma.c: Do not include <math_ldbl_opt.h>. (fmal): Do not define as compat symbol here. * sysdeps/alpha/fpu/s_fma.c: New file.
*	aarch64: don't use MIN in dl-machine.h	Szabolcs Nagy	2017-10-04	1	-1/+2
\| \| \| \| \| \| \|	MIN is used, but param.h may not be included, so expand its single use inline. * sysdeps/aarch64/dl-machine.h (elf_machine_rela): Expand MIN.
*	Restore sparc32 copysignl, fabsl, fmal compat symbols (bug 22229).	Joseph Myers	2017-10-04	8	-1/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	32-bit SPARC libm should have compat symbols for copysignl (GLIBC_2.0), fabsl (GLIBC_2.0), fmal (GLIBC_2.1), pointing to the double functions; they were present in glibc 2.8, for example, but are now missing, probably when optimized SPARC function implementations were added without appropriate compat symbol handling. The same applies to copysignl in libc. This patch restores those compat symbols. Tested with build-many-glibcs.py for sparcv9-linux-gnu. [BZ #22229] * sysdeps/sparc/sparc32/fpu/s_copysign.S: Include <math_ldbl_opt.h> (copysignl): Define as compat symbol at version GLIBC_2_0 for libm and libc. * sysdeps/sparc/sparc32/fpu/s_fabs.S: Include <math_ldbl_opt.h>. (fabsl): Define as compat symbol at version GLIBC_2_0 for libm. * sysdeps/sparc/sparc32/fpu/s_fma.c: Include <math_ldbl_opt.h>. (fmal): Define as compat symbol at version GLIBC_2_1 for libm. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign.S: Include <math_ldbl_opt.h> (copysignl): Define as compat symbol at version GLIBC_2_0 for libm and libc. (compat_symbol): Undefine and redefine. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fabs.S: Include <math_ldbl_opt.h> (fabsl): Define as compat symbol at version GLIBC_2_0 for libm. (compat_symbol): Undefine and redefine. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fma.c [HAVE_AS_VIS3_SUPPORT]: Include <math_ldbl_opt.h>. [HAVE_AS_VIS3_SUPPORT] (fmal): Define as compat symbol at version GLIBC_2_1 for libm. * sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist: Add GLIBC_2.0 copysignl symbol. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Add GLIBC_2.0 copysignl and fabsl and GLIBC_2.1 fmal symbols.