mirror/glibc - mirror of git://sourceware.org/git/glibc.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Avoid "inexact" exceptions in i386/x86_64 floor functions (bug 15479).	Joseph Myers	2016-06-27	4	-23/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As discussed in <https://sourceware.org/ml/libc-alpha/2016-05/msg00577.html>, TS 18661-1 disallows ceil, floor, round and trunc functions from raising the "inexact" exception, in accordance with general IEEE 754 semantics for when that exception is raised. Fixing this for x87 floating point is more complicated than for the other versions of these functions, because they use the frndint instruction that raises "inexact" and this can only be avoided by saving and restoring the whole floating-point environment. As I noted in <https://sourceware.org/ml/libc-alpha/2016-06/msg00128.html>, I have now implemented a GCC option -fno-fp-int-builtin-inexact for GCC 7, such that GCC will inline these functions on x86, without caring about "inexact", when the default -ffp-int-builtin-inexact is in effect. This allows users to get optimized code depending on the options they pass to the compiler, while making the out-of-line functions follow TS 18661-1 semantics and avoid "inexact". This patch duly fixes the out-of-line floor function implementations to avoid "inexact", in the same way as the nearbyint implementations. I do not know how the performance of implementations such as these based on saving the environment and changing the rounding mode temporarily compares to that of the C versions or SSE 4.1 versions (of course, for 32-bit x86 SSE implementations still need to get the return value in an x87 register); it's entirely possible other implementations could be faster in some cases. Tested for x86_64 and x86. [BZ #15479] * sysdeps/i386/fpu/s_floor.S (__floor): Save and restore floating-point environment rather than just control word. * sysdeps/i386/fpu/s_floorf.S (__floorf): Likewise. * sysdeps/i386/fpu/s_floorl.S (__floorl): Save and restore floating-point environment, with "invalid" exceptions merged in, rather than just control word. * sysdeps/x86_64/fpu/s_floorl.S (__floorl): Likewise. * math/libm-test.inc (floor_test_data): Do not allow spurious "inexact" exceptions.
*	Avoid "inexact" exceptions in i386/x86_64 ceil functions (bug 15479).	Joseph Myers	2016-06-27	4	-23/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As discussed in <https://sourceware.org/ml/libc-alpha/2016-05/msg00577.html>, TS 18661-1 disallows ceil, floor, round and trunc functions from raising the "inexact" exception, in accordance with general IEEE 754 semantics for when that exception is raised. Fixing this for x87 floating point is more complicated than for the other versions of these functions, because they use the frndint instruction that raises "inexact" and this can only be avoided by saving and restoring the whole floating-point environment. As I noted in <https://sourceware.org/ml/libc-alpha/2016-06/msg00128.html>, I have now implemented a GCC option -fno-fp-int-builtin-inexact for GCC 7, such that GCC will inline these functions on x86, without caring about "inexact", when the default -ffp-int-builtin-inexact is in effect. This allows users to get optimized code depending on the options they pass to the compiler, while making the out-of-line functions follow TS 18661-1 semantics and avoid "inexact". This patch duly fixes the out-of-line ceil function implementations to avoid "inexact", in the same way as the nearbyint implementations. I do not know how the performance of implementations such as these based on saving the environment and changing the rounding mode temporarily compares to that of the C versions or SSE 4.1 versions (of course, for 32-bit x86 SSE implementations still need to get the return value in an x87 register); it's entirely possible other implementations could be faster in some cases. Tested for x86_64 and x86. [BZ #15479] * sysdeps/i386/fpu/s_ceil.S (__ceil): Save and restore floating-point environment rather than just control word. * sysdeps/i386/fpu/s_ceilf.S (__ceilf): Likewise. * sysdeps/i386/fpu/s_ceill.S (__ceill): Save and restore floating-point environment, with "invalid" exceptions merged in, rather than just control word. * sysdeps/x86_64/fpu/s_ceill.S (__ceill): Likewise. * math/libm-test.inc (ceil_test_data): Do not allow spurious "inexact" exceptions.
*	MIPS, SPARC: more fixes to the vfork aliases in libpthread.so	Aurelien Jarno	2016-06-27	3	-14/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 43c29487 tried to fix the vfork aliases in libpthread.so on MIPS and SPARC, but failed to do it correctly, introducing an ABI change. This patch does the remaining changes needed to align the MIPS and SPARC vfork implementations with the other architectures. That way the the alpha version of pt-vfork.S works correctly for MIPS and SPARC. The changes for alpha were done in 82aab97c. Changelog: * sysdeps/unix/sysv/linux/mips/vfork.S (__vfork): Rename into __libc_vfork. (__vfork) [IS_IN (libc)]: Remove alias. (__libc_vfork) [IS_IN (libc)]: Define as an alias. * sysdeps/unix/sysv/linux/sparc/sparc32/vfork.S: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/vfork.S: Likewise.
*	Remove atomic_compare_and_exchange_bool_rel.	Torvald Riegel	2016-06-24	9	-89/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	atomic_compare_and_exchange_bool_rel and catomic_compare_and_exchange_bool_rel are removed and replaced with the new C11-like atomic_compare_exchange_weak_release. The concurrent code in nscd/cache.c has not been reviewed yet, so this patch does not add detailed comments. * nscd/cache.c (cache_add): Use new C11-like atomic operation instead of atomic_compare_and_exchange_bool_rel. * nptl/pthread_mutex_unlock.c (__pthread_mutex_unlock_full): Likewise. * include/atomic.h (atomic_compare_and_exchange_bool_rel, catomic_compare_and_exchange_bool_rel): Remove. * sysdeps/aarch64/atomic-machine.h (atomic_compare_and_exchange_bool_rel): Likewise. * sysdeps/alpha/atomic-machine.h (atomic_compare_and_exchange_bool_rel): Likewise. * sysdeps/arm/atomic-machine.h (atomic_compare_and_exchange_bool_rel): Likewise. * sysdeps/mips/atomic-machine.h (atomic_compare_and_exchange_bool_rel): Likewise. * sysdeps/tile/atomic-machine.h (atomic_compare_and_exchange_bool_rel): Likewise.
*	Fix i386/x86_64 scalbl with sNaN input (bug 20296).	Joseph Myers	2016-06-23	2	-23/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The x86_64 and i386 versions of scalbl return sNaN for some cases of sNaN input and are missing "invalid" exceptions for other cases. This results from overly complicated code that either returns a NaN input, or discards both inputs when one is NaN and loads a NaN from memory. This patch fixes this by simplifying the code to add the arguments when either one is NaN. Tested for x86_64 and x86. [BZ #20296] * sysdeps/i386/fpu/e_scalbl.S (__ieee754_scalbl): Add arguments when either argument is a NaN. * sysdeps/x86_64/fpu/e_scalbl.S (__ieee754_scalbl): Likewise. * math/libm-test.inc (scalb_test_data): Add sNaN tests.
*	Simplify x86 nearbyint functions.	Joseph Myers	2016-06-22	4	-16/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The i386 implementations of nearbyint functions, and x86_64 nearbyintl, contain code to mask the "inexact" exception. However, the fnstenv instruction has the effect of masking all exceptions, so this masking code has been redundant since fnstenv was added to those implementations (by commit 846d9a4a3acdb4939ca7bf6aed48f9f6f26911be; commit 71d1b0166b4ace0d804af2993b3815758b852efc added the test math/test-nearbyint-except-2.c that verifies these functions do work when called with "inexact" traps enabled); this patch removes the redundant code. Tested for x86_64 and x86. * sysdeps/i386/fpu/s_nearbyint.S (__nearbyint): Do not mask "inexact" exceptions after fnstenv. * sysdeps/i386/fpu/s_nearbyintf.S (__nearbyintf): Likewise. * sysdeps/i386/fpu/s_nearbyintl.S (__nearbyintl): Likewise. * sysdeps/x86_64/fpu/s_nearbyintl.S (__nearbyintl): Likewise.
*	Move sysdeps/generic/bits/hwcap.h to top-level bits/	Zack Weinberg	2016-06-22	1	-23/+0
\| \| \| \| \| \| \| \| \| \|	This file was added to sysdeps/generic/bits in 2012. This appears to have been an oversight, as the entire sysdeps/generic/bits directory was moved to the top level in 2005. Accordingly the generic bits/hwcap.h belongs there too. * sysdeps/generic/bits/hwcap.h: Moved to ... * bits/hwcap.h: Here.
*	This patch further tunes memcpy - avoid one branch for sizes 1-3,	Wilco Dijkstra	2016-06-22	1	-24/+32
\| \| \| \| \| \| \|	add a prefetch and improve small copies that are exact powers of 2. * sysdeps/aarch64/memcpy.S (memcpy): Further tuning for performance.
*	Fix p{readv,writev}{64} consolidation implementation	Adhemerval Zanella	2016-06-21	5	-13/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes the p{readv,writev}{64} consolidation implementation from commits 4e77815 and af5fdf5. Different from pread/pwrite implementation, preadv/pwritev implementation does not require __ALIGNMENT_ARG because kernel syscall prototypes define the high and low part of the off_t, if it is the case, directly (different from pread/pwrite where the architecture ABI for passing 64-bit values must be in consideration for passsing the arguments). It also adds some basic tests for preadv/pwritev. Tested on x86_64, i686, and armhf. * misc/Makefile (tests): Add tst-preadvwritev and tst-preadvwritev64. * misc/tst-preadvwritev.c: New file. * misc/tst-preadvwritev64.c: Likewise. * sysdeps/unix/sysv/linux/preadv.c (preadv): Remove SYSCALL_LL{64} usage. * sysdeps/unix/sysv/linux/preadv64.c (preadv64): Likewise. * sysdeps/unix/sysv/linux/pwritev.c (pwritev): Likewise. * sysdeps/unix/sysv/linux/pwritev64.c (pwritev64): Likewise. * sysdeps/unix/sysv/linux/sysdep.h (LO_HI_LONG): New macro.
*	Added tests to ensure linkage through libmvec *_finite aliases which are	Andrew Senkevich	2016-06-20	26	-0/+307
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	defined in libmvec_nonshared.a (bug 19654). [BZ #19654] * sysdeps/x86_64/fpu/Makefile: Added new tests. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx-main.c: New. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx-mod.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx2-main.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx2-mod.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx512-main.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx512-mod.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-avx512.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-main.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias-mod.c: Likewise. * sysdeps/x86_64/fpu/test-double-libmvec-alias.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx-main.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx-mod.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx2-main.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx2-mod.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx512-main.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx512-mod.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-avx512.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-main.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias-mod.c: Likewise. * sysdeps/x86_64/fpu/test-float-libmvec-alias.c: Likewise. * sysdeps/x86_64/fpu/test-libmvec-alias-mod.c: Likewise.
*	Add a simple rawmemchr implementation. Use strlen for rawmemchr(s, '\0') as it	Wilco Dijkstra	2016-06-20	2	-2/+45
\| \| \| \| \| \| \| \|	is the fastest way to search for '\0'. Otherwise use memchr with an infinite size. This is 3x faster on benchtests for large sizes. Passes GLIBC tests. * sysdeps/aarch64/rawmemchr.S (__rawmemchr): New file. * sysdeps/aarch64/strlen.S (__strlen): Change to __strlen to avoid PLT.
*	This is an optimized memcpy/memmove for AArch64. Copies are split into 3 main	Wilco Dijkstra	2016-06-20	2	-452/+209
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cases: small copies of up to 16 bytes, medium copies of 17..96 bytes which are fully unrolled. Large copies of more than 96 bytes align the destination and use an unrolled loop processing 64 bytes per iteration. In order to share code with memmove, small and medium copies read all data before writing, allowing any kind of overlap. All memmoves except for the large backwards case fall into memcpy for optimal performance. On a random copy test memcpy/memmove are 40% faster on Cortex-A57 and 28% on Cortex-A53. * sysdeps/aarch64/memcpy.S (memcpy): Rewrite of optimized memcpy and memmove. * sysdeps/aarch64/memmove.S (memmove): Remove memmove code (merged into memcpy.S).
*	elf: Consolidate machine-agnostic DTV definitions in <dl-dtv.h>	Florian Weimer	2016-06-20	34	-242/+54
\| \| \| \| \|	Identical definitions of dtv_t and TLS_DTV_UNALLOCATED were repeated for all architectures using DTVs.
*	Expand comments in Linux times() implementation.	Carlos O'Donell	2016-06-19	1	-10/+17
\|
*	MIPS, SPARC: fix wrong vfork aliases in libpthread.so	Aurelien Jarno	2016-06-18	3	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With recent binutils versions the GNU libc fails to build on at least MISP and SPARC, with this kind of error: /home/aurel32/glibc/glibc-build/nptl/libpthread.so:(IND+0x0): multiple definition of `vfork@GLIBC_2.0' /home/aurel32/glibc/glibc-build/nptl/libpthread.so::(.text+0xee50): first defined here It appears that on these architectures pt-vfork.S includes vfork.S (through the alpha version of pt-vfork.S) and that the __vfork aliases are not conditionalized on IS_IN (libc) like on other architectures. Therefore the aliases are also wrongly included in libpthread.so. Fix this by properly conditionalizing the aliases like on other architectures. Changelog: * sysdeps/unix/sysv/linux/mips/vfork.S (__vfork): Conditionalize hidden_def, weak_alias and strong_alias on [IS_IN (libc)]. * sysdeps/unix/sysv/linux/sparc/sparc32/vfork.S: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/vfork.S: Likewise.
*	Add nextup and nextdown math functions	Rajalakshmi Srinivasaraghavan	2016-06-16	35	-1/+569
\| \| \| \| \| \| \| \| \| \| \|	TS 18661 adds nextup and nextdown functions alongside nextafter to provide support for float128 equivalent to it. This patch adds nextupl, nextup, nextupf, nextdownl, nextdown and nextdownf to libm before float128 support. The nextup functions return the next representable value in the direction of positive infinity and the nextdown functions return the next representable value in the direction of negative infinity. These are currently enabled as GNU extensions.
*	Fix i386 fdim double rounding (bug 20255).	Joseph Myers	2016-06-14	1	-0/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fdim suffers from double rounding on i386 because subtracting two double values can produce an inexact long double value exactly half way between two double values. This patch fixes this by creating an i386-specific version of fdim - C, based on the generic version, unlike the previous .S version - which sets the x87 precision control to double precision for the subtraction and then restores it afterwards. As noted in the comment added, there are no issues of double rounding for subnormals (a case that setting precision control does not address) because subtraction cannot produce an inexact result in the subnormal range. Tested for x86_64 and x86. [BZ #20255] * sysdeps/i386/fpu/s_fdim.c: New file. Based on math/s_fdim.c. * math/libm-test.inc (fdim_test_data): Add another test.
*	Use generic fdim on more architectures (bug 6796, bug 20255, bug 20256).	Joseph Myers	2016-06-14	11	-391/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some architectures have their own versions of fdim functions, which are missing errno setting (bug 6796) and may also return sNaN instead of qNaN for sNaN input, in the case of the x86 / x86_64 long double versions (bug 20256). These versions are not actually doing anything that a compiler couldn't generate, just straightforward comparisons / arithmetic (and, in the x86 / x86_64 case, testing for NaNs with fxam, which isn't actually needed once you use an unordered comparison and let the NaNs pass through the same subtraction as non-NaN inputs). This patch removes the x86 / x86_64 / powerpc versions, so that those architectures use the generic C versions, which correctly handle setting errno and deal properly with sNaN inputs. This seems better than dealing with setting errno in lots of .S versions. The i386 versions also return results with excess range and precision, which is not appropriate for a function exactly defined by reference to IEEE operations. For errno setting to work correctly on overflow, it's necessary to remove excess range with math_narrow_eval, which this patch duly does in the float and double versions so that the tests can reliably pass on x86. For float, this avoids any double rounding issues as the long double precision is more than twice that of float. For double, double rounding issues will need to be addressed separately, so this patch does not fully fix bug 20255. Tested for x86_64, x86 and powerpc. [BZ #6796] [BZ #20255] [BZ #20256] * math/s_fdim.c: Include <math_private.h>. (__fdim): Use math_narrow_eval on result. * math/s_fdimf.c: Include <math_private.h>. (__fdimf): Use math_narrow_eval on result. * sysdeps/i386/fpu/s_fdim.S: Remove file. * sysdeps/i386/fpu/s_fdimf.S: Likewise. * sysdeps/i386/fpu/s_fdiml.S: Likewise. * sysdeps/i386/i686/fpu/s_fdim.S: Likewise. * sysdeps/i386/i686/fpu/s_fdimf.S: Likewise. * sysdeps/i386/i686/fpu/s_fdiml.S: Likewise. * sysdeps/powerpc/fpu/s_fdim.c: Likewise. * sysdeps/powerpc/fpu/s_fdimf.c: Likewise. * sysdeps/powerpc/powerpc32/fpu/s_fdim.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_fdim.c: Likewise. * sysdeps/x86_64/fpu/s_fdiml.S: Likewise. * math/libm-test.inc (fdim_test_data): Expect errno setting on overflow. Add sNaN tests.
*	powerpc: strcasecmp/strncasecmp optmization for power8	raji	2016-06-14	11	-49/+598
\| \| \| \| \| \| \|	This implementation utilizes vectors to improve performance compared to current byte by byte implementation for POWER7. The performance improvement is upto 4x. This patch is tested on powerpc64 and powerpc64le.
*	Fix dbl-64 atan2 (sNaN, qNaN) (bug 20252).	Joseph Myers	2016-06-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The dbl-64 implementation of atan2, passed arguments (sNaN, qNaN), fails to raise the "invalid" exception. This patch fixes it to add both arguments, rather than just adding the second argument to itself, in the case where the second argument is a NaN (which is checked for before checking for the first argument being a NaN). sNaN tests for atan2 are added, along with some qNaN tests I noticed were missing but should have been there by analogy with other tests present. Tested for x86_64 and x86. [BZ #20252] * sysdeps/ieee754/dbl-64/e_atan2.c (__ieee754_atan2): Add both arguments when second argument is a NaN. * math/libm-test.inc (atan2_test_data): Add sNaN tests and more qNaN tests.
*	Fix frexp (NaN) (bug 20250).	Joseph Myers	2016-06-13	7	-6/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Various implementations of frexp functions return sNaN for sNaN input. This patch fixes them to add such arguments to themselves so that qNaN is returned. Tested for x86_64, x86, mips64 and powerpc. [BZ #20250] * sysdeps/i386/fpu/s_frexpl.S (__frexpl): Add non-finite input to itself. * sysdeps/ieee754/dbl-64/s_frexp.c (__frexp): Add non-finite or zero input to itself. * sysdeps/ieee754/dbl-64/wordsize-64/s_frexp.c (__frexp): Likewise. * sysdeps/ieee754/flt-32/s_frexpf.c (__frexpf): Likewise. * sysdeps/ieee754/ldbl-128/s_frexpl.c (__frexpl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_frexpl.c (__frexpl): Likewise. * sysdeps/ieee754/ldbl-96/s_frexpl.c (__frexpl): Likewise. * math/libm-test.inc (frexp_test_data): Add sNaN tests.
*	Remove __ASSUME_FUTEX_LOCK_PI	Adhemerval Zanella	2016-06-13	5	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch removes __ASSUME_FUTEX_LOCK_PI usage and assumes that kernel will correctly return if it supports or not futex_atomic_cmpxchg_inatomic. Current PI mutex code already has runtime support by calling prio_inherit_missing and returns ENOTSUP if the futex operation fails at initialization (it issues a FUTEX_UNLOCK_PI futex operation). Also, current minimum supported kernel (v3.2) will return ENOSYS if futex_atomic_cmpxchg_inatomic is not supported in the system: kernel/futex.c: 2628 long do_futex(u32 __user uaddr, int op, u32 val, ktime_t timeout, 2629 u32 __user uaddr2, u32 val2, u32 val3) 2630 { 2631 int ret = -ENOSYS, cmd = op & FUTEX_CMD_MASK; [...] 2667 case FUTEX_UNLOCK_PI: 2668 if (futex_cmpxchg_enabled) 2669 ret = futex_unlock_pi(uaddr, flags); [...] 2686 return ret; 2687 } The futex_cmpxchg_enabled is initialized by calling cmpxchg_futex_value_locked, which calls futex_atomic_cmpxchg_inatomic. For ARM futex_atomic_cmpxchg_inatomic will be either defined (if both CONFIG_CPU_USE_DOMAINS and CONFIG_SMP are not defined) or use the default generic implementation that returns ENOSYS. For m68k is uses the default generic implementation. For mips futex_atomic_cmpxchg_inatomic will return ENOSYS if cpu has no 'cpu_has_llsc' support (defined by each chip supporte inside kernel). For sparc, 32-bit kernel will just use default generic implementation, while 64-bit kernel has support. Tested on ARM (v3.8 kernel) and x86_64. nptl/pthread_mutex_init.c [__ASSUME_FUTEX_LOCK_PI] (prio_inherit_missing): Remove define. * sysdeps/unix/sysv/linux/arm/kernel-features.h (__ASSUME_FUTEX_LOCK_PI): Likewise. * sysdeps/unix/sysv/linux/kernel-features.h (__ASSUME_FUTEX_LOCK_PI): Likewise. * sysdeps/unix/sysv/linux/m68k/kernel-features.h (__ASSUME_FUTEX_LOCK_PI): Likewise. * sysdeps/unix/sysv/linux/mips/kernel-features.h (__ASSUME_FUTEX_LOCK_PI): Likewise. * sysdeps/unix/sysv/linux/sparc/kernel-features.h (__ASSUME_FUTEX_LOCK_PI): Likewise.
*	Revert {send,sendm,recv,recvm}msg conformance changes	Adhemerval Zanella	2016-06-10	59	-575/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After some discussion in libc-alpha about this POSIX compliance fix, I see that GLIBC should indeed revert back to previous definition of msghdr and cmsghdr and implementation of sendmsg, recvmsg, sendmmsg, recvmmsg due some reasons: * The possible issue where the syscalls wrapper add the compatibility layer is quite limited in scope and range. And kernel current also add some limits to the values on the internal msghdr and cmsghdr fields: - msghdr::msg_iovlen larger than UIO_MAXIOV (1024) returns EMSGSIZE. - msghdr::msg_controllen larger than INT_MAX returns ENOBUFS. * There is a small performance hit for recvmsg/sendmsg/recmmsg which is neglectable, but it is a big hit for sendmmsg since now instead of calling the syscall for the packed structure, GLIBC is calling multiple sendmsg. This defeat the very existence of the syscall. * It currently breaks libsanitizer build on GCC [1] (I fixed on compiler-rt). However the fix is incomplete because it does add any runtime check since libsanitizer currently does not have any facility to intercept symbols with multiple version [2]. This, along with incorret dlsym/dlvsym return for versioned symbol due another bug [3], makes hard to interpose versioned symbols. Also, current approach of fixing GCC PR#71445 leads to half-baked solutions without versioned symbol interposing. This patch basically reverts commits 2f0dc39029ae08, 222c2d7f4357d66, af7f7c7ec8dea1. I decided to not revert abf29edd4a3918 (Adjust kernel-features.h defaults for recvmsg and sendmsg) mainly because it does not really address the POSIX compliance original issue and also adds some cleanups. Tested on x86, i386, s390, s390x, aarch64, and powerpc64le. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71445 [2] https://github.com/google/sanitizers/issues/628 [3] https://sourceware.org/bugzilla/show_bug.cgi?id=14932 * conform/data/sys/socket.h-data (msghdr.msg_iovlen): Add xfail-. (msghdr.msg_controllen): Likewise. (cmsghdr.cmsg_len): Likewise. * nptl/Makefile (libpthread-routines): Remove ptw-oldrecvmsg and ptw-oldsendmsg. (CFLAGS-oldrecvmsg.c): Remove rule. (CFLAGS-oldsendmsg.c): Likewise. (CFLAGS-recvmsg.c): Add rule. (CFLAGS-sendmsg.c): Likewise. * sysdeps/unix/sysv/linux/Makefile (sysdep_routines): Remove oldrecvmsg, oldsendmsg, oldrecvmmsg, oldsendmmsg. (CFLAGS-recvmsg.c): Remove rule. (CFLAGS-sendmsg.c): Likewise. (CFLAGS-oldrecvmsg.c): Likewise. (CFLAGS-oldsendmsg.c): Likewise. (CFLAGS-recvmmsg.c): Likewise. * sysdeps/unix/sysv/linux/bits/socket.h (msghdr.msg_iovlen): Revert to kernel defined interfaces. (msghdr.msg_controllen): Likewise. (cmsghdr.cmsg_len): Likewise. (msghdr.__glibc_reserved1): Remove member. (msghdr.__glibc_reserved2): Likewise. (cmsghdr.__glibc_reserved1): Likewise. * sysdeps/unix/sysv/linux/oldrecvmmsg.c: Remove file. * sysdeps/unix/sysv/linux/oldrecvmsg.c: Likewise. * sysdeps/unix/sysv/linux/oldsendmmsg.c: Likewise. * sysdeps/unix/sysv/linux/oldsendmsg.c: Likewise. * sysdeps/unix/sysv/linux/recvmmsg.c: Revert back to previous version. * sysdeps/unix/sysv/linux/recvmsg.c: Likewise. * sysdeps/unix/sysv/linux/sendmmsg.c: Likewise. * sysdeps/unix/sysv/linux/sendmsg.c: Likewise. * sysdeps/unix/sysv/linux/aarch64/Versions [libc] (GLIBC_2.24): Remove recvmsg and sendmsg. * sysdeps/unix/sysv/linux/alpha/Versions [libc] (GLIBC_2.24): Likewise. * sysdeps/unix/sysv/linux/hppa/Versions [libc] (GLIBC_2.24): Likewise. * sysdeps/unix/sysv/linux/i386/Versions [libc] (GLIBC_2.24): Likewise. * sysdeps/unix/sysv/linux/m68k/Versions [libc] (GLIBC_2.24): Likewise. * sysdeps/unix/sysv/linux/microblaze/Versions [libc] (GLIBC_2.24): Likewise. * sysdeps/unix/sysv/linux/mips/mips32/Versions [libc] (GLIBC_2.24): Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n32/Versions [libc] (GLIBC_2.24): Likewise. * sysdeps/unix/sysv/linux/nios2/Versions [libc] (GLIBC_2.24): Likewise. * sysdeps/unix/sysv/linux/powerpc/Versions [libc] (GLIBC_2.24): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/Versions [libc] (GLIBC_2.24): Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/Versions [libc] (GLIBC_2.24): Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/Versions [libc] (GLIBC_2.24): Likewise. * sysdeps/unix/sysv/linux/sh/Versions [libc] (GLIBC_2.24): Likewise. * sysdeps/unix/sysv/linux/sparc/Versions [libc] (GLIBC_2.24): Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/Versions [libc] (GLIBC_2.24): Likewise. * sysdeps/unix/sysv/linux/tile/Versions [libc] (GLIBC_2.24): Likewise. * sysdeps/unix/sysv/linux/x86_64/Versions [libc] (GLIBC_2.24): Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/Versions: Remove file * sysdeps/unix/sysv/linux/x86_64/64/Versions: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n64/Versions: Likewise. * sysdeps/unix/sysv/linux/aarch64/libc.abilist: Remove new 2.24 version for {recv,send,recm,sendm}msg. * sysdeps/unix/sysv/linux/alpha/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/arm/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/hppa/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/microblaze/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/nios2/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/sh/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilepro/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist: Likewise.
*	Fix i386/x86_64 log2l (sNaN) (bug 20235).	Joseph Myers	2016-06-09	2	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The i386/x86_64 versions of log2l return sNaN for sNaN input. This patch fixes them to add NaN inputs to themselves so that qNaN is returned in this case. Tested for x86_64 and x86. [BZ #20235] * sysdeps/i386/fpu/e_log2l.S (__ieee754_log2l): Add NaN input to itself. * sysdeps/x86_64/fpu/e_log2l.S (__ieee754_log2l): Likewise. * math/libm-test.inc (log2_test_data): Add sNaN tests.
*	Fix ldbl-128ibm log1pl (sNaN) (bug 20234).	Joseph Myers	2016-06-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	The ldbl-128ibm version of log1pl returns sNaN for sNaN input. This patch fixes it to add such inputs to themselves so that qNaN is returned in this case. Tested for powerpc. [BZ #20234] * sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (__log1pl): Add positive infinity or NaN input to itself.
*	Fix ldbl-128ibm expm1l (sNaN) (bug 20233).	Joseph Myers	2016-06-09	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	The ldbl-128ibm version of expm1l returns sNaN for sNaN input. This patch fixes it to add such inputs to themselves so that qNaN is returned in this case. Tested for powerpc. [BZ #20233] * sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Add NaN input to itself.
*	Fix ldbl-128 expm1l (sNaN) (bug 20232).	Joseph Myers	2016-06-09	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	The ldbl-128 version of expm1l returns sNaN for sNaN input. This patch fixes it to add such inputs to themselves so that qNaN is returned in this case. Tested for mips64. [BZ #20232] * sysdeps/ieee754/ldbl-128/s_expm1l.c (__expm1l): Add NaN input to itself.
*	Always indirect branch to __libc_start_main via GOT	H.J. Lu	2016-06-09	1	-9/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since __libc_start_main in libc.so is called very early, lazy binding isn't relevant. Always call __libc_start_main with indirect branch via GOT to avoid extra branch to PLT slot. In case of static executable, ld in binutils 2.26 or above can convert indirect branch into direct branch: 0000000000400a80 <_start>: 400a80: 31 ed xor %ebp,%ebp 400a82: 49 89 d1 mov %rdx,%r9 400a85: 5e pop %rsi 400a86: 48 89 e2 mov %rsp,%rdx 400a89: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp 400a8d: 50 push %rax 400a8e: 54 push %rsp 400a8f: 49 c7 c0 20 1b 40 00 mov $0x401b20,%r8 400a96: 48 c7 c1 90 1a 40 00 mov $0x401a90,%rcx 400a9d: 48 c7 c7 c0 03 40 00 mov $0x4003c0,%rdi 400aa4: 67 e8 96 09 00 00 addr32 callq 401440 <__libc_start_main> 400aaa: f4 hlt * sysdeps/x86_64/start.S (_start): Always indirect branch to __libc_start_main via GOT.
*	X86-64: Add dummy memcopy.h and wordcopy.c	H.J. Lu	2016-06-09	2	-0/+2
\| \| \| \| \| \| \| \| \|	Since x86-64 no longer uses memory copy functions, add dummy memcopy.h and wordcopy.c to reduce code size. It reduces the size of libc.so by about 1 KB. * sysdeps/x86_64/memcopy.h: New file. * sysdeps/x86_64/wordcopy.c: Likewise.
*	Fix i386/x86_64 log1pl (sNaN) (bug 20229).	Joseph Myers	2016-06-08	2	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	The i386/x86_64 versions of log1pl return sNaN for sNaN input. This patch fixes them to add a NaN input to itself so that qNaN is returned in this case. Tested for x86_64 and x86. [BZ #20229] * sysdeps/i386/fpu/s_log1pl.S (__log1pl): Add NaN input to itself. * sysdeps/x86_64/fpu/s_log1pl.S (__log1pl): Likewise. * math/libm-test.inc (log1p_test_data): Add sNaN tests.
*	Fix i386/x86_64 log10l (sNaN) (bug 20228).	Joseph Myers	2016-06-08	2	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The i386/x86_64 versions of log10l return sNaN for sNaN input. This patch fixes them to add a NaN input to itself so that qNaN is returned in this case. Tested for x86_64 and x86. [BZ #20228] * sysdeps/i386/fpu/e_log10l.S (__ieee754_log10l): Add NaN input to itself. * sysdeps/x86_64/fpu/e_log10l.S (__ieee754_log10l): Likewise. * math/libm-test.inc (log10_test_data): Add sNaN tests.
*	Fix i386/x86_64 logl (sNaN) (bug 20227).	Joseph Myers	2016-06-08	3	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The i386/x86_64 versions of logl return sNaN for sNaN input. This patch fixes them to add a NaN input to itself so that qNaN is returned in this case. Tested for x86_64 and x86 (including a build for i586 to cover the non-i686 logl version). [BZ #20227] * sysdeps/i386/fpu/e_logl.S (__ieee754_logl): Add NaN input to itself. * sysdeps/i386/i686/fpu/e_logl.S (__ieee754_logl): Likewise. * sysdeps/x86_64/fpu/e_logl.S (__ieee754_logl): Likewise. * math/libm-test.inc (log_test_data): Add sNaN tests.
*	Fix i386/x86_64 expl, exp10l, expm1l for sNaN input (bug 20226).	Joseph Myers	2016-06-08	2	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The i386 and x86_64 implementations of expl, exp10l and expm1l (code shared between the functions) return sNaN for sNaN input. This patch fixes them to add NaN inputs to themselves so that qNaN is returned in this case. Tested for x86_64 and x86. [BZ #20226] * sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL): Add NaN argument to itself. * sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL): Likewise. * math/libm-test.inc (exp_test_data): Add sNaN tests. (exp10_test_data): Likewise. (expm1_test_data): Likewise.
*	Fix i386 cbrtl (sNaN) (bug 20224).	Joseph Myers	2016-06-08	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The i386 version of cbrtl returns sNaN (without raising any exceptions) for sNaN input. This patch fixes it to add non-finite arguments to themselves (the code path in question is also reached for zero arguments, for which adding them to themselves is also harmless), so that "invalid" is raised and qNaN returned. Tested for x86_64 and x86. [BZ #20224] * sysdeps/i386/fpu/s_cbrtl.S (__cbrtl): Add non-finite or zero argument to itself. * math/libm-test.inc (cbrt_test_data): Add sNaN tests.
*	X86-64: Remove previous default/SSE2/AVX2 memcpy/memmove	H.J. Lu	2016-06-08	19	-1490/+394
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since the new SSE2/AVX2 memcpy/memmove are faster than the previous ones, we can remove the previous SSE2/AVX2 memcpy/memmove and replace them with the new ones. No change in IFUNC selection if SSE2 and AVX2 memcpy/memmove weren't used before. If SSE2 or AVX2 memcpy/memmove were used, the new SSE2 or AVX2 memcpy/memmove optimized with Enhanced REP MOVSB will be used for processors with ERMS. The new AVX512 memcpy/memmove will be used for processors with AVX512 which prefer vzeroupper. Since the new SSE2 memcpy/memmove are faster than the previous default memcpy/memmove used in libc.a and ld.so, we also remove the previous default memcpy/memmove and make them the default memcpy/memmove, except that non-temporal store isn't used in ld.so. Together, it reduces the size of libc.so by about 6 KB and the size of ld.so by about 2 KB. [BZ #19776] * sysdeps/x86_64/memcpy.S: Make it dummy. * sysdeps/x86_64/mempcpy.S: Likewise. * sysdeps/x86_64/memmove.S: New file. * sysdeps/x86_64/memmove_chk.S: Likewise. * sysdeps/x86_64/multiarch/memmove.S: Likewise. * sysdeps/x86_64/multiarch/memmove_chk.S: Likewise. * sysdeps/x86_64/memmove.c: Removed. * sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S: Likewise. * sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: Likewise. * sysdeps/x86_64/multiarch/memmove-avx-unaligned.S: Likewise. * sysdeps/x86_64/multiarch/memmove-sse2-unaligned-erms.S: Likewise. * sysdeps/x86_64/multiarch/memmove.c: Likewise. * sysdeps/x86_64/multiarch/memmove_chk.c: Likewise. * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Remove memcpy-sse2-unaligned, memmove-avx-unaligned, memcpy-avx-unaligned and memmove-sse2-unaligned-erms. * sysdeps/x86_64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list): Replace __memmove_chk_avx512_unaligned_2 with __memmove_chk_avx512_unaligned. Remove __memmove_chk_avx_unaligned_2. Replace __memmove_chk_sse2_unaligned_2 with __memmove_chk_sse2_unaligned. Remove __memmove_chk_sse2 and __memmove_avx_unaligned_2. Replace __memmove_avx512_unaligned_2 with __memmove_avx512_unaligned. Replace __memmove_sse2_unaligned_2 with __memmove_sse2_unaligned. Remove __memmove_sse2. Replace __memcpy_chk_avx512_unaligned_2 with __memcpy_chk_avx512_unaligned. Remove __memcpy_chk_avx_unaligned_2. Replace __memcpy_chk_sse2_unaligned_2 with __memcpy_chk_sse2_unaligned. Remove __memcpy_chk_sse2. Remove __memcpy_avx_unaligned_2. Replace __memcpy_avx512_unaligned_2 with __memcpy_avx512_unaligned. Remove __memcpy_sse2_unaligned_2 and __memcpy_sse2. Replace __mempcpy_chk_avx512_unaligned_2 with __mempcpy_chk_avx512_unaligned. Remove __mempcpy_chk_avx_unaligned_2. Replace __mempcpy_chk_sse2_unaligned_2 with __mempcpy_chk_sse2_unaligned. Remove __mempcpy_chk_sse2. Replace __mempcpy_avx512_unaligned_2 with __mempcpy_avx512_unaligned. Remove __mempcpy_avx_unaligned_2. Replace __mempcpy_sse2_unaligned_2 with __mempcpy_sse2_unaligned. Remove __mempcpy_sse2. * sysdeps/x86_64/multiarch/memcpy.S (__new_memcpy): Support __memcpy_avx512_unaligned_erms and __memcpy_avx512_unaligned. Use __memcpy_avx_unaligned_erms and __memcpy_sse2_unaligned_erms if processor has ERMS. Default to __memcpy_sse2_unaligned. (ENTRY): Removed. (END): Likewise. (ENTRY_CHK): Likewise. (libc_hidden_builtin_def): Likewise. Don't include ../memcpy.S. * sysdeps/x86_64/multiarch/memcpy_chk.S (__memcpy_chk): Support __memcpy_chk_avx512_unaligned_erms and __memcpy_chk_avx512_unaligned. Use __memcpy_chk_avx_unaligned_erms and __memcpy_chk_sse2_unaligned_erms if if processor has ERMS. Default to __memcpy_chk_sse2_unaligned. * sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S Change function suffix from unaligned_2 to unaligned. * sysdeps/x86_64/multiarch/mempcpy.S (__mempcpy): Support __mempcpy_avx512_unaligned_erms and __mempcpy_avx512_unaligned. Use __mempcpy_avx_unaligned_erms and __mempcpy_sse2_unaligned_erms if processor has ERMS. Default to __mempcpy_sse2_unaligned. (ENTRY): Removed. (END): Likewise. (ENTRY_CHK): Likewise. (libc_hidden_builtin_def): Likewise. Don't include ../mempcpy.S. (mempcpy): New. Add a weak alias. * sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Support __mempcpy_chk_avx512_unaligned_erms and __mempcpy_chk_avx512_unaligned. Use __mempcpy_chk_avx_unaligned_erms and __mempcpy_chk_sse2_unaligned_erms if if processor has ERMS. Default to __mempcpy_chk_sse2_unaligned.
*	X86-64: Remove the previous SSE2/AVX2 memsets	H.J. Lu	2016-06-08	8	-319/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since the new SSE2/AVX2 memsets are faster than the previous ones, we can remove the previous SSE2/AVX2 memsets and replace them with the new ones. This reduces the size of libc.so by about 900 bytes. No change in IFUNC selection if SSE2 and AVX2 memsets weren't used before. If SSE2 or AVX2 memset was used, the new SSE2 or AVX2 memset optimized with Enhanced REP STOSB will be used for processors with ERMS. The new AVX512 memset will be used for processors with AVX512 which prefer vzeroupper. [BZ #19881] * sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S: Folded into ... * sysdeps/x86_64/memset.S: This. (__bzero): Removed. (__memset_tail): Likewise. (__memset_chk): Likewise. (memset): Likewise. (MEMSET_CHK_SYMBOL): New. Define only if MEMSET_SYMBOL isn't defined. (MEMSET_SYMBOL): Define only if MEMSET_SYMBOL isn't defined. * sysdeps/x86_64/multiarch/memset-avx2.S: Removed. (__memset_zero_constant_len_parameter): Check SHARED instead of PIC. * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Remove memset-avx2 and memset-sse2-unaligned-erms. * sysdeps/x86_64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list): Remove __memset_chk_sse2, __memset_chk_avx2, __memset_sse2 and __memset_avx2_unaligned. * sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S (__bzero): Enabled. * sysdeps/x86_64/multiarch/memset.S (memset): Replace __memset_sse2 and __memset_avx2 with __memset_sse2_unaligned and __memset_avx2_unaligned. Use __memset_sse2_unaligned_erms or __memset_avx2_unaligned_erms if processor has ERMS. Support __memset_avx512_unaligned_erms and __memset_avx512_unaligned. (memset): Removed. (__memset_chk): Likewise. (MEMSET_SYMBOL): New. (libc_hidden_builtin_def): Replace __memset_sse2 with __memset_sse2_unaligned. * sysdeps/x86_64/multiarch/memset_chk.S (__memset_chk): Replace __memset_chk_sse2 and __memset_chk_avx2 with __memset_chk_sse2_unaligned and __memset_chk_avx2_unaligned_erms. Use __memset_chk_sse2_unaligned_erms or __memset_chk_avx2_unaligned_erms if processor has ERMS. Support __memset_chk_avx512_unaligned_erms and __memset_chk_avx512_unaligned.
*	Fix i386 atanhl (sNaN) (bug 20219).	Joseph Myers	2016-06-07	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	The i386 version of atanhl returns sNaN for sNaN input. This patch fixes it to add NaN arguments to themselves so it returns qNaN in this case. Tested for x86_64 and x86. [BZ #20219] * sysdeps/i386/fpu/e_atanhl.S (__ieee754_atanhl): Add NaN argument to itself. * math/libm-test.inc (atanh_test_data): Add sNaN tests.
*	Fix i386 asinhl (sNaN) (bug 20218).	Joseph Myers	2016-06-07	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The i386 version of asinhl returns sNaN (without raising any exceptions) for sNaN input. This patch fixes it to add non-finite arguments to themselves, so that "invalid" is raised and qNaN returned. Tested for x86_64 and x86. [BZ #20218] * sysdeps/i386/fpu/s_asinhl.S (__asinhl): Add non-finite argument to itself. * math/libm-test.inc (asinh_test_data): Add sNaN tests.
*	Check FMA after COMMON_CPUID_INDEX_80000001	H.J. Lu	2016-06-07	1	-4/+9
\| \| \| \| \| \| \| \| \| \| \|	Since the FMA4 bit is in COMMON_CPUID_INDEX_80000001 and FMA4 requires AVX, determine if FMA4 is usable after COMMON_CPUID_INDEX_80000001 is available and if AVX is usable. [BZ #20195] * sysdeps/x86/cpu-features.c (get_common_indeces): Move FMA4 check to ... (init_cpu_features): Here.
*	Bug 20214: Fix linux/in6.h and netinet/in.h sync.	Carlos O'Donell	2016-06-07	1	-4/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In: https://sourceware.org/glibc/wiki/Synchronizing_Headers we explain how we synchronize our headers with Linux kernel headers. In order to synchronize with the Linux linux/in6.h and linux/ipv6.h headers we checked for their guard macros and then defined __USE_KERNEL_IPV6_DEFS and conditionalized code on this macro. In upstream kernel 56c176c9 the _UAPI prefix was stripped and this broke our synchronized headers again. We now need to check for _LINUX_IN6_H and _IPV6_H, and keep checking the old versions of the header guard checks for maximum backwards compatibility with older Linux headers (the history is actually a bit muddled here and it appears upstream linus kernel broke this 10 months before our fix was ever applied to glibc, but without glibc testing we didn't notice and distro kernels have their own testing to fix this). This patch fixes synchronization with linux/in6.h and with netinet/in.h.
*	Bug 20198: quick_exit should not call destructors.	Carlos O'Donell	2016-06-06	29	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In C++11 18.5.12 says "Objects shall not be destroyed as a result of calling quick_exit." In C11 quick_exit is silent about thread object destruction. Therefore to make glibc C++ compliant we do not call any thread local destructors. A new regression test verifies the fix. I will note that C++11 18.5.3 makes it clear that C++ defines additional requirements for _Exit() to prevent it from executing destructors. Given that the point of _Exit() is to terminate the process immediately it makes sense the C and C++ should line up and avoid calling destructors. No failures. New regtest passes.
*	Fix a typo in comments in memmove-vec-unaligned-erms.S	H.J. Lu	2016-06-06	1	-1/+1
\| \| \| \| \|	* sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: Fix a typo in comments.
*	Fix dbl-64 asin (sNaN) (bug 20213).	Joseph Myers	2016-06-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	The dbl-64 version of asin returns sNaN for sNaN arguments. This patch fixes it to add NaN arguments to themselves so that qNaN is returned in this case. Tested for x86_64 and x86. [BZ #20213] * sysdeps/ieee754/dbl-64/e_asin.c (__ieee754_asin): Add NaN argument to itself. * math/libm-test.inc (asin_test_data): Add sNaN tests.
*	Consolidate pwritev/pwritev64 implementations	Adhemerval Zanella	2016-06-06	8	-199/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch consolidates all the pwritev{64} implementation for Linux in only one (sysdeps/unix/sysv/linux/pwritev{64}.c). It also removes the syscall from the auto-generation using assembly macros. It was based on previous pwrite/pwrite64 consolidation patch. The new macro SYSCALL_LL{64} is used to handle the offset argument and alias is created for __ASSUME_OFF_DIFF_OFF64 in case of pread64. Checked on x86_64, i386, aarch64, and powerpc64le. * misc/Makefile (CFLAGS-pwritev.c): New variable: add cancellation required flags. (CFLAGS-pwritev64.c): Likewise. * sysdeps/unix/sysv/linux/generic/wordsize-32/pwritev.c: Remove file. * sysdeps/unix/sysv/linux/generic/wordsize-32/pwritev64.c: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n64/pwritev64.c: Likewise. * sysdeps/unix/sysv/linux/wordsize-64/pwritev.c: Likewise. * sysdeps/unix/sysv/linux/wordsize-64/pwritev64.: Likwise. * sysdeps/unix/sysv/linux/x86_64/x32/syscalls.list (pwritev): Remove syscall from auto-generation. * sysdeps/unix/sysv/linux/pwritev.c: Rewrite implementation. [WORDSIZE == 64] (pwritev64): Remove macro. [!PWRITEV] (PWRITEV): Likewise. [!PWRITEV] (PWRITEV_REPLACEMENT): Likewise. [!PWRITEV] (PWRITE): Likewise. [!PWRITEV] (OFF_T): Likewise. [!__ASSUME_PWRITEV] (PWRITEV_REPLACEMENT): Likewise. (LO_HI_LONG): Remove macro. [__WORDSIZE != 64 \|\| __ASSUME_OFF_DIFF_OFF64] (pwritev): Add function. * sysdeps/unix/sysv/linux/pwritev64.c: Rewrite implementation. (PWRITEV): Remove macro. (PWRITEV_REPLACEMENTE): Likewise. (PWRITE): Likewise. (OFF_T): Likewise. (pwritev64): New function. * nptl/tst-cancel4.c (tf_writev): Add test.
*	Consolidate preadv/preadv64 implementation	Adhemerval Zanella	2016-06-06	8	-200/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch consolidates all the preadv{64} implementation for Linux in only one (sysdeps/unix/sysv/linux/preadv{64}.c). It also removes the syscall from the auto-generation using assembly macros. It was based on previous pread/pread64 consolidation patch. The new macro SYSCALL_LL{64} is used to handle the offset argument and alias is created for __ASSUME_OFF_DIFF_OFF64 in case of pread64. Checked on x86_64, i386, aarch64, and powerpc64le. * misc/Makefile (CFLAGS-preadv.c): New variable: add cancellation required flags. (CFLAGS-preadv64.c): Likewise. * sysdeps/unix/sysv/linux/generic/wordsize-32/preadv.c: Remove file. * sysdeps/unix/sysv/linux/generic/wordsize-32/preadv64.c: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n64/preadv64.c: Likewise. * sysdeps/unix/sysv/linux/wordsize-64/preadv.c: Likewise. * sysdeps/unix/sysv/linux/wordsize-64/preadv64.: Likwise. * sysdeps/unix/sysv/linux/x86_64/x32/syscalls.list (preadv): Remove syscall from auto-generation. * sysdeps/unix/sysv/linux/preadv.c: Rewrite implementation. [WORDSIZE == 64] (preadv64): Remove macro. [!PREADV] (PREADV): Likewise. [!PREADV] (PREADV_REPLACEMENT): Likewise. [!PREADV] (PREAD): Likewise. [!PREADV] (OFF_T): Likewise. [!__ASSUME_PREADV] (PREADV_REPLACEMENT): Likewise. (LO_HI_LONG): Remove macro. [__WORDSIZE != 64 \|\| __ASSUME_OFF_DIFF_OFF64] (preadv): Add function. * sysdeps/unix/sysv/linux/preadv64.c: Rewrite implementation. (PREADV): Remove macro. (PREADV_REPLACEMENTE): Likewise. (PREAD): Likewise. (OFF_T): Likewise. (preadv64): New function. * nptl/tst-cancel4.c (tf_preadv): Add test.
*	Fix dbl-64 acos (sNaN) (bug 20212).	Joseph Myers	2016-06-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	The dbl-64 version of acos returns sNaN for sNaN arguments. This patch fixes it to add NaN arguments to themselves so that qNaN is returned in this case. Tested for x86_64 and x86. [BZ #20212] * sysdeps/ieee754/dbl-64/e_asin.c (__ieee754_acos): Add NaN argument to itself. * math/libm-test.inc (acos_test_data): Add sNaN tests.
*	powerpc: Fix --disable-multi-arch build on POWER8	Tulio Magno Quites Machado Filho	2016-06-06	5	-6/+27
\| \| \| \| \| \|	Add missing symbols of stpncpy and strcasestr when multi-arch is disabled. Fix memset call from strncpy/stpncpy when multi-arch is disabled.
*	Fix x86/x86_64 nextafterl incrementing negative subnormals (bug 20205).	Joseph Myers	2016-06-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The x86 / x86_64 implementation of nextafterl (also used for nexttowardl) produces incorrect results (NaNs) when negative subnormals, the low 32 bits of whose mantissa are zero, are incremented towards zero. This patch fixes this by disabling the logic to decrement the exponent in that case. Tested for x86_64 and x86. [BZ #20205] * sysdeps/i386/fpu/s_nextafterl.c (__nextafterl): Do not adjust exponent when incrementing negative subnormal with low mantissa word zero. * math/libm-test.inc (nextafter_test_data) [TEST_COND_intel96]: Add another test.
*	Fix macro API for __USE_KERNEL_IPV6_DEFS.	Carlos O'Donell	2016-06-02	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The use of __USE_KERNEL_IPV6_DEFS with ifndef is bad practice per: https://sourceware.org/glibc/wiki/Wundef. This change moves it to use 'if' and always define the macro. Please note that this is not the only problem with this code. I have a series of fixes after this one to resolve breakage with this code and add regression tests for it via compile-only source testing (to be discussed in another thread). Unfortunately __USE_KERNEL_XATTR_DEFS is set by the kernel and not glibc, and uses 'define', so we can't fix that yet.
*	hurd: disable ifunc for now	Samuel Thibault	2016-05-30	2	-0/+8
\| \| \| \| \| \|	* sysdeps/mach/hurd/configure.ac (libc_cv_ld_gnu_indirect_function): Set to no. * sysdeps/mach/hurd/configure: Refresh.