about summary refs log tree commit diff
path: root/sysdeps/x86_64
Commit message (Collapse)AuthorAgeFilesLines
...
* configure: fix `test ==` usageMike Frysinger2016-04-092-4/+4
| | | | | POSIX defines the = operator, but not ==. Fix the few places where we incorrectly used ==.
* X86-64: Prepare memmove-vec-unaligned-erms.SH.J. Lu2016-04-061-54/+84
| | | | | | | | | | | | | | Prepare memmove-vec-unaligned-erms.S to make the SSE2 version as the default memcpy, mempcpy and memmove. * sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S (MEMCPY_SYMBOL): New. (MEMPCPY_SYMBOL): Likewise. (MEMMOVE_CHK_SYMBOL): Likewise. Replace MEMMOVE_SYMBOL with MEMMOVE_CHK_SYMBOL on __mempcpy_chk symbols. Replace MEMMOVE_SYMBOL with MEMPCPY_SYMBOL on __mempcpy symbols. Provide alias for __memcpy_chk in libc.a. Provide alias for memcpy in libc.a and ld.so.
* X86-64: Prepare memset-vec-unaligned-erms.SH.J. Lu2016-04-061-13/+19
| | | | | | | | | | | | Prepare memset-vec-unaligned-erms.S to make the SSE2 version as the default memset. * sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S (MEMSET_CHK_SYMBOL): New. Define if not defined. (__bzero): Check VEC_SIZE == 16 instead of USE_MULTIARCH. Disabled fro now. Replace MEMSET_SYMBOL with MEMSET_CHK_SYMBOL on __memset_chk symbols. Properly check USE_MULTIARCH on __memset symbols.
* Force 32-bit displacement in memset-vec-unaligned-erms.SH.J. Lu2016-04-051-0/+13
| | | | | * sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Force 32-bit displacement to avoid long nop between instructions.
* Add a comment in memset-sse2-unaligned-erms.SH.J. Lu2016-04-051-0/+2
| | | | | * sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S: Add a comment on VMOVU and VMOVA.
* Don't put SSE2/AVX/AVX512 memmove/memset in ld.soH.J. Lu2016-04-036-32/+40
| | | | | | | | | | | | | | Since memmove and memset in ld.so don't use IFUNC, don't put SSE2, AVX and AVX512 memmove and memset in ld.so. * sysdeps/x86_64/multiarch/memmove-avx-unaligned-erms.S: Skip if not in libc. * sysdeps/x86_64/multiarch/memmove-avx512-unaligned-erms.S: Likewise. * sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S: Likewise. * sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S: Likewise.
* Fix memmove-vec-unaligned-erms.SH.J. Lu2016-04-031-24/+30
| | | | | | | | | | | | | | __mempcpy_erms and __memmove_erms can't be placed between __memmove_chk and __memmove it breaks __memmove_chk. Don't check source == destination first since it is less common. * sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: (__mempcpy_erms, __memmove_erms): Moved before __mempcpy_chk with unaligned_erms. (__memmove_erms): Skip if source == destination. (__memmove_unaligned_erms): Don't check source == destination first.
* Add x86-64 memset with unaligned store and rep stosbH.J. Lu2016-03-316-1/+335
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement x86-64 memset with unaligned store and rep movsb. Support 16-byte, 32-byte and 64-byte vector register sizes. A single file provides 2 implementations of memset, one with rep stosb and the other without rep stosb. They share the same codes when size is between 2 times of vector register size and REP_STOSB_THRESHOLD which defaults to 2KB. Key features: 1. Use overlapping store to avoid branch. 2. For size <= 4 times of vector register size, fully unroll the loop. 3. For size > 4 times of vector register size, store 4 times of vector register size at a time. [BZ #19881] * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add memset-sse2-unaligned-erms, memset-avx2-unaligned-erms and memset-avx512-unaligned-erms. * sysdeps/x86_64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list): Test __memset_chk_sse2_unaligned, __memset_chk_sse2_unaligned_erms, __memset_chk_avx2_unaligned, __memset_chk_avx2_unaligned_erms, __memset_chk_avx512_unaligned, __memset_chk_avx512_unaligned_erms, __memset_sse2_unaligned, __memset_sse2_unaligned_erms, __memset_erms, __memset_avx2_unaligned, __memset_avx2_unaligned_erms, __memset_avx512_unaligned_erms and __memset_avx512_unaligned. * sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S: New file. * sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S: Likewise. * sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S: Likewise. * sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Likewise.
* Add x86-64 memmove with unaligned load/store and rep movsbH.J. Lu2016-03-316-1/+594
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement x86-64 memmove with unaligned load/store and rep movsb. Support 16-byte, 32-byte and 64-byte vector register sizes. When size <= 8 times of vector register size, there is no check for address overlap bewteen source and destination. Since overhead for overlap check is small when size > 8 times of vector register size, memcpy is an alias of memmove. A single file provides 2 implementations of memmove, one with rep movsb and the other without rep movsb. They share the same codes when size is between 2 times of vector register size and REP_MOVSB_THRESHOLD which is 2KB for 16-byte vector register size and scaled up by large vector register size. Key features: 1. Use overlapping load and store to avoid branch. 2. For size <= 8 times of vector register size, load all sources into registers and store them together. 3. If there is no address overlap bewteen source and destination, copy from both ends with 4 times of vector register size at a time. 4. If address of destination > address of source, backward copy 8 times of vector register size at a time. 5. Otherwise, forward copy 8 times of vector register size at a time. 6. Use rep movsb only for forward copy. Avoid slow backward rep movsb by fallbacking to backward copy 8 times of vector register size at a time. 7. Skip when address of destination == address of source. [BZ #19776] * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add memmove-sse2-unaligned-erms, memmove-avx-unaligned-erms and memmove-avx512-unaligned-erms. * sysdeps/x86_64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list): Test __memmove_chk_avx512_unaligned_2, __memmove_chk_avx512_unaligned_erms, __memmove_chk_avx_unaligned_2, __memmove_chk_avx_unaligned_erms, __memmove_chk_sse2_unaligned_2, __memmove_chk_sse2_unaligned_erms, __memmove_avx_unaligned_2, __memmove_avx_unaligned_erms, __memmove_avx512_unaligned_2, __memmove_avx512_unaligned_erms, __memmove_erms, __memmove_sse2_unaligned_2, __memmove_sse2_unaligned_erms, __memcpy_chk_avx512_unaligned_2, __memcpy_chk_avx512_unaligned_erms, __memcpy_chk_avx_unaligned_2, __memcpy_chk_avx_unaligned_erms, __memcpy_chk_sse2_unaligned_2, __memcpy_chk_sse2_unaligned_erms, __memcpy_avx_unaligned_2, __memcpy_avx_unaligned_erms, __memcpy_avx512_unaligned_2, __memcpy_avx512_unaligned_erms, __memcpy_sse2_unaligned_2, __memcpy_sse2_unaligned_erms, __memcpy_erms, __mempcpy_chk_avx512_unaligned_2, __mempcpy_chk_avx512_unaligned_erms, __mempcpy_chk_avx_unaligned_2, __mempcpy_chk_avx_unaligned_erms, __mempcpy_chk_sse2_unaligned_2, __mempcpy_chk_sse2_unaligned_erms, __mempcpy_avx512_unaligned_2, __mempcpy_avx512_unaligned_erms, __mempcpy_avx_unaligned_2, __mempcpy_avx_unaligned_erms, __mempcpy_sse2_unaligned_2, __mempcpy_sse2_unaligned_erms and __mempcpy_erms. * sysdeps/x86_64/multiarch/memmove-avx-unaligned-erms.S: New file. * sysdeps/x86_64/multiarch/memmove-avx512-unaligned-erms.S: Likwise. * sysdeps/x86_64/multiarch/memmove-sse2-unaligned-erms.S: Likwise. * sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: Likwise.
* Make __memcpy_avx512_no_vzeroupper an aliasH.J. Lu2016-03-283-430/+404
| | | | | | | | | | | | | | | | | | | | | | | | | | Since x86-64 memcpy-avx512-no-vzeroupper.S implements memmove, make __memcpy_avx512_no_vzeroupper an alias of __memmove_avx512_no_vzeroupper to reduce code size of libc.so. * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Remove memcpy-avx512-no-vzeroupper. * sysdeps/x86_64/multiarch/memcpy-avx512-no-vzeroupper.S: Renamed to ... * sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S: This. (MEMCPY): Don't define. (MEMCPY_CHK): Likewise. (MEMPCPY): Likewise. (MEMPCPY_CHK): Likewise. (MEMPCPY_CHK): Renamed to ... (__mempcpy_chk_avx512_no_vzeroupper): This. (MEMPCPY_CHK): Renamed to ... (__mempcpy_chk_avx512_no_vzeroupper): This. (MEMCPY_CHK): Renamed to ... (__memmove_chk_avx512_no_vzeroupper): This. (MEMCPY): Renamed to ... (__memmove_avx512_no_vzeroupper): This. (__memcpy_avx512_no_vzeroupper): New alias. (__memcpy_chk_avx512_no_vzeroupper): Likewise.
* Implement x86-64 multiarch mempcpy in memcpyH.J. Lu2016-03-289-57/+69
| | | | | | | | | | | | | | | | | | | | | | | | | Implement x86-64 multiarch mempcpy in memcpy to share most of code. It reduces code size of libc.so. [BZ #18858] * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Remove mempcpy-ssse3, mempcpy-ssse3-back, mempcpy-avx-unaligned and mempcpy-avx512-no-vzeroupper. * sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S (MEMPCPY_CHK): New. (MEMPCPY): Likewise. * sysdeps/x86_64/multiarch/memcpy-avx512-no-vzeroupper.S (MEMPCPY_CHK): New. (MEMPCPY): Likewise. * sysdeps/x86_64/multiarch/memcpy-ssse3-back.S (MEMPCPY_CHK): New. (MEMPCPY): Likewise. * sysdeps/x86_64/multiarch/memcpy-ssse3.S (MEMPCPY_CHK): New. (MEMPCPY): Likewise. * sysdeps/x86_64/multiarch/mempcpy-avx-unaligned.S: Removed. * sysdeps/x86_64/multiarch/mempcpy-avx512-no-vzeroupper.S: Likewise. * sysdeps/x86_64/multiarch/mempcpy-ssse3-back.S: Likewise. * sysdeps/x86_64/multiarch/mempcpy-ssse3.S: Likewise.
* [x86] Add a feature bit: Fast_Unaligned_CopyH.J. Lu2016-03-281-1/+1
| | | | | | | | | | | | | | | | | | | On AMD processors, memcpy optimized with unaligned SSE load is slower than emcpy optimized with aligned SSSE3 while other string functions are faster with unaligned SSE load. A feature bit, Fast_Unaligned_Copy, is added to select memcpy optimized with unaligned SSE load. [BZ #19583] * sysdeps/x86/cpu-features.c (init_cpu_features): Set Fast_Unaligned_Copy with Fast_Unaligned_Load for Intel processors. Set Fast_Copy_Backward for AMD Excavator processors. * sysdeps/x86/cpu-features.h (bit_arch_Fast_Unaligned_Copy): New. (index_arch_Fast_Unaligned_Copy): Likewise. * sysdeps/x86_64/multiarch/memcpy.S (__new_memcpy): Check Fast_Unaligned_Copy instead of Fast_Unaligned_Load.
* tst-audit10: Fix compilation on compilers without bit_AVX512F [BZ #19860]Florian Weimer2016-03-251-1/+4
| | | | | | [BZ# 19860] * sysdeps/x86_64/tst-audit10.c (avx512_enabled): Always return zero if the compiler does not provide the AVX512F bit.
* Fix x86_64 / x86 powl inaccuracy for integer exponents (bug 19848).Joseph Myers2016-03-242-12/+12
| | | | | | | | | | | | | | | | | | | | | | | Bug 19848 reports cases where powl on x86 / x86_64 has error accumulation, for small integer exponents, larger than permitted by glibc's accuracy goals, at least in some rounding modes. This patch further restricts the exponent range for which the small-integer-exponent logic is used to limit the possible error accumulation. Tested for x86_64 and x86 and ulps updated accordingly. [BZ #19848] * sysdeps/i386/fpu/e_powl.S (p3): Rename to p2 and change value from 8 to 4. (__ieee754_powl): Compare integer exponent against 4 not 8. * sysdeps/x86_64/fpu/e_powl.S (p3): Rename to p2 and change value from 8 to 4. (__ieee754_powl): Compare integer exponent against 4 not 8. * math/auto-libm-test-in: Add more tests of pow. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Don't set %rcx twice before "rep movsb"H.J. Lu2016-03-221-1/+0
| | | | | * sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S (MEMCPY): Don't set %rcx twice before "rep movsb".
* Use JUMPTARGET in x86-64 mathvecH.J. Lu2016-03-1638-130/+130
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When PLT may be used, JUMPTARGET should be used instead calling the function directly. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core_sse4.S (_ZGVbN2v_cos_sse4): Use JUMPTARGET to call cos. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core_avx2.S (_ZGVdN4v_cos_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S (_ZGVdN4v_cos): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core_sse4.S (_ZGVbN2v_exp_sse4): Use JUMPTARGET to call exp. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core_avx2.S (_ZGVdN4v_exp_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core_avx512.S (_ZGVdN4v_exp): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core_sse4.S (_ZGVbN2v_log_sse4): Use JUMPTARGET to call log. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core_avx2.S (_ZGVdN4v_log_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core_avx512.S (_ZGVdN4v_log): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core_sse4.S (_ZGVbN2vv_pow_sse4): Use JUMPTARGET to call pow. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core_avx2.S (_ZGVdN4vv_pow_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core_avx512.S (_ZGVdN4vv_pow): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S (_ZGVbN2v_sin_sse4): Use JUMPTARGET to call sin. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S (_ZGVdN4v_sin_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S (_ZGVdN4v_sin): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core_sse4.S (_ZGVbN2vvv_sincos_sse4): Use JUMPTARGET to call sin and cos. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core_avx2.S (_ZGVdN4vvv_sincos_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S (_ZGVdN4vvv_sincos): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S (_ZGVdN8v_cosf): Use JUMPTARGET to call cosf. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core_sse4.S (_ZGVbN4v_cosf_sse4): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core_avx2.S (_ZGVdN8v_cosf_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S (_ZGVdN8v_expf): Use JUMPTARGET to call expf. * sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core_sse4.S (_ZGVbN4v_expf_sse4): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core_avx2.S (_ZGVdN8v_expf_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core_avx512.S (_ZGVdN8v_logf): Use JUMPTARGET to call logf. * sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core_sse4.S (_ZGVbN4v_logf_sse4): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core_avx2.S (_ZGVdN8v_logf_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core_avx512.S (_ZGVdN8vv_powf): Use JUMPTARGET to call powf. * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core_sse4.S (_ZGVbN4vv_powf_sse4): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core_avx2.S (_ZGVdN8vv_powf_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S (_ZGVdN8vv_powf): Use JUMPTARGET to call sinf and cosf. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S (_ZGVbN4vvv_sincosf_sse4): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S (_ZGVdN8vvv_sincosf_avx2): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S (_ZGVdN8v_sinf): Use JUMPTARGET to call sinf. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core_sse4.S (_ZGVbN4v_sinf_sse4): Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core_avx2.S (_ZGVdN8v_sinf_avx2): Likewise. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h (WRAPPER_IMPL_SSE2): Use JUMPTARGET to call callee. (WRAPPER_IMPL_SSE2_ff): Likewise. (WRAPPER_IMPL_SSE2_fFF): Likewise. (WRAPPER_IMPL_AVX): Likewise. (WRAPPER_IMPL_AVX_ff): Likewise. (WRAPPER_IMPL_AVX_fFF): Likewise. (WRAPPER_IMPL_AVX512): Likewise. (WRAPPER_IMPL_AVX512_ff): Likewise. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h (WRAPPER_IMPL_SSE2): Likewise. (WRAPPER_IMPL_SSE2_ff): Likewise. (WRAPPER_IMPL_SSE2_fFF): Likewise. (WRAPPER_IMPL_AVX): Likewise. (WRAPPER_IMPL_AVX_ff): Likewise. (WRAPPER_IMPL_AVX_fFF): Likewise. (WRAPPER_IMPL_AVX512): Likewise. (WRAPPER_IMPL_AVX512_ff): Likewise. (WRAPPER_IMPL_AVX512_fFF): Likewise.
* Fix tst-audit10 build when -mavx512f is not supported.Roland McGrath2016-03-082-3/+4
|
* tst-audit4, tst-audit10: Compile AVX/AVX-512 code separately [BZ #19269]Florian Weimer2016-03-075-55/+112
| | | | | This ensures that GCC will not use unsupported instructions before the run-time check to ensure support.
* Group AVX512 functions in .text.avx512 sectionH.J. Lu2016-03-062-2/+2
| | | | | | | * sysdeps/x86_64/multiarch/memcpy-avx512-no-vzeroupper.S: Replace .text with .text.avx512. * sysdeps/x86_64/multiarch/memset-avx512-no-vzeroupper.S: Likewise.
* Replace PREINIT_FUNCTION@PLT with *%rax in callH.J. Lu2016-03-041-1/+1
| | | | | | | | | Since we have loaded address of PREINIT_FUNCTION into %rax, we can avoid extra branch to PLT slot. [BZ #19745] * sysdeps/x86_64/crti.S (_init): Replace PREINIT_FUNCTION@PLT with *%rax in call.
* Replace @PLT with @GOTPCREL(%rip) in callH.J. Lu2016-03-041-2/+4
| | | | | | | | | Since __libc_start_main is called very early, lazy binding isn't relevant here. Use indirect branch via GOT to avoid extra branch to PLT slot. [BZ #19745] * sysdeps/x86_64/start.S (_start): __libc_start_main@PLT with *__libc_start_main@GOTPCREL(%rip) in call.
* Add a comment in sysdeps/x86_64/MakefileH.J. Lu2016-03-041-0/+3
| | | | | | Mention recursive calls when ENTRY is used in _mcount.S. * sysdeps/x86_64/Makefile (sysdep_noprof): Add a comment.
* x86-64: Fix memcpy IFUNC selectionH.J. Lu2016-03-041-13/+14
| | | | | | | | | | | | | | | | | Chek Fast_Unaligned_Load, instead of Slow_BSF, and also check for Fast_Copy_Backward to enable __memcpy_ssse3_back. Existing selection order is updated with following selection order: 1. __memcpy_avx_unaligned if AVX_Fast_Unaligned_Load bit is set. 2. __memcpy_sse2_unaligned if Fast_Unaligned_Load bit is set. 3. __memcpy_sse2 if SSSE3 isn't available. 4. __memcpy_ssse3_back if Fast_Copy_Backward bit it set. 5. __memcpy_ssse3 [BZ #18880] * sysdeps/x86_64/multiarch/memcpy.S: Check Fast_Unaligned_Load, instead of Slow_BSF, and also check for Fast_Copy_Backward to enable __memcpy_ssse3_back.
* 2016-03-03 Paul Pluzhnikov <ppluzhnikov@google.com>Paul Pluzhnikov2016-03-031-13/+42
| | | | | | [BZ #19490] * sysdeps/x86_64/_mcount.S (_mcount): Add unwind descriptor. (__fentry__): Likewise
* Copy x86_64 _mcount.op from _mcount.oH.J. Lu2016-03-031-0/+1
| | | | | | | | No need to compile x86_64 _mcount.S with -pg. We can just copy the normal static object. * gmon/Makefile (noprof): Add $(sysdep_noprof). * sysdeps/x86_64/Makefile (sysdep_noprof): Add _mcount.
* Call x86-64 __mcount_internal/__sigjmp_save directlyH.J. Lu2016-03-012-12/+0
| | | | | | | | | | | | | | | Since __mcount_internal and __sigjmp_save are internal to x86-64 libc.so: 3532: 0000000000104530 289 FUNC LOCAL DEFAULT 13 __mcount_internal 3391: 0000000000034170 38 FUNC LOCAL DEFAULT 13 __sigjmp_save they can be called directly without PLT. * sysdeps/x86_64/_mcount.S (C_LABEL(_mcount)): Call __mcount_internal directly. (C_LABEL(__fentry__)): Likewise. * sysdeps/x86_64/setjmp.S __sigsetjmp): Call __sigjmp_save directly.
* [x86_64] Set DL_RUNTIME_UNALIGNED_VEC_SIZE to 8H.J. Lu2016-02-192-11/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to GCC bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066 __tls_get_addr may be called with 8-byte stack alignment. Although this bug has been fixed in GCC 4.9.4, 5.3 and 6, we can't assume that stack will be always aligned at 16 bytes. Since SSE optimized memory/string functions with aligned SSE register load and store are used in the dynamic linker, we must set DL_RUNTIME_UNALIGNED_VEC_SIZE to 8 so that _dl_runtime_resolve_sse will align the stack before calling _dl_fixup: Dump of assembler code for function _dl_runtime_resolve_sse: 0x00007ffff7deea90 <+0>: push %rbx 0x00007ffff7deea91 <+1>: mov %rsp,%rbx 0x00007ffff7deea94 <+4>: and $0xfffffffffffffff0,%rsp ^^^^^^^^^^^ Align stack to 16 bytes 0x00007ffff7deea98 <+8>: sub $0x100,%rsp 0x00007ffff7deea9f <+15>: mov %rax,0xc0(%rsp) 0x00007ffff7deeaa7 <+23>: mov %rcx,0xc8(%rsp) 0x00007ffff7deeaaf <+31>: mov %rdx,0xd0(%rsp) 0x00007ffff7deeab7 <+39>: mov %rsi,0xd8(%rsp) 0x00007ffff7deeabf <+47>: mov %rdi,0xe0(%rsp) 0x00007ffff7deeac7 <+55>: mov %r8,0xe8(%rsp) 0x00007ffff7deeacf <+63>: mov %r9,0xf0(%rsp) 0x00007ffff7deead7 <+71>: movaps %xmm0,(%rsp) 0x00007ffff7deeadb <+75>: movaps %xmm1,0x10(%rsp) 0x00007ffff7deeae0 <+80>: movaps %xmm2,0x20(%rsp) 0x00007ffff7deeae5 <+85>: movaps %xmm3,0x30(%rsp) 0x00007ffff7deeaea <+90>: movaps %xmm4,0x40(%rsp) 0x00007ffff7deeaef <+95>: movaps %xmm5,0x50(%rsp) 0x00007ffff7deeaf4 <+100>: movaps %xmm6,0x60(%rsp) 0x00007ffff7deeaf9 <+105>: movaps %xmm7,0x70(%rsp) [BZ #19679] * sysdeps/x86_64/dl-trampoline.S (DL_RUNIME_UNALIGNED_VEC_SIZE): Renamed to ... (DL_RUNTIME_UNALIGNED_VEC_SIZE): This. Set to 8. (DL_RUNIME_RESOLVE_REALIGN_STACK): Renamed to ... (DL_RUNTIME_RESOLVE_REALIGN_STACK): This. Updated. (DL_RUNIME_RESOLVE_REALIGN_STACK): Renamed to ... (DL_RUNTIME_RESOLVE_REALIGN_STACK): This. * sysdeps/x86_64/dl-trampoline.h (DL_RUNIME_RESOLVE_REALIGN_STACK): Renamed to ... (DL_RUNTIME_RESOLVE_REALIGN_STACK): This.
* Use PIC relocation in ALIAS_IMPLAndrew Senkevich2016-02-171-2/+1
| | | | | | | | | Since libmvec_nonshared.a may be linked into shared objects, ALIAS_IMPL should use PIC relocation. [BZ #19590] * sysdeps/x86_64/fpu/svml_finite_alias.S (ALIAS_IMPL): Use PIC relocation.
* 2016-01-20 Paul Pluzhnikov <ppluzhnikov@google.com>Paul Pluzhnikov2016-01-203-15/+10
| | | | | | | | | [BZ #19490] * sysdeps/unix/sysv/linux/x86_64/pthread_cond_broadcast.S (pthread_cond_broadcast): Use ENTRY/END * sysdeps/unix/sysv/linux/x86_64/pthread_cond_signal.S (pthread_cond_signal): Likewise * sysdeps/x86_64/nptl/pthread_spin_lock.S (pthread_spin_lock): Likewise * sysdeps/x86_64/nptl/pthread_spin_trylock.S (pthread_spin_trylock): Likewise * sysdeps/x86_64/nptl/pthread_spin_unlock.S (pthread_spin_unlock): Likewise
* Fixed build with assembler w/o AVX-512 support.Andrew Senkevich2016-01-191-0/+12
| | | | | * sysdeps/x86_64/multiarch/ifunc-impl-list.c: Fixed build with assembler not supporting AVX-512.
* Fixed typos in __memcpy_chk.Andrew Senkevich2016-01-161-3/+3
| | | | * sysdeps/x86_64/multiarch/memcpy_chk.S: Fixed typos.
* Added memcpy/memmove family optimized with AVX512 for KNL hardware.Andrew Senkevich2016-01-1611-19/+540
| | | | | | | | | | | | | | | | | | | | Added AVX512 implementations of memcpy, mempcpy, memmove, memcpy_chk, mempcpy_chk, memmove_chk. It shows average improvement more than 30% over AVX versions on KNL hardware (performance results in the thread <https://sourceware.org/ml/libc-alpha/2016-01/msg00258.html>). * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Added new files. * sysdeps/x86_64/multiarch/ifunc-impl-list.c: Added new tests. * sysdeps/x86_64/multiarch/memcpy-avx512-no-vzeroupper.S: New file. * sysdeps/x86_64/multiarch/mempcpy-avx512-no-vzeroupper.S: Likewise. * sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S: Likewise. * sysdeps/x86_64/multiarch/memcpy.S: Added new IFUNC branch. * sysdeps/x86_64/multiarch/memcpy_chk.S: Likewise. * sysdeps/x86_64/multiarch/memmove.c: Likewise. * sysdeps/x86_64/multiarch/memmove_chk.c: Likewise. * sysdeps/x86_64/multiarch/mempcpy.S: Likewise. * sysdeps/x86_64/multiarch/mempcpy_chk.S: Likewise.
* Update copyright dates with scripts/update-copyrights.Joseph Myers2016-01-04339-339/+339
|
* Added memset optimized with AVX512 for KNL hardware.Andrew Senkevich2015-12-195-3/+225
| | | | | | | | | | | | | | | It shows improvement up to 28% over AVX2 memset (performance results attached at <https://sourceware.org/ml/libc-alpha/2015-12/msg00052.html>). * sysdeps/x86_64/multiarch/memset-avx512-no-vzeroupper.S: New file. * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Added new file. * sysdeps/x86_64/multiarch/ifunc-impl-list.c: Added new tests. * sysdeps/x86_64/multiarch/memset.S: Added new IFUNC branch. * sysdeps/x86_64/multiarch/memset_chk.S: Likewise. * sysdeps/x86/cpu-features.h (bit_Prefer_No_VZEROUPPER, index_Prefer_No_VZEROUPPER): New. * sysdeps/x86/cpu-features.c (init_cpu_features): Set the Prefer_No_VZEROUPPER for Knights Landing.
* Better workaround for aliases of *_finite symbols in vector math library.Andrew Senkevich2015-11-272-1/+62
| | | | | | | | | | | | | Old workaround based on assembly aliases can lead to link fail (bug 19058). This patch makes workaround in another way to avoid it. [BZ #19058] * math/Makefile ($(inst_libdir)/libm.so): Added libmvec_nonshared.a to AS_NEEDED. * sysdeps/x86/fpu/bits/math-vector.h: Removed code with old workaround. * sysdeps/x86_64/fpu/Makefile (libmvec-support, libmvec-static-only-routines): Added new file. * sysdeps/x86_64/fpu/svml_finite_alias.S: New file.
* Fix i386/x86_64 log* (1) zero sign for -ffinite-math-only (bug 19213).Joseph Myers2015-11-053-3/+21
| | | | | | | | | | | | | | | | | | | | | For the -ffinite-math-only versions of various x86_64 and x86 log* functions, a zero result from log* (1) is returned with incorrect sign in round-downward mode. This patch fixes this in a similar way to the previous fixes for the non-*_finite versions of the functions. Tested for x86_64 and x86 (including an i586 build), together with a patch that will be applied separately to enable the main libm-test.inc tests for the finite-math-only functions. [BZ #19213] * sysdeps/i386/fpu/e_log.S (__log_finite): Ensure +0 is always returned for argument 1. * sysdeps/i386/fpu/e_logf.S (__logf_finite): Likewise. * sysdeps/i386/fpu/e_logl.S (__logl_finite): Likewise. * sysdeps/i386/i686/fpu/e_logl.S (__logl_finite): Likewise. * sysdeps/x86_64/fpu/e_log10l.S (__log10l_finite): Likewise. * sysdeps/x86_64/fpu/e_log2l.S (__log2l_finite): Likewise. * sysdeps/x86_64/fpu/e_logl.S (__logl_finite): Likewise.
* Remove miscellaneous GCC >= 4.7 version conditionals.Joseph Myers2015-11-041-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes miscellaneous __GNUC_PREREQ (4, 7) conditionals that are now dead. Tested for x86_64 and x86 (testsuite, and that installed stripped shared libraries are unchanged by the patch). * sysdeps/arm/atomic-machine.h [__GNUC_PREREQ (4, 7) && __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4]: Change conditional to [__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4]. [__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 && !__GNUC_PREREQ (4, 7)]: Remove conditional code. [!__GNUC_PREREQ (4, 7) || !__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4]: Change conditional to [!__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4]. * sysdeps/i386/sysdep.h [__ASSEMBLER__ && __GNUC_PREREQ (4, 7)]: Change conditional to [__ASSEMBLER__]. [__ASSEMBLER__ && !__GNUC_PREREQ (4, 7)]: Remove conditional code. [!__ASSEMBLER__ && __GNUC_PREREQ (4, 7)]: Change conditional to [!__ASSEMBLER__]. [!__ASSEMBLER__ && !__GNUC_PREREQ (4, 7)]: Remove conditional code. * sysdeps/unix/sysv/linux/sh/atomic-machine.h (rNOSP): Remove conditional macro definitions. (__arch_compare_and_exchange_val_8_acq): Use "u" instead of rNOSP. (__arch_compare_and_exchange_val_16_acq): Likewise. (__arch_compare_and_exchange_val_32_acq): Likewise. (atomic_exchange_and_add): Likewise. (atomic_add): Likewise. (atomic_add_negative): Likewise. (atomic_add_zero): Likewise. (atomic_bit_set): Likewise. (atomic_bit_test_set): Likewise. * sysdeps/x86_64/atomic-machine.h [__GNUC_PREREQ (4, 7)]: Make code unconditional. [!__GNUC_PREREQ (4, 7)]: Remove conditional code.
* Remove cpuid.h configure tests.Joseph Myers2015-10-292-46/+0
| | | | | | | | | | | | | | There are configure tests for the cpuid.h header for x86 / x86_64. GCC 4.3 and later install this header, so those tests are obsolete. This patch removes them. Tested for x86_64 and x86 (testsuite, and that installed shared libraries are unchanged by the patch). * sysdeps/i386/configure.ac (cpuid.h): Do not test for header. * sysdeps/i386/configure: Regenerated. * sysdeps/x86_64/configure.ac (cpuid.h): Do not test for header. * sysdeps/x86_64/configure: Regenerated.
* Handle more state in i386/x86_64 fesetenv (bug 16068).Joseph Myers2015-10-281-10/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fenv_t should include architecture-specific floating-point modes and status flags. i386 and x86_64 fesetenv limit which bits they use from the x87 status and control words, when using saved state, and limit which parts of the state they set to fixed values, when using FE_DFL_ENV / FE_NOMASK_ENV. The following should be included but are excluded in at least some cases: status and masking for the "denormal operand" exception (which isn't part of FE_ALL_EXCEPT); precision control (explicitly mentioned in Annex F as something that counts as part of the floating-point environment); MXCSR FZ and DAZ bits (for FE_DFL_ENV and FE_NOMASK_ENV). This patch arranges for this extra state to be handled by fesetenv (and thereby by feupdateenv, which calls fesetenv). (Note that glibc functions using floating point are not generally expected to work correctly with non-default values of this state, especially precision control, but it is still logically part of the floating-point environment and should be handled as such by fesetenv. Changes to the state relating to subnormals ought generally to work with libm functions when the arguments aren't subnormal and neither are the expected results; that's a consequence of functions avoiding spurious internal underflows.) A question arising from this is whether FE_NOMASK_ENV should or should not mask the "denormal operand" exception. I decided it should mask that exception. This is the status quo - previously that exception could only be unmasked by direct manipulation of control registers (possibly via <fpu_control.h>). In addition, it means that use of FE_NOMASK_ENV leaves a floating-point environment the same as could be obtained by fesetenv (FE_DFL_ENV); feenableexcept (FE_ALL_EXCEPT);, rather than an environment in which an exception is unmasked that could only be masked again by using fesetenv with FE_DFL_ENV (or a previously saved environment) - this exception not being usable with other <fenv.h> functions because it's outside FE_ALL_EXCEPT. Tested for x86_64 and x86. [BZ #16068] * sysdeps/i386/fpu/fesetenv.c: Include <fpu_control.h>. (FE_ALL_EXCEPT_X86): New macro. (__fesetenv): Use FE_ALL_EXCEPT_X86 in most places instead of FE_ALL_EXCEPT. Ensure precision control is included in floating-point state. Ensure that FE_DFL_ENV and FE_NOMASK_ENV handle "denormal operand exception" and clear FZ and DAZ bits. * sysdeps/x86_64/fpu/fesetenv.c: Include <fpu_control.h>. (FE_ALL_EXCEPT_X86): New macro. (__fesetenv): Use FE_ALL_EXCEPT_X86 in most places instead of FE_ALL_EXCEPT. Ensure precision control is included in floating-point state. Ensure that FE_DFL_ENV and FE_NOMASK_ENV handle "denormal operand exception" and clear FZ and DAZ bits. * sysdeps/x86/fpu/test-fenv-sse-2.c: New file. * sysdeps/x86/fpu/test-fenv-x87.c: Likewise. * sysdeps/x86/fpu/Makefile [$(subdir) = math] (tests): Add test-fenv-x87 and test-fenv-sse-2. [$(subdir) = math] (CFLAGS-test-fenv-sse-2.c): New variable.
* Fix i386/x86_64 fesetenv SSE exception clearing (bug 19181).Joseph Myers2015-10-281-0/+4
| | | | | | | | | | | | | | | | | | | | | | | The i386 and x86_64 versions of fesetenv, when called with FE_DFL_ENV or FE_NOMASK_ENV as argument, do not clear SSE exceptions raised in MXCSR. These arguments should, like other fenv_t values, represent the whole of the floating-point state, so such exceptions should be cleared; this patch adds the required clearing. (Discovered while working on bug 16068.) Tested for x86_64 and x86. [BZ #19181] * sysdeps/i386/fpu/fesetenv.c (__fesetenv): Clear already-raised SSE exceptions when argument is FE_DFL_ENV or FE_NOMASK_ENV. * sysdeps/x86_64/fpu/fesetenv.c (__fesetenv): Likewise. * math/test-fenv-clear-main.c: New file. * math/test-fenv-clear.c: Likewise. * math/Makefile (tests): Add test-fenv-clear. * sysdeps/x86/fpu/test-fenv-clear-sse.c: New file. * sysdeps/x86/fpu/Makefile [$(subdir) = math] (tests): Add test-fenv-clear-sse. [$(subdir) = math] (CFLAGS-test-fenv-clear-sse.c): New variable.
* Remove -mavx2 configure tests.Joseph Myers2015-10-286-58/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | There are configure tests for the -mavx2 compiler option. AVX2 support was added in GCC 4.7, so these tests are now obsolete; this patch removes them. Tested for x86_64 and x86 (testsuite, and that installed stripped shared libraries are unchanged by the patch). * sysdeps/i386/configure.ac (libc_cv_cc_avx2): Remove configure test. * sysdeps/i386/configure: Regenerated. * sysdeps/x86_64/configure.ac (libc_cv_cc_avx2): Remove configure test. * sysdeps/x86_64/configure: Regenerated. * config.h.in (HAVE_AVX2_SUPPORT): Remove #undef. * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add memset-avx2 unconditionally instead of conditionally on [$(config-cflags-avx2) = yes]. * sysdeps/x86_64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list) [HAVE_AVX2_SUPPORT]: Make code unconditional. * sysdeps/x86_64/multiarch/memset.S [HAVE_AVX2_SUPPORT]: Likewise. * sysdeps/x86_64/multiarch/memset_chk.S [IS_IN (libc) && SHARED && HAVE_AVX2_SUPPORT]: Change conditional to [IS_IN (libc) && SHARED].
* x86_64: Regenerate ulps [BZ #19168]Florian Weimer2015-10-261-6/+6
| | | | This comes from running “make regen-ulps” on AMD Opteron 6272 CPUs.
* Add more libm tests (ilogb, is*, j0, j1, jn, lgamma, log*).Joseph Myers2015-10-231-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch improves the libm test coverage for a few more functions. Tested for x86_64 and x86. * math/auto-libm-test-in: Add more tests of log, log10, log1p and log2. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (MAX_EXP): New macro. (ilogb_test_data): Add more tests. (isfinite_test_data): Likewise. (isgreater_test_data): Likewise. (isgreaterequal_test_data): Likewise. (isinf_test_data): Likewise. (isless_test_data): Likewise. (islessequal_test_data): Likewise. (islessgreater_test_data): Likewise. (isnan_test_data): Likewise. (isnormal_test_data): Likewise. (issignaling_test_data): Likewise. (isunordered_test_data): Likewise. (j0_test_data): Likewise. (j1_test_data): Likewise. (jn_test_data): Likewise. (lgamma_test_data): Likewise. (log_test_data): Likewise. (log10_test_data): Likewise. (log1p_test_data): Likewise. (log2_test_data): Likewise. (logb_test_data): Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Update.
* Fix i386 / x86_64 nearbyint exception clearing (bug 15491).Joseph Myers2015-10-221-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The implementations of nearbyint functions using x87 floating point (i386 all versions, x86_64 long double only) use the fclex instruction, which clears any exceptions that were raised before the function was called. These functions must not clear exceptions that were raised before they were called. This patch fixes these functions to save and restore the whole floating-point environment (fnstenv / fldenv) as the way of avoiding raising "inexact" (recall that there isn't an x87 instruction for loading just the status word, so the whole environment has to be saved and loaded instead - the code already saved and loaded the control word, which is now obtained from the saved environment after this patch, to disable traps on "inexact"). In the case of the long double functions, any "invalid" exception from frndint (applied to a signaling NaN) needs merging into the saved state; this issue doesn't apply to the float and double functions because that exception would have been raised when the argument is loaded, before the environment is saved. [BZ #15491] * sysdeps/i386/fpu/s_nearbyint.S (__nearbyint): Save and restore floating-point environment instead of clearing all exceptions. * sysdeps/i386/fpu/s_nearbyintf.S (__nearbyintf): Likewise. * sysdeps/i386/fpu/s_nearbyintl.S (__nearbyintl): Likewise, merging in "invalid" exceptions from frndint. * sysdeps/x86_64/fpu/s_nearbyintl.S (__nearbyintl): Likewise. * math/test-nearbyint-except.c: New file. * math/Makefile (tests): Add test-nearbyint-except.
* Convert 231 sysdeps function definitions to prototype style.Joseph Myers2015-10-191-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This mostly automatically-generated patch converts 231 sysdeps function definitions in glibc from old-style K&R to prototype-style. For __aio_sigqueue and __gai_sigqueue I had to add internal_function to the definitions as noted by Florian in <https://sourceware.org/ml/libc-alpha/2015-10/msg00595.html> to keep the functions compiling on x86 after conversion to prototype definitions. Otherwise, the patch is automatically generated with all the same exclusions and caveats as in <https://sourceware.org/ml/libc-alpha/2015-10/msg00594.html> except that it's a patch for sysdeps files. Tested for x86_64 and x86 (testsuite, and that installed stripped shared libraries are unchanged by the patch). Also tested for arm, mips64 and powerpc32 that installed stripped shared libraries are unchanged by the patch. * sysdeps/arm/backtrace.c (__backtrace): Convert to prototype-style function definition. * sysdeps/i386/backtrace.c (__backtrace): Likewise. * sysdeps/i386/ffs.c (__ffs): Likewise. * sysdeps/i386/i686/ffs.c (__ffs): Likewise. * sysdeps/ia64/nptl/pthread_spin_lock.c (pthread_spin_lock): Likewise. * sysdeps/ia64/nptl/pthread_spin_trylock.c (pthread_spin_trylock): Likewise. * sysdeps/ieee754/ldbl-128/e_log2l.c (__ieee754_log2l): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_log2l.c (__ieee754_log2l): Likewise. * sysdeps/m68k/ffs.c (__ffs): Likewise. * sysdeps/m68k/m680x0/fpu/e_acos.c (FUNC): Likewise. * sysdeps/m68k/m680x0/fpu/e_fmod.c (FUNC): Likewise. * sysdeps/mach/adjtime.c (__adjtime): Likewise. * sysdeps/mach/gettimeofday.c (__gettimeofday): Likewise. * sysdeps/mach/hurd/_exit.c (_exit): Likewise. * sysdeps/mach/hurd/access.c (__access): Likewise. * sysdeps/mach/hurd/adjtime.c (__adjtime): Likewise. * sysdeps/mach/hurd/chdir.c (__chdir): Likewise. * sysdeps/mach/hurd/chmod.c (__chmod): Likewise. * sysdeps/mach/hurd/chown.c (__chown): Likewise. * sysdeps/mach/hurd/cthreads.c (cthread_keycreate): Likewise. (cthread_getspecific): Likewise. (cthread_setspecific): Likewise. (__libc_getspecific): Likewise. * sysdeps/mach/hurd/euidaccess.c (__euidaccess): Likewise. * sysdeps/mach/hurd/faccessat.c (faccessat): Likewise. * sysdeps/mach/hurd/fchdir.c (__fchdir): Likewise. * sysdeps/mach/hurd/fchmod.c (__fchmod): Likewise. * sysdeps/mach/hurd/fchmodat.c (fchmodat): Likewise. * sysdeps/mach/hurd/fchown.c (__fchown): Likewise. * sysdeps/mach/hurd/fchownat.c (fchownat): Likewise. * sysdeps/mach/hurd/flock.c (__flock): Likewise. * sysdeps/mach/hurd/fsync.c (fsync): Likewise. * sysdeps/mach/hurd/ftruncate.c (__ftruncate): Likewise. * sysdeps/mach/hurd/getgroups.c (__getgroups): Likewise. * sysdeps/mach/hurd/gethostname.c (__gethostname): Likewise. * sysdeps/mach/hurd/getitimer.c (__getitimer): Likewise. * sysdeps/mach/hurd/getlogin_r.c (__getlogin_r): Likewise. * sysdeps/mach/hurd/getpgid.c (__getpgid): Likewise. * sysdeps/mach/hurd/getrusage.c (__getrusage): Likewise. * sysdeps/mach/hurd/getsockname.c (__getsockname): Likewise. * sysdeps/mach/hurd/group_member.c (__group_member): Likewise. * sysdeps/mach/hurd/isatty.c (__isatty): Likewise. * sysdeps/mach/hurd/lchown.c (__lchown): Likewise. * sysdeps/mach/hurd/link.c (__link): Likewise. * sysdeps/mach/hurd/linkat.c (linkat): Likewise. * sysdeps/mach/hurd/listen.c (__listen): Likewise. * sysdeps/mach/hurd/mkdir.c (__mkdir): Likewise. * sysdeps/mach/hurd/mkdirat.c (mkdirat): Likewise. * sysdeps/mach/hurd/openat.c (__openat): Likewise. * sysdeps/mach/hurd/poll.c (__poll): Likewise. * sysdeps/mach/hurd/readlink.c (__readlink): Likewise. * sysdeps/mach/hurd/readlinkat.c (readlinkat): Likewise. * sysdeps/mach/hurd/recv.c (__recv): Likewise. * sysdeps/mach/hurd/rename.c (rename): Likewise. * sysdeps/mach/hurd/renameat.c (renameat): Likewise. * sysdeps/mach/hurd/revoke.c (revoke): Likewise. * sysdeps/mach/hurd/rewinddir.c (__rewinddir): Likewise. * sysdeps/mach/hurd/rmdir.c (__rmdir): Likewise. * sysdeps/mach/hurd/seekdir.c (seekdir): Likewise. * sysdeps/mach/hurd/send.c (__send): Likewise. * sysdeps/mach/hurd/setdomain.c (setdomainname): Likewise. * sysdeps/mach/hurd/setegid.c (setegid): Likewise. * sysdeps/mach/hurd/seteuid.c (seteuid): Likewise. * sysdeps/mach/hurd/setgid.c (__setgid): Likewise. * sysdeps/mach/hurd/setgroups.c (setgroups): Likewise. * sysdeps/mach/hurd/sethostid.c (sethostid): Likewise. * sysdeps/mach/hurd/sethostname.c (sethostname): Likewise. * sysdeps/mach/hurd/setlogin.c (setlogin): Likewise. * sysdeps/mach/hurd/setpgid.c (__setpgid): Likewise. * sysdeps/mach/hurd/setregid.c (__setregid): Likewise. * sysdeps/mach/hurd/setreuid.c (__setreuid): Likewise. * sysdeps/mach/hurd/settimeofday.c (__settimeofday): Likewise. * sysdeps/mach/hurd/setuid.c (__setuid): Likewise. * sysdeps/mach/hurd/shutdown.c (shutdown): Likewise. * sysdeps/mach/hurd/sigaction.c (__sigaction): Likewise. * sysdeps/mach/hurd/sigaltstack.c (__sigaltstack): Likewise. * sysdeps/mach/hurd/sigpending.c (sigpending): Likewise. * sysdeps/mach/hurd/sigprocmask.c (__sigprocmask): Likewise. * sysdeps/mach/hurd/sigsuspend.c (__sigsuspend): Likewise. * sysdeps/mach/hurd/socket.c (__socket): Likewise. * sysdeps/mach/hurd/symlink.c (__symlink): Likewise. * sysdeps/mach/hurd/symlinkat.c (symlinkat): Likewise. * sysdeps/mach/hurd/telldir.c (telldir): Likewise. * sysdeps/mach/hurd/truncate.c (__truncate): Likewise. * sysdeps/mach/hurd/umask.c (__umask): Likewise. * sysdeps/mach/hurd/unlink.c (__unlink): Likewise. * sysdeps/mach/hurd/unlinkat.c (unlinkat): Likewise. * sysdeps/mips/mips64/__longjmp.c (__longjmp): Likewise. * sysdeps/posix/alarm.c (alarm): Likewise. * sysdeps/posix/cuserid.c (cuserid): Likewise. * sysdeps/posix/dirfd.c (dirfd): Likewise. * sysdeps/posix/dup.c (__dup): Likewise. * sysdeps/posix/dup2.c (__dup2): Likewise. * sysdeps/posix/euidaccess.c (euidaccess): Likewise. (main): Likewise. * sysdeps/posix/flock.c (__flock): Likewise. * sysdeps/posix/fpathconf.c (__fpathconf): Likewise. * sysdeps/posix/getcwd.c (__getcwd): Likewise. * sysdeps/posix/gethostname.c (__gethostname): Likewise. * sysdeps/posix/gettimeofday.c (__gettimeofday): Likewise. * sysdeps/posix/isatty.c (__isatty): Likewise. * sysdeps/posix/killpg.c (killpg): Likewise. * sysdeps/posix/libc_fatal.c (__libc_fatal): Likewise. * sysdeps/posix/mkfifoat.c (mkfifoat): Likewise. * sysdeps/posix/raise.c (raise): Likewise. * sysdeps/posix/remove.c (remove): Likewise. * sysdeps/posix/rename.c (rename): Likewise. * sysdeps/posix/rewinddir.c (__rewinddir): Likewise. * sysdeps/posix/seekdir.c (seekdir): Likewise. * sysdeps/posix/sigblock.c (__sigblock): Likewise. * sysdeps/posix/sigignore.c (sigignore): Likewise. * sysdeps/posix/sigintr.c (siginterrupt): Likewise. * sysdeps/posix/signal.c (__bsd_signal): Likewise. * sysdeps/posix/sigset.c (sigset): Likewise. * sysdeps/posix/sigsuspend.c (__sigsuspend): Likewise. * sysdeps/posix/sysconf.c (__sysconf): Likewise. * sysdeps/posix/sysv_signal.c (__sysv_signal): Likewise. * sysdeps/posix/time.c (time): Likewise. * sysdeps/posix/ttyname.c (getttyname): Likewise. (ttyname): Likewise. * sysdeps/posix/ttyname_r.c (__ttyname_r): Likewise. * sysdeps/posix/utime.c (utime): Likewise. * sysdeps/powerpc/fpu/s_isnan.c (__isnan): Likewise. * sysdeps/powerpc/nptl/pthread_spin_lock.c (pthread_spin_lock): Likewise. * sysdeps/powerpc/nptl/pthread_spin_trylock.c (pthread_spin_trylock): Likewise. * sysdeps/pthread/aio_error.c (aio_error): Likewise. * sysdeps/pthread/aio_read.c (aio_read): Likewise. * sysdeps/pthread/aio_read64.c (aio_read64): Likewise. * sysdeps/pthread/aio_write.c (aio_write): Likewise. * sysdeps/pthread/aio_write64.c (aio_write64): Likewise. * sysdeps/pthread/flockfile.c (__flockfile): Likewise. * sysdeps/pthread/ftrylockfile.c (__ftrylockfile): Likewise. * sysdeps/pthread/funlockfile.c (__funlockfile): Likewise. * sysdeps/pthread/timer_create.c (timer_create): Likewise. * sysdeps/pthread/timer_getoverr.c (timer_getoverrun): Likewise. * sysdeps/pthread/timer_gettime.c (timer_gettime): Likewise. * sysdeps/s390/ffs.c (__ffs): Likewise. * sysdeps/s390/nptl/pthread_spin_lock.c (pthread_spin_lock): Likewise. * sysdeps/s390/nptl/pthread_spin_trylock.c (pthread_spin_trylock): Likewise. * sysdeps/sh/nptl/pthread_spin_lock.c (pthread_spin_lock): Likewise. * sysdeps/sparc/nptl/pthread_barrier_destroy.c (pthread_barrier_destroy): Likewise. * sysdeps/sparc/nptl/pthread_barrier_wait.c (__pthread_barrier_wait): Likewise. * sysdeps/sparc/sparc32/e_sqrt.c (__ieee754_sqrt): Likewise. * sysdeps/sparc/sparc32/pthread_barrier_wait.c (__pthread_barrier_wait): Likewise. * sysdeps/sparc/sparc32/sem_init.c (__old_sem_init): Likewise. * sysdeps/tile/memcmp.c (memcmp_common_alignment): Likewise. (memcmp_not_common_alignment): Likewise. (MEMCMP): Likewise. * sysdeps/tile/wordcopy.c (_wordcopy_fwd_aligned): Likewise. (_wordcopy_fwd_dest_aligned): Likewise. (_wordcopy_bwd_aligned): Likewise. (_wordcopy_bwd_dest_aligned): Likewise. * sysdeps/unix/bsd/ftime.c (ftime): Likewise. * sysdeps/unix/bsd/gtty.c (gtty): Likewise. * sysdeps/unix/bsd/stty.c (stty): Likewise. * sysdeps/unix/bsd/tcflow.c (tcflow): Likewise. * sysdeps/unix/bsd/tcflush.c (tcflush): Likewise. * sysdeps/unix/bsd/tcgetattr.c (__tcgetattr): Likewise. * sysdeps/unix/bsd/tcgetpgrp.c (tcgetpgrp): Likewise. * sysdeps/unix/bsd/tcsendbrk.c (tcsendbreak): Likewise. * sysdeps/unix/bsd/tcsetattr.c (tcsetattr): Likewise. * sysdeps/unix/bsd/tcsetpgrp.c (tcsetpgrp): Likewise. * sysdeps/unix/bsd/ualarm.c (ualarm): Likewise. * sysdeps/unix/bsd/wait3.c (__wait3): Likewise. * sysdeps/unix/getlogin_r.c (__getlogin_r): Likewise. * sysdeps/unix/sockatmark.c (sockatmark): Likewise. * sysdeps/unix/stime.c (stime): Likewise. * sysdeps/unix/sysv/linux/_exit.c (_exit): Likewise. * sysdeps/unix/sysv/linux/aio_sigqueue.c (__aio_sigqueue): Likewise. Use internal_function. * sysdeps/unix/sysv/linux/arm/sigaction.c (__libc_sigaction): Convert to prototype-style function definition. * sysdeps/unix/sysv/linux/faccessat.c (faccessat): Likewise. * sysdeps/unix/sysv/linux/fchmodat.c (fchmodat): Likewise. * sysdeps/unix/sysv/linux/fpathconf.c (__fpathconf): Likewise. * sysdeps/unix/sysv/linux/gai_sigqueue.c (__gai_sigqueue): Likewise. Use internal_function. * sysdeps/unix/sysv/linux/gethostid.c (sethostid): Convert to prototype-style function definition * sysdeps/unix/sysv/linux/getlogin_r.c (__getlogin_r_loginuid): Likewise. (__getlogin_r): Likewise. * sysdeps/unix/sysv/linux/getpt.c (__posix_openpt): Likewise. * sysdeps/unix/sysv/linux/hppa/pthread_cond_broadcast.c (__pthread_cond_broadcast): Likewise. * sysdeps/unix/sysv/linux/hppa/pthread_cond_destroy.c (__pthread_cond_destroy): Likewise. * sysdeps/unix/sysv/linux/hppa/pthread_cond_init.c (__pthread_cond_init): Likewise. * sysdeps/unix/sysv/linux/hppa/pthread_cond_signal.c (__pthread_cond_signal): Likewise. * sysdeps/unix/sysv/linux/hppa/pthread_cond_wait.c (__pthread_cond_wait): Likewise. * sysdeps/unix/sysv/linux/i386/getmsg.c (getmsg): Likewise. * sysdeps/unix/sysv/linux/i386/setegid.c (setegid): Likewise. * sysdeps/unix/sysv/linux/ia64/sigaction.c (__libc_sigaction): Likewise. * sysdeps/unix/sysv/linux/ia64/sigpending.c (sigpending): Likewise. * sysdeps/unix/sysv/linux/ia64/sigprocmask.c (__sigprocmask): Likewise. * sysdeps/unix/sysv/linux/mips/sigaction.c (__libc_sigaction): Likewise. * sysdeps/unix/sysv/linux/msgget.c (msgget): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/ftruncate64.c (__ftruncate64): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/truncate64.c (truncate64): Likewise. * sysdeps/unix/sysv/linux/pt-raise.c (raise): Likewise. * sysdeps/unix/sysv/linux/pthread_getcpuclockid.c (pthread_getcpuclockid): Likewise. * sysdeps/unix/sysv/linux/pthread_getname.c (pthread_getname_np): Likewise. * sysdeps/unix/sysv/linux/pthread_setname.c (pthread_setname_np): Likewise. * sysdeps/unix/sysv/linux/pthread_sigmask.c (pthread_sigmask): Likewise. * sysdeps/unix/sysv/linux/pthread_sigqueue.c (pthread_sigqueue): Likewise. * sysdeps/unix/sysv/linux/raise.c (raise): Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/sigaction.c (__libc_sigaction): Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/sigpending.c (sigpending): Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/sigprocmask.c (__sigprocmask): Likewise. * sysdeps/unix/sysv/linux/semget.c (semget): Likewise. * sysdeps/unix/sysv/linux/semop.c (semop): Likewise. * sysdeps/unix/sysv/linux/setrlimit64.c (setrlimit64): Likewise. * sysdeps/unix/sysv/linux/shmat.c (shmat): Likewise. * sysdeps/unix/sysv/linux/shmdt.c (shmdt): Likewise. * sysdeps/unix/sysv/linux/shmget.c (shmget): Likewise. * sysdeps/unix/sysv/linux/sigaction.c (__libc_sigaction): Likewise. * sysdeps/unix/sysv/linux/sigpending.c (sigpending): Likewise. * sysdeps/unix/sysv/linux/sigprocmask.c (__sigprocmask): Likewise. * sysdeps/unix/sysv/linux/sigqueue.c (__sigqueue): Likewise. * sysdeps/unix/sysv/linux/sigstack.c (sigstack): Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/sigpending.c (sigpending): Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/sigprocmask.c (__sigprocmask): Likewise. * sysdeps/unix/sysv/linux/speed.c (cfgetospeed): Likewise. (cfgetispeed): Likewise. (cfsetospeed): Likewise. (cfsetispeed): Likewise. * sysdeps/unix/sysv/linux/tcflow.c (tcflow): Likewise. * sysdeps/unix/sysv/linux/tcflush.c (tcflush): Likewise. * sysdeps/unix/sysv/linux/tcgetattr.c (__tcgetattr): Likewise. * sysdeps/unix/sysv/linux/tcsetattr.c (tcsetattr): Likewise. * sysdeps/unix/sysv/linux/time.c (time): Likewise. * sysdeps/unix/sysv/linux/timer_create.c (timer_create): Likewise. * sysdeps/unix/sysv/linux/timer_delete.c (timer_delete): Likewise. * sysdeps/unix/sysv/linux/timer_getoverr.c (timer_getoverrun): Likewise. * sysdeps/unix/sysv/linux/timer_gettime.c (timer_gettime): Likewise. * sysdeps/unix/sysv/linux/x86_64/sigpending.c (sigpending): Likewise. * sysdeps/unix/sysv/linux/x86_64/sigprocmask.c (__sigprocmask): Likewise. * sysdeps/x86_64/backtrace.c (__backtrace): Likewise.
* Mark x86 _dl_unmap/_dl_make_tlsdesc_dynamic hiddenH.J. Lu2015-10-152-3/+5
| | | | | | | | | | | | | Since x86 _dl_unmap and _dl_make_tlsdesc_dynamic are only used internally in ld.so, they can be made hidden. [BZ #19122] * sysdeps/i386/dl-lookupcfg.h (_dl_unmap): Add attribute_hidden. * sysdeps/i386/dl-tlsdesc.h (_dl_make_tlsdesc_dynamic): Likewise. * sysdeps/x86_64/dl-tlsdesc.h (_dl_make_tlsdesc_dynamic): Likewise. * sysdeps/x86_64/dl-lookupcfg.h (_dl_unmap): Likewise.
* Support PLT and GOT references in local PIC checkH.J. Lu2015-10-141-7/+7
| | | | | | | | | | | | | | Linker in binutils 2.26 and newer generate GOT references instead PLT references when -z now is passed to linker. We need to extend scripts/localplt.awk to allow PLT or GOT references. [BZ #19007] * scripts/localplt.awk: Also allow GOT references. * sysdeps/unix/sysv/linux/i386/localplt.data: Mark _Unwind_Find_FDE, calloc, memalign, realloc and __libc_memalign with "+ REL R_386_GLOB_DAT". * sysdeps/x86_64/localplt.data: Mark calloc, memalign, realloc and __libc_memalign with "+ RELA R_X86_64_GLOB_DAT".
* Support x86-64 assmebler without AVX512H.J. Lu2015-10-131-16/+24
| | | | | | | | | | | | | | | | When x86-64 assmebler doesn't support AVX512, we should make _dl_runtime_resolve_avx512/_dl_runtime_profile_avx512 as aliases of _dl_runtime_resolve_avx/_dl_runtime_profile_avx. Tested on x86-64 using GCC 5.2 with binutils 20151008 and GCC 4.8 with binutils 20130219. There are no differences in ld.so with binutils 20151008. There are no unexpected failures with binutils 20130219 and 20151008. [BZ #19124] * sysdeps/x86_64/dl-trampoline.S [!HAVE_AVX512_ASM_SUPPORT] (_dl_runtime_resolve_avx512): Make it a hidden alias of _dl_runtime_resolve_avx. (_dl_runtime_profile_avx512): Make it a hidden alias of _dl_runtime_profile_avx.
* Update lrint/lrintf/lrintl for x32H.J. Lu2015-10-096-1/+90
| | | | | | | | | | | | | | | | | | | | | | The x86_64 versions of lrint/lrintf/ lrintl are aliases for the long long versions which isn't correct for x32, where exceptions must respect overflow for 32-bit long. Separate versions of the long functions for x32 that convert to 32-bit long and raise the right exceptions for that conversion, while keeping the aliases in the non-x32 case. Tested on x86_64 and x32. There are no code changes in libm.so on x86_64. * sysdeps/x86_64/fpu/s_llrint.S (__lrint): Add alias only if __ILP32__ isn't defined. (lrint): Likewise. * sysdeps/x86_64/fpu/s_llrintf.S (__lrintf): Likewise. (lrintf): Likewise. * sysdeps/x86_64/fpu/s_llrintl.S (__lrintl): Likewise. (lrintl): Likewise. * sysdeps/x86_64/x32/fpu/s_lrint.S: New file. * sysdeps/x86_64/x32/fpu/s_lrintf.S: Likewise. * sysdeps/x86_64/x32/fpu/s_lrintl.S: Likewise.
* Remove configure tests for -mno-vzeroupper support.Joseph Myers2015-10-093-34/+1
| | | | | | | | | | | | | | | | | | GCC added support for -mno-vzeroupper in version 4.6. Thus the configure tests for this support are obsolete, and this patch removes them. Tested for x86_64 and x86 (testsuite, and that installed stripped shared libraries are unchanged by this patch). * sysdeps/i386/configure.ac (libc_cv_cc_novzeroupper): Remove configure test. * sysdeps/i386/configure: Regenerated. * sysdeps/x86_64/configure.ac (libc_cv_cc_novzeroupper): Remove configure test. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/Makefile [$(config-cflags-novzeroupper) = yes]: Make code unconditional.