about summary refs log tree commit diff
path: root/sysdeps
Commit message (Collapse)AuthorAgeFilesLines
* Combination of data tables for x86_64 vector functions sinf, cosf and sincosf.Andrew Senkevich2015-06-2418-3586/+197
| | | | | | | | | | | | | | | | | | | | | | * sysdeps/x86_64/fpu/Makefile (libmvec-support): Fixed files list. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core_sse4.S: Renamed variable and included header. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/svml_s_trig_data.S: New file. * sysdeps/x86_64/fpu/svml_s_trig_data.h: Likewise. * sysdeps/x86_64/fpu/svml_s_cosf_data.S: Removed file. * sysdeps/x86_64/fpu/svml_s_cosf_data.h: Likewise. * sysdeps/x86_64/fpu/svml_s_sinf_data.S: Likewise. * sysdeps/x86_64/fpu/svml_s_sinf_data.h: Likewise. * sysdeps/x86_64/fpu/svml_s_sincosf_data.S: Likewise. * sysdeps/x86_64/fpu/svml_s_sincosf_data.h: Likewise.
* Fix sin, sincos missing underflows (bug 16526, bug 16538).Joseph Myers2015-06-238-18/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to various other bugs in this area, some sin and sincos implementations do not raise the underflow exception for subnormal arguments, when the result is tiny and inexact. This patch forces the exception in a similar way to previous fixes. Tested for x86_64, x86, mips64 and powerpc. [BZ #16526] [BZ #16538] * sysdeps/ieee754/dbl-64/s_sin.c: Include <float.h>. (__sin): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/flt-32/k_sinf.c: Include <float.h>. (__kernel_sinf): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-128/k_sincosl.c: Include <float.h>. (__kernel_sincosl): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-128/k_sinl.c: Include <float.h>. (__kernel_sinl): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-128ibm/k_sincosl.c: Include <float.h>. (__kernel_sincosl): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-128ibm/k_sinl.c: Include <float.h>. (__kernel_sinl): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-96/k_sinl.c: Include <float.h>. (__kernel_sinl): Force underflow exception for arguments with small absolute value. * sysdeps/powerpc/fpu/k_sinf.c: Include <float.h>. (__kernel_sinf): Force underflow exception for arguments with small absolute value. * math/auto-libm-test-in: Add more tests of sin and sincos. * math/auto-libm-test-out: Regenerated.
* Fix spurious "inexact" exceptions from __kernel_standard_l (bug 18245, bug ↵Joseph Myers2015-06-231-24/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | 18583). __kernel_standard_l converts long double arguments to double for use in SVID "struct exception". This has special-case handling for when that conversion would overflow or underflow but the original long double function wouldn't. However, it turns out that "inexact" exceptions can be spurious here as well, when the function is exactly determined and __kernel_standard_l is being called for a domain error. This patch fixes this by using feholdexcept / fesetenv to avoid exceptions from the conversion, replacing the previous special-case logic for overflow and underflow (this covers all functions using __kernel_standard_l, not just those that actually need a change, since there doesn't seem to be much point in restricting things just to the functions that mustn't get "inexact" here). Tested for x86_64 and x86. [BZ #18245] [BZ #18583] * sysdeps/ieee754/k_standardl.c: Include <fenv.h>. (__kernel_standard_l): Use feholdexcept and fesetenv around conversion to double instead of special-casing overflow and underflow. * math/libm-test.inc (fmod_test_data): Add more tests. (remainder_test_data): Likewise. (sqrt_test_data): Likewise.
* Fix atomic_full_barrier on x86 and x86_64.Torvald Riegel2015-06-232-0/+14
| | | | | | | | | This fixes BZ #17403 by defining atomic_full_barrier, atomic_read_barrier, and atomic_write_barrier on x86 and x86_64. A full barrier is implemented through an atomic idempotent modification to the stack and not through using mfence because the latter can supposedly be somewhat slower due to having to provide stronger guarantees wrt. self-modifying code, for example.
* Combination of data tables for x86_64 vector functions sin, cos and sincos.Andrew Senkevich2015-06-2320-439/+173
| | | | | | | | | | | | | | | | | | | | | | | | | * sysdeps/x86_64/fpu/Makefile (libmvec-support): Fixed files list. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core_sse4.S: Renamed variable and included header. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/svml_d_trig_data.S: New file. * sysdeps/x86_64/fpu/svml_d_trig_data.h: Likewise. * sysdeps/x86_64/fpu/svml_d_cos2_core.S: Removed unneeded include. * sysdeps/x86_64/fpu/svml_d_cos4_core.S: Likewise. * sysdeps/x86_64/fpu/svml_d_cos8_core.S: Likewise. * sysdeps/x86_64/fpu/svml_d_cos_data.S: Removed file. * sysdeps/x86_64/fpu/svml_d_cos_data.h: Likewise. * sysdeps/x86_64/fpu/svml_d_sin_data.S: Likewise. * sysdeps/x86_64/fpu/svml_d_sin_data.h: Likewise. * sysdeps/x86_64/fpu/svml_d_sincos_data.S: Likewise. * sysdeps/x86_64/fpu/svml_d_sincos_data.h: Likewise.
* Fix csqrt spurious underflows (bug 18371).Joseph Myers2015-06-231-0/+2
| | | | | | | | | | | | | | | | | | | | | | The csqrt implementations in glibc can cause spurious underflows in some cases as a side-effect of the scaling for large arguments (when underflow is correct for the square root of the argument that was scaled down to avoid overflow, but not for the original argument). This patch arranges to avoid the underflowing intermediate computation (eliminating a multiplication in 0.5 in the problem cases where a subsequent scaling by 2 would follow). Tested for x86_64 and x86 and ulps updated accordingly (only needed for x86). [BZ #18371] * math/s_csqrt.c (__csqrt): Avoid multiplication by 0.5 where intermediate but not final result might underflow. * math/s_csqrtf.c (__csqrtf): Likewise. * math/s_csqrtl.c (__csqrtl): Likewise. * math/auto-libm-test-in: Add more tests of csqrt. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update.
* Fix exp2, exp2f spurious underflows (bug 18219).Joseph Myers2015-06-232-2/+6
| | | | | | | | | | | | | | | | | | | | | The dbl-64 and flt-32 implementations of exp2 functions produce spurious underflow exceptions. The underlying reason is the same in both cases: the computation works as (2^a - 1)*2^b + 2^b for suitably chosen a and b, where a has small magnitude so 2^a - 1 can be computed with a low-degree polynomial approximation, and (2^a - 1)*2^b can underflow even when the final result does not. This patch fixes this by adjusting the threshold for when scaling is used to avoid intermediate underflow so it works for any possible value of a where the final result would not underflow. Tested for x86_64 and x86. [BZ #18219] * sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Reduce threshold on absolute value of exponent for which scaling is used. * sysdeps/ieee754/flt-32/e_exp2f.c (__ieee754_exp2f): Likewise. * math/auto-libm-test-in: Add more tests of exp2. * math/auto-libm-test-out: Regenerated.
* Fix expm1 missing underflows (bug 16353).Joseph Myers2015-06-225-1/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to various other bugs in this area, some expm1 implementations do not raise the underflow exception for subnormal arguments, when the result is tiny and inexact. This patch forces the exception in a similar way to previous fixes. (The issue does not apply to the ldbl-* implementations or to those for x86 / x86_64 long double. The change to sysdeps/ieee754/dbl-64/wordsize-64/e_cosh.c is one I missed when previously fixing bug 16354; the bug in that implementation was previously latent, but the expm1 fixes stopped it being latent and so required it to be fixed to avoid spurious underflows from cosh.) Tested for x86_64 and x86. [BZ #16353] * sysdeps/i386/fpu/s_expm1.S (dbl_min): New object. (__expm1): Force underflow exception for arguments with small absolute value. * sysdeps/i386/fpu/s_expm1f.S (flt_min): New object. (__expm1f): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/dbl-64/s_expm1.c: Include <float.h>. (__expm1): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/flt-32/s_expm1f.c: Include <float.h>. (__expm1f): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/dbl-64/wordsize-64/e_cosh.c (__ieee754_cosh): Check for small arguments before calling __expm1. * math/auto-libm-test-in: Do not mark underflow exceptions as possibly missing for bug 16353. * math/auto-libm-test-out: Regenerated.
* Fix x86_64 / x86 expm1l (-min_subnorm) result sign (bug 18569).Joseph Myers2015-06-212-0/+12
| | | | | | | | | | | | | | | | | | | | In the x86 / x86_64 implementations of expm1l, when expm1l's result should underflow to 0 (argument minus the least subnormal, in some rounding modes), it can be a zero of the wrong sign. This patch fixes this by returning the argument with underflow forced in that case (this is a 1ulp error relative to the correctly rounded result of -0, which is OK in terms of the documented accuracy goals, whereas a result with the wrong sign never is). Tested for x86_64 and x86. [BZ #18569] * sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Force underflow and return argument in case of subnormal argument. * sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Likewise. * math/auto-libm-test-in: Add more tests of expm1. * math/auto-libm-test-out: Regenerated.
* Fix x86 / x86_64 expl, exp10l missing underflows (bug 16361).Joseph Myers2015-06-212-2/+29
| | | | | | | | | | | | | | | | | | | | | Similar to various other bugs in this area, the x86 and x86_64 implementations of expl / exp10l can fail to produce underflow exceptions when the unscaled result has trailing 0 bits so the scaling down to subnormal precision is exact. This patch fixes this by forcing the exception in the case of tiny results. Tested for x86_64 and x86. [BZ #16361] * sysdeps/i386/fpu/e_expl.S [!USE_AS_EXPM1L] (cmin): New object. [!USE_AS_EXPM1L] (IEEE754_EXPL): Force underflow exception for tiny results. * sysdeps/x86_64/fpu/e_expl.S [!USE_AS_EXPM1L] (cmin): New object. [!USE_AS_EXPM1L] (IEEE754_EXPL): Force underflow exception for tiny results. * math/auto-libm-test-in: Add more tests of exp and exp10. Do not mark underflow exceptions as possibly missing for bug 16361. * math/auto-libm-test-out: Regenerated.
* Fixed powerpc64 build.Andrew Senkevich2015-06-191-0/+4
| | | | | | * sysdeps/ieee754/ldbl-opt/s_sin.c (__DECL_SIMD_sincos_disable, __DECL_SIMD_sincos_disablef, __DECL_SIMD_sincos_disablel): Added empty definitions for proper unfolding of __MATHDECL_VEC.
* S/390: Regenerate ULPsStefan Liebler2015-06-191-214/+266
| | | | | | | | Regenerated ULPs after recent math test changes. ChangeLog: * sysdeps/s390/fpu/libm-test-ulps: Regenerated.
* Fix asinh missing underflows (bug 16350).Joseph Myers2015-06-188-3/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to various other bugs in this area, some asinh implementations do not raise the underflow exception for subnormal arguments, when the result is tiny and inexact. This patch forces the exception in a similar way to previous fixes. Tested for x86_64, x86 and mips64. [BZ #16350] * sysdeps/i386/fpu/s_asinh.S (__asinh): Force underflow exception for arguments with small absolute value. * sysdeps/i386/fpu/s_asinhf.S (__asinhf): Likewise. * sysdeps/i386/fpu/s_asinhl.S (__asinhl): Likewise. * sysdeps/ieee754/dbl-64/s_asinh.c: Include <float.h>. (__asinh): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/flt-32/s_asinhf.c: Include <float.h>. (__asinhf): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-128/s_asinhl.c: Include <float.h>. (__asinhl): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-128ibm/s_asinhl.c: Include <float.h>. (__asinhl): Force underflow exception for arguments with small absolute value. * sysdeps/ieee754/ldbl-96/s_asinhl.c: Include <float.h>. (__asinhl): Force underflow exception for arguments with small absolute value. * math/auto-libm-test-in: Do not mark underflow exceptions as possibly missing for bug 16350. * math/auto-libm-test-out: Regenerated.
* Fix netinet/in.h MCAST_* namespace (bug 18558).Joseph Myers2015-06-181-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sysdeps/unix/sysv/linux/bits/in.h (as included in netinet/in.h, and via that in netdb.h and arpa/inet.h) defines a series of MCAST_* macros, both under __USE_MISC and then again unconditionally. These are not POSIX macros, nor in any of the namespaces listed in POSIX as reserved for this header, so should not be defined unconditionally. This patch duly removes the unconditional definitions, leaving the ones conditional on __USE_MISC. Tested for x86_64 and x86 (testsuite, and that installed stripped shared libraries are unchanged by the patch). [BZ #18558] * sysdeps/unix/sysv/linux/bits/in.h (MCAST_JOIN_GROUP): Remove unconditional definition. (MCAST_BLOCK_SOURCE): Likewise. (MCAST_UNBLOCK_SOURCE): Likewise. (MCAST_LEAVE_GROUP): Likewise. (MCAST_JOIN_SOURCE_GROUP): Likewise. (MCAST_LEAVE_SOURCE_GROUP): Likewise. (MCAST_MSFILTER): Likewise. * conform/Makefile (test-xfail-XOPEN2K/arpa/inet.h/conform): Remove variable. (test-xfail-XOPEN2K/netdb.h/conform): Likewise. (test-xfail-XOPEN2K/netinet/in.h/conform): Likewise. (test-xfail-XOPEN2K8/arpa/inet.h/conform): Likewise. (test-xfail-XOPEN2K8/netdb.h/conform): Likewise. (test-xfail-XOPEN2K8/netinet/in.h/conform): Likewise.
* Vector sincosf for x86_64 and tests.Andrew Senkevich2015-06-1827-41/+2620
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized sincosf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * NEWS: Mention addition of x86_64 vector sincosf. * math/test-float-vlen16.h: Added wrapper for sincosf tests. * math/test-float-vlen4.h: Likewise. * math/test-float-vlen8.h: Likewise. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added sincosf SIMD declaration. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S * sysdeps/x86_64/fpu/svml_s_sincosf16_core.S * sysdeps/x86_64/fpu/svml_s_sincosf4_core.S * sysdeps/x86_64/fpu/svml_s_sincosf8_core.S * sysdeps/x86_64/fpu/svml_s_sincosf8_core_avx.S * sysdeps/x86_64/fpu/svml_s_sincosf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_sincosf_data.h: New file. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Added 3 argument wrappers. * sysdeps/x86_64/fpu/test-float-vlen16.c: : Vector sincosf tests. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise.
* Vector sincos for x86_64 and tests.Andrew Senkevich2015-06-1827-1/+1776
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized sincos containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * NEWS: Mention addition of x86_64 vector sincos. * bits/libm-simd-decl-stubs.h: Added stubs for sincos. * math/math.h (__MATHDECL_VEC): New macro. * math/bits/mathcalls.h: Added sincos declaration with __MATHDECL_VEC. * math/gen-libm-have-vector-test.sh: Added generation of sincos wrapper declaration under condition. * math/test-vec-loop.h (TEST_VEC_LOOP): Refactored. * math/test-double-vlen2.h: Added wrapper for sincos tests, reflected TEST_VEC_LOOP change. * math/test-double-vlen4.h: Likewise. * math/test-double-vlen8.h: Likewise. * math/test-float-vlen16.h: Reflected TEST_VEC_LOOP change. * math/test-float-vlen4.h: Likewise. * math/test-float-vlen8.h: Likewise. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added sincos SIMD declaration. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos_data.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos_data.h: New file. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Added wrappers for sincos. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Vector sincos tests. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise.
* Vector powf for x86_64 and tests.Andrew Senkevich2015-06-1827-2/+5596
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized powf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for powf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Added 2 argument wrappers. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_powf16_core.S: New file. * sysdeps/x86_64/fpu/svml_s_powf4_core.S: New file. * sysdeps/x86_64/fpu/svml_s_powf8_core.S: New file. * sysdeps/x86_64/fpu/svml_s_powf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_powf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_powf_data.h: New file. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector powf tests. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. * math/test-float-vlen16.h: Fixed 2 argument macro. * math/test-float-vlen4.h: Likewise. * math/test-float-vlen8.h: Likewise. * NEWS: Mention addition of x86_64 vector powf.
* Remove ldbl-128ibm variants of complex math functions.Joseph Myers2015-06-173-281/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sysdeps/ieee754/ldbl-128ibm has its own versions of cprojl, ctanhl and ctanl. Having its own versions, where otherwise the math/ copies are generally used for all floating-point formats, means they are liable to get out of sync and not benefit from bug fixes to the generic versions. The substantive differences (not arising from getting out of sync and slightly different fixes for the same issues) are: long double compat handling (also done in the ldbl-opt versions, so doesn't require special versions for ldbl-128ibm); handling of LDBL_EPSILON (conditionally undefined and redefined in other math/ implementations, so doesn't justify a special version), and: /* __gcc_qmul does not respect -0.0 so we need the following fixup. */ if ((__real__ res == 0.0L) && (__real__ x == 0.0L)) __real__ res = __real__ x; if ((__real__ res == 0.0L) && (__imag__ x == 0.0L)) __imag__ res = __imag__ x; But if that statement about __gcc_qmul was ever true for an old version of that libgcc function, it's not the case for any GCC version now supported to build glibc; there's explicit logic early in that function (and similarly in __gcc_qdiv) to return an appropriately signed zero if the product of the high parts is zero. So this patch adds the special LDBL_EPSILON handling to the generic functions and removes the ldbl-128ibm versions. Tested for powerpc32 (compared test-ldouble.out before and after the changes; there are slight changes to results for ctanl / ctanhl, arising from divergence of the implementations, but nothing that affects the overall nature of the issues shown by the testsuite, and in particular nothing related to signs of zero resutls). * math/s_ctanhl.c [LDBL_MANT_DIG == 106] (LDBL_EPSILON): Undefine and redefine. * math/s_ctanl.c [LDBL_MANT_DIG == 106] (LDBL_EPSILON): Undefine and redefine. * sysdeps/ieee754/ldbl-128ibm/s_cprojl.c: Remove file. * sysdeps/ieee754/ldbl-128ibm/s_ctanhl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_ctanl.c: Likewise.
* Fix nice getpriority, setpriority namespace (bug 18553).Joseph Myers2015-06-175-11/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nice (XPG3) calls getpriority and setpriority (in XPG4 but not XPG3, i.e. UX-shaded in XPG4). This patch fixes this by making those functions into weak aliases of __* functions and calling the __* versions as needed. Tested for x86_64 and x86 (testsuite, and that disassembly of installed shared libraries is unchanged by this patch). This completes cleaning up the unsorted linknamespace test XFAILs. [BZ #18553] * resource/getpriority.c (getpriority): Rename to __getpriority and define as weak alias of __getpriority. * resource/setpriority.c (setpriority): Rename to __setpriority and define as weak alias of __setpriority. * sysdeps/mach/hurd/getpriority.c (getpriority): Rename to __getpriority and define as weak alias of __getpriority. * sysdeps/mach/hurd/setpriority.c (setpriority): Rename to __setpriority and define as weak alias of __setpriority. * sysdeps/unix/syscalls.list (getpriority): Use __getpriority as strong name. (setpriority): Use __setpriority as strong name. * sysdeps/unix/sysv/linux/getpriority.c (getpriority): Rename to __getpriority and define as weak alias of __getpriority. * include/sys/resource.h (__getpriority): Declare. Use libc_hidden_proto. (__setpriority): Likewise. (getpriority): Don't use libc_hidden_proto. (setpriority): Likewise. * sysdeps/posix/nice.c (nice): Call __getpriority instead of getpriority. Call __setpriority instead of setpriority. * conform/Makefile (test-xfail-XPG3/unistd.h/linknamespace): Remove variable.
* Fix mq_notify socket, recv namespace (bug 18546).Joseph Myers2015-06-177-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mq_notify (in the 1996 edition of POSIX) brings in references to recv and socket (not in POSIX until the 2001 edition). This patch fixes this by using __recv and __socket, exporting them from libc at version GLIBC_PRIVATE. Tested for x86_64 and x86 (testsuite and comparison of installed stripped shared libraries; PLT / dynamic symbol table changes render the comparison not particularly useful for libc). [BZ #18546] * socket/recv.c (__recv): Use libc_hidden_def. * socket/socket.c (__socket): Likewise. * sysdeps/mach/hurd/recv.c (__recv): Likewise. * sysdeps/mach/hurd/socket.c (__socket): Likewise. * sysdeps/unix/sysv/linux/generic/recv.c (__recv): Likewise. * sysdeps/unix/sysv/linux/recv.c (__recv): Use libc_hidden_weak. * sysdeps/unix/sysv/linux/socket.c (__socket): Use libc_hidden_def. * sysdeps/unix/sysv/linux/x86_64/recv.c (__recv): Use libc_hidden_weak. * include/sys/socket.h (__socket): Do not use attribute_hidden. Use libc_hidden_proto. (__recv): Likewise. * socket/Versions (libc): Export __recv and __socket at version GLIBC_PRIVATE. * sysdeps/unix/sysv/linux/mq_notify.c (helper_thread): Call __recv instead of recv. (init_mq_netlink): Call __socket instead of socket. * conform/Makefile (test-xfail-POSIX/mqueue.h/linknamespace): Remove variable.
* Fix mq_receive, mq_send mq_timed* namespace (bug 18545).Joseph Myers2015-06-173-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | mq_receive calls mq_timedreceive, and mq_send calls mq_timedsend. But mq_receive and mq_send were in POSIX by 1996, while mq_timed* were added in the 2001 edition of POSIX. This patch fixes this by making mq_timed* into weak aliases for __mq_timed* and calling the __mq_timed* names. Tested for x86_64 and x86 (testsuite, and that disassembly of installed shared libraries is unchanged by the patch). [BZ #18545] * rt/mq_timedreceive.c (mq_timedreceive): Rename to __mq_timedreceive and define as alias of __mq_timedreceive. Use hidden_weak. * rt/mq_timedsend.c (mq_timedsend): Rename to __mq_timedsend and define as alias of __mq_timedsend. Use hidden_weak. * sysdeps/unix/sysv/linux/syscalls.list (mq_timedsend): Use __mq_timedsend as strong name. (mq_timedreceive): Use __mq_timedreceive as strong name. * include/mqueue.h (__mq_timedsend): Declare. Use hidden_proto. (__mq_timedreceive): Likewise. * sysdeps/unix/sysv/linux/mq_receive.c (mq_receive): Call __mq_timedreceive instead of mq_timedreceive. * sysdeps/unix/sysv/linux/mq_send.c (mq_send): Call __mq_timedsend instead of mq_timedsend. * conform/Makefile (test-xfail-UNIX98/mqueue.h/linknamespace): Remove variable.
* Create hidden aliases for non-libc syscalls automatically.Joseph Myers2015-06-173-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The syscall wrappers mechanism automatically creates hidden aliases for syscalls with libc_hidden_def / libc_hidden_weak. The use of libc_hidden_* has the side-effect that for syscall wrappers in non-libc libraries those aliases are not created. In turn, this means that three mq_* syscalls in sysdeps/unix/sysv/linux/syscalls.list list the __GI_* names explicitly. The use of libc_hidden_* dates back to the original introduction of that support in 2002-08-03 Roland McGrath <roland@redhat.com> * sysdeps/unix/make-syscalls.sh: Generate libc_hidden_def or libc_hidden_weak for every system call symbol defined. (predating the non-libc syscalls in question) and I see no reason for excluding non-libc syscalls. This patch changes the code to use hidden_def / hidden_weak (via a wrapper syscall_hidden_def in the case where the argument is itself a macro, so that the argument gets expanded before concatenation with __GI_), so avoiding the need to specify the hidden aliases explicitly in this case. Tested for x86_64 and x86 (testsuite, and that disassembly of installed stripped shared libraries is unchanged by the patch; the mq_* symbols change from weak to strong, which is of no significance and two of them will shortly change back to weak as part of a fix for bug 18545). * sysdeps/unix/make-syscalls.sh (emit_weak_aliases): Use hidden_def and hidden_weak instead of libc_hidden_def and libc_hidden_weak. (top level): Refer to hidden_def in comment. * sysdeps/unix/syscall-template.S (syscall_hidden_def): New macro. Use it instead of libc_hidden_def. * sysdeps/unix/sysv/linux/syscalls.list (mq_timedsend): Do not specify __GI_* name explicitly. (mq_timedreceive): Likewise. (mq_setattr): Likewise.
* Fix mq_notify pthread_barrier_* namespace (bug 18544).Joseph Myers2015-06-176-14/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mq_notify (present in POSIX by 1996) brings in references to pthread_barrier_init and pthread_barrier_wait (new in the 2001 edition of POSIX). This patch fixes this by making those functions into weak aliases of __pthread_barrier_*, exporting the __pthread_barrier_* names at version GLIBC_PRIVATE and using them in mq_notify. Tested for x86_64 and x86 (testsuite, and comparison of installed stripped shared libraries). Changes in addresses from dynamic symbol table / PLT changes render most comparisons not particularly useful, but when the addresses of subsequent code don't change there's no sign of unexpected changes there. This patch does not remove any linknamespace XFAILs because of other namespace issues remaining with mqueue.h functions. [BZ #18544] * nptl/pthread_barrier_init.c (pthread_barrier_init): Rename to __pthread_barrier_init and define as weak alias of __pthread_barrier_init. * sysdeps/sparc/nptl/pthread_barrier_init.c (pthread_barrier_init): Likewise. * nptl/pthread_barrier_wait.c (pthread_barrier_wait): Rename to __pthread_barrier_wait and define as weak alias of __pthread_barrier_wait. * sysdeps/sparc/nptl/pthread_barrier_wait.c (pthread_barrier_wait): Likewise. * sysdeps/sparc/sparc32/pthread_barrier_wait.c (pthread_barrier_wait): Likewise. * sysdeps/unix/sysv/linux/i386/i486/pthread_barrier_wait.S (pthread_barrier_wait): Likewise. * sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S (pthread_barrier_wait): Likewise. * nptl/Versions (libpthread): Export __pthread_barrier_init and __pthread_barrier_wait at version GLIBC_PRIVATE. * include/pthread.h (__pthread_barrier_init): Declare. (__pthread_barrier_wait): Likewise. * sysdeps/unix/sysv/linux/mq_notify.c (notification_function): Call __pthread_barrier_wait instead of pthread_barrier_wait. (helper_thread): Likewise. (init_mq_netlink): Call __pthread_barrier_init instead of pthread_barrier_init.
* Fix sem_* tdelete, tfind, tsearch, twalk namespace (bug 18536).Joseph Myers2015-06-171-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The sem_* functions bring in references to tdelete, tfind, tsearch and twalk. But the t* functions are XSI-shaded, while sem_* aren't. This patch fixes this by using __t* instead, exporting those functions from libc at version GLIBC_PRIVATE (since sem_* are in libpthread) and using libc_hidden_* for the benefit of calls within libc. Tested for x86_64 and x86 (testsuite, and comparison of disassembly of installed stripped shared libraries). libpthread gets changes from PLT reordering; addresses in libc change because of PLT / dynamic symbol table changes. [BZ #18536] * misc/tsearch.c (__tsearch): Use libc_hidden_def. (__tfind): Likewise. (__tdelete): Likewise. (__twalk): Likewise. * misc/Versions (libc): Add __tdelete, __tfind, __tsearch and __twalk to GLIBC_PRIVATE. * include/search.h (__tsearch): Use libc_hidden_proto. (__tfind): Likewise. (__tdelete): Likewise. (__twalk): Likewise. * nptl/sem_close.c (sem_close): Call __twalk instead of twalk. Call __tdelete instead of tdelete. * nptl/sem_open.c (check_add_mapping): Call __tfind instead of tfind. Call __tsearch instead of tsearch. * sysdeps/sparc/sparc32/sem_open.c (check_add_mapping): Likewise. * conform/Makefile (test-xfail-POSIX/semaphore.h/linknamespace): Remove variable. (test-xfail-POSIX2008/semaphore.h/linknamespace): Likewise.
* Vector pow for x86_64 and tests.Andrew Senkevich2015-06-1727-2/+6896
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized pow containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * bits/libm-simd-decl-stubs.h: Added stubs for pow. * math/bits/mathcalls.h: Added pow declaration with __MATHCALL_VEC. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for pow. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Added 2 argument wrappers. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_pow2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_pow4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_pow4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_pow8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_pow_data.S: New file. * sysdeps/x86_64/fpu/svml_d_pow_data.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector pow test. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector pow.
* Vector expf for x86_64 and tests.Andrew Senkevich2015-06-1726-1/+1224
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized expf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for expf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_expf16_core.S: New file. * sysdeps/x86_64/fpu/svml_s_expf4_core.S: New file. * sysdeps/x86_64/fpu/svml_s_expf8_core.S: New file. * sysdeps/x86_64/fpu/svml_s_expf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_expf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_expf_data.h: New file. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector expf tests. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector expf.
* Vector exp for x86_64 and tests.Andrew Senkevich2015-06-1726-2/+2291
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized exp containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * bits/libm-simd-decl-stubs.h: Added stubs for exp. * math/bits/mathcalls.h: Added exp declaration with __MATHCALL_VEC. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for exp. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_exp2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_exp4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_exp4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_exp8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_exp_data.S: New file. * sysdeps/x86_64/fpu/svml_d_exp_data.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector exp test. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector exp.
* Vector logf for x86_64 and tests.Andrew Senkevich2015-06-1726-2/+1201
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized logf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for logf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_logf16_core.S: New file. * sysdeps/x86_64/fpu/svml_s_logf4_core.S: New file. * sysdeps/x86_64/fpu/svml_s_logf8_core.S: New file. * sysdeps/x86_64/fpu/svml_s_logf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_logf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_logf_data.h: New file. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector logf tests. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector logf.
* Vector log for x86_64 and tests.Andrew Senkevich2015-06-1726-0/+2887
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized log containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * bits/libm-simd-decl-stubs.h: Added stubs for log. * math/bits/mathcalls.h: Added log declaration with __MATHCALL_VEC. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for log. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_log2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_log4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_log4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_log8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_log_data.S: New file. * sysdeps/x86_64/fpu/svml_d_log_data.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector log test. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector log.
* [AArch64] Fix cfi_adjust_cfa_offset usage in dl-tlsdesc.SSzabolcs Nagy2015-06-171-5/+5
| | | | | | | | | Some of the cfi annotations used incorrect sign. * sysdeps/aarch64/dl-tlsdesc.S (_dl_tlsdesc_return_lazy): Fix cfi_adjust_cfa_offset argument. (_dl_tlsdesc_undefweak, _dl_tlsdesc_dynamic): Likewise. (_dl_tlsdesc_resolve_rela, _dl_tlsdesc_resolve_hold): Likewise.
* [BZ 18034][AArch64] Lazy TLSDESC relocation data race fixSzabolcs Nagy2015-06-173-12/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Lazy TLSDESC initialization needs to be synchronized with concurrent TLS accesses. The TLS descriptor contains a function pointer (entry) and an argument that is accessed from the entry function. With lazy initialization the first call to the entry function updates the entry and the argument to their final value. A final entry function must make sure that it accesses an initialized argument, this needs synchronization on systems with weak memory ordering otherwise the writes of the first call can be observed out of order. There are at least two issues with the current code: tlsdesc.c (i386, x86_64, arm, aarch64) uses volatile memory accesses on the write side (in the initial entry function) instead of C11 atomics. And on systems with weak memory ordering (arm, aarch64) the read side synchronization is missing from the final entry functions (dl-tlsdesc.S). This patch only deals with aarch64. * Write side: Volatile accesses were replaced with C11 relaxed atomics, and a release store was used for the initialization of entry so the read side can synchronize with it. * Read side: TLS access generated by the compiler and an entry function code is roughly ldr x1, [x0] // load the entry blr x1 // call it entryfunc: ldr x0, [x0,#8] // load the arg ret Various alternatives were considered to force the ordering in the entry function between the two loads: (1) barrier entryfunc: dmb ishld ldr x0, [x0,#8] (2) address dependency (if the address of the second load depends on the result of the first one the ordering is guaranteed): entryfunc: ldr x1,[x0] and x1,x1,#8 orr x1,x1,#8 ldr x0,[x0,x1] (3) load-acquire (ARMv8 instruction that is ordered before subsequent loads and stores) entryfunc: ldar xzr,[x0] ldr x0,[x0,#8] Option (1) is the simplest but slowest (note: this runs at every TLS access), options (2) and (3) do one extra load from [x0] (same address loads are ordered so it happens-after the load on the call site), option (2) clobbers x1 which is problematic because existing gcc does not expect that, so approach (3) was chosen. A new _dl_tlsdesc_return_lazy entry function was introduced for lazily relocated static TLS, so non-lazy static TLS can avoid the synchronization cost. [BZ #18034] * sysdeps/aarch64/dl-tlsdesc.h (_dl_tlsdesc_return_lazy): Declare. * sysdeps/aarch64/dl-tlsdesc.S (_dl_tlsdesc_return_lazy): Define. (_dl_tlsdesc_undefweak): Guarantee TLSDESC entry and argument load-load ordering using ldar. (_dl_tlsdesc_dynamic): Likewise. (_dl_tlsdesc_return_lazy): Likewise. * sysdeps/aarch64/tlsdesc.c (_dl_tlsdesc_resolve_rela_fixup): Use relaxed atomics instead of volatile and synchronize with release store. (_dl_tlsdesc_resolve_hold_fixup): Use relaxed atomics instead of volatile. * elf/tlsdeschtab.h (_dl_tlsdesc_resolve_early_return_p): Likewise.
* Vector sinf for x86_64 and tests.Andrew Senkevich2015-06-1526-1/+2345
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized sinf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for sinf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf16_core.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf4_core.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf8_core.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf_data.h: New file. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector sinf tests. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector sinf.
* Fix getlogin_r namespace (bug 18527).Joseph Myers2015-06-123-8/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | Various functions in XPG4 bring in references to getlogin_r, which is not in XPG4; this is also a bug for some older POSIX versions which aren't yet covered by the linknamespace tests. This patch fixes this by making getlogin_r into a weak alias for __getlogin_r and using __getlogin_r as needed. Tested for x86_64 and x86 (testsuite, and that disassembly of installed stripped shared libraries is unchanged by the patch). [BZ #18527] * login/getlogin_r.c (getlogin_r): Rename to __getlogin_r and define as weak alias of __getlogin_r. Use libc_hidden_weak. * sysdeps/mach/hurd/getlogin_r.c (getlogin_r): Likewise. * sysdeps/unix/getlogin_r.c (getlogin_r): Likewise. * sysdeps/unix/sysv/linux/getlogin_r.c (getlogin_r): Likewise. * include/unistd.h (__getlogin_r): Declare. Use libc_hidden_proto. * posix/glob.c (glob): Call __getlogin_r instead of getlogin_r. * conform/Makefile (test-xfail-XPG3/glob.h/linknamespace): Remove variable. (test-xfail-XPG3/wordexp.h/linknamespace): Likewise. (test-xfail-XPG4/glob.h/linknamespace): Likewise. (test-xfail-XPG4/wordexp.h/linknamespace): Likewise.
* Fix aio_* pread namespace (bug 18519).Joseph Myers2015-06-121-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | aio_* bring in references to pread, which isn't in all the standards containing aio_* (as a reference from one library to another, this is a bug for dynamic as well as static linking). This patch fixes this by using __libc_pread instead, exporting that function from libc at symbol version GLIBC_PRIVATE; the code, with conditionals that may call either __pread64 or __libc_pread, becomes exactly analogous to that elsewhere in the same file that may call either __pwrite64 or __libc_pwrite. Tested for x86_64 and x86 (testsuite, and comparison of disassembly of installed shared libraries). libc changes because of the PLT entry for the newly exported __libc_pread; librt changes because of assertion line numbers and PLT rearrangement; other stripped installed shared libraries do not change. [BZ #18519] * posix/Versions (libc): Export __libc_pread at version GLIBC_PRIVATE. * sysdeps/pthread/aio_misc.c (handle_fildes_io): Call __libc_pread instead of pread. * conform/Makefile (test-xfail-POSIX/aio.h/linknamespace): Remove variable.
* Vector sin for x86_64 and tests.Andrew Senkevich2015-06-1126-3/+1296
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized sin containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * bits/libm-simd-decl-stubs.h: Added stubs for sin. * math/bits/mathcalls.h: Added sin declaration with __MATHCALL_VEC. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: SIMD declaration for sin. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_sin2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sin4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sin4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_sin8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sin_data.S: New file. * sysdeps/x86_64/fpu/svml_d_sin_data.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector sin test. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector sin.
* More strict check of AVX512 support in assembler.Andrew Senkevich2015-06-112-0/+2
| | | | | | | | Binutils 2.24 doesn't support some AVX512 instructions with ZMM registers, so we need add more strict check. * configure.ac: Added more strict check. * configure: Regenerated.
* x86: Remove vsyscall usageAdhemerval Zanella2015-06-096-74/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes the vsyscall usage for x86_64 port. As indicated by kernel code comments [1], vsyscalls are a legacy ABI and its concept is problematic: - It interferes with ASLR. - It's awkward to write code that lives in kernel addresses but is callable by userspace at fixed addresses. - The whole concept is impossible for 32-bit compat userspace. - UML cannot easily virtualize a vsyscall. The VDSO is a better approach for such functionality. Tested on i686, x86_64, and x32. * sysdeps/unix/sysv/linux/i386/gettimeofday.c (__gettimeofday_syscall): Remove vsyscall fallback. * sysdeps/unix/sysv/linux/i386/time.c (__time_syscall): Likewise. * sysdeps/unix/sysv/linux/x86/gettimeofday.c (__gettimeofday_syscall): Add syscall fallback function. (gettimeofday_ifunc): Use __gettimeofday_syscall as fallback mechanism if vDSO is not present. * sysdeps/unix/sysv/linux/x86/time.c (__time_syscall): Add syscall fallback function. (time_ifunc): Use __time_syscall as fallback mechanism if vDSO is not present. * sysdeps/unix/sysv/linux/x86_64/gettimeofday.c: Remove file. * sysdeps/unix/sysv/linux/x86_64/time.c: Likewise. [1] arch/x86/kernel/vsyscall_64.c
* Fix regcomp wcscoll, wcscmp namespace (bug 18497).Joseph Myers2015-06-093-7/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | regcomp brings in references to wcscoll, which isn't in all the standards that contain regcomp. In turn, wcscoll brings in references to wcscmp, also not in all those standards. This patch fixes this by making those functions into weak aliases of __wcscoll and __wcscmp and calling those names instead as needed. Tested for x86_64 and x86 (testsuite, and that disassembly of installed shared libraries is unchanged by the patch). [BZ #18497] * wcsmbs/wcscmp.c [!WCSCMP] (WCSCMP): Define as __wcscmp instead of wcscmp. (wcscmp): Define as weak alias of WCSCMP. * wcsmbs/wcscoll.c (STRCOLL): Define as __wcscoll instead of wcscoll. (USE_HIDDEN_DEF): Define. [!USE_IN_EXTENDED_LOCALE_MODEL] (wcscoll): Define as weak alias of __wcscoll. Don't use libc_hidden_weak. * wcsmbs/wcscoll_l.c (STRCMP): Define as __wcscmp instead of wcscmp. * sysdeps/i386/i686/multiarch/wcscmp-c.c [SHARED] (libc_hidden_def): Define __GI___wcscmp instead of __GI_wcscmp. (weak_alias): Undefine and redefine. * sysdeps/i386/i686/multiarch/wcscmp.S (wcscmp): Rename to __wcscmp and define as weak alias of __wcscmp. * sysdeps/x86_64/wcscmp.S (wcscmp): Likewise. * include/wchar.h (__wcscmp): Declare. Use libc_hidden_proto. (__wcscoll): Likewise. (wcscmp): Don't use libc_hidden_proto. (wcscoll): Likewise. * posix/regcomp.c (build_range_exp): Call __wcscoll instead of wcscoll. * posix/regexec.c (check_node_accept_bytes): Likewise. * conform/Makefile (test-xfail-XPG3/regex.h/linknamespace): Remove variable. (test-xfail-XPG4/regex.h/linknamespace): Likewise. (test-xfail-POSIX/regex.h/linknamespace): Likewise.
* Fix pathconf statvfs namespace (bug 18507).Joseph Myers2015-06-094-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | pathconf uses __statvfs64, and fpathconf uses __fstatvfs64. On systems using sysdeps/unix/sysv/linux/wordsize-64, __statvfs64 then brings in the strong symbol statvfs, and __fstatvfs64 brings in the strong symbol fstatvfs, which are not in all the standards that have pathconf and fpathconf. This patch fixes this by making those symbols into weak aliases. Tested for x86_64 and x86 (testsuite, and that disassembly of installed shared libraries is unchanged by the patch). [BZ #18507] * sysdeps/unix/sysv/linux/fstatvfs.c (fstatvfs): Rename to __fstatvfs and define as weak alias of __fstatvfs. Use libc_hidden_weak. * sysdeps/unix/sysv/linux/statvfs.c (statvs): Rename to __statvfs and define as weak alias of __statvfs. Use libc_hidden_weak. * sysdeps/unix/sysv/linux/wordsize-64/fstatvfs.c (__fstatvfs64): Define as alias of __fstatvfs, not fstatvfs. (fstatvfs64): Likewise. * sysdeps/unix/sysv/linux/wordsize-64/statvfs.c (__statvfs64): Define as alias of __statvfs, not statvfs. (statvfs64): Likewise. * conform/Makefile (test-xfail-POSIX/unistd.h/linknamespace): Remove variable.
* Consolidate sched_getcpuAdhemerval Zanella2015-06-0912-203/+10
| | | | | | | | This patch consolidates the sched_getcpu implementations across all arches (except tile, which requires its own). This patch removes the powerpc, x86_64 and x32 specific files and change the default linux one to use INLINE_VSYSCALL where possible (for ports that implements it).
* This patch adds vector cosf tests.Andrew Senkevich2015-06-0910-2/+221
| | | | | | | | | | | | | | | | | | * math/Makefile: Added CFLAGS for new tests. * math/test-float-vlen16.h: New file. * math/test-float-vlen4.h: New file. * math/test-float-vlen8.h: New file. * math/test-double-vlen2.h: Fixed 2 argument macro and comment. * sysdeps/x86_64/fpu/Makefile: Added new tests and variables. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: New file. * sysdeps/x86_64/fpu/test-float-vlen16.c: New file. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: New file. * sysdeps/x86_64/fpu/test-float-vlen4.c: New file. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: New file. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: New file. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: New file. * sysdeps/x86_64/fpu/test-float-vlen8.c: New file.
* Vector cosf for x86_64.Andrew Senkevich2015-06-0918-2/+2436
| | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized cosf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/svml_s_cosf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/svml_s_cosf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_cosf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_cosf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: New file. * sysdeps/x86_64/fpu/svml_s_cosf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_cosf_data.h: New file. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for cosf. * NEWS: Mention addition of x86_64 vector cosf.
* Addition of testing infrastructure for vector math functions.Andrew Senkevich2015-06-0911-0/+299
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We test vector math functions using scalar tests infrastructure with help of special wrappers from scalar versions to vector ones. Wrapper implemented using platform specific vector types and placed in separate file for compilation with architecture specific options, main part of test has no such options. With help of system of definitions unfolding of which is drived from test code we have wrapper called in individual testing function instead of scalar function. Also system of definitions includes generated during make check header math/libm-have-vector-test.h with series of conditional definitions which help to avoid build fails for functions having no vector versions; runtime architecture check to prevent runtime fails of test run on inappropriate hardware. * math/Makefile: Added rules for vector tests. * math/gen-libm-have-vector-test.sh: Added generation of wrapper declaration under condition. * math/test-double-vlen2.h: New file. * math/test-double-vlen4.h: New file. * math/test-double-vlen8.h: New file. * math/test-vec-loop.h: Added initialization macro. * sysdeps/x86_64/fpu/Makefile: Added variables for vector tests. * sysdeps/x86_64/fpu/libm-test-ulps: Regenarated. * sysdeps/x86_64/fpu/math-tests-arch.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: New file. * sysdeps/x86_64/fpu/test-double-vlen2.c: New file. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: New file. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: New file. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: New file. * sysdeps/x86_64/fpu/test-double-vlen4.c: New file. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: New file. * sysdeps/x86_64/fpu/test-double-vlen8.c: New file.
* Start of series of patches with x86_64 vector math functions.Andrew Senkevich2015-06-0921-0/+1452
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of cos containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI which had been discussed in <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. Vector math library build and ABI testing enabled by default for x86_64. * sysdeps/x86_64/fpu/Makefile: New file. * sysdeps/x86_64/fpu/Versions: New file. * sysdeps/x86_64/fpu/svml_d_cos_data.S: New file. * sysdeps/x86_64/fpu/svml_d_cos_data.h: New file. * sysdeps/x86_64/fpu/svml_d_cos2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_cos4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_cos4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_cos8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for cos. * math/bits/mathcalls.h: Added cos declaration with __MATHCALL_VEC. * sysdeps/x86_64/configure.ac: Options for libmvec build. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/sysdep.h (cfi_offset_rel_rsp): New macro. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New file. * manual/install.texi (Configuring and compiling): Document --disable-mathvec. * INSTALL: Regenerated. * NEWS: Mention addition of libmvec and x86_64 vector cos.
* This patch adds detection of availability for AVX512F and AVX512DQ ISAs.Andrew Senkevich2015-06-082-0/+34
| | | | | | | | * sysdeps/x86_64/multiarch/init-arch.h (bit_AVX512F_Usable, bit_AVX512DQ_Usable, bit_Opmask_state, bit_ZMM0_15_state, bit_ZMM16_31_state): New macro. * sysdeps/x86_64/multiarch/init-arch.c (__init_cpu_features): Check and set bit_AVX512F_Usable, bit_AVX512DQ_Usable.
* posix_fallocate: Emulation fixes and documentation [BZ #15661]Florian Weimer2015-06-052-38/+96
| | | | | | | | Handle signed integer overflow correctly. Detect and reject O_APPEND. Document drawbacks of emulation. This does not completely address bug 15661, but improves the situation somewhat.
* nptl: Rewrite cancellation macrosAdhemerval Zanella2015-06-0473-1161/+192
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes the way cancellation entrypoints are defined to instead call the macro SYSCALL_CANCEL. An usual cnacellation definition is defined as: if (SINGLE_THREAD_P) return INLINE_SYSCALL (syscall, NARGS, args...) int oldtype = LIBC_CANCEL_ASYNC (); return INLINE_SYSCALL (syscall, NARGS, args...) LIBC_CANCEL_RESET (oldtype); And it is rewrited as just: SYSCALL_CANCEL (syscall, args...) The idea is to remove LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET explicit usage. Tested on i386, x86_64, powerpc32, powerpc64le, arm, and aarch64. * sysdeps/unix/sysdep.h [SYSCALL_CANCEL]: New macro: define cancellable syscalls. (SYS_ify): Add guard to no redefine it. (INLINE_SYSCALL): Likewise. * sysdeps/unix/sysv/linux/accept4.c (accept4): Remove LIBC_CANCEL_ASYNC/INLINE_SYSCALL/LIBC_CANCEL_RESET and use SYSCALL_CANCEL instead. * sysdeps/unix/sysv/linux/alpha/fdatasync.c (__fdatasync): Likewise. * sysdeps/unix/sysv/linux/arm/pread.c (__libc_pread): Likewise. * sysdeps/unix/sysv/linux/arm/pread64.c (__libc_pread64): Likewise. * sysdeps/unix/sysv/linux/arm/pwrite.c (__libc_pwrite): Likewise. * sysdeps/unix/sysv/linux/arm/pwrite64.c (__libc_pwrite64): Likewise. * sysdeps/unix/sysv/linux/epoll_pwait.c (epoll_pwait): Likewise. * sysdeps/unix/sysv/linux/fallocate.c (fallocate): Likewise. * sysdeps/unix/sysv/linux/fallocate64.c (fallocate64): Likewise. * sysdeps/unix/sysv/linux/generic/open.c (__libc_open): Likewise. * sysdeps/unix/sysv/linux/generic/open64.c (__libc_open64): Likewise. * sysdeps/unix/sysv/linux/generic/pause.c (__libc_pause): Likewise. * sysdeps/unix/sysv/linux/generic/poll.c (__poll): Likewise. * sysdeps/unix/sysv/linux/generic/recv.c (__libc_recv): Likewise. * sysdeps/unix/sysv/linux/generic/select.c (__select): Likewise. * sysdeps/unix/sysv/linux/generic/send.c (__libc_send): Likewise. * sysdeps/unix/sysv/linux/generic/wordsize-32/pread.c (__libc_pread): Likewise. * sysdeps/unix/sysv/linux/generic/wordsize-32/pread64.c (__libc_pread64): Likewise. * sysdeps/unix/sysv/linux/generic/wordsize-32/preadv.c (__libc_preadv): Likewise. * sysdeps/unix/sysv/linux/generic/wordsize-32/preadv64.c (__libc_readv64): Likewise. * sysdeps/unix/sysv/linux/generic/wordsize-32/pwrite.c (__libc_pwrite): Likewise. * sysdeps/unix/sysv/linux/generic/wordsize-32/pwrite64.c (__libc_pwrite64): Likewise. * sysdeps/unix/sysv/linux/generic/wordsize-32/pwritev.c (__libc_pwritev): Likewise. * sysdeps/sysv/linux/generic/wordsize-32/pwritev64.c (__libc_pwritev64): Likewise. * sysdeps/unix/sysv/linux/i386/fcntl.c (__libc_fcntl): Likewise. * sysdeps/unix/sysv/linux/mips/mips32/sync_file_range.c (sync_file_range): Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n32/fallocate.c (fallocate): Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n32/fallocate64.c (fallocate64): Likewise. * sysdeps/unix/sysv/linux/mips/pread.c (__libc_pread): Likewise. * sysdeps/unix/sysv/linux/mips/pread64.c (__libc_pread64): Likewise. * sysdeps/unix/sysv/linux/mips/pwrite.c (__libc_pwrite): Likewise. * sysdeps/unix/sysv/linux/mips/pwrite64.c (__libc_pwrite64): Likewise. * sysdeps/unix/sysv/linux/msgrcv.c (__libc_msgrcv): Likewise. * sysdeps/unix/sysv/linux/msgsnd.c (__libc_msgsnd): Likewise. * sysdeps/unix/sysv/linux/open64.c (__libc_open64): Likewise. * sysdeps/unix/sysv/linux/openat.c (__libc_openat): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/pread.c (__libc_pread): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/pread64.c (__libc_read64): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/pwrite.c (__libc_write): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/pwrite64.c (__libc_write64): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/fcntl.c (__libc_fcntl): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/pread.c (__libc_pread): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/pread64.c (__libc_pread64): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/pwrite.c (__libc_pwrite): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/pwrite64.c (__libc_pwrite64): Likewise. * sysdeps/sysv/linux/powerpc/powerpc64/sync_file_range.c (sync_file_range): Likewise. * sysdeps/unix/sysv/linux/ppoll.c (ppoll): Likewise. * sysdeps/unix/sysv/linux/pread.c (__libc_pread): Likewise. * sysdeps/unix/sysv/linux/pread64.c (__libc_pread64): Likewise. * sysdeps/unix/sysv/linux/preadv.c (__libc_preadv): Likewise. * sysdeps/unix/sysv/linux/pselect.c (__pselect): Likewise. * sysdeps/unix/sysv/linux/pwrite.c (__libc_pwrite): Likewise. * sysdeps/unix/sysv/linux/pwrite64.c (__libc_pwrite64): Likewise. * sysdeps/unix/sysv/linux/pwritev.c (PWRITEV): Likewise. * sysdeps/unix/sysv/linux/readv.c (__libc_readv): Likewise. * sysdeps/unix/sysv/linux/recvmmsg.c (recvmmsg): Likewise. * sysdeps/unix/sysv/linux/sendmmsg.c (sendmmsg): Likewise. * sysdeps/unix/sysv/linux/sh/pread.c (__libc_pread): Likewise. * sysdeps/unix/sysv/linux/sh/pread64.c (__libc_pread64): Likewise. * sysdeps/unix/sysv/linux/sh/pwrite.c (__libc_pwrite): Likewise. * sysdeps/unix/sysv/linux/sh/pwrite64.c (__libc_pwrite64): Likewise. * sysdeps/unix/sysv/linux/sigsuspend.c (__sigsuspend): Likewise. * sysdeps/unix/sysv/linux/sigtimedwait.c (__sigtimedwait): Likewise. * sysdeps/unix/sysv/linux/sigwaitinfo.c (__sigwaitinfo): Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/msgrcv.c (__libc_msgrcv): Likewise. * sysdeps/unix/sysv/linux/sync_file_range.c (sync_file_range): Likewise. * sysdeps/unix/sysv/linux/tcdrain.c (__libc_tcdrain): Likewise. * sysdeps/unix/sysv/linux/timer_routines.c (timer_helper_thread): Likewise. * sysdeps/unix/sysv/linux/wait.c (__libc_wait): Likewise. * sysdeps/unix/sysv/linux/waitid.c (__waitid): Likewise. * sysdeps/unix/sysv/linux/waitpid.c (__libc_waitpid): Likewise. * sysdeps/unix/sysv/linux/wordsize-64/fallocate.c (fallocate): Likewise. * sysdeps/unix/sysv/linux/wordsize-64/preadv.c (preadv): Likewise. * sysdeps/unix/sysv/linux/wordsize-64/pwritev.c (pwritev): Likewise. * sysdeps/unix/sysv/linux/writev.c (__libc_writev): Likewise. * sysdeps/unix/sysv/linux/x86_64/recv.c (__libc_recv): Likewise. * sysdeps/unix/sysv/linux/x86_64/send.c (__libc_send): Likewise.
* ARM: VDSO supportNathan Lynch2015-06-045-0/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Beginning with the upcoming 4.1 release, Linux on a subset of 32-bit ARM hardware will provide fast user-space implementations of the following system calls: - gettimeofday - clock_gettime The kernel implementation depends on the ARMv7 Generic Timers Extension to accelerate these system calls. So CPUs such as Cortex-A15 and -A7 benefit, while Cortex-A9, -A8, and pre-v7 CPUs do not. On systems where the VDSO does not provide any speedup, the kernel prevents the relevant symbol lookups from succeeding. On OMAP5 (Cortex-A15) gettimeofday latency decreases from ~350ns to ~120ns. On BeagleBone Black (Cortex-A8) it goes from ~650ns to ~660ns, which to my mind is an acceptable cost. Verified that no new test failures are introduced on kernels with and without the VDSO. * sysdeps/unix/sysv/linux/arm/Makefile: (sysdep_routines): Include dl-vdso. * sysdeps/unix/sysv/linux/arm/init-first.c: New file: Use VDSO routines for gettimeofday, clock_gettime if available. * sysdeps/unix/sysv/linux/arm/libc-vdso.h: New file: Declare VDSO symbols. * sysdeps/unix/sysv/linux/arm/sysdep.h: [HAVE_GETTIMEOFDAY_VSYSCALL]: Define. [HAVE_CLOCK_GETTIME_VSYSCALL]: Define. * sysdeps/unix/sysv/linux/arm/Versions: Add __vdso_clock_gettime.
* Use inline syscalls for non-cancellable versionsAdhemerval Zanella2015-06-042-34/+44
| | | | | | This patch uses inline calls (through INLINE_SYSCALL macro) to define the non-cancellable functions macros to avoid use of the syscall_nocancel entrypoint.
* NaCl: Implement nacl_interface_ext_supply entry point.Roland McGrath2015-06-035-2/+95
|