about summary refs log tree commit diff
path: root/sysdeps/x86_64
Commit message (Collapse)AuthorAgeFilesLines
...
* Update libmvec multiarch functions for <cpu-features.h>H.J. Lu2015-08-1338-227/+125
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch updates libmvec multiarch functions to use the newly defined HAS_CPU_FEATURE, HAS_ARCH_FEATURE and LOAD_RTLD_GLOBAL_RO_RDX from <cpu-features.h>. * math/Makefile ($(addprefix $(objpfx), $(libm-vec-tests))): Remove $(objpfx)init-arch.o. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Remove init-arch. * sysdeps/x86_64/fpu/math-tests-arch.h (avx_usable): Removed. (INIT_ARCH_EXT): Defined as empty. (CHECK_ARCH_EXT): Replace HAS_XXX with HAS_ARCH_FEATURE (XXX). * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.S: Remove __init_cpu_features call. Replace HAS_XXX with HAS_CPU_FEATURE/HAS_ARCH_FEATURE (XXX). * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core.S: Likewise.
* Update x86_64 multiarch functions for <cpu-features.h>H.J. Lu2015-08-1339-233/+246
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch updates x86_64 multiarch functions to use the newly defined HAS_CPU_FEATURE, HAS_ARCH_FEATURE and LOAD_RTLD_GLOBAL_RO_RDX from <cpu-features.h>. * sysdeps/x86_64/fpu/multiarch/e_asin.c: Replace HAS_XXX with HAS_CPU_FEATURE/HAS_ARCH_FEATURE (XXX). * sysdeps/x86_64/fpu/multiarch/e_atan2.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_log.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_pow.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_atan.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_fmaf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sin.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_tan.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceil.S: Use LOAD_RTLD_GLOBAL_RO_RDX and HAS_CPU_FEATURE (SSE4_1). * sysdeps/x86_64/fpu/multiarch/s_ceilf.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floor.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floorf.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_nearbyint.S : Likewise. * sysdeps/x86_64/fpu/multiarch/s_nearbyintf.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rintf.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rintf.S : Likewise. * sysdeps/x86_64/multiarch/ifunc-impl-list.c: Likewise. * sysdeps/x86_64/multiarch/sched_cpucount.c: Likewise. * sysdeps/x86_64/multiarch/strstr.c: Likewise. * sysdeps/x86_64/multiarch/memmove.c: Likewise. * sysdeps/x86_64/multiarch/memmove_chk.c: Likewise. * sysdeps/x86_64/multiarch/test-multiarch.c: Likewise. * sysdeps/x86_64/multiarch/memcmp.S: Remove __init_cpu_features call. Add LOAD_RTLD_GLOBAL_RO_RDX. Replace HAS_XXX with HAS_CPU_FEATURE/HAS_ARCH_FEATURE (XXX). * sysdeps/x86_64/multiarch/memcpy.S: Likewise. * sysdeps/x86_64/multiarch/memcpy_chk.S: Likewise. * sysdeps/x86_64/multiarch/mempcpy.S: Likewise. * sysdeps/x86_64/multiarch/mempcpy_chk.S: Likewise. * sysdeps/x86_64/multiarch/memset.S: Likewise. * sysdeps/x86_64/multiarch/memset_chk.S: Likewise. * sysdeps/x86_64/multiarch/strcat.S: Likewise. * sysdeps/x86_64/multiarch/strchr.S: Likewise. * sysdeps/x86_64/multiarch/strcmp.S: Likewise. * sysdeps/x86_64/multiarch/strcpy.S: Likewise. * sysdeps/x86_64/multiarch/strcspn.S: Likewise. * sysdeps/x86_64/multiarch/strspn.S: Likewise. * sysdeps/x86_64/multiarch/wcscpy.S: Likewise. * sysdeps/x86_64/multiarch/wmemcmp.S: Likewise.
* Add _dl_x86_cpu_features to rtld_globalH.J. Lu2015-08-1310-546/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds _dl_x86_cpu_features to rtld_global in x86 ld.so and initializes it early before __libc_start_main is called so that cpu_features is always available when it is used and we can avoid calling __init_cpu_features in IFUNC selectors. * sysdeps/i386/dl-machine.h: Include <cpu-features.c>. (dl_platform_init): Call init_cpu_features. * sysdeps/i386/dl-procinfo.c (_dl_x86_cpu_features): New. * sysdeps/i386/i686/cacheinfo.c (DISABLE_PREFERRED_MEMORY_INSTRUCTION): Removed. * sysdeps/i386/i686/multiarch/Makefile (aux): Remove init-arch. * sysdeps/i386/i686/multiarch/Versions: Removed. * sysdeps/i386/i686/multiarch/ifunc-defines.sym (KIND_OFFSET): Removed. * sysdeps/i386/ldsodefs.h: Include <cpu-features.h>. * sysdeps/unix/sysv/linux/x86/Makefile (libpthread-sysdep_routines): Remove init-arch. * sysdeps/unix/sysv/linux/x86_64/dl-procinfo.c: Include <sysdeps/x86_64/dl-procinfo.c> instead of sysdeps/generic/dl-procinfo.c>. * sysdeps/x86/Makefile [$(subdir) == csu] (gen-as-const-headers): Add cpu-features-offsets.sym and rtld-global-offsets.sym. [$(subdir) == elf] (sysdep-dl-routines): Add dl-get-cpu-features. [$(subdir) == elf] (tests): Add tst-get-cpu-features. [$(subdir) == elf] (tests-static): Add tst-get-cpu-features-static. * sysdeps/x86/Versions: New file. * sysdeps/x86/cpu-features-offsets.sym: Likewise. * sysdeps/x86/cpu-features.c: Likewise. * sysdeps/x86/cpu-features.h: Likewise. * sysdeps/x86/dl-get-cpu-features.c: Likewise. * sysdeps/x86/libc-start.c: Likewise. * sysdeps/x86/rtld-global-offsets.sym: Likewise. * sysdeps/x86/tst-get-cpu-features-static.c: Likewise. * sysdeps/x86/tst-get-cpu-features.c: Likewise. * sysdeps/x86_64/dl-procinfo.c: Likewise. * sysdeps/x86_64/cacheinfo.c (__cpuid_count): Removed. Assume USE_MULTIARCH is defined and don't check it. (is_intel): Replace __cpu_features with GLRO(dl_x86_cpu_features). (is_amd): Likewise. (max_cpuid): Likewise. (intel_check_word): Likewise. (__cache_sysconf): Don't call __init_cpu_features. (__x86_preferred_memory_instruction): Removed. (init_cacheinfo): Don't call __init_cpu_features. Replace __cpu_features with GLRO(dl_x86_cpu_features). * sysdeps/x86_64/dl-machine.h: <cpu-features.c>. (dl_platform_init): Call init_cpu_features. * sysdeps/x86_64/ldsodefs.h: Include <cpu-features.h>. * sysdeps/x86_64/multiarch/Makefile (aux): Remove init-arch. * sysdeps/x86_64/multiarch/Versions: Removed. * sysdeps/x86_64/multiarch/cacheinfo.c: Likewise. * sysdeps/x86_64/multiarch/init-arch.c: Likewise. * sysdeps/x86_64/multiarch/ifunc-defines.sym (KIND_OFFSET): Removed. * sysdeps/x86_64/multiarch/init-arch.h: Rewrite.
* Add more tests of various libm functions.Joseph Myers2015-08-111-12/+12
| | | | | | | | | | | | | | This patch adds more tests of various libm functions found through random test generation to give increased ulps on 32-bit x86. Tested for x86_64 and x86. * math/auto-libm-test-in: Add more tests of acosh, asin, asinh, atanh, cabs, carg, cbrt, cosh, csqrt, erf, erfc, exp, exp10, expm1, hypot, log, log10, log1p, log2, pow, sinh, tan and tgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Align stack to 16 bytes when calling __errno_locationH.J. Lu2015-08-053-0/+18
| | | | | | | | | | We should align stack to 16 bytes when calling __errno_location. [BZ #18661] * sysdeps/x86_64/fpu/s_cosf.S (__cosf): Align stack to 16 bytes when calling __errno_location. * sysdeps/x86_64/fpu/s_sincosf.S (__sincosf): Likewise. * sysdeps/x86_64/fpu/s_sinf.S (__sinf): Likewise.
* Compile {memcpy,strcmp}-sse2-unaligned.S only for libcH.J. Lu2015-08-052-0/+8
| | | | | | | | {memcpy,strcmp}-sse2-unaligned.S aren't needed in ld.so. * sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: Compile only for libc. * sysdeps/x86_64/multiarch/strcmp-sse2-unaligned.S: Likewise.
* Prevent runtime fail of SSE vector math tests on non SSE4.1 machine.Andrew Senkevich2015-07-301-2/+0
| | | | | | | | [BZ #18740] * sysdeps/x86_64/fpu/Makefile (double-vlen2-arch-ext-cflags, float-vlen4-arch-ext-cflags): Removed. * math/Makefile (CFLAGS-test-double-vlen2-wrappers.c, CFLAGS-test-float-vlen4-wrappers.c): Likewise.
* Extend local PLT reference checkH.J. Lu2015-07-291-0/+19
| | | | | | | | | | | | | | On x86, linker in binutils 2.26 and newer consolidates R_*_JUMP_SLOT with R_*_GLOB_DAT relocation against the same symbol. This patch extends local PLT reference check to support alternate relocations. [BZ #18078] * scripts/check-localplt.awk: Support alternate relocations. * scripts/localplt.awk: Also check relocations in DT_RELA/DT_REL sections. * sysdeps/unix/sysv/linux/i386/localplt.data: Mark free and malloc entries with + REL R_386_GLOB_DAT. * sysdeps/x86_64/localplt.data: New file.
* Added runtime check for AVX vector math tests.Andrew Senkevich2015-07-293-1/+27
| | | | | | | [BZ #18731] * sysdeps/x86_64/fpu/math-tests-arch.h: Added AVX runtime check. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise.
* Fixed several libmvec bugs found during testing on KNL hardware.Andrew Senkevich2015-07-2415-223/+201
| | | | | | | | | | | | | | | | | | | | | | AVX512 IFUNC implementations, implementations of wrappers to AVX2 versions and KNL expf implementation fixed. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: Fixed AVX512 IFUNC. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core.S: Likewise. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Fixed wrappers to AVX2. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S: Fixed KNL implementation.
* Modify several tests to use test-skeleton.cArjun Shankar2015-07-151-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These tests were skipped by the use-test-skeleton conversion done in commit 29955b5d because they were reused in other tests via the #include directive, and so deemed worth an inspection before they were modified. This has now been done. ChangeLog: 2015-07-09 Arjun Shankar <arjun.is@lostca.se> * elf/tst-leaks1.c (main): Converted to ... (do_test): ... this. (TEST_FUNCTION): New macro. Include test-skeleton.c. * localedata/tst-langinfo.c (main): Converted to ... (do_test): ... this. (TEST_FUNCTION): New macro. Include test-skeleton.c. * math/test-fpucw.c (main): Converted to ... (do_test): ... this. (TEST_FUNCTION): New macro. Include test-skeleton.c. * math/test-tgmath.c (main): Converted to ... (do_test): ... this. (TEST_FUNCTION): New macro. Include test-skeleton.c. * math/test-tgmath2.c (main): Converted to ... (do_test): ... this. (TEST_FUNCTION): New macro. Include test-skeleton.c. * setjmp/tst-setjmp.c (main): Converted to ... (do_test): ... this. (TEST_FUNCTION): New macro. Include test-skeleton.c. * stdio-common/tst-sscanf.c (main): Converted to ... (do_test): ... this. (TEST_FUNCTION): New macro. Include test-skeleton.c. * sysdeps/x86_64/tst-audit6.c (main): Converted to ... (do_test): ... this. (TEST_FUNCTION): New macro. Include test-skeleton.c.
* Improve bndmov encoding with zero displacementH.J. Lu2015-07-091-0/+8
| | | | | | | | | | If x86-64 assembler doesn't support MPX, we encode bndmov instruction by hand. When displacement is zero, assembler generates shorter encoding. This patch improves bndmov encoding with zero displacement so that ld.so is identical when using assemblers with and without MPX support. * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve): Improve bndmov encoding with zero displacement.
* Preserve bound registers for pointer pass/returnIgor Zamyatin2015-07-092-20/+25
| | | | | | | | | | | | | | | | | | | | | | | We need to save/restore bound registers and add a BND prefix before branches in _dl_runtime_profile so that bound registers for pointer pass and return are preserved when LD_AUDIT is used. [BZ #18134] * sysdeps/i386/configure.ac: Set HAVE_MPX_SUPPORT. * sysdeps/i386/configure: Regenerated. * sysdeps/i386/dl-trampoline.S (PRESERVE_BND_REGS_PREFIX): New. (_dl_runtime_profile): Save and restore Intel MPX return bound registers when calling _dl_call_pltexit. Add PRESERVE_BND_REGS_PREFIX before return. * sysdeps/i386/link-defines.sym (LRV_BND0_OFFSET): New. (LRV_BND1_OFFSET): Likewise. * sysdeps/x86/bits/link.h (La_i86_retval): Add lrv_bnd0 and lrv_bnd1. * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Fix typo in bndmov encoding. * sysdeps/x86_64/dl-trampoline.h: Properly save and restore Intel MPX bound registers. Add PRESERVE_BND_REGS_PREFIX before branch instructions to preserve bounds.
* Add la_symbind32 to x86-64 audit testsH.J. Lu2015-07-076-0/+60
| | | | | | | | | | | | la_symbind32 is used for x32 in x86-64 audit tests. We should define both la_symbind32 and la_symbind64 in x86-64 audit tests. * sysdeps/x86_64/tst-auditmod10b.c (la_symbind32): New. * sysdeps/x86_64/tst-auditmod4b.c (la_symbind32): Likewise. * sysdeps/x86_64/tst-auditmod5b.c (la_symbind32): Likewise. * sysdeps/x86_64/tst-auditmod6b.c (la_symbind32): Likewise. * sysdeps/x86_64/tst-auditmod6c.c (la_symbind32): Likewise. * sysdeps/x86_64/tst-auditmod7b.c (la_symbind32): Likewise.
* Clean up BUSY_WAIT_NOP and atomic_delay.Torvald Riegel2015-06-301-1/+1
| | | | | | | | | This patch combines BUSY_WAIT_NOP and atomic_delay into a new atomic_spin_nop function and adjusts all clients. The new function is put into atomic.h because what is best done in a spin loop is architecture-specific, and atomics must be used for spinning. The function name is meant to tell users that this has no effect on synchronization semantics but is a performance aid for spinning.
* Improve tgamma accuracy (bug 18613).Joseph Myers2015-06-291-4/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In non-default rounding modes, tgamma can be slightly less accurate than permitted by glibc's accuracy goals. Part of the problem is error accumulation, addressed in this patch by setting round-to-nearest for internal computations. However, there was also a bug in the code dealing with computing pow (x + n, x + n) where x + n is not exactly representable, providing another source of error even in round-to-nearest mode; it was necessary to address both bugs to get errors for all testcases within glibc's accuracy goals. Given this second fix, accuracy in round-to-nearest mode is also improved (hence regeneration of ulps for tgamma should be from scratch - truncate libm-test-ulps or at least remove existing tgamma entries - so that the expected ulps can be reduced). Some additional complications also arose. Certain tgamma tests should strictly, according to IEEE semantics, overflow or not depending on the rounding mode; this is beyond the scope of glibc's accuracy goals for any function without exactly-determined results, but gen-auto-libm-tests doesn't handle being lax there as it does for underflow. (libm-test.inc also doesn't handle being lax about whether the result in cases very close to the overflow threshold is infinity or a finite value close to overflow, but that doesn't cause problems in this case though I've seen it cause problems with random test generation for some functions.) Thus, spurious-overflow markings, with a comment, are added to auto-libm-test-in (no bug in Bugzilla because the issue is with the testsuite, not a user-visible bug in glibc). And on x86, after the patch I saw ERANGE issues as previously reported by Carlos (see my commentary in <https://sourceware.org/ml/libc-alpha/2015-01/msg00485.html>), which needed addressing by ensuring excess range and precision were eliminated at various points if FLT_EVAL_METHOD != 0. I also noticed and fixed a cosmetic issue where 1.0f was used in long double functions and should have been 1.0L. This completes the move of all functions to testing in all rounding modes with ALL_RM_TEST, so gen-libm-have-vector-test.sh is updated to remove the workaround for some functions not using ALL_RM_TEST. Tested for x86_64, x86, mips64 and powerpc. [BZ #18613] * sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gamma_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. * sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gammaf_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gammal_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. Use 1.0L not 1.0f as numerator of division. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gammal_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. Use 1.0L not 1.0f as numerator of division. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Take log of X_ADJ not X when adjusting exponent. (__ieee754_gammal_r): Do intermediate computations in round-to-nearest then adjust overflowing and underflowing results as needed. Use 1.0L not 1.0f as numerator of division. * math/libm-test.inc (tgamma_test_data): Remove one test. Moved to auto-libm-test-in. (tgamma_test): Use ALL_RM_TEST. * math/auto-libm-test-in: Add one test of tgamma. Mark some other tests of tgamma with spurious-overflow. * math/auto-libm-test-out: Regenerated. * math/gen-libm-have-vector-test.sh: Do not check for START. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Use round-to-nearest internally in jn, test with ALL_RM_TEST (bug 18602).Joseph Myers2015-06-251-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some existing jn tests, if run in non-default rounding modes, produce errors above those accepted in glibc, which causes problems for moving tests of jn to use ALL_RM_TEST. This patch makes jn set rounding to-nearest internally, as was done for yn some time ago, then computes the appropriate underflowing value for results that underflowed to zero in to-nearest, and moves the tests to ALL_RM_TEST. It does nothing about the general inaccuracy of Bessel function implementations in glibc, though it should make jn more accurate on average in non-default rounding modes through reduced error accumulation. The recomputation of results that underflowed to zero should as a side-effect fix some cases of bug 16559, where jn just used an exact zero, but that is *not* the goal of this patch and other cases of that bug remain unfixed. (Most of the changes in the patch are reindentation to add new scopes for SET_RESTORE_ROUND*.) Tested for x86_64, x86, powerpc and mips64. [BZ #16559] [BZ #18602] * sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Set round-to-nearest internally then recompute results that underflowed to zero in the original rounding mode. * sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise. * sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise. * sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise * math/libm-test.inc (jn_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Refactor libm tests.Joseph Myers2015-06-248-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch refactors the libm tests using libm-test.inc to reduce the level of duplicate definitions. New headers are created for the definitions shared by tests for a particular type; by tests of inline functions; by tests of non-inline functions; by scalar tests; and by vector tests. The unused MATHCONST macro is removed. A new macro VEC_LEN is added to the vector headers to allow the macros defining wrappers for vector functions to be defined once, instead of six times each (differing only in vector length) as before. There is still scope for further refactoring, but this seems a useful start. Tested for x86_64. * math/test-double.h: New file. * math/test-float.h: Likewise. * math/test-ldouble.h: Likewise. * math/test-math-inline.h: Likewise. * math/test-math-no-inline.h: Likewise. * math/test-math-scalar.h: Likewise. * math/test-math-vector.h: Likewise. * math/test-vec-loop.h: Remove file. Contents moved into test-math-vector.h. * math/libm-test.inc (MATHCONST): Do not document macro. * math/test-double.c: Include test-double.h, test-math-no-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_DOUBLE): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-float.c: Include test-float.h, test-math-no-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_FLOAT): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-idouble.c: Include test-double.h, test-math-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_DOUBLE): Likewise. (TEST_MATHVEC): Likewise. (TEST_INLINE): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-ifloat.c: Include test-float.h, test-math-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_FLOAT): Likewise. (TEST_MATHVEC): Likewise. (TEST_INLINE): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-ildoubl.c: Include test-ldouble.h, test-math-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_LDOUBLE): Likewise. (TEST_MATHVEC): Likewise. (TEST_INLINE): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-ldouble.c: Include test-ldouble.h, test-math-no-inline.h and test-math-scalar.h. (FUNC): Remove macro. (FUNC_TEST): Likewise. (FLOAT): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_LDOUBLE): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. * math/test-double-vlen2.h: Include test-double.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_DOUBLE): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * math/test-double-vlen4.h: Include test-double.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_DOUBLE): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * math/test-double-vlen8.h: Include test-double.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_DOUBLE): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * math/test-float-vlen4.h: Include test-float.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_FLOAT): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * math/test-float-vlen8.h: Include test-float.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_FLOAT): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * math/test-float-vlen16.h: Include test-float.h, test-math-no-inline.h and test-math-vector.h. (FLOAT): Remove macro. (FUNC): Likewise. (MATHCONST): Likewise. (PRINTF_EXPR): Likewise. (PRINTF_XEXPR): Likewise. (PRINTF_NEXPR): Likewise. (TEST_FLOAT): Likewise. (TEST_MATHVEC): Likewise. (__NO_MATH_INLINES): Likewise. (CNCT): Likewise. (CONCAT): Likewise. (WRAPPER_NAME): Likewise. (WRAPPER_DECL): Likewise. (WRAPPER_DECL_ff): Likewise. (WRAPPER_DECL_fFF): Likewise. (VECTOR_WRAPPER): Likewise. (VECTOR_WRAPPER_ff): Likewise. (VECTOR_WRAPPER_fFF): Likewise. (VEC_LEN): New macro. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Do not include test-vec-loop.h. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Likewise.
* Fix cexp, ccos, ccosh, csin, csinh spurious underflows (bug 18594).Joseph Myers2015-06-241-0/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cexp, ccos, ccosh, csin and csinh have spurious underflows in cases where they compute sin of the smallest normal, that produces an underflow exception (depending on which sin implementation is in use) but the final result does not underflow. ctan and ctanh may also have such underflows, or they may be latent (the issue there is that e.g. ctan (DBL_MIN) should, rounded upwards, be the next double value above DBL_MIN, which under glibc's accuracy goals may not have an underflow exception, but the intermediate computation of sin (DBL_MIN) would legitimately underflow on before-rounding architectures). This patch fixes all those functions so they use plain comparisons (> DBL_MIN etc.) instead of comparing the result of fpclassify with FP_SUBNORMAL (in all these cases, we already know the number being compared is finite). Note that in the case of csin / csinf / csinl, there is no need for fabs calls in the comparison because the real part has already been reduced to its absolute value. As the patch fixes the failures that previously obstructed moving tests of cexp to use ALL_RM_TEST, those tests are moved to ALL_RM_TEST by the patch (two functions remain yet to be converted). Tested for x86_64 and x86 and ulps updated accordingly. [BZ #18594] * math/s_ccosh.c (__ccosh): Compare with least normal value instead of comparing class with FP_SUBNORMAL. * math/s_ccoshf.c (__ccoshf): Likewise. * math/s_ccoshl.c (__ccoshl): Likewise. * math/s_cexp.c (__cexp): Likewise. * math/s_cexpf.c (__cexpf): Likewise. * math/s_cexpl.c (__cexpl): Likewise. * math/s_csin.c (__csin): Likewise. * math/s_csinf.c (__csinf): Likewise. * math/s_csinh.c (__csinh): Likewise. * math/s_csinhf.c (__csinhf): Likewise. * math/s_csinhl.c (__csinhl): Likewise. * math/s_csinl.c (__csinl): Likewise. * math/s_ctan.c (__ctan): Likewise. * math/s_ctanf.c (__ctanf): Likewise. * math/s_ctanh.c (__ctanh): Likewise. * math/s_ctanhf.c (__ctanhf): Likewise. * math/s_ctanhl.c (__ctanhl): Likewise. * math/s_ctanl.c (__ctanl): Likewise. * math/auto-libm-test-in: Add more tests of ccos, ccosh, cexp, csin, csinh, ctan and ctanh. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (cexp_test): Use ALL_RM_TEST. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Fix csin, csinh overflow in directed rounding modes (bug 18593).Joseph Myers2015-06-241-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | csin and csinh can produce bad results when overflowing in directed rounding modes, because a multiplication that can overflow is followed by a possible negation. This patch fixes this by negating one of the arguments of the multiplication before the multiplication instead of negating the result. The new tests for this issue are added to auto-libm-test-in, starting use of that file for csin and csinh. The issue was found in the course of moving existing tests for csin and csinh (existing tests, by being enabled in more cases than previously, showed the issue for float and double but not for long double); that move will now be done separately. Tested for x86_64 and x86 and ulps updated accordingly. [BZ #18593] * math/s_csin.c (__csin): Negate before rather than after possibly overflowing multiplication. * math/s_csinf.c (__csinf): Likewise. * math/s_csinh.c (__csinh): Likewise. * math/s_csinhf.c (__csinhf): Likewise. * math/s_csinhl.c (__csinhl): Likewise. * math/s_csinl.c (__csinl): Likewise. * math/auto-libm-test-in: Add some tests of csin and csinh. * math/auto-libm-test-out: Regenerated. * math/libm-test.inc (csin_test_data): Use AUTO_TESTS_c_c. (csinh_test_data): Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Update.
* Combination of data tables for x86_64 vector functions sinf, cosf and sincosf.Andrew Senkevich2015-06-2418-3586/+197
| | | | | | | | | | | | | | | | | | | | | | * sysdeps/x86_64/fpu/Makefile (libmvec-support): Fixed files list. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core_sse4.S: Renamed variable and included header. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/svml_s_trig_data.S: New file. * sysdeps/x86_64/fpu/svml_s_trig_data.h: Likewise. * sysdeps/x86_64/fpu/svml_s_cosf_data.S: Removed file. * sysdeps/x86_64/fpu/svml_s_cosf_data.h: Likewise. * sysdeps/x86_64/fpu/svml_s_sinf_data.S: Likewise. * sysdeps/x86_64/fpu/svml_s_sinf_data.h: Likewise. * sysdeps/x86_64/fpu/svml_s_sincosf_data.S: Likewise. * sysdeps/x86_64/fpu/svml_s_sincosf_data.h: Likewise.
* Fix atomic_full_barrier on x86 and x86_64.Torvald Riegel2015-06-231-0/+7
| | | | | | | | | This fixes BZ #17403 by defining atomic_full_barrier, atomic_read_barrier, and atomic_write_barrier on x86 and x86_64. A full barrier is implemented through an atomic idempotent modification to the stack and not through using mfence because the latter can supposedly be somewhat slower due to having to provide stronger guarantees wrt. self-modifying code, for example.
* Combination of data tables for x86_64 vector functions sin, cos and sincos.Andrew Senkevich2015-06-2320-439/+173
| | | | | | | | | | | | | | | | | | | | | | | | | * sysdeps/x86_64/fpu/Makefile (libmvec-support): Fixed files list. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core_sse4.S: Renamed variable and included header. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core_sse4.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core_avx2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S: Likewise. * sysdeps/x86_64/fpu/svml_d_trig_data.S: New file. * sysdeps/x86_64/fpu/svml_d_trig_data.h: Likewise. * sysdeps/x86_64/fpu/svml_d_cos2_core.S: Removed unneeded include. * sysdeps/x86_64/fpu/svml_d_cos4_core.S: Likewise. * sysdeps/x86_64/fpu/svml_d_cos8_core.S: Likewise. * sysdeps/x86_64/fpu/svml_d_cos_data.S: Removed file. * sysdeps/x86_64/fpu/svml_d_cos_data.h: Likewise. * sysdeps/x86_64/fpu/svml_d_sin_data.S: Likewise. * sysdeps/x86_64/fpu/svml_d_sin_data.h: Likewise. * sysdeps/x86_64/fpu/svml_d_sincos_data.S: Likewise. * sysdeps/x86_64/fpu/svml_d_sincos_data.h: Likewise.
* Fix x86_64 / x86 expm1l (-min_subnorm) result sign (bug 18569).Joseph Myers2015-06-211-0/+6
| | | | | | | | | | | | | | | | | | | | In the x86 / x86_64 implementations of expm1l, when expm1l's result should underflow to 0 (argument minus the least subnormal, in some rounding modes), it can be a zero of the wrong sign. This patch fixes this by returning the argument with underflow forced in that case (this is a 1ulp error relative to the correctly rounded result of -0, which is OK in terms of the documented accuracy goals, whereas a result with the wrong sign never is). Tested for x86_64 and x86. [BZ #18569] * sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Force underflow and return argument in case of subnormal argument. * sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Likewise. * math/auto-libm-test-in: Add more tests of expm1. * math/auto-libm-test-out: Regenerated.
* Fix x86 / x86_64 expl, exp10l missing underflows (bug 16361).Joseph Myers2015-06-211-1/+14
| | | | | | | | | | | | | | | | | | | | | Similar to various other bugs in this area, the x86 and x86_64 implementations of expl / exp10l can fail to produce underflow exceptions when the unscaled result has trailing 0 bits so the scaling down to subnormal precision is exact. This patch fixes this by forcing the exception in the case of tiny results. Tested for x86_64 and x86. [BZ #16361] * sysdeps/i386/fpu/e_expl.S [!USE_AS_EXPM1L] (cmin): New object. [!USE_AS_EXPM1L] (IEEE754_EXPL): Force underflow exception for tiny results. * sysdeps/x86_64/fpu/e_expl.S [!USE_AS_EXPM1L] (cmin): New object. [!USE_AS_EXPM1L] (IEEE754_EXPL): Force underflow exception for tiny results. * math/auto-libm-test-in: Add more tests of exp and exp10. Do not mark underflow exceptions as possibly missing for bug 16361. * math/auto-libm-test-out: Regenerated.
* Vector sincosf for x86_64 and tests.Andrew Senkevich2015-06-1825-41/+2614
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized sincosf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * NEWS: Mention addition of x86_64 vector sincosf. * math/test-float-vlen16.h: Added wrapper for sincosf tests. * math/test-float-vlen4.h: Likewise. * math/test-float-vlen8.h: Likewise. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added sincosf SIMD declaration. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf4_core_sse4.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core.S * sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S * sysdeps/x86_64/fpu/svml_s_sincosf16_core.S * sysdeps/x86_64/fpu/svml_s_sincosf4_core.S * sysdeps/x86_64/fpu/svml_s_sincosf8_core.S * sysdeps/x86_64/fpu/svml_s_sincosf8_core_avx.S * sysdeps/x86_64/fpu/svml_s_sincosf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_sincosf_data.h: New file. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Added 3 argument wrappers. * sysdeps/x86_64/fpu/test-float-vlen16.c: : Vector sincosf tests. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise.
* Vector sincos for x86_64 and tests.Andrew Senkevich2015-06-1825-1/+1770
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized sincos containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * NEWS: Mention addition of x86_64 vector sincos. * bits/libm-simd-decl-stubs.h: Added stubs for sincos. * math/math.h (__MATHDECL_VEC): New macro. * math/bits/mathcalls.h: Added sincos declaration with __MATHDECL_VEC. * math/gen-libm-have-vector-test.sh: Added generation of sincos wrapper declaration under condition. * math/test-vec-loop.h (TEST_VEC_LOOP): Refactored. * math/test-double-vlen2.h: Added wrapper for sincos tests, reflected TEST_VEC_LOOP change. * math/test-double-vlen4.h: Likewise. * math/test-double-vlen8.h: Likewise. * math/test-float-vlen16.h: Reflected TEST_VEC_LOOP change. * math/test-float-vlen4.h: Likewise. * math/test-float-vlen8.h: Likewise. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added sincos SIMD declaration. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos_data.S: New file. * sysdeps/x86_64/fpu/svml_d_sincos_data.h: New file. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Added wrappers for sincos. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Vector sincos tests. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise.
* Vector powf for x86_64 and tests.Andrew Senkevich2015-06-1825-2/+5586
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized powf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for powf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: Added 2 argument wrappers. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_powf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_powf16_core.S: New file. * sysdeps/x86_64/fpu/svml_s_powf4_core.S: New file. * sysdeps/x86_64/fpu/svml_s_powf8_core.S: New file. * sysdeps/x86_64/fpu/svml_s_powf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_powf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_powf_data.h: New file. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector powf tests. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. * math/test-float-vlen16.h: Fixed 2 argument macro. * math/test-float-vlen4.h: Likewise. * math/test-float-vlen8.h: Likewise. * NEWS: Mention addition of x86_64 vector powf.
* Vector pow for x86_64 and tests.Andrew Senkevich2015-06-1725-2/+6886
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized pow containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * bits/libm-simd-decl-stubs.h: Added stubs for pow. * math/bits/mathcalls.h: Added pow declaration with __MATHCALL_VEC. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for pow. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: Added 2 argument wrappers. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_pow2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_pow4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_pow4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_pow8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_pow_data.S: New file. * sysdeps/x86_64/fpu/svml_d_pow_data.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector pow test. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector pow.
* Vector expf for x86_64 and tests.Andrew Senkevich2015-06-1724-1/+1214
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized expf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for expf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_expf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_expf16_core.S: New file. * sysdeps/x86_64/fpu/svml_s_expf4_core.S: New file. * sysdeps/x86_64/fpu/svml_s_expf8_core.S: New file. * sysdeps/x86_64/fpu/svml_s_expf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_expf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_expf_data.h: New file. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector expf tests. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector expf.
* Vector exp for x86_64 and tests.Andrew Senkevich2015-06-1724-2/+2281
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized exp containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * bits/libm-simd-decl-stubs.h: Added stubs for exp. * math/bits/mathcalls.h: Added exp declaration with __MATHCALL_VEC. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for exp. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_exp2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_exp4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_exp4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_exp8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_exp_data.S: New file. * sysdeps/x86_64/fpu/svml_d_exp_data.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector exp test. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector exp.
* Vector logf for x86_64 and tests.Andrew Senkevich2015-06-1724-2/+1191
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized logf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for logf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_logf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_logf16_core.S: New file. * sysdeps/x86_64/fpu/svml_s_logf4_core.S: New file. * sysdeps/x86_64/fpu/svml_s_logf8_core.S: New file. * sysdeps/x86_64/fpu/svml_s_logf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_logf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_logf_data.h: New file. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector logf tests. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector logf.
* Vector log for x86_64 and tests.Andrew Senkevich2015-06-1724-0/+2871
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized log containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * bits/libm-simd-decl-stubs.h: Added stubs for log. * math/bits/mathcalls.h: Added log declaration with __MATHCALL_VEC. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration and asm redirections for log. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_log8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_log2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_log4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_log4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_log8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_log_data.S: New file. * sysdeps/x86_64/fpu/svml_d_log_data.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector log test. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector log.
* Vector sinf for x86_64 and tests.Andrew Senkevich2015-06-1524-1/+2339
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized sinf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New symbols added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for sinf. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_sinf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf16_core.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf4_core.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf8_core.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_sinf_data.h: New file. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: Vector sinf tests. * sysdeps/x86_64/fpu/test-float-vlen16.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-float-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector sinf.
* Vector sin for x86_64 and tests.Andrew Senkevich2015-06-1124-3/+1290
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized sin containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * bits/libm-simd-decl-stubs.h: Added stubs for sin. * math/bits/mathcalls.h: Added sin declaration with __MATHCALL_VEC. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: SIMD declaration for sin. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_d_sin2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sin4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sin4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_sin8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_sin_data.S: New file. * sysdeps/x86_64/fpu/svml_d_sin_data.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector sin test. * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise. * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise. * NEWS: Mention addition of x86_64 vector sin.
* More strict check of AVX512 support in assembler.Andrew Senkevich2015-06-112-0/+2
| | | | | | | | Binutils 2.24 doesn't support some AVX512 instructions with ZMM registers, so we need add more strict check. * configure.ac: Added more strict check. * configure: Regenerated.
* Fix regcomp wcscoll, wcscmp namespace (bug 18497).Joseph Myers2015-06-091-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | regcomp brings in references to wcscoll, which isn't in all the standards that contain regcomp. In turn, wcscoll brings in references to wcscmp, also not in all those standards. This patch fixes this by making those functions into weak aliases of __wcscoll and __wcscmp and calling those names instead as needed. Tested for x86_64 and x86 (testsuite, and that disassembly of installed shared libraries is unchanged by the patch). [BZ #18497] * wcsmbs/wcscmp.c [!WCSCMP] (WCSCMP): Define as __wcscmp instead of wcscmp. (wcscmp): Define as weak alias of WCSCMP. * wcsmbs/wcscoll.c (STRCOLL): Define as __wcscoll instead of wcscoll. (USE_HIDDEN_DEF): Define. [!USE_IN_EXTENDED_LOCALE_MODEL] (wcscoll): Define as weak alias of __wcscoll. Don't use libc_hidden_weak. * wcsmbs/wcscoll_l.c (STRCMP): Define as __wcscmp instead of wcscmp. * sysdeps/i386/i686/multiarch/wcscmp-c.c [SHARED] (libc_hidden_def): Define __GI___wcscmp instead of __GI_wcscmp. (weak_alias): Undefine and redefine. * sysdeps/i386/i686/multiarch/wcscmp.S (wcscmp): Rename to __wcscmp and define as weak alias of __wcscmp. * sysdeps/x86_64/wcscmp.S (wcscmp): Likewise. * include/wchar.h (__wcscmp): Declare. Use libc_hidden_proto. (__wcscoll): Likewise. (wcscmp): Don't use libc_hidden_proto. (wcscoll): Likewise. * posix/regcomp.c (build_range_exp): Call __wcscoll instead of wcscoll. * posix/regexec.c (check_node_accept_bytes): Likewise. * conform/Makefile (test-xfail-XPG3/regex.h/linknamespace): Remove variable. (test-xfail-XPG4/regex.h/linknamespace): Likewise. (test-xfail-POSIX/regex.h/linknamespace): Likewise.
* This patch adds vector cosf tests.Andrew Senkevich2015-06-0910-2/+221
| | | | | | | | | | | | | | | | | | * math/Makefile: Added CFLAGS for new tests. * math/test-float-vlen16.h: New file. * math/test-float-vlen4.h: New file. * math/test-float-vlen8.h: New file. * math/test-double-vlen2.h: Fixed 2 argument macro and comment. * sysdeps/x86_64/fpu/Makefile: Added new tests and variables. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated. * sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c: New file. * sysdeps/x86_64/fpu/test-float-vlen16.c: New file. * sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c: New file. * sysdeps/x86_64/fpu/test-float-vlen4.c: New file. * sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c: New file. * sysdeps/x86_64/fpu/test-float-vlen8-avx2.c: New file. * sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c: New file. * sysdeps/x86_64/fpu/test-float-vlen8.c: New file.
* Vector cosf for x86_64.Andrew Senkevich2015-06-0916-2/+2430
| | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of vectorized cosf containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files. * sysdeps/x86_64/fpu/Versions: New versions added. * sysdeps/x86_64/fpu/svml_s_cosf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core_sse4.S: New file. * sysdeps/x86_64/fpu/svml_s_cosf8_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_s_cosf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core_avx2.S: New file. * sysdeps/x86_64/fpu/svml_s_cosf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S: New file. * sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: New file. * sysdeps/x86_64/fpu/svml_s_cosf_data.S: New file. * sysdeps/x86_64/fpu/svml_s_cosf_data.h: New file. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for cosf. * NEWS: Mention addition of x86_64 vector cosf.
* Addition of testing infrastructure for vector math functions.Andrew Senkevich2015-06-0911-0/+299
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We test vector math functions using scalar tests infrastructure with help of special wrappers from scalar versions to vector ones. Wrapper implemented using platform specific vector types and placed in separate file for compilation with architecture specific options, main part of test has no such options. With help of system of definitions unfolding of which is drived from test code we have wrapper called in individual testing function instead of scalar function. Also system of definitions includes generated during make check header math/libm-have-vector-test.h with series of conditional definitions which help to avoid build fails for functions having no vector versions; runtime architecture check to prevent runtime fails of test run on inappropriate hardware. * math/Makefile: Added rules for vector tests. * math/gen-libm-have-vector-test.sh: Added generation of wrapper declaration under condition. * math/test-double-vlen2.h: New file. * math/test-double-vlen4.h: New file. * math/test-double-vlen8.h: New file. * math/test-vec-loop.h: Added initialization macro. * sysdeps/x86_64/fpu/Makefile: Added variables for vector tests. * sysdeps/x86_64/fpu/libm-test-ulps: Regenarated. * sysdeps/x86_64/fpu/math-tests-arch.h: New file. * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: New file. * sysdeps/x86_64/fpu/test-double-vlen2.c: New file. * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: New file. * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: New file. * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: New file. * sysdeps/x86_64/fpu/test-double-vlen4.c: New file. * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: New file. * sysdeps/x86_64/fpu/test-double-vlen8.c: New file.
* Start of series of patches with x86_64 vector math functions.Andrew Senkevich2015-06-0919-0/+1412
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is implementation of cos containing SSE, AVX, AVX2 and AVX512 versions according to Vector ABI which had been discussed in <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>. Vector math library build and ABI testing enabled by default for x86_64. * sysdeps/x86_64/fpu/Makefile: New file. * sysdeps/x86_64/fpu/Versions: New file. * sysdeps/x86_64/fpu/svml_d_cos_data.S: New file. * sysdeps/x86_64/fpu/svml_d_cos_data.h: New file. * sysdeps/x86_64/fpu/svml_d_cos2_core.S: New file. * sysdeps/x86_64/fpu/svml_d_cos4_core.S: New file. * sysdeps/x86_64/fpu/svml_d_cos4_core_avx.S: New file. * sysdeps/x86_64/fpu/svml_d_cos8_core.S: New file. * sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core_sse4.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core_avx2.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: New file. * sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S: New file. * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added build of SSE, AVX2 and AVX512 IFUNC versions. * sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for cos. * math/bits/mathcalls.h: Added cos declaration with __MATHCALL_VEC. * sysdeps/x86_64/configure.ac: Options for libmvec build. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/sysdep.h (cfi_offset_rel_rsp): New macro. * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New file. * manual/install.texi (Configuring and compiling): Document --disable-mathvec. * INSTALL: Regenerated. * NEWS: Mention addition of libmvec and x86_64 vector cos.
* This patch adds detection of availability for AVX512F and AVX512DQ ISAs.Andrew Senkevich2015-06-082-0/+34
| | | | | | | | * sysdeps/x86_64/multiarch/init-arch.h (bit_AVX512F_Usable, bit_AVX512DQ_Usable, bit_Opmask_state, bit_ZMM0_15_state, bit_ZMM16_31_state): New macro. * sysdeps/x86_64/multiarch/init-arch.c (__init_cpu_features): Check and set bit_AVX512F_Usable, bit_AVX512DQ_Usable.
* Remove various ABS macros and replace uses with fabs (or in one case abs)Wilco Dijkstra2015-05-151-0/+1
| | | | which is more efficient on all targets.
* Use strspn/strcspn/strpbrk ifunc in internal calls.Ondřej Bílka2015-05-122-13/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To make a strtok faster and improve performance in general we need to do one additional change. A comment: /* It doesn't make sense to send libc-internal strcspn calls through a PLT. The speedup we get from using SSE4.2 instruction is likely eaten away by the indirect call in the PLT. */ Does not make sense at all because nobody bothered to check it. Gap between these implementations is quite big, when haystack is empty a sse2 is around 40 cycles slower because it needs to populate a lookup table and difference only increases with size. That is much bigger than plt slowdown which is few cycles. Even benchtest show a gap which also may be reverse by branch misprediction but my internal benchmark shown. simple_strspn stupid_strspn __strspn_sse42 __strspn_sse2 Length 0, alignment 0, acc len 6: 18.6562 35.2344 17.0469 61.6719 Length 6, alignment 0, acc len 6: 59.5469 72.5781 16.4219 73.625 This patch also handles strpbrk which is implemented by including a x86_64/multiarch/strcspn.S file. * sysdeps/x86_64/multiarch/strspn.S: Remove plt indirection. * sysdeps/x86_64/multiarch/strcspn.S: Likewise.
* Add more tests of csqrt, lgamma, log10, sinh.Joseph Myers2015-05-081-14/+14
| | | | | | | | | | | | | This patch adds more randomly-generated tests of various libm functions that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of csqrt, lgamma, log10 and sinh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Add more tests of acosh, atanh, cos, csqrt, erfc, sin, sincos.Joseph Myers2015-05-061-18/+18
| | | | | | | | | | | | | This patch adds more randomly-generated tests of various libm functions that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of acosh, atanh, cos, csqrt, erfc, sin and sincos. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Add further tests of libm functions.Joseph Myers2015-05-051-44/+52
| | | | | | | | | | | | | | | | | This patch adds more randomly-generated tests of various libm functions that are observed to increase ulps on x86_64. (This process must eventually converge, when my random test generation stops finding inputs that increase the listed ulps, except maybe for any cases uncovered where the errors exceed the maximum allowed 9ulp error and so indicate actual libm bugs needing fixing.) Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of acosh, atanh, clog, clog10, csqrt, erfc, exp2, expm1, log10, log2 and sinh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Add more tests of libm functions.Joseph Myers2015-05-021-38/+42
| | | | | | | | | | | | | | This patch adds more randomly-generated tests of various libm functions that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of atan, clog, clog10, cos, csqrt, erf, erfc, exp2, lgamma, log1p, sin, sincos, tanh and tgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Add more tests of tgamma.Joseph Myers2015-05-011-6/+6
| | | | | | | | | | | | This patch adds some randomly-generated tests of tgamma that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of tgamma. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Add more tests of tanh.Joseph Myers2015-05-011-18/+26
| | | | | | | | | | | | This patch adds some randomly-generated tests of tanh that are observed to increase ulps on x86_64. Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add more tests of tanh. * math/auto-libm-test-out: Regenerated. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.