| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If assembler doesn't support AVX512DQ, _dl_runtime_resolve_avx is used
to save the first 8 vector registers, which only saves the lower 256
bits of vector register, for lazy binding. When it is called on AVX512
platform, the upper 256 bits of ZMM registers are clobbered. Parameters
passed in ZMM registers will be wrong when the function is called the
first time. This patch requires binutils 2.24, whose assembler can store
and load ZMM registers, to build x86-64 glibc. Since mathvec library
needs assembler support for AVX512DQ, we disable mathvec if assembler
doesn't support AVX512DQ.
[BZ #20139]
* config.h.in (HAVE_AVX512_ASM_SUPPORT): Renamed to ...
(HAVE_AVX512DQ_ASM_SUPPORT): This.
* sysdeps/x86_64/configure.ac: Require assembler from binutils
2.24 or above.
(HAVE_AVX512_ASM_SUPPORT): Removed.
(HAVE_AVX512DQ_ASM_SUPPORT): New.
* sysdeps/x86_64/configure: Regenerated.
* sysdeps/x86_64/dl-trampoline.S: Make HAVE_AVX512_ASM_SUPPORT
check unconditional.
* sysdeps/x86_64/multiarch/ifunc-impl-list.c: Likewise.
* sysdeps/x86_64/multiarch/memcpy.S: Likewise.
* sysdeps/x86_64/multiarch/memcpy_chk.S: Likewise.
* sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S:
Likewise.
* sysdeps/x86_64/multiarch/memmove-avx512-unaligned-erms.S:
Likewise.
* sysdeps/x86_64/multiarch/memmove.S: Likewise.
* sysdeps/x86_64/multiarch/memmove_chk.S: Likewise.
* sysdeps/x86_64/multiarch/mempcpy.S: Likewise.
* sysdeps/x86_64/multiarch/mempcpy_chk.S: Likewise.
* sysdeps/x86_64/multiarch/memset-avx512-no-vzeroupper.S:
Likewise.
* sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S:
Likewise.
* sysdeps/x86_64/multiarch/memset.S: Likewise.
* sysdeps/x86_64/multiarch/memset_chk.S: Likewise.
* sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S: Check
HAVE_AVX512DQ_ASM_SUPPORT instead of HAVE_AVX512_ASM_SUPPORT.
* sysdeps/x86_64/fpu/multiarch/svml_d_exp8_core_avx512.S:
Likewise.
* sysdeps/x86_64/fpu/multiarch/svml_d_log8_core_avx512.S:
Likewise.
* sysdeps/x86_64/fpu/multiarch/svml_d_pow8_core_avx512.S:
Likewise.
* sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S:
Likewise.
* sysdeps/x86_64/fpu/multiarch/svml_d_sincos8_core_avx512.:
Likewise.
* sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S:
Likewise.
* sysdeps/x86_64/fpu/multiarch/svml_s_expf16_core_avx512.S:
Likewise.
* sysdeps/x86_64/fpu/multiarch/svml_s_logf16_core_avx512.S:
Likewise.
* sysdeps/x86_64/fpu/multiarch/svml_s_powf16_core_avx512.S:
Likewise.
* sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx51:
Likewise.
* sysdeps/x86_64/fpu/multiarch/svml_s_sinf16_core_avx512.S:
Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since x86-64 memcpy-avx512-no-vzeroupper.S implements memmove, make
__memcpy_avx512_no_vzeroupper an alias of __memmove_avx512_no_vzeroupper
to reduce code size of libc.so.
* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Remove
memcpy-avx512-no-vzeroupper.
* sysdeps/x86_64/multiarch/memcpy-avx512-no-vzeroupper.S: Renamed
to ...
* sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S: This.
(MEMCPY): Don't define.
(MEMCPY_CHK): Likewise.
(MEMPCPY): Likewise.
(MEMPCPY_CHK): Likewise.
(MEMPCPY_CHK): Renamed to ...
(__mempcpy_chk_avx512_no_vzeroupper): This.
(MEMPCPY_CHK): Renamed to ...
(__mempcpy_chk_avx512_no_vzeroupper): This.
(MEMCPY_CHK): Renamed to ...
(__memmove_chk_avx512_no_vzeroupper): This.
(MEMCPY): Renamed to ...
(__memmove_avx512_no_vzeroupper): This.
(__memcpy_avx512_no_vzeroupper): New alias.
(__memcpy_chk_avx512_no_vzeroupper): Likewise.
|
|
Added AVX512 implementations of memcpy, mempcpy, memmove, memcpy_chk,
mempcpy_chk, memmove_chk.
It shows average improvement more than 30% over AVX versions on KNL
hardware (performance results in the thread
<https://sourceware.org/ml/libc-alpha/2016-01/msg00258.html>).
* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Added new files.
* sysdeps/x86_64/multiarch/ifunc-impl-list.c: Added new tests.
* sysdeps/x86_64/multiarch/memcpy-avx512-no-vzeroupper.S: New file.
* sysdeps/x86_64/multiarch/mempcpy-avx512-no-vzeroupper.S: Likewise.
* sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S: Likewise.
* sysdeps/x86_64/multiarch/memcpy.S: Added new IFUNC branch.
* sysdeps/x86_64/multiarch/memcpy_chk.S: Likewise.
* sysdeps/x86_64/multiarch/memmove.c: Likewise.
* sysdeps/x86_64/multiarch/memmove_chk.c: Likewise.
* sysdeps/x86_64/multiarch/mempcpy.S: Likewise.
* sysdeps/x86_64/multiarch/mempcpy_chk.S: Likewise.
|