diff options
author | Noah Goldstein <goldstein.w.n@gmail.com> | 2022-06-28 08:26:26 -0700 |
---|---|---|
committer | Noah Goldstein <goldstein.w.n@gmail.com> | 2022-06-29 19:47:52 -0700 |
commit | 58bcf7b71a113378dd490f6c41931a14f25a26c9 (patch) | |
tree | ba53cf12cc40a9f9b70b4d74cbb08735feded071 /sysdeps/x86_64/dl-trampoline.S | |
parent | 21925f64730d52eb7d8b2fb62b412f8ab92b0caf (diff) | |
download | glibc-58bcf7b71a113378dd490f6c41931a14f25a26c9.tar.gz glibc-58bcf7b71a113378dd490f6c41931a14f25a26c9.tar.xz glibc-58bcf7b71a113378dd490f6c41931a14f25a26c9.zip |
x86-64: Small improvements to dl-trampoline.S
1. Remove sse2 instructions when using the avx512 or avx version. 2. Fixup some format nits in how the address offsets where aligned. 3. Use more space efficient instructions in the conditional AVX restoral. - vpcmpeqq -> vpcmpeqb - cmp imm32, r; jz -> inc r; jz 4. Use `rep movsb` instead of `rep movsq`. The former is guranteed to be fast with the ERMS flags, the latter is not. The latter also wastes an instruction in size setup.
Diffstat (limited to 'sysdeps/x86_64/dl-trampoline.S')
-rw-r--r-- | sysdeps/x86_64/dl-trampoline.S | 4 |
1 files changed, 4 insertions, 0 deletions
diff --git a/sysdeps/x86_64/dl-trampoline.S b/sysdeps/x86_64/dl-trampoline.S index f669805ac5..580d2b6499 100644 --- a/sysdeps/x86_64/dl-trampoline.S +++ b/sysdeps/x86_64/dl-trampoline.S @@ -57,22 +57,26 @@ #define VMOVA vmovdqa64 #define VEC(i) zmm##i #define _dl_runtime_profile _dl_runtime_profile_avx512 +# define SECTION(p) p##.evex512 #include "dl-trampoline.h" #undef _dl_runtime_profile #undef VEC #undef VMOVA #undef VEC_SIZE +#undef SECTION #if MINIMUM_X86_ISA_LEVEL <= AVX_X86_ISA_LEVEL # define VEC_SIZE 32 # define VMOVA vmovdqa # define VEC(i) ymm##i +# define SECTION(p) p##.avx # define _dl_runtime_profile _dl_runtime_profile_avx # include "dl-trampoline.h" # undef _dl_runtime_profile # undef VEC # undef VMOVA # undef VEC_SIZE +# undef SECTION #endif #if MINIMUM_X86_ISA_LEVEL < AVX_X86_ISA_LEVEL |