diff options
author | Noah Goldstein <goldstein.w.n@gmail.com> | 2022-11-08 17:38:40 -0800 |
---|---|---|
committer | Noah Goldstein <goldstein.w.n@gmail.com> | 2022-11-08 19:22:33 -0800 |
commit | 64b8b6516b3cba19dba4c8f4f9b97daa0556fd98 (patch) | |
tree | 93e9e1c8bd54c3a10eb9c00db604bd74696adb10 /sysdeps/x86_64/multiarch/Makefile | |
parent | 642933158e7cf072d873231b1a9bb03291f2b989 (diff) | |
download | glibc-64b8b6516b3cba19dba4c8f4f9b97daa0556fd98.tar.gz glibc-64b8b6516b3cba19dba4c8f4f9b97daa0556fd98.tar.xz glibc-64b8b6516b3cba19dba4c8f4f9b97daa0556fd98.zip |
x86: Add evex optimized functions for the wchar_t strcpy family
Implemented: wcscat-evex (+ 905 bytes) wcscpy-evex (+ 674 bytes) wcpcpy-evex (+ 709 bytes) wcsncpy-evex (+1358 bytes) wcpncpy-evex (+1467 bytes) wcsncat-evex (+1213 bytes) Performance Changes: Times are from N = 10 runs of the benchmark suite and are reported as geometric mean of all ratios of New Implementation / Best Old Implementation. Best Old Implementation was determined with the highest ISA implementation. wcscat-evex -> 0.991 wcscpy-evex -> 0.587 wcpcpy-evex -> 0.695 wcsncpy-evex -> 0.719 wcpncpy-evex -> 0.694 wcsncat-evex -> 0.979 Code Size Changes: This change increase the size of libc.so by ~6.3kb bytes. For reference the patch optimizing the normal strcpy family functions decreases libc.so by ~5.7kb. Full check passes on x86-64 and build succeeds for all ISA levels w/ and w/o multiarch.
Diffstat (limited to 'sysdeps/x86_64/multiarch/Makefile')
-rw-r--r-- | sysdeps/x86_64/multiarch/Makefile | 14 |
1 files changed, 13 insertions, 1 deletions
diff --git a/sysdeps/x86_64/multiarch/Makefile b/sysdeps/x86_64/multiarch/Makefile index 066bfa48d9..d6e01940c3 100644 --- a/sysdeps/x86_64/multiarch/Makefile +++ b/sysdeps/x86_64/multiarch/Makefile @@ -131,6 +131,12 @@ endif ifeq ($(subdir),wcsmbs) sysdep_routines += \ + wcpcpy-evex \ + wcpcpy-generic \ + wcpncpy-evex \ + wcpncpy-generic \ + wcscat-evex \ + wcscat-generic \ wcschr-avx2 \ wcschr-avx2-rtm \ wcschr-evex \ @@ -140,6 +146,8 @@ sysdep_routines += \ wcscmp-avx2-rtm \ wcscmp-evex \ wcscmp-sse2 \ + wcscpy-evex \ + wcscpy-generic \ wcscpy-ssse3 \ wcslen-avx2 \ wcslen-avx2-rtm \ @@ -147,9 +155,13 @@ sysdep_routines += \ wcslen-evex512 \ wcslen-sse2 \ wcslen-sse4_1 \ + wcsncat-evex \ + wcsncat-generic \ wcsncmp-avx2 \ wcsncmp-avx2-rtm \ wcsncmp-evex \ + wcsncpy-evex \ + wcsncpy-generic \ wcsnlen-avx2 \ wcsnlen-avx2-rtm \ wcsnlen-evex \ @@ -163,8 +175,8 @@ sysdep_routines += \ wmemchr-avx2 \ wmemchr-avx2-rtm \ wmemchr-evex \ - wmemchr-evex512 \ wmemchr-evex-rtm \ + wmemchr-evex512 \ wmemchr-sse2 \ wmemcmp-avx2-movbe \ wmemcmp-avx2-movbe-rtm \ |