diff options
author | H.J. Lu <hjl.tools@gmail.com> | 2017-06-09 05:45:43 -0700 |
---|---|---|
committer | H.J. Lu <hjl.tools@gmail.com> | 2017-06-09 05:45:52 -0700 |
commit | d2538b91568af2a63c9d8649ce11756d4dfbdac3 (patch) | |
tree | 22cc8602e6ab159f296651224be8a6c3460f2581 /ChangeLog | |
parent | 5ac7aa1d7cce8580f8225c33c819991abca102b9 (diff) | |
download | glibc-d2538b91568af2a63c9d8649ce11756d4dfbdac3.tar.gz glibc-d2538b91568af2a63c9d8649ce11756d4dfbdac3.tar.xz glibc-d2538b91568af2a63c9d8649ce11756d4dfbdac3.zip |
x86-64: Optimize strrchr/wcsrchr with AVX2
Optimize strrchr/wcsrchr with AVX2 to check 32 bytes with vector instructions. It is as fast as SSE2 version for small data sizes and up to 1X faster for large data sizes on Haswell. Select AVX2 version on AVX2 machines where vzeroupper is preferred and AVX unaligned load is fast. * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add strrchr-sse2, strrchr-avx2, wcsrchr-sse2 and wcsrchr-avx2. * sysdeps/x86_64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list): Add tests for __strrchr_avx2, __strrchr_sse2, __wcsrchr_avx2 and __wcsrchr_sse2. * sysdeps/x86_64/multiarch/strrchr-avx2.S: New file. * sysdeps/x86_64/multiarch/strrchr-sse2.S: Likewise. * sysdeps/x86_64/multiarch/strrchr.c: Likewise. * sysdeps/x86_64/multiarch/wcsrchr-avx2.S: Likewise. * sysdeps/x86_64/multiarch/wcsrchr-sse2.S: Likewise. * sysdeps/x86_64/multiarch/wcsrchr.c: Likewise.
Diffstat (limited to 'ChangeLog')
-rw-r--r-- | ChangeLog | 14 |
1 files changed, 14 insertions, 0 deletions
diff --git a/ChangeLog b/ChangeLog index 8fea821f91..d45f71d5f8 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,6 +1,20 @@ 2017-06-09 H.J. Lu <hongjiu.lu@intel.com> * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add + strrchr-sse2, strrchr-avx2, wcsrchr-sse2 and wcsrchr-avx2. + * sysdeps/x86_64/multiarch/ifunc-impl-list.c + (__libc_ifunc_impl_list): Add tests for __strrchr_avx2, + __strrchr_sse2, __wcsrchr_avx2 and __wcsrchr_sse2. + * sysdeps/x86_64/multiarch/strrchr-avx2.S: New file. + * sysdeps/x86_64/multiarch/strrchr-sse2.S: Likewise. + * sysdeps/x86_64/multiarch/strrchr.c: Likewise. + * sysdeps/x86_64/multiarch/wcsrchr-avx2.S: Likewise. + * sysdeps/x86_64/multiarch/wcsrchr-sse2.S: Likewise. + * sysdeps/x86_64/multiarch/wcsrchr.c: Likewise. + +2017-06-09 H.J. Lu <hongjiu.lu@intel.com> + + * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add memrchr-sse2 and memrchr-avx2. * sysdeps/x86_64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list): Add tests for __memrchr_avx2 and |