about summary refs log tree commit diff
path: root/math/w_acosl_compat.c
diff options
context:
space:
mode:
authorH.J. Lu <hjl.tools@gmail.com>2017-05-26 12:21:55 -0700
committerH.J. Lu <hjl.tools@gmail.com>2017-06-05 15:09:59 -0700
commit9593e235c2401156e9f50ca4b88c4f6b194d61f5 (patch)
tree38974c300053273469ce26c8f9344eed83d8fb7a /math/w_acosl_compat.c
parentce40306fcc3edb2baade47e8050c975c5ecba980 (diff)
downloadglibc-hjl/avx2/c.tar.gz
glibc-hjl/avx2/c.tar.xz
glibc-hjl/avx2/c.zip
x86-64: Optimize strrchr/wcsrchr with AVX2 hjl/avx2/c
Optimize strrchr/wcsrchr with AVX2 to check 32 bytes with vector
instructions.  It is as fast as SSE2 version for small data sizes
and up to 1X faster for large data sizes on Haswell.  Select AVX2
version on AVX2 machines where vzeroupper is preferred and AVX
unaligned load is fast.

	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
	strrchr-sse2, strrchr-avx2, wcsrchr-sse2 and wcsrchr-avx2.
	* sysdeps/x86_64/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add tests for __strrchr_avx2,
	__strrchr_sse2, __wcsrchr_avx2 and __wcsrchr_sse2.
	* sysdeps/x86_64/multiarch/strrchr-avx2.S: New file.
	* sysdeps/x86_64/multiarch/strrchr-sse2.S: Likewise.
	* sysdeps/x86_64/multiarch/strrchr.c: Likewise.
	* sysdeps/x86_64/multiarch/wcsrchr-avx2.S: Likewise.
	* sysdeps/x86_64/multiarch/wcsrchr-sse2.S: Likewise.
	* sysdeps/x86_64/multiarch/wcsrchr.c: Likewise.
Diffstat (limited to 'math/w_acosl_compat.c')
0 files changed, 0 insertions, 0 deletions