diff options
author | H.J. Lu <hjl.tools@gmail.com> | 2016-03-31 10:05:51 -0700 |
---|---|---|
committer | H.J. Lu <hjl.tools@gmail.com> | 2016-03-31 10:06:07 -0700 |
commit | 830566307f038387ca0af3fd327706a8d1a2f595 (patch) | |
tree | 22d89ebf426a8799ec13913fd6591a53d4663973 /ChangeLog | |
parent | 88b57b8ed41d5ecf2e1bdfc19556f9246a665ebb (diff) | |
download | glibc-830566307f038387ca0af3fd327706a8d1a2f595.tar.gz glibc-830566307f038387ca0af3fd327706a8d1a2f595.tar.xz glibc-830566307f038387ca0af3fd327706a8d1a2f595.zip |
Add x86-64 memset with unaligned store and rep stosb
Implement x86-64 memset with unaligned store and rep movsb. Support 16-byte, 32-byte and 64-byte vector register sizes. A single file provides 2 implementations of memset, one with rep stosb and the other without rep stosb. They share the same codes when size is between 2 times of vector register size and REP_STOSB_THRESHOLD which defaults to 2KB. Key features: 1. Use overlapping store to avoid branch. 2. For size <= 4 times of vector register size, fully unroll the loop. 3. For size > 4 times of vector register size, store 4 times of vector register size at a time. [BZ #19881] * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add memset-sse2-unaligned-erms, memset-avx2-unaligned-erms and memset-avx512-unaligned-erms. * sysdeps/x86_64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list): Test __memset_chk_sse2_unaligned, __memset_chk_sse2_unaligned_erms, __memset_chk_avx2_unaligned, __memset_chk_avx2_unaligned_erms, __memset_chk_avx512_unaligned, __memset_chk_avx512_unaligned_erms, __memset_sse2_unaligned, __memset_sse2_unaligned_erms, __memset_erms, __memset_avx2_unaligned, __memset_avx2_unaligned_erms, __memset_avx512_unaligned_erms and __memset_avx512_unaligned. * sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S: New file. * sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S: Likewise. * sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S: Likewise. * sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Likewise.
Diffstat (limited to 'ChangeLog')
-rw-r--r-- | ChangeLog | 23 |
1 files changed, 23 insertions, 0 deletions
diff --git a/ChangeLog b/ChangeLog index 100764f66a..1a87d43e11 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,28 @@ 2016-03-31 H.J. Lu <hongjiu.lu@intel.com> + [BZ #19881] + * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add + memset-sse2-unaligned-erms, memset-avx2-unaligned-erms and + memset-avx512-unaligned-erms. + * sysdeps/x86_64/multiarch/ifunc-impl-list.c + (__libc_ifunc_impl_list): Test __memset_chk_sse2_unaligned, + __memset_chk_sse2_unaligned_erms, __memset_chk_avx2_unaligned, + __memset_chk_avx2_unaligned_erms, __memset_chk_avx512_unaligned, + __memset_chk_avx512_unaligned_erms, __memset_sse2_unaligned, + __memset_sse2_unaligned_erms, __memset_erms, + __memset_avx2_unaligned, __memset_avx2_unaligned_erms, + __memset_avx512_unaligned_erms and __memset_avx512_unaligned. + * sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S: New + file. + * sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S: + Likewise. + * sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S: + Likewise. + * sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: + Likewise. + +2016-03-31 H.J. Lu <hongjiu.lu@intel.com> + [BZ #19776] * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add memmove-sse2-unaligned-erms, memmove-avx-unaligned-erms and |