about summary refs log tree commit diff
path: root/ChangeLog
diff options
context:
space:
mode:
authorH.J. Lu <hjl.tools@gmail.com>2016-03-31 10:05:51 -0700
committerH.J. Lu <hjl.tools@gmail.com>2016-03-31 10:06:07 -0700
commit830566307f038387ca0af3fd327706a8d1a2f595 (patch)
tree22d89ebf426a8799ec13913fd6591a53d4663973 /ChangeLog
parent88b57b8ed41d5ecf2e1bdfc19556f9246a665ebb (diff)
downloadglibc-830566307f038387ca0af3fd327706a8d1a2f595.tar.gz
glibc-830566307f038387ca0af3fd327706a8d1a2f595.tar.xz
glibc-830566307f038387ca0af3fd327706a8d1a2f595.zip
Add x86-64 memset with unaligned store and rep stosb
Implement x86-64 memset with unaligned store and rep movsb.  Support
16-byte, 32-byte and 64-byte vector register sizes.  A single file
provides 2 implementations of memset, one with rep stosb and the other
without rep stosb.  They share the same codes when size is between 2
times of vector register size and REP_STOSB_THRESHOLD which defaults
to 2KB.

Key features:

1. Use overlapping store to avoid branch.
2. For size <= 4 times of vector register size, fully unroll the loop.
3. For size > 4 times of vector register size, store 4 times of vector
register size at a time.

	[BZ #19881]
	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
	memset-sse2-unaligned-erms, memset-avx2-unaligned-erms and
	memset-avx512-unaligned-erms.
	* sysdeps/x86_64/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Test __memset_chk_sse2_unaligned,
	__memset_chk_sse2_unaligned_erms, __memset_chk_avx2_unaligned,
	__memset_chk_avx2_unaligned_erms, __memset_chk_avx512_unaligned,
	__memset_chk_avx512_unaligned_erms, __memset_sse2_unaligned,
	__memset_sse2_unaligned_erms, __memset_erms,
	__memset_avx2_unaligned, __memset_avx2_unaligned_erms,
	__memset_avx512_unaligned_erms and __memset_avx512_unaligned.
	* sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S: New
	file.
	* sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S:
	Likewise.
	* sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S:
	Likewise.
	* sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:
	Likewise.
Diffstat (limited to 'ChangeLog')
-rw-r--r--ChangeLog23
1 files changed, 23 insertions, 0 deletions
diff --git a/ChangeLog b/ChangeLog
index 100764f66a..1a87d43e11 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,28 @@
 2016-03-31   H.J. Lu  <hongjiu.lu@intel.com>
 
+	[BZ #19881]
+	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
+	memset-sse2-unaligned-erms, memset-avx2-unaligned-erms and
+	memset-avx512-unaligned-erms.
+	* sysdeps/x86_64/multiarch/ifunc-impl-list.c
+	(__libc_ifunc_impl_list): Test __memset_chk_sse2_unaligned,
+	__memset_chk_sse2_unaligned_erms, __memset_chk_avx2_unaligned,
+	__memset_chk_avx2_unaligned_erms, __memset_chk_avx512_unaligned,
+	__memset_chk_avx512_unaligned_erms, __memset_sse2_unaligned,
+	__memset_sse2_unaligned_erms, __memset_erms,
+	__memset_avx2_unaligned, __memset_avx2_unaligned_erms,
+	__memset_avx512_unaligned_erms and __memset_avx512_unaligned.
+	* sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S: New
+	file.
+	* sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S:
+	Likewise.
+	* sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S:
+	Likewise.
+	* sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:
+	Likewise.
+
+2016-03-31   H.J. Lu  <hongjiu.lu@intel.com>
+
 	[BZ #19776]
 	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
 	memmove-sse2-unaligned-erms, memmove-avx-unaligned-erms and