about summary refs log tree commit diff
path: root/ChangeLog
diff options
context:
space:
mode:
authorAndrew Senkevich <andrew.senkevich@intel.com>2014-12-29 14:39:46 +0300
committerH.J. Lu <hjl.tools@gmail.com>2014-12-30 07:19:38 -0800
commit8b4416d83c79ba77b0669203741c712880a09ae4 (patch)
treec0701090d02b9e3c9ddcb840e2ad62084c498b4a /ChangeLog
parent6d6d7fde04c8ef830205a9900bf101597a2f4b18 (diff)
downloadglibc-8b4416d83c79ba77b0669203741c712880a09ae4.tar.gz
glibc-8b4416d83c79ba77b0669203741c712880a09ae4.tar.xz
glibc-8b4416d83c79ba77b0669203741c712880a09ae4.zip
i386: memcpy functions with SSE2 unaligned load/store
These new memcpy functions are the 32-bit version of x86_64 SSE2 unaligned
memcpy.  Memcpy average performace benefit is 18% on Silvermont, other
platforms also improved about 35%, benchmarked on Silvermont, Haswell, Ivy
Bridge, Sandy Bridge and Westmere, performance results attached in

https://sourceware.org/ml/libc-alpha/2014-07/msg00157.html

	* sysdeps/i386/i686/multiarch/bcopy-sse2-unaligned.S: New file.
	* sysdeps/i386/i686/multiarch/memcpy-sse2-unaligned.S: Likewise.
	* sysdeps/i386/i686/multiarch/memmove-sse2-unaligned.S: Likewise.
	* sysdeps/i386/i686/multiarch/mempcpy-sse2-unaligned.S: Likewise.
	* sysdeps/i386/i686/multiarch/bcopy.S: Select the sse2_unaligned
	version if bit_Fast_Unaligned_Load is set.
	* sysdeps/i386/i686/multiarch/memcpy.S: Likewise.
	* sysdeps/i386/i686/multiarch/memcpy_chk.S: Likewise.
	* sysdeps/i386/i686/multiarch/memmove.S: Likewise.
	* sysdeps/i386/i686/multiarch/memmove_chk.S: Likewise.
	* sysdeps/i386/i686/multiarch/mempcpy.S: Likewise.
	* sysdeps/i386/i686/multiarch/mempcpy_chk.S: Likewise.
	* sysdeps/i386/i686/multiarch/Makefile (sysdep_routines): Add
	bcopy-sse2-unaligned, memcpy-sse2-unaligned,
	memmove-sse2-unaligned and mempcpy-sse2-unaligned.
	* sysdeps/i386/i686/multiarch/ifunc-impl-list.c (MAX_IFUNC): Set
	to 4.
	(__libc_ifunc_impl_list): Test __bcopy_sse2_unaligned,
	__memmove_chk_sse2_unaligned, __memmove_sse2_unaligned,
	__memcpy_chk_sse2_unaligned, __memcpy_sse2_unaligned,
	__mempcpy_chk_sse2_unaligned, and __mempcpy_sse2_unaligned.
Diffstat (limited to 'ChangeLog')
-rw-r--r--ChangeLog25
1 files changed, 25 insertions, 0 deletions
diff --git a/ChangeLog b/ChangeLog
index d5aeddc400..31252196f9 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,28 @@
+2014-12-30  Andrew Senkevich  <andrew.senkevich@intel.com>
+	    H.J. Lu  <hongjiu.lu@intel.com>
+
+	* sysdeps/i386/i686/multiarch/bcopy-sse2-unaligned.S: New file.
+	* sysdeps/i386/i686/multiarch/memcpy-sse2-unaligned.S: Likewise.
+	* sysdeps/i386/i686/multiarch/memmove-sse2-unaligned.S: Likewise.
+	* sysdeps/i386/i686/multiarch/mempcpy-sse2-unaligned.S: Likewise.
+	* sysdeps/i386/i686/multiarch/bcopy.S: Select the sse2_unaligned
+	version if bit_Fast_Unaligned_Load is set.
+	* sysdeps/i386/i686/multiarch/memcpy.S: Likewise.
+	* sysdeps/i386/i686/multiarch/memcpy_chk.S: Likewise.
+	* sysdeps/i386/i686/multiarch/memmove.S: Likewise.
+	* sysdeps/i386/i686/multiarch/memmove_chk.S: Likewise.
+	* sysdeps/i386/i686/multiarch/mempcpy.S: Likewise.
+	* sysdeps/i386/i686/multiarch/mempcpy_chk.S: Likewise.
+	* sysdeps/i386/i686/multiarch/Makefile (sysdep_routines): Add
+	bcopy-sse2-unaligned, memcpy-sse2-unaligned,
+	memmove-sse2-unaligned and mempcpy-sse2-unaligned.
+	* sysdeps/i386/i686/multiarch/ifunc-impl-list.c (MAX_IFUNC): Set
+	to 4.
+	(__libc_ifunc_impl_list): Test __bcopy_sse2_unaligned,
+	__memmove_chk_sse2_unaligned, __memmove_sse2_unaligned,
+	__memcpy_chk_sse2_unaligned, __memcpy_sse2_unaligned,
+	__mempcpy_chk_sse2_unaligned, and __mempcpy_sse2_unaligned.
+
 2014-12-29  Chris Metcalf  <cmetcalf@ezchip.com>
 
 	* sysdeps/unix/sysv/linux/tst-setgetname.c (do_test): Use #ifndef