about summary refs log tree commit diff
path: root/ChangeLog
diff options
context:
space:
mode:
authorSiddhesh Poyarekar <siddhesh@sourceware.org>2018-05-11 00:08:01 +0530
committerSiddhesh Poyarekar <siddhesh@sourceware.org>2018-05-11 00:08:02 +0530
commit70c97f8493ab2a215c2543d78f212abb23f151ed (patch)
treefc07055d4b4221040496e590104da4ea05964db7 /ChangeLog
parent8f5b00d375dbd7f5e15e57b24fec3bd5a4b1e98d (diff)
downloadglibc-70c97f8493ab2a215c2543d78f212abb23f151ed.tar.gz
glibc-70c97f8493ab2a215c2543d78f212abb23f151ed.tar.xz
glibc-70c97f8493ab2a215c2543d78f212abb23f151ed.zip
aarch64,falkor: Ignore prefetcher hints for memmove tail
The tail of the copy loops are unable to train the falkor hardware
prefetcher because they load from a different base compared to the hot
loop.  In this case avoid serializing the instructions by loading them
into different registers.  Also peel the last iteration of the loop
into the tail (and have them use different registers) since it gives
better performance for medium sizes.

This results in performance improvements of between 3% and 20% over
the current falkor implementation for sizes between 128 bytes and 1K
on the memmove-walk benchmark, thus mostly covering the regressions
seen against the generic memmove.

	* sysdeps/aarch64/multiarch/memmove_falkor.S
	(__memmove_falkor): Use multiple registers to move data in
	loop tail.
Diffstat (limited to 'ChangeLog')
-rw-r--r--ChangeLog6
1 files changed, 6 insertions, 0 deletions
diff --git a/ChangeLog b/ChangeLog
index b2b66020e8..c3b2e03c3b 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2018-05-11  Siddhesh Poyarekar  <siddhesh@sourceware.org>
+
+	* sysdeps/aarch64/multiarch/memmove_falkor.S
+	(__memmove_falkor): Use multiple registers to move data in
+	loop tail.
+
 2018-05-10  Joseph Myers  <joseph@codesourcery.com>
 
 	* math/math-underflow.h: New file.