x86: Optimize memmove-vec-unaligned-erms.S

No bug. The optimizations are as follows: 1) Always align entry to 64 bytes. This makes behavior more predictable and makes other frontend optimizations easier. 2) Make the L(more_8x_vec) cases 4k aliasing aware. This can have significant benefits in the case that: 0 < (dst - src) < [256, 512] 3) Align before `rep movsb`. For ERMS this is roughly a [0, 30%] improvement and for FSRM [-10%, 25%]. In addition to these primary changes there is general cleanup throughout to optimize the aligning routines and control flow logic. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
author: Noah Goldstein <goldstein.w.n@gmail.com> 2021-11-01 00:49:51 -0500
committer: Noah Goldstein <goldstein.w.n@gmail.com> 2021-11-06 16:18:03 -0500
commit: a6b7502ec0c2da89a7437f43171f160d713e39c6 (patch)
tree: d2ff01bb7c3ea8b1e1415542a50913a27bbb3707 /sysdeps/x86_64/multiarch/memmove-avx-unaligned-erms.S
parent: ac759b1fbf28a82d99afde9046f8b72c7cba5dae (diff)
download: glibc-a6b7502ec0c2da89a7437f43171f160d713e39c6.tar.gz
glibc-a6b7502ec0c2da89a7437f43171f160d713e39c6.tar.xz
glibc-a6b7502ec0c2da89a7437f43171f160d713e39c6.zip
1 files changed, 1 insertions, 1 deletions
diff --git a/sysdeps/x86_64/multiarch/memmove-avx-unaligned-erms.S b/sysdeps/x86_64/multiarch/memmove-avx-unaligned-erms.S
index e195e93f15..975ae6c051 100644
--- a/sysdeps/x86_64/multiarch/memmove-avx-unaligned-erms.S
+++ b/sysdeps/x86_64/multiarch/memmove-avx-unaligned-erms.S
@@ -4,7 +4,7 @@
 # define VMOVNT		vmovntdq
 # define VMOVU		vmovdqu
 # define VMOVA		vmovdqa
-
+# define MOV_SIZE	4
 # define SECTION(p)		p##.avx
 # define MEMMOVE_SYMBOL(p,s)	p##_avx_##s
author	Noah Goldstein <goldstein.w.n@gmail.com>	2021-11-01 00:49:51 -0500
committer	Noah Goldstein <goldstein.w.n@gmail.com>	2021-11-06 16:18:03 -0500
commit	a6b7502ec0c2da89a7437f43171f160d713e39c6 (patch)
tree	d2ff01bb7c3ea8b1e1415542a50913a27bbb3707 /sysdeps/x86_64/multiarch/memmove-avx-unaligned-erms.S
parent	ac759b1fbf28a82d99afde9046f8b72c7cba5dae (diff)
download	glibc-a6b7502ec0c2da89a7437f43171f160d713e39c6.tar.gz glibc-a6b7502ec0c2da89a7437f43171f160d713e39c6.tar.xz glibc-a6b7502ec0c2da89a7437f43171f160d713e39c6.zip