about summary refs log tree commit diff
path: root/sysdeps/mach/hurd
diff options
context:
space:
mode:
authorNoah Goldstein <goldstein.w.n@gmail.com>2022-04-14 11:47:40 -0500
committerNoah Goldstein <goldstein.w.n@gmail.com>2022-04-14 23:21:42 -0500
commit26b2478322db94edc9e0e8f577b2f71d291e5acb (patch)
tree6087539637c25880c3f6e32d0ab19d0d23c9e0df /sysdeps/mach/hurd
parentd85916e30a902ff4bce5b0b44ff245ef58b79236 (diff)
downloadglibc-26b2478322db94edc9e0e8f577b2f71d291e5acb.tar.gz
glibc-26b2478322db94edc9e0e8f577b2f71d291e5acb.tar.xz
glibc-26b2478322db94edc9e0e8f577b2f71d291e5acb.zip
x86: Reduce code size of mem{move|pcpy|cpy}-ssse3
The goal is to remove most SSSE3 function as SSE4, AVX2, and EVEX are
generally preferable. memcpy/memmove is one exception where avoiding
unaligned loads with `palignr` is important for some targets.

This commit replaces memmove-ssse3 with a better optimized are lower
code footprint verion. As well it aliases memcpy to memmove.

Aside from this function all other SSSE3 functions should be safe to
remove.

The performance is not changed drastically although shows overall
improvements without any major regressions or gains.

bench-memcpy geometric_mean(N=50) New / Original: 0.957

bench-memcpy-random geometric_mean(N=50) New / Original: 0.912

bench-memcpy-large geometric_mean(N=50) New / Original: 0.892

Benchmarks where run on Zhaoxin KX-6840@2000MHz See attached numbers
for all results.

More important this saves 7246 bytes of code size in memmove an
additional 10741 bytes by reusing memmove code for memcpy (total 17987
bytes saves). As well an additional 896 bytes of rodata for the jump
table entries.
Diffstat (limited to 'sysdeps/mach/hurd')
0 files changed, 0 insertions, 0 deletions