about summary refs log tree commit diff
path: root/sysdeps/microblaze/libm-test-ulps-name
diff options
context:
space:
mode:
authorWilco Dijkstra <wdijkstr@arm.com>2021-12-02 18:33:26 +0000
committerWilco Dijkstra <wdijkstr@arm.com>2021-12-02 18:36:03 +0000
commitb31bd11454fade731e5158b1aea40b133ae19926 (patch)
treed6d25ad11615c9f2c91a607f7d7a7cdc958bb5d7 /sysdeps/microblaze/libm-test-ulps-name
parentb51eb35c572b015641f03e3682c303f7631279b7 (diff)
downloadglibc-b31bd11454fade731e5158b1aea40b133ae19926.tar.gz
glibc-b31bd11454fade731e5158b1aea40b133ae19926.tar.xz
glibc-b31bd11454fade731e5158b1aea40b133ae19926.zip
AArch64: Improve A64FX memcpy
v2 is a complete rewrite of the A64FX memcpy. Performance is improved
by streamlining the code, aligning all large copies and using a single
unrolled loop for all sizes. The code size for memcpy and memmove goes
down from 1796 bytes to 868 bytes. Performance is better in all cases:
bench-memcpy-random is 2.3% faster overall, bench-memcpy-large is ~33%
faster for large sizes, bench-memcpy-walk is 25% faster for small sizes
and 20% for the largest sizes. The geomean of all tests in bench-memcpy
is 5.1% faster, and total time is reduced by 4%.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Diffstat (limited to 'sysdeps/microblaze/libm-test-ulps-name')
0 files changed, 0 insertions, 0 deletions