diff options
author | Krzysztof Koch <Krzysztof.Koch@arm.com> | 2019-11-05 17:35:18 +0000 |
---|---|---|
committer | Szabolcs Nagy <szabolcs.nagy@arm.com> | 2019-11-12 17:08:18 +0000 |
commit | b9f145df85145506f8e61bac38b792584a38d88f (patch) | |
tree | 666df24ad3209f2495b6fcd1361f785c0d000ff4 /string | |
parent | 76a7c103eb9060f9e3ba01d073ae4621a17d8b46 (diff) | |
download | glibc-b9f145df85145506f8e61bac38b792584a38d88f.tar.gz glibc-b9f145df85145506f8e61bac38b792584a38d88f.tar.xz glibc-b9f145df85145506f8e61bac38b792584a38d88f.zip |
aarch64: Increase small and medium cases for __memcpy_generic
Increase the upper bound on medium cases from 96 to 128 bytes. Now, up to 128 bytes are copied unrolled. Increase the upper bound on small cases from 16 to 32 bytes so that copies of 17-32 bytes are not impacted by the larger medium case. Benchmarking: The attached figures show relative timing difference with respect to 'memcpy_generic', which is the existing implementation. 'memcpy_med_128' denotes the the version of memcpy_generic with only the medium case enlarged. The 'memcpy_med_128_small_32' numbers are for the version of memcpy_generic submitted in this patch, which has both medium and small cases enlarged. The figures were generated using the script from: https://www.sourceware.org/ml/libc-alpha/2019-10/msg00563.html Depending on the platform, the performance improvement in the bench-memcpy-random.c benchmark ranges from 6% to 20% between the original and final version of memcpy.S Tested against GLIBC testsuite and randomized tests.
Diffstat (limited to 'string')
0 files changed, 0 insertions, 0 deletions