diff options
author | Wilco Dijkstra <wdijkstr@arm.com> | 2016-05-12 16:41:00 +0100 |
---|---|---|
committer | Wilco Dijkstra <wdijkstr@arm.com> | 2016-05-12 16:44:53 +0100 |
commit | a8c5a2a9521e105da6e96eaf4029b8e4d595e4f5 (patch) | |
tree | 77a2dc477b893cc23304cb30ddf8d2bc2853414b /sysdeps/aarch64/memmove.S | |
parent | 56290d6e762c1194547e73ff0b948cd79d3a1e03 (diff) | |
download | glibc-a8c5a2a9521e105da6e96eaf4029b8e4d595e4f5.tar.gz glibc-a8c5a2a9521e105da6e96eaf4029b8e4d595e4f5.tar.xz glibc-a8c5a2a9521e105da6e96eaf4029b8e4d595e4f5.zip |
This is an optimized memset for AArch64. Memset is split into 4 main cases:
small sets of up to 16 bytes, medium of 16..96 bytes which are fully unrolled. Large memsets of more than 96 bytes align the destination and use an unrolled loop processing 64 bytes per iteration. Memsets of zero of more than 256 use the dc zva instruction, and there are faster versions for the common ZVA sizes 64 or 128. STP of Q registers is used to reduce codesize without loss of performance. The speedup on test-memset is 1% on Cortex-A57 and 8% on Cortex-A53. * sysdeps/aarch64/memset.S (__memset): Rewrite of optimized memset.
Diffstat (limited to 'sysdeps/aarch64/memmove.S')
0 files changed, 0 insertions, 0 deletions