This is an optimized memset for AArch64. Memset is split into 4 main cases: - mirror/glibc - mirror of git://sourceware.org/git/glibc.git

diff options

author	Wilco Dijkstra <wdijkstr@arm.com>	2016-05-12 16:41:00 +0100
committer	Wilco Dijkstra <wdijkstr@arm.com>	2016-05-12 16:44:53 +0100
commit	a8c5a2a9521e105da6e96eaf4029b8e4d595e4f5 (patch)
tree	77a2dc477b893cc23304cb30ddf8d2bc2853414b /sysdeps/aarch64/memmove.S
parent	56290d6e762c1194547e73ff0b948cd79d3a1e03 (diff)
download	glibc-a8c5a2a9521e105da6e96eaf4029b8e4d595e4f5.tar.gz glibc-a8c5a2a9521e105da6e96eaf4029b8e4d595e4f5.tar.xz glibc-a8c5a2a9521e105da6e96eaf4029b8e4d595e4f5.zip

This is an optimized memset for AArch64. Memset is split into 4 main cases:

small sets of up to 16 bytes, medium of 16..96 bytes which are fully unrolled.
Large memsets of more than 96 bytes align the destination and use an unrolled
loop processing 64 bytes per iteration.  Memsets of zero of more than 256 use
the dc zva instruction, and there are faster versions for the common ZVA sizes
64 or 128.  STP of Q registers is used to reduce codesize without loss of
performance.

The speedup on test-memset is 1% on Cortex-A57 and 8% on Cortex-A53.

	* sysdeps/aarch64/memset.S (__memset):
	Rewrite of optimized memset.

Diffstat (limited to 'sysdeps/aarch64/memmove.S')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: