From 922369032c604b4dcfd535e1bcddd4687e7126a5 Mon Sep 17 00:00:00 2001 From: Wilco Dijkstra Date: Thu, 10 Aug 2017 17:00:38 +0100 Subject: [AArch64] Optimized memcmp. This is an optimized memcmp for AArch64. This is a complete rewrite using a different algorithm. The previous version split into cases where both inputs were aligned, the inputs were mutually aligned and unaligned using a byte loop. The new version combines all these cases, while small inputs of less than 8 bytes are handled separately. This allows the main code to be sped up using unaligned loads since there are now at least 8 bytes to be compared. After the first 8 bytes, align the first input. This ensures each iteration does at most one unaligned access and mutually aligned inputs behave as aligned. After the main loop, process the last 8 bytes using unaligned accesses. This improves performance of (mutually) aligned cases by 25% and unaligned by >500% (yes >6 times faster) on large inputs. * sysdeps/aarch64/memcmp.S (memcmp): Rewrite of optimized memcmp. --- ChangeLog | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'ChangeLog') diff --git a/ChangeLog b/ChangeLog index dcf86261ca..e5e36a40e6 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,8 @@ +2017-08-10 Wilco Dijkstra + + * sysdeps/aarch64/memcmp.S (memcmp): + Rewrite of optimized memcmp. + 2017-08-10 Florian Weimer Introduce ld.so exceptions. -- cgit 1.4.1