diff options
author | Prakhar Bahuguna <prakhar.bahuguna@arm.com> | 2017-06-27 15:43:50 +0000 |
---|---|---|
committer | Joseph Myers <joseph@codesourcery.com> | 2017-06-27 15:43:50 +0000 |
commit | f8f72bc0c3da8ba039e6a1ed670ca576120b1f85 (patch) | |
tree | 83b3438aea7f6425cf94c5f97cbbca1d62797683 /time | |
parent | a37b5daa6bc7fbcbbc229b2549a161fa15023f41 (diff) | |
download | glibc-f8f72bc0c3da8ba039e6a1ed670ca576120b1f85.tar.gz glibc-f8f72bc0c3da8ba039e6a1ed670ca576120b1f85.tar.xz glibc-f8f72bc0c3da8ba039e6a1ed670ca576120b1f85.zip |
[ARM] Optimise memchr for NEON-enabled processors
This patch provides an optimised implementation of memchr using NEON instructions to improve its performance, especially with longer search regions. This gave an improvement in performance against the Thumb2+DSP optimised code, with more significant gains for larger inputs. The NEON code also wins in cases where the input is small (less than 8 bytes) by defaulting to a simple byte-by-byte search. This avoids the overhead imposed by filling two quadword registers from memory. * sysdeps/arm/armv7/multiarch/Makefile: Add memchr_neon to sysdep_routines. * sysdeps/arm/armv7/multiarch/ifunc-impl-list.c: Add define for __memchr_neon. Add ifunc definitions for __memchr_neon and __memchr_noneon. * sysdeps/arm/armv7/multiarch/memchr.S: New file. * sysdeps/arm/armv7/multiarch/memchr_impl.S: Likewise. * sysdeps/arm/armv7/multiarch/memchr_neon.S: Likewise. Testing done: Ran regression tests for arm-none-linux-gnueabihf as well as a full toolchain bootstrap. Benchmark tests were ran on ARMv7-A and ARMv8-A hardware targets.
Diffstat (limited to 'time')
0 files changed, 0 insertions, 0 deletions