about summary refs log tree commit diff
path: root/sysdeps/arm/armv7/multiarch/memchr_neon.S
Commit message (Collapse)AuthorAgeFilesLines
* [ARM] Optimise memchr for NEON-enabled processorsPrakhar Bahuguna2017-06-271-0/+9
This patch provides an optimised implementation of memchr using NEON instructions to improve its performance, especially with longer search regions. This gave an improvement in performance against the Thumb2+DSP optimised code, with more significant gains for larger inputs. The NEON code also wins in cases where the input is small (less than 8 bytes) by defaulting to a simple byte-by-byte search. This avoids the overhead imposed by filling two quadword registers from memory. * sysdeps/arm/armv7/multiarch/Makefile: Add memchr_neon to sysdep_routines. * sysdeps/arm/armv7/multiarch/ifunc-impl-list.c: Add define for __memchr_neon. Add ifunc definitions for __memchr_neon and __memchr_noneon. * sysdeps/arm/armv7/multiarch/memchr.S: New file. * sysdeps/arm/armv7/multiarch/memchr_impl.S: Likewise. * sysdeps/arm/armv7/multiarch/memchr_neon.S: Likewise. Testing done: Ran regression tests for arm-none-linux-gnueabihf as well as a full toolchain bootstrap. Benchmark tests were ran on ARMv7-A and ARMv8-A hardware targets.