about summary refs log tree commit diff
path: root/sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S
diff options
context:
space:
mode:
authorNoah Goldstein <goldstein.w.n@gmail.com>2022-10-18 17:44:07 -0700
committerNoah Goldstein <goldstein.w.n@gmail.com>2022-10-19 17:31:03 -0700
commitb412213eee0afa3b51dfe92b736dfc7c981309f5 (patch)
tree1f4279f8ab3483f106f43613f6bf066bcbafe8bf /sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S
parent4af6844aa5d3577e327f15dd877a38a043cb236a (diff)
downloadglibc-b412213eee0afa3b51dfe92b736dfc7c981309f5.tar.gz
glibc-b412213eee0afa3b51dfe92b736dfc7c981309f5.tar.xz
glibc-b412213eee0afa3b51dfe92b736dfc7c981309f5.zip
x86: Optimize strrchr-evex.S and implement with VMM headers
Optimization is:
1. Cache latest result in "fast path" loop with `vmovdqu` instead of
  `kunpckdq`.  This helps if there are more than one matches.

Code Size Changes:
strrchr-evex.S       :  +30 bytes (Same number of cache lines)

Net perf changes:

Reported as geometric mean of all improvements / regressions from N=10
runs of the benchtests. Value as New Time / Old Time so < 1.0 is
improvement and 1.0 is regression.

strrchr-evex.S       : 0.932 (From cases with higher match frequency)

Full results attached in email.

Full check passes on x86-64.
Diffstat (limited to 'sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S')
0 files changed, 0 insertions, 0 deletions