diff options
author | Noah Goldstein <goldstein.w.n@gmail.com> | 2022-10-18 17:44:03 -0700 |
---|---|---|
committer | Noah Goldstein <goldstein.w.n@gmail.com> | 2022-10-19 17:31:03 -0700 |
commit | 330881763efff626d6b1cdf8de9ffee4ed7a1ba1 (patch) | |
tree | be5ed6967393bbb1b87d8ac6c1ed3cc1bdef25cd /sysdeps/x86_64/multiarch/rawmemchr-evex-rtm.S | |
parent | 451c6e58540e8571e31581c04c4829e5d2cfe8ac (diff) | |
download | glibc-330881763efff626d6b1cdf8de9ffee4ed7a1ba1.tar.gz glibc-330881763efff626d6b1cdf8de9ffee4ed7a1ba1.tar.xz glibc-330881763efff626d6b1cdf8de9ffee4ed7a1ba1.zip |
x86: Optimize memchr-evex.S and implement with VMM headers
Optimizations are: 1. Use the fact that tzcnt(0) -> VEC_SIZE for memchr to save a branch in short string case. 2. Restructure code so that small strings are given the hot path. - This is a net-zero on the benchmark suite but in general makes sense as smaller sizes are far more common. 3. Use more code-size efficient instructions. - tzcnt ... -> bsf ... - vpcmpb $0 ... -> vpcmpeq ... 4. Align labels less aggressively, especially if it doesn't save fetch blocks / causes the basic-block to span extra cache-lines. The optimizations (especially for point 2) make the memchr and rawmemchr code essentially incompatible so split rawmemchr-evex to a new file. Code Size Changes: memchr-evex.S : -107 bytes rawmemchr-evex.S : -53 bytes Net perf changes: Reported as geometric mean of all improvements / regressions from N=10 runs of the benchtests. Value as New Time / Old Time so < 1.0 is improvement and 1.0 is regression. memchr-evex.S : 0.928 rawmemchr-evex.S : 0.986 (Less targets cross cache lines) Full results attached in email. Full check passes on x86-64.
Diffstat (limited to 'sysdeps/x86_64/multiarch/rawmemchr-evex-rtm.S')
-rw-r--r-- | sysdeps/x86_64/multiarch/rawmemchr-evex-rtm.S | 9 |
1 files changed, 6 insertions, 3 deletions
diff --git a/sysdeps/x86_64/multiarch/rawmemchr-evex-rtm.S b/sysdeps/x86_64/multiarch/rawmemchr-evex-rtm.S index deda1ca395..2073eaa620 100644 --- a/sysdeps/x86_64/multiarch/rawmemchr-evex-rtm.S +++ b/sysdeps/x86_64/multiarch/rawmemchr-evex-rtm.S @@ -1,3 +1,6 @@ -#define MEMCHR __rawmemchr_evex_rtm -#define USE_AS_RAWMEMCHR 1 -#include "memchr-evex-rtm.S" +#define RAWMEMCHR __rawmemchr_evex_rtm + +#define USE_IN_RTM 1 +#define SECTION(p) p##.evex.rtm + +#include "rawmemchr-evex.S" |