x86: Optimize memrchr-evex.S - mirror/glibc - mirror of git://sourceware.org/git/glibc.git

diff options

author	Noah Goldstein <goldstein.w.n@gmail.com>	2022-06-06 21:11:31 -0700
committer	Sunil K Pandey <skpgkp2@gmail.com>	2022-07-18 22:13:57 -0700
commit	11946110f89511ee6ac769bef752f20045bd19d4 (patch)
tree	1e0b6911f5b7e887955aba949ceaf3c0ec2d2610 /stdlib/drand48_r.c
parent	6742c432db42a6dcf0e0be63a0c37cecbd3f6f04 (diff)
download	glibc-11946110f89511ee6ac769bef752f20045bd19d4.tar.gz glibc-11946110f89511ee6ac769bef752f20045bd19d4.tar.xz glibc-11946110f89511ee6ac769bef752f20045bd19d4.zip

x86: Optimize memrchr-evex.S

The new code:
    1. prioritizes smaller user-arg lengths more.
    2. optimizes target placement more carefully
    3. reuses logic more
    4. fixes up various inefficiencies in the logic. The biggest
       case here is the `lzcnt` logic for checking returns which
       saves either a branch or multiple instructions.

The total code size saving is: 263 bytes
Geometric Mean of all benchmarks New / Old: 0.755

Regressions:
There are some regressions. Particularly where the length (user arg
length) is large but the position of the match char is near the
beginning of the string (in first VEC). This case has roughly a
20% regression.

This is because the new logic gives the hot path for immediate matches
to shorter lengths (the more common input). This case has roughly
a 35% speedup.

Full xcheck passes on x86_64.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

(cherry picked from commit b4209615a06b01c974f47b4998b00e4c7b1aa5d9)

Diffstat (limited to 'stdlib/drand48_r.c')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: