diff options
author | Adhemerval Zanella <adhemerval.zanella@linaro.org> | 2024-03-21 14:12:00 -0300 |
---|---|---|
committer | Adhemerval Zanella <adhemerval.zanella@linaro.org> | 2024-03-27 13:48:16 -0300 |
commit | 721314c980ed371d36a84f63c393e4289e249b3b (patch) | |
tree | d472163c71d71a7846e52a9481da8e6a3ebe44e2 /sysdeps/x86_64/multiarch/wcscat-generic.c | |
parent | 2e53eb923486704b7a0d6f3d81d1ee8ba672a56b (diff) | |
download | glibc-721314c980ed371d36a84f63c393e4289e249b3b.tar.gz glibc-721314c980ed371d36a84f63c393e4289e249b3b.tar.xz glibc-721314c980ed371d36a84f63c393e4289e249b3b.zip |
x86_64: Remove avx512 strstr implementation
As indicated in a recent thread, this it is a simple brute-force algorithm that checks the whole needle at a matching character pair (and does so 1 byte at a time after the first 64 bytes of a needle). Also it never skips ahead and thus can match at every haystack position after trying to match all of the needle, which generic implementation avoids. As indicated by Wilco, a 4x larger needle and 16x larger haystack gives a clear 65x slowdown both basic_strstr and __strstr_avx512: "ifuncs": ["basic_strstr", "twoway_strstr", "__strstr_avx512", "__strstr_sse2_unaligned", "__strstr_generic"], { "len_haystack": 65536, "len_needle": 1024, "align_haystack": 0, "align_needle": 0, "fail": 1, "desc": "Difficult bruteforce needle", "timings": [4.0948e+07, 15094.5, 3.20818e+07, 108558, 10839.2] }, { "len_haystack": 1048576, "len_needle": 4096, "align_haystack": 0, "align_needle": 0, "fail": 1, "desc": "Difficult bruteforce needle", "timings": [2.69767e+09, 100797, 2.08535e+09, 495706, 82666.9] } PS: I don't have an AVX512 capable machine to verify this issues, but skimming through the code it does seems to follow what Wilco has described. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Diffstat (limited to 'sysdeps/x86_64/multiarch/wcscat-generic.c')
0 files changed, 0 insertions, 0 deletions