diff options
author | Wilco Dijkstra <wdijkstr@arm.com> | 2019-06-12 11:38:52 +0100 |
---|---|---|
committer | Wilco Dijkstra <wdijkstr@arm.com> | 2019-06-12 11:38:52 +0100 |
commit | 5e0a7ecb6629461b28adc1a5aabcc0ede122f201 (patch) | |
tree | 8dc29750e4e9405f670fd362daa47cbc2731700e /string/str-two-way.h | |
parent | 80b2bfb5350442ef1a781b0ee9dd44d61bd88f8a (diff) | |
download | glibc-5e0a7ecb6629461b28adc1a5aabcc0ede122f201.tar.gz glibc-5e0a7ecb6629461b28adc1a5aabcc0ede122f201.tar.xz glibc-5e0a7ecb6629461b28adc1a5aabcc0ede122f201.zip |
Improve performance of strstr
This patch significantly improves performance of strstr using a novel modified Horspool algorithm. Needles up to size 256 use a bad-character table indexed by hashed pairs of characters to quickly skip past mismatches. Long needles use a self-adapting filtering step to avoid comparing the whole needle repeatedly. By limiting the needle length to 256, the shift table only requires 8 bits per entry, lowering preprocessing overhead and minimizing cache effects. This limit also implies worst-case performance is linear. Small needles up to size 3 use a dedicated linear search. Very long needles use the Two-Way algorithm. The performance gain using the improved bench-strstr on Cortex-A72 is 5.8 times basic_strstr and 3.7 times twoway_strstr. Tested against GLIBC testsuite, randomized tests and the GNULIB strstr test (https://git.savannah.gnu.org/cgit/gnulib.git/tree/tests/test-strstr.c). Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> * string/str-two-way.h (two_way_short_needle): Add inline to avoid warning. (two_way_long_needle): Block inlining. * string/strstr.c (strstr2): Add new function. (strstr3): Likewise. (STRSTR): Completely rewrite strstr to improve performance.
Diffstat (limited to 'string/str-two-way.h')
-rw-r--r-- | string/str-two-way.h | 9 |
1 files changed, 6 insertions, 3 deletions
diff --git a/string/str-two-way.h b/string/str-two-way.h index b5011baafa..f43c613f5a 100644 --- a/string/str-two-way.h +++ b/string/str-two-way.h @@ -221,7 +221,7 @@ critical_factorization (const unsigned char *needle, size_t needle_len, most 2 * HAYSTACK_LEN - NEEDLE_LEN comparisons occur in searching. If AVAILABLE modifies HAYSTACK_LEN (as in strstr), then at most 3 * HAYSTACK_LEN - NEEDLE_LEN comparisons occur in searching. */ -static RETURN_TYPE +static inline RETURN_TYPE two_way_short_needle (const unsigned char *haystack, size_t haystack_len, const unsigned char *needle, size_t needle_len) { @@ -382,8 +382,11 @@ two_way_short_needle (const unsigned char *haystack, size_t haystack_len, and sublinear performance O(HAYSTACK_LEN / NEEDLE_LEN) is possible. If AVAILABLE modifies HAYSTACK_LEN (as in strstr), then at most 3 * HAYSTACK_LEN - NEEDLE_LEN comparisons occur in searching, and - sublinear performance is not possible. */ -static RETURN_TYPE + sublinear performance is not possible. + + Since this function is large and complex, block inlining to avoid + slowing down the common case of small needles. */ +__attribute__((noinline)) static RETURN_TYPE two_way_long_needle (const unsigned char *haystack, size_t haystack_len, const unsigned char *needle, size_t needle_len) { |