about summary refs log tree commit diff
path: root/string/str-two-way.h
diff options
context:
space:
mode:
authorWilco Dijkstra <wdijkstr@arm.com>2019-06-12 11:38:52 +0100
committerWilco Dijkstra <wdijkstr@arm.com>2019-06-12 11:38:52 +0100
commit5e0a7ecb6629461b28adc1a5aabcc0ede122f201 (patch)
tree8dc29750e4e9405f670fd362daa47cbc2731700e /string/str-two-way.h
parent80b2bfb5350442ef1a781b0ee9dd44d61bd88f8a (diff)
downloadglibc-5e0a7ecb6629461b28adc1a5aabcc0ede122f201.tar.gz
glibc-5e0a7ecb6629461b28adc1a5aabcc0ede122f201.tar.xz
glibc-5e0a7ecb6629461b28adc1a5aabcc0ede122f201.zip
Improve performance of strstr
This patch significantly improves performance of strstr using a novel
modified Horspool algorithm.  Needles up to size 256 use a bad-character
table indexed by hashed pairs of characters to quickly skip past mismatches.
Long needles use a self-adapting filtering step to avoid comparing the whole
needle repeatedly.

By limiting the needle length to 256, the shift table only requires 8 bits
per entry, lowering preprocessing overhead and minimizing cache effects.
This limit also implies worst-case performance is linear.

Small needles up to size 3 use a dedicated linear search.  Very long needles
use the Two-Way algorithm.

The performance gain using the improved bench-strstr on Cortex-A72 is 5.8
times basic_strstr and 3.7 times twoway_strstr.

Tested against GLIBC testsuite, randomized tests and the GNULIB strstr test
(https://git.savannah.gnu.org/cgit/gnulib.git/tree/tests/test-strstr.c).

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

	* string/str-two-way.h (two_way_short_needle): Add inline to avoid
	warning.
	(two_way_long_needle): Block inlining.
	* string/strstr.c (strstr2): Add new function.
	(strstr3): Likewise.
	(STRSTR): Completely rewrite strstr to improve performance.
Diffstat (limited to 'string/str-two-way.h')
-rw-r--r--string/str-two-way.h9
1 files changed, 6 insertions, 3 deletions
diff --git a/string/str-two-way.h b/string/str-two-way.h
index b5011baafa..f43c613f5a 100644
--- a/string/str-two-way.h
+++ b/string/str-two-way.h
@@ -221,7 +221,7 @@ critical_factorization (const unsigned char *needle, size_t needle_len,
    most 2 * HAYSTACK_LEN - NEEDLE_LEN comparisons occur in searching.
    If AVAILABLE modifies HAYSTACK_LEN (as in strstr), then at most 3 *
    HAYSTACK_LEN - NEEDLE_LEN comparisons occur in searching.  */
-static RETURN_TYPE
+static inline RETURN_TYPE
 two_way_short_needle (const unsigned char *haystack, size_t haystack_len,
 		      const unsigned char *needle, size_t needle_len)
 {
@@ -382,8 +382,11 @@ two_way_short_needle (const unsigned char *haystack, size_t haystack_len,
    and sublinear performance O(HAYSTACK_LEN / NEEDLE_LEN) is possible.
    If AVAILABLE modifies HAYSTACK_LEN (as in strstr), then at most 3 *
    HAYSTACK_LEN - NEEDLE_LEN comparisons occur in searching, and
-   sublinear performance is not possible.  */
-static RETURN_TYPE
+   sublinear performance is not possible.
+
+   Since this function is large and complex, block inlining to avoid
+   slowing down the common case of small needles.  */
+__attribute__((noinline)) static RETURN_TYPE
 two_way_long_needle (const unsigned char *haystack, size_t haystack_len,
 		     const unsigned char *needle, size_t needle_len)
 {