| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Continue cleanup of the string benchtests. Remove simplistic
byte-oriented versions with faster generic implementations.
Remove bcopy/bzero benchmarks (bcopy/bzero are obsolete and never
emitted by compilers). Remove builtin versions of memcpy, memset
and strlen. Remove all remaining "stupid" implementations given
they are always slower than the "simple" variants and thus don't
add anything useful.
* benchtests/bench-strcasecmp.c (stupid_strcasecmp): Remove.
* benchtests/bench-strcasestr.c (stupid_strcasestr): Remove.
* benchtests/bench-strchr.c (stupid_strchr): Remove.
* benchtests/bench-strcmp.c (stupid_strcmp): Remove.
* benchtests/bench-strcspn.c (stupid_strcspn): Remove.
* benchtests/bench-strlen.c (builtin_strlen): Remove.
* benchtests/bench-strncasecmp.c (stupid_strncasecmp): Remove.
* benchtests/bench-strncmp.c (stupid_strncmp): Remove.
* benchtests/bench-strpbrk.c (stupid_strpbrk): Remove.
* benchtests/bench-strspn.c (stupid_strspn): Remove.
* benchtests/Makefile: Remove bench-bcopy.c and bench-bzero.c.
* benchtests/bench-bcopy.c: Delete file.
* benchtests/bench-bzero.c: Likewise.
* benchtests/bench-memccpy.c (stupid_memccpy): Remove.
(simple_memccpy): Remove.
(generic_memccpy): Add function.
* benchtests/bench-memcpy.c: (builtin_memcpy): Remove.
* benchtests/bench-memmove.c (simple_bcopy): Remove.
* benchtests/bench-mempcpy.c (simple_mempcpy): Remove.
(generic_mempcpy): Add new function.
* benchtests/bench-memset.c (simple_bzero): Remove.
(builtin_bzero): Remove.
(builtin_memset): Remove.
* benchtests/bench-rawmemchr.c (simple_rawmemchr): Remove.
(generic_rawmemchr): Add new function.
|
|
|
|
|
|
|
| |
* All files with FSF copyright notices: Update copyright dates
using scripts/update-copyrights.
* locale/programs/charmap-kw.h: Regenerated.
* locale/programs/locfile-kw.h: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Refactor string benchtests by moving duplicated defines into
bench-string.h.
* benchtests/bench-memchr.c: Cleanup defines.
* benchtests/bench-memcmp.c: Likewise.
* benchtests/bench-memset.c: Likewise.
* benchtests/bench-memset-large.c: Likewise.
* benchtests/bench-memset-walk.c: Likewise.
* benchtests/bench-stpcpy.c: Likewise.
* benchtests/bench-stpncpy.c: Likewise.
* benchtests/bench-strcat.c: Likewise.
* benchtests/bench-strchr.c: Likewise.
* benchtests/bench-strcmp.c: Likewise.
* benchtests/bench-strcpy.c: Likewise.
* benchtests/bench-strcspn.c: Likewise.
* benchtests/bench-string.h: Likewise.
* benchtests/bench-strlen.c: Likewise.
* benchtests/bench-strncat.c: Likewise.
* benchtests/bench-strncmp.c: Likewise.
* benchtests/bench-strncpy.c: Likewise.
* benchtests/bench-strnlen.c: Likewise.
* benchtests/bench-strpbrk.c: Likewise.
* benchtests/bench-strrchr.c: Likewise.
* benchtests/bench-strspn.c: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Drop realloc_bufs in favour of making alloc_bufs transparently
reallocate the buffers if it had allocated before. Also consolidate
computation of buffer lengths so that they don't get repeated on every
reallocation.
* benchtests/bench-string.h (buf1_size, buf2_size): New
variables.
(init_sizes): New function.
(test_init): Use it.
(alloc_buf, exit_error): New functions.
(alloc_bufs): Use ALLOC_BUF.
(realloc_bufs): Remove.
* benchtests/bench-memcmp.c (do_test): Adjust.
* benchtests/bench-memset-large.c (do_test): Likewise.
* benchtests/bench-memset-walk.c (do_test): Likewise.
* benchtests/bench-memset.c (do_test): Likewise.
* benchtests/bench-strncmp.c (do_test): Likewise.
|
|
|
|
|
|
|
| |
* All files with FSF copyright notices: Update copyright dates
using scripts/update-copyrights.
* locale/programs/charmap-kw.h: Regenerated.
* locale/programs/locfile-kw.h: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Keeping the same buffers along with copying the same size of data into
the same location means that the first routine is typically the
slowest since it has to bear the cost of fetching data into to cache.
Reallocating buffers stabilizes numbers by a bit.
* benchtests/bench-string.h (realloc_bufs): New function.
(test_init): Call it.
* benchtests/bench-memset-large.c (do_test): Likewise.
* benchtests/bench-memset.c (do_test): Likewise.
|
|
|
|
|
|
|
|
|
|
| |
Make the memset benchmarks (bench-memset and bench-memset-large) print
their output in JSON so that they can be evaluated using the
compare_strings.py script.
* benchtests/bench-memset-large.c: Print output in JSON
format.
* benchtests/bench-memset.c: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The test run is unnecessary and interferes with the benchmark. The
tests are done during make check, so they're unnecessary here.
* benchtests/bench-memccpy.c (do_one_test): Remove checks.
* benchtests/bench-memchr.c (do_one_test): Likewise.
* benchtests/bench-memcpy-large.c (do_one_test): Likewise.
* benchtests/bench-memcpy.c (do_one_test): Likewise.
* benchtests/bench-memmove-large.c (do_one_test): Likewise.
* benchtests/bench-memmove.c (do_one_test): Likewise.
* benchtests/bench-memset-large.c (do_one_test): Likewise.
* benchtests/bench-memset.c (do_one_test): Likewise.
* benchtests/bench-string.h (test_init): Remove memsets.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch basically replaces the test-skeleton.c inclusion by
support/test-driver.c and also minor adjustments in bench-string.h.
Checked on x86_64-linux-gnu and powerpc64le-linux-gnu.
* benchtests/bench-string.h (TEST_FUNCTION): Use name without
parenthesis.
(CMDLINE_PROCESS): Define using function instead of macro.
* benchtests/bench-memccpy.c: Include <support/test-driver.c> instead
of test-skeleton.
* benchtests/bench-memchr.c: Likewise.
* benchtests/bench-memcmp.c: Likewise.
* benchtests/bench-memcpy-large.c: Likewise.
* benchtests/bench-memcpy.c: Likewise.
* benchtests/bench-memmem.c: Likewise.
* benchtests/bench-memmove-large.c: Likewise.
* benchtests/bench-memmove.c: Likewise.
* benchtests/bench-memset-large.c: Likewise.
* benchtests/bench-memset.c: Likewise.
* benchtests/bench-rawmemchr.c: Likewise.
* benchtests/bench-strcasecmp.c: Likewise.
* benchtests/bench-strcasestr.c: Likewise.
* benchtests/bench-strcat.c: Likewise.
* benchtests/bench-strchr.c: Likewise.
* benchtests/bench-strcmp.c: Likewise.
* benchtests/bench-strcpy.c: Likewise.
* benchtests/bench-strcpy_chk.c: Likewise.
* benchtests/bench-strlen.c: Likewise.
* benchtests/bench-strncasecmp.c: Likewise.
* benchtests/bench-strncmp.c: Likewise.
* benchtests/bench-strncpy.c: Likewise.
* benchtests/bench-strnlen.c: Likewise.
* benchtests/bench-strpbrk.c: Likewise.
* benchtests/bench-strrchr.c: Likewise.
* benchtests/bench-strsep.c: Likewise.
* benchtests/bench-strspn.c: Likewise.
* benchtests/bench-strstr.c: Likewise.
* benchtests/bench-strtok.c: Likewise.
|
|
|
|
|
|
|
|
|
| |
Add 64-byte alignment tests in memset benchtest for 64-byte vector
registers.
* benchtests/bench-memset.c (do_test): Support 64-byte
alignment.
(test_main): Test 64-byte alignment.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch provides optimized version of wmemset with the z13 vector
instructions.
ChangeLog:
* sysdeps/s390/multiarch/wmemset-c.c: New File.
* sysdeps/s390/multiarch/wmemset-vx.S: Likewise.
* sysdeps/s390/multiarch/wmemset.c: Likewise.
* sysdeps/s390/multiarch/Makefile
(sysdep_routines): Add wmemset functions.
* sysdeps/s390/multiarch/ifunc-impl-list-common.c
(__libc_ifunc_impl_list_common): Add ifunc test for wmemset.
* wcsmbs/wmemset.c: Use WMEMSET if defined.
* string/test-memset.c: Add wmemset support.
* wcsmbs/test-wmemset.c: New File.
* wcsmbs/Makefile (strop-tests): Add wmemset.
* benchtests/bench-memset.c: Add wmemset support.
* benchtests/bench-wmemset.c: New File.
* benchtests/Makefile (wcsmbs-bench): Add wmemset.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds an optimized memset implementation for POWER8. For
sizes from 0 to 255 bytes, a word/doubleword algorithm similar to
POWER7 optimized one is used.
For size higher than 255 two strategies are used:
1. If the constant is different than 0, the memory is written with
altivec vector instruction;
2. If constant is 0, dbcz instructions are used. The loop is unrolled
to clear 512 byte at time.
Using vector instructions increases throughput considerable, with a
double performance for sizes larger than 1024. The dcbz loops unrolls
also shows performance improvement, by doubling throughput for sizes
larger than 8192 bytes.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Switch the string benchmarks to using bench-timing.h instead
of hp-timing.h directly. This allows the string benchmarks to
be run usefully on architectures such as ARM that do not have
support for hp-timing.h.
In order to do this the tests have been changed from timing each
individual call and picking the lowest execution time recorded to
timing a number of calls and taking the mean execution time.
ChangeLog:
2013-09-04 Will Newton <will.newton@linaro.org>
* benchtests/bench-timing.h (TIMING_PRINT_MEAN): New macro.
* benchtests/bench-string.h: Include bench-timing.h instead
of including hp-timing.h directly. (INNER_LOOP_ITERS): New
define. (HP_TIMING_BEST): Delete macro. (test_init): Remove
call to HP_TIMING_DIFF_INIT.
* benchtests/bench-memccpy.c: Use bench-timing.h macros
instead of hp-timing.h macros.
* benchtests/bench-memchr.c: Likewise.
* benchtests/bench-memcmp.c: Likewise.
* benchtests/bench-memcpy.c: Likewise.
* benchtests/bench-memmem.c: Likewise.
* benchtests/bench-memmove.c: Likewise.
* benchtests/bench-memset.c: Likewise.
* benchtests/bench-rawmemchr.c: Likewise.
* benchtests/bench-strcasecmp.c: Likewise.
* benchtests/bench-strcasestr.c: Likewise.
* benchtests/bench-strcat.c: Likewise.
* benchtests/bench-strchr.c: Likewise.
* benchtests/bench-strcmp.c: Likewise.
* benchtests/bench-strcpy.c: Likewise.
* benchtests/bench-strcpy_chk.c: Likewise.
* benchtests/bench-strlen.c: Likewise.
* benchtests/bench-strncasecmp.c: Likewise.
* benchtests/bench-strncat.c: Likewise.
* benchtests/bench-strncmp.c: Likewise.
* benchtests/bench-strncpy.c: Likewise.
* benchtests/bench-strnlen.c: Likewise.
* benchtests/bench-strpbrk.c: Likewise.
* benchtests/bench-strrchr.c: Likewise.
* benchtests/bench-strspn.c: Likewise.
* benchtests/bench-strstr.c: Likewise.
|
|
|
|
|
|
| |
Check wheter the compiler has the option -fno-tree-loop-distribute-patterns
to inhibit loop transformation to library calls and uses it on memset
and memmove default implementation to avoid recursive calls.
|
|
Copy over already existing string performance tests into benchtests.
Bits not related to performance measurements have been omitted.
|