| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
Increase benchmark iterations for math and vector math functions to improve
timing accuracy. Vector math benchmarks now take 1-3 seconds on a modern CPU.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.
I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah. I don't
know why I run into these diagnostics whereas others evidently do not.
remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also, change sources.redhat.com to sourceware.org.
This patch was automatically generated by running the following shell
script, which uses GNU sed, and which avoids modifying files imported
from upstream:
sed -ri '
s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g
s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g
' \
$(find $(git ls-files) -prune -type f \
! -name '*.po' \
! -name 'ChangeLog*' \
! -path COPYING ! -path COPYING.LIB \
! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \
! -path manual/texinfo.tex ! -path scripts/config.guess \
! -path scripts/config.sub ! -path scripts/install-sh \
! -path scripts/mkinstalldirs ! -path scripts/move-if-change \
! -path INSTALL ! -path locale/programs/charmap-kw.h \
! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \
! '(' -name configure \
-execdir test -f configure.ac -o -f configure.in ';' ')' \
! '(' -name preconfigure \
-execdir test -f preconfigure.ac ';' ')' \
-print)
and then by running 'make dist-prepare' to regenerate files built
from the altered files, and then executing the following to cleanup:
chmod a+x sysdeps/unix/sysv/linux/riscv/configure
# Omit irrelevant whitespace and comment-only changes,
# perhaps from a slightly-different Autoconf version.
git checkout -f \
sysdeps/csky/configure \
sysdeps/hppa/configure \
sysdeps/riscv/configure \
sysdeps/unix/sysv/linux/csky/configure
# Omit changes that caused a pre-commit check to fail like this:
# remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines
git checkout -f \
sysdeps/powerpc/powerpc64/ppc-mcount.S \
sysdeps/unix/sysv/linux/s390/s390-64/syscall.S
# Omit change that caused a pre-commit check to fail like this:
# remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline
git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
|
|
|
|
|
|
|
|
|
|
| |
Remove TIMING_INIT since it's no longer used.
* benchtests/bench-malloc-simple.c: Remove TIMING_INIT.
* benchtests/bench-malloc-thread.c: Likewise.
* benchtests/bench-skeleton.c: Likewise.
* benchtests/bench-strtod.c: Likewise.
* benchtests/bench-timing.h: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The GNU Coding Standards specify that line breaks in expressions
should go before an operator, not after one. This patch fixes various
code to do this. It only changes code that appears to be mostly
following GNU style anyway, not files and directories with
substantially different formatting. It is not exhaustive even for
files using GNU style (for example, changes to sysdeps files are
deferred for subsequent cleanups). Some files changed are shared with
gnulib, but most are specific to glibc. Changes were made manually,
with places to change found by grep (so some cases, e.g. where the
operator was followed by a comment at end of line, are particularly
liable to have been missed by grep, but I did include cases where the
operator was followed by backslash-newline).
This patch generally does not attempt to address other coding style
issues in the expressions changed (for example, missing spaces before
'(', or lack of parentheses to ensure indentation of continuation
lines properly reflects operator precedence).
Tested for x86_64, and with build-many-glibcs.py.
* benchtests/bench-memmem.c (simple_memmem): Break lines before
rather than after operators.
* benchtests/bench-skeleton.c (TIMESPEC_AFTER): Likewise.
* crypt/md5.c (md5_finish_ctx): Likewise.
* crypt/sha256.c (__sha256_finish_ctx): Likewise.
* crypt/sha512.c (__sha512_finish_ctx): Likewise.
* elf/cache.c (load_aux_cache): Likewise.
* elf/dl-load.c (open_verify): Likewise.
* elf/get-dynamic-info.h (elf_get_dynamic_info): Likewise.
* elf/readelflib.c (process_elf_file): Likewise.
* elf/rtld.c (dl_main): Likewise.
* elf/sprof.c (generate_call_graph): Likewise.
* hurd/ctty-input.c (_hurd_ctty_input): Likewise.
* hurd/ctty-output.c (_hurd_ctty_output): Likewise.
* hurd/dtable.c (reauth_dtable): Likewise.
* hurd/getdport.c (__getdport): Likewise.
* hurd/hurd/signal.h (_hurd_interrupted_rpc_timeout): Likewise.
* hurd/hurd/sigpreempt.h (HURD_PREEMPT_SIGNAL_P): Likewise.
* hurd/hurdfault.c (_hurdsig_fault_catch_exception_raise):
Likewise.
* hurd/hurdioctl.c (fioctl): Likewise.
* hurd/hurdselect.c (_hurd_select): Likewise.
* hurd/hurdsig.c (_hurdsig_abort_rpcs): Likewise.
(STOPSIGS): Likewise.
* hurd/hurdstartup.c (_hurd_startup): Likewise.
* hurd/intr-msg.c (_hurd_intr_rpc_mach_msg): Likewise.
* hurd/lookup-retry.c (__hurd_file_name_lookup_retry): Likewise.
* hurd/msgportdemux.c (msgport_server): Likewise.
* hurd/setauth.c (_hurd_setauth): Likewise.
* include/features.h (__GLIBC_USE_DEPRECATED_SCANF): Likewise.
* libio/libioP.h [IO_DEBUG] (CHECK_FILE): Likewise.
* locale/programs/ld-ctype.c (set_class_defaults): Likewise.
* localedata/tests-mbwc/tst_swscanf.c (tst_swscanf): Likewise.
* login/tst-utmp.c (do_check): Likewise.
(simulate_login): Likewise.
* mach/lowlevellock.h (lll_lock): Likewise.
(lll_trylock): Likewise.
* math/test-fenv.c (ALL_EXC): Likewise.
* math/test-fenvinline.c (ALL_EXC): Likewise.
* misc/sys/cdefs.h (__attribute_deprecated_msg__): Likewise.
* nis/nis_call.c (__do_niscall3): Likewise.
* nis/nis_callback.c (cb_prog_1): Likewise.
* nis/nis_defaults.c (searchaccess): Likewise.
* nis/nis_findserv.c (__nis_findfastest_with_timeout): Likewise.
* nis/nis_ismember.c (internal_ismember): Likewise.
* nis/nis_local_names.c (nis_local_principal): Likewise.
* nis/nss_nis/nis-rpc.c (_nss_nis_getrpcbyname_r): Likewise.
* nis/nss_nisplus/nisplus-netgrp.c (_nss_nisplus_getnetgrent_r):
Likewise.
* nis/ypclnt.c (yp_match): Likewise.
(yp_first): Likewise.
(yp_next): Likewise.
(yp_master): Likewise.
(yp_order): Likewise.
* nscd/hstcache.c (cache_addhst): Likewise.
* nscd/initgrcache.c (addinitgroupsX): Likewise.
* nss/nss_compat/compat-pwd.c (copy_pwd_changes): Likewise.
(internal_getpwuid_r): Likewise.
* nss/nss_compat/compat-spwd.c (copy_spwd_changes): Likewise.
* posix/glob.h (__GLOB_FLAGS): Likewise.
* posix/regcomp.c (peek_token): Likewise.
(peek_token_bracket): Likewise.
(parse_expression): Likewise.
* posix/regexec.c (sift_states_iter_mb): Likewise.
(check_node_accept_bytes): Likewise.
* posix/tst-spawn3.c (do_test): Likewise.
* posix/wordexp-test.c (testit): Likewise.
* posix/wordexp.c (parse_tilde): Likewise.
(exec_comm): Likewise.
* posix/wordexp.h (__WRDE_FLAGS): Likewise.
* resource/vtimes.c (TIMEVAL_TO_VTIMES): Likewise.
* setjmp/sigjmp.c (__sigjmp_save): Likewise.
* stdio-common/printf_fp.c (__printf_fp_l): Likewise.
* stdio-common/tst-fileno.c (do_test): Likewise.
* stdio-common/vfprintf-internal.c (vfprintf): Likewise.
* stdlib/strfmon_l.c (__vstrfmon_l_internal): Likewise.
* stdlib/strtod_l.c (round_and_return): Likewise.
(____STRTOF_INTERNAL): Likewise.
* stdlib/tst-strfrom.h (TEST_STRFROM): Likewise.
* string/strcspn.c (STRCSPN): Likewise.
* string/test-memmem.c (simple_memmem): Likewise.
* termios/tcsetattr.c (tcsetattr): Likewise.
* time/alt_digit.c (_nl_parse_alt_digit): Likewise.
* time/asctime.c (asctime_internal): Likewise.
* time/strptime_l.c (__strptime_internal): Likewise.
* time/sys/time.h (timercmp): Likewise.
* time/tzfile.c (__tzfile_compute): Likewise.
|
|
|
|
|
|
|
| |
* All files with FSF copyright notices: Update copyright dates
using scripts/update-copyrights.
* locale/programs/charmap-kw.h: Regenerated.
* locale/programs/locfile-kw.h: Likewise.
|
|
|
|
|
|
|
|
| |
Add the duration and iterations attributes to the workloads tests to
make the json schema parser happy
* benchtests/bench-skeleton.c (main): Add duration and
iterations attributes.
|
|
|
|
|
|
|
| |
* All files with FSF copyright notices: Update copyright dates
using scripts/update-copyrights.
* locale/programs/charmap-kw.h: Regenerated.
* locale/programs/locfile-kw.h: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch further improves math function benchmarking by adding a latency
test in addition to throughput. This enables more accurate comparisons of the
math functions. The latency test works by creating a dependency on the previous
iteration: func_res = F (func_res * zero + input[i]). The multiply by zero
avoids changing the input.
It reports reciprocal throughput and latency in nanoseconds (depending on the
timing header used) and max/min throughput in iterations per second:
"workload-spec2006.wrf": {
"reciprocal-throughput": 100,
"latency": 200,
"max-throughput": 1.0e+07,
"min-throughput": 5.0e+06
}
* benchtests/bench-skeleton.c (main): Add support for
latency benchmarking.
* benchtests/scripts/bench.py: Add support for latency benchmarking.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Improve support for math function benchmarking. This patch adds
a feature that allows accurate benchmarking of traces extracted
from real workloads. This is done by iterating over all samples
rather than repeating each sample many times (which completely
ignores branch prediction and cache effects). A trace can be
added to existing math function inputs via
"## name: workload-<name>", followed by the trace.
* benchtests/README: Describe workload feature.
* benchtests/bench-skeleton.c (main): Add support for
benchmarking traces from workloads.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
uses 2 arrays with 1024 doubles, one with 99% finite FP numbers (10% zeroes, 10% negative) and 1% inf/NaN, the other with 50% inf, and 50% Nan.
ChangeLog:
2015-09-18 Wilco Dijkstra <wdijkstr@arm.com>
* benchtests/Makefile: Add bench-math-inlines, link with libm.
* benchtests/bench-math-inlines.c: New benchmark.
* benchtests/bench-util.h: New file.
* benchtests/bench-util.c: New file.
* benchtests/bench-skeleton.c: Add include of bench-util.c/h.
|
| |
|
|
|
|
|
|
|
| |
Add a new 'init' directive that specifies the name of the function to
call to do function-specific initialization. This is useful for
benchmarks that need to do a one-time initialization before the
functions are executed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a small library to print JSON values and use it to improve the
readability of the benchmark output and the readability of the
benchmark code.
ChangeLog:
2014-04-11 Will Newton <will.newton@linaro.org>
* benchtests/Makefile (extra-objs): Add json-lib.o.
(bench-func): Tidy up JSON output.
* benchtests/bench-skeleton.c: Include json-lib.h.
(main): Use JSON library functions to do output of
benchmark results.
* benchtests/bench-timing-type.c (main): Output the
timing type simply, leaving formatting to the user.
* benchtests/json-lib.c: New file.
* benchtests/json-lib.h: Likewise.
|
|
|
|
|
|
|
|
| |
This patch adds an option to get detailed benchmark output for
functions. Invoking the benchmark with 'make DETAILED=1 bench' causes
each benchmark program to store a mean execution time for each input
it works on. This is useful to give a more comprehensive picture of
performance of functions compared to just the single mean figure.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch changes the output format of the main benchmark output file
(bench.out) to an extensible format. I chose JSON over XML because in
addition to being extensible, it is also not too verbose.
Additionally it has good support in python.
The significant change I have made in terms of functionality is to put
timing information as an attribute in JSON instead of a string and to
do that, there is a separate program that prints out a JSON snippet
mentioning the type of timing (hp_timing or clock_gettime). The mean
timing has now changed from iterations per unit to actual timing per
iteration.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The TIMING_INIT macro currently sets the number of loop iterations
to 1000, which limits usefulness. Make the argument a clock
resolution value and multiply by 1000 in bench-skeleton.c instead
to allow easier reuse.
ChangeLog:
2013-09-11 Will Newton <will.newton@linaro.org>
* benchtests/bench-timing.h (TIMING_INIT): Rename ITERS
parameter to RES. Remove hardcoded 1000 value.
* benchtests/bench-skeleton.c (main): Pass RES parameter
to TIMING_INIT and multiply result by 1000.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
HP_TIMING uses native timestamping instructions if available, thus
greatly reducing the overhead of recording start and end times for
function calls. For architectures that don't have HP_TIMING
available, we fall back to the clock_gettime bits. One may also
override this by invoking the benchmark as follows:
make USE_CLOCK_GETTIME=1 bench
and get the benchmark results using clock_gettime. One has to do
`make bench-clean` to ensure that the benchmark programs are rebuilt.
|
| |
|
|
|
|
|
|
| |
A benchmark could be skewed by CPU initialy working on minimal
frequency and speeding up later. We first run code in loop
to partialy fix this issue.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some math functions have distinct performance characteristics in
specific domains of inputs, where some inputs return via a fast path
while other inputs require multiple precision calculations, that too
at different precision levels. The way to implement different domains
was to have a separate source file and benchmark definition, resulting
in separate programs.
This clutters up the benchmark, so this change allows these domains to
be consolidated into the same input file. To do this, the input file
format is now enhanced to allow comments with a preceding # and
directives with two # at the begining of a line. A directive that
looks like:
tells the benchmark generation script that what follows is a different
domain of inputs. The value of the 'name' directive (in this case,
foo) is used in the output. The two input domains are then executed
sequentially and their results collated separately. with the above
directive, there would be two lines in the result that look like:
func(): ....
func(foo): ...
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The idea to run benchmarks for a constant number of iterations is
problematic. While the benchmarks may run for 10 seconds on x86_64,
they could run for about 30 seconds on powerpc and worse, over 3
minutes on arm. Besides that, adding a new benchmark is cumbersome
since one needs to find out the number of iterations needed for a
sufficient runtime.
A better idea would be to run each benchmark for a specific amount of
time. This patch does just that. The run time defaults to 10 seconds
and it is configurable at command line:
make BENCH_DURATION=5 bench
|
|
See benchtests/Makefile to know how to use it.
|