| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Performance seems to be similar (gcc 11.2.1 on a Ryzen 9 5900X),
the generic algorithm shows slight better performance for
the 'workload-huge.wrf' input set.
* s_sinf-sse2.S:
"sinf": {
"": {
"duration": 3.72405e+09,
"iterations": 2.38374e+08,
"max": 63.973,
"min": 11.211,
"mean": 15.6227
},
"workload-random.wrf": {
"duration": 3.76923e+09,
"iterations": 8.4e+07,
"reciprocal-throughput": 17.6355,
"latency": 72.108,
"max-throughput": 5.67037e+07,
"min-throughput": 1.38681e+07
},
"workload-huge.wrf": {
"duration": 3.76943e+09,
"iterations": 6e+07,
"reciprocal-throughput": 29.3493,
"latency": 96.2985,
"max-throughput": 3.40724e+07,
"min-throughput": 1.03844e+07
}
}
* generic s_sinf.c:
"sinf": {
"": {
"duration": 3.70989e+09,
"iterations": 2.18025e+08,
"max": 69.782,
"min": 11.1,
"mean": 17.0159
},
"workload-random.wrf": {
"duration": 3.77213e+09,
"iterations": 9.6e+07,
"reciprocal-throughput": 17.5402,
"latency": 61.0459,
"max-throughput": 5.70119e+07,
"min-throughput": 1.63811e+07
},
"workload-huge.wrf": {
"duration": 3.81576e+09,
"iterations": 5.6e+07,
"reciprocal-throughput": 38.2111,
"latency": 98.0659,
"max-throughput": 2.61704e+07,
"min-throughput": 1.01972e+07
}
}
Checked on i686-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.
I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah. I don't
know why I run into these diagnostics whereas others evidently do not.
remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also, change sources.redhat.com to sourceware.org.
This patch was automatically generated by running the following shell
script, which uses GNU sed, and which avoids modifying files imported
from upstream:
sed -ri '
s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g
s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g
' \
$(find $(git ls-files) -prune -type f \
! -name '*.po' \
! -name 'ChangeLog*' \
! -path COPYING ! -path COPYING.LIB \
! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \
! -path manual/texinfo.tex ! -path scripts/config.guess \
! -path scripts/config.sub ! -path scripts/install-sh \
! -path scripts/mkinstalldirs ! -path scripts/move-if-change \
! -path INSTALL ! -path locale/programs/charmap-kw.h \
! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \
! '(' -name configure \
-execdir test -f configure.ac -o -f configure.in ';' ')' \
! '(' -name preconfigure \
-execdir test -f preconfigure.ac ';' ')' \
-print)
and then by running 'make dist-prepare' to regenerate files built
from the altered files, and then executing the following to cleanup:
chmod a+x sysdeps/unix/sysv/linux/riscv/configure
# Omit irrelevant whitespace and comment-only changes,
# perhaps from a slightly-different Autoconf version.
git checkout -f \
sysdeps/csky/configure \
sysdeps/hppa/configure \
sysdeps/riscv/configure \
sysdeps/unix/sysv/linux/csky/configure
# Omit changes that caused a pre-commit check to fail like this:
# remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines
git checkout -f \
sysdeps/powerpc/powerpc64/ppc-mcount.S \
sysdeps/unix/sysv/linux/s390/s390-64/syscall.S
# Omit change that caused a pre-commit check to fail like this:
# remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline
git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
|
|
|
|
|
|
|
| |
* All files with FSF copyright notices: Update copyright dates
using scripts/update-copyrights.
* locale/programs/charmap-kw.h: Regenerated.
* locale/programs/locfile-kw.h: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The second patch improves performance of sinf and cosf using the same
algorithms and polynomials. The returned values are identical to sincosf
for the same input. ULP definitions for AArch64 and x64 are updated.
sinf/cosf througput gains on Cortex-A72:
* |x| < 0x1p-12 : 1.2x
* |x| < M_PI_4 : 1.8x
* |x| < 2 * M_PI: 1.7x
* |x| < 120.0 : 2.3x
* |x| < Inf : 3.0x
* NEWS: Mention sinf, cosf, sincosf.
* sysdeps/aarch64/libm-test-ulps: Update ULP for sinf, cosf, sincosf.
* sysdeps/x86_64/fpu/libm-test-ulps: Update ULP for sinf and cosf.
* sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: Add definitions of
constants rather than including generic sincosf.h.
* sysdeps/x86_64/fpu/s_sincosf_data.c: Remove.
* sysdeps/ieee754/flt-32/s_cosf.c (cosf): Rewrite.
* sysdeps/ieee754/flt-32/s_sincosf.h (reduced_sin): Remove.
(reduced_cos): Remove.
(sinf_poly): New function.
* sysdeps/ieee754/flt-32/s_sinf.c (sinf): Rewrite.
|
|
|
|
|
|
|
| |
* All files with FSF copyright notices: Update copyright dates
using scripts/update-copyrights.
* locale/programs/charmap-kw.h: Regenerated.
* locale/programs/locfile-kw.h: Likewise.
|
|
|
|
|
| |
This implementation is based on generic s_sinf.c and s_cosf.c.
Tested on s390x, powerpc64le and powerpc32.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch avoid an extra floating point to integer conversion in
reduced internal function for generic sinf by defining the sign as
double instead of integers.
There is no much difference on Haswell with GCC 7.2.1:
Before After
min 9.11 9.108
mean 21.982 21.9224
However H.J. Lu reported gains on Skylake:
Before:
"sinf": {
"": {
"duration": 3.4044e+10,
"iterations": 1.9942e+09,
"max": 141.106,
"min": 7.704,
"mean": 17.0715
}
}
After:
"sinf": {
"": {
"duration": 3.40665e+10,
"iterations": 2.03199e+09,
"max": 95.994,
"min": 7.704,
"mean": 16.765
}
}
Checked on x86_64-linux-gnu.
* sysdeps/ieee754/flt-32/s_sinf.c (ones): Define as double.
(reduced): Use ones as double instead of integer.
|
|
|
|
|
|
|
|
|
| |
sinf(NAN) should not signal invalid fp exception
so use isless instead of < where NAN is compared.
this makes the sinf tests pass on aarch64.
* sysdeps/ieee754/flt-32/s_sinf.c (sinf): Use isless.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since s_sinf.c either assigns the return value of floor to integer or
passes double converted from integer to floor, this patch replaces
floor with simple casts.
Also since long == int for 32-bit targets, we can use long instead of
int to avoid 64-bit integer for 64-bit targets.
On Skylake, bench-sinf reports performance improvement:
Before After Improvement
max 130.566 129.564 30%
min 7.704 7.706 0%
mean 21.8188 19.1363 30%
* sysdeps/ieee754/flt-32/s_sinf.c (reduced): Replace long with
int.
(SINF_FUNC): Likewise. Replace floor with simple casts.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new sinf implementation introduced localplt failures for all
platforms where the compiler did not inline the calls to floor
(converted to trunc by machine-independent optimizations). This patch
changes the calls to use __floor as normal in libm.
We can't use the public function names floor / floorf / floorl /
floorf128 in libm code in the absence of appropriate asms to redirect
floor/trunc calls, if not inlined, to use the internal names instead
(while avoiding breaking code building the floor functions themselves)
- while having such asms and then calling the public functions
unconditionally would be desirable for optimization (few architectures
have __floor inlines in math_private.h, and once the built-in function
is used you don't need them), using __floor is the minimum safe fix
for the present test regressions.
Tested with build-many-glibcs.py that this fixes the localplt test
failure for arm-linux-gnueabi.
* sysdeps/ieee754/flt-32/s_sinf.c (SINF_FUNC): Use __floor instead
of floor.
|
|
|
|
|
| |
This implementation is based on optimized sinf assembly versions
of x86_64 and powerpc.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes flt-32 libm functions use libm_alias_float to define
public interfaces (in cases where _Float32 aliases of those interfaces
would be appropriate, so not for finitef / isinff / isnanf).
Tested for x86_64. Also tested with build-many-glibcs.py that
installed stripped shared libraries are unchanged by the patch.
* sysdeps/ieee754/flt-32/s_asinhf.c: Include <libm-alias-float.h>.
(asinhf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_atanf.c: Include <libm-alias-float.h>.
(atanf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_cbrtf.c: Include <libm-alias-float.h>.
(cbrtf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_ceilf.c: Include <libm-alias-float.h>.
(ceilf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_copysignf.c: Include
<libm-alias-float.h>.
(copysignf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_cosf.c: Include <libm-alias-float.h>.
(cosf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_erff.c: Include <libm-alias-float.h>.
(erff): Define using libm_alias_float.
(erfcf): Likewise.
* sysdeps/ieee754/flt-32/s_expm1f.c: Include <libm-alias-float.h>.
(expm1f): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_fabsf.c: Include <libm-alias-float.h>.
(fabsf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_floorf.c: Include <libm-alias-float.h>.
(floorf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_frexpf.c: Include <libm-alias-float.h>.
(frexpf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_fromfpf.c (fromfpf): Define using
libm_alias_float.
* sysdeps/ieee754/flt-32/s_fromfpf_main.c: Include
<libm-alias-float.h>.
* sysdeps/ieee754/flt-32/s_fromfpxf.c (fromfpxf): Define using
libm_alias_float.
* sysdeps/ieee754/flt-32/s_getpayloadf.c: Include
<libm-alias-float.h>.
(getpayloadf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_llrintf.c: Include
<libm-alias-float.h>.
(llrintf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_llroundf.c: Include
<libm-alias-float.h>.
(llroundf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_logbf.c: Include <libm-alias-float.h>.
(logbf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_lrintf.c: Include <libm-alias-float.h>.
(lrintf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_lroundf.c: Include <libm-alias-float.h>.
(lroundf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_modff.c: Include <libm-alias-float.h>.
(modff): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_nearbyintf.c: Include
<libm-alias-float.h>.
(nearbyintf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_nextafterf.c: Include
<libm-alias-float.h>.
(nextafterf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_nextupf.c: Include
<libm-alias-float.h>.
(nextupf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_remquof.c: Include
<libm-alias-float.h>.
(remquof): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_rintf.c: Include <libm-alias-float.h>.
(rintf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_roundevenf.c: Include
<libm-alias-float.h>.
(roundevenf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_roundf.c: Include <libm-alias-float.h>.
(roundf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_setpayloadf.c (setpayloadf): Define
using libm_alias_float.
* sysdeps/ieee754/flt-32/s_setpayloadf_main.c: Include
<libm-alias-float.h>.
* sysdeps/ieee754/flt-32/s_setpayloadsigf.c (setpayloadsigf):
Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_sincosf.c: Include
<libm-alias-float.h>.
(sincosf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_sinf.c: Include <libm-alias-float.h>.
(sinf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_tanf.c: Include <libm-alias-float.h>.
(tanf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_tanhf.c: Include <libm-alias-float.h>.
(tanhf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_totalorderf.c: Include
<libm-alias-float.h>.
(totalorderf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_totalordermagf.c: Include
<libm-alias-float.h>.
(totalordermagf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_truncf.c: Include <libm-alias-float.h>.
(truncf): Define using libm_alias_float.
* sysdeps/ieee754/flt-32/s_ufromfpf.c (ufromfpf): Define using
libm_alias_float.
* sysdeps/ieee754/flt-32/s_ufromfpxf.c (ufromfpxf): Define using
libm_alias_float.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* sysdeps/i386/i686/fpu/multiarch/Makefile (sysdep_routines):
Add s_sinf-sse2, s_conf-sse2.
* sysdeps/i386/i686/fpu/multiarch/s_sinf-sse2.S: New file.
* sysdeps/i386/i686/fpu/multiarch/s_cosf-sse2.S: New file.
* sysdeps/i386/i686/fpu/multiarch/s_sinf.c: New file.
* sysdeps/i386/i686/fpu/multiarch/s_cosf.c: New file.
* sysdeps/ieee754/flt-32/s_sinf.c (SINF, SINF_FUNC): Add macros
for using routine as __sinf_ia32.
Use macro for function declaration and weak_alias.
* sysdeps/ieee754/flt-32/s_cosf.c (COSF, COSF_FUNC): Add macros
for using routine as __cosf_ia32.
Use macro for function declaration and weak_alias.
* sysdeps/i386/i686/fpu/multiarch/e_expf-sse2.S: Fix Copyright.
* sysdeps/i386/i686/fpu/multiarch/e_expf.c: Fix Copyright.
* sysdeps/x86_64/fpu/s_sinf.S: New file.
* sysdeps/x86_64/fpu/s_cosf.S: New file.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
* math/libm-test.inc (cos_test): Add more test cases.
(sin_test): Likewise.
(sincos_test): Likewise.
|
|
|
|
| |
Entire tree edited via find | grep | sed.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* sysdeps/i386/fpu/s_cosf.S: Likewise.
* sysdeps/i386/fpu/s_cosl.S: Likewise.
* sysdeps/i386/fpu/s_sin.S: Likewise.
* sysdeps/i386/fpu/s_sinf.S: Likewise.
* sysdeps/i386/fpu/s_sinl.S: Likewise.
* sysdeps/ieee754/dbl-64/s_sin.c: Likewise.
* sysdeps/ieee754/flt-32/s_cosf.c: Likewise.
* sysdeps/ieee754/flt-32/s_sinf.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_cosl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_sinl.c: Likewise.
* sysdeps/x86_64/fpu/s_cosl.S: Likewise.
* sysdeps/x86_64/fpu/s_sinl.S: Likewise.
* math/libm-test.inc: Add tests for errno after sin/cos calls with
±Inf.
|
|
|