about summary refs log tree commit diff
path: root/sysdeps/ieee754/flt-32
Commit message (Collapse)AuthorAgeFilesLines
* math: Simplify hypotf implementationAdhemerval Zanella2021-12-132-35/+37
| | | | | | | | | | | | Use a more optimized comparison for check for NaN and infinite and add an inlined issignaling implementation for float. With gcc it results in 2 FP comparisons. The file Copyright is also changed to use GPL, the implementation was completely changed by 7c10fd3515f to use double precision instead of scaling and this change removes all the GET_FLOAT_WORD usage. Checked on x86_64-linux-gnu.
* Fixed inaccuracy of j0f (BZ #28185)Paul Zimmermann2021-10-051-3/+3
| | | | | | | | | | | | | The largest errors over the full binary32 range are after this patch (on x86_64): RNDN: libm wrong by up to 9.00e+00 ulp(s) [9] for x=0x1.04c39cp+6 RNDZ: libm wrong by up to 9.00e+00 ulp(s) [9] for x=0x1.04c39cp+6 RNDU: libm wrong by up to 9.00e+00 ulp(s) [9] for x=0x1.04c39cp+6 RNDD: libm wrong by up to 8.98e+00 ulp(s) [9] for x=0x1.4b7066p+7 Inputs that were yielding huge errors have been added to "make check". Reviewed-by: Adhemeral Zanella <adhemerval.zanella@linaro.org>
* Remove "Contributed by" linesSiddhesh Poyarekar2021-09-0351-54/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | We stopped adding "Contributed by" or similar lines in sources in 2012 in favour of git logs and keeping the Contributors section of the glibc manual up to date. Removing these lines makes the license header a bit more consistent across files and also removes the possibility of error in attribution when license blocks or files are copied across since the contributed-by lines don't actually reflect reality in those cases. Move all "Contributed by" and similar lines (Written by, Test by, etc.) into a new file CONTRIBUTED-BY to retain record of these contributions. These contributors are also mentioned in manual/contrib.texi, so we just maintain this additional record as a courtesy to the earlier developers. The following scripts were used to filter a list of files to edit in place and to clean up the CONTRIBUTED-BY file respectively. These were not added to the glibc sources because they're not expected to be of any use in future given that this is a one time task: https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* Use GCC builtins for roundeven functions if desired.Shen-Ta Hsieh2021-06-271-0/+5
| | | | | | | | | | | | | This patch is using the corresponding GCC builtin for roundevenf, roundeven and roundevenl if the USE_FUNCTION_BUILTIN macros are defined to one in math-use-builtins.h. These builtin functions is supported since GCC 10. The code of the generic implementation is not changed. Signed-off-by: Shen-Ta Hsieh <ibmibmibm.tw@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* math: redirect roundeven functionShen-Ta Hsieh2021-06-271-0/+3
| | | | | | | This patch redirect roundeven function for futhermore changes. Signed-off-by: Shen-Ta Hsieh <ibmibmibm.tw@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]Paul Zimmermann2021-04-023-70/+1021
| | | | | | | | | | | | | | | | | | | | | | | For j0f/j1f/y0f/y1f, the largest error for all binary32 inputs is reduced to at most 9 ulps for all rounding modes. The new code is enabled only when there is a cancellation at the very end of the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not give any visible slowdown on average. Two different algorithms are used: * around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of degree 3 are used, computed using the Sollya tool (https://www.sollya.org/) * for large inputs, an asymptotic formula from [1] is used [1] Fast and Accurate Bessel Function Computation, John Harrison, Proceedings of Arith 19, 2009. Inputs yielding the new largest errors are added to auto-libm-test-in, and ulps are regenerated for various targets (thanks Adhemerval Zanella). Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Update copyright dates with scripts/update-copyrightsPaul Eggert2021-01-0240-40/+40
| | | | | | | | | | | | | | | | I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 6694 files FOO. I then removed trailing white space from benchtests/bench-pthread-locks.c and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this diagnostic from Savannah: remote: *** pre-commit check failed ... remote: *** error: lines with trailing whitespace found remote: error: hook declined to update refs/heads/master
* math: Fix inaccuracy of j0f for x >= 2^127 when sin(x)+cos(x) is tinyPaul Zimmermann2020-08-071-1/+16
| | | | Checked on x86_64-linux-gnu and i686-linux-gnu.
* Use C2x return value from getpayload of non-NaN (bug 26073).Joseph Myers2020-07-061-0/+3
| | | | | | | | | | | | | | | In TS 18661-1, getpayload had an unspecified return value for a non-NaN argument, while C2x requires the return value -1 in that case. This patch implements the return value of -1. I don't think this is worth having a new symbol version that's an alias of the old one, although occasionally we do that in such cases where the new function semantics are a refinement of the old ones (to avoid programs relying on the new semantics running on older glibc versions but not behaving as intended). Tested for x86_64 and x86; also ran math/ tests for aarch64 and powerpc.
* New exp10f version without SVID compat wrapperAdhemerval Zanella2020-06-191-1/+15
| | | | | | | | | | | | | | | | | | | | | | | This patch changes the exp10f error handling semantics to only set errno according to POSIX rules. New symbol version is introduced at GLIBC_2.32. The old wrappers are kept for compat symbols. There are some outliers that need special handling: - ia64 provides an optimized implementation of exp10f that uses ia64 specific routines to set SVID compatibility. The new symbol version is aliased to the exp10f one. - m68k also provides an optimized implementation, and the new version uses it instead of the sysdeps/ieee754/flt32 one. - riscv and csky uses the generic template implementation that does not provide SVID support. For both cases a new exp10f version is not added, but rather the symbols version of the generic sysdeps/ieee754/flt32 is adjusted instead. Checked on aarch64-linux-gnu, x86_64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu.
* math: Optimized generic exp10f with wrappersPaul Zimmermann2020-06-192-1/+199
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is inspired by expf and reuses its tables and internal functions. The error checks are inlined and errno setting is in separate tail called functions, but the wrappers are kept in this patch to handle the _LIB_VERSION==_SVID_ case. Double precision arithmetics is used which is expected to be faster on most targets (including soft-float) than using single precision and it is easier to get good precision result with it. Result for x86_64 (i7-4790K CPU @ 4.00GHz) are: Before new code: "exp10f": { "workload-spec2017.wrf (adapted)": { "duration": 4.0414e+09, "iterations": 1.00128e+08, "reciprocal-throughput": 26.6818, "latency": 54.043, "max-throughput": 3.74787e+07, "min-throughput": 1.85038e+07 } With new code: "exp10f": { "workload-spec2017.wrf (adapted)": { "duration": 4.11951e+09, "iterations": 1.23968e+08, "reciprocal-throughput": 21.0581, "latency": 45.4028, "max-throughput": 4.74876e+07, "min-throughput": 2.20251e+07 } Result for aarch64 (A72 @ 2GHz) are: Before new code: "exp10f": { "workload-spec2017.wrf (adapted)": { "duration": 4.62362e+09, "iterations": 3.3376e+07, "reciprocal-throughput": 127.698, "latency": 149.365, "max-throughput": 7.831e+06, "min-throughput": 6.69501e+06 } With new code: "exp10f": { "workload-spec2017.wrf (adapted)": { "duration": 4.29108e+09, "iterations": 6.6752e+07, "reciprocal-throughput": 51.2111, "latency": 77.3568, "max-throughput": 1.9527e+07, "min-throughput": 1.29271e+07 } Checked on x86_64-linux-gnu, powerpc64le-linux-gnu, aarch64-linux-gnu, and sparc64-linux-gnu.
* ieee754: provide gcc builtins based generic sqrt functionsVineet Gupta2020-06-031-6/+10
| | | | Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Adjust thresholds in Bessel function implementations (bug 14469).Joseph Myers2020-02-142-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A recent discussion in bug 14469 notes that a threshold in float Bessel function implementations, used to determine when to use a simpler implementation approach, results in substantially inaccurate results. As I discussed in <https://sourceware.org/ml/libc-alpha/2013-03/msg00345.html>, a heuristic argument suggests 2^(S+P) as the right order of magnitude for a suitable threshold, where S is the number of significand bits in the floating-point type and P is the number of significant bits in the representation of the floating-point type, and the float and ldbl-96 implementations use thresholds that are too small. Some threshold does need using, there or elsewhere in the implementation, to avoid spurious underflow and overflow for large arguments. This patch sets the thresholds in the affected implementations to more heuristically justifiable values. Results will still be inaccurate close to zeroes of the functions (thus this patch does *not* fix any of the bugs for Bessel function inaccuracy); fixing that would require a different implementation approach, likely along the lines described in <http://www.cl.cam.ac.uk/~jrh13/papers/bessel.ps.gz>. So the justification for a change such as this would be statistical rather than based on particular tests that had excessive errors and no longer do so (no doubt such tests could be found, but would probably be too fragile to add to the testsuite, as liable to give large errors again from very small implementation changes or even from compiler changes). See <https://sourceware.org/ml/libc-alpha/2020-02/msg00638.html> for such statistics of the resulting improvements for float functions. Tested (glibc testsuite) for x86_64.
* Add libm_alias_finite for _finite symbolsWilco Dijkstra2020-01-0322-30/+51
| | | | | | | | | | | | | | | | | | | This patch adds a new macro, libm_alias_finite, to define all _finite symbol. It sets all _finite symbol as compat symbol based on its first version (obtained from the definition at built generated first-versions.h). The <fn>f128_finite symbols were introduced in GLIBC 2.26 and so need special treatment in code that is shared between long double and float128. It is done by adding a list, similar to internal symbol redifinition, on sysdeps/ieee754/float128/float128_private.h. Alpha also needs some tricky changes to ensure we still emit 2 compat symbols for sqrt(f). Passes buildmanyglibc. Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* Update copyright dates with scripts/update-copyrights.Joseph Myers2020-01-0139-39/+39
|
* Adjust s_ceilf.c and s_ceill.c regarding code style.Stefan Liebler2019-12-111-21/+33
| | | | | | | This patch just adjusts the generic implementation regarding code style. No functional change. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Adjust s_floorf.c and s_floorl.c regarding code style.Stefan Liebler2019-12-111-22/+33
| | | | | | | This patch just adjusts the generic implementation regarding code style. No functional change. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Adjust s_rintf.c and s_rintl.c regarding code style.Stefan Liebler2019-12-111-19/+25
| | | | | | | This patch just adjusts the generic implementation regarding code style. No functional change. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Adjust s_nearbyintf.c and s_nearbyintl.c regarding code style.Stefan Liebler2019-12-111-27/+33
| | | | | | | This patch just adjusts the generic implementation regarding code style. No functional change. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Use GCC builtins for copysign functions if desired.Stefan Liebler2019-12-111-8/+4
| | | | | | | | | | | | This patch is always using the corresponding GCC builtin for copysignf, copysign, and is using the builtin for copysignl, copysignf128 if the USE_FUNCTION_BUILTIN macros are defined to one in math-use-builtins.h. Altough the long double version is enabled by default we still need the macro and the alternative implementation as the _Float128 version of the builtin is not available with all supported GCC versions. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Use GCC builtins for round functions if desired.Stefan Liebler2019-12-111-0/+6
| | | | | | | | | | | | This patch is using the corresponding GCC builtin for roundf, round, roundl and roundf128 if the USE_FUNCTION_BUILTIN macros are defined to one in math-use-builtins.h. This is the case for s390 if build with at least --march=z196 --mzarch. Otherwise the generic implementation is used. The code of the generic implementation is not changed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Use GCC builtins for trunc functions if desired.Stefan Liebler2019-12-111-0/+6
| | | | | | | | | | | | This patch is using the corresponding GCC builtin for truncf, trunc, truncl and truncf128 if the USE_FUNCTION_BUILTIN macros are defined to one in math-use-builtins.h. This is the case for s390 if build with at least --march=z196 --mzarch. Otherwise the generic implementation is used. The code of the generic implementation is not changed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Use GCC builtins for ceil functions if desired.Stefan Liebler2019-12-111-1/+6
| | | | | | | | | | | | This patch is using the corresponding GCC builtin for ceilf, ceil, ceill and ceilf128 if the USE_FUNCTION_BUILTIN macros are defined to one in math-use-builtins.h. This is the case for s390 if build with at least --march=z196 --mzarch. Otherwise the generic implementation is used. The code of the generic implementation is not changed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Use GCC builtins for floor functions if desired.Stefan Liebler2019-12-111-0/+6
| | | | | | | | | | | | This patch is using the corresponding GCC builtin for floorf, floor, floorl and floorf128 if the USE_FUNCTION_BUILTIN macros are defined to one in math-use-builtins.h. This is the case for s390 if build with at least --march=z196 --mzarch. Otherwise the generic implementation is used. The code of the generic implementation is not changed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Use GCC builtins for rint functions if desired.Stefan Liebler2019-12-111-6/+11
| | | | | | | | | | | | This patch is using the corresponding GCC builtin for rintf, rint, rintl and rintf128 if the USE_FUNCTION_BUILTIN macros are defined to one in math-use-builtins.h. This is the case for s390 if build with at least --march=z196 --mzarch. Otherwise the generic implementation is used. The code of the generic implementation is not changed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Use GCC builtins for nearbyint functions if desired.Stefan Liebler2019-12-111-6/+11
| | | | | | | | | | | | This patch is using the corresponding GCC builtin for nearbyintf, nearbyint, nearbintl and nearbyintf128 if the USE_FUNCTION_BUILTIN macros are defined to one in math-use-builtins.h. This is the case for s390 if build with at least --march=z196 --mzarch. Otherwise the generic implementation is used. The code of the generic implementation is not changed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Remove duplicate inline implementation of issignalingfPaul A. Clarke2019-11-222-11/+2
| | | | | | | | | | | | | | | | | | | | Very recent commit 854e91bf6b4221f424ffa13b9ef50f35623b7b74 enabled inline of issignalingf() in general (__issignalingf in include/math.h). There is another implementation for an inline use of issignalingf (issignalingf_inline in sysdeps/ieee754/flt-32/math_config.h) which could instead make use of the new enablement. Replace the use of issignalingf_inline with __issignaling. Using issignaling (instead of __issignalingf) will allow future enhancements to the type-generic implementation, issignaling, to be automatically adopted. The implementations are slightly different, and compile to slightly different code, but I measured no significant performance difference. The second implementation was brought to my attention by: Suggested-by: Joseph Myers <joseph@codesourcery.com> Reviewed-by: Joseph Myers <joseph@codesourcery.com>
* Prefer https to http for gnu.org and fsf.org URLsPaul Eggert2019-09-0740-40/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Also, change sources.redhat.com to sourceware.org. This patch was automatically generated by running the following shell script, which uses GNU sed, and which avoids modifying files imported from upstream: sed -ri ' s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g ' \ $(find $(git ls-files) -prune -type f \ ! -name '*.po' \ ! -name 'ChangeLog*' \ ! -path COPYING ! -path COPYING.LIB \ ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \ ! -path manual/texinfo.tex ! -path scripts/config.guess \ ! -path scripts/config.sub ! -path scripts/install-sh \ ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \ ! -path INSTALL ! -path locale/programs/charmap-kw.h \ ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \ ! '(' -name configure \ -execdir test -f configure.ac -o -f configure.in ';' ')' \ ! '(' -name preconfigure \ -execdir test -f preconfigure.ac ';' ')' \ -print) and then by running 'make dist-prepare' to regenerate files built from the altered files, and then executing the following to cleanup: chmod a+x sysdeps/unix/sysv/linux/riscv/configure # Omit irrelevant whitespace and comment-only changes, # perhaps from a slightly-different Autoconf version. git checkout -f \ sysdeps/csky/configure \ sysdeps/hppa/configure \ sysdeps/riscv/configure \ sysdeps/unix/sysv/linux/csky/configure # Omit changes that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines git checkout -f \ sysdeps/powerpc/powerpc64/ppc-mcount.S \ sysdeps/unix/sysv/linux/s390/s390-64/syscall.S # Omit change that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
* Make totalorder and totalordermag functions take pointer arguments.Joseph Myers2019-08-152-6/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The resolution of C floating-point Clarification Request 25 <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2397.htm#dr_25> is that the totalorder and totalordermag functions should take pointer arguments, and this has been adopted in C2X (with const added; note that the integration of this change into C2X is present in the C standard git repository but postdates the most recent public PDF draft). This patch updates glibc accordingly. As a defect resolution, the API is changed unconditionally rather than supporting any sort of TS 18661-1 mode for compilation with the old version of the API. There are compat symbols for existing binaries that pass floating-point arguments directly. As a consequence of changing to pointer arguments, there are no longer type-generic macros in tgmath.h for these functions. Because of the fairly complicated logic for creating libm function aliases and determining the set of aliases to create in a given glibc configuration, rather than duplicating all that in individual source files to create the versioned and compat symbols, the source files for the various versions of totalorder functions are set up to redefine weak_alias before using libm_alias_* macros to create the symbols required. In turn, this requires creating a separate alias for each symbol version pointing to the same implementation (see binutils bug <https://sourceware.org/bugzilla/show_bug.cgi?id=23840>), which is done automatically using __COUNTER__. (As I noted in <https://sourceware.org/ml/libc-alpha/2018-10/msg00631.html>, it might well make sense for glibc's symbol versioning macros to do that alias creation with __COUNTER__ themselves, which would somewhat simplify the logic in the totalorder source files.) It is of course desirable to test the compat symbols. I did this with the generic libm-test machinery, but didn't wish to duplicate the actual tables of test inputs and outputs, and thought it risky to attempt to have a single object file refer to both default and compat versions of the same function in order to test them together. Thus, I created libm-test-compat_totalorder.inc and libm-test-compat_totalordermag.inc which include the generated .c files (with the processed version of those tables of inputs) from the non-compat tests, and added appropriate dependencies. I think this provides sufficient test coverage for the compat symbols without also needing to make the special ldbl-96 and ldbl-128ibm tests (of peculiarities relating to the representations of those formats that can't be covered in the generic tests) run for the compat symbols. Tests of compat symbols need to be internal tests, meaning _ISOMAC is not defined. Making some libm-test tests into internal tests showed up two other issues. GCC diagnoses duplicate macro definitions of __STDC_* macros, including __STDC_WANT_IEC_60559_TYPES_EXT__; I added an appropriate conditional and filed <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91451> for this issue. On ia64, include/setjmp.h ends up getting included indirectly from libm-symbols.h, resulting in conflicting definitions of the STR macro (also defined in libm-test-driver.c); I renamed the macros in include/setjmp.h. (It's arguable that we should have common internal headers used everywhere for stringizing and concatenation macros.) Tested for x86_64 and x86, and with build-many-glibcs.py. * math/bits/mathcalls.h [__GLIBC_USE (IEC_60559_BFP_EXT) || __MATH_DECLARING_FLOATN] (totalorder): Take pointer arguments. [__GLIBC_USE (IEC_60559_BFP_EXT) || __MATH_DECLARING_FLOATN] (totalordermag): Likewise. * manual/arith.texi (totalorder): Likewise. (totalorderf): Likewise. (totalorderl): Likewise. (totalorderfN): Likewise. (totalorderfNx): Likewise. (totalordermag): Likewise. (totalordermagf): Likewise. (totalordermagl): Likewise. (totalordermagfN): Likewise. (totalordermagfNx): Likewise. * math/tgmath.h (__TGMATH_BINARY_REAL_RET_ONLY): Remove macro. [__GLIBC_USE (IEC_60559_BFP_EXT)] (totalorder): Likewise. [__GLIBC_USE (IEC_60559_BFP_EXT)] (totalordermag): Likewise. * math/Versions (GLIBC_2.31): Add totalorder, totalorderf, totalorderl, totalordermag, totalordermagf, totalordermagl, totalorderf32, totalorderf64, totalorderf32x, totalordermagf32, totalordermagf64, totalordermagf32x, totalorderf64x, totalordermagf64x, totalorderf128 and totalordermagf128. * math/Makefile (libm-test-funcs-noauto): Add compat_totalorder and compat_totalordermag. (libm-test-funcs-compat): New variable. (libm-tests-compat): Likewise. (tests): Do not include compat tests. (tests-internal): Add compat tests. ($(foreach t,$(libm-tests-base), $(objpfx)$(t)-compat_totalorder.o)): Depend on $(objpfx)libm-test-totalorder.c. ($(foreach t,$(libm-tests-base), $(objpfx)$(t)-compat_totalordermag.o): Depend on $(objpfx)libm-test-totalordermag.c. (tgmath3-macros): Remove totalorder and totalordermag. * math/libm-test-compat_totalorder.inc: New file. * math/libm-test-compat_totalordermag.inc: Likewise. * math/libm-test-driver.c (struct test_ff_i_data): Update comment. (RUN_TEST_fpfp_b): New macro. (RUN_TEST_LOOP_fpfp_b): Likewise. * math/libm-test-totalorder.inc (totalorder_test_data): Use TEST_fpfp_b. (totalorder_test): Condition on [!COMPAT_TEST]. (do_test): Likewise. * math/libm-test-totalordermag.inc (totalordermag_test_data): Use TEST_fpfp_b. (totalordermag_test): Condition on [!COMPAT_TEST]. (do_test): Likewise. * math/gen-tgmath-tests.py (Tests.add_all_tests): Remove totalorder and totalordermag. * math/test-tgmath.c (NCALLS): Change to 132. (F(compile_test)): Do not call totalorder or totalordermag. (F(totalorder)): Remove. (F(totalordermag)): Likewise. * include/float.h (__STDC_WANT_IEC_60559_TYPES_EXT__): Do not define if [__STDC_WANT_IEC_60559_TYPES_EXT__]. * include/setjmp.h [!_ISOMAC] (STR_HELPER): Rename to SJSTR_HELPER. [!_ISOMAC] (STR): Rename to SJSTR. Update call to STR_HELPER. [!_ISOMAC] (TEST_SIZE): Update call to STR. [!_ISOMAC] (TEST_ALIGN): Likewise. [!_ISOMAC] (TEST_OFFSET): Likewise. * sysdeps/ieee754/dbl-64/s_totalorder.c: Include <shlib-compat.h> and <first-versions.h>. (__totalorder): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/dbl-64/s_totalordermag.c: Include <shlib-compat.h> and <first-versions.h>. (__totalordermag): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/dbl-64/wordsize-64/s_totalorder.c: Include <shlib-compat.h> and <first-versions.h>. (__totalorder): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/dbl-64/wordsize-64/s_totalordermag.c: Include <shlib-compat.h> and <first-versions.h>. (__totalordermag): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/float128/float128_private.h (__totalorder_compatl): New macro. (__totalordermag_compatl): Likewise. * sysdeps/ieee754/flt-32/s_totalorderf.c: Include <shlib-compat.h> and <first-versions.h>. (__totalorderf): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/flt-32/s_totalordermagf.c: Include <shlib-compat.h> and <first-versions.h>. (__totalordermagf): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/ldbl-128/s_totalorderl.c: Include <shlib-compat.h> and <first-versions.h>. (__totalorderl): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/ldbl-128/s_totalordermagl.c: Include <shlib-compat.h> and <first-versions.h>. (__totalordermagl): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/ldbl-128ibm/s_totalorderl.c: Include <shlib-compat.h>. (__totalorderl): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/ldbl-128ibm/s_totalordermagl.c: Include <shlib-compat.h>. (__totalordermagl): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/ldbl-96/s_totalorderl.c: Include <shlib-compat.h> and <first-versions.h>. (__totalorderl): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/ldbl-96/s_totalordermagl.c: Include <shlib-compat.h> and <first-versions.h>. (__totalordermagl): Take pointer arguments. Add symbol versions and compat symbols. * sysdeps/ieee754/ldbl-opt/nldbl-totalorder.c (totalorderl): Take pointer arguments. * sysdeps/ieee754/ldbl-opt/nldbl-totalordermag.c (totalordermagl): Likewise. * sysdeps/ieee754/ldbl-128ibm/test-totalorderl-ldbl-128ibm.c (do_test): Update calls to totalorderl and totalordermagl. * sysdeps/ieee754/ldbl-96/test-totalorderl-ldbl-96.c (do_test): Update calls to totalorderl and totalordermagl. * sysdeps/mach/hurd/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/csky/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
* Update copyright dates with scripts/update-copyrights.Joseph Myers2019-01-0139-39/+39
| | | | | | | * All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.
* x86-64: Vectorize sincosf_poly and update s_sincosf-fma.cH.J. Lu2018-12-262-70/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add <sincosf_poly.h> and include it in s_sincosf.h to allow vectorized sincosf_poly. Add x86 sincosf_poly.h to vectorize sincosf_poly. On Broadwell, bench-sincosf shows: Before After Improvement max 160.273 114.198 40% min 6.25 5.625 11% mean 13.0325 10.6462 22% Vectorized sincosf_poly shows Before After Improvement max 138.653 114.198 21% min 5.004 5.625 -11% mean 11.5934 10.6462 9% Tested on x86-64 and i686 as well as with build-many-glibcs.py. * sysdeps/ieee754/flt-32/s_sincosf.h: Include <sincosf_poly.h>. (sincos_t, sincosf_poly, sinf_poly): Moved to ... * sysdeps/ieee754/flt-32/sincosf_poly.h: Here. New file. * sysdeps/x86/fpu/s_sincosf_data.c: New file. * sysdeps/x86/fpu/sincosf_poly.h: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: Just include <sysdeps/ieee754/flt-32/s_sincosf.c>.
* Fix powf overflow handling in non-nearest rounding mode [BZ #23961]Szabolcs Nagy2018-12-111-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | The threshold value at which powf overflows depends on the rounding mode and the current check did not take this into account. So when the result was rounded away from zero it could become infinity without setting errno to ERANGE. Example: pow(0x1.7ac7cp+5, 23) is 0x1.fffffep+127 + 0.1633ulp If the result goes above 0x1.fffffep+127 + 0.5ulp then errno is set, which is fine in nearest rounding mode, but powf(0x1.7ac7cp+5, 23) is inf in upward rounding mode powf(-0x1.7ac7cp+5, 23) is -inf in downward rounding mode and the previous implementation did not set errno in these cases. The fix tries to avoid affecting the common code path or calling a function that may introduce a stack frame, so float arithmetics is used to check the rounding mode and the threshold is selected accordingly. [BZ #23961] * math/auto-libm-test-in: Add new test case. * math/auto-libm-test-out-pow: Regenerated. * sysdeps/ieee754/flt-32/e_powf.c (__powf): Fix overflow check.
* Remove unnecessary math_private.h includes.Joseph Myers2018-09-281-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After my changes to move various macros, inlines and other content from math_private.h to more specific headers, many files including math_private.h no longer need to do so. Furthermore, since the optimized inlines of various functions have been moved to include/fenv.h or replaced by use of function names GCC inlines automatically, a missing math_private.h include where one is appropriate will reliably cause a build failure rather than possibly causing code to be less well optimized while still building successfully. Thus, this patch removes includes of math_private.h that are now unnecessary. In the case of two RISC-V files, the include is replaced by one of stdbool.h because the files in question were relying on math_private.h to get a definition of bool. Tested for x86_64 and x86, and with build-many-glibcs.py. * math/fromfp.h: Do not include <math_private.h>. * math/s_cacosh_template.c: Likewise. * math/s_casin_template.c: Likewise. * math/s_casinh_template.c: Likewise. * math/s_ccos_template.c: Likewise. * math/s_cproj_template.c: Likewise. * math/s_fdim_template.c: Likewise. * math/s_fmaxmag_template.c: Likewise. * math/s_fminmag_template.c: Likewise. * math/s_iseqsig_template.c: Likewise. * math/s_ldexp_template.c: Likewise. * math/s_nextdown_template.c: Likewise. * math/w_log1p_template.c: Likewise. * math/w_scalbln_template.c: Likewise. * sysdeps/aarch64/fpu/feholdexcpt.c: Likewise. * sysdeps/aarch64/fpu/fesetround.c: Likewise. * sysdeps/aarch64/fpu/fgetexcptflg.c: Likewise. * sysdeps/aarch64/fpu/ftestexcept.c: Likewise. * sysdeps/aarch64/fpu/s_llrint.c: Likewise. * sysdeps/aarch64/fpu/s_llrintf.c: Likewise. * sysdeps/aarch64/fpu/s_lrint.c: Likewise. * sysdeps/aarch64/fpu/s_lrintf.c: Likewise. * sysdeps/i386/fpu/s_atanl.c: Likewise. * sysdeps/i386/fpu/s_f32xaddf64.c: Likewise. * sysdeps/i386/fpu/s_f32xsubf64.c: Likewise. * sysdeps/i386/fpu/s_fdim.c: Likewise. * sysdeps/i386/fpu/s_logbl.c: Likewise. * sysdeps/i386/fpu/s_rintl.c: Likewise. * sysdeps/i386/fpu/s_significandl.c: Likewise. * sysdeps/ia64/fpu/s_matherrf.c: Likewise. * sysdeps/ia64/fpu/s_matherrl.c: Likewise. * sysdeps/ieee754/dbl-64/s_atan.c: Likewise. * sysdeps/ieee754/dbl-64/s_cbrt.c: Likewise. * sysdeps/ieee754/dbl-64/s_fma.c: Likewise. * sysdeps/ieee754/dbl-64/s_fmaf.c: Likewise. * sysdeps/ieee754/flt-32/s_cbrtf.c: Likewise. * sysdeps/ieee754/k_standardf.c: Likewise. * sysdeps/ieee754/k_standardl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_copysignl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_finitel.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_fpclassifyl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_isinfl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_isnanl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_signbitl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_cbrtl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fma.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fmal.c: Likewise. * sysdeps/ieee754/s_signgam.c: Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c: Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c: Likewise. * sysdeps/powerpc/power7/fpu/s_logbf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_ceil.c: Likewise. * sysdeps/riscv/rv64/rvd/s_floor.c: Likewise. * sysdeps/riscv/rv64/rvd/s_nearbyint.c: Likewise. * sysdeps/riscv/rv64/rvd/s_round.c: Likewise. * sysdeps/riscv/rv64/rvd/s_roundeven.c: Likewise. * sysdeps/riscv/rv64/rvd/s_trunc.c: Likewise. * sysdeps/riscv/rvd/s_finite.c: Likewise. * sysdeps/riscv/rvd/s_fmax.c: Likewise. * sysdeps/riscv/rvd/s_fmin.c: Likewise. * sysdeps/riscv/rvd/s_fpclassify.c: Likewise. * sysdeps/riscv/rvd/s_isinf.c: Likewise. * sysdeps/riscv/rvd/s_isnan.c: Likewise. * sysdeps/riscv/rvd/s_issignaling.c: Likewise. * sysdeps/riscv/rvf/fegetround.c: Likewise. * sysdeps/riscv/rvf/feholdexcpt.c: Likewise. * sysdeps/riscv/rvf/fesetenv.c: Likewise. * sysdeps/riscv/rvf/fesetround.c: Likewise. * sysdeps/riscv/rvf/feupdateenv.c: Likewise. * sysdeps/riscv/rvf/fgetexcptflg.c: Likewise. * sysdeps/riscv/rvf/ftestexcept.c: Likewise. * sysdeps/riscv/rvf/s_ceilf.c: Likewise. * sysdeps/riscv/rvf/s_finitef.c: Likewise. * sysdeps/riscv/rvf/s_floorf.c: Likewise. * sysdeps/riscv/rvf/s_fmaxf.c: Likewise. * sysdeps/riscv/rvf/s_fminf.c: Likewise. * sysdeps/riscv/rvf/s_fpclassifyf.c: Likewise. * sysdeps/riscv/rvf/s_isinff.c: Likewise. * sysdeps/riscv/rvf/s_isnanf.c: Likewise. * sysdeps/riscv/rvf/s_issignalingf.c: Likewise. * sysdeps/riscv/rvf/s_nearbyintf.c: Likewise. * sysdeps/riscv/rvf/s_roundevenf.c: Likewise. * sysdeps/riscv/rvf/s_roundf.c: Likewise. * sysdeps/riscv/rvf/s_truncf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_rint.c: Include <stdbool.h> instead of <math_private.h>. * sysdeps/riscv/rvf/s_rintf.c: Likewise.
* Use copysign functions not __copysign functions in glibc libm.Joseph Myers2018-09-276-11/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __copysign functions to call the corresponding copysign names instead, with asm redirection to __copysign when the calls are not inlined (all cases are inlined except for IBM long double for powerpc soft-float / e500v1). This eliminates the need for an inline function defining __copysign in terms of __builtin_copysign. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_BINARY_ARGS): New macro. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (copysign): Redirect using MATH_REDIRECT. * sysdeps/alpha/fpu/s_copysign.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/alpha/fpu/s_copysignf.c: Likewise. * sysdeps/ieee754/dbl-64/s_copysign.c: Likewise. * sysdeps/ieee754/float128/s_copysignf128.c: Likewise. * sysdeps/ieee754/flt-32/s_copysignf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_copysignl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_copysignl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_copysignl.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysignf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysignf.c: Likewise. * sysdeps/riscv/rvd/s_copysign.c: Likewise. * sysdeps/riscv/rvf/s_copysignf.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysignf.c: Likewise. * sysdeps/generic/math_private_calls.h [!__MATH_DECLARING_LONG_DOUBLE || !NO_LONG_DOUBLE] (__copysign): Do not declare and define as an inline function. * math/divtc3.c (__divtc3): Use copysign functions instead of __copysign variants. * math/multc3.c (__multc3): Likewise. * sysdeps/generic/math-type-macros.h (M_COPYSIGN): Likewise. * sysdeps/ieee754/dbl-64/e_atan2.c (signArctan2): Likewise. * sysdeps/ieee754/dbl-64/e_atanh.c (__ieee754_atanh): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Likewise. * sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Likewise. (__ieee754_yn): Likewise. * sysdeps/ieee754/dbl-64/s_asinh.c (__asinh): Likewise. * sysdeps/ieee754/dbl-64/s_atan.c (__signArctan): Likewise. * sysdeps/ieee754/dbl-64/s_scalbln.c (__scalbln): Likewise. * sysdeps/ieee754/dbl-64/s_scalbn.c (__scalbn): Likewise. * sysdeps/ieee754/dbl-64/s_sin.c (do_sin): Likewise. (__sin): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c (__nearbyint): Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_scalbln.c (__scalbln): Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_scalbn.c (__scalbn): Likewise. * sysdeps/ieee754/flt-32/e_atanhf.c (__ieee754_atanhf): Likewise. * sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r): Likewise. * sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise. (__ieee754_ynf): Likewise. * sysdeps/ieee754/flt-32/s_asinhf.c (__asinhf): Likewise. * sysdeps/ieee754/flt-32/s_scalbnf.c (__scalbnf): Likewise. * sysdeps/ieee754/k_standard.c (__kernel_standard): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl): Likewise. * sysdeps/ieee754/ldbl-128/s_scalblnl.c (__scalblnl): Likewise. * sysdeps/ieee754/ldbl-128/s_scalbnl.c (__scalbnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_fmal.c (__fmal): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_scalblnl.c (__scalblnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c (__scalbnl): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl) * sysdeps/ieee754/ldbl-96/s_asinhl.c (__asinhl): Likewise. * sysdeps/ieee754/ldbl-96/s_scalblnl.c (__scalblnl): Likewise. * sysdeps/ieee754/ldbl-opt/nldbl-copysign.c (copysignl): Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.
* Use round functions not __round functions in glibc libm.Joseph Myers2018-09-272-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __round functions to call the corresponding round names instead, with asm redirection to __round when the calls are not inlined. An additional complication arises in sysdeps/ieee754/ldbl-128ibm/e_expl.c, where a call to roundl, with the result converted to int, gets converted by the compiler to call lroundl in the case of 32-bit long, so resulting in localplt test failures. It's logically correct to let the compiler make such an optimization; an appropriate asm redirection of lroundl to __lroundl is thus added to that file (it's not needed anywhere else). Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (round): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_round.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_roundf.c: Likewise. * sysdeps/ieee754/dbl-64/s_round.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_round.c: Likewise. * sysdeps/ieee754/float128/s_roundf128.c: Likewise. * sysdeps/ieee754/flt-32/s_roundf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_roundl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_roundl.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_round.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_roundf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_round.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_roundf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_round.c: Likewise. * sysdeps/riscv/rvf/s_roundf.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_roundl.c: Likewise. (round): Redirect to __round. (__roundl): Call round instead of __round. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__round): Remove macro. [_ARCH_PWR5X] (__roundf): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Use round functions instead of __round variants. * sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/x86/fpu/powl_helper.c (__powl_helper): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_expl.c (lroundl): Redirect to __lroundl. (__ieee754_expl): Call roundl instead of __roundl.
* Use trunc functions not __trunc functions in glibc libm.Joseph Myers2018-09-202-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __trunc functions to call the corresponding trunc names instead, with asm redirection to __trunc when the calls are not inlined. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (trunc): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_trunc.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_truncf.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_trunc.c: Likewise. * sysdeps/ieee754/float128/s_truncf128.c: Likewise. * sysdeps/ieee754/dbl-64/s_trunc.c: Likewise. * sysdeps/ieee754/flt-32/s_truncf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_truncl.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_trunc.c: Likewise. * sysdeps/riscv/rvf/s_truncf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_trunc_template.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_truncl.c: Likewise. (ceil): Redirect to __ceil. (floor): Redirect to __floor. (trunc): Redirect to __trunc. (__truncl): Call trunc instead of __trunc. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__trunc): Remove macro. [_ARCH_PWR5X] (__truncf): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Use trunc functions instead of __trunc variants. * sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r): Likewise.
* Fix the documentation comment of checkint in powfSzabolcs Nagy2018-09-191-1/+2
| | | | | | checkint in powf is not supposed to be used with 0, inf or nan inputs. * sysdeps/ieee754/flt-32/e_powf.c (checkint): Fix documentation.
* Use ceil functions not __ceil functions in glibc libm.Joseph Myers2018-09-172-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __ceil functions to call the corresponding ceil names instead, with asm redirection to __ceil when the calls are not inlined. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (ceil): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_ceil.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_ceilf.c: Likewise. * sysdeps/ieee754/dbl-64/s_ceil.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_ceil.c: Likewise. * sysdeps/ieee754/float128/s_ceilf128.c: Likewise. * sysdeps/ieee754/flt-32/s_ceilf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_ceill.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_ceill.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_ceil_template.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_ceil.c: Likewise. * sysdeps/riscv/rvf/s_ceilf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__ceil): Remove macro. * sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Use ceil functions instead of __ceil variants. * sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.
* Use rint functions not __rint functions in glibc libm.Joseph Myers2018-09-142-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __rint functions to call the corresponding rint names instead, with asm redirection to __rint when the calls are not inlined. The x86_64 math_private.h is removed as no longer useful after this patch. This patch is relative to a tree with my floor patch <https://sourceware.org/ml/libc-alpha/2018-09/msg00148.html> applied, and much the same considerations arise regarding possibly replacing an IFUNC call with a direct inline expansion. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (rint): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_rint.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_rintf.c: Likewise. * sysdeps/alpha/fpu/s_rint.c: Likewise. * sysdeps/alpha/fpu/s_rintf.c: Likewise. * sysdeps/i386/fpu/s_rintl.c: Likewise. * sysdeps/ieee754/dbl-64/s_rint.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_rint.c: Likewise. * sysdeps/ieee754/float128/s_rintf128.c: Likewise. * sysdeps/ieee754/flt-32/s_rintf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_rintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise. * sysdeps/m68k/coldfire/fpu/s_rint.c: Likewise. * sysdeps/m68k/coldfire/fpu/s_rintf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_rint.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_rintf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_rintl.c: Likewise. * sysdeps/powerpc/fpu/s_rint.c: Likewise. * sysdeps/powerpc/fpu/s_rintf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_rint.c: Likewise. * sysdeps/riscv/rvf/s_rintf.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rint.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rintf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_rint.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_rintf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rint.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rintf.c: Likewise. * sysdeps/x86_64/fpu/math_private.h: Remove file. * math/e_scalb.c (invalid_fn): Use rint functions instead of __rint variants. * math/e_scalbf.c (invalid_fn): Likewise. * math/e_scalbl.c (invalid_fn): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Likewise. * sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r): Likewise. * sysdeps/ieee754/k_standard.c (__kernel_standard): Likewise. * sysdeps/ieee754/k_standardl.c (__kernel_standard_l): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/powerpc/powerpc32/fpu/s_llrint.c (__llrint): Likewise. * sysdeps/powerpc/powerpc32/fpu/s_llrintf.c (__llrintf): Likewise.
* Use floor functions not __floor functions in glibc libm.Joseph Myers2018-09-143-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to the changes that were made to call sqrt functions directly in glibc, instead of __ieee754_sqrt variants, so that the compiler could inline them automatically without needing special inline definitions in lots of math_private.h headers, this patch makes libm code call floor functions directly instead of __floor variants, removing the inlines / macros for x86_64 (SSE4.1) and powerpc (POWER5). The redirection used to ensure that __ieee754_sqrt does still get called when the compiler doesn't inline a built-in function expansion is refactored so it can be applied to other functions; the refactoring is arranged so it's not limited to unary functions either (it would be reasonable to use this mechanism for copysign - removing the inline in math_private_calls.h but also eliminating unnecessary local PLT entry use in the cases (powerpc soft-float and e500v1, for IBM long double) where copysign calls don't get inlined). The point of this change is that more architectures can get floor calls inlined where they weren't previously (AArch64, for example), without needing special inline definitions in their math_private.h, and existing such definitions in math_private.h headers can be removed. Note that it's possible that in some cases an inline may be used where an IFUNC call was previously used - this is the case on x86_64, for example. I think the direct calls to floor are still appropriate; if there's any significant performance cost from inline SSE2 floor instead of an IFUNC call ending up with SSE4.1 floor, that indicates that either the function should be doing something else that's faster than using floor at all, or it should itself have IFUNC variants, or that the compiler choice of inlining for generic tuning should change to allow for the possibility that, by not inlining, an SSE4.1 IFUNC might be called at runtime - but not that glibc should avoid calling floor internally. (After all, all the same considerations would apply to any user program calling floor, where it might either be inlined or left as an out-of-line call allowing for a possible IFUNC.) Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT): New macro. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_LDBL): Likewise. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_F128): Likewise. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_UNARY_ARGS): Likewise. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (sqrt): Redirect using MATH_REDIRECT. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (floor): Likewise. * sysdeps/aarch64/fpu/s_floor.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_floorf.c: Likewise. * sysdeps/ieee754/dbl-64/s_floor.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_floor.c: Likewise. * sysdeps/ieee754/float128/s_floorf128.c: Likewise. * sysdeps/ieee754/flt-32/s_floorf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_floorl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_floorl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_floor_template.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_floor.c: Likewise. * sysdeps/riscv/rvf/s_floorf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__floor): Remove macro. [_ARCH_PWR5X] (__floorf): Likewise. * sysdeps/x86_64/fpu/math_private.h [__SSE4_1__] (__floor): Remove inline function. [__SSE4_1__] (__floorf): Likewise. * math/w_lgamma_main.c (LGFUNC (__lgamma)): Use floor functions instead of __floor variants. * math/w_lgamma_r_compat.c (__lgamma_r): Likewise. * math/w_lgammaf_main.c (LGFUNC (__lgammaf)): Likewise. * math/w_lgammaf_r_compat.c (__lgammaf_r): Likewise. * math/w_lgammal_main.c (LGFUNC (__lgammal)): Likewise. * math/w_lgammal_r_compat.c (__lgammal_r): Likewise. * math/w_tgamma_compat.c (__tgamma): Likewise. * math/w_tgamma_template.c (M_DECL_FUNC (__tgamma)): Likewise. * math/w_tgammaf_compat.c (__tgammaf): Likewise. * math/w_tgammal_compat.c (__tgammal): Likewise. * sysdeps/ieee754/dbl-64/e_lgamma_r.c (sin_pi): Likewise. * sysdeps/ieee754/dbl-64/k_rem_pio2.c (__kernel_rem_pio2): Likewise. * sysdeps/ieee754/dbl-64/lgamma_neg.c (__lgamma_neg): Likewise. * sysdeps/ieee754/flt-32/e_lgammaf_r.c (sin_pif): Likewise. * sysdeps/ieee754/flt-32/lgamma_negf.c (__lgamma_negf): Likewise. * sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r): Likewise. * sysdeps/ieee754/ldbl-128/e_powl.c (__ieee754_powl): Likewise. * sysdeps/ieee754/ldbl-128/lgamma_negl.c (__lgamma_negl): Likewise. * sysdeps/ieee754/ldbl-128/s_expm1l.c (__expm1l): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c (__ieee754_lgammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Likewise. * sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c (__lgamma_negl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Likewise. * sysdeps/ieee754/ldbl-96/e_lgammal_r.c (sin_pi): Likewise. * sysdeps/ieee754/ldbl-96/lgamma_negl.c (__lgamma_negl): Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.
* Move fenv.h soft-float inlines from fenv_private.h to include/fenv.h.Joseph Myers2018-09-044-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | <fenv_private.h> has inline versions of various <fenv.h> functions, and their __fe* variants, for systems (generally soft-float) without support for floating-point exceptions, rounding modes or both. Having these inlines in a separate header introduces a risk of a source file including <fenv.h> and compiling OK on x86_64, but failing to compile (because the feraiseexcept inline is actually a macro that discards its argument, to avoid the need for #ifdef FE_INVALID conditionals), or not being properly optimized, on systems without the exceptions and rounding modes support (when these inlines were in math_private.h, we had a few cases where this broke the build because there was no obvious reason for a file to need math_private.h and it didn't need that header on x86_64). By moving those inlines to include/fenv.h, this risk can be avoided, and fenv_private.h becomes more clearly defined as specifically the header for the internal libc_fe* and SET_RESTORE_ROUND* interfaces. This patch makes that move, removing fenv_private.h includes that are no longer needed (or replacing them by fenv.h includes in a few cases that didn't already have such an include). Tested for x86_64 and x86, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/generic/fenv_private.h [FE_ALL_EXCEPT == 0]: Move this code .... [!FE_HAVE_ROUNDING_MODES]: And this code .... * include/fenv.h [!_ISOMAC]: ... to here. * math/fraiseexcpt.c (__feraiseexcept): Undefine as macro. (feraiseexcept): Likewise. * math/fromfp.h: Do not include <fenv_private.h>. * math/s_cexp_template.c: Likewise. * math/s_csin_template.c: Likewise. * math/s_csinh_template.c: Likewise. * math/s_ctan_template.c: Likewise. * math/s_ctanh_template.c: Likewise. * math/s_iseqsig_template.c: Likewise. * math/w_acos_compat.c: Likewise. * math/w_acosf_compat.c: Likewise. * math/w_acosl_compat.c: Likewise. * math/w_asin_compat.c: Likewise. * math/w_asinf_compat.c: Likewise. * math/w_asinl_compat.c: Likewise. * math/w_j0_compat.c: Likewise. * math/w_j0f_compat.c: Likewise. * math/w_j0l_compat.c: Likewise. * math/w_j1_compat.c: Likewise. * math/w_j1f_compat.c: Likewise. * math/w_j1l_compat.c: Likewise. * math/w_jn_compat.c: Likewise. * math/w_jnf_compat.c: Likewise. * math/w_log10_compat.c: Likewise. * math/w_log10f_compat.c: Likewise. * math/w_log10l_compat.c: Likewise. * math/w_log2_compat.c: Likewise. * math/w_log2f_compat.c: Likewise. * math/w_log2l_compat.c: Likewise. * math/w_log_compat.c: Likewise. * math/w_logf_compat.c: Likewise. * math/w_logl_compat.c: Likewise. * sysdeps/ieee754/dbl-64/s_llrint.c: Likewise. * sysdeps/ieee754/dbl-64/s_llround.c: Likewise. * sysdeps/ieee754/dbl-64/s_lrint.c: Likewise. * sysdeps/ieee754/dbl-64/s_lround.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_lround.c: Likewise. * sysdeps/ieee754/flt-32/s_llrintf.c: Likewise. * sysdeps/ieee754/flt-32/s_llroundf.c: Likewise. * sysdeps/ieee754/flt-32/s_lrintf.c: Likewise. * sysdeps/ieee754/flt-32/s_lroundf.c: Likewise. * sysdeps/ieee754/k_standardl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_expl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-128/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nearbyintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fma.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-96/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_lroundl.c: Likewise. * math/w_ilogb_template.c: Include <fenv.h> instead of <fenv_private.h>. * math/w_llogb_template.c: Likewise. * sysdeps/powerpc/fpu/e_sqrt.c: Likewise. * sysdeps/powerpc/fpu/e_sqrtf.c: Likewise.
* Do not include fenv_private.h in math_private.h.Joseph Myers2018-09-039-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Continuing the clean-up related to the catch-all math_private.h header, this patch stops math_private.h from including fenv_private.h. Instead, fenv_private.h is included directly from those users of math_private.h that also used interfaces from fenv_private.h. No attempt is made to remove unused includes of math_private.h, but that is a natural followup. (However, since math_private.h sometimes defines optimized versions of math.h interfaces or __* variants thereof, as well as defining its own interfaces, I think it might make sense to get all those optimized versions included from include/math.h, not requiring a separate header at all, before eliminating unused math_private.h includes - that avoids a file quietly becoming less-optimized if someone adds a call to one of those interfaces without restoring a math_private.h include to that file.) There is still a pitfall that if code uses plain fe* and __fe* interfaces, but only includes fenv.h and not fenv_private.h or (before this patch) math_private.h, it will compile on platforms with exceptions and rounding modes but not get the optimized versions (and possibly not compile) on platforms without exception and rounding mode support, so making it easy to break the build for such platforms accidentally. I think it would be most natural to move the inlines / macros for fe* and __fe* in the case of no exceptions and rounding modes into include/fenv.h, so that all code including fenv.h with _ISOMAC not defined automatically gets them. Then fenv_private.h would be purely the header for the libc_fe*, SET_RESTORE_ROUND etc. internal interfaces and the risk of breaking the build on other platforms than the one you tested on because of a missing fenv_private.h include would be much reduced (and there would be some unused fenv_private.h includes to remove along with unused math_private.h includes). Tested for x86_64 and x86, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. * sysdeps/generic/math_private.h: Do not include <fenv_private.h>. * math/fromfp.h: Include <fenv_private.h>. * math/math-narrow.h: Likewise. * math/s_cexp_template.c: Likewise. * math/s_csin_template.c: Likewise. * math/s_csinh_template.c: Likewise. * math/s_ctan_template.c: Likewise. * math/s_ctanh_template.c: Likewise. * math/s_iseqsig_template.c: Likewise. * math/w_acos_compat.c: Likewise. * math/w_acosf_compat.c: Likewise. * math/w_acosl_compat.c: Likewise. * math/w_asin_compat.c: Likewise. * math/w_asinf_compat.c: Likewise. * math/w_asinl_compat.c: Likewise. * math/w_ilogb_template.c: Likewise. * math/w_j0_compat.c: Likewise. * math/w_j0f_compat.c: Likewise. * math/w_j0l_compat.c: Likewise. * math/w_j1_compat.c: Likewise. * math/w_j1f_compat.c: Likewise. * math/w_j1l_compat.c: Likewise. * math/w_jn_compat.c: Likewise. * math/w_jnf_compat.c: Likewise. * math/w_llogb_template.c: Likewise. * math/w_log10_compat.c: Likewise. * math/w_log10f_compat.c: Likewise. * math/w_log10l_compat.c: Likewise. * math/w_log2_compat.c: Likewise. * math/w_log2f_compat.c: Likewise. * math/w_log2l_compat.c: Likewise. * math/w_log_compat.c: Likewise. * math/w_logf_compat.c: Likewise. * math/w_logl_compat.c: Likewise. * sysdeps/aarch64/fpu/feholdexcpt.c: Likewise. * sysdeps/aarch64/fpu/fesetround.c: Likewise. * sysdeps/aarch64/fpu/fgetexcptflg.c: Likewise. * sysdeps/aarch64/fpu/ftestexcept.c: Likewise. * sysdeps/ieee754/dbl-64/e_atan2.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp2.c: Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c: Likewise. * sysdeps/ieee754/dbl-64/e_jn.c: Likewise. * sysdeps/ieee754/dbl-64/e_pow.c: Likewise. * sysdeps/ieee754/dbl-64/e_remainder.c: Likewise. * sysdeps/ieee754/dbl-64/e_sqrt.c: Likewise. * sysdeps/ieee754/dbl-64/gamma_product.c: Likewise. * sysdeps/ieee754/dbl-64/lgamma_neg.c: Likewise. * sysdeps/ieee754/dbl-64/s_atan.c: Likewise. * sysdeps/ieee754/dbl-64/s_fma.c: Likewise. * sysdeps/ieee754/dbl-64/s_fmaf.c: Likewise. * sysdeps/ieee754/dbl-64/s_llrint.c: Likewise. * sysdeps/ieee754/dbl-64/s_llround.c: Likewise. * sysdeps/ieee754/dbl-64/s_lrint.c: Likewise. * sysdeps/ieee754/dbl-64/s_lround.c: Likewise. * sysdeps/ieee754/dbl-64/s_nearbyint.c: Likewise. * sysdeps/ieee754/dbl-64/s_sin.c: Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c: Likewise. * sysdeps/ieee754/dbl-64/s_tan.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_lround.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c: Likewise. * sysdeps/ieee754/dbl-64/x2y2m1.c: Likewise. * sysdeps/ieee754/float128/float128_private.h: Likewise. * sysdeps/ieee754/flt-32/e_gammaf_r.c: Likewise. * sysdeps/ieee754/flt-32/e_j1f.c: Likewise. * sysdeps/ieee754/flt-32/e_jnf.c: Likewise. * sysdeps/ieee754/flt-32/lgamma_negf.c: Likewise. * sysdeps/ieee754/flt-32/s_llrintf.c: Likewise. * sysdeps/ieee754/flt-32/s_llroundf.c: Likewise. * sysdeps/ieee754/flt-32/s_lrintf.c: Likewise. * sysdeps/ieee754/flt-32/s_lroundf.c: Likewise. * sysdeps/ieee754/flt-32/s_nearbyintf.c: Likewise. * sysdeps/ieee754/k_standardl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_expl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c: Likewise. * sysdeps/ieee754/ldbl-128/e_j1l.c: Likewise. * sysdeps/ieee754/ldbl-128/e_jnl.c: Likewise. * sysdeps/ieee754/ldbl-128/gamma_productl.c: Likewise. * sysdeps/ieee754/ldbl-128/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-128/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nearbyintl.c: Likewise. * sysdeps/ieee754/ldbl-128/x2y2m1l.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_expl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_j1l.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/x2y2m1l.c: Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c: Likewise. * sysdeps/ieee754/ldbl-96/e_jnl.c: Likewise. * sysdeps/ieee754/ldbl-96/gamma_productl.c: Likewise. * sysdeps/ieee754/ldbl-96/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fma.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-96/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-96/x2y2m1l.c: Likewise. * sysdeps/powerpc/fpu/e_sqrt.c: Likewise. * sysdeps/powerpc/fpu/e_sqrtf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_ceil.c: Likewise. * sysdeps/riscv/rv64/rvd/s_floor.c: Likewise. * sysdeps/riscv/rv64/rvd/s_nearbyint.c: Likewise. * sysdeps/riscv/rv64/rvd/s_round.c: Likewise. * sysdeps/riscv/rv64/rvd/s_roundeven.c: Likewise. * sysdeps/riscv/rv64/rvd/s_trunc.c: Likewise. * sysdeps/riscv/rvd/s_finite.c: Likewise. * sysdeps/riscv/rvd/s_fmax.c: Likewise. * sysdeps/riscv/rvd/s_fmin.c: Likewise. * sysdeps/riscv/rvd/s_fpclassify.c: Likewise. * sysdeps/riscv/rvd/s_isinf.c: Likewise. * sysdeps/riscv/rvd/s_isnan.c: Likewise. * sysdeps/riscv/rvd/s_issignaling.c: Likewise. * sysdeps/riscv/rvf/fegetround.c: Likewise. * sysdeps/riscv/rvf/feholdexcpt.c: Likewise. * sysdeps/riscv/rvf/fesetenv.c: Likewise. * sysdeps/riscv/rvf/fesetround.c: Likewise. * sysdeps/riscv/rvf/feupdateenv.c: Likewise. * sysdeps/riscv/rvf/fgetexcptflg.c: Likewise. * sysdeps/riscv/rvf/ftestexcept.c: Likewise. * sysdeps/riscv/rvf/s_ceilf.c: Likewise. * sysdeps/riscv/rvf/s_finitef.c: Likewise. * sysdeps/riscv/rvf/s_floorf.c: Likewise. * sysdeps/riscv/rvf/s_fmaxf.c: Likewise. * sysdeps/riscv/rvf/s_fminf.c: Likewise. * sysdeps/riscv/rvf/s_fpclassifyf.c: Likewise. * sysdeps/riscv/rvf/s_isinff.c: Likewise. * sysdeps/riscv/rvf/s_isnanf.c: Likewise. * sysdeps/riscv/rvf/s_issignalingf.c: Likewise. * sysdeps/riscv/rvf/s_nearbyintf.c: Likewise. * sysdeps/riscv/rvf/s_roundevenf.c: Likewise. * sysdeps/riscv/rvf/s_roundf.c: Likewise. * sysdeps/riscv/rvf/s_truncf.c: Likewise.
* Remove unused math filesWilco Dijkstra2018-08-244-400/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | Remove empty files due to the sin/cos improvements: k_sinf.c, k_cosf.c, k_cos.c, k_sin.c. After the tanf change s_rem_pio2f.c and k_rem_pio2f.c (and the ia64, m68k and powerpc equivalents) are no longer used, so remove them. All e_rem_pio2.c files were already empty or commented out, so remove them too. Passes build-many-glibcs. * math/Makefile: Remove empty files k_sin(f).c, k_cos(f).c. Remove unused files e_rem_pio2(f).c, k_rem_pio2f.c. * sysdeps/i386/fpu/e_rem_pio2.c: Delete file. * sysdeps/ia64/fpu/e_rem_pio2.c: Likewise. * sysdeps/ia64/fpu/e_rem_pio2f.c: Likewise. * sysdeps/ia64/fpu/k_rem_pio2f.c: Likewise. * sysdeps/ieee754/dbl-64/e_rem_pio2.c: Likewise. * sysdeps/ieee754/dbl-64/k_cos.c: Likewise. * sysdeps/ieee754/dbl-64/k_sin.c: Likewise. * sysdeps/ieee754/flt-32/e_rem_pio2f.c: Likewise. * sysdeps/ieee754/flt-32/k_cosf.c: Likewise. * sysdeps/ieee754/flt-32/k_rem_pio2f.c: Likewise. * sysdeps/ieee754/flt-32/k_sinf.c: Likewise. * sysdeps/m68k/m680x0/fpu/e_rem_pio2.c: Likewise * sysdeps/m68k/m680x0/fpu/e_rem_pio2f.c: Likewise * sysdeps/m68k/m680x0/fpu/k_rem_pio2f.c: Likewise * sysdeps/powerpc/fpu/e_rem_pio2f.c: Likewise. * sysdeps/powerpc/fpu/k_rem_pio2f.c: Likewise.
* Speedup tanf range reductionWilco Dijkstra2018-08-231-1/+28
| | | | | | | | | | | | | | | | Speedup tanf range reduction by using the new sincosf range reduction algorithm. Overall code quality is improved due to inlining, so there is a speedup even if no range reduction is required. tanf throughput gains on Cortex-A72: * |x| < M_PI_4 : 1.1x * |x| < M_PI_2 : 1.2x * |x| < 2 * M_PI: 1.5x * |x| < 120.0 : 1.6x * |x| < Inf : 12.1x * sysdeps/ieee754/flt-32/s_tanf.c (__tanf): Use fast range reduction.
* Use generic sinf/cosf in lgammaf_rWilco Dijkstra2018-08-153-112/+8
| | | | | | | | | | | | | | The internal functions __kernel_sinf and __kernel_cosf are used only by lgammaf_r. Removing the internal functions and using the generic sinf and cosf is better overall. Benchmarking on Cortex-A72 shows the generic sinf and cosf are 1.4x and 2.3x faster in the range |x| < PI/4, and 0.66x and 1.1x for |x| < PI/2, so it should make lgammaf_r faster on average. GLIBC regression tests pass on AArch64. * sysdeps/ieee754/flt-32/e_lgammaf_r.c (sin_pif): Use __sinf/__cosf. * sysdeps/ieee754/flt-32/k_cosf.c (__kernel_cosf): Remove all code. * sysdeps/ieee754/flt-32/k_sinf.c (__kernel_sinf): Likewise.
* Improve performance of sinf and cosfWilco Dijkstra2018-08-143-370/+134
| | | | | | | | | | | | | | | | | | | | | | | | | The second patch improves performance of sinf and cosf using the same algorithms and polynomials. The returned values are identical to sincosf for the same input. ULP definitions for AArch64 and x64 are updated. sinf/cosf througput gains on Cortex-A72: * |x| < 0x1p-12 : 1.2x * |x| < M_PI_4 : 1.8x * |x| < 2 * M_PI: 1.7x * |x| < 120.0 : 2.3x * |x| < Inf : 3.0x * NEWS: Mention sinf, cosf, sincosf. * sysdeps/aarch64/libm-test-ulps: Update ULP for sinf, cosf, sincosf. * sysdeps/x86_64/fpu/libm-test-ulps: Update ULP for sinf and cosf. * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: Add definitions of constants rather than including generic sincosf.h. * sysdeps/x86_64/fpu/s_sincosf_data.c: Remove. * sysdeps/ieee754/flt-32/s_cosf.c (cosf): Rewrite. * sysdeps/ieee754/flt-32/s_sincosf.h (reduced_sin): Remove. (reduced_cos): Remove. (sinf_poly): New function. * sysdeps/ieee754/flt-32/s_sinf.c (sinf): Rewrite.
* Improve performance of sincosfWilco Dijkstra2018-08-103-132/+256
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is a complete rewrite of sincosf. The new version is significantly faster, as well as simple and accurate. The worst-case ULP is 0.5607, maximum relative error is 0.5303 * 2^-23 over all 4 billion inputs. In non-nearest rounding modes the error is 1ULP. The algorithm uses 3 main cases: small inputs which don't need argument reduction, small inputs which need a simple range reduction and large inputs requiring complex range reduction. The code uses approximate integer comparisons to quickly decide between these cases. The small range reducer uses a single reduction step to handle values up to 120.0. It is fastest on targets which support inlined round instructions. The large range reducer uses integer arithmetic for simplicity. It does a 32x96 bit multiply to compute a 64-bit modulo result. This is more than accurate enough to handle the worst-case cancellation for values close to an integer multiple of PI/4. It could be further optimized, however it is already much faster than necessary. sincosf throughput gains on Cortex-A72: * |x| < 0x1p-12 : 1.6x * |x| < M_PI_4 : 1.7x * |x| < 2 * M_PI: 1.5x * |x| < 120.0 : 1.8x * |x| < Inf : 2.3x * math/Makefile: Add s_sincosf_data.c. * sysdeps/ia64/fpu/s_sincosf_data.c: New file. * sysdeps/ieee754/flt-32/s_sincosf.h (abstop12): Add new function. (sincosf_poly): Likewise. (reduce_small): Likewise. (reduce_large): Likewise. * sysdeps/ieee754/flt-32/s_sincosf.c (sincosf): Rewrite. * sysdeps/ieee754/flt-32/s_sincosf_data.c: New file with sincosf data. * sysdeps/m68k/m680x0/fpu/s_sincosf_data.c: New file. * sysdeps/x86_64/fpu/s_sincosf_data.c: New file.
* Clean up converttoint handling and document the semanticsSzabolcs Nagy2018-08-102-9/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch currently only affects aarch64. The roundtoint and converttoint internal functions are only called with small values, so 32 bit result is enough for converttoint and it is a signed int conversion so the return type is changed to int32_t. The original idea was to help the compiler keeping the result in uint64_t, then it's clear that no sign extension is needed and there is no accidental undefined or implementation defined signed int arithmetics. But it turns out gcc does a good job with inlining so changing the type has no overhead and the semantics of the conversion is less surprising this way. Since we want to allow the asuint64 (x + 0x1.8p52) style conversion, the top bits were never usable and the existing code ensures that only the bottom 32 bits of the conversion result are used. On aarch64 the neon intrinsics (which round ties to even) are changed to round and lround (which round ties away from zero) this does not affect the results in a significant way, but more portable (relies on round and lround being inlined which works with -fno-math-errno). The TOINT_SHIFT and TOINT_RINT macros were removed, only keep separate code paths for TOINT_INTRINSICS and !TOINT_INTRINSICS. * sysdeps/aarch64/fpu/math_private.h (roundtoint): Use round. (converttoint): Use lround. * sysdeps/ieee754/flt-32/math_config.h (roundtoint): Declare and document the semantics when TOINT_INTRINSICS is set. (converttoint): Likewise. (TOINT_RINT): Remove. (TOINT_SHIFT): Remove. * sysdeps/ieee754/flt-32/e_expf.c (__expf): Remove the TOINT_RINT code path.
* Use uint32_t sign in single precision math error handling functionsSzabolcs Nagy2018-07-023-11/+11
| | | | | | | | | | | | | | | | | | | | | Ideally sign should be bool, but sometimes (e.g. in powf) it's more efficient to pass a non-zero value than 1 to indicate that the sign should be set. The fixed size int is less ambigous than unsigned long. * sysdeps/ieee754/flt-32/e_powf.c (__powf): Use uint32_t. (exp2f_inline): Likewise. * sysdeps/ieee754/flt-32/math_config.h (__math_oflowf): Likewise. (__math_uflowf): Likewise. (__math_may_uflowf): Likewise. (__math_divzerof): Likewise. (__math_invalidf): Likewise. * sysdeps/ieee754/flt-32/math_errf.c (xflowf): Likewise. (__math_oflowf): Likewise. (__math_uflowf): Likewise. (__math_may_uflowf): Likewise. (__math_divzerof): Likewise. (__math_invalidf): Likewise.
* Fix i686-linux-gnu build with GCC mainline.Joseph Myers2018-05-221-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Building with recent GCC mainline for i686-linux-gnu is failing with: ../sysdeps/ieee754/flt-32/k_rem_pio2f.c: In function '__kernel_rem_pio2f': ../sysdeps/ieee754/flt-32/k_rem_pio2f.c:186:28: error: 'fq[0]' may be used uninitialized in this function [-Werror=maybe-uninitialized] fv = math_narrow_eval (fq[0]-fv); ^ and ../sysdeps/ieee754/dbl-64/k_rem_pio2.c: In function '__kernel_rem_pio2': ../sysdeps/ieee754/dbl-64/k_rem_pio2.c:333:32: error: 'fq[0]' may be used uninitialized in this function [-Werror=maybe-uninitialized] fv = math_narrow_eval (fq[0] - fv); ^ These are similar to -Warray-bounds cases for which the DIAG_* macros are already used in those files: the array element is in fact always initialized, but the reasoning that it is depends on another array not having been all zero at an earlier point, which depends on the functions not being called with zero arguments. Thus, this patch uses DIAG_* to disable -Wmaybe-uninitialized for this code. (The warning may be i686-specific because of math_narrow_eval somehow perturbing what the compiler does with this code enough to cause the warning. I don't know why it doesn't appear for i686-gnu.) Tested with build-many-glibcs.py that this fixes the i686 build in this configuration. * sysdeps/ieee754/dbl-64/k_rem_pio2.c (__kernel_rem_pio2): Ignore -Wmaybe-uninitialized around access to fq[0]. * sysdeps/ieee754/flt-32/k_rem_pio2f.c (__kernel_rem_pio2f): Likewise.