about summary refs log tree commit diff
path: root/sysdeps/powerpc/fpu
Commit message (Collapse)AuthorAgeFilesLines
* Update copyright dates with scripts/update-copyrightsPaul Eggert2021-01-0253-53/+53
| | | | | | | | | | | | | | | | I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 6694 files FOO. I then removed trailing white space from benchtests/bench-pthread-locks.c and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this diagnostic from Savannah: remote: *** pre-commit check failed ... remote: *** error: lines with trailing whitespace found remote: error: hook declined to update refs/heads/master
* powerpc: Regenerate ulpsFlorian Weimer2020-12-221-12/+13
| | | | | For new inputs added in commit cad5ad81d2f7f58a7ad0d8afa8c1b710, as seen on a POWER8 system.
* Update powerpc libm-test-ulpsMatheus Castanho2020-09-101-3/+3
| | | | | | | | | | | | | Before this patch, the following tests were failing: ppc and ppc64: FAIL: math/test-ldouble-j0 ppc64le: FAIL: math/test-float128-j0 FAIL: math/test-float64x-j0 FAIL: math/test-ibm128-j0 FAIL: math/test-ldouble-j0
* powerpc: Use sqrt{f} builtinAdhemerval Zanella2020-06-223-80/+42
| | | | | | | | | | The powerpc sqrt implementation is also simplified: - the static constants are open coded within the implementation. - for !USE_SQRT_BUILTIN the function is implemented directly on __ieee754_sqrt (it avoid an superflous extra jump). Checked on powerpc-linux-gnu and powerpc64le-linux-gnu.
* math: Decompose math-use-builtins.hAdhemerval Zanella2020-06-222-77/+9
| | | | | | | | | | | Each symbol definitions are moved on a separated file and it cover all symbol type definitions (float, double, long double, and float128). It allows to set support for architectures without the boiler place of copying default values. Checked with a build on the affected ABIs.
* powerpc64le: use common fmaf128 implementationPaul E. Murphy2020-06-051-1/+7
| | | | | | | | | This defines the macro such that it should behave best on all supported powerpc targets. Likewise, this allows us to remove the ppc64le specific s_fmaf128.c. I have verified powerpc64le multiarch and powerpc64le power9 no-multiarch builds continue to generate optimize fmaf128.
* powerpc: Fix powerpc64le due a7a3435c9aAdhemerval Zanella2020-06-041-0/+2
| | | | | | | | | | | | | | | The build uses an undefined macro evaluation for fmaf128 build. For now set USE_FMAL_BUILTIN and USE_FMAF128_BUILTIN to 0. Checked with a build for: powerpc64le-linux-gnu-power9-disable-multi-arch powerpc64le-linux-gnu-power9 powerpc64le-linux-gnu powerpc64-linux-gnu-power8 powerpc64-linux-gnu powerpc-linux-gnu-power4 powerpc-linux-gnu
* powerpc/fpu: use generic fma functionsVineet Gupta2020-06-033-54/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tested with build-many-glibcs for powerpc-linux-gnu This is a non functional change and powerpc libm before/after was byte invariant as compared below: | cd /SCRATCH/vgupta/gnu/install-glibc-A-baseline | for i in `find . -name libm-2.31.9000.so`; do | echo $i; diff $i /SCRATCH/vgupta/gnu/install-glibc-C-reduce-scope/$i ; | echo $?; | done | ./aarch64-linux-gnu/lib64/libm-2.31.9000.so | 0 | ./arm-linux-gnueabi/lib/libm-2.31.9000.so | 0 | ./x86_64-linux-gnu/lib64/libm-2.31.9000.so | 0 | ./arm-linux-gnueabihf/lib/libm-2.31.9000.so | 0 | ./riscv64-linux-gnu-rv64imac-lp64/lib64/lp64/libm-2.31.9000.so | 0 | ./riscv64-linux-gnu-rv64imafdc-lp64/lib64/lp64/libm-2.31.9000.so | 0 | ./powerpc-linux-gnu/lib/libm-2.31.9000.so | 0 | ./microblaze-linux-gnu/lib/libm-2.31.9000.so | 0 | ./nios2-linux-gnu/lib/libm-2.31.9000.so | 0 | ./hppa-linux-gnu/lib/libm-2.31.9000.so | 0 | ./s390x-linux-gnu/lib64/libm-2.31.9000.so | 0 Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* powerpc64le: Enable support for IEEE long doubleGabriel F. T. Gomes2020-04-301-0/+4
| | | | | | | | | | | | | | On platforms where long double may have two different formats, i.e.: the same format as double (64-bits) or something else (128-bits), building with -mlong-double-128 is the default and function calls in the user program match the name of the function in Glibc. When building with -mlong-double-64, Glibc installed headers redirect such calls to the appropriate function. Likewise, the internals of glibc are now built against IEEE long double. However, the only (minimally) notable usage of long double is difftime. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
* powerpc: Update ULPs and xfail more ibm128 outputsTulio Magno Quites Machado Filho2020-04-071-16/+18
| | | | | | | There are 2 new input values that require to be marked as xfail-rounding:ibm128-libgcc as they're known to fail because of libgcc issues with different rounding modes. Otherwise, the other tests just need an increase in ULP.
* math: Remove fenvinline.hAdhemerval Zanella2020-03-302-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to string2.h (18b10de7ce) and string3.h (09a596cc2c) this patch removes the fenvinline.h on all architectures. Currently only powerpc implements some optimizations. This kind of optimization is better implemented by the compiler (which handles the architecture ISA transparently). Also, for the specific optimized powerpc implementation the code is becoming convoluted and these micro-optimization are hardly wildly used, even more being a possible hotspot in realword cases (non-default rounding are used only on specific cases and exception handling are done most likely only on errors path). Only x86 implements similar optimization (on fenv.h) also indicates that these should no be on libc. The math/test-fenv already covers all math/test-fenvinline tests, so it is safe to remove it. The powerpc fegetround optimization is moved to internal fenv_libc.h. The BZ#94193 [1] the corresponding GCC bug for adding replacements for these on powerpc. Checked on x86_64-linux-gnu and powerpc64le-linux-gnu. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94193
* math: Remove inline math testsAdhemerval Zanella2020-03-191-1135/+0
| | | | | | | | | | | | | With mathinline removal there is no need to keep building and testing inline math tests. The gen-libm-tests.py support to generate ULP_I_* is removed and all libm-test-ulps files are updated to longer have the i{float,double,ldouble} entries. The support for no-test-inline is also removed from both gen-auto-libm-tests and the auto-libm-test-out-* were regenerated. Checked on x86_64-linux-gnu and i686-linux-gnu.
* Add libm_alias_finite for _finite symbolsWilco Dijkstra2020-01-034-4/+12
| | | | | | | | | | | | | | | | | | | This patch adds a new macro, libm_alias_finite, to define all _finite symbol. It sets all _finite symbol as compat symbol based on its first version (obtained from the definition at built generated first-versions.h). The <fn>f128_finite symbols were introduced in GLIBC 2.26 and so need special treatment in code that is shared between long double and float128. It is done by adding a list, similar to internal symbol redifinition, on sysdeps/ieee754/float128/float128_private.h. Alpha also needs some tricky changes to ensure we still emit 2 compat symbols for sqrt(f). Passes buildmanyglibc. Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* Update copyright dates with scripts/update-copyrights.Joseph Myers2020-01-0155-55/+55
|
* [powerpc] No need to enter "Ignore Exceptions Mode"Paul A. Clarke2019-10-021-4/+15
| | | | | | | | | Since at least POWER8, there is no performance advantage to entering "Ignore Exceptions Mode", and doing so conditionally requires - the conditional logic, and - a system call. Make it a no-op for uses within glibc.
* [powerpc] Rename fesetenv_mode to fesetenv_controlPaul A. Clarke2019-09-275-6/+6
| | | | | | | fesetenv_mode is used variously to write the FPSCR exception enable bits and rounding mode bits. These are referred to as the control bits in the POWER ISA. Change the name to be reflective of its current and expected use, and match up well with fegetenv_control.
* [powerpc] libc_feholdsetround_noex_ppc_ctx: optimize FPSCR writePaul A. Clarke2019-09-271-1/+1
| | | | | | | | | | | | libc_feholdsetround_noex_ppc_ctx currently performs: 1. Read FPSCR, save to context. 2. Create new FPSCR value: clear enables and set new rounding mode. 3. Write new value to FPSCR. Since other bits just pass through, there is no need to write them. Instead, write just the changed values (enables and rounding mode), which can be a bit more efficient.
* [powerpc] Rename fegetenv_status to fegetenv_controlPaul A. Clarke2019-09-277-9/+9
| | | | | | | | | | | fegetenv_status is used variously to retrieve the FPSCR exception enable bits, rounding mode bits, or both. These are referred to as the control bits in the POWER ISA. FPSCR status bits are also returned by the 'mffs' and 'mffsl' instructions, but they are uniformly ignored by all uses of fegetenv_status. Change the name to be reflective of its current and expected use. Reviewed-By: Paul E Murphy <murphyp@linux.ibm.com>
* [powerpc] __fesetround_inline optimizationsPaul A. Clarke2019-09-271-3/+15
| | | | | | | | | On POWER9, use more efficient means to update the 2-bit rounding mode via the 'mffscrn' instruction (instead of two 'mtfsb0/1' instructions or one 'mtfsfi' instruction that modifies 4 bits). Suggested-by: Paul E. Murphy <murphyp@linux.ibm.com> Reviewed-By: Paul E Murphy <murphyp@linux.ibm.com>
* [powerpc] libc_feupdateenv_test: optimize FPSCR accessPaul A. Clarke2019-09-272-2/+18
| | | | | | | | | | | | ROUND_TO_ODD and a couple of other places use libc_feupdateenv_test to restore the rounding mode and exception enables, preserve exception flags, and test whether given exception(s) were generated. If the exception flags haven't changed, then it is sufficient and a bit more efficient to just restore the rounding mode and enables, rather than writing the full Floating-Point Status and Control Register (FPSCR). Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>
* [powerpc] fenv_private.h clean upPaul A. Clarke2019-09-278-117/+38
| | | | | | | | | | | fenv_private.h includes unused functions, magic macro constants, and some replicated common code fragments. Remove unused functions, replace magic constants with constants from fenv_libc.h, and refactor replicated code. Suggested-by: Paul E. Murphy <murphyp@linux.ibm.com> Reviewed-By: Paul E Murphy <murphyp@linux.ibm.com>
* [powerpc] SET_RESTORE_ROUND optimizations and bug fixPaul A. Clarke2019-09-192-25/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SET_RESTORE_ROUND brackets a block of code, temporarily setting and restoring the rounding mode and letting everything else, including exceptions generated within the block, pass through. On powerpc, the current code clears the exception enables, which will hide exceptions generated within the block. This issue was introduced by me in commit e905212627350d54b58426214b5a54ddc852b0c9. Fix this by not clearing exception enable bits in the prologue. Also, since we are no longer changing the enable bits in either the prologue or the epilogue, there is no need to test for entering/exiting non-stop mode. Also, optimize the prologue get/save/set rounding mode operations for POWER9 and later by using 'mffscrn' when possible. Suggested-by: Paul E. Murphy <murphyp@linux.ibm.com> Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com> Fixes: e905212627350d54b58426214b5a54ddc852b0c9 2019-09-19 Paul A. Clarke <pc@us.ibm.com> * sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_and_set_rn): New. (__fe_mffscrn): New. * sysdeps/powerpc/fpu/fenv_private.h (libc_feholdsetround_ppc_ctx): Do not clear enable bits, remove obsolete code, use fegetenv_and_set_rn. (libc_feresetround_ppc): Remove obsolete code, use fegetenv_and_set_rn.
* Prefer https to http for gnu.org and fsf.org URLsPaul Eggert2019-09-0755-55/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Also, change sources.redhat.com to sourceware.org. This patch was automatically generated by running the following shell script, which uses GNU sed, and which avoids modifying files imported from upstream: sed -ri ' s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g ' \ $(find $(git ls-files) -prune -type f \ ! -name '*.po' \ ! -name 'ChangeLog*' \ ! -path COPYING ! -path COPYING.LIB \ ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \ ! -path manual/texinfo.tex ! -path scripts/config.guess \ ! -path scripts/config.sub ! -path scripts/install-sh \ ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \ ! -path INSTALL ! -path locale/programs/charmap-kw.h \ ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \ ! '(' -name configure \ -execdir test -f configure.ac -o -f configure.in ';' ')' \ ! '(' -name preconfigure \ -execdir test -f preconfigure.ac ';' ')' \ -print) and then by running 'make dist-prepare' to regenerate files built from the altered files, and then executing the following to cleanup: chmod a+x sysdeps/unix/sysv/linux/riscv/configure # Omit irrelevant whitespace and comment-only changes, # perhaps from a slightly-different Autoconf version. git checkout -f \ sysdeps/csky/configure \ sysdeps/hppa/configure \ sysdeps/riscv/configure \ sysdeps/unix/sysv/linux/csky/configure # Omit changes that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines git checkout -f \ sysdeps/powerpc/powerpc64/ppc-mcount.S \ sysdeps/unix/sysv/linux/s390/s390-64/syscall.S # Omit change that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
* [powerpc] fegetenv_status: simplify instruction generationPaul A. Clarke2019-08-281-15/+6
| | | | | | | | | | | | | | fegetenv_status() wants to use the lighter weight instruction 'mffsl' for reading the Floating-Point Status and Control Register (FPSCR). It currently will use it directly if compiled '-mcpu=power9', and will perform a runtime check (cpu_supports("arch_3_00")) otherwise. Nicely, it turns out that the 'mffsl' instruction will decode to 'mffs' on architectures older than "arch_3_00" because the additional bits set for 'mffsl' are "don't care" for 'mffs'. 'mffs' is a superset of 'mffsl'. So, just generate 'mffsl'.
* [powerpc] fesetenv: optimize FPSCR accessPaul A. Clarke2019-08-281-8/+4
| | | | | | | | | | | | | | fesetenv() reads the current value of the Floating-Point Status and Control Register (FPSCR) to determine the difference between the current state of exception enables and the newly requested state. All of these bits are also returned by the lighter weight 'mffsl' instruction used by fegetenv_status(). Use that instead. Also, remove a local macro _FPU_MASK_ALL in favor of a common macro, FPU_ENABLES_MASK from fenv_libc.h. Finally, use a local variable ('new') in favor of a pointer dereference ('*envp').
* [powerpc] SET_RESTORE_ROUND improvementsPaul A. Clarke2019-08-281-2/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SET_RESTORE_ROUND uses libc_feholdsetround_ppc_ctx and libc_feresetround_ppc_ctx to bracket a block of code where the floating point rounding mode must be set to a certain value. For the *prologue*, libc_feholdsetround_ppc_ctx is used and performs: 1. Read/save FPSCR. 2. Create new value for FPSCR with new rounding mode and enables cleared. 3. If new value is different than current value, a. If transitioning from a state where some exceptions enabled, enter "ignore exceptions / non-stop" mode. b. Write new value to FPSCR. c. Put a mark on the wall indicating the FPSCR was changed. (1) uses the 'mffs' instruction. On POWER9, the lighter weight 'mffsl' instruction can be used, but it doesn't return all of the bits in the FPSCR. fegetenv_status uses 'mffsl' on POWER9, 'mffs' otherwise, and can thus be used instead of fegetenv_register. (3b) uses 'mtfsf 0b11111111' to write the entire FPSCR, so it must instead use 'mtfsf 0b00000011' to write just the enables and the mode, because some of the rest of the bits are not valid if 'mffsl' was used. fesetenv_mode uses 'mtfsf 0b00000011' on POWER9, 'mtfsf 0b11111111' otherwise. For the *epilogue*, libc_feresetround_ppc_ctx checks the mark on the wall, then calls libc_feresetround_ppc, which just calls __libc_femergeenv_ppc with parameters such that it performs: 1. Retreive saved value of FPSCR, saved in prologue above. 2. Read FPSCR. 3. Create new value of FPSCR where: - Summary bits and exception indicators = current OR saved. - Rounding mode and enables = saved. - Status bits = current. 4. If transitioning from some exceptions enabled to none, enter "ignore exceptions / non-stop" mode. 5. If transitioning from no exceptions enabled to some, enter "catch exceptions" mode. 6. Write new value to FPSCR. The summary bits are hardwired to the exception indicators, so there is no need to restore any saved summary bits. The exception indicator bits, which are sticky and remain set unless explicitly cleared, would only need to be restored if the code block might explicitly clear any of them. This is certainly not expected. So, the only bits that need to be restored are the enables and the mode. If it is the case that only those bits are to be restored, there is no need to read the FPSCR. Steps (2) and (3) are unnecessary, and step (6) only needs to write the bits being restored. We know we are transitioning out of "ignore exceptions" mode, so step (4) is unnecessary, and in step (6), we only need to check the state we are entering.
* [powerpc] fe{en,dis}ableexcept, fesetmode: optimize FPSCR accessesPaul A. Clarke2019-08-284-24/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Since fe{en,dis}ableexcept() and fesetmode() read-modify-write just the "mode" (exception enable and rounding mode) bits of the Floating Point Status Control Register (FPSCR), the lighter weight 'mffsl' instruction can be used to read the FPSCR (enables and rounding mode), and 'mtfsf 0b00000011' can be used to write just those bits back to the FPSCR. The net is better performance. In addition, fe{en,dis}ableexcept() read the FPSCR again after writing it, or they determine that it doesn't need to be written because it is not changing. In either case, the local variable holds the current values of the enable bits in the FPSCR. This local variable can be used instead of again reading the FPSCR. Also, that value of the FPSCR which is read the second time is validated against the requested enables. Since the write can't fail, this validation step is unnecessary, and can be removed. Instead, the exceptions to be enabled (or disabled) are transformed into available bits in the FPSCR, then validated after being transformed back, to ensure that all requested bits are actually being set. For example, FE_INVALID_SQRT can be requested, but cannot actually be set. This bit is not mapped during the transformations, so a test for that bit being set before and after transformations will show the bit would not be set, and the function will return -1 for failure. Finally, convert the local macros in fesetmode.c to more generally useful macros in fenv_libc.h.
* [powerpc] fe{en,dis}ableexcept optimize bit translationsPaul A. Clarke2019-08-283-33/+63
| | | | | | | | | | | The exceptions passed to fe{en,dis}ableexcept() are defined in the ABI as a bitmask, a combination of FE_INVALID, FE_OVERFLOW, etc. Within the functions, these bits must be translated to/from the corresponding enable bits in the Floating Point Status Control Register (FPSCR). This translation is currently done bit-by-bit. The compiler generates a series of conditional bit operations. Nicely, the "FE" exception bits are all a uniform offset from the FPSCR enable bits, so the bit-by-bit operation can instead be performed by a shift with appropriate masking.
* [powerpc] fenv_libc.h: protect use of __builtin_cpu_supportsPaul A. Clarke2019-07-091-1/+3
| | | | | | | | | | | | | Using __builtin_cpu_supports() requires support in GCC and Glibc. My recent patch to fenv_libc.h added an unprotected use of __builtin_cpu_supports(). Compilation of Glibc itself will fail with a sufficiently new GCC and sufficiently old Glibc: ../sysdeps/powerpc/fpu/fegetexcept.c: In function ‘__fegetexcept’: ../sysdeps/powerpc/fpu/fenv_libc.h:52:20: error: builtin ‘__builtin_cpu_supports’ needs GLIBC (2.23 and newer) that exports hardware capability bits [-Werror] Reviewed-by: Florian Weimer <fweimer@redhat.com> Fixes 3db85a9814784a74536a1f0e7b7ddbfef7dc84bb.
* powerpc: refactor logb{f,l}Adhemerval Zanella2019-07-083-0/+198
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The power7 logb implementation does not show a performance gain on ISA 2.07+ chips with faster floating-point to GRP instructions (currently POWER8 and POWER9). This patch moves the POWER7 implementation to generic one and enables it for POWER7. It also add some cleanup to use inline floating-point number instead of define them using static const. The performance difference is for POWER9: - Without patch: "logb": { "subnormal": { "duration": 4.99202e+09, "iterations": 8.83662e+08, "max": 75.194, "min": 5.501, "mean": 5.64925 }, "normal": { "duration": 4.97063e+09, "iterations": 9.97094e+08, "max": 46.489, "min": 4.956, "mean": 4.98512 } } - With patch: "logb": { "subnormal": { "duration": 4.97226e+09, "iterations": 9.92036e+08, "max": 77.209, "min": 4.892, "mean": 5.01218 }, "normal": { "duration": 4.96192e+09, "iterations": 1.07545e+09, "max": 12.361, "min": 4.593, "mean": 4.61382 } } The ifunc implementation is also enabled only for powerpc64. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/power7/fpu/s_logb.c: Move to ... * sysdeps/powerpc/fpu/s_logb.c: ... here. Use inline FP constants. * sysdeps/powerpc/power7/fpu/s_logbf.c: Move to ... * sysdeps/powerpc/fpu/s_logbf.c: ... here. Use inline FP constants. * sysdeps/powerpc/power7/fpu/s_logbl.c: Move to ... * sysdeps/powerpc/fpu/s_logbl.c: ... here. Use inline FP constants. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logb-power7.c: Adjust implementation path. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logbf-power7.c: Adjust implementation path. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logbl-power7.c: Adjust implementation path. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_log* objects. (CFLAGS-s_logbf-power7.c, CFLAGS-s_logbl-power7.c, CFLAGS-s_logb-power7.c): New fule. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb-power7.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logb-power7.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb-ppc64.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logb-ppc64.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logb.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf-power7.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbf-power7.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf-ppc64.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbf-ppc64.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbf.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbl-power7.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbl-power7.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbl-ppc64.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbl-ppc64.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbl.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbl.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile: Remove file. * sysdeps/powerpc/powerpc64/power7/fpu/s_logb.c: Remove file. * sysdeps/powerpc/powerpc64/power7/fpu/s_logbf.c: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_logbl.c: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
* powerpc: Refactor modf{f}Adhemerval Zanella2019-07-082-0/+114
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The modf{f} optimization is not an optimization for ISA 2.07+. This patch move the IFUNC for powerpc64 only, move the power5+ to generic location, and include the generic implementation for ISA 2.07+. The performance changes are based on modf benchtests: * POWER9 - ppc64 "modf": { "": { "duration": 4.97057e+09, "iterations": 1.00688e+09, "max": 28.76, "min": 4.912, "mean": 4.9366 } } * POWER9 - power5+ "modf": { "": { "duration": 4.98291e+09, "iterations": 9.32818e+08, "max": 15.058, "min": 5.107, "mean": 5.34178 } } * POWER8 - ppc64 "modf": { "": { "duration": 5.05329e+09, "iterations": 8.38814e+08, "max": 518.051, "min": 5.79, "mean": 6.02433 } } * POWER8 - power5+ "modf": { "": { "duration": 5.05573e+09, "iterations": 8.35254e+08, "max": 63.141, "min": 5.873, "mean": 6.05293 } } * POWER7 - ppc64 "modf": { "": { "duration": 4.89818e+09, "iterations": 1.08408e+09, "max": 57.556, "min": 3.953, "mean": 4.51827 } } * POWER7 - power5+ "modf": { "": { "duration": 4.83789e+09, "iterations": 1.33409e+09, "max": 46.608, "min": 2.224, "mean": 3.62636 } } Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/power5+/fpu/s_modf.c: Move to ... * sysdeps/powerpc/fpu/s_modf.c: ... here. Add ISA 2.07 optimization. * sysdeps/powerpc/power5+/fpu/s_modff.c: Move to ... * sysdeps/powerpc/fpu/s_modff.c: ... here. Add ISA 2.07 optimization. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf-power5+.c: Adjust include. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modff-power5+.c: Likewise. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile (sysdep_calls, sysdep_routines): Add s_modf* objects. (CFLAGS-s_modf-power5+.c, CFLAGS-s_modff-power5+.c, CFLAGS-s_modf-ppc64.c, CFLAGS-s_modff-ppc64.c): New rule. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-power5+.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-power5+.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-power5+.c: Movo to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-power5+.c: Move ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-power5+.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-power5+.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-ppc64.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-ppc64.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff.c: ... here. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
* powerpc: hypot refactor and optimizationAdhemerval Zanella2019-07-081-72/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The powerpc hypot is slight optimized by: - Commit 8df4e219e43, both isnan and isinf are always inlined and thus the check TEST_INF_NAN does not make sense anymore. The generic check for POWER7 should be faster on all powerpc configuration. - The redundant check 'y > two60factor && (x / y) > two60' is removed. Both changes leads to unrequired ifunc especialization for power7 and thus they are removed. Finally The code is also cleanup a bit by inlining the constants floating points. The performance changes using the hypot benchtests are: - POWER9 without patch: "hypot": { "overflow": { "duration": 4.98585e+09, "iterations": 4.84932e+08, "max": 46.551, "min": 10.229, "mean": 10.2815 }, "higher_two500": { "duration": 5.00192e+09, "iterations": 4.24843e+08, "max": 33.319, "min": 11.606, "mean": 11.7736 }, "subnormal": { "duration": 5.0075e+09, "iterations": 4.06792e+08, "max": 22.178, "min": 12.15, "mean": 12.3097 }, "less_two500": { "duration": 5.00685e+09, "iterations": 4.08772e+08, "max": 22.784, "min": 12.052, "mean": 12.2485 }, "default": { "duration": 5.06002e+09, "iterations": 4.09894e+08, "max": 20.648, "min": 11.874, "mean": 12.3447 } } - POWER9 with patch: "hypot": { "overflow": { "duration": 4.91848e+09, "iterations": 7.28039e+08, "max": 47.958, "min": 6.436, "mean": 6.75579 }, "higher_two500": { "duration": 4.9359e+09, "iterations": 6.63376e+08, "max": 20.783, "min": 7.321, "mean": 7.44057 }, "subnormal": { "duration": 4.9479e+09, "iterations": 6.19772e+08, "max": 18.856, "min": 7.817, "mean": 7.98341 }, "less_two500": { "duration": 4.94275e+09, "iterations": 6.3889e+08, "max": 17.452, "min": 7.597, "mean": 7.73647 }, "default": { "duration": 5.03645e+09, "iterations": 5.70718e+08, "max": 18.904, "min": 8.55, "mean": 8.82476 } } - POWER7 without patch "hypot": { "overflow": { "duration": 4.86637e+09, "iterations": 6.43196e+08, "max": 53.958, "min": 7.328, "mean": 7.56592 }, "higher_two500": { "duration": 4.99842e+09, "iterations": 3.11012e+08, "max": 78.227, "min": 15.696, "mean": 16.0715 }, "subnormal": { "duration": 4.99841e+09, "iterations": 3.08935e+08, "max": 51.392, "min": 15.983, "mean": 16.1795 }, "less_two500": { "duration": 5.00108e+09, "iterations": 2.99464e+08, "max": 73.247, "min": 16.416, "mean": 16.7001 }, "default": { "duration": 5.04645e+09, "iterations": 3.52608e+08, "max": 70.073, "min": 13.38, "mean": 14.3118 } } - POWER7 with patch "hypot": { "overflow": { "duration": 4.80785e+09, "iterations": 8.00001e+08, "max": 66.262, "min": 5.888, "mean": 6.00981 }, "higher_two500": { "duration": 4.9859e+09, "iterations": 3.39449e+08, "max": 5148.44, "min": 14.539, "mean": 14.6882 }, "subnormal": { "duration": 4.9905e+09, "iterations": 3.28874e+08, "max": 64.905, "min": 14.971, "mean": 15.1745 }, "less_two500": { "duration": 4.99494e+09, "iterations": 3.19755e+08, "max": 103.696, "min": 14.972, "mean": 15.6211 }, "default": { "duration": 5.03951e+09, "iterations": 4.02502e+08, "max": 61.008, "min": 12.368, "mean": 12.5205 } } Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/e_hypot.c (two60, two500, two600, two1022, twoM500, twoM600, two60factor, pdnum): Remove. (TEST_INFO_NAN, GET_TW0_HIGH_WORD): Remove macro. (__ieee754_hypot): Replace static variables with inline definition, remove ununsed branches. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove e_hypot-* objects. (CFLAGS-e_hypot-power7.c, CFLAGS-e_hypotf-power7.c): Remove rule. * sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypot-power7.c: Remove file. * sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypot-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypot.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypotf-power7.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypotf-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypotf.c: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
* powerpc: Use faster means to access FPSCR when possible in some casesPaul A. Clarke2019-06-303-2/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Using 'mffs' instruction to read the Floating Point Status Control Register (FPSCR) can force a processor flush in some cases, with undesirable performance impact. If the values of the bits in the FPSCR which force the flush are not needed, an instruction that is new to POWER9 (ISA version 3.0), 'mffsl' can be used instead. Cases included: get_rounding_mode, fegetround, fegetmode, fegetexcept. * sysdeps/powerpc/bits/fenvinline.h (__fegetround): Use __fegetround_ISA300() or __fegetround_ISA2() as appropriate. (__fegetround_ISA300) New. (__fegetround_ISA2) New. * sysdeps/powerpc/fpu_control.h (IS_ISA300): New. (_FPU_MFFS): Move implementation... (_FPU_GETCW): Here. (_FPU_MFFSL): Move implementation.... (_FPU_GET_RC_ISA300): Here. New. (_FPU_GET_RC): Use _FPU_GET_RC_ISA300() or _FPU_GETCW() as appropriate. * sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status_ISA300): New. (fegetenv_status): New. * sysdeps/powerpc/fpu/fegetmode.c (fegetmode): Use fegetenv_status() instead of fegetenv_register(). * sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Likewise. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
* [powerpc] add 'volatile' to asmPaul A. Clarke2019-06-191-2/+2
| | | | | | | | | | | | | | Add 'volatile' keyword to a few asm statements, to force the compiler to generate the instructions therein. Some instances were implicitly volatile, but adding keyword for consistency. 2019-06-19 Paul A. Clarke <pc@us.ibm.com> * sysdeps/powerpc/fpu/fenv_libc.h (relax_fenv_state): Add 'volatile'. * sysdeps/powerpc/fpu/fpu_control.h (__FPU_MFFS): Likewise. (__FPU_MFFSL): Likewise. (_FPU_SETCW): Likewise.
* powerpc: Refactor powerpc32 lrint/lrintf/llrint/llrintfAdhemerval Zanella2019-06-172-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patches consolidates all the powerpc llrint{f} implementations on the generic sysdeps/powerpc/powerpc32/fpu/s_llrint{f}. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cpu and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/s_lrintf.S: Remove file. * sysdeps/powerpc/powerpc64/fpu/s_lrintf.c: Move to ... * sysdeps/powerpc/fpu/s_lrintf.c: ... here. * sysdeps/powerpc/powerpc32/fpu/Makefile [$(subdir) == math] (CFLAGS-s_lrint.c): New rule. * sysdeps/powerpc/powerpc32/fpu/s_llrint.c (__llrint): Add power4 optimization. * sysdeps/powerpc/powerpc32/fpu/s_llrintf.c (__llrintf): Likewise. * sysdeps/powerpc/powerpc32/fpu/s_lrint.S: Remove file. * sysdeps/powerpc/powerpc32/fpu/s_lrint.c: New file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (CFLAGS-s_llrintf-power6.c, CFLAGS-s_llrintf-ppc32.c, CFLAGS-s_llrint-power6.c, CFLAGS-s_llrint-ppc32.c, CFLAGS-s_lrint-ppc32.c): New rule. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint-power6.S: Remove file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf-power6.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lrint-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/s_llrint.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/s_llrintf.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_llrint.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_llrintf.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint-power6.c: New file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf-power6.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lrint-ppc32.c: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
* powerpc: Remove optimized isnanAdhemerval Zanella2019-06-122-63/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The powerpc isnan optimizations are not really a gain: - GCC will call libm iff -fsignaling-nans is used. This usage pattern is usually not performance oriented and for such calls PLT overhead should dominate execution time. - The power5, power6, and power6x are just micro-optimization to improve the Load-Hit-Store hazards from floating-point to general register transfer, and current GCC already has support to minimize it by inserting either extra nops or group dispatch instructions. - The power7 uses ftdiv to optimize for some input patterns, but at cost of others. Comparing against generic C implementation built for powerpc-linux-gnu-power4 (which uses the hp-timing support on benchtests): - Generic sysdeps/ieee754 implementation: "isnan": { "": { "duration": 4.98415e+09, "iterations": 2.34516e+09, "max": 45.925, "min": 2.052, "mean": 2.12529 }, "INF": { "duration": 4.74057e+09, "iterations": 1.69761e+09, "max": 91.01, "min": 2.052, "mean": 2.79249 }, "NAN": { "duration": 4.74071e+09, "iterations": 1.68768e+09, "max": 282.343, "min": 2.052, "mean": 2.809 } } - power7 optimized one: $ ./testrun.sh benchtests/bench-isnan "isnan": { "": { "duration": 4.96842e+09, "iterations": 2.56297e+09, "max": 50.048, "min": 1.872, "mean": 1.93854 }, "INF": { "duration": 4.76648e+09, "iterations": 1.54213e+09, "max": 373.408, "min": 2.661, "mean": 3.09084 }, "NAN": { "duration": 4.76845e+09, "iterations": 1.54515e+09, "max": 51.016, "min": 2.736, "mean": 3.08607 } } So it basically optimizes marginally for normal numbers while increasing the latency for other kind of FP. - The generic implementation requires getting the floating point status, disable the invalid operation bit, and restore the floating-point status. Each operation is costly and requires flushing the FP pipeline. Using the same scenarion for the previous analysis: "isnan": { "": { "duration": 5.08284e+09, "iterations": 6.2898e+08, "max": 41.844, "min": 8.057, "mean": 8.08108 }, "INF": { "duration": 4.97904e+09, "iterations": 6.16176e+08, "max": 39.661, "min": 8.057, "mean": 8.08055 }, "NAN": { "duration": 4.98695e+09, "iterations": 5.95866e+08, "max": 29.728, "min": 8.345, "mean": 8.36925 } } - The power8 implementation is just the generic implementation using ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation). So generic implementation is the best option for powerpc64le. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/s_isnan.c: Remove file. * sysdeps/powerpc/fpu/s_isnanf.S: Likewise. * sysdeps/powerpc/powerpc32/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (sysdeps_routines, libm-sysdep_routines): Remove s_isnan-* and s_isnanf-* objects. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power5.S: Remove file * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power6.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power7.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf-power5.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf-power6.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf.c: Likewise. * sysdeps/powerpc/powerpc32/power5/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc32/power5/fpu/s_isnanf.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_isnanf.S: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_isnanf.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_calls): Remove s_isnan-* and s_isnanf-* objects. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power5.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6x.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power7.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power8.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnanf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power5/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power6/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power6x/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_isnanf.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_isnanf.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
* powerpc: copysign cleanupAdhemerval Zanella2019-06-122-0/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GCC always expand copysign{f} for all possible cpus, so calling the libm is only done if user explicitly states to disable the builtin (which is done usually not for performance reason). So to provide ifunc variant for copysign is just unrequired complexity, since libm will be called on non-performance critical code. This patch removes both powerpc32 and powerpc64 ifunc variants and consolidates the powerpc implementation on sysdeps/powerpc/fpu/s_copysign{f}.c using compiler builtins. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/s_copysign.c: New file. * sysdeps/powerpc/fpu/s_copysignf.c: Likewise. * sysdeps/powerpc/powerpc32/fpu/s_copysign.S: Remove file. * sysdeps/powerpc/powerpc32/fpu/s_copysignf.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (sysdep_routines, libm-sysdep_routines): Remove s_copysign-power6 and s_copysign-ppc32. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign-power6.S: Remove file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysignf.c: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_copysign.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_copysignf.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdeps_calls): Remove s_copysign-power6 s_copysign-ppc64. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign-power6.S: Remove file. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysignf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_copysignf.S: Likewise. * sysdeps/powerpc/powerpc64/power6/fpu/s_copysign.S: Likewise. * sysdeps/powerpc/powerpc64/power6/fpu/s_copysignf.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
* powerpc: consolidate rintAdhemerval Zanella2019-06-123-43/+32
| | | | | | | | | | | | | | | | | | | | | | This patches consolidates all the powerpc rint{f} implementations on the generic sysdeps/powerpc/fpu/s_rint{f}. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/round_to_integer.h (set_fenv_mode, round_to_integer_float, round_mode): Add RINT handling. (reset_fenv_mode): New symbol. * sysdeps/powerpc/fpu/s_rint.c (__rint): Use generic implementation. * sysdeps/powerpc/fpu/s_rintf.c (__rintf): Likewise. * sysdeps/powerpc/powerpc32/fpu/s_rint.S: Remove file. * sysdeps/powerpc/powerpc32/fpu/s_rintf.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_rint.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_rintf.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
* [powerpc] get_rounding_mode: utilize faster method to get rounding modePaul A. Clarke2019-06-061-0/+33
| | | | | | | | | Add support to use 'mffsl' instruction if compiled for POWER9 (or later). Also, mask the result to avoid bleeding unrelated bits into the result of _FPU_GET_RC(). Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
* [powerpc] fegetexcept: utilize function instead of duplicating codePaul A. Clarke2019-06-051-13/+1
| | | | | | | | | | fegetexcept() included code which exactly duplicates the code in fenv_reg_to_exceptions(). Replace with a call to that function. 2019-06-05 Paul A. Clarke <pc@us.ibm.com> * sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code with call to equivalent function.
* powerpc: generic nearbyint/nearbyintfAdhemerval Zanella2019-05-283-7/+75
| | | | | | | | | | | | | | This patches consolidates all the powerpc nearbyint{f} implementations on the generic sysdeps/powerpc/fpu/s_nearbyint{f}. * sysdeps/powerpc/fpu/round_to_integer.h (set_fenv_mode): Add NEARBYINT handling. * sysdeps/powerpc/fpu/s_nearbyint.c: New file. * sysdeps/powerpc/fpu/s_nearbyintf.c: Likewise. * sysdeps/powerpc/powerpc32/fpu/s_nearbyint.S: Remove file. * sysdeps/powerpc/powerpc32/fpu/s_nearbyintf.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_nearbyint.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_nearbyintf.S: Likewise.
* powerpc: trunc/truncf refactorAdhemerval Zanella2019-05-093-1/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patches consolidates all the powerpc trunc{f} implementations on the generic sysdeps/powerpc/fpu/s_trunc{f}. The generic implementation uses either the compiler builts for ISA 2.03+ (which generates the frim instruction) or a generic implementation which uses FP only operations. The IFUNC organization for powerpc64 is also change to be enabled only for powerpc64 and not for powerpc64le (since minium ISA of 2.08 does not require the fallback generic implementation). Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/trunc_to_integer.h (set_fenv_mode): Add TRUNC handling. (round_mode): Add definition for TRUNC. * sysdeps/powerpc/fpu/s_trunc.c: New file. * sysdeps/powerpc/fpu/s_truncf.c: New file. * sysdeps/powerpc/powerpc32/fpu/s_trunc.S: Remove file. * sysdeps/powerpc/powerpc32/fpu/s_truncf.S: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_trunc-power5+.S: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_trunc-ppc32.S: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_truncf-power5+.S: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_truncf-ppc32.S: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_trunc-power5+.c: New file. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_trunc-ppc32.c: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_truncf-power5+.c: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_truncf-ppc32.c: Likewise. * sysdep/powerpc/powerpc32/power5+/fpu/s_trunc.S: Remove file. * sysdep/powerpc/powerpc32/power5+/fpu/s_truncf.S: Likewise. * sysdep/powerpc/powerpc64/be/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_trunc-power5+, s_trunc-ppc64, s_truncf-power5+, and s_truncf-ppc64. (CFLAGS-s_trunc-power5+.c, CFLAGS-s_truncf-power5+.c): New rule. * sysdep/powerpc/powercp64/be/fpu/multiarch/s_trunc-power5+.c: New file. * sysdep/powerpc/powercp64/be/fpu/multiarch/s_trunc-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_trunc.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_trunc.c: ... here. * sysdep/powerpc/powercp64/be/fpu/multiarch/s_truncf-power5+.c: New file. * sysdep/powerpc/powercp64/be/fpu/multiarch/s_truncf-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_truncf.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_truncf.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove s_trunc-power5+, s_trunc-ppc64, s_truncf-power5+, and s_truncf-ppc64. * sysdep/powerpc/powerpc64/fpu/multiarch/s_trunc-power5+.S: Remove file. * sysdep/powerpc/powerpc64/fpu/multiarch/s_trunc-ppc64.S: Likewise. * sysdep/powerpc/powerpc64/fpu/multiarch/s_truncf-power5+.S: Likewise. * sysdep/powerpc/powerpc64/fpu/multiarch/s_truncf-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_trunc.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_truncf.S: Likewise. * sysdep/powerpc/powerpc64/power5+/fpu/s_trunc.S: Likewise. * sysdep/powerpc/powerpc64/power5+/fpu/s_truncf.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
* powerpc: round/roundf refactorAdhemerval Zanella2019-05-093-1/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patches consolidates all the powerpc round{f} implementations on the generic sysdeps/powerpc/fpu/s_round{f}. The generic implementation uses either the compiler builts for ISA 2.03+ (which generates the frim instruction) or a generic implementation which uses FP only operations. The IFUNC organization for powerpc64 is also change to be enabled only for powerpc64 and not for powerpc64le (since minium ISA of 2.08 does not require the fallback generic implementation). Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/round_to_integer.h (set_fenv_mode): Add ROUND handling. (round_mode): Add definition for ROUND. (round_to_integer_float): Likewise. * sysdeps/powerpc/fpu/s_round.c: New file. * sysdeps/powerpc/fpu/s_roundf.c: New file. * sysdeps/powerpc/powerpc32/fpu/s_round.S: Remove file. * sysdeps/powerpc/powerpc32/fpu/s_roundf.S: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_round-power5+.S: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_round-ppc32.S: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_roundf-power5+.S: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_roundf-ppc32.S: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_round-power5+.c: New file. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_round-ppc32.c: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_roundf-power5+.c: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_roundf-ppc32.c: Likewise. * sysdep/powerpc/powerpc32/power5+/fpu/s_round.S: Remove file. * sysdep/powerpc/powerpc32/power5+/fpu/s_roundf.S: Likewise. * sysdep/powerpc/powerpc64/be/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_round-power5+, s_round-ppc64, s_roundf-power5+, and s_roundf-ppc64. (CFLAGS-s_round-power5+.c, CFLAGS-s_roundf-power5+.c): New rule. * sysdep/powerpc/powercp64/be/fpu/multiarch/s_round-power5+.c: New file. * sysdep/powerpc/powercp64/be/fpu/multiarch/s_round-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_round.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_round.c: ... here. * sysdep/powerpc/powercp64/be/fpu/multiarch/s_roundf-power5+.c: New file. * sysdep/powerpc/powercp64/be/fpu/multiarch/s_roundf-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_roundf.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_roundf.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove s_round-power5+, s_round-ppc64, s_roundf-power5+, and s_roundf-ppc64. * sysdep/powerpc/powerpc64/fpu/multiarch/s_round-power5+.S: Remove file. * sysdep/powerpc/powerpc64/fpu/multiarch/s_round-ppc64.S: Likewise. * sysdep/powerpc/powerpc64/fpu/multiarch/s_roundf-power5+.S: Likewise. * sysdep/powerpc/powerpc64/fpu/multiarch/s_roundf-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_round.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_roundf.S: Likewise. * sysdep/powerpc/powerpc64/power5+/fpu/s_round.S: Likewise. * sysdep/powerpc/powerpc64/power5+/fpu/s_roundf.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
* powerpc: floor/floorf refactorAdhemerval Zanella2019-05-093-1/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patches consolidates all the powerpc floor{f} implementations on the generic sysdeps/powerpc/fpu/s_floor{f}. The generic implementation uses either the compiler builts for ISA 2.03+ (which generates the frim instruction) or a generic implementation which uses FP only operations. The IFUNC organization for powerpc64 is also change to be enabled only for powerpc64 and not for powerpc64le (since minium ISA of 2.08 does not require the fallback generic implementation). Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/round_to_integer.h (set_fenv_mode): Add FLOOR option. (round_mode): Add definition for FLOOR. * sysdeps/powerpc/fpu/s_floor.c: New file. * sysdeps/powerpc/fpu/s_floorf.c: Likewise. * sysdeps/powerpc/powerpc32/fpu/s_floor.S: Remove file. * sysdeps/powerpc/powerpc32/fpu/s_floorf.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor-power5+.S: Remove file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor-ppc32.S: Likewise * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf-power5+.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor-power5+.c: New file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf-power5+.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power5+/fpu/s_floor.S: Remove file. * sysdeps/powerpc/powerpc32/power5+/fpu/s_floorf.S: Remove file. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_floor-power5+, s_floor-ppc64, s_floorf-power5+, and s_floorf-ppc64. (CFLAGS-s_floor-power5+.c, CFLAGS-s_floorf-power5+.c): New rule. * sysdep/powerpc/powerpc64/be/fpu/multiarch/s_floor-power5+.c: New file. * sysdep/powerpc/powerpc64/be/fpu/multiarch/s_floor-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_floor.c: ... here. * sysdep/powerpc/powerpc64/be/fpu/multiarch/s_floorf-power5+.c: New file. * sysdep/powerpc/powerpc64/be/fpu/multiarch/s_floorf-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_floorf.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove s_floor-power5+, s_floor-ppc64, s_floorf-power5+, and s_floorf-ppc64. * sysdep/powerpc/powerpc64/fpu/multiarch/s_floor-power5+.S: Remove file. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor-ppc64.S: Remove file. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf-power5+.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_floor.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_floorf.S: Likewise. * sysdeps/powerpc/powerpc64/power5+/fpu/s_floor.S: Likewise. * sysdeps/powerpc/powerpc64/power5+/fpu/s_floorf.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
* powerpc: ceil/ceilf refactorAdhemerval Zanella2019-04-294-0/+183
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patches consolidates all the powerpc ceil{f} implementations on the generic sysdeps/powerpc/fpu/s_ceil{f}. The generic implementation uses either the compiler builts for ISA 2.03+ (which generates the frip instruction) or a generic implementation which uses FP only operations. It adds a generic implementation (round_to_integer.h) which is shared with other rounding to integer routines. The resulting code should be similar in term os performance to previous assembly one. The IFUNC organization for powerpc64 is also change to be enabled only for powerpc64 and not for powerpc64le (since minium ISA of 2.08 does not require the fallback generic implementation). Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/fenv_libc.h (__fesetround_inline_nocheck): New function. * sysdeps/powerpc/fpu/round_to_integer.h: New file. * sysdeps/powerpc/fpu/s_ceil.c: Likewise. * sysdeps/powerpc/fpu/s_ceilf.c: Likewise. * sysdeps/powerpc/powerpc32/fpu/s_ceil.S: Remove file. * sysdeps/powerpc/powerpc32/fpu/s_ceilf.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (CFLAGS-s_ceil-power5+.c, CFLAGS-s_ceilf-power5+.c): New rule. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil-power5+.S: Remove file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf-power5+.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil-power5+.c: New file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf-power5+.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power5+/fpu/s_ceil.S: Remove file. * sysdeps/powerpc/powerpc32/power5+/fpu/s_ceilf.S: Likewise. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile: New file. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceil-power5+.c: Likewise. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceil-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceil.c: ... here. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceilf-power5+.c: New file. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceilf-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceilf.c: ... * here. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove s_ceil-power5+, s_ceil-ppc64, s_ceilf-power5+, and s_ceilf-ppc64. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil-power5+.S: Remove file. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf-power5+.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_ceil.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_ceilf.S: Likewise. * sysdeps/powerpc/powerpc64/power5+/fpu/s_ceil.S: Likewise. * sysdeps/powerpc/powerpc64/power5+/fpu/s_ceilf.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
* powerpc: Fix format issue from 3a16dd780eeba602Adhemerval Zanella2019-04-172-2/+4
| | | | | * sysdeps/powerpc/fpu/s_fma.c: Fix format. * sysdeps/powerpc/fpu/s_fmaf.c: Likewise.
* powerpc: fma using builtinsAdhemerval Zanella2019-04-172-14/+10
| | | | | | | | | | | | | | | This patch just refactor the assembly implementation to use compiler builtins instead. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/s_fma.S: Remove file. * sysdeps/powerpc/fpu/s_fmaf.S: Likewise. * sysdeps/powerpc/fpu/s_fma.c: New file. * sysdeps/powerpc/fpu/s_fmaf.c: Likewise.
* powerpc: Use generic fabs{f} implementationsAdhemerval Zanella2019-04-172-34/+0
| | | | | | | | | | | | | Since be2e25bbd78f9fdf the generic ieee754 implementation uses compiler builtin which generates fabs{f} for all supported targets. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/s_fabs.S: Remove file. * sysdeps/powerpc/fpu/s_fabsf.S: Likewise.
* [powerpc] Use __builtin_{mffs,mtfsf}Paul A. Clarke2019-03-292-10/+5
| | | | | | | | | | | | | | | | | | | Replace inline asm uses of the "mffs" and "mtfsf" instructions with the analogous GCC builtins. __builtin_mffs and __builtin_mtfsf are both available in GCC 5 and above. Given the minimum GCC level for GLibC is now GCC 6.2, it is safe to use these builtins without restriction. 2019-03-29 Paul A. Clarke <pc@us.ibm.com> * sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_register): Replace inline asm with builtin. * sysdeps/powerpc/powerpc64/le/fpu/sfp-machine.h (FP_INIT_ROUNDMODE): Likewise. * sysdeps/powerpc/fpu/tst-setcontext-fpscr.c (_GET_DI_FPSCR): Likewise. (_GET_SI_FPSCR): Likewise. (_SET_SI_FPSCR): Likewise.
* powerpc: Remove ununsed s_float_bitwise.hAdhemerval Zanella2019-03-251-115/+0
| | | | | | | | | | | This file is not used anywhere since removal of {k,e}_rem_pio2f.c (commit ca3aac57efa89). Checked with a build for powerpc-linux-gnu (with --with-cpu=power4 and --with-cpu=power7), powerpc64-linux-gnu (with --with-cpu=power4 and --with-cpu=power7), and powerpc64le-linux (with --with-cpu=power8). * sysdeps/powerpc/fpu/s_float_bitwise.h: Remove file.