diff options
author | Joseph Myers <joseph@codesourcery.com> | 2021-09-23 21:18:31 +0000 |
---|---|---|
committer | Joseph Myers <joseph@codesourcery.com> | 2021-09-23 21:18:31 +0000 |
commit | 4ed7a383f9a8468194ccaebba3f0fa659003888d (patch) | |
tree | 2413859edc62b3faf11f7bd70f52dae525a83a50 /resource/ulimit.h | |
parent | 475b0b92e079c67ea8a25ec05fe0b17fdd935e12 (diff) | |
download | glibc-4ed7a383f9a8468194ccaebba3f0fa659003888d.tar.gz glibc-4ed7a383f9a8468194ccaebba3f0fa659003888d.tar.xz glibc-4ed7a383f9a8468194ccaebba3f0fa659003888d.zip |
Fix ffma use of round-to-odd on x86
On 32-bit x86 with -mfpmath=sse, and on x86_64 with --disable-multi-arch, the tests of ffma and its aliases (fma narrowing from binary64 to binary32) fail. This is probably the issue reported by H.J. in <https://sourceware.org/pipermail/libc-alpha/2021-September/131277.html>. The problem is the use of fenv_private.h macros in the round-to-odd implementation. Those macros are set up to manipulate only one of the SSE and 387 floating-point state, whichever is relevant for the type indicated by the suffix on the macro name. But x86 configurations sometimes use the ldbl-96 implementation of binary64 fma (that's where --disable-multi-arch is relevant for x86_64: it causes the ldbl-96 implementation to be used, instead of an IFUNC implementation that falls back to the dbl-64 version), contrary to the expectations of those macros for functions operating on double when __SSE2_MATH__ is defined. This can be addressed by using the default versions of those macros (giving x86 its own version of s_ffma.c), as is done for the *f128 macro variants where it depends on the details of how GCC was configured when building libgcc which floating-point state is affected by _Float128 arithmetic. The issue only applies when __SSE2_MATH__ is defined, and doesn't apply when __FP_FAST_FMA is defined (because in that case, fma will be inlined by the compiler, meaning it's definitely an SSE operation; for the same reason, this is not an issue for narrowing sqrt, as hardware sqrt is always inlined in that implementation for x86), but in other cases it's safest to use the default versions of the fenv_private.h macros to ensure things work whichever fma implementation is used. Tested for x86_64 (with and without --disable-multi-arch) and x86 (with and without -mfpmath=sse).
Diffstat (limited to 'resource/ulimit.h')
0 files changed, 0 insertions, 0 deletions