Fix ffma use of round-to-odd on x86 - mirror/glibc - mirror of git://sourceware.org/git/glibc.git

diff options

author	Joseph Myers <joseph@codesourcery.com>	2021-09-23 21:18:31 +0000
committer	Joseph Myers <joseph@codesourcery.com>	2021-09-23 21:18:31 +0000
commit	4ed7a383f9a8468194ccaebba3f0fa659003888d (patch)
tree	2413859edc62b3faf11f7bd70f52dae525a83a50 /resource/ulimit.h
parent	475b0b92e079c67ea8a25ec05fe0b17fdd935e12 (diff)
download	glibc-4ed7a383f9a8468194ccaebba3f0fa659003888d.tar.gz glibc-4ed7a383f9a8468194ccaebba3f0fa659003888d.tar.xz glibc-4ed7a383f9a8468194ccaebba3f0fa659003888d.zip

Fix ffma use of round-to-odd on x86

On 32-bit x86 with -mfpmath=sse, and on x86_64 with
--disable-multi-arch, the tests of ffma and its aliases (fma narrowing
from binary64 to binary32) fail.  This is probably the issue reported
by H.J. in
<https://sourceware.org/pipermail/libc-alpha/2021-September/131277.html>.

The problem is the use of fenv_private.h macros in the round-to-odd
implementation.  Those macros are set up to manipulate only one of the
SSE and 387 floating-point state, whichever is relevant for the type
indicated by the suffix on the macro name.  But x86 configurations
sometimes use the ldbl-96 implementation of binary64 fma (that's where
--disable-multi-arch is relevant for x86_64: it causes the ldbl-96
implementation to be used, instead of an IFUNC implementation that
falls back to the dbl-64 version), contrary to the expectations of
those macros for functions operating on double when __SSE2_MATH__ is
defined.

This can be addressed by using the default versions of those macros
(giving x86 its own version of s_ffma.c), as is done for the *f128
macro variants where it depends on the details of how GCC was
configured when building libgcc which floating-point state is affected
by _Float128 arithmetic.  The issue only applies when __SSE2_MATH__ is
defined, and doesn't apply when __FP_FAST_FMA is defined (because in
that case, fma will be inlined by the compiler, meaning it's
definitely an SSE operation; for the same reason, this is not an issue
for narrowing sqrt, as hardware sqrt is always inlined in that
implementation for x86), but in other cases it's safest to use the
default versions of the fenv_private.h macros to ensure things work
whichever fma implementation is used.

Tested for x86_64 (with and without --disable-multi-arch) and x86
(with and without -mfpmath=sse).

Diffstat (limited to 'resource/ulimit.h')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: