summary refs log tree commit diff
path: root/sysdeps
diff options
context:
space:
mode:
authorJoseph Myers <joseph@codesourcery.com>2015-08-14 17:15:06 +0000
committerJoseph Myers <joseph@codesourcery.com>2015-08-14 17:15:06 +0000
commit739babd775d4b69525e3494cad4310742a9b360a (patch)
treefbfae14704a97c927cb31aa5c2976fe71f8ca1e1 /sysdeps
parentdb7f8c8fe0575fa4c441c2f04e666a01cb06f0cc (diff)
downloadglibc-739babd775d4b69525e3494cad4310742a9b360a.tar.gz
glibc-739babd775d4b69525e3494cad4310742a9b360a.tar.xz
glibc-739babd775d4b69525e3494cad4310742a9b360a.zip
Fix fma spurious underflows (bug 18824).
Various fma implementations have logic that, when computing fma (x, y,
z) where z is large (so care needs taking to avoid internal overflow)
but x * y is small, scale x * y up instead of down to avoid internal
underflows resulting from scaling down.  (In these cases, x * y is
small enough that only its sign actually matters rather than the exact
value.)

The threshold for scaling up instead of down was correct for "if the
unscaled values were multiplied, the low part of the multiplication
could underflow", and the scaling was sufficient to ensure that the
low part of the multiplication did not underflow (given that cases of
very small x * y - less than half the least subnormal - were
previously dealt with).  However, the choice in the functions wasn't
between scaling up or no scaling, but between scaling up and scaling
down (scaling down actually being needed when x * y isn't so small
compared to z and so the exact value does matter).  Thus a larger
threshold is needed to ensure that scaling down doesn't produce values
the multiplication of whose low parts underflows.  This patch
increases the thresholds accordingly.

Tested for x86_64, x86 and mips64 (with the MIPS version of s_fmal.c
removed so that the ldbl-128 version gets tested instead of the
soft-fp one).

	[BZ #18824]
	* sysdeps/ieee754/dbl-64/s_fma.c (__fma): Increase threshold for
	scaling x * y up instead of down.
	* sysdeps/ieee754/ldbl-128/s_fmal.c (__fmal): Likewise.
	* sysdeps/ieee754/ldbl-96/s_fmal.c (__fmal): Likewise.
	* math/auto-libm-test-in: Add more tests of fma.
	* math/auto-libm-test-out: Regenerated.
Diffstat (limited to 'sysdeps')
-rw-r--r--sysdeps/ieee754/dbl-64/s_fma.c2
-rw-r--r--sysdeps/ieee754/ldbl-128/s_fmal.c2
-rw-r--r--sysdeps/ieee754/ldbl-96/s_fmal.c2
3 files changed, 3 insertions, 3 deletions
diff --git a/sysdeps/ieee754/dbl-64/s_fma.c b/sysdeps/ieee754/dbl-64/s_fma.c
index 716b41273d..278b690f9b 100644
--- a/sysdeps/ieee754/dbl-64/s_fma.c
+++ b/sysdeps/ieee754/dbl-64/s_fma.c
@@ -117,7 +117,7 @@ __fma (double x, double y, double z)
 	     very small, adjust them up to avoid spurious underflows,
 	     rather than down.  */
 	  if (u.ieee.exponent + v.ieee.exponent
-	      <= IEEE754_DOUBLE_BIAS + DBL_MANT_DIG)
+	      <= IEEE754_DOUBLE_BIAS + 2 * DBL_MANT_DIG)
 	    {
 	      if (u.ieee.exponent > v.ieee.exponent)
 		u.ieee.exponent += 2 * DBL_MANT_DIG + 2;
diff --git a/sysdeps/ieee754/ldbl-128/s_fmal.c b/sysdeps/ieee754/ldbl-128/s_fmal.c
index b13178ffe7..5abc9105e5 100644
--- a/sysdeps/ieee754/ldbl-128/s_fmal.c
+++ b/sysdeps/ieee754/ldbl-128/s_fmal.c
@@ -121,7 +121,7 @@ __fmal (long double x, long double y, long double z)
 	     very small, adjust them up to avoid spurious underflows,
 	     rather than down.  */
 	  if (u.ieee.exponent + v.ieee.exponent
-	      <= IEEE854_LONG_DOUBLE_BIAS + LDBL_MANT_DIG)
+	      <= IEEE854_LONG_DOUBLE_BIAS + 2 * LDBL_MANT_DIG)
 	    {
 	      if (u.ieee.exponent > v.ieee.exponent)
 		u.ieee.exponent += 2 * LDBL_MANT_DIG + 2;
diff --git a/sysdeps/ieee754/ldbl-96/s_fmal.c b/sysdeps/ieee754/ldbl-96/s_fmal.c
index eec5a02e0b..1232c9ebad 100644
--- a/sysdeps/ieee754/ldbl-96/s_fmal.c
+++ b/sysdeps/ieee754/ldbl-96/s_fmal.c
@@ -119,7 +119,7 @@ __fmal (long double x, long double y, long double z)
 	     very small, adjust them up to avoid spurious underflows,
 	     rather than down.  */
 	  if (u.ieee.exponent + v.ieee.exponent
-	      <= IEEE854_LONG_DOUBLE_BIAS + LDBL_MANT_DIG)
+	      <= IEEE854_LONG_DOUBLE_BIAS + 2 * LDBL_MANT_DIG)
 	    {
 	      if (u.ieee.exponent > v.ieee.exponent)
 		u.ieee.exponent += 2 * LDBL_MANT_DIG + 2;