Speedup tanf range reduction

Speedup tanf range reduction by using the new sincosf range reduction algorithm. Overall code quality is improved due to inlining, so there is a speedup even if no range reduction is required. tanf throughput gains on Cortex-A72: * |x| < M_PI_4 : 1.1x * |x| < M_PI_2 : 1.2x * |x| < 2 * M_PI: 1.5x * |x| < 120.0 : 1.6x * |x| < Inf : 12.1x * sysdeps/ieee754/flt-32/s_tanf.c (__tanf): Use fast range reduction.
author: Wilco Dijkstra <wdijkstr@arm.com> 2018-08-23 12:38:16 +0100
committer: Wilco Dijkstra <wdijkstr@arm.com> 2018-08-23 12:38:16 +0100
commit: 900fb446eb8172c54cdaed85107bc783ee50673a (patch)
tree: 80ecbb5c9245698a5b5c70b1070f621e77834dec
parent: 561b0bec4448f0302cb4915bf67c919bde4a1c57 (diff)
download: glibc-900fb446eb8172c54cdaed85107bc783ee50673a.tar.gz
glibc-900fb446eb8172c54cdaed85107bc783ee50673a.tar.xz
glibc-900fb446eb8172c54cdaed85107bc783ee50673a.zip
2 files changed, 32 insertions, 1 deletions
diff --git a/ChangeLog b/ChangeLog
index a22301f323..2c902f2b98 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,7 @@
+2018-08-23  Wilco Dijkstra  <wdijkstr@arm.com>
+
+	* sysdeps/ieee754/flt-32/s_tanf.c (__tanf): Use fast range reduction.
+
 2018-08-22  DJ Delorie  <dj@redhat.com>
 
 	* Makefile (testroot.pristine): New rules to initialize the
diff --git a/sysdeps/ieee754/flt-32/s_tanf.c b/sysdeps/ieee754/flt-32/s_tanf.c
index ba3af54913..fd104103ad 100644
--- a/sysdeps/ieee754/flt-32/s_tanf.c
+++ b/sysdeps/ieee754/flt-32/s_tanf.c
@@ -21,6 +21,33 @@ static char rcsid[] = "$NetBSD: s_tanf.c,v 1.4 1995/05/10 20:48:20 jtc Exp $";
 #include <math.h>
 #include <math_private.h>
 #include <libm-alias-float.h>
+#include "s_sincosf.h"
+
+/* Reduce range of X to a multiple of PI/2.  The modulo result is between
+   -PI/4 and PI/4 and returned as a high part y[0] and a low part y[1].
+   The low bit in the return value indicates the first or 2nd half of tanf.  */
+static inline int32_t
+rem_pio2f (float x, float *y)
+{
+  double dx = x;
+  int n;
+  const sincos_t *p = &__sincosf_table[0];
+
+  if (__glibc_likely (abstop12 (x) < abstop12 (120.0f)))
+    dx = reduce_fast (dx, p, &n);
+  else
+    {
+      uint32_t xi = asuint (x);
+      int sign = xi >> 31;
+
+      dx = reduce_large (xi, &n);
+      dx = sign ? -dx : dx;
+    }
+
+  y[0] = dx;
+  y[1] = dx - y[0];
+  return n;
+}
 
 float __tanf(float x)
 {
@@ -42,7 +69,7 @@ float __tanf(float x)
 
     /* argument reduction needed */
 	else {
-	    n = __ieee754_rem_pio2f(x,y);
+	    n = rem_pio2f(x,y);
 	    return __kernel_tanf(y[0],y[1],1-((n&1)<<1)); /*   1 -- n even
 							      -1 -- n odd */
 	}
author	Wilco Dijkstra <wdijkstr@arm.com>	2018-08-23 12:38:16 +0100
committer	Wilco Dijkstra <wdijkstr@arm.com>	2018-08-23 12:38:16 +0100
commit	900fb446eb8172c54cdaed85107bc783ee50673a (patch)
tree	80ecbb5c9245698a5b5c70b1070f621e77834dec
parent	561b0bec4448f0302cb4915bf67c919bde4a1c57 (diff)
download	glibc-900fb446eb8172c54cdaed85107bc783ee50673a.tar.gz glibc-900fb446eb8172c54cdaed85107bc783ee50673a.tar.xz glibc-900fb446eb8172c54cdaed85107bc783ee50673a.zip