about summary refs log tree commit diff
path: root/NEWS
diff options
context:
space:
mode:
authorSzabolcs Nagy <szabolcs.nagy@arm.com>2018-06-13 17:40:19 +0100
committerSzabolcs Nagy <szabolcs.nagy@arm.com>2018-09-12 17:33:30 +0100
commitf41b0a43e426831e391cafd8d0bd47a3efa4a840 (patch)
treeb60d5ee7f8c6f496089d9cd45b048e072ac1a564 /NEWS
parent5a274db4ea363d6b0b92933f085a92daaf1be2f2 (diff)
downloadglibc-f41b0a43e426831e391cafd8d0bd47a3efa4a840.tar.gz
glibc-f41b0a43e426831e391cafd8d0bd47a3efa4a840.tar.xz
glibc-f41b0a43e426831e391cafd8d0bd47a3efa4a840.zip
Add new log implementation
Optimized log using carefully generated lookup table with 1/c and log(c)
values for small intervalls around 1.  The log(c) is very near a double
precision value, it has about 62 bits precision.  The algorithm is
log(2^k x) = k log(2) + log(c) + log(x/c), where the last term is
approximated by a polynomial of x/c - 1.  Near 1 a single polynomial of
x - 1 is used.

There is separate code path when fma instruction is not available for
computing x/c - 1 precisely, in which case the table size is doubled.
The code uses __builtin_fma under __FP_FAST_FMA to ensure it is inlined
as an instruction.

With the default configuration settings the worst case error is 0.519 ULP
(and 0.520 without fma), the rodata size is 2192 bytes (4240 without fma).
The non-nearest rounding error is less than 1 ULP.

Improvements on Cortex-A72 compared to current glibc master:
log thruput: 3.28x in [0.01 11.1]
log latency: 2.23x in [0.01 11.1]
log thruput: 1.56x in [0.999 1.001]
log latency: 1.57x in [0.999 1.001]

Tested on
aarch64-linux-gnu (defined __FP_FAST_FMA)
arm-linux-gnueabihf (!defined __FP_FAST_FMA)
x86_64-linux-gnu (!defined __FP_FAST_FMA)
powerpc64le-linux-gnu (defined __FP_FAST_FMA)
targets.

	* NEWS: Mention log improvement.
	* math/Makefile (type-double-routines): Add e_log_data.
	* sysdeps/i386/fpu/e_log_data.c: New file.
	* sysdeps/ia64/fpu/e_log_data.c: New file.
	* sysdeps/ieee754/dbl-64/e_log.c: Rewrite.
	* sysdeps/ieee754/dbl-64/e_log_data.c: New file.
	* sysdeps/ieee754/dbl-64/math_config.h (__log_data): Add.
	* sysdeps/ieee754/dbl-64/ulog.h: Remove.
	* sysdeps/ieee754/dbl-64/ulog.tbl: Remove.
	* sysdeps/m68k/m680x0/fpu/e_log_data.c: New file.
Diffstat (limited to 'NEWS')
-rw-r--r--NEWS2
1 files changed, 1 insertions, 1 deletions
diff --git a/NEWS b/NEWS
index 085325ab87..79bee8ee6b 100644
--- a/NEWS
+++ b/NEWS
@@ -16,7 +16,7 @@ Major new features:
   to set the install root if you wish to install into a non-default
   configured location.
 
-* Optimized generic exp, exp2, sinf, cosf, sincosf and tanf.
+* Optimized generic exp, exp2, log, sinf, cosf, sincosf and tanf.
 
 * The reallocarray function is now declared under _DEFAULT_SOURCE, not just
   for _GNU_SOURCE, to match BSD environments.