diff options
author | Szabolcs Nagy <szabolcs.nagy@arm.com> | 2018-06-13 17:40:19 +0100 |
---|---|---|
committer | Szabolcs Nagy <szabolcs.nagy@arm.com> | 2018-09-12 17:33:30 +0100 |
commit | f41b0a43e426831e391cafd8d0bd47a3efa4a840 (patch) | |
tree | b60d5ee7f8c6f496089d9cd45b048e072ac1a564 /sysdeps/ieee754/dbl-64/math_config.h | |
parent | 5a274db4ea363d6b0b92933f085a92daaf1be2f2 (diff) | |
download | glibc-f41b0a43e426831e391cafd8d0bd47a3efa4a840.tar.gz glibc-f41b0a43e426831e391cafd8d0bd47a3efa4a840.tar.xz glibc-f41b0a43e426831e391cafd8d0bd47a3efa4a840.zip |
Add new log implementation
Optimized log using carefully generated lookup table with 1/c and log(c) values for small intervalls around 1. The log(c) is very near a double precision value, it has about 62 bits precision. The algorithm is log(2^k x) = k log(2) + log(c) + log(x/c), where the last term is approximated by a polynomial of x/c - 1. Near 1 a single polynomial of x - 1 is used. There is separate code path when fma instruction is not available for computing x/c - 1 precisely, in which case the table size is doubled. The code uses __builtin_fma under __FP_FAST_FMA to ensure it is inlined as an instruction. With the default configuration settings the worst case error is 0.519 ULP (and 0.520 without fma), the rodata size is 2192 bytes (4240 without fma). The non-nearest rounding error is less than 1 ULP. Improvements on Cortex-A72 compared to current glibc master: log thruput: 3.28x in [0.01 11.1] log latency: 2.23x in [0.01 11.1] log thruput: 1.56x in [0.999 1.001] log latency: 1.57x in [0.999 1.001] Tested on aarch64-linux-gnu (defined __FP_FAST_FMA) arm-linux-gnueabihf (!defined __FP_FAST_FMA) x86_64-linux-gnu (!defined __FP_FAST_FMA) powerpc64le-linux-gnu (defined __FP_FAST_FMA) targets. * NEWS: Mention log improvement. * math/Makefile (type-double-routines): Add e_log_data. * sysdeps/i386/fpu/e_log_data.c: New file. * sysdeps/ia64/fpu/e_log_data.c: New file. * sysdeps/ieee754/dbl-64/e_log.c: Rewrite. * sysdeps/ieee754/dbl-64/e_log_data.c: New file. * sysdeps/ieee754/dbl-64/math_config.h (__log_data): Add. * sysdeps/ieee754/dbl-64/ulog.h: Remove. * sysdeps/ieee754/dbl-64/ulog.tbl: Remove. * sysdeps/m68k/m680x0/fpu/e_log_data.c: New file.
Diffstat (limited to 'sysdeps/ieee754/dbl-64/math_config.h')
-rw-r--r-- | sysdeps/ieee754/dbl-64/math_config.h | 16 |
1 files changed, 16 insertions, 0 deletions
diff --git a/sysdeps/ieee754/dbl-64/math_config.h b/sysdeps/ieee754/dbl-64/math_config.h index 02d94fed3e..2eb793d4c8 100644 --- a/sysdeps/ieee754/dbl-64/math_config.h +++ b/sysdeps/ieee754/dbl-64/math_config.h @@ -133,4 +133,20 @@ extern const struct exp_data uint64_t tab[2*(1 << EXP_TABLE_BITS)]; } __exp_data attribute_hidden; +#define LOG_TABLE_BITS 7 +#define LOG_POLY_ORDER 6 +#define LOG_POLY1_ORDER 12 +extern const struct log_data +{ + double ln2hi; + double ln2lo; + double poly[LOG_POLY_ORDER - 1]; /* First coefficient is 1. */ + double poly1[LOG_POLY1_ORDER - 1]; + /* See e_log_data.c for details. */ + struct {double invc, logc;} tab[1 << LOG_TABLE_BITS]; +#ifndef __FP_FAST_FMA + struct {double chi, clo;} tab2[1 << LOG_TABLE_BITS]; +#endif +} __log_data attribute_hidden; + #endif |