| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
commit 3d9f171bfb5325bd5f427e9fc386453358c6e840
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Mon Feb 7 05:55:15 2022 -0800
x86-64: Optimize bzero
added the optimized bzero. Remove bzero weak alias in SS2 memset to
avoid undefined __bzero in memset-sse2-unaligned-erms.
|
|
|
|
|
| |
Move PI_STATIC_AND_HIDDEN and SUPPORT_STATIC_PIE to
sysdeps/x86/configure.ac.
|
|
|
|
|
|
|
|
|
|
|
| |
commit 3d9f171bfb5325bd5f427e9fc386453358c6e840
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Mon Feb 7 05:55:15 2022 -0800
x86-64: Optimize bzero
Remove setting the .text section for the code. This commit
adds that back.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prelinked binaries and libraries still work, the dynamic tags
DT_GNU_PRELINKED, DT_GNU_LIBLIST, DT_GNU_CONFLICT just ignored
(meaning the process is reallocated as default).
The loader environment variable TRACE_PRELINKING is also removed,
since it used solely on prelink.
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
|
|
|
|
|
|
|
|
|
|
| |
memset with zero as the value to set is by far the majority value (99%+
for Python3 and GCC).
bzero can be slightly more optimized for this case by using a zero-idiom
xor for broadcasting the set value to a register (vector or GPR).
Co-developed-by: Noah Goldstein <goldstein.w.n@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
| |
commit b62ace2740a106222e124cc86956448fa07abf4d
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date: Sun Feb 6 00:54:18 2022 -0600
x86: Improve vec generation in memset-vec-unaligned-erms.S
Revert usage of 'pshufb' in broadcast logic as it is an SSSE3
instruction and memset.S is restricted to only SSE2 instructions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
No bug.
Split vec generation into multiple steps. This allows the
broadcast in AVX2 to use 'xmm' registers for the L(less_vec)
case. This saves an expensive lane-cross instruction and removes
the need for 'vzeroupper'.
For SSE2 replace 2x 'punpck' instructions with zero-idiom 'pxor' for
byte broadcast.
Results for memset-avx2 small (geomean of N = 20 benchset runs).
size, New Time, Old Time, New / Old
0, 4.100, 3.831, 0.934
1, 5.074, 4.399, 0.867
2, 4.433, 4.411, 0.995
4, 4.487, 4.415, 0.984
8, 4.454, 4.396, 0.987
16, 4.502, 4.443, 0.987
All relevant string/wcsmbs tests are passing.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add vector tan/tanf and input files to libmvec microbenchmark.
libmvec-tan-inputs:
90% Normal random distribution
range: (-DBL_MAX, DBL_MAX)
mean: 0.0
sigma: 5.0
10% uniform random distribution in range (-1000.0, 1000.0)
libmvec-tanf-inputs:
90% Normal random distribution
range: (-FLT_MAX, FLT_MAX)
mean: 0.0f
sigma: 5.0f
10% uniform random distribution in range (-1000.0f, 1000.0f)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add vector erfc/erfcf and input files to libmvec microbenchmark.
libmvec-erfc-inputs:
90% Normal random distribution
range: (-6.0, 6.0)
mean: 0.0
sigma: 1.0
10% uniform random distribution in range (-5.9, 5.9)
libmvec-erfcf-inputs:
90% Normal random distribution
range: (-4.0f, 4.0f)
mean: 0.0f
sigma: 1.0f
10% uniform random distribution in range (-3.9f, 3.9f)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add vector asinh/asinhf and input files to libmvec microbenchmark.
libmvec-asinh-inputs:
90% Normal random distribution
range: (-DBL_MAX, DBL_MAX)
mean: 0.0
sigma: 2.0
10% uniform random distribution in range (-1.0e6, 1.0e6)
libmvec-asinhf-inputs:
90% Normal random distribution
range: (-FLT_MAX, FLT_MAX)
mean: 0.0f
sigma: 2.0f
10% uniform random distribution in range (-1.0e6f, 1.0e6f)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add vector tanh/tanhf and input files to libmvec microbenchmark.
libmvec-tanh-inputs:
90% Normal random distribution
range: (-19.0, 19.0)
mean: 0.0
sigma: 2.0
10% uniform random distribution in range (-16.0, 16.0)
libmvec-tanhf-inputs:
90% Normal random distribution
range: (-10.0f, 10.0f)
mean: 0.0f
sigma: 2.0f
10% uniform random distribution in range (-8.0f, 8.0f)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add vector erf/erff and input files to libmvec microbenchmark.
libmvec-erf-inputs:
90% Normal random distribution
range: (-6.0, 6.0)
mean: 0.0
sigma: 1.0
10% uniform random distribution in range (-5.9, 5.9)
libmvec-erff-inputs:
90% Normal random distribution
range: (-4.0f, 4.0f)
mean: 0.0f
sigma: 1.0f
10% uniform random distribution in range (-3.9f, 3.9f)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add vector acosh/acoshf and input files to libmvec microbenchmark.
libmvec-acosh-inputs:
90% Normal random distribution
range: (1.0, DBL_MAX)
mean: 1.0
sigma: 8.0
10% uniform random distribution in range (1.0, 1.0e6)
libmvec-acoshf-inputs:
90% Normal random distribution
range: (1.0f, FLT_MAX)
mean: 1.0f
sigma: 4.0f
10% uniform random distribution in range (1.0f, 1.0e6f)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add vector atanh/atanhf and input files to libmvec microbenchmark.
libmvec-atanh-inputs:
90% Normal random distribution
range: (-1.0, 1.0)
mean: 0.0
sigma: 1.0
10% uniform random distribution in range (-1.0, 1.0)
libmvec-atanhf-inputs:
90% Normal random distribution
range: (-1.0f, 1.0f)
mean: 0.0f
sigma: 1.0f
10% uniform random distribution in range (-1.0f, 1.0f)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add vector log1p/log1pf and input files to libmvec microbenchmark.
libmvec-log1p-inputs:
70% Normal random distribution
range: (-1.0, DBL_MAX)
mean: 0.0
sigma: 50.0
30% uniform random distribution in range (-1.0, 1.0e6)
libmvec-log1pf-inputs:
70% Normal random distribution
range: (-1.0f, FLT_MAX)
mean: 0.0f
sigma: 50.0f
30% uniform random distribution in range (-1.0f, 1.0e6f)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add vector log2/log2f and input files to libmvec microbenchmark.
libmvec-log2-inputs:
70% Normal random distribution
range: (0.0, DBL_MAX)
mean: 1.0
sigma: 50.0
30% uniform random distribution in range (0.0, 1.0e6)
libmvec-log2f-inputs:
70% Normal random distribution
range: (0.0f, FLT_MAX)
mean: 1.0f
sigma: 50.0f
30% uniform random distribution in range (0.0f, 1.0e6f)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add vector log10/log10f and input files to libmvec microbenchmark.
libmvec-log10-inputs:
70% Normal random distribution
range: (0.0, DBL_MAX)
mean: 1.0
sigma: 50.0
30% uniform random distribution in range (0.0, 1.0e6)
libmvec-log10f-inputs:
70% Normal random distribution
range: (0.0f, FLT_MAX)
mean: 1.0f
sigma: 50.0f
30% uniform random distribution in range (0.0f, 1.0e6f)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add vector atan2/atan2f and input files to libmvec microbenchmark.
libmvec-atan2-inputs:
arg1:
90% Normal random distribution
range: (-DBL_MAX, DBL_MAX)
mean: 0.0
sigma: 4.0
10% uniform random distribution in range (-1.0e6, 1.0e6)
arg2:
90% Normal random distribution
range: (-DBL_MAX, DBL_MAX)
mean: 0.0
sigma: 4.0
10% uniform random distribution in range (-1.0e6, 1.0e6)
libmvec-atan2f-inputs:
arg1:
90% Normal random distribution
range: (-FLT_MAX, FLT_MAX)
mean: 0.0f
sigma: 4.0f
10% uniform random distribution in range (-1.0e6f, 1.0e6f)
arg2:
90% Normal random distribution
range: (-FLT_MAX, FLT_MAX)
mean: 0.0f
sigma: 4.0f
10% uniform random distribution in range (-1.0e6f, 1.0e6f)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add vector cbrt/cbrtf and input files to libmvec microbenchmark.
libmvec-cbrt-inputs:
90% Normal random distribution
range: (-DBL_MAX, DBL_MAX)
mean: 0.0
sigma: 10.0
10% uniform random distribution in range (-1000.0, 1000.0)
libmvec-cbrtf-inputs:
90% Normal random distribution
range: (-FLT_MAX, FLT_MAX)
mean: 0.0f
sigma: 10.0f
10% uniform random distribution in range (-1000.0f, 1000.0f)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add vector sinh/sinhf and input files to libmvec microbenchmark.
libmvec-sinh-inputs:
90% Normal random distribution
range: (-710.0, 710.0)
mean: 0.0
sigma: 32.0
10% uniform random distribution in range (-500.0, 500.0)
libmvec-sinhf-inputs:
90% Normal random distribution
range: (-89.0f, 89.0f)
mean: 0.0f
sigma: 16.0f
10% uniform random distribution in range (-50.0f, 50.0f)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add vector expm1/expm1f and input files to libmvec microbenchmark.
libmvec-expm1-inputs:
90% Normal random distribution
range: (-708.0, 709.0)
mean: 0.0
sigma: 16.0
10% uniform random distribution in range (-500.0, 500.0)
libmvec-expm1f-inputs:
90% Normal random distribution
range: (-87.0f, 88.0f)
mean: 0.0f
sigma: 8.0f
10% uniform random distribution in range (-50.0f, 50.0f)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add vector cosh/coshf and input files to libmvec microbenchmark.
libmvec-cosh-inputs:
90% Normal random distribution
range: (-710.0, 710.0)
mean: 0.0
sigma: 32.0
10% uniform random distribution in range (-500.0, 500.0)
libmvec-coshf-inputs:
90% Normal random distribution
range: (-89.0f, 89.0f)
mean: 0.0f
sigma: 16.0f
10% uniform random distribution in range (-50.0f, 50.0f)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add vector exp10/exp10f and input files to libmvec microbenchmark.
libmvec-exp10-inputs:
90% Normal random distribution
range: (-307.0, 308.0)
mean: 0.0
sigma: 16.0
10% uniform random distribution in range (-250.0, 250.0)
libmvec-exp10f-inputs:
90% Normal random distribution
range: (-37.0f, 38.0f)
mean: 0.0f
sigma: 8.0f
10% uniform random distribution in range (-25.0f, 25.0f)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add vector exp2/exp2f and input files to libmvec microbenchmark.
libmvec-exp2-inputs:
90% Normal random distribution
range: (-1022.0, 1024.0)
mean: 0.0
sigma: 16.0
10% uniform random distribution in range (-1000.0, 1000.0)
libmvec-exp2f-inputs:
90% Normal random distribution
range: (-126.0f, 128.0f)
mean: 0.0f
sigma: 8.0f
10% uniform random distribution in range (-100.0f, 100.0f)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add vector hypot/hypotf and input files to libmvec microbenchmark.
libmvec-hypot-inputs:
arg1:
90% Normal random distribution
range: (-DBL_MAX, DBL_MAX)
mean: 0.0
sigma: 10.0
10% uniform random distribution in range (-1000.0, 1000.0)
arg1:
90% Normal random distribution
range: (-DBL_MAX, DBL_MAX)
mean: 0.0
sigma: 10.0
10% uniform random distribution in range (-1000.0, 1000.0)
libmvec-hypotf-inputs:
arg1:
90% Normal random distribution
range: (-FLT_MAX, FLT_MAX)
mean: 0.0f
sigma: 10.0f
10% uniform random distribution in range (-1000.0f, 1000.0f)
arg2:
90% Normal random distribution
range: (-FLT_MAX, FLT_MAX)
mean: 0.0f
sigma: 10.0f
10% uniform random distribution in range (-1000.0f, 1000.0f)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add vector asin/asinf and input files to libmvec microbenchmark.
libmvec-asin-inputs:
90% Normal random distribution
range: (-1.0, 1.0)
mean: 0.0
sigma: 1.0
10% uniform random distribution in range (-1.0, 1.0)
libmvec-asinf-inputs:
90% Normal random distribution
range: (-1.0f, 1.0f)
mean: 0.0f
sigma: 1.0f
10% uniform random distribution in range (-1.0f, 1.0f)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add vector atan/atanf and input files to libmvec microbenchmark.
libmvec-atan-inputs:
arg1:
90% Normal random distribution
range: (-DBL_MAX, DBL_MAX)
mean: 0.0
sigma: 4.0
10% uniform random distribution in range (-1.0e6, 1.0e6)
arg2:
90% Normal random distribution
range: (-DBL_MAX, DBL_MAX)
mean: 0.0
sigma: 4.0
10% uniform random distribution in range (-1.0e6, 1.0e6)
libmvec-atanf-inputs:
arg1:
90% Normal random distribution
range: (-FLT_MAX, FLT_MAX)
mean: 0.0f
sigma: 4.0f
10% uniform random distribution in range (-1.0e6f, 1.0e6f)
arg2:
90% Normal random distribution
range: (-FLT_MAX, FLT_MAX)
mean: 0.0f
sigma: 4.0f
10% uniform random distribution in range (-1.0e6f, 1.0e6f)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Change "movl %edx, %rdx" to "movl %edx, %edx" in:
commit 8418eb3ff4b781d31c4ed5dc6c0bd7356bc45db9
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date: Mon Jan 10 15:35:39 2022 -0600
x86: Optimize strcmp-evex.S
|
|
|
|
|
|
|
|
|
|
| |
Change "movl %edx, %rdx" to "movl %edx, %edx" in:
commit b77b06e0e296f1a2276c27a67e1d44f2cfa38d45
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date: Mon Jan 10 15:35:38 2022 -0600
x86: Optimize strcmp-avx2.S
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add vector acos/acosf and input files to libmvec microbenchmark.
libmvec-acos-inputs:
90% Normal random distribution
range: (-1.0, 1.0)
mean: 0.0
sigma: 1.0
10% uniform random distribution in range (-1.0, 1.0)
libmvec-acosf-inputs:
90% Normal random distribution
range: (-1.0f, 1.0f)
mean: 0.0f
sigma: 1.0f
10% uniform random distribution in range (-1.0f, 1.0f)
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Optimization are primarily to the loop logic and how the page cross
logic interacts with the loop.
The page cross logic is at times more expensive for short strings near
the end of a page but not crossing the page. This is done to retest
the page cross conditions with a non-faulty check and to improve the
logic for entering the loop afterwards. This is only particular cases,
however, and is general made up for by more than 10x improvements on
the transition from the page cross -> loop case.
The non-page cross cases as well are nearly universally improved.
test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass.
Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Optimization are primarily to the loop logic and how the page cross
logic interacts with the loop.
The page cross logic is at times more expensive for short strings near
the end of a page but not crossing the page. This is done to retest
the page cross conditions with a non-faulty check and to improve the
logic for entering the loop afterwards. This is only particular cases,
however, and is general made up for by more than 10x improvements on
the transition from the page cross -> loop case.
The non-page cross cases are improved most for smaller sizes [0, 128]
and go about even for (128, 4096]. The loop page cross logic is
improved so some more significant speedup is seen there as well.
test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass.
Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
|
|
|
|
|
|
|
| |
It is not Hurd-specific, but H.J. Lu wants it there.
Also, dc.a can be used to avoid hardcoding .long vs .quad and thus use
the same implementation for i386 and x86_64.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds following inputs:
0x1.bcab29da0e947p-54 0x1.bc41f4d2294b8p-54
0x1.a11891ec004d4p-348 0x1.814830510be26p-348
0x1.b836ed678be29p-588 0x1.b7be6f5a03a8cp-588
0x1.a83f842ef3f73p-633 0x1.a799d8a6677ep-633
to atan2 tests and updates x86_64 double atan2 ulps.
This fixes BZ #28765.
Reviewed-By: Paul Zimmermann <Paul.Zimmermann@inria.fr>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes SSE4.2 libmvec atan2 function accuracy for following
inputs to less than 4 ulps.
{0x1.bcab29da0e947p-54,0x1.bc41f4d2294b8p-54} 4.19888 ulps
{0x1.b836ed678be29p-588,0x1.b7be6f5a03a8cp-588} 4.09889 ulps
This fixes BZ #28765.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
Fixes [BZ# 28755] for wcsncmp by redirecting length >= 2^56 to
__wcscmp_evex. For x86_64 this covers the entire address range so any
length larger could not possibly be used to bound `s1` or `s2`.
test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass.
Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
Fixes [BZ# 28755] for wcsncmp by redirecting length >= 2^56 to
__wcscmp_avx2. For x86_64 this covers the entire address range so any
length larger could not possibly be used to bound `s1` or `s2`.
test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass.
Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.
I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah. I don't
know why I run into these diagnostics whereas others evidently do not.
remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
|
|
|
|
|
|
|
|
| |
Implement vectorized tan/tanf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector tan/tanf with regenerated ulps.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
| |
Implement vectorized erfc/erfcf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector erfc/erfcf with regenerated ulps.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
| |
Implement vectorized asinh/asinhf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector asinh/asinhf with regenerated ulps.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
| |
Implement vectorized tanh/tanhf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector tanh/tanhf with regenerated ulps.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
| |
Implement vectorized erf/erff containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector erf/erff with regenerated ulps.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
| |
Implement vectorized acosh/acoshf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector acosh/acoshf with regenerated ulps.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
| |
Implement vectorized atanh/atanhf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector atanh/atanhf with regenerated ulps.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
| |
Implement vectorized log1p/log1pf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector log1p/log1pf with regenerated ulps.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
| |
Implement vectorized log2/log2f containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector log2/log2f with regenerated ulps.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
| |
Implement vectorized log10/log10f containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector log10/log10f with regenerated ulps.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
| |
Implement vectorized atan2/atan2f containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector atan2/atan2f with regenerated ulps.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|