diff options
author | Joe Ramsay <Joe.Ramsay@arm.com> | 2024-05-02 16:43:13 +0100 |
---|---|---|
committer | Szabolcs Nagy <szabolcs.nagy@arm.com> | 2024-05-14 13:10:33 +0100 |
commit | 90a6ca8b28bf34e361e577e526e1b0f4c39a32a5 (patch) | |
tree | 69830b0b2204a585bcca976208ae412543c19dc1 /sysdeps/aarch64/fpu/expm1_advsimd.c | |
parent | ec6ed525f1aa24fd38ea5153e88d14d92d0d2f82 (diff) | |
download | glibc-90a6ca8b28bf34e361e577e526e1b0f4c39a32a5.tar.gz glibc-90a6ca8b28bf34e361e577e526e1b0f4c39a32a5.tar.xz glibc-90a6ca8b28bf34e361e577e526e1b0f4c39a32a5.zip |
aarch64: Fix AdvSIMD libmvec routines for big-endian
Previously many routines used * to load from vector types stored in the data table. This is emitted as ldr, which byte-swaps the entire vector register, and causes bugs for big-endian when not all lanes contain the same value. When a vector is to be used this way, it has been replaced with an array and the load with an explicit ld1 intrinsic, which byte-swaps only within lanes. As well, many routines previously used non-standard GCC syntax for vector operations such as indexing into vectors types with [] and assembling vectors using {}. This syntax should not be mixed with ACLE, as the former does not respect endianness whereas the latter does. Such examples have been replaced with, for instance, vcombine_* and vgetq_lane* intrinsics. Helpers which only use the GCC syntax, such as the v_call helpers, do not need changing as they do not use intrinsics. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Diffstat (limited to 'sysdeps/aarch64/fpu/expm1_advsimd.c')
-rw-r--r-- | sysdeps/aarch64/fpu/expm1_advsimd.c | 9 |
1 files changed, 6 insertions, 3 deletions
diff --git a/sysdeps/aarch64/fpu/expm1_advsimd.c b/sysdeps/aarch64/fpu/expm1_advsimd.c index 3628398674..3db3b80c49 100644 --- a/sysdeps/aarch64/fpu/expm1_advsimd.c +++ b/sysdeps/aarch64/fpu/expm1_advsimd.c @@ -23,7 +23,9 @@ static const struct data { float64x2_t poly[11]; - float64x2_t invln2, ln2, shift; + float64x2_t invln2; + double ln2[2]; + float64x2_t shift; int64x2_t exponent_bias; #if WANT_SIMD_EXCEPT uint64x2_t thresh, tiny_bound; @@ -92,8 +94,9 @@ float64x2_t VPCS_ATTR V_NAME_D1 (expm1) (float64x2_t x) where 2^i is exact because i is an integer. */ float64x2_t n = vsubq_f64 (vfmaq_f64 (d->shift, d->invln2, x), d->shift); int64x2_t i = vcvtq_s64_f64 (n); - float64x2_t f = vfmsq_laneq_f64 (x, n, d->ln2, 0); - f = vfmsq_laneq_f64 (f, n, d->ln2, 1); + float64x2_t ln2 = vld1q_f64 (&d->ln2[0]); + float64x2_t f = vfmsq_laneq_f64 (x, n, ln2, 0); + f = vfmsq_laneq_f64 (f, n, ln2, 1); /* Approximate expm1(f) using polynomial. Taylor expansion for expm1(x) has the form: |