aarch64: Add half-width versions of AdvSIMD f32 libmvec routines

Compilers may emit calls to 'half-width' routines (two-lane single-precision variants). These have been added in the form of wrappers around the full-width versions, where the low half of the vector is simply duplicated. This will perform poorly when one lane triggers the special-case handler, as there will be a redundant call to the scalar version, however this is expected to be rare at Ofast. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
author: Joe Ramsay <Joe.Ramsay@arm.com> 2023-12-19 16:44:01 +0000
committer: Szabolcs Nagy <szabolcs.nagy@arm.com> 2023-12-20 08:41:25 +0000
commit: cc0d77ba944cd4ce46c5f0e6d426af3057962ca5 (patch)
tree: 840c09b10bcb0ad4f733e8cb4bce2acbd92e5945 /sysdeps/aarch64/fpu/expm1f_advsimd.c
parent: 3150cc0c9019bf9da841419f86dda8e7f26d676d (diff)
download: glibc-cc0d77ba944cd4ce46c5f0e6d426af3057962ca5.tar.gz
glibc-cc0d77ba944cd4ce46c5f0e6d426af3057962ca5.tar.xz
glibc-cc0d77ba944cd4ce46c5f0e6d426af3057962ca5.zip
1 files changed, 3 insertions, 1 deletions
diff --git a/sysdeps/aarch64/fpu/expm1f_advsimd.c b/sysdeps/aarch64/fpu/expm1f_advsimd.c
index b27b75068a..a6fe5627e5 100644
--- a/sysdeps/aarch64/fpu/expm1f_advsimd.c
+++ b/sysdeps/aarch64/fpu/expm1f_advsimd.c
@@ -64,7 +64,7 @@ special_case (float32x4_t x, float32x4_t y, uint32x4_t special)
    The maximum error is 1.51 ULP:
    _ZGVnN4v_expm1f (0x1.8baa96p-2) got 0x1.e2fb9p-2
 				  want 0x1.e2fb94p-2.  */
-float32x4_t VPCS_ATTR V_NAME_F1 (expm1) (float32x4_t x)
+float32x4_t VPCS_ATTR NOINLINE V_NAME_F1 (expm1) (float32x4_t x)
 {
   const struct data *d = ptr_barrier (&data);
   uint32x4_t ix = vreinterpretq_u32_f32 (x);
@@ -115,3 +115,5 @@ float32x4_t VPCS_ATTR V_NAME_F1 (expm1) (float32x4_t x)
   /* expm1(x) ~= p * t + (t - 1).  */
   return vfmaq_f32 (vsubq_f32 (t, v_f32 (1.0f)), p, t);
 }
+libmvec_hidden_def (V_NAME_F1 (expm1))
+HALF_WIDTH_ALIAS_F1 (expm1)
author	Joe Ramsay <Joe.Ramsay@arm.com>	2023-12-19 16:44:01 +0000
committer	Szabolcs Nagy <szabolcs.nagy@arm.com>	2023-12-20 08:41:25 +0000
commit	cc0d77ba944cd4ce46c5f0e6d426af3057962ca5 (patch)
tree	840c09b10bcb0ad4f733e8cb4bce2acbd92e5945 /sysdeps/aarch64/fpu/expm1f_advsimd.c
parent	3150cc0c9019bf9da841419f86dda8e7f26d676d (diff)
download	glibc-cc0d77ba944cd4ce46c5f0e6d426af3057962ca5.tar.gz glibc-cc0d77ba944cd4ce46c5f0e6d426af3057962ca5.tar.xz glibc-cc0d77ba944cd4ce46c5f0e6d426af3057962ca5.zip