about summary refs log tree commit diff
path: root/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
diff options
context:
space:
mode:
authorAmrita H S <amritahs@linux.vnet.ibm.com>2023-12-06 11:43:11 -0500
committerRajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>2023-12-07 11:10:40 -0600
commit3367d8e180848030d1646f088759f02b8dfe0d6f (patch)
tree1300c2911e4f08b4861bf144b6bcbc54e193b51b /sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
parent546a1ba664626603660b595662249d524e429013 (diff)
downloadglibc-3367d8e180848030d1646f088759f02b8dfe0d6f.tar.gz
glibc-3367d8e180848030d1646f088759f02b8dfe0d6f.tar.xz
glibc-3367d8e180848030d1646f088759f02b8dfe0d6f.zip
powerpc: Optimized strcmp for power10
This patch is based on __strcmp_power9 and __strlen_power10.

Improvements from __strcmp_power9:

    1. Uses new POWER10 instructions
       - This code uses lxvp to decrease contention on load
         by loading 32 bytes per instruction.

    2. Performance implication
       - This version has around 30% better performance on average.
       - Performance regression is seen for a specific combination
         of sizes and alignments. Some of them is observed without
         changes also, while rest may be induced by the patch.

Signed-off-by: Amrita H S <amritahs@linux.vnet.ibm.com>
Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>
Diffstat (limited to 'sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c')
-rw-r--r--sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c4
1 files changed, 4 insertions, 0 deletions
diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
index fc26dd0e17..965dd17786 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
@@ -378,6 +378,10 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
   IFUNC_IMPL (i, name, strcmp,
 #ifdef __LITTLE_ENDIAN__
 	      IFUNC_IMPL_ADD (array, i, strcmp,
+			      (hwcap2 & PPC_FEATURE2_ARCH_3_1)
+			      && (hwcap & PPC_FEATURE_HAS_VSX),
+			      __strcmp_power10)
+	      IFUNC_IMPL_ADD (array, i, strcmp,
 			      hwcap2 & PPC_FEATURE2_ARCH_3_00
 			      && hwcap & PPC_FEATURE_HAS_ALTIVEC,
 			      __strcmp_power9)