diff options
author | Adhemerval Zanella <azanella@linux.vnet.ibm.com> | 2014-12-31 11:47:41 -0500 |
---|---|---|
committer | Adhemerval Zanella <azanella@linux.vnet.ibm.com> | 2015-01-13 11:28:44 -0500 |
commit | f06a4faf8a2b4d046eb40e94b47948cc47d79902 (patch) | |
tree | 846d1fc4c0ce0be53ef275c227b2accab263bbda /sysdeps/powerpc/powerpc64/multiarch/Makefile | |
parent | 9f2f36e5a91c2ce6edba5415e176155eb1008ae1 (diff) | |
download | glibc-f06a4faf8a2b4d046eb40e94b47948cc47d79902.tar.gz glibc-f06a4faf8a2b4d046eb40e94b47948cc47d79902.tar.xz glibc-f06a4faf8a2b4d046eb40e94b47948cc47d79902.zip |
powerpc: Optimized st{r,p}ncpy for POWER8/PPC64
This patch adds an optimized POWER8 st{r,p}ncpy using unaligned accesses. It shows 10%-80% improvement over the optimized POWER7 one that uses only aligned accesses, specially on unaligned inputs. The algorithm first read and check 16 bytes (if inputs do not cross a 4K page size). The it realign source to 16-bytes and issue a 16 bytes read and compare loop to speedup null byte checks for large strings. Also, different from POWER7 optimization, the null pad is done inline in the implementation using possible unaligned accesses, instead of realying on a memset call. Special case is added for page cross reads.
Diffstat (limited to 'sysdeps/powerpc/powerpc64/multiarch/Makefile')
-rw-r--r-- | sysdeps/powerpc/powerpc64/multiarch/Makefile | 5 |
1 files changed, 3 insertions, 2 deletions
diff --git a/sysdeps/powerpc/powerpc64/multiarch/Makefile b/sysdeps/powerpc/powerpc64/multiarch/Makefile index 74b2daac1d..18d337843c 100644 --- a/sysdeps/powerpc/powerpc64/multiarch/Makefile +++ b/sysdeps/powerpc/powerpc64/multiarch/Makefile @@ -17,9 +17,10 @@ sysdep_routines += memcpy-power7 memcpy-a2 memcpy-power6 memcpy-cell \ stpcpy-power7 stpcpy-ppc64 \ strrchr-power7 strrchr-ppc64 strncat-power7 strncat-ppc64 \ strncpy-power7 strncpy-ppc64 \ - stpncpy-power7 stpncpy-ppc64 strcmp-power7 strcmp-ppc64 \ + stpncpy-power8 stpncpy-power7 stpncpy-ppc64 \ + strcmp-power7 strcmp-ppc64 \ strcat-power8 strcat-power7 strcat-ppc64 memmove-power7 \ - memmove-ppc64 bcopy-ppc64 + memmove-ppc64 bcopy-ppc64 strncpy-power8 CFLAGS-strncase-power7.c += -mcpu=power7 -funroll-loops CFLAGS-strncase_l-power7.c += -mcpu=power7 -funroll-loops |