about summary refs log tree commit diff
path: root/sysdeps/powerpc/powerpc64/multiarch
Commit message (Collapse)AuthorAgeFilesLines
* powerpc: Improve memcmp performance for POWER8Rajalakshmi Srinivasaraghavan2017-05-184-2/+36
| | | | | Vectorization improves performance over the current implementation. Tested on powerpc64 and powerpc64le.
* powerpc: Fix strncat ifunc selectionRajalakshmi Srinivasaraghavan2017-05-041-1/+1
| | | | | | Correct hwcap usage in strncat introduced by commit 249dcdb71b79e4c488a46c9027e0014c0bc27044. Tested on power7 and power8 systems
* powerpc64: strrchr optimization for power8Rajalakshmi Srinivasaraghavan2017-04-184-1/+46
| | | | | | | P7 code is used for <=32B strings and for > 32B vectorized loops are used. This shows as an average 25% improvement depending on the position of search character. The performance is same for shorter strings. Tested on ppc64 and ppc64le.
* powerpc: Optimized strncat for POWER8Rajalakshmi Srinivasaraghavan2017-04-134-2/+40
| | | | | | | | | | | | | | | | | | With new optimized strnlen for POWER8 [1], this patch adds strncat for power8 to make use of optimized strlen and strnlen. This is faster than POWER7 current implementation for larger strings. Tested on powerpc64 and powerpc64le. [1] https://sourceware.org/ml/libc-alpha/2017-03/msg00491.html * sysdeps/powerpc/powerpc64/multiarch/Makefile (sysdep_routines): Add strncat-power8. * sysdeps/powerpc/powerpc64/multiarch/strncat.c (strncat): Add __strncat_power8 to ifunc list. * sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c (strncat): Add __strncat_power8 to list of strncat functions. * sysdeps/powerpc/powerpc64/multiarch/strncat-power8.c: New file.
* powerpc: refactor memcmp and memmove IFUNC.Wainer dos Santos Moschetta2017-04-113-45/+3
| | | | | | | | | | | | | | | | | Clean up the IFUNC implementations for powerpc in order to remove unneeded macro definitions. Tested on ppc64le with and without --disable-multi-arch flag. * sysdeps/powerpc/powerpc64/multiarch/memcmp-power4.S: Define the implementation-specific function name and remove unneeded macros definition. * sysdeps/powerpc/powerpc64/multiarch/memcmp-power7.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S: Likewise. * sysdeps/powerpc/powerpc64/power4/memcmp.S: Set a default function name if not defined and pass as parameter to macros accordingly. * sysdeps/powerpc/powerpc64/power7/memcmp.S: Likewise. * sysdeps/powerpc/powerpc64/power7/memmove.S: Likewise.
* powerpc: refactor memcpy and mempcpy IFUNC.Wainer dos Santos Moschetta2017-04-117-105/+7
| | | | | | | | | | | | | | | | | | | | | | | | | Clean up the IFUNC implementations for powerpc in order to remove unneeded macro definitions. Tested on ppc64le with and without --disable-multi-arch flag. * sysdeps/powerpc/powerpc64/multiarch/memcpy-a2.S: Define the implementation-specific function name and remove unneeded macros definition. * sysdeps/powerpc/powerpc64/multiarch/memcpy-cell.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/memcpy-power4.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/memcpy-power6.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/memcpy-power7.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/memcpy-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/mempcpy-power7.S: Likewise. * sysdeps/powerpc/powerpc64/a2/memcpy.S: Set a default function name if not defined and pass as parameter to macros accordingly. * sysdeps/powerpc/powerpc64/cell/memcpy.S: Likewise. * sysdeps/powerpc/powerpc64/memcpy.S: Likewise. * sysdeps/powerpc/powerpc64/power4/memcpy.S: Likewise. * sysdeps/powerpc/powerpc64/power6/memcpy.S: Likewise. * sysdeps/powerpc/powerpc64/power7/memcpy.S: Likewise. * sysdeps/powerpc/powerpc64/power7/mempcpy.S: Likewise.
* powerpc: refactor memchr, memrchr, and rawmemchr IFUNC.Wainer dos Santos Moschetta2017-04-113-42/+3
| | | | | | | | | | | | | | | | | | Clean up the IFUNC implementations for powerpc in order to remove unneeded macro definitions. Tested on ppc64le with and without --disable-multi-arch flag. * sysdeps/powerpc/powerpc64/multiarch/memchr-power7.S: Define the implementation-specific function name and remove unneeded macros definition. * sysdeps/powerpc/powerpc64/multiarch/memrchr-power7.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/rawmemchr-power7.S: Likewise. * sysdeps/powerpc/powerpc64/power7/memchr.S: Set a default function name if not defined and pass as parameter to macros accordingly. * sysdeps/powerpc/powerpc64/power7/memrchr.S: Likewise. * sysdeps/powerpc/powerpc64/power7/rawmemchr.S: Likewise.
* powerpc: refactor memset IFUNC.Wainer dos Santos Moschetta2017-04-115-75/+5
| | | | | | | | | | | | | | | | | | | | | Clean up the IFUNC implementations for powerpc in order to remove unneeded macro definitions. Tested on ppc64le with and without --disable-multi-arch flag. * sysdeps/powerpc/powerpc64/multiarch/memset-power4.S: Define the implementation-specific function name and remove unneeded macros definition. * sysdeps/powerpc/powerpc64/multiarch/memset-power6.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/memset-power7.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/memset-power8.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/memset-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/memset.S: Set a default function name if not defined and pass as parameter to macros accordingly. * sysdeps/powerpc/powerpc64/power4/memset.S: Likewise. * sysdeps/powerpc/powerpc64/power6/memset.S: Likewise. * sysdeps/powerpc/powerpc64/power7/memset.S: Likewise. * sysdeps/powerpc/powerpc64/power8/memset.S: Likewise.
* powerpc: refactor strcasestr and strstr IFUNC.Wainer dos Santos Moschetta2017-04-112-30/+2
| | | | | | | | | | | | | | | Clean up the IFUNC implementations for powerpc in order to remove unneeded macro definitions. Tested on ppc64le with and without --disable-multi-arch flag. * sysdeps/powerpc/powerpc64/multiarch/strcasestr-power8.S: Define the strcasestr implementation name and remove unneeded macros definition. * sysdeps/powerpc/powerpc64/multiarch/strstr-power7.S: Define strstr implementation name and remove unneeded macros definition. * sysdeps/powerpc/powerpc64/power7/strstr.S: Set a default function name if not defined and pass as parameter to macros accordingly. * sysdeps/powerpc/powerpc64/power8/strcasestr.S: Likewise.
* powerpc: refactor strchr, strchrnul, and strrchr IFUNC.Wainer dos Santos Moschetta2017-04-116-84/+6
| | | | | | | | | | | | | | | | | | | | | | | Clean up the IFUNC implementations for powerpc in order to remove unneeded macro definitions. Tested on ppc64le with and without --disable-multi-arch flag. * sysdeps/powerpc/powerpc64/multiarch/strchr-power7.S: Define the implementation-specific function name and remove unneeded macros definition. * sysdeps/powerpc/powerpc64/multiarch/strchr-power8.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strchr-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strchrnul-power7.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strchrnul-power8.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strrchr-power7.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strchr.S: Set a default function name if not defined and pass as parameter to macros accordingly. * sysdeps/powerpc/powerpc64/power7/strchrnul.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strrchr.S: Likewise. * sysdeps/powerpc/powerpc64/power8/strchr.S: Likewise. * sysdeps/powerpc/powerpc64/strchr.S: Likewise.
* powerpc: refactor strnlen and strlen IFUNC.Wainer dos Santos Moschetta2017-04-114-56/+4
| | | | | | | | | | | | | | | | | | | Clean up the IFUNC implementations for powerpc in order to remove unneeded macro definitions. Tested on ppc64le with and without --disable-multi-arch flag. * sysdeps/powerpc/powerpc64/multiarch/strlen-power7.S: Define the strlen implementation name and remove unneeded macros definition. * sysdeps/powerpc/powerpc64/multiarch/strlen-power8.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strlen-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strnlen-power7.S: Define the strnlen implementation name and remove unneeded macros definition. * sysdeps/powerpc/powerpc64/power7/strlen.S: Set a default function name if not defined and pass as parameter to macros accordingly. * sysdeps/powerpc/powerpc64/power7/strnlen.S: Likewise. * sysdeps/powerpc/powerpc64/power8/strlen.S: Likewise. * sysdeps/powerpc/powerpc64/strlen.S: Likewise.
* powerpc: refactor strcasecmp, strcmp, and strncmp IFUNC.Wainer dos Santos Moschetta2017-04-1110-153/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clean up the IFUNC implementations for powerpc in order to remove unneeded macro definitions. Tested on ppc64le with and without --disable-multi-arch flag. * sysdeps/powerpc/powerpc64/multiarch/strcasecmp_l-power7.S: Define the implementation-specific function name and remove unneeded macros definition. * sysdeps/powerpc/powerpc64/multiarch/strcmp-power7.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcmp-power8.S Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcmp-power9.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcmp-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strncmp-power4.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strncmp-power7.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strncmp-power8.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strncmp-power9.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strncmp-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/power4/strncmp.S: Set a default function name if not defined and pass as parameter to macros accordingly. * sysdeps/powerpc/powerpc64/power7/strcmp.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strncmp.S: Likewise. * sysdeps/powerpc/powerpc64/power8/strcmp.S: Likewise. * sysdeps/powerpc/powerpc64/power8/strncmp.S: Likewise. * sysdeps/powerpc/powerpc64/power9/strcmp.S: Likewise. * sysdeps/powerpc/powerpc64/power9/strncmp.S: Likewise. * sysdeps/powerpc/powerpc64/strcmp.S: Likewise. * sysdeps/powerpc/powerpc64/strncmp.S: Likewise.
* powerpc: refactor stpcpy, stpncpy, strcpy, and strncpy IFUNC.Wainer dos Santos Moschetta2017-04-116-90/+6
| | | | | | | | | | | | | | | | | | | | Clean up the IFUNC implementations for powerpc in order to remove unneeded macro definitions. Tested on ppc64le with and without --disable-multi-arch flag. * sysdeps/powerpc/powerpc64/multiarch/stpcpy-power8.S: Define the implementation-specific function name and remove unneeded macros definition. * sysdeps/powerpc/powerpc64/multiarch/stpncpy-power7.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/stpncpy-power8.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcpy-power8.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strncpy-power7.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strncpy-power8.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strncpy.S: Set a default function name if not defined. * sysdeps/powerpc/powerpc64/power8/strcpy.S: Likewise. * sysdeps/powerpc/powerpc64/power8/strncpy.S: Likewise.
* powerpc64: Add POWER8 strnlenWainer dos Santos Moschetta2017-04-054-5/+40
| | | | | | | | | | | | | | | | | | | | | | | | Added strnlen POWER8 otimized for long strings. It delivers same performance as POWER7 implementation for short strings. This takes advantage of reasonably performing unaligned loads and bit permutes to check the first 1-16 bytes until quadword aligned, then checks in 64 bytes strides until unsafe, then 16 bytes, truncating the count if need be. Likewise, the POWER7 code is recycled for less than 32 bytes strings. Tested on ppc64 and ppc64le. * sysdeps/powerpc/powerpc64/multiarch/Makefile (sysdep_routines): Add strnlen-power8. * sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c (strnlen): Add __strnlen_power8 to list of strnlen functions. * sysdeps/powerpc/powerpc64/multiarch/strnlen-power8.S: New file. * sysdeps/powerpc/powerpc64/multiarch/strnlen.c (__strnlen): Add __strnlen_power8 to ifunc list. * sysdeps/powerpc/powerpc64/power8/strnlen.S: New file.
* powerpc: Use latest optimizations for internal function callsRajalakshmi Srinivasaraghavan2017-02-072-3/+3
| | | | | Some of the power8 strings optimizations are not updated to use the latest version of other string optimizations
* Remove duplicate strcat implementationsAdhemerval Zanella2017-01-033-3/+3
| | | | | | | | | | | | | | | | | Since commit 6e46de42fe16 default strcat implementation is essentially the same for specialized ia64 and powerpc ones. This patch removes the redundant implementation and adjust powerpc64 ifunc code to use the default one. Checked on powerpc32-linux-gnu (default and power4) and ia64-linux build and on powerpc64le-linux-gnu. * sysdeps/ia64/strcat.c: Remove file. * sysdeps/powerpc/strcat.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcat-power7.c: Use default C implementation. * sysdeps/powerpc/powerpc64/multiarch/strcat-power8.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcat-ppc64.c: Likewise.
* Update copyright dates with scripts/update-copyrights.Joseph Myers2017-01-01128-128/+128
|
* powerpc64: strchr/strchrnul optimization for power8Rajalakshmi Srinivasaraghavan2016-12-286-2/+92
| | | | | | | The P7 code is used for <=32B strings and for > 32B vectorized loops are used. This shows as an average 25% improvement depending on the position of search character. The performance is same for shorter strings. Tested on ppc64 and ppc64le.
* powerpc: strncmp optimization for power9Rajalakshmi Srinivasaraghavan2016-12-134-1/+47
| | | | | | | Vectorized loops are used for strings > 32B when compared to power8 optimization. Tested on power9 ppc64le simulator.
* powerpc: Remove stpcpy internal clash with IFUNCAdhemerval Zanella2016-12-011-3/+3
| | | | | | | | | | | | | | | | | | | | | Commit c7debbdfacb redirected the internal strrch to default powerpc64 implementation by redefining the weak_alias at sysdeps/powerpc/powerpc64/multiarch/strchr-ppc64.c: #undef weak_alias #define weak_alias(name, aliasname) \ extern __typeof (__strrchr_ppc) aliasname \ __attribute__ ((weak, alias ("__strrchr_ppc"))); This creates a __GI_strchr alias that clashes with the IFUNC symbol in stprchr.os. There is not need to define the default version for internal version, since ifunc should work internally for powerpc64. This patch removes the weak_alias indirection. Checked on powerpc64le. * sysdeps/powerpc/powerpc64/multiarch/strrchr-ppc64.c (weak_alias): Remove redirection to __strrchr_ppc.
* powerpc: strcmp optimization for power9Rajalakshmi Srinivasaraghavan2016-12-014-1/+48
| | | | | | | Vectorized loops are used for strings > 32B when compared to power8 optimization. Tested on power9 ppc64le simulator.
* powerpc: Remove stpcpy internal clash with IFUNCAdhemerval Zanella2016-11-301-3/+1
| | | | | | | | | | | | | | | | | | | | | Commit 142e0a99530 redirected the internal stpcpy to default powerpc64 implementation by redefining the weak_alias at sysdeps/powerpc/powerpc64/multiarch/stpcpy-ppc64.c: #undef weak_alias #define weak_alias(name, aliasname) \ extern __typeof (__stpcpy_ppc) aliasname \ __attribute__ ((weak, alias ("__stpcpy_ppc"))); This creates a __GI_stpcpy alias that clashes with the IFUNC symbol in stpcpy.os. There is not need to define the default version for internal version, since ifunc should work internally for powerpc64. This patch removes the weak_alias indirection. Checked on powerpc64le. * sysdeps/powerpc/powerpc64/multiarch/stpcpy-ppc64.c (weak_alias): Remove redirection to __stpcpy_ppc.
* Use gcc attribute ifunc in libc_ifunc macro instead of inline assembly due ↵Stefan Liebler2016-10-0715-85/+127
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to false debuginfo. The current s390 ifunc resolver for vector optimized functions and the common libc_ifunc macro in include/libc-symbols.h uses something like that to generate ifunc'ed functions: extern void *__resolve___strlen(unsigned long int dl_hwcap) asm (strlen); asm (".type strlen, %gnu_indirect_function"); This leads to false debug information: objdump --dwarf=info libc.so: ... <1><1e6424>: Abbrev Number: 43 (DW_TAG_subprogram) <1e6425> DW_AT_external : 1 <1e6425> DW_AT_name : (indirect string, offset: 0x1146e): __resolve___strlen <1e6429> DW_AT_decl_file : 1 <1e642a> DW_AT_decl_line : 23 <1e642b> DW_AT_linkage_name: (indirect string, offset: 0x1147a): strlen <1e642f> DW_AT_prototyped : 1 <1e642f> DW_AT_type : <0x1e4ccd> <1e6433> DW_AT_low_pc : 0x998e0 <1e643b> DW_AT_high_pc : 0x16 <1e6443> DW_AT_frame_base : 1 byte block: 9c (DW_OP_call_frame_cfa) <1e6445> DW_AT_GNU_all_call_sites: 1 <1e6445> DW_AT_sibling : <0x1e6459> <2><1e6449>: Abbrev Number: 44 (DW_TAG_formal_parameter) <1e644a> DW_AT_name : (indirect string, offset: 0x1845): dl_hwcap <1e644e> DW_AT_decl_file : 1 <1e644f> DW_AT_decl_line : 23 <1e6450> DW_AT_type : <0x1e4c8d> <1e6454> DW_AT_location : 0x122115 (location list) ... The debuginfo for the ifunc-resolver function contains the DW_AT_linkage_name field, which names the real function name "strlen". If you perform an inferior function call to strlen in lldb, then it fails due to something like that: "error: no matching function for call to 'strlen' candidate function not viable: no known conversion from 'const char [6]' to 'unsigned long' for 1st argument" The unsigned long is the dl_hwcap argument of the resolver function. The strlen function itself has no debufinfo. The s390 ifunc resolver for memset & co uses something like that: asm (".globl FUNC" ".type FUNC, @gnu_indirect_function" ".set FUNC, __resolve_FUNC"); This way the debuginfo for the ifunc-resolver function does not conain the DW_AT_linkage_name field and the real function has no debuginfo, too. Using this strategy for the vector optimized functions leads to some troubles for functions like strnlen. Here we have __strnlen and a weak alias strnlen. The __strnlen function is the ifunc function, which is realized with the asm- statement above. The weak_alias-macro can't be used here due to undefined symbol: gcc ../sysdeps/s390/multiarch/strnlen.c -c ... In file included from <command-line>:0:0: ../sysdeps/s390/multiarch/strnlen.c:28:24: error: ‘strnlen’ aliased to undefined symbol ‘__strnlen’ weak_alias (__strnlen, strnlen) ^ ./../include/libc-symbols.h:111:26: note: in definition of macro ‘_weak_alias’ extern __typeof (name) aliasname __attribute__ ((weak, alias (#name))); ^ ../sysdeps/s390/multiarch/strnlen.c:28:1: note: in expansion of macro ‘weak_alias’ weak_alias (__strnlen, strnlen) ^ make[2]: *** [build/string/strnlen.o] Error 1 As the __strnlen function is defined with asm-statements the function name __strnlen isn't known by gcc. But the weak alias can also be done with an asm statement to resolve this issue: __asm__ (".weak strnlen\n\t" ".set strnlen,__strnlen\n"); In order to use the weak_alias macro, gcc needs to know the ifunc function. The minimum gcc to build glibc is currently 4.7, which supports attribute((ifunc)). See https://gcc.gnu.org/onlinedocs/gcc-4.7.0/gcc/Function-Attributes.html. It is only supported if gcc is configured with --enable-gnu-indirect-function or gcc supports it by default for at least intel and s390x architecture. This patch uses the old behaviour if gcc support is not available. Usage of attribute ifunc is something like that: __typeof (FUNC) FUNC __attribute__ ((ifunc ("__resolve_FUNC"))); Then gcc produces the same .globl, .type, .set assembler instructions like above. And the debuginfo does not contain the DW_AT_linkage_name field and there is no debuginfo for the real function, too. But in order to get it work, there is also some extra work to do. Currently, the glibc internal symbol on s390x e.g. __GI___strnlen is not the ifunc symbol, but the fallback __strnlen_c symbol. Thus I have to omit the libc_hidden_def macro in strnlen.c (here is the ifunc function __strnlen) because it is already handled in strnlen-c.c (here is __strnlen_c). Due to libc_hidden_proto (__strnlen) in string.h, compiling fails: gcc ../sysdeps/s390/multiarch/strnlen.c -c ... In file included from <command-line>:0:0: ../sysdeps/s390/multiarch/strnlen.c:53:24: error: ‘strnlen’ aliased to undefined symbol ‘__strnlen’ weak_alias (__strnlen, strnlen) ^ ./../include/libc-symbols.h:111:26: note: in definition of macro ‘_weak_alias’ extern __typeof (name) aliasname __attribute__ ((weak, alias (#name))); ^ ../sysdeps/s390/multiarch/strnlen.c:53:1: note: in expansion of macro ‘weak_alias’ weak_alias (__strnlen, strnlen) ^ make[2]: *** [build/string/strnlen.os] Error 1 I have to redirect the prototypes for __strnlen in string.h and create a copy of the prototype for using as ifunc function: __typeof (__redirect___strnlen) __strnlen __attribute__ ((ifunc ("__resolve_strnlen"))); weak_alias (__strnlen, strnlen) This way there is no trouble with the internal __GI_* symbols. Glibc builds fine with this construct and the debuginfo is "correct". For functions without a __GI_* symbol like memccpy this redirection is not needed. This patch adjusts the common libc_ifunc and libm_ifunc macro to use gcc attribute ifunc. Due to this change, the macro users where the __GI_* symbol does not target the ifunc symbol have to be prepared with the redirection construct. Furthermore a configure check to test gcc support is added. If it is not supported, the old behaviour is used. This patch also prepares the libc_ifunc macro to be useable in s390-ifunc-macro. The s390 ifunc-resolver-functions do have an hwcaps parameter and not all resolvers need the same initialization code. The next patch in this series changes the s390 ifunc macros to use this common one. ChangeLog: * include/libc-symbols.h (__ifunc_resolver): New macro is used by __ifunc* macros. (__ifunc): New macro uses gcc attribute ifunc or inline assembly depending on HAVE_GCC_IFUNC. (libc_ifunc, libm_ifunc): Use __ifunc as base macro. (libc_ifunc_redirected, libc_ifunc_hidden, libm_ifunc_init): New macro. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite.c: Redirect ifunced function in header for using as type for ifunc function. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finitef.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinff.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/memcmp.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/memcpy.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/memmove.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/mempcpy.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/memset.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/rawmemchr.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/strchr.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/strlen.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/strncmp.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/strnlen.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finitef.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinff.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/memcmp.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/mempcpy.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/rawmemchr.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/stpncpy.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcat.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strchr.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcmp.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcpy.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strncmp.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strncpy.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strnlen.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strrchr.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strstr.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/wcschr.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf.c: Add libc_hidden_def() and use libc_ifunc_hidden() macro instead of libc_ifunc() macro. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnanf.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/stpcpy.c: Likewise.
* powerpc: strcasecmp/strncasecmp optmization for power8raji2016-06-149-49/+132
| | | | | | | This implementation utilizes vectors to improve performance compared to current byte by byte implementation for POWER7. The performance improvement is upto 4x. This patch is tested on powerpc64 and powerpc64le.
* powerpc: Fix --disable-multi-arch build on POWER8Tulio Magno Quites Machado Filho2016-06-062-0/+6
| | | | | | Add missing symbols of stpncpy and strcasestr when multi-arch is disabled. Fix memset call from strncpy/stpncpy when multi-arch is disabled.
* powerpc: Add optimized strcspn for P8Paul E. Murphy2016-04-256-18/+97
| | | | | A few minor adjustments to the P8 strspn gives us an almost equally optimized P8 strcspn.
* powerpc: strcasestr optmization for power8Rajalakshmi Srinivasaraghavan2016-04-225-1/+130
| | | | | | This patch optimizes strcasestr function for power >= 8 systems. The average improvement of this optimization is ~40% and compares 16 bytes at a time using vector instructions. This patch is tested on powerpc64 and powerpc64le.
* powerpc: Optimization for strlen for POWER8.Carlos Eduardo Seo2016-04-154-4/+48
| | | | | This implementation takes advantage of vectorization to improve performance of the loop over the current strlen implementation for POWER7.
* powerpc: Add optimized P8 strspnPaul E. Murphy2016-04-075-1/+110
| | | | | | | This utilizes vectors and bitmasks. For small needle, large haystack, the performance improvement is upto 8x. For short strings (0-4B), the cost of computing the bitmask dominates, and is a tad slower.
* Update copyright dates with scripts/update-copyrights.Joseph Myers2016-01-04110-110/+110
|
* powerpc: make memchr use memchr-power7.Carlos Eduardo Seo2015-08-211-1/+13
| | | | | | | | | In powerpc64, memchr was always pointing to the internal __GI_memchr implementation. This patch fixes that and makes it use the optimized POWER7 version when adequate. * sysdeps/powerpc/powerpc64/multiarch/memchr-ppc64.c: Make memchr not point to the internal __GI_memchr implementation.
* powerpc: Fix stpcpy performance for power8Ondrej Bilka2015-08-111-2/+5
| | | | | | This patch fixes the missing enablement for stpcpy on POWER8. * sysdeps/powerpc/powerpc64/multiarch/stpcpy.c: Fix ifunc.
* powerpc: Use default strcpy optimization for POWER7Adhemerval Zanella2015-08-116-105/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patches uses the default strcpy/stpcpy implementation for POWER7/PPC64. This is faster in mostly inputs for benchtests and for multiarch the implementation uses the POWER7 strlen and memcpy. * string/stpcpy.c (__stpcpy): Use STPCPY to redefine symbol name and cleanup macro usage. * string/strcpy.c (strcpt): Use STRCPY to redefine symbol name. * sysdeps/powerpc/powerpc64/multiarch/stpcpy-power7.S: Remove file. * sysdeps/powerpc/powerpc64/multiarch/stpcpy-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcpy-power7.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcpy-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/power7/stpcpy.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strcpy.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strcpy.c: Likewise. * sysdeps/powerpc/powerpc64/stpcpy.S: Likewise. * sysdeps/powerpc/powerpc64/strcpy.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/stpcpy.c [SHARED && IS_IN (libc)]: Include <string/strcpy.c>. * sysdeps/powerpc/powerpc64/multiarch/stpcpy.c [SHARED && IS_IN (libc)]: Include <string/stpcpy.c>. * sysdeps/powerpc/powerpc64/multiarch/stpcpy-power7.c: New file. * sysdeps/powerpc/powerpc64/multiarch/stpcpy-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcpy-power7.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcpy-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/power7/strcpy.c: Likewise.
* powerpc: strstr optimizationRajalakshmi Srinivasaraghavan2015-07-165-1/+117
| | | | | | | | | | | | | | | | This patch optimizes strstr function for power >= 7 systems. Performance gain is obtained using aligned memory access and usage of cmpb instruction for quicker comparison. The average improvement of this optimization is ~40%. Tested on ppc64 and ppc64le. 2015-07-16 Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com> * sysdeps/powerpc/powerpc64/multiarch/Makefile: Add strstr(). * sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c: Likewise. * sysdeps/powerpc/powerpc64/power7/strstr.S: New File. * sysdeps/powerpc/powerpc64/multiarch/strstr-power7.S: New File. * sysdeps/powerpc/powerpc64/multiarch/strstr-ppc64.c: New File. * sysdeps/powerpc/powerpc64/multiarch/strstr.c: New File.
* powerpc: Fix memmove static buildAdhemerval Zanella2015-02-251-1/+3
| | | | | | | This patch fixes the missing "__memcpy_ppc" symbol for memmove-ppc64 object in static builds. Since memcpy ifunc is not enabled in static mode, the specialized symbols are not provided. The patch changed the it to just "__memcpy" instead.
* powerpc: wordcopy/memmove cleanup for ppc32Adhemerval Zanella2015-02-091-4/+0
| | | | | | | | This patch cleanup some multiarch code related to memmmove optimization. Initial IFUNC support added specialized wordcopy symbols which turned in local IFUNC calls used by memmove default implementation. The patch removes the internal IFUNC for wordcopy symbols and uses local branches in the memmmove optimization instead.
* powerpc: wordcopy/memmove cleanup for ppc64Adhemerval Zanella2015-02-095-95/+23
| | | | | | | | | | This patch cleanup some multiarch code related to memmmove optimization. Initial IFUNC support added specialized wordcopy symbols which turned in local IFUNC calls used by memmove default implementation. This change by removing then and used the optimized memmove instead for supported chips.
* powerpc: Remove POWER7 wordcopy ifuncAdhemerval Zanella2015-02-093-45/+9
| | | | | | This patch remove the POWER7 ifunc wordcopy function (_wordcopy_*_power7), since now GLIBC provides a optimized memmove/bcopy for POWER7.
* powerpc: Simplify bcopy default implementationAdhemerval Zanella2015-02-091-4/+6
| | | | | | This patch simplify the default bcopy symbol for powerpc64 by just using memmove instead of implementing using the default bcopy. Since the symbol is deprecated, it trades speed by code size.
* powerpc: multiarch Makefile cleanup for powerpc64Adhemerval Zanella2015-02-091-5/+10
| | | | | This patch cleanups the multiarch Makefile by putting the wide chars implementation to correct wcsmbs rule.
* powerpc: Optimized strncmp for POWER8/PPC64Adhemerval Zanella2015-01-134-5/+51
| | | | | | | | | | This patch adds an optimized POWER8 strncmp. The implementation focus on speeding up unaligned cases follwing the ideas of power8 strcmp. The algorithm first check the initial 16 bytes, then align the first function source and uses unaligned loads on second argument only. Aditional checks for page boundaries are done for unaligned cases (where sources alignment are different).
* powerpc: Optimized strcmp for POWER8/PPC64Adhemerval Zanella2015-01-134-3/+49
| | | | | | | This patch adds an optimized POWER8 strcmp using unaligned accesses. The algorithm first check the initial 16 bytes, then align the first function source and uses unaligned loads on second argument only. Aditional checks for page boundaries are done for unaligned cases
* powerpc: Optimized st{r,p}ncpy for POWER8/PPC64Adhemerval Zanella2015-01-136-6/+98
| | | | | | | | | | | | | This patch adds an optimized POWER8 st{r,p}ncpy using unaligned accesses. It shows 10%-80% improvement over the optimized POWER7 one that uses only aligned accesses, specially on unaligned inputs. The algorithm first read and check 16 bytes (if inputs do not cross a 4K page size). The it realign source to 16-bytes and issue a 16 bytes read and compare loop to speedup null byte checks for large strings. Also, different from POWER7 optimization, the null pad is done inline in the implementation using possible unaligned accesses, instead of realying on a memset call. Special case is added for page cross reads.
* powerpc: Optimized strncat for POWER7/PPC64Adhemerval Zanella2015-01-132-42/+31
| | | | | | | | | | | With 3eb38795dbbbd816 (Simplify strncat) the generic algorithms uses strlen, strnlen, and memcpy. This is faster than POWER7 current implementation, especially for unaligned strings (where POWER7 code uses byte-byte operations). This patch removes the assembly implementation and uses a multiarch specialization based on default algorithm calling optimized POWER7 symbols.
* powerpc: Optimized strcat for POWER8/PPC64Adhemerval Zanella2015-01-134-4/+40
| | | | | With new optimized strcpy for POWER8, this patch adds an optimized strcat which uses it along with default implementation at strings/.
* powerpc: Optimized st{r,p}cpy for POWER8/PPC64Adhemerval Zanella2015-01-135-3/+91
| | | | | | | | | | | | This patch adds an optimized POWER8 strcpy using unaligned accesses. For strings up to 16 bytes the implementation first calculate the string size, like strlen, and issues a memcpy. For larger strings, source is first aligned to 16 bytes and then tested over a loop that reads 16 bytes am combine the cmpb results for speedup. Special case is added for page cross reads. It shows 30%-60% improvement over the optimized POWER7 one that uses only aligned accesses.
* Update copyright dates with scripts/update-copyrights.Joseph Myers2015-01-02103-103/+103
|
* Fix strftime wcschr namespace (bug 17634).Joseph Myers2014-12-101-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use of strftime, a C90 function, ends up bringing in wcschr, which is not a C90 function. Although not a conformance bug (C90 reserves wcs*), this is still contrary to glibc practice of avoiding relying on those reservations; this patch arranges for the internal uses to use __wcschr instead, with wcschr being a weak alias. This is more complicated than some such patches because of the various IFUNC definitions of wcschr (which include code redefining libc_hidden_def in a way that involves creating __GI_wcschr manually and so also needs to create __GI___wcschr after the change of internal uses to use __wcschr). Tested for x86_64 and 32-bit x86 (testsuite, and that disassembly of installed shared libraries is unchanged by the patch). 2014-12-10 Joseph Myers <joseph@codesourcery.com> Adhemerval Zanella <azanella@linux.vnet.ibm.com> [BZ #17634] * wcsmbs/wcschr.c [!WCSCHR] (wcschr): Define as __wcschr. Undefine after defining function. Define as weak alias of __wcschr. Use libc_hidden_weak. * include/wchar.h (__wcschr): Declare. Use libc_hidden_proto. * sysdeps/i386/i686/multiarch/wcschr-c.c [IS_IN (libc) && SHARED] (libc_hidden_def): Also define __GI___wcschr alias. * sysdeps/i386/i686/multiarch/wcschr.S (wcschr): Rename to __wcschr and define as weak alias of __wcschr. * sysdeps/powerpc/power6/wcschr.c [!WCSCHR] (WCSCHR): Define as __wcschr. [!WCSCHR] (DEFAULT_WCSCHR): Define. [DEFAULT_WCSCHR] (__wcschr): Use libc_hidden_def. [DEFAULT_WCSCHR] (wcschr): Define as weak alias of __wcschr. Use libc_hidden_weak. Do not use libc_hidden_def. * sysdeps/powerpc/powerpc32/power4/multiarch/wcschr-ppc32.c [IS_IN (libc) && SHARED] (libc_hidden_def): Also define __GI___wcschr alias. * sysdeps/powerpc/powerpc32/power4/multiarch/wcschr.c [IS_IN (libc)] (wcschr): Define as macro expanding to __redirect_wcschr. [IS_IN (libc)] (__wcschr_ppc): Use __redirect_wcschr in typeof. [IS_IN (libc)] (__wcschr_power6): Likewise. [IS_IN (libc)] (__wcschr_power7): Likewise. [IS_IN (libc)] (__libc_wcschr): New. Define with libc_ifunc instead of wcschr. [IS_IN (libc)] (wcschr): Undefine and define as weak alias of __libc_wcschr. [!IS_IN (libc)] (libc_hidden_def): Do not undefine and redefine. * sysdeps/powerpc/powerpc64/multiarch/wcschr.c (wcschr): Rename to __wcschr and define as weak alias of __wcschr. Use libc_hidden_builtin_def. * sysdeps/x86_64/wcschr.S (wcschr): Rename to __wcschr and define as weak alias of __wcschr. Use libc_hidden_weak. * time/alt_digit.c (_nl_get_walt_digit): Use __wcschr instead of wcschr. * time/era.c (_nl_init_era_entries): Likewise. * conform/Makefile (test-xfail-ISO/time.h/linknamespace): Remove variable. (test-xfail-XPG3/time.h/linknamespace): Likewise. (test-xfail-XPG4/time.h/linknamespace): Likewise.
* powerpc: Add powerpc64 strpbrk optimizationAdhemerval Zanella2014-12-025-110/+1
| | | | | | This patch makes the POWER7 optimized strpbrk generic by using default doubleword stores to zero the hash, instead of VSX instructions. Performance on POWER7/POWER8 does not change.
* powerpc: Add powerpc64 strcspn optimizationAdhemerval Zanella2014-12-025-110/+0
| | | | | | This patch makes the POWER7 optimized strcspn generic by using default doubleword stores to zero the hash, instead of VSX instructions. Performance on POWER7/POWER8 does not change.