about summary refs log tree commit diff
path: root/sysdeps/powerpc/powerpc64
Commit message (Collapse)AuthorAgeFilesLines
* powerpc: Remove stpcpy internal clash with IFUNCAdhemerval Zanella2016-11-301-3/+1
| | | | | | | | | | | | | | | | | | | | | Commit 142e0a99530 redirected the internal stpcpy to default powerpc64 implementation by redefining the weak_alias at sysdeps/powerpc/powerpc64/multiarch/stpcpy-ppc64.c: #undef weak_alias #define weak_alias(name, aliasname) \ extern __typeof (__stpcpy_ppc) aliasname \ __attribute__ ((weak, alias ("__stpcpy_ppc"))); This creates a __GI_stpcpy alias that clashes with the IFUNC symbol in stpcpy.os. There is not need to define the default version for internal version, since ifunc should work internally for powerpc64. This patch removes the weak_alias indirection. Checked on powerpc64le. * sysdeps/powerpc/powerpc64/multiarch/stpcpy-ppc64.c (weak_alias): Remove redirection to __stpcpy_ppc.
* powerpc: Add hidden definition for __sigsetjmpFlorian Weimer2016-11-291-0/+11
| | | | | There already is a hidden prototype for __sigsetjmp, but the architecture-specific definition was missing.
* Define wordsize.h macros everywhereSteve Ellcey2016-11-041-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | * bits/wordsize.h: Add documentation. * sysdeps/aarch64/bits/wordsize.h : New file * sysdeps/generic/stdint.h (PTRDIFF_MIN, PTRDIFF_MAX): Update definitions. (SIZE_MAX): Change ifdef to if in __WORDSIZE32_SIZE_ULONG check. * sysdeps/gnu/bits/utmp.h (__WORDSIZE_TIME64_COMPAT32): Check with #if instead of #ifdef. * sysdeps/gnu/bits/utmpx.h (__WORDSIZE_TIME64_COMPAT32): Ditto. * sysdeps/mips/bits/wordsize.h (__WORDSIZE32_SIZE_ULONG, __WORDSIZE32_PTRDIFF_LONG, __WORDSIZE_TIME64_COMPAT32): Add or change defines. * sysdeps/powerpc/powerpc32/bits/wordsize.h: Likewise. * sysdeps/powerpc/powerpc64/bits/wordsize.h: Likewise. * sysdeps/s390/s390-32/bits/wordsize.h: Likewise. * sysdeps/s390/s390-64/bits/wordsize.h: Likewise. * sysdeps/sparc/sparc32/bits/wordsize.h: Likewise. * sysdeps/sparc/sparc64/bits/wordsize.h: Likewise. * sysdeps/tile/tilegx/bits/wordsize.h: Likewise. * sysdeps/tile/tilepro/bits/wordsize.h: Likewise. * sysdeps/unix/sysv/linux/alpha/bits/wordsize.h: Likewise. * sysdeps/unix/sysv/linux/powerpc/bits/wordsize.h: Likewise. * sysdeps/unix/sysv/linux/sparc/bits/wordsize.h: Likewise. * sysdeps/wordsize-32/bits/wordsize.h: Likewise. * sysdeps/wordsize-64/bits/wordsize.h: Likewise. * sysdeps/x86/bits/wordsize.h: Likewise.
* Fix cmpli usage in power6 memset.Joseph Myers2016-10-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | Building glibc for powerpc64 with recent (2.27.51.20161012) binutils, with multi-arch enabled, I get the error: ../sysdeps/powerpc/powerpc64/power6/memset.S: Assembler messages: ../sysdeps/powerpc/powerpc64/power6/memset.S:254: Error: operand out of range (5 is not between 0 and 1) ../sysdeps/powerpc/powerpc64/power6/memset.S:254: Error: operand out of range (128 is not between 0 and 31) ../sysdeps/powerpc/powerpc64/power6/memset.S:254: Error: missing operand Indeed, cmpli is documented as a four-operand instruction, and looking at nearby code it seems likely cmpldi was intended. This patch fixes this powerpc64 code accordingly, and makes a corresponding change to the powerpc32 code. Tested for powerpc, powerpc64 and powerpc64le by Tulio Magno Quites Machado Filho * sysdeps/powerpc/powerpc32/power6/memset.S (memset): Use cmplwi instead of cmpli. * sysdeps/powerpc/powerpc64/power6/memset.S (memset): Use cmpldi instead of cmpli.
* Stop powerpc copysignl raising "invalid" for sNaN argument (bug 20718).Joseph Myers2016-10-191-6/+4
| | | | | | | | | | | | | | | | | | | | | | The powerpc (hard-float) implementations of copysignl, both 32-bit and 64-bit, raise spurious "invalid" exceptions when the first argument is a signaling NaN. copysign functions should never raise exceptions even for signaling NaNs. The problem is the use of an fcmpu instruction to test the sign of the high part of the long double argument. This patch fixes the functions to use fsel instead (as used for fabsl following my fixes for a similar bug there), or to examine the integer representation for older 32-bit processors without fsel. Tested for powerpc64 and powerpc32 (configurations with and without fsel used). [BZ #20718] * sysdeps/powerpc/powerpc32/fpu/s_copysignl.S (__copysignl): Do not use floating-point comparisons to test sign. * sysdeps/powerpc/powerpc64/fpu/s_copysignl.S (__copysignl): Likewise.
* Use gcc attribute ifunc in libc_ifunc macro instead of inline assembly due ↵Stefan Liebler2016-10-0721-136/+201
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to false debuginfo. The current s390 ifunc resolver for vector optimized functions and the common libc_ifunc macro in include/libc-symbols.h uses something like that to generate ifunc'ed functions: extern void *__resolve___strlen(unsigned long int dl_hwcap) asm (strlen); asm (".type strlen, %gnu_indirect_function"); This leads to false debug information: objdump --dwarf=info libc.so: ... <1><1e6424>: Abbrev Number: 43 (DW_TAG_subprogram) <1e6425> DW_AT_external : 1 <1e6425> DW_AT_name : (indirect string, offset: 0x1146e): __resolve___strlen <1e6429> DW_AT_decl_file : 1 <1e642a> DW_AT_decl_line : 23 <1e642b> DW_AT_linkage_name: (indirect string, offset: 0x1147a): strlen <1e642f> DW_AT_prototyped : 1 <1e642f> DW_AT_type : <0x1e4ccd> <1e6433> DW_AT_low_pc : 0x998e0 <1e643b> DW_AT_high_pc : 0x16 <1e6443> DW_AT_frame_base : 1 byte block: 9c (DW_OP_call_frame_cfa) <1e6445> DW_AT_GNU_all_call_sites: 1 <1e6445> DW_AT_sibling : <0x1e6459> <2><1e6449>: Abbrev Number: 44 (DW_TAG_formal_parameter) <1e644a> DW_AT_name : (indirect string, offset: 0x1845): dl_hwcap <1e644e> DW_AT_decl_file : 1 <1e644f> DW_AT_decl_line : 23 <1e6450> DW_AT_type : <0x1e4c8d> <1e6454> DW_AT_location : 0x122115 (location list) ... The debuginfo for the ifunc-resolver function contains the DW_AT_linkage_name field, which names the real function name "strlen". If you perform an inferior function call to strlen in lldb, then it fails due to something like that: "error: no matching function for call to 'strlen' candidate function not viable: no known conversion from 'const char [6]' to 'unsigned long' for 1st argument" The unsigned long is the dl_hwcap argument of the resolver function. The strlen function itself has no debufinfo. The s390 ifunc resolver for memset & co uses something like that: asm (".globl FUNC" ".type FUNC, @gnu_indirect_function" ".set FUNC, __resolve_FUNC"); This way the debuginfo for the ifunc-resolver function does not conain the DW_AT_linkage_name field and the real function has no debuginfo, too. Using this strategy for the vector optimized functions leads to some troubles for functions like strnlen. Here we have __strnlen and a weak alias strnlen. The __strnlen function is the ifunc function, which is realized with the asm- statement above. The weak_alias-macro can't be used here due to undefined symbol: gcc ../sysdeps/s390/multiarch/strnlen.c -c ... In file included from <command-line>:0:0: ../sysdeps/s390/multiarch/strnlen.c:28:24: error: ‘strnlen’ aliased to undefined symbol ‘__strnlen’ weak_alias (__strnlen, strnlen) ^ ./../include/libc-symbols.h:111:26: note: in definition of macro ‘_weak_alias’ extern __typeof (name) aliasname __attribute__ ((weak, alias (#name))); ^ ../sysdeps/s390/multiarch/strnlen.c:28:1: note: in expansion of macro ‘weak_alias’ weak_alias (__strnlen, strnlen) ^ make[2]: *** [build/string/strnlen.o] Error 1 As the __strnlen function is defined with asm-statements the function name __strnlen isn't known by gcc. But the weak alias can also be done with an asm statement to resolve this issue: __asm__ (".weak strnlen\n\t" ".set strnlen,__strnlen\n"); In order to use the weak_alias macro, gcc needs to know the ifunc function. The minimum gcc to build glibc is currently 4.7, which supports attribute((ifunc)). See https://gcc.gnu.org/onlinedocs/gcc-4.7.0/gcc/Function-Attributes.html. It is only supported if gcc is configured with --enable-gnu-indirect-function or gcc supports it by default for at least intel and s390x architecture. This patch uses the old behaviour if gcc support is not available. Usage of attribute ifunc is something like that: __typeof (FUNC) FUNC __attribute__ ((ifunc ("__resolve_FUNC"))); Then gcc produces the same .globl, .type, .set assembler instructions like above. And the debuginfo does not contain the DW_AT_linkage_name field and there is no debuginfo for the real function, too. But in order to get it work, there is also some extra work to do. Currently, the glibc internal symbol on s390x e.g. __GI___strnlen is not the ifunc symbol, but the fallback __strnlen_c symbol. Thus I have to omit the libc_hidden_def macro in strnlen.c (here is the ifunc function __strnlen) because it is already handled in strnlen-c.c (here is __strnlen_c). Due to libc_hidden_proto (__strnlen) in string.h, compiling fails: gcc ../sysdeps/s390/multiarch/strnlen.c -c ... In file included from <command-line>:0:0: ../sysdeps/s390/multiarch/strnlen.c:53:24: error: ‘strnlen’ aliased to undefined symbol ‘__strnlen’ weak_alias (__strnlen, strnlen) ^ ./../include/libc-symbols.h:111:26: note: in definition of macro ‘_weak_alias’ extern __typeof (name) aliasname __attribute__ ((weak, alias (#name))); ^ ../sysdeps/s390/multiarch/strnlen.c:53:1: note: in expansion of macro ‘weak_alias’ weak_alias (__strnlen, strnlen) ^ make[2]: *** [build/string/strnlen.os] Error 1 I have to redirect the prototypes for __strnlen in string.h and create a copy of the prototype for using as ifunc function: __typeof (__redirect___strnlen) __strnlen __attribute__ ((ifunc ("__resolve_strnlen"))); weak_alias (__strnlen, strnlen) This way there is no trouble with the internal __GI_* symbols. Glibc builds fine with this construct and the debuginfo is "correct". For functions without a __GI_* symbol like memccpy this redirection is not needed. This patch adjusts the common libc_ifunc and libm_ifunc macro to use gcc attribute ifunc. Due to this change, the macro users where the __GI_* symbol does not target the ifunc symbol have to be prepared with the redirection construct. Furthermore a configure check to test gcc support is added. If it is not supported, the old behaviour is used. This patch also prepares the libc_ifunc macro to be useable in s390-ifunc-macro. The s390 ifunc-resolver-functions do have an hwcaps parameter and not all resolvers need the same initialization code. The next patch in this series changes the s390 ifunc macros to use this common one. ChangeLog: * include/libc-symbols.h (__ifunc_resolver): New macro is used by __ifunc* macros. (__ifunc): New macro uses gcc attribute ifunc or inline assembly depending on HAVE_GCC_IFUNC. (libc_ifunc, libm_ifunc): Use __ifunc as base macro. (libc_ifunc_redirected, libc_ifunc_hidden, libm_ifunc_init): New macro. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite.c: Redirect ifunced function in header for using as type for ifunc function. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finitef.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinff.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/memcmp.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/memcpy.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/memmove.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/mempcpy.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/memset.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/rawmemchr.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/strchr.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/strlen.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/strncmp.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/strnlen.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finitef.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinff.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/memcmp.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/mempcpy.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/rawmemchr.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/stpncpy.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcat.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strchr.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcmp.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcpy.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strncmp.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strncpy.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strnlen.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strrchr.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strstr.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/wcschr.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf.c: Add libc_hidden_def() and use libc_ifunc_hidden() macro instead of libc_ifunc() macro. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnanf.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/stpcpy.c: Likewise.
* powerpc: Fix POWER9 impliesTulio Magno Quites Machado Filho2016-09-191-1/+0
| | | | | Fix multiarch build for POWER9 by correcting the order of the directories listed at sysnames configure variable.
* ppc: Fix modf (sNaN) for pre-POWER5+ CPU (bug 20240).Aurelien Jarno2016-07-081-0/+5
| | | | | | | | | | | | | | | | Commit a6a4395d fixed modf implementation by compiling s_modf.c and s_modff.c with -fsignaling-nans. However these files are also included from the pre-POWER5+ implementation, and thus these files should also be compiled with -fsignaling-nans. Changelog: [BZ #20240] * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (CFLAGS-s_modf-ppc32.c): New variable. (CFLAGS-s_modff-ppc32.c): Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (CFLAGS-s_modf-ppc64.c): Likewise. (CFLAGS-s_modff-ppc64.c): Likewise.
* powerpc: Fix return code of strcasecmp for unaligned inputsRajalakshmi Srinivasaraghavan2016-07-051-3/+14
| | | | | | | If the input values are unaligned and if there are null characters in the memory before the starting address of the input values, strcasecmp gives incorrect return code. Fixed it by adding mask the bits that are not part of the string.
* powerpc: Add a POWER8-optimized version of sinf()Anton Blanchard2016-06-305-1/+604
| | | | | This uses the implementation of sinf() in sysdeps/x86_64/fpu/s_sinf.S as inspiration.
* powerpc: Add a POWER8-optimized version of expf()Tulio Magno Quites Machado Filho2016-06-305-1/+386
| | | | | | | | This implementation is based on the one already used at sysdeps/x86_64/fpu/e_expf.S. This implementation improves the performance by ~14% on average in synthetic benchmarks at the cost of decreasing accuracy to 1 ULP.
* Remove atomic_compare_and_exchange_bool_rel.Torvald Riegel2016-06-241-33/+0
| | | | | | | | | | | | | | | | | | | | | | | | atomic_compare_and_exchange_bool_rel and catomic_compare_and_exchange_bool_rel are removed and replaced with the new C11-like atomic_compare_exchange_weak_release. The concurrent code in nscd/cache.c has not been reviewed yet, so this patch does not add detailed comments. * nscd/cache.c (cache_add): Use new C11-like atomic operation instead of atomic_compare_and_exchange_bool_rel. * nptl/pthread_mutex_unlock.c (__pthread_mutex_unlock_full): Likewise. * include/atomic.h (atomic_compare_and_exchange_bool_rel, catomic_compare_and_exchange_bool_rel): Remove. * sysdeps/aarch64/atomic-machine.h (atomic_compare_and_exchange_bool_rel): Likewise. * sysdeps/alpha/atomic-machine.h (atomic_compare_and_exchange_bool_rel): Likewise. * sysdeps/arm/atomic-machine.h (atomic_compare_and_exchange_bool_rel): Likewise. * sysdeps/mips/atomic-machine.h (atomic_compare_and_exchange_bool_rel): Likewise. * sysdeps/tile/atomic-machine.h (atomic_compare_and_exchange_bool_rel): Likewise.
* Use generic fdim on more architectures (bug 6796, bug 20255, bug 20256).Joseph Myers2016-06-141-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some architectures have their own versions of fdim functions, which are missing errno setting (bug 6796) and may also return sNaN instead of qNaN for sNaN input, in the case of the x86 / x86_64 long double versions (bug 20256). These versions are not actually doing anything that a compiler couldn't generate, just straightforward comparisons / arithmetic (and, in the x86 / x86_64 case, testing for NaNs with fxam, which isn't actually needed once you use an unordered comparison and let the NaNs pass through the same subtraction as non-NaN inputs). This patch removes the x86 / x86_64 / powerpc versions, so that those architectures use the generic C versions, which correctly handle setting errno and deal properly with sNaN inputs. This seems better than dealing with setting errno in lots of .S versions. The i386 versions also return results with excess range and precision, which is not appropriate for a function exactly defined by reference to IEEE operations. For errno setting to work correctly on overflow, it's necessary to remove excess range with math_narrow_eval, which this patch duly does in the float and double versions so that the tests can reliably pass on x86. For float, this avoids any double rounding issues as the long double precision is more than twice that of float. For double, double rounding issues will need to be addressed separately, so this patch does not fully fix bug 20255. Tested for x86_64, x86 and powerpc. [BZ #6796] [BZ #20255] [BZ #20256] * math/s_fdim.c: Include <math_private.h>. (__fdim): Use math_narrow_eval on result. * math/s_fdimf.c: Include <math_private.h>. (__fdimf): Use math_narrow_eval on result. * sysdeps/i386/fpu/s_fdim.S: Remove file. * sysdeps/i386/fpu/s_fdimf.S: Likewise. * sysdeps/i386/fpu/s_fdiml.S: Likewise. * sysdeps/i386/i686/fpu/s_fdim.S: Likewise. * sysdeps/i386/i686/fpu/s_fdimf.S: Likewise. * sysdeps/i386/i686/fpu/s_fdiml.S: Likewise. * sysdeps/powerpc/fpu/s_fdim.c: Likewise. * sysdeps/powerpc/fpu/s_fdimf.c: Likewise. * sysdeps/powerpc/powerpc32/fpu/s_fdim.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_fdim.c: Likewise. * sysdeps/x86_64/fpu/s_fdiml.S: Likewise. * math/libm-test.inc (fdim_test_data): Expect errno setting on overflow. Add sNaN tests.
* powerpc: strcasecmp/strncasecmp optmization for power8raji2016-06-1411-49/+598
| | | | | | | This implementation utilizes vectors to improve performance compared to current byte by byte implementation for POWER7. The performance improvement is upto 4x. This patch is tested on powerpc64 and powerpc64le.
* powerpc: Fix --disable-multi-arch build on POWER8Tulio Magno Quites Machado Filho2016-06-065-6/+27
| | | | | | Add missing symbols of stpncpy and strcasestr when multi-arch is disabled. Fix memset call from strncpy/stpncpy when multi-arch is disabled.
* Fix powerpc64 ceil, rint etc. on sNaN input (bug 20160).Joseph Myers2016-05-2712-12/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | The powerpc64 versions of ceil, floor, round, trunc, rint, nearbyint and their float versions return sNaN for sNaN input when they should return qNaN. This patch fixes them to add a NaN argument to itself to quiet sNaNs before returning. Tested for powerpc64. [BZ #20160] * sysdeps/powerpc/powerpc64/fpu/s_ceil.S (__ceil): Add NaN argument to itself before returning the result. * sysdeps/powerpc/powerpc64/fpu/s_ceilf.S (__ceilf): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_floor.S (__floor): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_floorf.S (__floorf): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_nearbyint.S (__nearbyint): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_nearbyintf.S (__nearbyintf): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_rint.S (__rint): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_rintf.S (__rintf): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_round.S (__round): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_roundf.S (__roundf): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_trunc.S (__trunc): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_truncf.S (__truncf): Likewise.
* Avoid "invalid" exceptions from powerpc fabsl (sNaN) (bug 20157).Joseph Myers2016-05-271-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The powerpc implementations of fabsl for ldbl-128ibm (both powerpc32 and powerpc64) wrongly raise the "invalid" exception for sNaN arguments. fabs functions should be quiet for all inputs including signaling NaNs. The problem is the use of a comparison instruction fcmpu to determine if the high part of the argument is negative and so the low part needs to be negated; such instructions raise "invalid" for sNaNs. There is a pure integer implementation of fabsl in sysdeps/ieee754/ldbl-128ibm/s_fabsl.c. However, it's not necessary to use it to avoid such exceptions. The fsel instruction does not raise exceptions for sNaNs, and can be used in place of the original comparison. (Note that if the high part is zero or a NaN, it does not matter whether the low part is negated; the choice of whether the low part of a zero is +0 or -0 does not affect the value, and the low part of a NaN does not affect the value / payload either.) The condition in GCC for fsel to be available is TARGET_PPC_GFXOPT, corresponding to the _ARCH_PPCGR predefined macro. fsel is available on all 64-bit processors supported by GCC. A few 32-bit processors supported by GCC do not have TARGET_PPC_GFXOPT despite having hard float support. To support those processors, integer code (similar to that in copysignl) is included for the !_ARCH_PPCGR case for powerpc32. Tested for powerpc32 (configurations with and without _ARCH_PPCGR) and powerpc64. [BZ #20157] * sysdeps/powerpc/powerpc32/fpu/s_fabsl.S (__fabsl): Use fsel to determine whether to negate low half if [_ARCH_PPCGR], and integer comparison otherwise. * sysdeps/powerpc/powerpc64/fpu/s_fabsl.S (__fabsl): Use fsel to determine whether to negate low half.
* Do not raise "inexact" from powerpc64 ceil, floor, trunc (bug 15479).Joseph Myers2016-05-256-18/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Continuing fixes for ceil, floor and trunc functions not to raise the "inexact" exception, this patch fixes the versions used on older powerpc64 processors. As was done with the round implementations some time ago, the save of floating-point state is moved after the first floating-point operation on the input to ensure that any "invalid" exception from signaling NaN input is included in the saved state, and then the whole state gets restored rather than just the rounding mode. This has no effect on configurations using the power5+ code, since such processors can do these operations with a single instruction (and those instructions do not set "inexact", so are correct for TS 18661-1 semantics). Tested for powerpc64. [BZ #15479] * sysdeps/powerpc/powerpc64/fpu/s_ceil.S (__ceil): Move save of floating-point state after first floating-point operation on input. Restore full floating-point state instead of just rounding mode. * sysdeps/powerpc/powerpc64/fpu/s_ceilf.S (__ceilf): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_floor.S (__floor): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_floorf.S (__floorf): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_trunc.S (__trunc): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_truncf.S (__truncf): Likewise.
* powerpc: Fix operand prefixesGabriel F. T. Gomes2016-05-041-4/+4
| | | | | | | | | | | | | | The file sysdeps/powerpc/sysdeps.h defines aliases for condition register operands. E.g.: 'cr7' means condition register 7. On the one hand, this increases readability, as it makes it easier for readers to know whether the operand is a condition register, a general purpose register or an immediate. On the other hand, this permits that condition registers be written as if they were general purpose, and vice-versa, thus reducing the readability of the code. This commit removes some of these unintentional misuses. The changes have no effect on the final code. Checked with objdump.
* powerpc: Zero pad using memset in strncpy/stpncpyGabriel F. T. Gomes2016-04-291-67/+56
| | | | | | | Call __memset_power8 to pad, with zeros, the remaining bytes in the dest string on __strncpy_power8 and __stpncpy_power8. This improves performance when n is larger than the input string, giving ~30% gain for larger strings without impacting much shorter strings.
* powerpc: Add optimized strcspn for P8Paul E. Murphy2016-04-258-29/+151
| | | | | A few minor adjustments to the P8 strspn gives us an almost equally optimized P8 strcspn.
* powerpc: strcasestr optmization for power8Rajalakshmi Srinivasaraghavan2016-04-228-1/+693
| | | | | | This patch optimizes strcasestr function for power >= 8 systems. The average improvement of this optimization is ~40% and compares 16 bytes at a time using vector instructions. This patch is tested on powerpc64 and powerpc64le.
* powerpc: Optimization for strlen for POWER8.Carlos Eduardo Seo2016-04-155-4/+345
| | | | | This implementation takes advantage of vectorization to improve performance of the loop over the current strlen implementation for POWER7.
* powerpc: Add optimized P8 strspnPaul E. Murphy2016-04-076-1/+289
| | | | | | | This utilizes vectors and bitmasks. For small needle, large haystack, the performance improvement is upto 8x. For short strings (0-4B), the cost of computing the bitmask dominates, and is a tad slower.
* Remove powerpc64 strspn, strcspn, and strpbrk implementationAdhemerval Zanella2016-04-013-406/+0
| | | | | | | | | | | | | This patch removes the powerpc64 optimized strspn, strcspn, and strpbrk assembly implementation now that the default C one implements the same strategy. On internal glibc benchtests current implementations shows similar performance with -O2. Tested on powerpc64le (POWER8). * sysdeps/powerpc/powerpc64/strcspn.S: Remove file. * sysdeps/powerpc/powerpc64/strpbrk.S: Remove file. * sysdeps/powerpc/powerpc64/strspn.S: Remove file.
* powerpc: Rearrange cfi_offset callsRajalakshmi Srinivasaraghavan2016-03-113-21/+21
| | | | | This patch rearranges cfi_offset() calls after the last store so as to avoid extra DW_CFA_advance opcodes in unwind information.
* Update copyright dates with scripts/update-copyrights.Joseph Myers2016-01-04305-305/+305
|
* powerpc: Add basic support for POWER9 sans hwcap.Carlos Eduardo Seo2015-12-224-0/+6
| | | | This patch adds the minimum changes for supporting the POWER9 processor.
* powerpc: Add hwcap/hwcap2/platform data to TCB.Carlos Eduardo Seo2015-12-031-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a new feature for powerpc. In order to get faster access to the HWCAP/HWCAP2 bits and platform number (i.e. for implementing __builtin_cpu_is () / __builtin_cpu_supports () in GCC) without the overhead of reading from the auxiliary vector, we now reserve space for them in the TCB. This is an ABI change for GLIBC 2.23. A new versioned symbol '__parse_hwcap_and_convert_at_platform' is available to get the data from the auxiliary vector and parse it, and store it for later use in the TLS initialization code. This function is called very early (in _dl_sysdep_start () via DL_PLATFORM_INFO for the dynamic linking case, and in __libc_start_main () for the static linking case) to make sure the data is available at the time of TLS initialization. * sysdeps/powerpc/Makefile (sysdep-dl-routines): Add hwcapinfo. (sysdep_routines): Likewise. (sysdep-rtld-routines): Likewise. [$(subdir) = nptl](tests): Add test-get_hwcap and test-get_hwcap-static [$(subdir) = nptl](tests-static): test-get_hwcap-static * sysdeps/powerpc/Versions: Added new __parse_hwcap_and_convert_at_platform symbol to GLIBC-2.23. * sysdeps/powerpc/hwcapinfo.c: New file. (__tcb_parse_hwcap_and_convert_at_platform): New function to initialize and parse hwcap, hwcap2 and platform number information. * sysdeps/powerpc/hwcapinfo.h: New file. Creates global variables to store HWCAP+HWCAP2 and platform number. * sysdeps/powerpc/nptl/tcb-offsets.sym: Added new offsets for HWCAP+HWCAP2 and platform number in the TCB. * sysdeps/powerpc/nptl/tls.h: New functionality. Stores the HWCAP, HWCAP2 and platform number in the TCB. (dtv): Added new fields for HWCAP+HWCAP2 and platform number. (TLS_INIT_TP): Included calls to add the hwcap and at_platform values in the TCB in TP initialization. (TLS_DEFINE_INIT_TP): Likewise. (THREAD_GET_HWCAP): New macro. (THREAD_SET_HWCAP): Likewise. (THREAD_GET_AT_PLATFORM): Likewise. (THREAD_SET_AT_PLATFORM): Likewise. * sysdeps/powerpc/powerpc32/dl-machine.h: (dl_platform_init): New function that calls __parse_hwcap_and_convert_at_platform for the dymanic linking case for powerpc32. * sysdeps/powerpc/powerpc64/dl-machine.h: Likewise, for powerpc64. * sysdeps/powerpc/test-get_hwcap-static.c: New file. Testcase for this functionality, static linking case. * sysdeps/powerpc/test-get_hwcap.c: New file. Likewise, dynamic linking case. * sysdeps/unix/sysv/linux/powerpc/libc-start.c: Added call to __parse_hwcap_and_convert_at_platform for the static linking case. * sysdeps/unix/sysv/linux/powerpc/powerpc32/ld.abilist: Included the new __parse_hwcap_and_convert_at_platform symbol in the ABI list for GLIBC 2.23. * sysdeps/unix/sysv/linux/powerpc/powerpc64/ld-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/ld.abilist: Likewise.
* Fix powerpc round, roundf spurious "inexact" (bug 19238).Joseph Myers2015-11-122-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The powerpc hard-float round and roundf functions, both 32-bit and 64-bit, raise spurious "inexact" exceptions for integer arguments from adding 0.5 and rounding to integer toward zero. Since these functions already save and restore the rounding mode, it's natural to make them restore the full floating-point state instead to fix this bug, which this patch does. The save of the state is moved after the first floating-point operation on the input so that any "invalid" exceptions from signaling NaN inputs are properly preserved. As a consequence of this approach to the fix, "inexact" for noninteger arguments (disallowed by TS 18661-1 but not by C99/C11, see bug 15479) is also avoided for these implementations; this is *not* a general fix for bug 15479 since plenty of other implementations of various functions still raise spurious "inexact" for noninteger arguments. This issue and fix do not apply to builds using power5+ versions of round and roundf, which use the frin instruction and avoid "inexact" exceptions that way. This patch should get hard-float powerpc32 and powerpc64 (default function implementations) back to a state where test-float and test-double will pass after ulps regeneration. Tested for powerpc32 and powerpc64. [BZ #15479] [BZ #19238] * sysdeps/powerpc/powerpc32/fpu/s_round.S (__round): Save floating-point state after first operation on input. Restore full state rather than just rounding mode. * sysdeps/powerpc/powerpc32/fpu/s_roundf.S (__roundf): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_round.S (__round): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_roundf.S (__roundf): Likewise.
* Fix powerpc64 lround, lroundf, llround, llroundf spurious "inexact" ↵Joseph Myers2015-11-122-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | exceptions (bug 19235). Similar to bug 19134 for powerpc32, the powerpc64 implementations of lround, lroundf, llround, llroundf can raise spurious "inexact" exceptions for integer arguments from adding 0.5 then converting to integer (this does not apply to the power5+ version for double, which uses the frin instruction which is defined never to raise "inexact"; I don't know why power5+ doesn't use that version for float as well). This patch fixes the bug in a similar way to the powerpc32 bug, by testing for integers (adding and subtracting 2^52 and comparing with the value before that addition and subtraction) and not adding 0.5 in that case. The powerpc maintainers may wish to look at making power5+ / power6x / power8 use frin for float lround / llround as well as for double, unless there's some reason I've missed that this isn't beneficial. Tested for powerpc64. [BZ #19235] * sysdeps/powerpc/powerpc64/fpu/s_llround.S (__llround): Do not add 0.5 to integer arguments. * sysdeps/powerpc/powerpc64/fpu/s_llroundf.S (__llroundf): Likewise. (.LC2): New object.
* Fix powerpc nearbyint wrongly clearing "inexact" and leaving traps disabled ↵Joseph Myers2015-11-112-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (bug 19228). Similar to bug 15491 recently fixed for x86_64 / x86, the powerpc (both powerpc32 and powerpc64) hard-float implementations of nearbyintf and nearbyint wrongly clear an "inexact" exception that was raised before the function was called; this shows up as failure of the test math/test-nearbyint-except added when that bug was fixed. They also wrongly leave traps on "inexact" disabled if they were enabled before the function was called. This patch fixes the bugs similar to how the x86 bug was fixed: saving and restoring the whole floating-point state, both to restore the original "inexact" flag state and to restore the original state of whether traps on "inexact" were enabled. Because there's a convenient point in the powerpc implementations to save state after any sNaN arguments will have raised "invalid" but before "inexact" traps need to be disabled, no special handling for "invalid" is needed as in the x86 version. Tested for powerpc64 and powerpc32, where it fixes the math/test-nearbyint-except failure as well as fixing the new test math/test-nearbyint-except-2 added by this patch. Also tested for x86_64 and x86 that the new test passes. If powerpc experts see a more efficient way of doing this (e.g. instruction positioning that's better for pipelines on typical processors) then of course followups optimizing the fix are welcome. [BZ #19228] * sysdeps/powerpc/powerpc32/fpu/s_nearbyint.S (__nearbyint): Save and restore full floating-point state. * sysdeps/powerpc/powerpc32/fpu/s_nearbyintf.S (__nearbyintf): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_nearbyint.S (__nearbyint): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_nearbyintf.S (__nearbyintf): Likewise. * math/test-nearbyint-except-2.c: New file. * math/Makefile (tests): Add test-nearbyint-except-2.
* PowerPC: Add comments to optimized strncpyGabriel F. T. Gomes2015-10-011-2/+38
| | | | | * sysdeps/powerpc/powerpc64/power8/strncpy.S: Added comments to some assembly instructions.
* PowerPC: Fix operand prefixesGabriel F. T. Gomes2015-10-011-6/+6
| | | | | | | | | | | | | | | | | | | | The file sysdeps/powerpc/sysdeps.h defines aliases for register operands, which add the letter 'r' as a prefix to a register name. E.g.: register 20 can be written as 'r20', instead of '20'. On the one hand, this increases readability, as it makes it easier for readers to know whether the operand is a register or an immediate. On the other hand, this permits that immediate operands be written as if they were registers, and vice-versa, thus reducing the readability of the code. This commit removes some of these unintentional misuses. This commit also increases readability of the code by adding the prefix 'cr' to some uses of the control register. Both changes have no effect on the final code. Checked with objdump. * sysdeps/powerpc/powerpc64/power8/strncpy.S: Remove or add register prefix from operands.
* Move bits/atomic.h to atomic-machine.h (bug 14912).Joseph Myers2015-09-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It was noted in <https://sourceware.org/ml/libc-alpha/2012-09/msg00305.html> that the bits/*.h naming scheme should only be used for installed headers. This patch renames bits/atomic.h to atomic-machine.h to follow that convention. This is the only change in this series that needs to change the filename rather than simply removing a directory level (because both atomic.h and bits/atomic.h exist at present). Tested for x86_64 (testsuite, and that installed stripped shared libraries are unchanged by the patch). [BZ #14912] * sysdeps/aarch64/bits/atomic.h: Move to ... * sysdeps/aarch64/atomic-machine.h: ...here. (_AARCH64_BITS_ATOMIC_H): Rename macro to _AARCH64_ATOMIC_MACHINE_H. * sysdeps/alpha/bits/atomic.h: Move to ... * sysdeps/alpha/atomic-machine.h: ...here. * sysdeps/arm/bits/atomic.h: Move to ... * sysdeps/arm/atomic-machine.h: ...here. Update comments. * bits/atomic.h: Move to ... * sysdeps/generic/atomic-machine.h: ...here. (_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H. * sysdeps/i386/bits/atomic.h: Move to ... * sysdeps/i386/atomic-machine.h: ...here. * sysdeps/ia64/bits/atomic.h: Move to ... * sysdeps/ia64/atomic-machine.h: ...here. * sysdeps/m68k/coldfire/bits/atomic.h: Move to ... * sysdeps/m68k/coldfire/atomic-machine.h: ...here. (_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H. * sysdeps/m68k/m680x0/m68020/bits/atomic.h: Move to ... * sysdeps/m68k/m680x0/m68020/atomic-machine.h: ...here. * sysdeps/microblaze/bits/atomic.h: Move to ... * sysdeps/microblaze/atomic-machine.h: ...here. * sysdeps/mips/bits/atomic.h: Move to ... * sysdeps/mips/atomic-machine.h: ...here. (_MIPS_BITS_ATOMIC_H): Rename macro to _MIPS_ATOMIC_MACHINE_H. * sysdeps/powerpc/bits/atomic.h: Move to ... * sysdeps/powerpc/atomic-machine.h: ...here. Update comments. * sysdeps/powerpc/powerpc32/bits/atomic.h: Move to ... * sysdeps/powerpc/powerpc32/atomic-machine.h: ...here. Update comments. Include <atomic-machine.h> instead of <bits/atomic.h>. * sysdeps/powerpc/powerpc64/bits/atomic.h: Move to ... * sysdeps/powerpc/powerpc64/atomic-machine.h: ...here. Include <atomic-machine.h> instead of <bits/atomic.h>. * sysdeps/s390/bits/atomic.h: Move to ... * sysdeps/s390/atomic-machine.h: ...here. * sysdeps/sparc/sparc32/bits/atomic.h: Move to ... * sysdeps/sparc/sparc32/atomic-machine.h: ...here. (_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H. * sysdeps/sparc/sparc32/sparcv9/bits/atomic.h: Move to ... * sysdeps/sparc/sparc32/sparcv9/atomic-machine.h: ...here. * sysdeps/sparc/sparc64/bits/atomic.h: Move to ... * sysdeps/sparc/sparc64/atomic-machine.h: ...here. * sysdeps/tile/bits/atomic.h: Move to ... * sysdeps/tile/atomic-machine.h: ...here. * sysdeps/tile/tilegx/bits/atomic.h: Move to ... * sysdeps/tile/tilegx/atomic-machine.h: ...here. Include <sysdeps/tile/atomic-machine.h> instead of <sysdeps/tile/bits/atomic.h>. (_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H. * sysdeps/tile/tilepro/bits/atomic.h: Move to ... * sysdeps/tile/tilepro/atomic-machine.h: ...here. Include <sysdeps/tile/atomic-machine.h> instead of <sysdeps/tile/bits/atomic.h>. (_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H. * sysdeps/unix/sysv/linux/arm/bits/atomic.h: Move to ... * sysdeps/unix/sysv/linux/arm/atomic-machine.h: ...here. Include <sysdeps/arm/atomic-machine.h> instead of <sysdeps/arm/bits/atomic.h>. * sysdeps/unix/sysv/linux/hppa/bits/atomic.h: Move to ... * sysdeps/unix/sysv/linux/hppa/atomic-machine.h: ...here. (_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H. * sysdeps/unix/sysv/linux/m68k/coldfire/bits/atomic.h: Move to ... * sysdeps/unix/sysv/linux/m68k/coldfire/atomic-machine.h: ...here. (_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H. * sysdeps/unix/sysv/linux/nios2/bits/atomic.h: Move to ... * sysdeps/unix/sysv/linux/nios2/atomic-machine.h: ...here. (_NIOS2_BITS_ATOMIC_H): Rename macro to _NIOS2_ATOMIC_MACHINE_H. * sysdeps/unix/sysv/linux/sh/bits/atomic.h: Move to ... * sysdeps/unix/sysv/linux/sh/atomic-machine.h: ...here. * sysdeps/x86_64/bits/atomic.h: Move to ... * sysdeps/x86_64/atomic-machine.h: ...here. * include/atomic.h: Include <atomic-machine.h> instead of <bits/atomic.h>.
* powerpc: Fix tabort usage in syscallsPaul E. Murphy2015-08-251-2/+2
| | | | | | | | | | | | | | | | | | | Fix usage of tabort in generated syscalls. r0 has special meaning when used with this instruction, thus it will not generate persistent errors, nor return an error code. This mitigates poor CPU usage when performing elided critical sections. Additionally, transactions should be aborted when entering a user invoked syscall. Otherwise the results of the transaction may be undefined. 2015-08-25 Paul E. Murphy <murphyp@linux.vnet.ibm.com> * sysdeps/powerpc/powerpc32/sysdep.h (ABORT_TRANSACTION): Use register other than r0 for tabort, it has special meaning. * sysdeps/powerpc/powerpc64/sysdep.h (ABORT_TRANSACTION): Likewise * sysdeps/unix.sysv/linux/powerpc/syscall.S (syscall): Abort transaction before starting syscall.
* powerpc: Handle worstcase behavior in strstr() for POWER7Rajalakshmi Srinivasaraghavan2015-08-251-7/+15
| | | | | | | | | | Instead of checking needle length, constant 'n' number of comparisons is checked to fall back to default implementation. This patch is tested on powerpc64 and powerpc64le. 2015-08-25 Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com> * sysdeps/powerpc/powerpc64/power7/strstr.S: Handle worst case.
* powerpc: make memchr use memchr-power7.Carlos Eduardo Seo2015-08-211-1/+13
| | | | | | | | | In powerpc64, memchr was always pointing to the internal __GI_memchr implementation. This patch fixes that and makes it use the optimized POWER7 version when adequate. * sysdeps/powerpc/powerpc64/multiarch/memchr-ppc64.c: Make memchr not point to the internal __GI_memchr implementation.
* powerpc: Fix stpcpy performance for power8Ondrej Bilka2015-08-111-2/+5
| | | | | | This patch fixes the missing enablement for stpcpy on POWER8. * sysdeps/powerpc/powerpc64/multiarch/stpcpy.c: Fix ifunc.
* powerpc: Fix PPC64/POWER7 conform testsAdhemerval Zanella2015-08-112-3/+4
| | | | | | | | | | | When building with --disable-multi-arch the memmove and strstr POWER7 optimization create and uses symbols that conflict with expect conform tests. * sysdeps/powerpc/powerpc64/power7/memmove.S (bcopy): Changing to __bcopy and add a weak_alias to bcopy. * sysdeps/powerpc/powerpc64/power7/strstr.S (strstr): Use __strnlen for static build.
* powerpc: Use default strcpy optimization for POWER7Adhemerval Zanella2015-08-119-794/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patches uses the default strcpy/stpcpy implementation for POWER7/PPC64. This is faster in mostly inputs for benchtests and for multiarch the implementation uses the POWER7 strlen and memcpy. * string/stpcpy.c (__stpcpy): Use STPCPY to redefine symbol name and cleanup macro usage. * string/strcpy.c (strcpt): Use STRCPY to redefine symbol name. * sysdeps/powerpc/powerpc64/multiarch/stpcpy-power7.S: Remove file. * sysdeps/powerpc/powerpc64/multiarch/stpcpy-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcpy-power7.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcpy-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/power7/stpcpy.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strcpy.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strcpy.c: Likewise. * sysdeps/powerpc/powerpc64/stpcpy.S: Likewise. * sysdeps/powerpc/powerpc64/strcpy.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/stpcpy.c [SHARED && IS_IN (libc)]: Include <string/strcpy.c>. * sysdeps/powerpc/powerpc64/multiarch/stpcpy.c [SHARED && IS_IN (libc)]: Include <string/stpcpy.c>. * sysdeps/powerpc/powerpc64/multiarch/stpcpy-power7.c: New file. * sysdeps/powerpc/powerpc64/multiarch/stpcpy-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcpy-power7.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcpy-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/power7/strcpy.c: Likewise.
* powerpc: Fix strnlen/power7 buildAdhemerval Zanella2015-08-111-1/+2
| | | | This patch fixes the strnlen.S build with --disable-multi-arch option.
* powerpc: Fix strstr/power7 buildAdhemerval Zanella2015-08-112-0/+28
| | | | | | | | | | | | This patch fixes the strstr build with --disable-multi-arch option. The optimization calls the __strstr_ppc symbol, which always build for multiarch config but not if it is disable. This patch fixes it by adding the default C implementation object with the expected symbol name. * sysdeps/powerpc/powerpc64/power7/Makefile [$(subdir) = string] (sysdep_routines): Add strstr-ppc64. * sysdeps/powerpc/powerpc64/power7/strstr-ppc64.c: New file.
* powerpc: strstr optimizationRajalakshmi Srinivasaraghavan2015-07-166-1/+626
| | | | | | | | | | | | | | | | This patch optimizes strstr function for power >= 7 systems. Performance gain is obtained using aligned memory access and usage of cmpb instruction for quicker comparison. The average improvement of this optimization is ~40%. Tested on ppc64 and ppc64le. 2015-07-16 Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com> * sysdeps/powerpc/powerpc64/multiarch/Makefile: Add strstr(). * sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c: Likewise. * sysdeps/powerpc/powerpc64/power7/strstr.S: New File. * sysdeps/powerpc/powerpc64/multiarch/strstr-power7.S: New File. * sysdeps/powerpc/powerpc64/multiarch/strstr-ppc64.c: New File. * sysdeps/powerpc/powerpc64/multiarch/strstr.c: New File.
* libc-vdso.h place consolidationAdhemerval Zanella2015-04-201-1/+1
| | | | | | This patch moves the libc-vdso.h internal header from bits folder to default architecture one and also corrects the remaning includes in the files.
* Harden powerpc64 elf_machine_fixup_pltAlan Modra2015-03-261-5/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | IFUNC is difficult to correctly implement on any target needing a GOT to support position independent code, due to the dependency on order of dynamic relocations. ld.so should be changed to apply IFUNC relocations last, globally, because without that it is actually impossible to write an IFUNC resolver in C that works in all situations. Case in point, vfork in libpthread.so is an IFUNC with the resolver returning &__libc_vfork. (system and fork are similar.) If another shared library, libA say, uses vfork then it is quite possible that libpthread.so hasn't been dynamically relocated before the unfortunate libA is dynamically relocated. In that case the GOT entry for &__libc_vfork is still zero, so the IFUNC resolver returns NULL. LD_BIND_NOW=1 results in libA PLT dynamic relocations being applied using this NULL value and ld.so segfaults. This patch hardens ld.so to not segfault on a NULL from an IFUNC resolver. It also fixes a problem with undefined weak. If you leave the plt entry as-is for undefined weak then if the entry is ever called it will loop in ld.so rather than segfaulting. * sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_fixup_plt): Don't segfault if ifunc resolver returns a NULL. Do set plt to zero for undefined weak. (elf_machine_plt_conflict): Similarly.
* powerpc __tls_get_addr call optimizationAlan Modra2015-03-251-0/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is glibc support for a PowerPC TLS optimization, inspired by Alexandre Oliva's TLS optimization for other processors, http://www.lsd.ic.unicamp.br/~oliva/writeups/TLS/RFC-TLSDESC-x86.txt In essence, this optimization uses a zero module id in the tls_index GOT entry to indicate that a TLS variable is allocated space in the static TLS area. A special plt call linker stub for __tls_get_addr checks for such a tls_index and if found, returns the offset immediately. The linker communicates the fact that the special __tls_get_addr stub is used by setting a bit in the dynamic tag DT_PPC64_OPT/DT_PPC_OPT. glibc communicates to the linker that this optimization is available by the presence of __tls_get_addr_opt. tst-tlsmod2.so is built with -Wl,--no-tls-get-addr-optimize for tst-tls-dlinfo, which otherwise would fail since it tests that no static tls is allocated. The ld option --no-tls-get-addr-optimize has been available since binutils-2.20 so doesn't need a configure test. * NEWS: Advertise TLS optimization. * elf/elf.h (R_PPC_TLSGD, R_PPC_TLSLD, DT_PPC_OPT, PPC_OPT_TLS): Define. (DT_PPC_NUM): Increment. * elf/dynamic-link.h (HAVE_STATIC_TLS): Define. (CHECK_STATIC_TLS): Use here. * sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_rela): Optimize TLS descriptors. * sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/powerpc/dl-tls.c: New file. * sysdeps/powerpc/Versions: Add __tls_get_addr_opt. * sysdeps/powerpc/tst-tlsopt-powerpc.c: New tls test. * sysdeps/unix/sysv/linux/powerpc/Makefile: Add new test. Build tst-tlsmod2.so with --no-tls-get-addr-optimize. * sysdeps/unix/sysv/linux/powerpc/powerpc32/ld.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/ld.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/ld-le.abilist: Likewise.
* powerpc64 configure messageAlan Modra2015-03-252-3/+3
| | | | | | | | | This feature doesn't depend on the linker, as can be seen from the actual test. It's a compiler feature. * sysdeps/powerpc/powerpc64/configure.ac: Correct "linker support for overlapping .opd entries" to "support...". * sysdeps/powerpc/powerpc64/configure: Regenerate
* powerpc: Remove HAVE_ASM_GLOBAL_DOT_NAME defineAdhemerval Zanella2015-03-114-85/+11
| | | | | | | | | With AIX port deprecated there is no need to check/define HAVE_ASM_GLOBAL_DOT_NAME anymore since the current minimum binutils supported (2.22) does not emit global symbol with dot. This patch removes all the HAVE_ASM_GLOBAL_DOT_NAME definition and checks for powerpc64 port.
* Replace ELF_RTYPE_CLASS_NOCOPY with ELF_RTYPE_CLASS_COPYH.J. Lu2015-03-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ELF_RTYPE_CLASS_NOCOPY in comments is a typo. It should be ELF_RTYPE_CLASS_COPY. [BZ #18082] * sysdeps/alpha/dl-machine.h (elf_machine_type_class): Replace ELF_RTYPE_CLASS_NOCOPY with ELF_RTYPE_CLASS_COPY in comments. * sysdeps/arm/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/hppa/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/i386/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/ia64/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/m68k/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/microblaze/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/nios2/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/s390/s390-32/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/s390/s390-64/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/sh/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/sparc/sparc32/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/sparc/sparc64/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/tile/dl-machine.h (elf_machine_type_class): Likewise. * sysdeps/x86_64/dl-machine.h (elf_machine_type_class): Likewise.