about summary refs log tree commit diff
path: root/sysdeps/powerpc
Commit message (Collapse)AuthorAgeFilesLines
* PowerPC: memset optimization for POWER8/PPC64Adhemerval Zanella2014-09-106-9/+513
| | | | | | | | | | | | | | | | | | | This patch adds an optimized memset implementation for POWER8. For sizes from 0 to 255 bytes, a word/doubleword algorithm similar to POWER7 optimized one is used. For size higher than 255 two strategies are used: 1. If the constant is different than 0, the memory is written with altivec vector instruction; 2. If constant is 0, dbcz instructions are used. The loop is unrolled to clear 512 byte at time. Using vector instructions increases throughput considerable, with a double performance for sizes larger than 1024. The dcbz loops unrolls also shows performance improvement, by doubling throughput for sizes larger than 8192 bytes.
* PowerPC: multiarch bzero cleanup for PPC64Adhemerval Zanella2014-09-1010-91/+15
| | | | | | | | This patch cleanups the multiarch bzero for powerpc64 by remove the multiarch objects and use instead the the memset embedded implementation presented in each multiarch optimization. The code generate is essentially the same, but the TB_TOCLESS (which is not essential).
* Define __GI_fegetenv for e500 libmKhem Raj2014-09-021-0/+1
| | | | | | | | | | | generic HAVE_RM_CTX implementation which is used for ppc/e500 as well has introduced calls to fegetenv which should be resolved internally with in libm Signed-off-by: Khem Raj <raj.khem@gmail.com> * sysdeps/powerpc/powerpc32/e500/nofpu/fegetenv.c (fegetenv): Add libm_hidden_ver.
* Remove unnecessary uses of NOT_IN_libcSiddhesh Poyarekar2014-08-212-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a IS_IN_* macro is defined, then NOT_IN_libc is always defined, except obviously for IS_IN_libc. There's no need to check for both. Verified on x86_64 and i686 that the source is unchanged. * include/libc-symbols.h: Remove unnecessary check for NOT_IN_libc. * nptl/pthreadP.h: Likewise. * sysdeps/aarch64/setjmp.S: Likewise. * sysdeps/alpha/setjmp.S: Likewise. * sysdeps/arm/sysdep.h: Likewise. * sysdeps/i386/setjmp.S: Likewise. * sysdeps/m68k/setjmp.c: Likewise. * sysdeps/posix/getcwd.c: Likewise. * sysdeps/powerpc/powerpc32/setjmp-common.S: Likewise. * sysdeps/powerpc/powerpc64/setjmp-common.S: Likewise. * sysdeps/s390/s390-32/setjmp.S: Likewise. * sysdeps/s390/s390-64/setjmp.S: Likewise. * sysdeps/sh/sh3/setjmp.S: Likewise. * sysdeps/sh/sh4/setjmp.S: Likewise. * sysdeps/unix/alpha/sysdep.h: Likewise. * sysdeps/unix/sysv/linux/aarch64/sysdep.h: Likewise. * sysdeps/unix/sysv/linux/i386/sysdep.h: Likewise. * sysdeps/unix/sysv/linux/ia64/setjmp.S: Likewise. * sysdeps/unix/sysv/linux/ia64/sysdep.h: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/sysdep.h: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/sysdep.h: Likewise. * sysdeps/unix/sysv/linux/sh/sysdep.h: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/sysdep.h: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/sysdep.h: Likewise. * sysdeps/unix/sysv/linux/tile/sysdep.h: Likewise. * sysdeps/unix/sysv/linux/x86_64/sysdep.h: Likewise. * sysdeps/x86_64/setjmp.S: Likewise.
* Fix powerpc-nofpu __fe_enabled_env and __fe_nonieee_env (bug 17261).Joseph Myers2014-08-121-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | On powerpc, floating-point environment macros are defined as pointers to constants in the library that contain the bit-patterns of the desired environment, instead of being magic constants cast to pointer type. For soft-float, the bit-patterns used for fenv_t are not laid out the same as for hard-float. (e500 has a third layout used; that's not an ABI issue because these values are only meaningful within a single process, all of whose glibc libraries must come from the same build of glibc.) While the __fe_dfl_env value for soft-float was appropriate for the soft-float fenv_t representation, the other two constants had the same bit-patterns as for hard-float. Those bit patterns had the effect of having exceptions already raised, causing math/test-fenv-return to fail; this patch fixes the patterns used. (__fe_nonieee_env also had exceptions unmasked, though they should be masked to match hard-float semantics. Since there is no separate non-IEEE mode for soft-float, it's most appropriate for __fe_nonieee_env to be the same as __fe_dfl_env; this patch makes it an alias.) Tested for powerpc-nofpu. [BZ #17261] * sysdeps/powerpc/nofpu/fenv_const.c (__fe_enabled_env): Change value to 0. (__fe_nonieee_env): Define as an alias for __fe_dfl_env.
* PowerPC: Fix gprof entry point for LEAdhemerval Zanella2014-07-301-0/+2
| | | | | This patch fixes the ELFv2 gprof entry point since the ABI does not define function descriptors. It fixes BZ#17213.
* Fix missing newline in test outputAndreas Schwab2014-07-091-1/+1
|
* PowerPC: Cleanup powerpc memmoveAdhemerval Zanella2014-07-085-25/+5
| | | | | | | Now that MEMCPY_OK_FOR_FWD_MEMMOVE should be define on memcopy.h there is no need to specialized powerpc memmove implementation. This patch moves the define set to powerpc memcopy and cleanup its definition on powerpc code.
* PowerPC: Fix compiler warningsAdhemerval Zanella2014-07-083-3/+5
| | | | | This patch fixes some compiler due trailing data in #undef directives and due missing prototypes.
* PowerPC: Add ifunc tests for memmoveAdhemerval Zanella2014-07-081-0/+6
| | | | | This patch add the missing ifunc tests definition for memmove ppc32 optimization patch (commit 07aedd7).
* PowerPC: Align power7 memcpy using VSX to quadwordAdhemerval Zanella2014-07-072-20/+6
| | | | | | This patch changes power7 memcpy to use VSX instructions only when memory is aligned to quardword. It is to avoid unaligned kernel traps on non-cacheable memory (for instance, memory-mapped I/O).
* PowerPC: optimized memmove for POWER7/PPC32Adhemerval Zanella2014-07-074-1/+100
| | | | | This patch adds a optimized memmove for power7 by using the optimized power7 memcpy for forward copying.
* PowerPC: optimized memmove for POWER7/PPC64Adhemerval Zanella2014-07-079-1/+1016
| | | | | | | | | | | This patch adds an optimized memmove optimization for POWER7/powerpc64. Basically the idea is to use the memcpy for POWER7 on non-overlapped memory regions and a optimized backward memcpy for memory regions that overlap (similar to the idea of string/memmove.c). The backward memcpy algorithm used is similar the one use for memcpy for POWER7, with adjustments done for alignment. The difference is memory is always aligned to 16 bytes before using VSX/altivec instructions.
* PowerPC: memmove default implementation cleanupAdhemerval Zanella2014-07-071-97/+2
| | | | | | | | This patch removes the powerpc specific logic in memmove and instead include default implementation with MEMCPY_OK_FOR_FWD_MEMMOVE defined. This lead in a increase performance, since the constraints to use memcpy in powerpc code are too restrictive and memcpy can be used for any forward memmove.
* PowerPC: Guard CALL_ELF check for ppc64 only in link.hAdhemerval Zanella2014-07-071-2/+4
| | | | | This patch fixes powerpc32 undef compiler warnings for _CALL_ELF, since it is defined only for powerpc64.
* Always provide HP_SMALL_TIMING_AVAILRichard Henderson2014-07-032-0/+2
|
* Unify hp-timing implementationsRichard Henderson2014-07-032-106/+2
| | | | Provide an hp-timing-common.h for ports to use.
* Remove HP_TIMING_DIFF_INIT and dl_hp_timing_overheadRichard Henderson2014-07-035-100/+0
| | | | | Without HP_TIMING_ACCUM, dl_hp_timing_overhead is write-only. If we remove it, there's no point in HP_TIMING_DIFF_INIT either.
* Removing HP_TIMING_ACCUM as unusedRichard Henderson2014-07-032-32/+4
|
* Removing HP_TIMING_ZERO as unusedRichard Henderson2014-07-032-10/+0
|
* powerpc: Remove dummy hp-timing.hRichard Henderson2014-07-031-81/+0
| | | | It's the same as the generic dummy version.
* Fix -Wundef warning on PAGE_COPY_THRESHOLDSiddhesh Poyarekar2014-07-031-1/+0
| | | | | | | The PAGE_COPY_THRESHOLD macro is meant to be overridden by architecture-specific pagecopy.h, but it is currently done only by mach; all other architectures use the default. Check to see if the macro is defined in addition to whether it is set to a non-zero value.
* PowerPC: strcat optimization for PPC64/POWER7Vidya Ranganathan2014-07-026-4/+105
| | | | | | This patch adds an ifunc power7 strcat symbol that uses the logic on sysdeps/powerpc/strcat.c but call power7 strlen/strcpy symbols instead of default ones.
* Update powerpc-fpu ULPs.Adhemerval Zanella2014-06-301-0/+24
|
* Regenerate powerpc-nofpu libm-test-ulps.Joseph Myers2014-06-301-99/+1326
| | | | | | This patch regenerates libm-test-ulps for powerpc-nofpu. * sysdeps/powerpc/nofpu/libm-test-ulps: Regenerated.
* Remove shlib-versions ABI names support.Joseph Myers2014-06-271-1/+0
| | | | | | | | | | | | | | | | | | | | shlib-versions files can contain ABI lines that map triplets to a canonical ABI name. This name was once used for various purposes where test baseline files for different ABIs went in a single directory; now these purposes use sysdeps files, generation of headers which have per-ABI variants uses abi-variants and related Makefile variables and the shlib-versions ABI names are unused. This patch duly removes those lines and associated build system support for them. Tested for x86_64 (both a full testsuite run and confirming the installed shared libraries are unchanged by the patch). * Makeconfig ($(common-objpfx)soversions.mk): Do not generate abi-name definition. * scripts/soversions.awk: Do not handle or generate ABI lines. * shlib-versions: Remove ABI entries. * sysdeps/powerpc/nofpu/shlib-versions: Remove file. * sysdeps/x86_64/x32/shlib-versions: Remove ABI entry.
* Fix Wundef warning for ELF_MACHINE_NO_RELASiddhesh Poyarekar2014-06-262-0/+2
| | | | | | | This patch defines ELF_MACHINE_NO_RELA on all architectures. Tested only on x86_64 to verify that the sources before and after are identical except for two instructions that pass the current line number in dl-machine.h to assert_fail.
* Move base_machine and machine settings from configure.ac to sysdeps ↵Joseph Myers2014-06-251-2/+8
| | | | | | | | | | | | | | | | | | | | | | preconfigure fragments. This patch makes non-ex-ports architectures set base_machine and machine based on the original configured machine value in preconfigure fragments, like ex-ports architectures, rather than in the toplevel configure.ac. Tested x86 that the disassembly of installed shared libraries is unchanged by the patch. * configure.ac (base_machine): Do not set specially for particular machines here. * configure: Regenerated. * sysdeps/powerpc/preconfigure: Move machine and base_machine settings from configure.ac. * sysdeps/i386/preconfigure: New file. * sysdeps/s390/preconfigure: Likewise. * sysdeps/sh/preconfigure: Likewise. * sysdeps/sparc/preconfigure: Likewise.
* Update powerpc-fpu ULPs.Adhemerval Zanella2014-06-251-4/+66
|
* PowerPC: sync hwcap.h capabilitiesAdhemerval Zanella2014-06-231-0/+2
| | | | | | | Linux commit dd58a092c4202f2bd490adab7285b3ff77f8e467 added the PPC_FEATURE2_VEC_CRYPTO auvx capability to indicate whether to hardware supports vector crypto hardware instructions. This patch adds its definition to powerpc hwcap bits.
* Include <kernel-features.h> explicitly where required.Joseph Myers2014-06-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes files using __ASSUME_* macros include <kernel-features.h> explicitly, rather than relying on some other header (such as tls.h, lowlevellock.h or pthreadP.h) to include it implicitly. (I omitted cases where I've already posted or am testing the patch that stops the file from needing __ASSUME_* at all.) This accords with the general principle of making source files include the headers for anything they use, and also helps make it safe to remove <kernel-features.h> includes from any file that doesn't use __ASSUME_* (some of those may be stray includes left behind after increasing the minimum kernel version, others may never have been needed or may have become obsolete after some other change). Tested x86_64 that the disassembly of installed shared libraries is unchanged by this patch. * nptl/pthread_cond_wait.c: Include <kernel-features.h>. * nptl/pthread_rwlock_timedrdlock.c: Likewise. * nptl/pthread_rwlock_timedwrlock.c: Likewise. * nptl/sysdeps/unix/sysv/linux/lowlevelrobustlock.c: Likewise. * nscd/nscd.c: Likewise. * sysdeps/i386/nptl/tcb-offsets.sym: Likewise. * sysdeps/powerpc/nptl/tcb-offsets.sym: Likewise. * sysdeps/sh/nptl/tcb-offsets.sym: Likewise. * sysdeps/x86_64/nptl/tcb-offsets.sym: Likewise.
* PowerPC: Move powerpc code out of nptl/ subdirectoryAdhemerval Zanella2014-06-176-0/+382
|
* Update powerpc-fpu ULPs.Adhemerval Zanella2014-06-111-0/+24
|
* PowerPC: Optimized strcmp for PPC64/POWER7Vidya Ranganathan2014-06-116-1/+317
| | | | | | Optimization is achieved on 8 byte aligned strings with double word comparison using cmpb instruction. On unaligned strings loop unrolling is applied for Power7 gain.
* PowerPC: Fix optimized strncat strlen callAdhemerval Zanella2014-06-061-1/+5
| | | | | | | This patch fixes the optimized ppc64/power7 strncat strlen call for static build without ifunc enabled. The strlen symbol to call in such situation is just strlen, instead of __GI_strlen (since the __GI_ alias is just created for shared objects).
* Update powerpc-fpu ULPs.Adhemerval Zanella2014-05-261-0/+24
|
* PowerPC: Remove 64 bits instructions in PPC32 codeAdhemerval Zanella2014-05-268-16/+16
| | | | This patch replaces the insrdi by insrwi in powerpc32 assembly.
* PowerPC: Remove unneeded copysign[f] macrosAdhemerval Zanella2014-05-221-27/+0
| | | | | This patch remove the unneeded copysign[f] macro from powerpc math_private.h, since they are already covered in generic version.
* PowerPC: Fix memchr ifunc hidden symbol for PPC32Adhemerval Zanella2014-05-222-10/+14
| | | | | | | | | This patch fixes a similar issue to 736c304a1ab4cee36a2f3343f1698bc0abae4608, where for PPC32 if the symbol is defined as hidden (memchr) then compiler will create a local branc (symbol@local) and the linker will not create a required PLT call to make the ifunc work. It changes the default hidden symbol (__GI_memchr) to default memchr symbol for powerpc32 (__memchr_ppc32).
* Update powerpc-fpu ULPs.Adhemerval Zanella2014-05-201-0/+63
|
* PowerPC: Fix copysignf optimization macroAdhemerval Zanella2014-05-201-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the __copysignf optimized macro meant to internal libm usage when used with constant value. Without the explicit cast to float, if it is used with const double value (for instance, on s_casinhf.c) double constants will be used and it may lead to precision issues in some algorithms. It fixes the following failures on PPC64/POWER7: Failure: Test: Real part of: cacos_downward (inf + 0 i) Result: is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23 should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0 Failure: Test: Real part of: cacos_downward (inf - 0 i) Result: is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23 should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0 Failure: Test: Real part of: cacos_downward (inf + 0.5 i) Result: is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23 should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0 Failure: Test: Real part of: cacos_downward (inf - 0.5 i) Result: is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23 should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0 Failure: Test: Real part of: cacos_towardzero (inf + 0 i) Result: is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23 should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0 Failure: Test: Real part of: cacos_towardzero (inf - 0 i) Result: is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23 should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0 Failure: Test: Real part of: cacos_towardzero (inf + 0.5 i) Result: is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23 should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0 Failure: Test: Real part of: cacos_towardzero (inf - 0.5 i) Result: is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23 should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0
* PowerPC: Fix multiarch hypotf PPC64 pathAdhemerval Zanella2014-05-191-0/+0
| | | | This patch moves the hypotf multiarch implementation to correct path.
* PowerPC: strncpy/stpncpy optimization for PPC64/POWER7Vidya Ranganathan2014-05-0610-1/+593
| | | | | | | | The optimization is achieved by following techniques: > data alignment [gain from aligned memory access on read/write] > POWER7 gains performance with loop unrolling/unwinding [gain by reduction of branch penalty]. > zero padding done by calling optimized memset
* PowerPC: ifunc improvement for internal callsAdhemerval Zanella2014-05-057-26/+49
| | | | | | | This patch changes de default symbol redirection for internal call of memcpy, memset, memchr, and strlen to the IFUNC resolved ones. The performance improvement is noticeable in algorithms that uses these symbols extensible, like the regex functions.
* FixAdhemerval Zanella2014-04-292-2/+2
|
* PowerPC: Suppress unnecessary FPSCR writeAdhemerval Zanella2014-04-297-16/+48
| | | | | | | This patch optimizes the FPSCR update on exception and rounding change functions by just updating its value if new value if different from current one. It also optimizes fedisableexcept and feenableexcept by removing an unecessary FPSCR read.
* PowerPC: Add fenv macros for long doubleAdhemerval Zanella2014-04-171-2/+7
| | | | | This patch add the missing libc_<function>l_ctx macros for long double. Similar for float, they point to default double versions.
* PowerPC: Fix --disable-multi-arch buildsAdhemerval Zanella2014-04-0910-6/+14
| | | | | | | | | | This patch fixes some powerpc32 and powerpc64 builds with --disable-multi-arch option along with different --with-cpu=powerN. It cleanups the Implies directories by removing the multiarch folder for non multiarch config and also fixing two assembly implementations: powerpc64/power7/strncat.S that is calling the wrong strlen; and power8/fpu/s_isnan.S that misses the hidden_def and weak_alias directives.
* PowerPC: Fix nearbyint/nearbyintf result for FE_DOWNWARDAdhemerval Zanella2014-04-063-10/+203
| | | | | | | | | This patch fixes the powerpc32 optimized nearbyint/nearbyintf bogus results for FE_DOWNWARD rounding mode. This is due wrong instructions sequence used in the rounding calculation (two subtractions instead of adition and a subtraction). Fixes BZ#16815.
* Correct prefetch hint in power7 memrchr.Alan Modra2014-04-021-1/+1
| | | | | | Typo fix. * sysdeps/powerpc/powerpc64/power7/memrchr.S: Correct stream hint.