about summary refs log tree commit diff
path: root/sysdeps/powerpc
Commit message (Collapse)AuthorAgeFilesLines
* PowerPC: optimized memmove for POWER7/PPC32Adhemerval Zanella2014-07-074-1/+100
| | | | | This patch adds a optimized memmove for power7 by using the optimized power7 memcpy for forward copying.
* PowerPC: optimized memmove for POWER7/PPC64Adhemerval Zanella2014-07-079-1/+1016
| | | | | | | | | | | This patch adds an optimized memmove optimization for POWER7/powerpc64. Basically the idea is to use the memcpy for POWER7 on non-overlapped memory regions and a optimized backward memcpy for memory regions that overlap (similar to the idea of string/memmove.c). The backward memcpy algorithm used is similar the one use for memcpy for POWER7, with adjustments done for alignment. The difference is memory is always aligned to 16 bytes before using VSX/altivec instructions.
* PowerPC: memmove default implementation cleanupAdhemerval Zanella2014-07-071-97/+2
| | | | | | | | This patch removes the powerpc specific logic in memmove and instead include default implementation with MEMCPY_OK_FOR_FWD_MEMMOVE defined. This lead in a increase performance, since the constraints to use memcpy in powerpc code are too restrictive and memcpy can be used for any forward memmove.
* PowerPC: Guard CALL_ELF check for ppc64 only in link.hAdhemerval Zanella2014-07-071-2/+4
| | | | | This patch fixes powerpc32 undef compiler warnings for _CALL_ELF, since it is defined only for powerpc64.
* Always provide HP_SMALL_TIMING_AVAILRichard Henderson2014-07-032-0/+2
|
* Unify hp-timing implementationsRichard Henderson2014-07-032-106/+2
| | | | Provide an hp-timing-common.h for ports to use.
* Remove HP_TIMING_DIFF_INIT and dl_hp_timing_overheadRichard Henderson2014-07-035-100/+0
| | | | | Without HP_TIMING_ACCUM, dl_hp_timing_overhead is write-only. If we remove it, there's no point in HP_TIMING_DIFF_INIT either.
* Removing HP_TIMING_ACCUM as unusedRichard Henderson2014-07-032-32/+4
|
* Removing HP_TIMING_ZERO as unusedRichard Henderson2014-07-032-10/+0
|
* powerpc: Remove dummy hp-timing.hRichard Henderson2014-07-031-81/+0
| | | | It's the same as the generic dummy version.
* Fix -Wundef warning on PAGE_COPY_THRESHOLDSiddhesh Poyarekar2014-07-031-1/+0
| | | | | | | The PAGE_COPY_THRESHOLD macro is meant to be overridden by architecture-specific pagecopy.h, but it is currently done only by mach; all other architectures use the default. Check to see if the macro is defined in addition to whether it is set to a non-zero value.
* PowerPC: strcat optimization for PPC64/POWER7Vidya Ranganathan2014-07-026-4/+105
| | | | | | This patch adds an ifunc power7 strcat symbol that uses the logic on sysdeps/powerpc/strcat.c but call power7 strlen/strcpy symbols instead of default ones.
* Update powerpc-fpu ULPs.Adhemerval Zanella2014-06-301-0/+24
|
* Regenerate powerpc-nofpu libm-test-ulps.Joseph Myers2014-06-301-99/+1326
| | | | | | This patch regenerates libm-test-ulps for powerpc-nofpu. * sysdeps/powerpc/nofpu/libm-test-ulps: Regenerated.
* Remove shlib-versions ABI names support.Joseph Myers2014-06-271-1/+0
| | | | | | | | | | | | | | | | | | | | shlib-versions files can contain ABI lines that map triplets to a canonical ABI name. This name was once used for various purposes where test baseline files for different ABIs went in a single directory; now these purposes use sysdeps files, generation of headers which have per-ABI variants uses abi-variants and related Makefile variables and the shlib-versions ABI names are unused. This patch duly removes those lines and associated build system support for them. Tested for x86_64 (both a full testsuite run and confirming the installed shared libraries are unchanged by the patch). * Makeconfig ($(common-objpfx)soversions.mk): Do not generate abi-name definition. * scripts/soversions.awk: Do not handle or generate ABI lines. * shlib-versions: Remove ABI entries. * sysdeps/powerpc/nofpu/shlib-versions: Remove file. * sysdeps/x86_64/x32/shlib-versions: Remove ABI entry.
* Fix Wundef warning for ELF_MACHINE_NO_RELASiddhesh Poyarekar2014-06-262-0/+2
| | | | | | | This patch defines ELF_MACHINE_NO_RELA on all architectures. Tested only on x86_64 to verify that the sources before and after are identical except for two instructions that pass the current line number in dl-machine.h to assert_fail.
* Move base_machine and machine settings from configure.ac to sysdeps ↵Joseph Myers2014-06-251-2/+8
| | | | | | | | | | | | | | | | | | | | | | preconfigure fragments. This patch makes non-ex-ports architectures set base_machine and machine based on the original configured machine value in preconfigure fragments, like ex-ports architectures, rather than in the toplevel configure.ac. Tested x86 that the disassembly of installed shared libraries is unchanged by the patch. * configure.ac (base_machine): Do not set specially for particular machines here. * configure: Regenerated. * sysdeps/powerpc/preconfigure: Move machine and base_machine settings from configure.ac. * sysdeps/i386/preconfigure: New file. * sysdeps/s390/preconfigure: Likewise. * sysdeps/sh/preconfigure: Likewise. * sysdeps/sparc/preconfigure: Likewise.
* Update powerpc-fpu ULPs.Adhemerval Zanella2014-06-251-4/+66
|
* PowerPC: sync hwcap.h capabilitiesAdhemerval Zanella2014-06-231-0/+2
| | | | | | | Linux commit dd58a092c4202f2bd490adab7285b3ff77f8e467 added the PPC_FEATURE2_VEC_CRYPTO auvx capability to indicate whether to hardware supports vector crypto hardware instructions. This patch adds its definition to powerpc hwcap bits.
* Include <kernel-features.h> explicitly where required.Joseph Myers2014-06-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes files using __ASSUME_* macros include <kernel-features.h> explicitly, rather than relying on some other header (such as tls.h, lowlevellock.h or pthreadP.h) to include it implicitly. (I omitted cases where I've already posted or am testing the patch that stops the file from needing __ASSUME_* at all.) This accords with the general principle of making source files include the headers for anything they use, and also helps make it safe to remove <kernel-features.h> includes from any file that doesn't use __ASSUME_* (some of those may be stray includes left behind after increasing the minimum kernel version, others may never have been needed or may have become obsolete after some other change). Tested x86_64 that the disassembly of installed shared libraries is unchanged by this patch. * nptl/pthread_cond_wait.c: Include <kernel-features.h>. * nptl/pthread_rwlock_timedrdlock.c: Likewise. * nptl/pthread_rwlock_timedwrlock.c: Likewise. * nptl/sysdeps/unix/sysv/linux/lowlevelrobustlock.c: Likewise. * nscd/nscd.c: Likewise. * sysdeps/i386/nptl/tcb-offsets.sym: Likewise. * sysdeps/powerpc/nptl/tcb-offsets.sym: Likewise. * sysdeps/sh/nptl/tcb-offsets.sym: Likewise. * sysdeps/x86_64/nptl/tcb-offsets.sym: Likewise.
* PowerPC: Move powerpc code out of nptl/ subdirectoryAdhemerval Zanella2014-06-176-0/+382
|
* Update powerpc-fpu ULPs.Adhemerval Zanella2014-06-111-0/+24
|
* PowerPC: Optimized strcmp for PPC64/POWER7Vidya Ranganathan2014-06-116-1/+317
| | | | | | Optimization is achieved on 8 byte aligned strings with double word comparison using cmpb instruction. On unaligned strings loop unrolling is applied for Power7 gain.
* PowerPC: Fix optimized strncat strlen callAdhemerval Zanella2014-06-061-1/+5
| | | | | | | This patch fixes the optimized ppc64/power7 strncat strlen call for static build without ifunc enabled. The strlen symbol to call in such situation is just strlen, instead of __GI_strlen (since the __GI_ alias is just created for shared objects).
* Update powerpc-fpu ULPs.Adhemerval Zanella2014-05-261-0/+24
|
* PowerPC: Remove 64 bits instructions in PPC32 codeAdhemerval Zanella2014-05-268-16/+16
| | | | This patch replaces the insrdi by insrwi in powerpc32 assembly.
* PowerPC: Remove unneeded copysign[f] macrosAdhemerval Zanella2014-05-221-27/+0
| | | | | This patch remove the unneeded copysign[f] macro from powerpc math_private.h, since they are already covered in generic version.
* PowerPC: Fix memchr ifunc hidden symbol for PPC32Adhemerval Zanella2014-05-222-10/+14
| | | | | | | | | This patch fixes a similar issue to 736c304a1ab4cee36a2f3343f1698bc0abae4608, where for PPC32 if the symbol is defined as hidden (memchr) then compiler will create a local branc (symbol@local) and the linker will not create a required PLT call to make the ifunc work. It changes the default hidden symbol (__GI_memchr) to default memchr symbol for powerpc32 (__memchr_ppc32).
* Update powerpc-fpu ULPs.Adhemerval Zanella2014-05-201-0/+63
|
* PowerPC: Fix copysignf optimization macroAdhemerval Zanella2014-05-201-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the __copysignf optimized macro meant to internal libm usage when used with constant value. Without the explicit cast to float, if it is used with const double value (for instance, on s_casinhf.c) double constants will be used and it may lead to precision issues in some algorithms. It fixes the following failures on PPC64/POWER7: Failure: Test: Real part of: cacos_downward (inf + 0 i) Result: is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23 should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0 Failure: Test: Real part of: cacos_downward (inf - 0 i) Result: is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23 should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0 Failure: Test: Real part of: cacos_downward (inf + 0.5 i) Result: is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23 should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0 Failure: Test: Real part of: cacos_downward (inf - 0.5 i) Result: is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23 should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0 Failure: Test: Real part of: cacos_towardzero (inf + 0 i) Result: is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23 should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0 Failure: Test: Real part of: cacos_towardzero (inf - 0 i) Result: is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23 should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0 Failure: Test: Real part of: cacos_towardzero (inf + 0.5 i) Result: is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23 should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0 Failure: Test: Real part of: cacos_towardzero (inf - 0.5 i) Result: is: 1.19209289550781250000e-07 0x1.00000000000000000000p-23 should be: 0.00000000000000000000e+00 0x0.00000000000000000000p+0
* PowerPC: Fix multiarch hypotf PPC64 pathAdhemerval Zanella2014-05-191-0/+0
| | | | This patch moves the hypotf multiarch implementation to correct path.
* PowerPC: strncpy/stpncpy optimization for PPC64/POWER7Vidya Ranganathan2014-05-0610-1/+593
| | | | | | | | The optimization is achieved by following techniques: > data alignment [gain from aligned memory access on read/write] > POWER7 gains performance with loop unrolling/unwinding [gain by reduction of branch penalty]. > zero padding done by calling optimized memset
* PowerPC: ifunc improvement for internal callsAdhemerval Zanella2014-05-057-26/+49
| | | | | | | This patch changes de default symbol redirection for internal call of memcpy, memset, memchr, and strlen to the IFUNC resolved ones. The performance improvement is noticeable in algorithms that uses these symbols extensible, like the regex functions.
* FixAdhemerval Zanella2014-04-292-2/+2
|
* PowerPC: Suppress unnecessary FPSCR writeAdhemerval Zanella2014-04-297-16/+48
| | | | | | | This patch optimizes the FPSCR update on exception and rounding change functions by just updating its value if new value if different from current one. It also optimizes fedisableexcept and feenableexcept by removing an unecessary FPSCR read.
* PowerPC: Add fenv macros for long doubleAdhemerval Zanella2014-04-171-2/+7
| | | | | This patch add the missing libc_<function>l_ctx macros for long double. Similar for float, they point to default double versions.
* PowerPC: Fix --disable-multi-arch buildsAdhemerval Zanella2014-04-0910-6/+14
| | | | | | | | | | This patch fixes some powerpc32 and powerpc64 builds with --disable-multi-arch option along with different --with-cpu=powerN. It cleanups the Implies directories by removing the multiarch folder for non multiarch config and also fixing two assembly implementations: powerpc64/power7/strncat.S that is calling the wrong strlen; and power8/fpu/s_isnan.S that misses the hidden_def and weak_alias directives.
* PowerPC: Fix nearbyint/nearbyintf result for FE_DOWNWARDAdhemerval Zanella2014-04-063-10/+203
| | | | | | | | | This patch fixes the powerpc32 optimized nearbyint/nearbyintf bogus results for FE_DOWNWARD rounding mode. This is due wrong instructions sequence used in the rounding calculation (two subtractions instead of adition and a subtraction). Fixes BZ#16815.
* Correct prefetch hint in power7 memrchr.Alan Modra2014-04-021-1/+1
| | | | | | Typo fix. * sysdeps/powerpc/powerpc64/power7/memrchr.S: Correct stream hint.
* Fix reference to toc symbol.Alan Modra2014-04-021-1/+1
| | | | | | | https://sourceware.org/ml/binutils/2014-03/msg00033.html removes the "magic" treatment of symbols defined in a .toc section. * sysdeps/powerpc/powerpc64/start.S: Add @toc to toc symbol reference.
* Fix s_copysign stack temp for PowerPC64 ELFv2Alan Modra2014-04-011-2/+2
| | | | | [BZ #16786] * sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Don't trash stack.
* PowerPC: Fix little endian enconding for mfvsrdAdhemerval Zanella2014-03-315-0/+25
| | | | | This patch fixes the MFVSRD_R3_V1 macro that encodes 'mfvsrd r3,vs1' (to support old binutils) for little endian.
* Update powerpc-fpu ULPs.Adhemerval Zanella2014-03-251-0/+874
|
* PowerPC: optimized strpbrk for POWER7Adhemerval Zanella2014-03-206-1/+259
| | | | | | | | | This patch add an optimized strpbrk for POWER7 by using a different algorithm than default implementation: it constructs a table based on the 'accept' argument and use this table to check for any occurance on the input string. The idea is similar as x86_64 uses. For PowerPC some tunings were added, such as unroll loops and memory clear using VSX instructions.
* PowerPC: optimized strcspn for PPC64/POWER7Adhemerval Zanella2014-03-206-1/+249
| | | | | | | | | | This patch add a optimized strcspn for POWER7 by using a different algorithm than default implementation: it constructs a table based on the 'accept' argument and use this table to check for any occurance on the input string. The idea is similar as x86_64 uses. For PowerPC some tunings were added, such as unroll loops and align stack memory to table to 16 bytes (so VSX clean can ran without alignment issues).
* PowerPC: remove wrong roundl implementation for PowerPC64Adhemerval Zanella2014-03-141-132/+0
| | | | | | | | | | | | | | | | | The roundl assembly implementation (sysdeps/powerpc/powerpc64/fpu/s_roundl.S) returns wrong results for some inputs where first double is a exact integer and the precision is determined by second long double. Checking on implementation comments and history, I am very confident the assembly implementation was based on a version before commit 5c68d401698a58cf7da150d9cce769fa6679ba5f that fixes BZ#2423 (Errors in long double (ldbl-128ibm) rounding functions in glibc-2.4). By just removing the implementation and make the build select sysdeps/ieee754/ldbl-128ibm/s_roundl.c instead fixes the failing math. This fixes 16707.
* PowerPC: remove wrong nearbyintl implementation for PPC64Adhemerval Zanella2014-03-141-113/+0
| | | | | | | | | | | | | | | | | | The nearbyintl assembly implementation (sysdeps/powerpc/powerpc64/fpu/s_nearbyintl.S) returns wrong results for some inputs where first double is a exact integer and the precision is determined by second long double. Checking on implementation comments and history, I am very confident the assembly implementation was based on a version before commit 5c68d401698a58cf7da150d9cce769fa6679ba5f that fixes BZ#2423 (Errors in long double (ldbl-128ibm) rounding functions in glibc-2.4). By just removing the implementation and make the build select sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c instead fixes the failing math. Fixes BZ#16706.
* PowerPC: remove wrong ceill implementation for PowerPC64Adhemerval Zanella2014-03-141-132/+0
| | | | | | | | | | | | | | | | The ceill assembly implementation (sysdeps/powerpc/powerpc64/fpu/s_ceill.S) returns wrong results for some inputs where first double is a exact integer and the precision is determined by second long double. Checking on implementation comments and history, I am very confident the assembly implementation was based on a version before commit 5c68d401698a58cf7da150d9cce769fa6679ba5f that fixes BZ#2423 (Errors in long double (ldbl-128ibm) rounding functions in glibc-2.4). By just removing the implementation and make the build select sysdeps/ieee754/ldbl-128ibm/s_ceill.c instead fixes the failing math. Fixes BZ#16701.
* PowerPC: Fix bzero definition for static libc for PPC32Adhemerval Zanella2014-03-122-2/+11
| | | | | | | | | | This patch fixes an issue for powerpc32-fpu static build which fails with an 'bzero' undefined reference. This patch adds bzero ifunc selector for static builds and fixes the '__bzero_ppc' reference to default memset symbol (since static memset build does not provide ifunc selector). Fixes BZ#16689.
* PowerPC: Fix strspn for static buildAdhemerval Zanella2014-03-121-1/+1
| | | | This patch makes the strspn ifunc selector build for static builds.