about summary refs log tree commit diff
path: root/sysdeps/powerpc/powerpc64/power8
Commit message (Collapse)AuthorAgeFilesLines
...
* powerpc: Add optimized strcspn for P8Paul E. Murphy2016-04-252-11/+54
| | | | | A few minor adjustments to the P8 strspn gives us an almost equally optimized P8 strcspn.
* powerpc: strcasestr optmization for power8Rajalakshmi Srinivasaraghavan2016-04-223-0/+563
| | | | | | This patch optimizes strcasestr function for power >= 8 systems. The average improvement of this optimization is ~40% and compares 16 bytes at a time using vector instructions. This patch is tested on powerpc64 and powerpc64le.
* powerpc: Optimization for strlen for POWER8.Carlos Eduardo Seo2016-04-151-0/+297
| | | | | This implementation takes advantage of vectorization to improve performance of the loop over the current strlen implementation for POWER7.
* powerpc: Add optimized P8 strspnPaul E. Murphy2016-04-071-0/+179
| | | | | | | This utilizes vectors and bitmasks. For small needle, large haystack, the performance improvement is upto 8x. For short strings (0-4B), the cost of computing the bitmask dominates, and is a tad slower.
* Update copyright dates with scripts/update-copyrights.Joseph Myers2016-01-0412-12/+12
|
* PowerPC: Add comments to optimized strncpyGabriel F. T. Gomes2015-10-011-2/+38
| | | | | * sysdeps/powerpc/powerpc64/power8/strncpy.S: Added comments to some assembly instructions.
* PowerPC: Fix operand prefixesGabriel F. T. Gomes2015-10-011-6/+6
| | | | | | | | | | | | | | | | | | | | The file sysdeps/powerpc/sysdeps.h defines aliases for register operands, which add the letter 'r' as a prefix to a register name. E.g.: register 20 can be written as 'r20', instead of '20'. On the one hand, this increases readability, as it makes it easier for readers to know whether the operand is a register or an immediate. On the other hand, this permits that immediate operands be written as if they were registers, and vice-versa, thus reducing the readability of the code. This commit removes some of these unintentional misuses. This commit also increases readability of the code by adding the prefix 'cr' to some uses of the control register. Both changes have no effect on the final code. Checked with objdump. * sysdeps/powerpc/powerpc64/power8/strncpy.S: Remove or add register prefix from operands.
* powerpc: Fix powerpc64 build failure with binutils 2.22Adhemerval Zanella2015-01-241-1/+4
| | | | | | | | | | | | | | GLIBC memset optimization for POWER8 uses the '.machine power8' directive, which is only supported officially on binutils 2.24+. This causes a build failure on older binutils. Since the requirement of .machine power8 is to correctly assembly the 'mtvsrd' instruction and it is already handled by the MTVSRD_V1_R4 macro, there is no really needed of using it. The patch replaces the power8 with power7 for .machine directive. It fixes BZ#17869.
* powerpc: Optimized strncmp for POWER8/PPC64Adhemerval Zanella2015-01-131-0/+323
| | | | | | | | | | This patch adds an optimized POWER8 strncmp. The implementation focus on speeding up unaligned cases follwing the ideas of power8 strcmp. The algorithm first check the initial 16 bytes, then align the first function source and uses unaligned loads on second argument only. Aditional checks for page boundaries are done for unaligned cases (where sources alignment are different).
* powerpc: Optimized strcmp for POWER8/PPC64Adhemerval Zanella2015-01-131-0/+257
| | | | | | | This patch adds an optimized POWER8 strcmp using unaligned accesses. The algorithm first check the initial 16 bytes, then align the first function source and uses unaligned loads on second argument only. Aditional checks for page boundaries are done for unaligned cases
* powerpc: Optimized st{r,p}ncpy for POWER8/PPC64Adhemerval Zanella2015-01-132-0/+444
| | | | | | | | | | | | | This patch adds an optimized POWER8 st{r,p}ncpy using unaligned accesses. It shows 10%-80% improvement over the optimized POWER7 one that uses only aligned accesses, specially on unaligned inputs. The algorithm first read and check 16 bytes (if inputs do not cross a 4K page size). The it realign source to 16-bytes and issue a 16 bytes read and compare loop to speedup null byte checks for large strings. Also, different from POWER7 optimization, the null pad is done inline in the implementation using possible unaligned accesses, instead of realying on a memset call. Special case is added for page cross reads.
* powerpc: Optimized st{r,p}cpy for POWER8/PPC64Adhemerval Zanella2015-01-132-0/+286
| | | | | | | | | | | | This patch adds an optimized POWER8 strcpy using unaligned accesses. For strings up to 16 bytes the implementation first calculate the string size, like strlen, and issues a memcpy. For larger strings, source is first aligned to 16 bytes and then tested over a loop that reads 16 bytes am combine the cmpb results for speedup. Special case is added for page cross reads. It shows 30%-60% improvement over the optimized POWER7 one that uses only aligned accesses.
* Update copyright dates with scripts/update-copyrights.Joseph Myers2015-01-026-6/+6
|
* Remove IS_IN_libmSiddhesh Poyarekar2014-11-243-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace with IS_IN (libm). Generated code unchanged on x86_64. * include/math.h: Use IS_IN instead of IS_IN_libm. * sysdeps/alpha/fpu/s_copysign.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_copysignl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_finitel.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_frexpl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_isinfl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_isnanl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_modfl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_signbitl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_copysignl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_finitel.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_frexpl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_isinfl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_isnanl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_modfl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_scalbnl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_signbitl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/w_scalblnl.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_copysign.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_finite.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_frexp.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_isinf.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_isnan.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_ldexp.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_ldexpl.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_modf.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_scalbln.c: Likewise. * sysdeps/ieee754/ldbl-opt/s_scalbn.c: Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c: Likewise. * sysdeps/powerpc/powerpc32/fpu/s_copysign.S: Likewise. * sysdeps/powerpc/powerpc32/fpu/s_copysignl.S: Likewise. * sysdeps/powerpc/powerpc32/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf.c: Likewise. * sysdeps/powerpc/powerpc32/power5/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_copysign.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_finite.S: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_isinf.S: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_copysignl.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power5/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power6/fpu/s_copysign.S: Likewise. * sysdeps/powerpc/powerpc64/power6/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power6x/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_finite.S: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_isinf.S: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_finite.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_isinf.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_isnan.S: Likewise. * sysdeps/sparc/sparc32/fpu/s_signbitl.S: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/s_isnan.S: Likewise. * sysdeps/unix/sysv/linux/alpha/fraiseexcpt.S: Likewise.
* powerpc: Simplify encoding of POWER8 instructionAdhemerval Zanella2014-11-056-35/+6
|
* powerpc: Fix encoding of POWER8 instructionAdhemerval Zanella2014-11-031-1/+8
| | | | | This patch adds a binary encoding for 'mtvsrd' instruction to avoid build failures when assembler does not support POWER8.
* PowerPC: memset optimization for POWER8/PPC64Adhemerval Zanella2014-09-101-0/+449
| | | | | | | | | | | | | | | | | | | This patch adds an optimized memset implementation for POWER8. For sizes from 0 to 255 bytes, a word/doubleword algorithm similar to POWER7 optimized one is used. For size higher than 255 two strategies are used: 1. If the constant is different than 0, the memory is written with altivec vector instruction; 2. If constant is 0, dbcz instructions are used. The loop is unrolled to clear 512 byte at time. Using vector instructions increases throughput considerable, with a double performance for sizes larger than 1024. The dcbz loops unrolls also shows performance improvement, by doubling throughput for sizes larger than 8192 bytes.
* PowerPC: Fix --disable-multi-arch buildsAdhemerval Zanella2014-04-092-1/+4
| | | | | | | | | | This patch fixes some powerpc32 and powerpc64 builds with --disable-multi-arch option along with different --with-cpu=powerN. It cleanups the Implies directories by removing the multiarch folder for non multiarch config and also fixing two assembly implementations: powerpc64/power7/strncat.S that is calling the wrong strlen; and power8/fpu/s_isnan.S that misses the hidden_def and weak_alias directives.
* PowerPC: Fix little endian enconding for mfvsrdAdhemerval Zanella2014-03-315-0/+25
| | | | | This patch fixes the MFVSRD_R3_V1 macro that encodes 'mfvsrd r3,vs1' (to support old binutils) for little endian.
* PowerPC: llround/llroundf POWER8 optimizationAdhemerval Zanella2014-02-271-0/+47
| | | | | | This patch add a optimized llround/llroundf implementation for POWER8 using the new Move From VSR Doubleword instruction to gains some cycles from FP to GRP register move.
* PowerPC: llrint/llrintf POWER8 optimizationAdhemerval Zanella2014-02-271-0/+45
| | | | | | This patch add a optimized llrint/llrintf implementation for POWER8 using the new Move From VSR Doubleword instruction to gains some cycles from FP to GRP register move.
* PowerPC: Optimized finite/finitef for POWER8Adhemerval Zanella2014-02-272-0/+57
| | | | | | This patch add a optimized finite/finitef implementation for POWER8 using the new Move From VSR Doubleword instruction to gains some cycles from FP to GRP register move.
* PowerPC: Optimized isinf/isinff for POWER8Adhemerval Zanella2014-02-272-0/+62
| | | | | | This patch add a optimized isinf/isinff implementation for POWER8 using the new Move From VSR Doubleword instruction to gains some cycles from FP to GRP register move.
* PowerPC: Optimized isnan/isnanf for POWER8Adhemerval Zanella2014-02-272-0/+54
| | | | | | This patch add a optimized isnan/isnanf implementation for POWER8 using the new Move From VSR Doubleword instruction to gains some cycles from FP to GRP register move.
* PowerPC: Adjust multiarch Implies for PowerPC64Adhemerval Zanella2013-12-133-0/+3
| | | | | | This patch adds Implies files on multiarch folder for POWER chips so multirach is enabled when building with --with-cpu and powerN option.
* PowerPC: Enable POWER8 platform sans hwcap bits.Ryan S. Arnold2013-06-241-0/+2