| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds an optimized POWER8 strcpy using unaligned accesses.
For strings up to 16 bytes the implementation first calculate the
string size, like strlen, and issues a memcpy. For larger strings,
source is first aligned to 16 bytes and then tested over a loop that
reads 16 bytes am combine the cmpb results for speedup. Special case is
added for page cross reads.
It shows 30%-60% improvement over the optimized POWER7 one that uses
only aligned accesses.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replace with IS_IN (libm). Generated code unchanged on x86_64.
* include/math.h: Use IS_IN instead of IS_IN_libm.
* sysdeps/alpha/fpu/s_copysign.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_copysignl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_finitel.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_fmal.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_frexpl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_isinfl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_isnanl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_modfl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_signbitl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_copysignl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_finitel.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_frexpl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_isinfl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_isnanl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_modfl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_scalbnl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/s_signbitl.c: Likewise.
* sysdeps/ieee754/ldbl-64-128/w_scalblnl.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_copysign.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_finite.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_frexp.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_isinf.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_isnan.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_ldexp.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_ldexpl.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_modf.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_scalbln.c: Likewise.
* sysdeps/ieee754/ldbl-opt/s_scalbn.c: Likewise.
* sysdeps/powerpc/power5+/fpu/s_modf.c: Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_copysign.S: Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_copysignl.S: Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf.c: Likewise.
* sysdeps/powerpc/powerpc32/power5/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc32/power6/fpu/s_copysign.S: Likewise.
* sysdeps/powerpc/powerpc32/power6/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/fpu/s_finite.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/fpu/s_isinf.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_copysignl.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power5/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power6/fpu/s_copysign.S: Likewise.
* sysdeps/powerpc/powerpc64/power6/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power6x/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/fpu/s_finite.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/fpu/s_isinf.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_finite.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_isinf.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_isnan.S: Likewise.
* sysdeps/sparc/sparc32/fpu/s_signbitl.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_isnan.S: Likewise.
* sysdeps/unix/sysv/linux/alpha/fraiseexcpt.S: Likewise.
|
| |
|
|
|
|
|
| |
This patch adds a binary encoding for 'mtvsrd' instruction to avoid
build failures when assembler does not support POWER8.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds an optimized memset implementation for POWER8. For
sizes from 0 to 255 bytes, a word/doubleword algorithm similar to
POWER7 optimized one is used.
For size higher than 255 two strategies are used:
1. If the constant is different than 0, the memory is written with
altivec vector instruction;
2. If constant is 0, dbcz instructions are used. The loop is unrolled
to clear 512 byte at time.
Using vector instructions increases throughput considerable, with a
double performance for sizes larger than 1024. The dcbz loops unrolls
also shows performance improvement, by doubling throughput for sizes
larger than 8192 bytes.
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes some powerpc32 and powerpc64 builds with
--disable-multi-arch option along with different --with-cpu=powerN.
It cleanups the Implies directories by removing the multiarch
folder for non multiarch config and also fixing two assembly
implementations: powerpc64/power7/strncat.S that is calling the
wrong strlen; and power8/fpu/s_isnan.S that misses the hidden_def and
weak_alias directives.
|
|
|
|
|
| |
This patch fixes the MFVSRD_R3_V1 macro that encodes 'mfvsrd r3,vs1'
(to support old binutils) for little endian.
|
|
|
|
|
|
| |
This patch add a optimized llround/llroundf implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
|
|
|
|
|
|
| |
This patch add a optimized llrint/llrintf implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
|
|
|
|
|
|
| |
This patch add a optimized finite/finitef implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
|
|
|
|
|
|
| |
This patch add a optimized isinf/isinff implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
|
|
|
|
|
|
| |
This patch add a optimized isnan/isnanf implementation for POWER8
using the new Move From VSR Doubleword instruction to gains some
cycles from FP to GRP register move.
|
|
|
|
|
|
| |
This patch adds Implies files on multiarch folder for POWER chips so
multirach is enabled when building with --with-cpu and powerN
option.
|
|
|