| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
|
|
|
|
|
|
|
| |
This patch cleanups the multiarch bzero for powerpc64 by remove
the multiarch objects and use instead the the memset embedded
implementation presented in each multiarch optimization. The
code generate is essentially the same, but the TB_TOCLESS (which
is not essential).
|
| |
|
|
|
|
|
|
|
|
| |
For PPC64, all the wrappers at sysdeps are superfluous: they are
basically the same implementation from math/w_sqrt.c with the
'#ifdef _IEEE_LIBM'. And the power4 version just force the 'fsqrt'
instruction utilization with an inline assembly, which is already
handled by math_private.h __ieee754_sqrt implementation.
|
| |
|
|
|
|
|
|
| |
This patch adds Implies files on multiarch folder for POWER chips so
multirach is enabled when building with --with-cpu and powerN
option.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
http://sourceware.org/ml/libc-alpha/2013-08/msg00104.html
One of the things I noticed when looking at power7 timing is that rlwimi
is cracked and the two resulting insns have a register dependency.
That makes it a little slower than the equivalent rldimi.
* sysdeps/powerpc/powerpc64/memset.S: Replace rlwimi with
insrdi. Formatting.
* sysdeps/powerpc/powerpc64/power4/memset.S: Likewise.
* sysdeps/powerpc/powerpc64/power6/memset.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memset.S: Likewise.
* sysdeps/powerpc/powerpc32/power4/memset.S: Likewise.
* sysdeps/powerpc/powerpc32/power6/memset.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/memset.S: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
http://sourceware.org/ml/libc-alpha/2013-08/msg00103.html
LIttle-endian support for memcpy. I spent some time cleaning up the
64-bit power7 memcpy, in order to avoid the extra alignment traps
power7 takes for little-endian. It probably would have been better
to copy the linux kernel version of memcpy.
* sysdeps/powerpc/powerpc32/power4/memcpy.S: Add little endian support.
* sysdeps/powerpc/powerpc32/power6/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/mempcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power4/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power6/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/mempcpy.S: Likewise. Make better
use of regs. Use power7 mtocrf. Tidy function tails.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
http://sourceware.org/ml/libc-alpha/2013-08/msg00102.html
This is a rather large patch due to formatting and renaming. The
formatting changes were to make it possible to compare power7 and
power4 versions of memcmp. Using different register defines came
about while I was wrestling with the code, trying to find spare
registers at one stage. I found it much simpler if we refer to a reg
by the same name throughout a function, so it's better if short-term
multiple use regs like rTMP are referred to using their register
number. I made the cr field usage changes when attempting to reload
rWORDn regs in the exit path to byte swap before comparing when
little-endian. That proved a bad idea due to the pipelining involved
in the main loop; Offsets to reload the regs were different first
time around the loop.. Anyway, I left the cr field usage changes in
place for consistency.
Aside from these more-or-less cosmetic changes, I fixed a number of
places where an early exit path restores regs unnecessarily, removed
some dead code, and optimised one or two exits.
* sysdeps/powerpc/powerpc64/power7/memcmp.S: Add little-endian support.
Formatting. Consistently use rXXX register defines or rN defines.
Use early exit labels that avoid restoring unused non-volatile regs.
Make cr field use more consistent with rWORDn compares. Rename
regs used as shift registers for unaligned loop, using rN defines
for short lifetime/multiple use regs.
* sysdeps/powerpc/powerpc64/power4/memcmp.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/memcmp.S: Likewise. Exit with
addi 1,1,64 to pop stack frame. Simplify return value code.
* sysdeps/powerpc/powerpc32/power4/memcmp.S: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
http://sourceware.org/ml/libc-alpha/2013-08/msg00099.html
More little-endian support. I leave the main strcmp loops unchanged,
(well, except for renumbering rTMP to something other than r0 since
it's needed in an addi insn) and modify the tail for little-endian.
I noticed some of the big-endian tail code was a little untidy so have
cleaned that up too.
* sysdeps/powerpc/powerpc64/strcmp.S (rTMP2): Define as r0.
(rTMP): Define as r11.
(strcmp): Add little-endian support. Optimise tail.
* sysdeps/powerpc/powerpc32/strcmp.S: Similarly.
* sysdeps/powerpc/powerpc64/strncmp.S: Likewise.
* sysdeps/powerpc/powerpc32/strncmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power4/strncmp.S: Likewise.
* sysdeps/powerpc/powerpc32/power4/strncmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/strncmp.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/strncmp.S: Likewise.
|
| |
|
|
|
|
|
| |
Retain a single copy of the mp code in power4 instead of the two
identical copies in powerpc32 and powerpc64.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
Syncs up with generic code.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
This includes the overridden mpa.c in power4.
|
| |
|
|
|
|
| |
Fixed comment style and clearer wording in some cases.
|
| |
|
| |
|
|
|
|
|
| |
The power4-specific mpa.c depended on some global variables that were
removed by earlier patches. Also, it did not define mpone and mptwo.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the past the "-ftree-loop-linear" switch provided a measurable
improvement in performance for certain functions. At some point it
was assigned as the responsibility of Graphite in GCC. It has been
found that even with Graphite enabled these flags no longer perform
any appreciable improvement over the baseline.
Graphite now has some open bugs which need to be fixed in order for it
to provide measurable performance improvements but it lacks active
development. As a result some compiler distributors may disable
Graphite. If Graphite is disabled then building GLIBC will fail if
the "-ftree-loop-linear" switch is used.
This patch removes the use of "-ftree-loop-linear" as unnecessary.
|
|
|
|
| |
Entire tree edited via find | grep | sed.
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Specify .machine power6 to get ISA-V2.0 branch hints. Unroll loops
and avoid branch misspredicts for > 31 bytes memset case.
* sysdeps/powerpc/powerpc64/power6/memset.S: Likewise.
Remove toc ref to __cache_line_size.
* sysdeps/powerpc/powerpc32/power4/memcmp.S: Specify .machine power4
to get ISA-V2.0 branch hints.
* sysdeps/powerpc/powerpc32/power4/memcpy.S: Likewise
* sysdeps/powerpc/powerpc32/power4/memset.S: Likewise
* sysdeps/powerpc/powerpc32/power6/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power4/memcmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power4/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power4/memset.S: Likewise.
Remove toc ref to __cache_line_size.
* sysdeps/powerpc/powerpc32/power6/fpu/s_llrint.S:
Include math_ldbl_opt.h.
|
|
* sysdeps/powerpc/powerpc32/power5/fpu/Implies: New file.
* sysdeps/powerpc/powerpc32/power5+/fpu/Implies: New file.
* sysdeps/powerpc/powerpc32/power6/fpu/Implies: New file.
* sysdeps/powerpc/powerpc32/power6x/fpu/Implies: New file.
* sysdeps/powerpc/powerpc64/970/fpu/Implies: New file.
* sysdeps/powerpc/powerpc64/power5/fpu/Implies: New file.
* sysdeps/powerpc/powerpc64/power5+/fpu/Implies: New file.
* sysdeps/powerpc/powerpc64/power6/fpu/Implies: New file.
* sysdeps/powerpc/powerpc64/power6x/fpu/Implies: New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/970/fpu/Implies: New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/power4/fpu/Implies:
New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/power5/fpu/Implies:
New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/power5+/fpu/Implies:
New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/power6/fpu/Implies:
New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/power6x/fpu/Implies:
New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/970/fpu/Implies: New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/power4/fpu/Implies:
New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/power5/fpu/Implies:
New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/power5+/fpu/Implies:
New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/power6/fpu/Implies:
New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/power6x/fpu/Implies:
New file.
2007-05-31 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc32/powerpc64/fpu/s_llrint.S: Move.
* sysdeps/powerpc/powerpc32/power4/fpu/s_llrint.S: To here.
* sysdeps/powerpc/powerpc32/powerpc64/fpu/s_llrintf.S: Move.
* sysdeps/powerpc/powerpc32/power4/fpu/s_llrintf.S: To here.
* sysdeps/powerpc/powerpc32/powerpc64/fpu/s_llround.S: Move.
* sysdeps/powerpc/powerpc32/power4/fpu/s_llround.S: To here.
* sysdeps/powerpc/powerpc32/powerpc64/fpu/s_llroundf.S: Move.
* sysdeps/powerpc/powerpc32/power4/fpu/s_llroundf.S: To here.
2007-05-22 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc32/power5+/fpu/s_round.S
(LONG_DOUBLE_COMPAT): Specify correct version, GLIBC_2_1.
* sysdeps/powerpc/powerpc32/power5+/fpu/s_trunc.S
(LONG_DOUBLE_COMPAT): Specify correct version, GLIBC_2_1.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_round.S
(LONG_DOUBLE_COMPAT): Specify correct version, GLIBC_2_1.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_trunc.S
(LONG_DOUBLE_COMPAT): Specify correct version, GLIBC_2_1.
2007-05-21 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc32/power4/fpu/slowexp.c: New file.
* sysdeps/powerpc/powerpc32/power4/fpu/w_sqrt.c: New file.
* sysdeps/powerpc/powerpc64/power4/fpu/slowexp.c: New file.
* sysdeps/powerpc/powerpc64/power4/fpu/w_sqrt.c: New file.
2007-03-15 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc32/powerpc64/fpu/s_llrint.S
[LONG_DOUBLE_COMPAT]: Add compat_symbol for llrintl@@GLIBC_2_1.
2006-02-13 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc32/power6/fpu/s_llrint.S: New File
* sysdeps/powerpc/powerpc32/power6/fpu/s_llrintf.S: New File
* sysdeps/powerpc/powerpc32/power6/fpu/s_llround.S: New File
* sysdeps/powerpc/powerpc32/power6/fpu/s_llroundf.S: New File
2006-10-20 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc32/power4/fpu/slowpow.c: New file.
* sysdeps/powerpc/powerpc64/power4/fpu/slowpow.c: New file.
2006-10-03 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc32/powerpc64/fpu/s_llround.S: New file.
* sysdeps/powerpc/powerpc32/powerpc64/fpu/s_llroundf.S: New file.
* sysdeps/powerpc/powerpc32/powerpc64/fpu/Makefile: Moved.
* sysdeps/powerpc/powerpc32/powerpc64/fpu/mpa.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/Makefile: To here.
* sysdeps/powerpc/powerpc32/power4/fpu/mpa.c: Likewise.
2006-09-29 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc32/power6x/fpu/s_lrint.S: New file.
* sysdeps/powerpc/powerpc32/power6x/fpu/s_lround.S: New file.
* sysdeps/powerpc/powerpc64/power6x/fpu/s_llrint.S: New file.
* sysdeps/powerpc/powerpc64/power6x/fpu/s_llround.S: New file.
2006-09-28 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc32/power5+/fpu/s_llround.S: New file.
* sysdeps/powerpc/powerpc32/power5+/fpu/s_llroundf.S: New file.
* sysdeps/powerpc/powerpc32/power5+/fpu/s_lround.S: New file.
* sysdeps/powerpc/powerpc32/power6x/fpu/Implies: New file.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_llround.S: New file.
* sysdeps/powerpc/powerpc64/power6x/fpu/Implies: New file.
2006-08-31 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc32/powerpc64/fpu/Makefile: New file.
* sysdeps/powerpc/powerpc32/powerpc64/fpu/mpa.c: New file.
* sysdeps/powerpc/powerpc64/power4/fpu/Makefile: New file.
* sysdeps/powerpc/powerpc64/power4/fpu/mpa.c: New file.
2006-06-15 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc32/power5+/fpu/s_ceil.S: New file.
* sysdeps/powerpc/powerpc32/power5+/fpu/s_ceilf.S: New file.
* sysdeps/powerpc/powerpc32/power5+/fpu/s_floor.S: New file.
* sysdeps/powerpc/powerpc32/power5+/fpu/s_floorf.S: New file.
* sysdeps/powerpc/powerpc32/power5+/fpu/s_round.S: New file.
* sysdeps/powerpc/powerpc32/power5+/fpu/s_roundf.S: New file.
* sysdeps/powerpc/powerpc32/power5+/fpu/s_trunc.S: New file.
* sysdeps/powerpc/powerpc32/power5+/fpu/s_truncf.S: New file.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_ceil.S: New file.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_ceilf.S: New file.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_floor.S: New file.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_floorf.S: New file.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_round.S: New file.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_roundf.S: New file.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_trunc.S: New file.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_truncf.S: New file.
2006-03-20 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc32/powerpc64/fpu/s_llrint.S: New file.
* sysdeps/powerpc/powerpc32/powerpc64/fpu/s_llrintf.S: New file.
2007-06-01 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc32/power6/memset.S: New file.
* sysdeps/powerpc/powerpc64/power6/memset.S: New file.
2007-05-31 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc32/970/Implies: New file.
* sysdeps/powerpc/powerpc32/power5/Implies: New file.
* sysdeps/powerpc/powerpc32/power5+/Implies: New file.
* sysdeps/powerpc/powerpc32/power6/Implies: New file.
* sysdeps/powerpc/powerpc32/power6x/Implies: New file.
* sysdeps/powerpc/powerpc64/970/Implies: New file.
* sysdeps/powerpc/powerpc64/power5/Implies: New file.
* sysdeps/powerpc/powerpc64/power5+/Implies: New file.
* sysdeps/powerpc/powerpc64/power6/Implies: New file.
* sysdeps/powerpc/powerpc64/power6x/Implies: New file.
2007-05-21 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc32/power4/memset.S: New file
2007-03-13 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc64/memcpy.S: Improve aligned loop to minimize
branch miss-predicts. Ensure that cache line crossing does not impact
dispatch grouping.
2006-12-13 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc64/power4/memcopy.h: Replace with include
"../../powerpc32/power4/memcopy.h".
* sysdeps/powerpc/powerpc64/power4/wordcopy.c: Replace with include
"../../powerpc32/power4/wordcopy.c".
2006-10-03 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc32/powerpc64/Makefile: Moved.
* sysdeps/powerpc/powerpc32/powerpc64/memcopy.h: Likewise.
* sysdeps/powerpc/powerpc32/powerpc64/wordcopy.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/Makefile: To here.
* sysdeps/powerpc/powerpc32/power4/memcopy.h: Likewise.
* sysdeps/powerpc/powerpc32/power4/wordcopy.c: Likewise.
2006-09-10 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc32/power6/memcpy.S: New file.
2006-08-31 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc32/power6/wordcopy.c: New file.
* sysdeps/powerpc/powerpc32/powerpc64/Makefile: New file.
* sysdeps/powerpc/powerpc32/powerpc64/memcopy.h: New file.
* sysdeps/powerpc/powerpc32/powerpc64/wordcopy.c: New file.
* sysdeps/powerpc/powerpc64/power4/Makefile: New file.
* sysdeps/powerpc/powerpc64/power4/memcopy.h: New file.
* sysdeps/powerpc/powerpc64/power4/wordcopy.c: New file.
* sysdeps/powerpc/powerpc64/power6/wordcopy.c: New file.
2006-07-06 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc64/power6/memcpy.S: New file.
2006-03-20 Steven Munroe <sjmunroe@us.ibm.com>
* sysdeps/powerpc/powerpc32/power4/memcmp.S: New file.
* sysdeps/powerpc/powerpc32/power4/memcpy.S: New file.
* sysdeps/powerpc/powerpc32/power4/memset.S: New file.
* sysdeps/powerpc/powerpc32/power4/strncmp.S: New file.
* sysdeps/powerpc/powerpc64/power4/memcmp.S: New file.
* sysdeps/powerpc/powerpc64/power4/memcpy.S: New file.
* sysdeps/powerpc/powerpc64/power4/strncmp.S: New file.
|