| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Combine both implementations into a single file to allow
building twice with appropriate multiarch support when possible.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This started as a trivial change to Anton's rawmemchr. I got
carried away. This is a hybrid between P8's asympotically
faster 64B checks with extremely efficient small string checks
e.g <64B (and sometimes a little bit more depending on alignment).
The second trick is to align to 64B by running a 48B checking loop
16B at a time until we naturally align to 64B (i.e checking 48/96/144
bytes/iteration based on the alignment after the first 5 comparisons).
This allieviates the need to check page boundaries.
Finally, explicly use the P7 strlen with the runtime loader when building
P9. We need to be cautious about vector/vsx extensions here on P9 only
builds.
|
|
|
|
|
|
|
|
|
| |
This defines the macro such that it should behave best on all
supported powerpc targets. Likewise, this allows us to remove the
ppc64le specific s_fmaf128.c.
I have verified powerpc64le multiarch and powerpc64le power9
no-multiarch builds continue to generate optimize fmaf128.
|
|
|
|
|
|
|
|
|
|
|
| |
This version uses vector instructions and is up to 60% faster on medium
matches and up to 90% faster on long matches, compared to the POWER7
version. A few examples:
__rawmemchr_power9 __rawmemchr_power7
Length 32, alignment 0: 2.27566 3.77765
Length 64, alignment 2: 2.46231 3.51064
Length 1024, alignment 0: 17.3059 32.6678
|
|
|
|
|
|
|
|
|
|
| |
Add stpcpy support to the POWER9 strcpy. This is up to 40% faster on
small strings and up to 90% faster on long relatively unaligned strings,
compared to the POWER8 version. A few examples:
__stpcpy_power9 __stpcpy_power8
Length 20, alignments in bytes 4/ 4: 2.58246 4.8788
Length 1024, alignments in bytes 1/ 6: 24.8186 47.8528
|
|
|
|
|
|
|
|
|
|
| |
This version uses VSX store vector with length instructions and is
significantly faster on small strings and relatively unaligned large
strings, compared to the POWER8 version. A few examples:
__strcpy_power9 __strcpy_power8
Length 16, alignments in bytes 0/ 0: 2.52454 4.62695
Length 412, alignments in bytes 4/ 0: 11.6 22.9185
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
strcmp is used while resolving PLT references. Vector registers
should not be used during this. The P9 strcmp makes heavy use of
vector registers, so it should be avoided in rtld.
This prevents quiet vector register corruption when glibc is configured
with --disable-multi-arch and --with-cpu=power9. This can be seen with
test-float64x-compat_totalordermag during the first call into
totalordermagf64x@GLIBC_2.27.
Add a guard to fallback to the power8 implementation when building
power9 strcmp for libraries other than libc.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
|
|
|
|
|
|
| |
Adds a POWER9 version of fmaf128 that uses the xsmaddqp
instruction.
Co-authored-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds a new macro, libm_alias_finite, to define all _finite
symbol. It sets all _finite symbol as compat symbol based on its first
version (obtained from the definition at built generated first-versions.h).
The <fn>f128_finite symbols were introduced in GLIBC 2.26 and so need
special treatment in code that is shared between long double and float128.
It is done by adding a list, similar to internal symbol redifinition,
on sysdeps/ieee754/float128/float128_private.h.
Alpha also needs some tricky changes to ensure we still emit 2 compat
symbols for sqrt(f).
Passes buildmanyglibc.
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also, change sources.redhat.com to sourceware.org.
This patch was automatically generated by running the following shell
script, which uses GNU sed, and which avoids modifying files imported
from upstream:
sed -ri '
s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g
s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g
' \
$(find $(git ls-files) -prune -type f \
! -name '*.po' \
! -name 'ChangeLog*' \
! -path COPYING ! -path COPYING.LIB \
! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \
! -path manual/texinfo.tex ! -path scripts/config.guess \
! -path scripts/config.sub ! -path scripts/install-sh \
! -path scripts/mkinstalldirs ! -path scripts/move-if-change \
! -path INSTALL ! -path locale/programs/charmap-kw.h \
! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \
! '(' -name configure \
-execdir test -f configure.ac -o -f configure.in ';' ')' \
! '(' -name preconfigure \
-execdir test -f preconfigure.ac ';' ')' \
-print)
and then by running 'make dist-prepare' to regenerate files built
from the altered files, and then executing the following to cleanup:
chmod a+x sysdeps/unix/sysv/linux/riscv/configure
# Omit irrelevant whitespace and comment-only changes,
# perhaps from a slightly-different Autoconf version.
git checkout -f \
sysdeps/csky/configure \
sysdeps/hppa/configure \
sysdeps/riscv/configure \
sysdeps/unix/sysv/linux/csky/configure
# Omit changes that caused a pre-commit check to fail like this:
# remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines
git checkout -f \
sysdeps/powerpc/powerpc64/ppc-mcount.S \
sysdeps/unix/sysv/linux/s390/s390-64/syscall.S
# Omit change that caused a pre-commit check to fail like this:
# remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline
git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
|
|
|
|
|
|
|
| |
* All files with FSF copyright notices: Update copyright dates
using scripts/update-copyrights.
* locale/programs/charmap-kw.h: Regenerated.
* locale/programs/locfile-kw.h: Likewise.
|
|
|
|
|
|
| |
This patch moves little endian specific POWER9 optimization files to
sysdeps/powerpc/powerpc64/le and creates POWER9 ifunc functions
only for little endian.
|
|
The creation of the divergent sysdeps directory for powerpc64le
commit 2f7f3cd8cd302bb10908c86f3f7b349df0a78e6a
Author: Paul E. Murphy <murphyp@linux.vnet.ibm.com>
Date: Fri Jul 15 18:04:40 2016 -0500
powerpc64le: Create divergent sysdep directory for powerpc64le.
allowed float128 to be enabled for powerpc64le (little-endian) and not
for powerpc64 (big-endian). Since the only intended difference between
them was the presence or absence of the float128 interface, the sysdeps
directory for powerpc64le explicitly reused the files from powerpc64
(through the use of Implies files).
Although this works, it also means that files under the powerpc64
directory might be preferred over files under powerpc64le. For
instance, on a build for powerpc64le with target set to power9, a file
from powerpc64/power5 might get built, even though a file with the same
name exists in powerpc64le/power8. That happens because the processor
hierarchy was only defined in the sysdeps directory for powerpc64 (and
borrowed by powerpc64le).
This patch fixes this behavior, by creating new subdirectories under
powerpc64 (i.e.: powerpc64/be and powerpc64/le) and creating new Implies
files to provide the hierarchy of processors for powerpc64 and
powerpc64le separately. These changes have no effect on installed,
stripped binaries (which remain unchanged).
Tested that installed stripped binaries are unchanged and that there are
no regressions on powerpc64 and powerpc64le.
|