| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
We found that string functions were using AND+ADDP
to find the nibble/syndrome mask but there is an easier
opportunity through `SHRN dst.8b, src.8h, 4` (shift
right every 2 bytes by 4 and narrow to 1 byte) and has
same latency on all SIMD ARMv8 targets as ADDP. There
are also possible gaps for memcmp but that's for
another patch.
We see 10-20% savings for small-mid size cases (<=128)
which are primary cases for general workloads.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.
I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah. I don't
know why I run into these diagnostics whereas others evidently do not.
remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DELOUSE was added to asm code to make them compatible with non-LP64
ABIs, but it is an unfortunate name and the code was not compatible
with ABIs where pointer and size_t are different. Glibc currently
only supports the LP64 ABI so these macros are not really needed or
tested, but for now the name is changed to be more meaningful instead
of removing them completely.
Some DELOUSE macros were dropped: clone, strlen and strnlen used it
unnecessarily.
The out of tree ILP32 patches are currently not maintained and will
likely need a rework to rebase them on top of the time64 changes.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for MTE to strcpy. Regression tested with xcheck and benchmarked
with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1.
The existing implementation assumes that any access to the pages in which the
string resides is safe. This assumption is not true when MTE is enabled. This
patch updates the algorithm to ensure that accesses remain within the bounds
of an MTE tag (16-byte chunks) and improves overall performance.
Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes the optimized implementation of strcpy and strnlen
on a big-endian arm64 machine.
The optimized method uses neon, which can process 128bit with one
instruction. On a big-endian machine, the bit order should be reversed
for the whole 128-bits double word. But with instuction
rev64 datav.16b, datav.16b
it reverses 64bits in the two halves rather than reversing 128bits.
There is no such instruction as rev128 to reverse the 128bits, but we
can fix this by loading the data registers accordingly.
Fixes 0237b61526e7("aarch64: Optimized implementation of strcpy") and
2911cb68ed3d("aarch64: Optimized implementation of strnlen").
Signed-off-by: Lexi Shao <shaolexi@huawei.com>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
|
| |
|
|
|
|
| |
Checked on aarch64-linux-gnu.
|
|
|
|
|
|
|
|
|
|
|
| |
Optimize the strcpy implementation by using vector loads and operations
in main loop.Compared to aarch64/strcpy.S, it reduces latency of cases
in bench-strlen by 5%~18% when the length of src is greater than 64
bytes, with gains throughout the benchmark.
Checked on aarch64-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also, change sources.redhat.com to sourceware.org.
This patch was automatically generated by running the following shell
script, which uses GNU sed, and which avoids modifying files imported
from upstream:
sed -ri '
s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g
s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g
' \
$(find $(git ls-files) -prune -type f \
! -name '*.po' \
! -name 'ChangeLog*' \
! -path COPYING ! -path COPYING.LIB \
! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \
! -path manual/texinfo.tex ! -path scripts/config.guess \
! -path scripts/config.sub ! -path scripts/install-sh \
! -path scripts/mkinstalldirs ! -path scripts/move-if-change \
! -path INSTALL ! -path locale/programs/charmap-kw.h \
! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \
! '(' -name configure \
-execdir test -f configure.ac -o -f configure.in ';' ')' \
! '(' -name preconfigure \
-execdir test -f preconfigure.ac ';' ')' \
-print)
and then by running 'make dist-prepare' to regenerate files built
from the altered files, and then executing the following to cleanup:
chmod a+x sysdeps/unix/sysv/linux/riscv/configure
# Omit irrelevant whitespace and comment-only changes,
# perhaps from a slightly-different Autoconf version.
git checkout -f \
sysdeps/csky/configure \
sysdeps/hppa/configure \
sysdeps/riscv/configure \
sysdeps/unix/sysv/linux/csky/configure
# Omit changes that caused a pre-commit check to fail like this:
# remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines
git checkout -f \
sysdeps/powerpc/powerpc64/ppc-mcount.S \
sysdeps/unix/sysv/linux/s390/s390-64/syscall.S
# Omit change that caused a pre-commit check to fail like this:
# remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline
git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
|
|
|
|
|
|
|
| |
* All files with FSF copyright notices: Update copyright dates
using scripts/update-copyrights.
* locale/programs/charmap-kw.h: Regenerated.
* locale/programs/locfile-kw.h: Likewise.
|
|
|
|
|
|
|
| |
* All files with FSF copyright notices: Update copyright dates
using scripts/update-copyrights.
* locale/programs/charmap-kw.h: Regenerated.
* locale/programs/locfile-kw.h: Likewise.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* sysdeps/aarch64/crti.S: Add include of sysdep.h.
(call_weak_fn): Use PTR_REG to get correct reg name in ILP32.
* sysdeps/aarch64/dl-irel.h: Add include of sysdep.h.
(elf_irela): Use AARCH64_R macro to get correct relocation in ILP32.
* sysdeps/aarch64/dl-machine.h: Add include of sysdep.h.
(elf_machine_load_address, RTLD_START, RTLD_START_1, RTLD_START,
elf_machine_type_class, ELF_MACHINE_JMP_SLOT, elf_machine_rela,
elf_machine_lazy_rel): Add ifdef's for ILP32 support.
* sysdeps/aarch64/dl-tlsdesc.S (_dl_tlsdesc_return,
_dl_tlsdesc_return_lazy, _dl_tlsdesc_dynamic,
_dl_tlsdesc_resolve_hold): Extend pointers in ILP32, use PTR_REG
to get correct reg name for ILP32.
* sysdeps/aarch64/dl-trampoline.S (ip01): New Macro.
(RELA_SIZE): New Macro.
(_dl_runtime_resolve, _dl_runtime_profile): Use new macros and PTR_REG
to support ILP32.
* sysdeps/aarch64/jmpbuf-unwind.h (_JMPBUF_CFA_UNWINDS_ADJ): Add
cast for ILP32 mode.
* sysdeps/aarch64/memcmp.S (memcmp): Extend arg pointers for ILP32 mode.
* sysdeps/aarch64/memcpy.S (memmove, memcpy): Ditto.
* sysdeps/aarch64/memset.S (__memset): Ditto.
* sysdeps/aarch64/strchr.S (strchr): Ditto.
* sysdeps/aarch64/strchrnul.S (__strchrnul): Ditto.
* sysdeps/aarch64/strcmp.S (strcmp): Ditto.
* sysdeps/aarch64/strcpy.S (strcpy): Ditto.
* sysdeps/aarch64/strlen.S (__strlen): Ditto.
* sysdeps/aarch64/strncmp.S (strncmp): Ditto.
* sysdeps/aarch64/strnlen.S (strnlen): Ditto.
* sysdeps/aarch64/strrchr.S (strrchr): Ditto.
* sysdeps/unix/sysv/linux/aarch64/clone.S: Ditto.
* sysdeps/unix/sysv/linux/aarch64/setcontext.S (__setcontext): Ditto.
* sysdeps/unix/sysv/linux/aarch64/swapcontext.S (__swapcontext): Ditto.
* sysdeps/aarch64/__longjmp.S (__longjmp): Extend pointers in ILP32,
change PTR_MANGLE call to use register numbers instead of names.
* sysdeps/unix/sysv/linux/aarch64/getcontext.S (__getcontext): Ditto.
* sysdeps/aarch64/setjmp.S (__sigsetjmp): Extend arg pointers for
ILP32 mode, change PTR_MANGLE calls to use register numbers.
* sysdeps/aarch64/start.S (_start): Ditto.
* sysdeps/aarch64/nptl/bits/pthreadtypes.h
(__PTHREAD_RWLOCK_INT_FLAGS_SHARED): New define.
(__SIZEOF_PTHREAD_ATTR_T, __SIZEOF_PTHREAD_MUTEX_T,
__SIZEOF_PTHREAD_MUTEXATTR_T, __SIZEOF_PTHREAD_COND_T,
__SIZEOF_PTHREAD_COND_COMPAT_T, __SIZEOF_PTHREAD_CONDATTR_T,
__SIZEOF_PTHREAD_RWLOCK_T, __SIZEOF_PTHREAD_RWLOCKATTR_T,
__SIZEOF_PTHREAD_BARRIER_T, __SIZEOF_PTHREAD_BARRIERATTR_T):
Make defined values dependent on __ILP32__.
* sysdeps/aarch64/nptl/bits/semaphore.h (__SIZEOF_SEM_T): Change define.
(sem_t): Change __align type.
* sysdeps/aarch64/sysdep.h (AARCH64_R, PTR_REG, PTR_LOG_SIZE, DELOUSE,
PTR_SIZE): New Macros.
(LDST_PCREL, LDST_GLOBAL) Update to use PTR_REG.
* sysdeps/unix/sysv/linux/aarch64/bits/fcntl.h (O_LARGEFILE):
Set when in ILP32 mode.
(F_GETLK64, F_SETLK64, F_SETLKW64): Only set in LP64 mode.
* sysdeps/unix/sysv/linux/aarch64/dl-cache.h (DL_CACHE_DEFAULT_ID):
Set elf flags for ILP32.
(add_system_dir): Set ILP32 library directories.
* sysdeps/unix/sysv/linux/aarch64/init-first.c
(_libc_vdso_platform_setup): Set minimum kernel version for ILP32.
* sysdeps/unix/sysv/linux/aarch64/ldconfig.h
(SYSDEP_KNOWN_INTERPRETER_NAMES): Add ILP32 names.
* sysdeps/unix/sysv/linux/aarch64/sigcontextinfo.h (GET_PC, SET_PC):
New Macros.
* sysdeps/unix/sysv/linux/aarch64/sysdep.h: Handle ILP32 pointers.
|
| |
|
|
|