about summary refs log tree commit diff
path: root/sysdeps/x86_64/multiarch/strcpy-avx2.S
Commit message (Collapse)AuthorAgeFilesLines
* Update copyright dates with scripts/update-copyrightsPaul Eggert2024-01-011-1/+1
|
* Update copyright dates with scripts/update-copyrightsJoseph Myers2023-01-061-1/+1
|
* x86: Optimize and shrink st{r|p}{n}{cat|cpy}-avx2 functionsNoah Goldstein2022-11-081-921/+315
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Optimizations are: 1. Use more overlapping stores to avoid branches. 2. Reduce how unrolled the aligning copies are (this is more of a code-size save, its a negative for some sizes in terms of perf). 3. For st{r|p}n{cat|cpy} re-order the branches to minimize the number that are taken. Performance Changes: Times are from N = 10 runs of the benchmark suite and are reported as geometric mean of all ratios of New Implementation / Old Implementation. strcat-avx2 -> 0.998 strcpy-avx2 -> 0.937 stpcpy-avx2 -> 0.971 strncpy-avx2 -> 0.793 stpncpy-avx2 -> 0.775 strncat-avx2 -> 0.962 Code Size Changes: function -> Bytes New / Bytes Old -> Ratio strcat-avx2 -> 685 / 1639 -> 0.418 strcpy-avx2 -> 560 / 903 -> 0.620 stpcpy-avx2 -> 592 / 939 -> 0.630 strncpy-avx2 -> 1176 / 2390 -> 0.492 stpncpy-avx2 -> 1268 / 2438 -> 0.520 strncat-avx2 -> 1042 / 2563 -> 0.407 Notes: 1. Because of the significant difference between the implementations they are split into three files. strcpy-avx2.S -> strcpy, stpcpy, strcat strncpy-avx2.S -> strncpy strncat-avx2.S > strncat I couldn't find a way to merge them without making the ifdefs incredibly difficult to follow. Full check passes on x86-64 and build succeeds for all ISA levels w/ and w/o multiarch.
* x86: Add support to build st{p|r}{n}{cpy|cat} with explicit ISA levelNoah Goldstein2022-07-161-1/+4
| | | | | | | | | | | | | | | | | | | | 1. Add default ISA level selection in non-multiarch/rtld implementations. 2. Add ISA level build guards to different implementations. - I.e strcpy-avx2.S which is ISA level 3 will only build if compiled ISA level <= 3. Otherwise there is no reason to include it as we will always use one of the ISA level 4 implementations (strcpy-evex.S). 3. Refactor the ifunc selector and ifunc implementation list to use the ISA level aware wrapper macros that allow functions below the compiled ISA level (with a guranteed replacement) to be skipped. Tested with and without multiarch on x86_64 for ISA levels: {generic, x86-64-v2, x86-64-v3, x86-64-v4} And m32 with and without multiarch.
* Update copyright dates with scripts/update-copyrightsPaul Eggert2022-01-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 7061 files FOO. I then removed trailing white space from math/tgmath.h, support/tst-support-open-dev-null-range.c, and sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following obscure pre-commit check failure diagnostics from Savannah. I don't know why I run into these diagnostics whereas others evidently do not. remote: *** 912-#endif remote: *** 913: remote: *** 914- remote: *** error: lines with trailing whitespace found ... remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
* Remove "Contributed by" linesSiddhesh Poyarekar2021-09-031-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | We stopped adding "Contributed by" or similar lines in sources in 2012 in favour of git logs and keeping the Contributors section of the glibc manual up to date. Removing these lines makes the license header a bit more consistent across files and also removes the possibility of error in attribution when license blocks or files are copied across since the contributed-by lines don't actually reflect reality in those cases. Move all "Contributed by" and similar lines (Written by, Test by, etc.) into a new file CONTRIBUTED-BY to retain record of these contributions. These contributors are also mentioned in manual/contrib.texi, so we just maintain this additional record as a courtesy to the earlier developers. The following scripts were used to filter a list of files to edit in place and to clean up the CONTRIBUTED-BY file respectively. These were not added to the glibc sources because they're not expected to be of any use in future given that this is a one time task: https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* x86-64: Add AVX optimized string/memory functions for RTMH.J. Lu2021-03-291-53/+32
| | | | | | | | | | | | | | | | | Since VZEROUPPER triggers RTM abort while VZEROALL won't, select AVX optimized string/memory functions with xtest jz 1f vzeroall ret 1: vzeroupper ret at function exit on processors with usable RTM, but without 256-bit EVEX instructions to avoid VZEROUPPER inside a transactionally executing RTM region.
* Update copyright dates with scripts/update-copyrightsPaul Eggert2021-01-021-1/+1
| | | | | | | | | | | | | | | | I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 6694 files FOO. I then removed trailing white space from benchtests/bench-pthread-locks.c and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this diagnostic from Savannah: remote: *** pre-commit check failed ... remote: *** error: lines with trailing whitespace found remote: error: hook declined to update refs/heads/master
* Update copyright dates with scripts/update-copyrights.Joseph Myers2020-01-011-1/+1
|
* Prefer https to http for gnu.org and fsf.org URLsPaul Eggert2019-09-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Also, change sources.redhat.com to sourceware.org. This patch was automatically generated by running the following shell script, which uses GNU sed, and which avoids modifying files imported from upstream: sed -ri ' s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g ' \ $(find $(git ls-files) -prune -type f \ ! -name '*.po' \ ! -name 'ChangeLog*' \ ! -path COPYING ! -path COPYING.LIB \ ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \ ! -path manual/texinfo.tex ! -path scripts/config.guess \ ! -path scripts/config.sub ! -path scripts/install-sh \ ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \ ! -path INSTALL ! -path locale/programs/charmap-kw.h \ ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \ ! '(' -name configure \ -execdir test -f configure.ac -o -f configure.in ';' ')' \ ! '(' -name preconfigure \ -execdir test -f preconfigure.ac ';' ')' \ -print) and then by running 'make dist-prepare' to regenerate files built from the altered files, and then executing the following to cleanup: chmod a+x sysdeps/unix/sysv/linux/riscv/configure # Omit irrelevant whitespace and comment-only changes, # perhaps from a slightly-different Autoconf version. git checkout -f \ sysdeps/csky/configure \ sysdeps/hppa/configure \ sysdeps/riscv/configure \ sysdeps/unix/sysv/linux/csky/configure # Omit changes that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines git checkout -f \ sysdeps/powerpc/powerpc64/ppc-mcount.S \ sysdeps/unix/sysv/linux/s390/s390-64/syscall.S # Omit change that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
* x86-64 strncpy: Properly handle the length parameter [BZ# 24097]H.J. Lu2019-01-211-2/+2
| | | | | | | | | | | | | | | | | | | On x32, the size_t parameter may be passed in the lower 32 bits of a 64-bit register with the non-zero upper 32 bits. The string/memory functions written in assembly can only use the lower 32 bits of a 64-bit register as length or must clear the upper 32 bits before using the full 64-bit register for length. This pach fixes strncpy for x32. Tested on x86-64 and x32. On x86-64, libc.so is the same with and withou the fix. [BZ# 24097] CVE-2019-6488 * sysdeps/x86_64/multiarch/strcpy-avx2.S: Use RDX_LP for length. * sysdeps/x86_64/multiarch/strcpy-sse2-unaligned.S: Likewise. * sysdeps/x86_64/multiarch/strcpy-ssse3.S: Likewise. * sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strncpy. * sysdeps/x86_64/x32/tst-size_t-strncpy.c: New file.
* x86-64: Optimize strcat/strncat, strcpy/strncpy and stpcpy/stpncpy with AVX2Leonardo Sandoval2019-01-141-0/+1022
Optimize x86-64 strcat/strncat, strcpy/strncpy and stpcpy/stpncpy with AVX2. It uses vector comparison as much as possible. In general, the larger the source string, the greater performance gain observed, reaching speedups of 1.6x compared to SSE2 unaligned routines. Select AVX2 strcat/strncat, strcpy/strncpy and stpcpy/stpncpy on AVX2 machines where vzeroupper is preferred and AVX unaligned load is fast. * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add strcat-avx2, strncat-avx2, strcpy-avx2, strncpy-avx2, stpcpy-avx2 and stpncpy-avx2. * sysdeps/x86_64/multiarch/ifunc-impl-list.c: (__libc_ifunc_impl_list): Add tests for __strcat_avx2, __strncat_avx2, __strcpy_avx2, __strncpy_avx2, __stpcpy_avx2 and __stpncpy_avx2. * sysdeps/x86_64/multiarch/{ifunc-unaligned-ssse3.h => ifunc-strcpy.h}: rename header for a more generic name. * sysdeps/x86_64/multiarch/ifunc-strcpy.h: (IFUNC_SELECTOR): Return OPTIMIZE (avx2) on AVX 2 machines if AVX unaligned load is fast and vzeroupper is preferred. * sysdeps/x86_64/multiarch/stpcpy-avx2.S: New file * sysdeps/x86_64/multiarch/stpncpy-avx2.S: Likewise * sysdeps/x86_64/multiarch/strcat-avx2.S: Likewise * sysdeps/x86_64/multiarch/strcpy-avx2.S: Likewise * sysdeps/x86_64/multiarch/strncat-avx2.S: Likewise * sysdeps/x86_64/multiarch/strncpy-avx2.S: Likewise