about summary refs log tree commit diff
path: root/sysdeps/x86_64/wcscmp.S
Commit message (Collapse)AuthorAgeFilesLines
* x86-64: Optimize strcmp/wcscmp and strncmp/wcsncmp with AVX2Leonardo Sandoval2018-06-011-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Optimize x86-64 strcmp/wcscmp and strncmp/wcsncmp with AVX2. It uses vector comparison as much as possible. Peak performance observed on a SkyLake machine: 9x, 3x, 2.5x and 5.5x for strcmp, strncmp, wcscmp and wcsncmp, respectively. The larger the comparison length, the more benefit using avx2 functions, except on the strcmp, where peak is observed at length == 32 bytes. Select AVX2 strcmp/wcscmp on AVX2 machines where vzeroupper is preferred and AVX unaligned load is fast. NB: It uses TZCNT instead of BSF since TZCNT produces the same result as BSF for non-zero input. TZCNT is faster than BSF and is executed as BSF if machine doesn't support TZCNT. * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add strcmp-avx2, strncmp-avx2, wcscmp-avx2, wcscmp-sse2, wcsncmp-avx2 and wcsncmp-sse2. * sysdeps/x86_64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list): Add tests for __strcmp_avx2, __strncmp_avx2, __wcscmp_avx2, __wcsncmp_avx2, __wcscmp_sse2 and __wcsncmp_sse2. * sysdeps/x86_64/multiarch/strcmp.c (OPTIMIZE (avx2)): (IFUNC_SELECTOR): Return OPTIMIZE (avx2) on AVX 2 machines if AVX unaligned load is fast and vzeroupper is preferred. * sysdeps/x86_64/multiarch/strncmp.c: Likewise. * sysdeps/x86_64/multiarch/strcmp-avx2.S: New file. * sysdeps/x86_64/multiarch/strncmp-avx2.S: Likewise. * sysdeps/x86_64/multiarch/wcscmp-avx2.S: Likewise. * sysdeps/x86_64/multiarch/wcscmp-sse2.S: Likewise. * sysdeps/x86_64/multiarch/wcscmp.c: Likewise. * sysdeps/x86_64/multiarch/wcsncmp-avx2.S: Likewise. * sysdeps/x86_64/multiarch/wcsncmp-sse2.c: Likewise. * sysdeps/x86_64/multiarch/wcsncmp.c: Likewise. * sysdeps/x86_64/wcscmp.S (__wcscmp): Add alias only if __wcscmp is undefined.
* Update copyright dates with scripts/update-copyrights.Joseph Myers2018-01-011-1/+1
| | | | | | | * All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.
* Update copyright dates with scripts/update-copyrights.Joseph Myers2017-01-011-1/+1
|
* Update copyright dates with scripts/update-copyrights.Joseph Myers2016-01-041-1/+1
|
* Fix regcomp wcscoll, wcscmp namespace (bug 18497).Joseph Myers2015-06-091-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | regcomp brings in references to wcscoll, which isn't in all the standards that contain regcomp. In turn, wcscoll brings in references to wcscmp, also not in all those standards. This patch fixes this by making those functions into weak aliases of __wcscoll and __wcscmp and calling those names instead as needed. Tested for x86_64 and x86 (testsuite, and that disassembly of installed shared libraries is unchanged by the patch). [BZ #18497] * wcsmbs/wcscmp.c [!WCSCMP] (WCSCMP): Define as __wcscmp instead of wcscmp. (wcscmp): Define as weak alias of WCSCMP. * wcsmbs/wcscoll.c (STRCOLL): Define as __wcscoll instead of wcscoll. (USE_HIDDEN_DEF): Define. [!USE_IN_EXTENDED_LOCALE_MODEL] (wcscoll): Define as weak alias of __wcscoll. Don't use libc_hidden_weak. * wcsmbs/wcscoll_l.c (STRCMP): Define as __wcscmp instead of wcscmp. * sysdeps/i386/i686/multiarch/wcscmp-c.c [SHARED] (libc_hidden_def): Define __GI___wcscmp instead of __GI_wcscmp. (weak_alias): Undefine and redefine. * sysdeps/i386/i686/multiarch/wcscmp.S (wcscmp): Rename to __wcscmp and define as weak alias of __wcscmp. * sysdeps/x86_64/wcscmp.S (wcscmp): Likewise. * include/wchar.h (__wcscmp): Declare. Use libc_hidden_proto. (__wcscoll): Likewise. (wcscmp): Don't use libc_hidden_proto. (wcscoll): Likewise. * posix/regcomp.c (build_range_exp): Call __wcscoll instead of wcscoll. * posix/regexec.c (check_node_accept_bytes): Likewise. * conform/Makefile (test-xfail-XPG3/regex.h/linknamespace): Remove variable. (test-xfail-XPG4/regex.h/linknamespace): Likewise. (test-xfail-POSIX/regex.h/linknamespace): Likewise.
* Update copyright dates with scripts/update-copyrights.Joseph Myers2015-01-021-1/+1
|
* Update copyright notices with scripts/update-copyrightsAllan McRae2014-01-011-1/+1
|
* Update copyright notices with scripts/update-copyrights.Joseph Myers2013-01-021-1/+1
|
* Replace FSF snail mail address with URLs.Paul Eggert2012-02-091-3/+2
|
* Fix WSUlrich Drepper2011-10-231-5/+5
|
* Fix signedness in wcscmp comparisonLiubov Dmitrieva2011-10-231-45/+64
|
* Remove now-wrong commentUlrich Drepper2011-09-061-5/+0
|
* Add optimized x86-64 wcscmpUlrich Drepper2011-09-051-0/+936