| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
| |
Add the new HWCAP2_AFP and HWCAP2_RPRES constants from Linux 5.17.
Tested with build-many-glibcs.py for aarch64-linux-gnu.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The rational is:
1. SSE42 has nearly identical logic so any benefit is minimal (3.4%
regression on Tigerlake using SSE42 versus AVX across the
benchtest suite).
2. AVX2 version covers the majority of targets that previously
prefered it.
3. The targets where AVX would still be best (SnB and IVB) are
becoming outdated.
All in all the saving the code size is worth it.
All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
| |
geometric_mean(N=40) of all benchmarks EVEX / SSE42: .621
All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
| |
geometric_mean(N=40) of all benchmarks AVX2 / SSE42: .702
All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
| |
Test cases for when both `s1` and `s2` are near the end of a page
where previously missing.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
| |
Test cases for when both `s1` and `s2` are near the end of a page
where previously missing.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Slightly faster method of doing TOLOWER that saves an
instruction.
Also replace the hard coded 5-byte no with .p2align 4. On builds with
CET enabled this misaligned entry to strcasecmp.
geometric_mean(N=40) of all benchmarks New / Original: .920
All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Slightly faster method of doing TOLOWER that saves an
instruction.
Also replace the hard coded 5-byte no with .p2align 4. On builds with
CET enabled this misaligned entry to strcasecmp.
geometric_mean(N=40) of all benchmarks New / Original: .894
All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
| |
Add more robust tests that cover all the page cross edge cases.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
| |
Add more robust tests that cover all the page cross edge cases.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
| |
Just QOL change to make parsing the output of the benchtests more
consistent.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
| |
Just QOL change to make parsing the output of the benchtests more
consistent.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Overflow case for __wcsncmp_avx2_rtm should be __wcscmp_avx2_rtm not
__wcscmp_avx2.
commit ddf0992cf57a93200e0c782e2a94d0733a5a0b87
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date: Sun Jan 9 16:02:21 2022 -0600
x86: Fix __wcsncmp_avx2 in strcmp-avx2.S [BZ# 28755]
Set the wrong fallback function for `__wcsncmp_avx2_rtm`. It was set
to fallback on to `__wcscmp_avx2` instead of `__wcscmp_avx2_rtm` which
can cause spurious aborts.
This change will need to be backported.
All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
| |
The generic implementation is faster.
geometric_mean(N=20) of all benchmarks New / Original: .710
All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
| |
The generic implementation is faster (see strcspn commit).
All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
| |
The generic implementation is faster.
geometric_mean(N=20) of all benchmarks New / Original: .678
All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use _mm_cmpeq_epi8 and _mm_movemask_epi8 to get strlen instead of
_mm_cmpistri. Also change offset to unsigned to avoid unnecessary
sign extensions.
geometric_mean(N=20) of all benchmarks that dont fallback on
sse2; New / Original: .901
All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use _mm_cmpeq_epi8 and _mm_movemask_epi8 to get strlen instead of
_mm_cmpistri. Also change offset to unsigned to avoid unnecessary
sign extensions.
geometric_mean(N=20) of all benchmarks that dont fallback on
sse2/strlen; New / Original: .928
All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
| |
Just QOL change to make parsing the output of the benchtests more
consistent.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
| |
Just QOL change to make parsing the output of the benchtests more
consistent.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Small code cleanup for size: -81 bytes.
Add comment justifying using a branch to do NULL/non-null return.
All string/memory tests pass and no regressions in benchtests.
geometric_mean(N=20) of all benchmarks New / Original: .985
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Small code cleanup for size: -53 bytes.
Add comment justifying using a branch to do NULL/non-null return.
All string/memory tests pass and no regressions in benchtests.
geometric_mean(N=20) of all benchmarks Original / New: 1.00
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add benchmark that randomizes whether return should be NULL or pointer
to CHAR. The rationale is on many architectures there is a choice
between a predicate execution option (i.e cmovcc on x86) or a branch.
On x86 the results for cmovcc vs branch are something along the lines
of the following:
perc-zero, Br On Result, Time Br / Time cmov
0.10, 1, ,0.983
0.10, 0, ,1.246
0.25, 1, ,1.035
0.25, 0, ,1.49
0.33, 1, ,1.016
0.33, 0, ,1.579
0.50, 1, ,1.228
0.50, 0, ,1.739
0.66, 1, ,1.039
0.66, 0, ,1.764
0.75, 1, ,0.996
0.75, 0, ,1.642
0.90, 1, ,1.071
0.90, 0, ,1.409
1.00, 1, ,0.937
1.00, 0, ,0.999
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
| |
Just QOL change to make parsing the output of the benchtests more
consistent.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
|
|
|
|
| |
This patch updates the kernel version in the test tst-mman-consts.py
to 5.17. (There are no new MAP_* constants covered by this test in
5.17 that need any other header changes.)
Tested with build-many-glibcs.py.
|
|
|
|
| |
Checked on x86_64-linux-gnu.
|
| |
|
|
|
|
|
|
|
| |
Use INT_STRLEN_BOUND to proper get the maximum pid_t size. Also
fix the wrong calculation (the 3 should multiply the sizeof (pid_t)).
Checked on x86_64-linux-gnu.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
Linux 5.17 has one new syscall, set_mempolicy_home_node. Update
syscall-names.list and regenerate the arch-syscall.h headers with
build-many-glibcs.py update-syscalls.
Tested with build-many-glibcs.py.
|
|
|
|
| |
It is only called for legacy ABIs.
|
| |
|
|
|
|
|
|
| |
It is not used on rtld and ldsodef interfaces are meant to be used
solely on loader. It also removes the only usage of gcc extension
__builtin_va_arg_pack.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
configure scripts need to be runnable with a POSIX-compliant /bin/sh.
On many (but not all!) systems, /bin/sh is provided by Bash, so errors
like this aren't spotted. Notably Debian defaults to /bin/sh provided
by dash which doesn't tolerate such bashisms as '=='.
This retains compatibility with bash.
Fixes configure warnings/errors like:
```
checking if compiler warns about alias for function with incompatible types... yes
/var/tmp/portage/sys-libs/glibc-2.34-r10/work/glibc-2.34/configure: 4209: test: xyes: unexpected operator
```
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: Sam James <sam@gentoo.org>
|
|
|
|
|
|
|
|
| |
The close_retry goto jump is confusing and clumsy to read, so refactor
the code a bit to make it easier to follow.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
|
|
|
|
|
|
|
| |
This patch makes build-many-glibcs.py use Linux 5.17.
Tested with build-many-glibcs.py (host-libraries, compilers and glibcs
builds).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The structure HEADER is normally aligned to a word boundary but
sometimes it needs to be accessed when aligned on a byte boundary.
This change defines a new typedef, UHEADER, with alignment 1.
It is used to ensure the fields are accessed with byte loads and
stores when necessary.
V4: Change to res_mkquery.c deleted. Small whitespace fix.
V5: Move UHEADER typedef to resolv/resolv-internal.h. Replace all
HEADER usage with UHEADER in resolv/res_send.c.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Clean up another antipattern where code flows from an if condition to
its else counterpart with a goto.
Most of the change in this patch is whitespace-only; a `git diff -b`
ought to show the actual logic changes.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
|
|
|
|
|
|
|
|
| |
Split out line processing for `label`, `precedence` and `scopev4` into
separate functions instead of the gotos.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
|
|
|
|
|
|
|
|
|
| |
All other cases of failures due to lack of memory return EAI_MEMORY, so
it seems wrong to return EAI_SYSTEM here. The only reason
convert_hostent_to_gaih_addrtuple could fail is on calloc failure.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
|
|
|
|
|
|
|
| |
Simplify the loop a wee bit and clean up variable names too.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Flatten the condition nesting and replace the alloca for RET.AT/ATR with
a single array LOCAL_AT[2]. This gets rid of alloca and alloca
accounting.
`git diff -b` is probably the best way to view this change since much of
the diff is whitespace changes.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
|
|
|
|
|
|
|
|
| |
The macro is quite a pain to debug, so make gethosts into a function to
make it easier to maintain.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
|
|
|
|
|
| |
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
|
|
|
|
|
|
|
|
| |
Add a new member got_ipv6 to indicate if the results have an IPv6
result and use it instead of the local got_ipv6.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
|
|
|
|
|
|
|
|
| |
Add a free_at flag in gaih_result to indicate if res.at needs to be
freed by the caller.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce the gaih_result structure and general paradigm for cleanups
that follow to process the lookup request and return a result. A lookup
function (like text_to_binary_address), should return an integer error
code and set members of gaih_result based on what it finds. If the
function does not have a result and no errors have occurred during the
lookup, it should return 0 and res.at should be set to NULL, allowing a
subsequent function to do the lookup until we run out of options.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Refactor the code to split out the service resolution code into a
separate function. Allocate the service tuples array just once to the
size of the typeproto array, thus avoiding the unnecessary pointer
chasing and stack allocations.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
|