about summary refs log tree commit diff
Commit message (Collapse)AuthorAgeFilesLines
...
* Benchtests: Improve memrchr benchmarksNoah Goldstein2022-06-071-45/+65
| | | | | | | | | | Add a second iteration for memrchr to set `pos` starting from the end of the buffer. Previously `pos` was only set relative to the beginning of the buffer. This isn't really useful for memrchr because the beginning of the search space is (buf + len). Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86: Add COND_VZEROUPPER that can replace vzeroupper if no `ret`Noah Goldstein2022-06-072-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The RTM vzeroupper mitigation has no way of replacing inline vzeroupper not before a return. This can be useful when hoisting a vzeroupper to save code size for example: ``` L(foo): cmpl %eax, %edx jz L(bar) tzcntl %eax, %eax addq %rdi, %rax VZEROUPPER_RETURN L(bar): xorl %eax, %eax VZEROUPPER_RETURN ``` Can become: ``` L(foo): COND_VZEROUPPER cmpl %eax, %edx jz L(bar) tzcntl %eax, %eax addq %rdi, %rax ret L(bar): xorl %eax, %eax ret ``` This code does not change any existing functionality. There is no difference in the objdump of libc.so before and after this patch. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86: Create header for VEC classes in x86 strings libraryNoah Goldstein2022-06-077-0/+327
| | | | | | | | | | This patch does not touch any existing code and is only meant to be a tool for future patches so that simple source files can more easily be maintained to target multiple VEC classes. There is no difference in the objdump of libc.so before and after this patch. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* powerpc: Fix VSX register number on __strncpy_power9 [BZ #29197]Matheus Castanho2022-06-071-2/+2
| | | | | | | | | | | | | | | __strncpy_power9 initializes VR 18 with zeroes to be used throughout the code, including when zero-padding the destination string. However, the v18 reference was mistakenly being used for stxv and stxvl, which take a VSX vector as operand. The code ended up using the uninitialized VSR 18 register by mistake. Both occurrences have been changed to use the proper VSX number for VR 18 (i.e. VSR 50). Tested on powerpc, powerpc64 and powerpc64le. Signed-off-by: Kewen Lin <linkw@gcc.gnu.org>
* AArch64: Sort makefile entriesWilco Dijkstra2022-06-071-6/+18
| | | | Sort makefile entries to reduce conflicts.
* AArch64: Add SVE memcpyWilco Dijkstra2022-06-075-42/+284
| | | | | | Add an initial SVE memcpy implementation. Copies up to 32 bytes use SVE vectors which improves the random memcpy benchmark significantly. Cleanup the memcpy and memmove ifunc selectors.
* x86_64: Add strstr function with 512-bit EVEXRaghuveer Devulapalli2022-06-064-4/+242
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding a 512-bit EVEX version of strstr. The algorithm works as follows: (1) We spend a few cycles at the begining to peek into the needle. We locate an edge in the needle (first occurance of 2 consequent distinct characters) and also store the first 64-bytes into a zmm register. (2) We search for the edge in the haystack by looking into one cache line of the haystack at a time. This avoids having to read past a page boundary which can cause a seg fault. (3) If an edge is found in the haystack we first compare the first 64-bytes of the needle (already stored in a zmm register) before we proceed with a full string compare performed byte by byte. Benchmarking results: (old = strstr_sse2_unaligned, new = strstr_avx512) Geometric mean of all benchmarks: new / old = 0.66 Difficult skiptable(0) : new / old = 0.02 Difficult skiptable(1) : new / old = 0.01 Difficult 2-way : new / old = 0.25 Difficult testing first 2 : new / old = 1.26 Difficult skiptable(0) : new / old = 0.05 Difficult skiptable(1) : new / old = 0.06 Difficult 2-way : new / old = 0.26 Difficult testing first 2 : new / old = 1.05 Difficult skiptable(0) : new / old = 0.42 Difficult skiptable(1) : new / old = 0.24 Difficult 2-way : new / old = 0.21 Difficult testing first 2 : new / old = 1.04 Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* scripts/glibcelf.py: Add PT_AARCH64_MEMTAG_MTE constantAdhemerval Zanella2022-06-061-0/+4
| | | | | | | It was added in commit 603e5c8ba7257483c162cabb06eb6f79096429b6. This caused the elf/tst-glibcelf consistency check to fail. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* socket: Fix mistyped define statement in socket/sys/socket.h (BZ #29225)Dmitriy Fedchenko2022-06-061-1/+1
|
* Declare timegm for ISO C2XJoseph Myers2022-06-061-5/+14
| | | | | | | | The next revision of the ISO C standard has added the timegm function (that was already supported in glibc). Update the feature test conditionals on its declaration in <time.h> accordingly. Tested for x86_64.
* Add PT_AARCH64_MEMTAG_MTE from Linux 5.18 to elf.hJoseph Myers2022-06-061-0/+3
| | | | | | | Linux 5.18 defines a new AArch64 ELF segment type PT_AARCH64_MEMTAG_MTE; add it to elf.h. Tested with build-many-glibcs.py for aarch64-linux-gnu.
* grep: egrep -> grep -E, fgrep -> grep -FSam James2022-06-0524-127/+127
| | | | | | | | | | | Newer versions of GNU grep (after grep 3.7, not inclusive) will warn on 'egrep' and 'fgrep' invocations. Convert usages within the tree to their expanded non-aliased counterparts to avoid irritating warnings during ./configure and the test suite. Signed-off-by: Sam James <sam@gentoo.org> Reviewed-by: Fangrui Song <maskray@google.com>
* string.h: Fix boolean spelling in commentsH.J. Lu2022-06-031-1/+1
|
* elf: Add #include <errno.h> for use of E* constants.Carlos O'Donell2022-06-021-1/+1
| | | | | | In __strerror_r we use errno constants and must include errno.h. Tested on x86_64 and i686 without regression.
* elf: Add #include <sys/param.h> for MAX usage.Carlos O'Donell2022-06-021-0/+1
| | | | | | In _dl_audit_pltenter we use MAX and so need to include param.h. Tested on x86_64 and i686 without regression.
* linux: Add process_mreleaseAdhemerval Zanella2022-06-0239-0/+129
| | | | | | | | | Added in Linux 5.15 (884a7e5964e06ed93c7771c0d7cf19c09a8946f1), the new syscalls allows a caller to free the memory of a dying target process. Checked on x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* linux: Add process_madviseAdhemerval Zanella2022-06-0242-1/+243
| | | | | | | | | | It was added on Linux 5.10 (ecb8ac8b1f146915aa6b96449b66dd48984caacc) with the same functionality as madvise but using a pidfd of the target process. Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* linux: Set tst-pidfd-consts unsupported for kernels headers older than 5.10Adhemerval Zanella2022-06-021-0/+3
| | | | | | | | Instead of fail trying to build the compare source file. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Matheus Castanho <msc@linux.ibm.com> Reviewed-by: Matheus Castanho <msc@linux.ibm.com>
* testrun.sh: Support passing strace and valgrind argumentsFlorian Weimer2022-06-021-5/+6
| | | | | | This is a bit of a hack, but it works quite well in practice. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Linux: Adjust struct rseq definition to current kernel versionFlorian Weimer2022-06-021-22/+6
| | | | | | | | This definition is only used as a fallback with old kernel headers. The change follows kernel commit bfdf4e6208051ed7165b2e92035b4bf11 ("rseq: Remove broken uapi field layout on 32-bit little endian"). Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* iconv: Use 64 bit stat for gconv_parseconfdir (BZ# 29213)Adhemerval Zanella2022-06-011-3/+6
| | | | | | | | | The issue is only when used within libc.so (iconvconfig already builds with _TIME_SIZE=64). This is a missing spot initially from 52a5fe70a2c77935. Checked on i686-linux-gnu.
* catgets: Use 64 bit stat for __open_catalog (BZ# 29211)Adhemerval Zanella2022-06-011-2/+2
| | | | | | This is a missing spot initially from 52a5fe70a2c77935. Checked on i686-linux-gnu.
* inet: Use 64 bit stat for ruserpass (BZ# 29210)Adhemerval Zanella2022-06-011-2/+2
| | | | | | This is a missing spot initially from 52a5fe70a2c77935. Checked on i686-linux-gnu.
* socket: Use 64 bit stat for isfdtype (BZ# 29209)Adhemerval Zanella2022-06-011-2/+2
| | | | | | This is a missing spot initially from 52a5fe70a2c77935. Checked on i686-linux-gnu.
* posix: Use 64 bit stat for fpathconf (_PC_ASYNC_IO) (BZ# 29208)Adhemerval Zanella2022-06-011-2/+2
| | | | | | This is a missing spot initially from 52a5fe70a2c77935. Checked on i686-linux-gnu.
* posix: Use 64 bit stat for posix_fallocate fallback (BZ# 29207)Adhemerval Zanella2022-06-012-4/+4
| | | | | | This is a missing spot initially from 52a5fe70a2c77935. Checked on i686-linux-gnu.
* misc: Use 64 bit stat for getusershell (BZ# 29203)Adhemerval Zanella2022-06-011-2/+2
| | | | | | This is a missing spot initially from 52a5fe70a2c77935. Checked on i686-linux-gnu.
* misc: Use 64 bit stat for daemon (BZ# 29203)Adhemerval Zanella2022-06-011-3/+2
| | | | | | This is a missing spot initially from 52a5fe70a2c77935. Checked on i686-linux-gnu.
* linux: use statx for fstat if neither newfstatat nor fstatat64 is presentWANG Xuerui2022-06-011-1/+2
| | | | | | | | | | | | LoongArch is going to be the first architecture supported by Linux that has neither fstat* nor newfstatat [1], instead exclusively relying on statx. So in fstatat64's implementation, we need to also enable statx usage if neither fstatat64 nor newfstatat is present, to prepare for this new case of kernel ABI. [1]: https://lore.kernel.org/all/20220518092619.1269111-1-chenhuacai@loongson.cn/ Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Add MADV_DONTNEED_LOCKED from Linux 5.18 to bits/mman-linux.hJoseph Myers2022-06-011-0/+2
| | | | | | | | Linux 5.18 adds a constant MADV_DONTNEED_LOCKED (defined in multiple header files, but with the same value on all architectures). Add this constant to bits/mman-linux.h. Tested for x86_64.
* Add HWCAP2_MTE3 from Linux 5.18 to AArch64 bits/hwcap.hJoseph Myers2022-06-011-0/+1
| | | | | | | Linux 5.18 defines a new AArch64 HWCAP value HWCAP2_MTE3; add it to glibc's sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h. Tested with build-many-glibcs.py for aarch64-linux-gnu.
* i686: Use generic sincosf implementation for SSE2 versionAdhemerval Zanella2022-06-015-585/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The generic implementation shows slight better performance (gcc 11.2.1 on a Ryzen 9 5900X): * s_sincosf-sse2.S: "sincosf": { "workload-random": { "duration": 3.89961e+09, "iterations": 9.5472e+07, "reciprocal-throughput": 40.8429, "latency": 40.8483, "max-throughput": 2.4484e+07, "min-throughput": 2.44808e+07 } } * generic s_cossinf.c: "sincosf": { "workload-random": { "duration": 3.71953e+09, "iterations": 1.48512e+08, "reciprocal-throughput": 25.0515, "latency": 25.0391, "max-throughput": 3.99177e+07, "min-throughput": 3.99375e+07 } } Checked on i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* benchtests: Add workload name for sincosfAdhemerval Zanella2022-06-011-0/+1
| | | | | | So it can show both reciprocal-throughput and latency. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* i686: Use generic sinf implementation for SSE2 versionAdhemerval Zanella2022-06-015-565/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Performance seems to be similar (gcc 11.2.1 on a Ryzen 9 5900X), the generic algorithm shows slight better performance for the 'workload-huge.wrf' input set. * s_sinf-sse2.S: "sinf": { "": { "duration": 3.72405e+09, "iterations": 2.38374e+08, "max": 63.973, "min": 11.211, "mean": 15.6227 }, "workload-random.wrf": { "duration": 3.76923e+09, "iterations": 8.4e+07, "reciprocal-throughput": 17.6355, "latency": 72.108, "max-throughput": 5.67037e+07, "min-throughput": 1.38681e+07 }, "workload-huge.wrf": { "duration": 3.76943e+09, "iterations": 6e+07, "reciprocal-throughput": 29.3493, "latency": 96.2985, "max-throughput": 3.40724e+07, "min-throughput": 1.03844e+07 } } * generic s_sinf.c: "sinf": { "": { "duration": 3.70989e+09, "iterations": 2.18025e+08, "max": 69.782, "min": 11.1, "mean": 17.0159 }, "workload-random.wrf": { "duration": 3.77213e+09, "iterations": 9.6e+07, "reciprocal-throughput": 17.5402, "latency": 61.0459, "max-throughput": 5.70119e+07, "min-throughput": 1.63811e+07 }, "workload-huge.wrf": { "duration": 3.81576e+09, "iterations": 5.6e+07, "reciprocal-throughput": 38.2111, "latency": 98.0659, "max-throughput": 2.61704e+07, "min-throughput": 1.01972e+07 } } Checked on i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* i686: Use generic cosf implementation for SSE2 versionAdhemerval Zanella2022-06-015-552/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Performance seems to be similar (gcc 11.2.1 on a Ryzen 9 5900X): * s_cosf-sse2.S: "cosf": { "workload-random": { "duration": 3.74987e+09, "iterations": 9.616e+07, "reciprocal-throughput": 15.8141, "latency": 62.1782, "max-throughput": 6.32346e+07, "min-throughput": 1.60828e+07 } } * generic s_cosf.c: "cosf": { "workload-random": { "duration": 3.87298e+09, "iterations": 1.00968e+08, "reciprocal-throughput": 18.3448, "latency": 58.3722, "max-throughput": 5.45113e+07, "min-throughput": 1.71314e+07 } } Checked on i686-linux-gnu.
* benchtests: Add workload name for cosfAdhemerval Zanella2022-06-011-1/+1
| | | | | | So it can show both reciprocal-throughput and latency. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86_64: Optimize sincos where sin/cos is optimized (bug 29193)Andreas Schwab2022-06-016-3/+55
| | | | | | | The compiler may substitute calls to sin or cos with calls to sincos, thus we should have the same optimized implementations for sincos. The optimized implementations may produce results that differ, that also makes sure that the sincos call aggrees with the sin and cos calls.
* manual: fix reference to source fileAndreas Schwab2022-05-311-1/+1
|
* Add SOL_SMC from Linux 5.18 to bits/socket.hJoseph Myers2022-05-311-0/+1
| | | | | | | Linux 5.18 adds a constant SOL_SMC to the getsockopt / setsockopt levels; add this constant to bits/socket.h. Tested for x86_64.
* elf: Remove _dl_skip_argsAdhemerval Zanella2022-05-303-7/+0
| | | | | | Now that no architecture uses it anymore. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* x86_64: Remove _dl_skip_args usageAdhemerval Zanella2022-05-302-22/+3
| | | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. So there is no need to adjust the argc or argv. Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* sparc: Remove _dl_skip_args usageAdhemerval Zanella2022-05-302-79/+4
| | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. So there is no need to adjust the argc or argv. Checked on sparc64-linux-gnu and sparcv9-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* sh: Remove _dl_skip_args usageAdhemerval Zanella2022-05-301-15/+1
| | | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. So there is no need to adjust the argc or argv. Checked with qemu-user that arguments are correctly passed on both constructors and main program. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* s390: Remove _dl_skip_args usageAdhemerval Zanella2022-05-302-62/+0
| | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. So there is no need to adjust the argc or argv. Checked on s390x-linux-gnu and s390-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* riscv: Remove _dl_skip_args usageAdhemerval Zanella2022-05-301-11/+1
| | | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. So there is no need to adjust the argc or argv. Checked with qemu-user that arguments are correctly passed on both constructors and main program. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* nios2: Remove _dl_skip_args usage (BZ# 29187)Adhemerval Zanella2022-05-301-40/+10
| | | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. So there is no need to adjust the argc or argv. Checked with qemu-user that arguments are correctly passed on both constructors and main program. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* mips: Remove _dl_skip_args usageAdhemerval Zanella2022-05-301-29/+2
| | | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. So there is no need to adjust the argc or argv. Checked with qemu-user that arguments are correctly passed on both constructors and main program. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* microblaze: Remove _dl_skip_args usageAdhemerval Zanella2022-05-301-5/+0
| | | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. So there is no need to adjust the argc or argv. Checked with qemu-user that arguments are correctly passed on both constructors and main program. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* m68k: Remove _dl_skip_args usageAdhemerval Zanella2022-05-301-7/+0
| | | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. So there is no need to adjust the argc or argv. Checked with qemu-user that arguments are correctly passed on both constructors and main program. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* ia64: Remove _dl_skip_args usageAdhemerval Zanella2022-05-301-56/+14
| | | | | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. The startup code is changed to read the _dl_argc and _dl_argv values, and envp is calculated from argc and argv. Checked on ia64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>