about summary refs log tree commit diff
path: root/sysdeps
Commit message (Collapse)AuthorAgeFilesLines
* AArch64: Add SVE memcpyWilco Dijkstra2022-06-075-42/+284
| | | | | | Add an initial SVE memcpy implementation. Copies up to 32 bytes use SVE vectors which improves the random memcpy benchmark significantly. Cleanup the memcpy and memmove ifunc selectors.
* x86_64: Add strstr function with 512-bit EVEXRaghuveer Devulapalli2022-06-064-4/+242
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding a 512-bit EVEX version of strstr. The algorithm works as follows: (1) We spend a few cycles at the begining to peek into the needle. We locate an edge in the needle (first occurance of 2 consequent distinct characters) and also store the first 64-bytes into a zmm register. (2) We search for the edge in the haystack by looking into one cache line of the haystack at a time. This avoids having to read past a page boundary which can cause a seg fault. (3) If an edge is found in the haystack we first compare the first 64-bytes of the needle (already stored in a zmm register) before we proceed with a full string compare performed byte by byte. Benchmarking results: (old = strstr_sse2_unaligned, new = strstr_avx512) Geometric mean of all benchmarks: new / old = 0.66 Difficult skiptable(0) : new / old = 0.02 Difficult skiptable(1) : new / old = 0.01 Difficult 2-way : new / old = 0.25 Difficult testing first 2 : new / old = 1.26 Difficult skiptable(0) : new / old = 0.05 Difficult skiptable(1) : new / old = 0.06 Difficult 2-way : new / old = 0.26 Difficult testing first 2 : new / old = 1.05 Difficult skiptable(0) : new / old = 0.42 Difficult skiptable(1) : new / old = 0.24 Difficult 2-way : new / old = 0.21 Difficult testing first 2 : new / old = 1.04 Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* grep: egrep -> grep -E, fgrep -> grep -FSam James2022-06-055-8/+8
| | | | | | | | | | | Newer versions of GNU grep (after grep 3.7, not inclusive) will warn on 'egrep' and 'fgrep' invocations. Convert usages within the tree to their expanded non-aliased counterparts to avoid irritating warnings during ./configure and the test suite. Signed-off-by: Sam James <sam@gentoo.org> Reviewed-by: Fangrui Song <maskray@google.com>
* linux: Add process_mreleaseAdhemerval Zanella2022-06-0238-0/+124
| | | | | | | | | Added in Linux 5.15 (884a7e5964e06ed93c7771c0d7cf19c09a8946f1), the new syscalls allows a caller to free the memory of a dying target process. Checked on x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* linux: Add process_madviseAdhemerval Zanella2022-06-0238-0/+214
| | | | | | | | | | It was added on Linux 5.10 (ecb8ac8b1f146915aa6b96449b66dd48984caacc) with the same functionality as madvise but using a pidfd of the target process. Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* linux: Set tst-pidfd-consts unsupported for kernels headers older than 5.10Adhemerval Zanella2022-06-021-0/+3
| | | | | | | | Instead of fail trying to build the compare source file. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Matheus Castanho <msc@linux.ibm.com> Reviewed-by: Matheus Castanho <msc@linux.ibm.com>
* Linux: Adjust struct rseq definition to current kernel versionFlorian Weimer2022-06-021-22/+6
| | | | | | | | This definition is only used as a fallback with old kernel headers. The change follows kernel commit bfdf4e6208051ed7165b2e92035b4bf11 ("rseq: Remove broken uapi field layout on 32-bit little endian"). Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* socket: Use 64 bit stat for isfdtype (BZ# 29209)Adhemerval Zanella2022-06-011-2/+2
| | | | | | This is a missing spot initially from 52a5fe70a2c77935. Checked on i686-linux-gnu.
* posix: Use 64 bit stat for fpathconf (_PC_ASYNC_IO) (BZ# 29208)Adhemerval Zanella2022-06-011-2/+2
| | | | | | This is a missing spot initially from 52a5fe70a2c77935. Checked on i686-linux-gnu.
* posix: Use 64 bit stat for posix_fallocate fallback (BZ# 29207)Adhemerval Zanella2022-06-012-4/+4
| | | | | | This is a missing spot initially from 52a5fe70a2c77935. Checked on i686-linux-gnu.
* linux: use statx for fstat if neither newfstatat nor fstatat64 is presentWANG Xuerui2022-06-011-1/+2
| | | | | | | | | | | | LoongArch is going to be the first architecture supported by Linux that has neither fstat* nor newfstatat [1], instead exclusively relying on statx. So in fstatat64's implementation, we need to also enable statx usage if neither fstatat64 nor newfstatat is present, to prepare for this new case of kernel ABI. [1]: https://lore.kernel.org/all/20220518092619.1269111-1-chenhuacai@loongson.cn/ Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Add MADV_DONTNEED_LOCKED from Linux 5.18 to bits/mman-linux.hJoseph Myers2022-06-011-0/+2
| | | | | | | | Linux 5.18 adds a constant MADV_DONTNEED_LOCKED (defined in multiple header files, but with the same value on all architectures). Add this constant to bits/mman-linux.h. Tested for x86_64.
* Add HWCAP2_MTE3 from Linux 5.18 to AArch64 bits/hwcap.hJoseph Myers2022-06-011-0/+1
| | | | | | | Linux 5.18 defines a new AArch64 HWCAP value HWCAP2_MTE3; add it to glibc's sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h. Tested with build-many-glibcs.py for aarch64-linux-gnu.
* i686: Use generic sincosf implementation for SSE2 versionAdhemerval Zanella2022-06-015-585/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The generic implementation shows slight better performance (gcc 11.2.1 on a Ryzen 9 5900X): * s_sincosf-sse2.S: "sincosf": { "workload-random": { "duration": 3.89961e+09, "iterations": 9.5472e+07, "reciprocal-throughput": 40.8429, "latency": 40.8483, "max-throughput": 2.4484e+07, "min-throughput": 2.44808e+07 } } * generic s_cossinf.c: "sincosf": { "workload-random": { "duration": 3.71953e+09, "iterations": 1.48512e+08, "reciprocal-throughput": 25.0515, "latency": 25.0391, "max-throughput": 3.99177e+07, "min-throughput": 3.99375e+07 } } Checked on i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* i686: Use generic sinf implementation for SSE2 versionAdhemerval Zanella2022-06-015-565/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Performance seems to be similar (gcc 11.2.1 on a Ryzen 9 5900X), the generic algorithm shows slight better performance for the 'workload-huge.wrf' input set. * s_sinf-sse2.S: "sinf": { "": { "duration": 3.72405e+09, "iterations": 2.38374e+08, "max": 63.973, "min": 11.211, "mean": 15.6227 }, "workload-random.wrf": { "duration": 3.76923e+09, "iterations": 8.4e+07, "reciprocal-throughput": 17.6355, "latency": 72.108, "max-throughput": 5.67037e+07, "min-throughput": 1.38681e+07 }, "workload-huge.wrf": { "duration": 3.76943e+09, "iterations": 6e+07, "reciprocal-throughput": 29.3493, "latency": 96.2985, "max-throughput": 3.40724e+07, "min-throughput": 1.03844e+07 } } * generic s_sinf.c: "sinf": { "": { "duration": 3.70989e+09, "iterations": 2.18025e+08, "max": 69.782, "min": 11.1, "mean": 17.0159 }, "workload-random.wrf": { "duration": 3.77213e+09, "iterations": 9.6e+07, "reciprocal-throughput": 17.5402, "latency": 61.0459, "max-throughput": 5.70119e+07, "min-throughput": 1.63811e+07 }, "workload-huge.wrf": { "duration": 3.81576e+09, "iterations": 5.6e+07, "reciprocal-throughput": 38.2111, "latency": 98.0659, "max-throughput": 2.61704e+07, "min-throughput": 1.01972e+07 } } Checked on i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* i686: Use generic cosf implementation for SSE2 versionAdhemerval Zanella2022-06-015-552/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Performance seems to be similar (gcc 11.2.1 on a Ryzen 9 5900X): * s_cosf-sse2.S: "cosf": { "workload-random": { "duration": 3.74987e+09, "iterations": 9.616e+07, "reciprocal-throughput": 15.8141, "latency": 62.1782, "max-throughput": 6.32346e+07, "min-throughput": 1.60828e+07 } } * generic s_cosf.c: "cosf": { "workload-random": { "duration": 3.87298e+09, "iterations": 1.00968e+08, "reciprocal-throughput": 18.3448, "latency": 58.3722, "max-throughput": 5.45113e+07, "min-throughput": 1.71314e+07 } } Checked on i686-linux-gnu.
* x86_64: Optimize sincos where sin/cos is optimized (bug 29193)Andreas Schwab2022-06-016-3/+55
| | | | | | | The compiler may substitute calls to sin or cos with calls to sincos, thus we should have the same optimized implementations for sincos. The optimized implementations may produce results that differ, that also makes sure that the sincos call aggrees with the sin and cos calls.
* Add SOL_SMC from Linux 5.18 to bits/socket.hJoseph Myers2022-05-311-0/+1
| | | | | | | Linux 5.18 adds a constant SOL_SMC to the getsockopt / setsockopt levels; add this constant to bits/socket.h. Tested for x86_64.
* elf: Remove _dl_skip_argsAdhemerval Zanella2022-05-302-5/+0
| | | | | | Now that no architecture uses it anymore. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* x86_64: Remove _dl_skip_args usageAdhemerval Zanella2022-05-302-22/+3
| | | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. So there is no need to adjust the argc or argv. Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* sparc: Remove _dl_skip_args usageAdhemerval Zanella2022-05-302-79/+4
| | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. So there is no need to adjust the argc or argv. Checked on sparc64-linux-gnu and sparcv9-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* sh: Remove _dl_skip_args usageAdhemerval Zanella2022-05-301-15/+1
| | | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. So there is no need to adjust the argc or argv. Checked with qemu-user that arguments are correctly passed on both constructors and main program. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* s390: Remove _dl_skip_args usageAdhemerval Zanella2022-05-302-62/+0
| | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. So there is no need to adjust the argc or argv. Checked on s390x-linux-gnu and s390-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* riscv: Remove _dl_skip_args usageAdhemerval Zanella2022-05-301-11/+1
| | | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. So there is no need to adjust the argc or argv. Checked with qemu-user that arguments are correctly passed on both constructors and main program. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* nios2: Remove _dl_skip_args usage (BZ# 29187)Adhemerval Zanella2022-05-301-40/+10
| | | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. So there is no need to adjust the argc or argv. Checked with qemu-user that arguments are correctly passed on both constructors and main program. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* mips: Remove _dl_skip_args usageAdhemerval Zanella2022-05-301-29/+2
| | | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. So there is no need to adjust the argc or argv. Checked with qemu-user that arguments are correctly passed on both constructors and main program. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* microblaze: Remove _dl_skip_args usageAdhemerval Zanella2022-05-301-5/+0
| | | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. So there is no need to adjust the argc or argv. Checked with qemu-user that arguments are correctly passed on both constructors and main program. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* m68k: Remove _dl_skip_args usageAdhemerval Zanella2022-05-301-7/+0
| | | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. So there is no need to adjust the argc or argv. Checked with qemu-user that arguments are correctly passed on both constructors and main program. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* ia64: Remove _dl_skip_args usageAdhemerval Zanella2022-05-301-56/+14
| | | | | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. The startup code is changed to read the _dl_argc and _dl_argv values, and envp is calculated from argc and argv. Checked on ia64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* i686: Remove _dl_skip_args usageAdhemerval Zanella2022-05-301-11/+2
| | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. So there is no need to adjust the argc or argv. Checked on i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* hppa: Remove _dl_skip_args usage (BZ# 29165)Adhemerval Zanella2022-05-301-22/+14
| | | | | | | | | | | | | Different than other architectures, hppa creates an unrelated stack frame where ld.so argc/argv adjustments done by ad43cac44a6860eaefc is not done on the argc/argv saved/restore by _dl_start_user. Instead load _dl_argc and _dl_argv directlty instead of adjust them using _dl_skip_args value. Checked on hppa-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* csky: Remove _dl_skip_args usageAdhemerval Zanella2022-05-301-18/+1
| | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. It makes the fixup_stack branch ununsed. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* arc: Remove _dl_skip_args usageAdhemerval Zanella2022-05-301-15/+2
| | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. So there is no need to adjust the argc or argv. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* arm: Remove _dl_skip_args usageAdhemerval Zanella2022-05-301-39/+0
| | | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. It makes the _fixup_stack branch ununsed. Checked with qemu-user that arguments are correctly passed on both constructors and main program. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* alpha: Remove _dl_skip_args usageAdhemerval Zanella2022-05-301-41/+0
| | | | | | | | | | | Since ad43cac44a the generic code already shuffles the argv/envp/auxv on the stack to remove the ld.so own arguments and thus _dl_skip_args is always 0. It makes the fixup_stack branch ununsed. Checked with qemu-user that arguments are correctly passed on both constructors and main program. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* x86-64: Ignore r_addend for R_X86_64_GLOB_DAT/R_X86_64_JUMP_SLOTH.J. Lu2022-05-261-2/+4
| | | | | | | | According to x86-64 psABI, r_addend should be ignored for R_X86_64_GLOB_DAT and R_X86_64_JUMP_SLOT. Since linkers always set their r_addends to 0, we can ignore their r_addends. Reviewed-by: Fangrui Song <maskray@google.com>
* x86_64: Implement evex512 version of strlen, strnlen, wcslen and wcsnlenSunil K Pandey2022-05-267-0/+346
| | | | | | | | | | | | | | | This patch implements following evex512 version of string functions. Perf gain for evex512 version is up to 50% as compared to evex, depending on length and alignment. Placeholder function, not used by any processor at the moment. - String length function using 512 bit vectors. - String N length using 512 bit vectors. - Wide string length using 512 bit vectors. - Wide string N length using 512 bit vectors. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* Update kernel version to 5.18 in header constant testsJoseph Myers2022-05-262-2/+2
| | | | | | | | | | This patch updates the kernel version in the tests tst-mman-consts.py and tst-pidfd-consts.py to 5.18. (There are no new constants covered by these tests in 5.18, or in 5.17 in the case of tst-pidfd-consts.py that previously used version 5.16, that need any other header changes.) Tested with build-many-glibcs.py.
* Update syscall-names.list for Linux 5.18Joseph Myers2022-05-251-2/+2
| | | | | | | Linux 5.18 has no new syscalls. Update the version number in syscall-names.list to reflect that it is still current for 5.18. Tested with build-many-glibcs.py.
* Fix deadlock when pthread_atfork handler calls pthread_atfork or dlcloseArjun Shankar2022-05-255-4/+372
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In multi-threaded programs, registering via pthread_atfork, de-registering implicitly via dlclose, or running pthread_atfork handlers during fork was protected by an internal lock. This meant that a pthread_atfork handler attempting to register another handler or dlclose a dynamically loaded library would lead to a deadlock. This commit fixes the deadlock in the following way: During the execution of handlers at fork time, the atfork lock is released prior to the execution of each handler and taken again upon its return. Any handler registrations or de-registrations that occurred during the execution of the handler are accounted for before proceeding with further handler execution. If a handler that hasn't been executed yet gets de-registered by another handler during fork, it will not be executed. If a handler gets registered by another handler during fork, it will not be executed during that particular fork. The possibility that handlers may now be registered or deregistered during handler execution means that identifying the next handler to be run after a given handler may register/de-register others requires some bookkeeping. The fork_handler struct has an additional field, 'id', which is assigned sequentially during registration. Thus, handlers are executed in ascending order of 'id' during 'prepare', and descending order of 'id' during parent/child handler execution after the fork. Two tests are included: * tst-atfork3: Adhemerval Zanella <adhemerval.zanella@linaro.org> This test exercises calling dlclose from prepare, parent, and child handlers. * tst-atfork4: This test exercises calling pthread_atfork and dlclose from the prepare handler. [BZ #24595, BZ #27054] Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* math: Add math-use-builtins-fabs (BZ#29027)Adhemerval Zanella2022-05-2312-208/+29
| | | | | | | | | | | | | | | | | | Both float, double, and _Float128 are assumed to be supported (float and double already only uses builtins). Only long double is parametrized due GCC bug 29253 which prevents its usage on powerpc. It allows to remove i686, ia64, x86_64, powerpc, and sparc arch specific implementation. On ia64 it also fixes the sNAN handling: math/test-float64x-fabs math/test-ldouble-fabs Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu, powerpc64-linux-gnu, sparc64-linux-gnu, and ia64-linux-gnu.
* linux: Add CLONE_NEWTIME from Linux 5.6 to bits/sched.hAdhemerval Zanella2022-05-231-0/+4
| | | | It was added in commit 769071ac9f20b6a447410c7eaa55d1a5233ef40c.
* Revert "[ARM][BZ #17711] Fix extern protected data handling"Fangrui Song2022-05-232-28/+3
| | | | | | | | This reverts commit 3bcea719ddd6ce399d7bccb492c40af77d216e42. Similar to commit e555954e026df1b85b8ef6c101d05f97b1520d7e for aarch64. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* Revert "[AArch64][BZ #17711] Fix extern protected data handling"Fangrui Song2022-05-232-28/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 0910702c4d2cf9e8302b35c9519548726e1ac489. Say both a.so and b.so define protected data symbol `var` and the executable copy relocates var. ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA has strange semantics: a.so accesses the copy in the executable while b.so accesses its own. This behavior requires that (a) the compiler emits GOT-generating relocations (b) the linker produces GLOB_DAT instead of RELATIVE. Without the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA code, b.so's GLOB_DAT will bind to the executable (normal behavior). For aarch64 it makes sense to restore the original behavior and don't pay the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA cost. The behavior is very unlikely used by anyone. * Clang code generator treats STV_PROTECTED the same way as STV_HIDDEN: no GOT-generating relocation in the first place. * gold and lld reject copy relocation on a STV_PROTECTED symbol. * Nowadays -fpie/-fpic modes are popular. GCC/Clang's codegen uses GOT-generating relocation when accessing an default visibility external symbol which avoids copy relocation. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* elf: Optimize _dl_new_hash in dl-new-hash.hNoah Goldstein2022-05-232-0/+133
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unroll slightly and enforce good instruction scheduling. This improves performance on out-of-order machines. The unrolling allows for pipelined multiplies. As well, as an optional sysdep, reorder the operations and prevent reassosiation for better scheduling and higher ILP. This commit only adds the barrier for x86, although it should be either no change or a win for any architecture. Unrolling further started to induce slowdowns for sizes [0, 4] but can help the loop so if larger sizes are the target further unrolling can be beneficial. Results for _dl_new_hash Benchmarked on Tigerlake: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz Time as Geometric Mean of N=30 runs Geometric of all benchmark New / Old: 0.674 type, length, New Time, Old Time, New Time / Old Time fixed, 0, 2.865, 2.72, 1.053 fixed, 1, 3.567, 2.489, 1.433 fixed, 2, 2.577, 3.649, 0.706 fixed, 3, 3.644, 5.983, 0.609 fixed, 4, 4.211, 6.833, 0.616 fixed, 5, 4.741, 9.372, 0.506 fixed, 6, 5.415, 9.561, 0.566 fixed, 7, 6.649, 10.789, 0.616 fixed, 8, 8.081, 11.808, 0.684 fixed, 9, 8.427, 12.935, 0.651 fixed, 10, 8.673, 14.134, 0.614 fixed, 11, 10.69, 15.408, 0.694 fixed, 12, 10.789, 16.982, 0.635 fixed, 13, 12.169, 18.411, 0.661 fixed, 14, 12.659, 19.914, 0.636 fixed, 15, 13.526, 21.541, 0.628 fixed, 16, 14.211, 23.088, 0.616 fixed, 32, 29.412, 52.722, 0.558 fixed, 64, 65.41, 142.351, 0.459 fixed, 128, 138.505, 295.625, 0.469 fixed, 256, 291.707, 601.983, 0.485 random, 2, 12.698, 12.849, 0.988 random, 4, 16.065, 15.857, 1.013 random, 8, 19.564, 21.105, 0.927 random, 16, 23.919, 26.823, 0.892 random, 32, 31.987, 39.591, 0.808 random, 64, 49.282, 71.487, 0.689 random, 128, 82.23, 145.364, 0.566 random, 256, 152.209, 298.434, 0.51 Co-authored-by: Alexander Monakov <amonakov@ispras.ru> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* S390: Enable static PIEStefan Liebler2022-05-183-0/+242
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit enables static PIE on 64bit. On 31bit, static PIE is not supported. A new configure check in sysdeps/s390/s390-64/configure.ac also performs a minimal test for requirements in ld: Ensure you also have those patches for: - binutils (ld) - "[PR ld/22263] s390: Avoid dynamic TLS relocs in PIE" https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=26b1426577b5dcb32d149c64cca3e603b81948a9 (Tested by configure check above) Otherwise there will be a R_390_TLS_TPOFF relocation, which fails to be processed in _dl_relocate_static_pie() as static TLS map is not setup. - "s390: Add DT_JMPREL pointing to .rela.[i]plt with static-pie" https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d942d8db12adf4c9e5c7d9ed6496a779ece7149e (We can't test it in configure as we are not able to link a static PIE executable if the system glibc lacks static PIE support) Otherwise there won't be DT_JMPREL, DT_PLTRELA, DT_PLTRELASZ entries and the IFUNC symbols are not processed, which leads to crashes. - kernel (the mentioned links to the commits belong to 5.19 merge window): - "s390/mmap: increase stack/mmap gap to 128MB" https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git/commit/?h=features&id=f2f47d0ef72c30622e62471903ea19446ea79ee2 - "s390/vdso: move vdso mapping to its own function" https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git/commit/?h=features&id=57761da4dc5cd60bed2c81ba0edb7495c3c740b8 - "s390/vdso: map vdso above stack" https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git/commit/?h=features&id=9e37a2e8546f9e48ea76c839116fa5174d14e033 - "s390/vdso: add vdso randomization" https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git/commit/?h=features&id=41cd81abafdc4e58a93fcb677712a76885e3ca25 (We can't test the kernel of the target system) Otherwise if /proc/sys/kernel/randomize_va_space is turned off (0), static PIE executables like ldconfig will crash. While startup sbrk is used to enlarge the HEAP. Unfortunately the underlying brk syscall fails as there is not enough space after the HEAP. Then the address of the TLS image is invalid and the following memcpy in __libc_setup_tls() leads to a segfault. If /proc/sys/kernel/randomize_va_space is activated (default: 2), there is enough space after HEAP. - glibc - "Linux: Define MMAP_CALL_INTERNAL" https://sourceware.org/git/?p=glibc.git;a=commit;h=c1b68685d438373efe64e5f076f4215723004dfb - "i386: Remove OPTIMIZE_FOR_GCC_5 from Linux libc-do-syscall.S" https://sourceware.org/git/?p=glibc.git;a=commit;h=6e5c7a1e262961adb52443ab91bd2c9b72316402 - "i386: Honor I386_USE_SYSENTER for 6-argument Linux system calls" https://sourceware.org/git/?p=glibc.git;a=commit;h=60f0f2130d30cfd008ca39743027f1e200592dff - "ia64: Always define IA64_USE_NEW_STUB as a flag macro" https://sourceware.org/git/?p=glibc.git;a=commit;h=18bd9c3d3b1b6a9182698c85354578d1d58e9d64 - "Linux: Implement a useful version of _startup_fatal" https://sourceware.org/git/?p=glibc.git;a=commit;h=a2a6bce7d7e52c1c34369a7da62c501cc350bc31 - "Linux: Introduce __brk_call for invoking the brk system call" https://sourceware.org/git/?p=glibc.git;a=commit;h=b57ab258c1140bc45464b4b9908713e3e0ee35aa - "csu: Implement and use _dl_early_allocate during static startup" https://sourceware.org/git/?p=glibc.git;a=commit;h=f787e138aa0bf677bf74fa2a08595c446292f3d7 The mentioned patch series by Florian Weimer avoids the mentioned failing sbrk syscall by falling back to mmap. This commit also adjusts startup code in start.S to be ready for static PIE. We have to add a wrapper function for main as we are not allowed to use GOT relocations before __libc_start_main is called. (Compare also to: - commit 14d886edbd3d80b771e1c42fbd9217f9074de9c6 "aarch64: fix start code for static pie" - commit 3d1d79283e6de4f7c434cb67fb53a4fd28359669 "aarch64: fix static pie enabled libc when main is in a shared library" )
* linux: Add tst-pidfd.cAdhemerval Zanella2022-05-172-0/+173
| | | | | | | | | | To check for the pidfd functions pidfd_open, pidfd_getfd, pid_send_signal, and waitid with P_PIDFD. Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
* linux: Add P_PIDFDAdhemerval Zanella2022-05-172-0/+26
| | | | | | | | It was added on Linux 5.4 (3695eae5fee0605f316fbaad0b9e3de791d7dfaf) to extend waitid to wait on pidfd. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
* linux: Add pidfd_send_signalAdhemerval Zanella2022-05-1736-0/+43
| | | | | | | | | | | | | | This was added on Linux 5.1(3eb39f47934f9d5a3027fe00d906a45fe3a15fad) as a way to avoid the race condition of using kill (where PID might be reused by the kernel between between obtaining the pid and sending the signal). If the siginfo_t argument is NULL then pidfd_send_signal is equivalent to kill. If it is not NULL pidfd_send_signal is equivalent to rt_sigqueueinfo. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
* linux: Add pidfd_getfdAdhemerval Zanella2022-05-1737-0/+44
| | | | | | | | | | This was added on Linux 5.6 (8649c322f75c96e7ced2fec201e123b2b073bf09) as a way to retrieve a file descriptors for another process though pidfd (created either with CLONE_PIDFD or pidfd_getfd). The functionality is similar to recvmmsg SCM_RIGHTS. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>