about summary refs log tree commit diff
path: root/sysdeps
Commit message (Collapse)AuthorAgeFilesLines
* Aarch64: Add memcpy for qualcomm's oryon-1 coreAndrew Pinski2024-06-305-0/+316
| | | | | | | | | | | | | | | | | | | Qualcomm's new core (oryon-1) has a different performance characteristic than other cores. For memcpy, it is faster to use the GPRs to do the copy for large sizes (2x faster). For even larger sizes, it is better to use the nontemporal load/store instructions so we don't pollute the L1/L2 caches. For smaller sizes, the characteristic are very similar to other cores. I used the thunderx memcpy as a starting point and expanded from there. Changes since v1: * v2: Fix ordering in Makefile. * v3: Fix comment grammar about the ldnp/stnp instructions. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* arm: Avoid UB in elf_machine_rel()Palmer Dabbelt2024-06-261-5/+4
| | | | | | | | | | This recently came up during a cleanup to remove misaligned accesses from the RISC-V port. Link: https://sourceware.org/pipermail/libc-alpha/2022-June/139961.html Suggested-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Fangrui Song <maskray@google.com>
* LoongArch: Fix tst-gnu2-tls2 test casemengqinggang2024-06-261-143/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | asm volatile ("movfcsr2gr $t0, $fcsr0" ::: "$t0"); asm volatile ("st.d $t0, %0" :"=m"(restore_fcsr)); generate to the following instructions with -Og flag: movfcsr2gr $t0, $zero addi.d $t0, $sp, 2047(0x7ff) addi.d $t0, $t0, 77(0x4d) st.w $t0, $t0, 0 fcsr0 register and restore_fcsr variable are both stored in t0 register. Change to: asm volatile ("movfcsr2gr %0, $fcsr0" :"=r"(restore_fcsr)); to avoid restore_fcsr address in t0. Comparing float value using memcmp because float value cannot be directly compared for equality. Put LOAD_REGISTER_FCSR and SAVE_REGISTER_FCC after LOAD_REGISTER_FLOAT. Some float instructions may change fcsr register.
* posix: Fix pidfd_spawn/pidfd_spawnp leak if execve fails (BZ 31695)Adhemerval Zanella2024-06-251-7/+16
| | | | | | | | | | | | | | | | | If the pidfd_spawn/pidfd_spawnp helper process succeeds, but evecve fails for some reason (either with an invalid/non-existent, memory allocation, etc.) the resulting pidfd is never closed, nor returned to caller (so it can call close). Since the process creation failed, it should be up to posix_spawn to also, close the file descriptor in this case (similar to what it does to reap the process). This patch also changes the waitpid with waitid (P_PIDFD) for pidfd case, to avoid a possible pid re-use. Checked on x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* Revert "MIPSr6/math: Use builtin fma and fmaf"Andreas K. Hüttel2024-06-251-13/+0
| | | | | | | Apologies, I mistakenly interpreted this to be already accepted. Reverting until v6 or later is reviewed and approved. This reverts commit 9e06e4a43b58519991acbed1d7f33abc40249226.
* RISC-V: Execute a PAUSE hint in spin loopsChristoph Müllner2024-06-241-0/+3
| | | | | | | | | | | | | | | | The atomic_spin_nop() macro can be used to run arch-specific code in the body of a spin loop to potentially improve efficiency. RISC-V's Zihintpause extension includes a PAUSE instruction for this use-case, which is encoded as a HINT, which means that it behaves like a NOP on systems that don't implement Zihintpause. Binutils supports Zihintpause since 2.36, so this patch uses the ".insn" directive to keep the code compatible with older toolchains. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
* MIPSr6/math: Use builtin fma and fmafYunQiang Su2024-06-241-0/+13
| | | | | | | | | | | | | MIPSr6 has MADDF.s/MADDF.d instructions, which are fused. In MIPS ISA, double support can be subsetted. Only FMAF is enabled for this case. * sysdeps/mips/fpu/math-use-builtins-fma.h Signed-off-by: YunQiang Su <syq@gcc.gnu.org> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
* hppa/vdso: Add wrappers for vDSO functionsJohn David Anglin2024-06-231-0/+12
| | | | | | | | | | | | The upcoming parisc (hppa) v6.11 Linux kernel will include vDSO support for gettimeofday(), clock_gettime() and clock_gettime64() syscalls for 32- and 64-bit userspace. The patch below adds the necessary glue code for glibc. Signed-off-by: Helge Deller <deller@gmx.de> Changes in v2: - add vsyscalls for 64-bit too
* Update hppa libm-test-ulpsJohn David Anglin2024-06-231-0/+48
|
* Update hppa libm-test-ulpsJohn David Anglin2024-06-201-0/+16
|
* RISC-V: Update ulpsJulian Zhu2024-06-201-0/+80
| | | | | | For the exp10m1, exp2m1, log10p1 and log2p1 implementations. Signed-off-by: Julian Zhu <jz531210@gmail.com>
* MIPS: Update ulpsJulian Zhu2024-06-202-0/+108
| | | | | | Update mips32/mips64 ulps for the exp10m1, exp2m1, and log10p1 implementations. Signed-off-by: Julian Zhu <jz531210@gmail.com>
* i386: Update ulpsFlorian Weimer2024-06-201-2/+2
| | | | | This is from a -march=i686 -mtune=generic build with --disable-multi-arch, running on a Cascade Lake CPU.
* s390x: Capture grep output in static PIE checkFlorian Weimer2024-06-202-7/+7
| | | | | | | | | | | The test is not a run-time check, so update the description. Also use readelf -W for a more stable output format and fix an LC_ALL typo. This avoids garbled configure messages: checking for s390-specific static PIE requirements (runtime check)... 0x0000000000000017 (JMPREL) 0x280 yes
* powerpc: Update ulpsFlorian Weimer2024-06-201-1/+13
| | | | | Results based on POWER8 and POWER9 machines running powerpc64-linux-gnu, with and without --disable-multi-arch.
* i386: Update ulpsFlorian Weimer2024-06-202-8/+104
| | | | | | Based on a -march=x86-64-v4 -mfpmath=sse build, with and without --disable-multi-arch, running on a Zen 4 CPU. Also used different -march=x8i6-64-v… settings.
* LoongArch: Update ulpsXi Ruoyao2024-06-191-0/+60
| | | | | | Add ulps for recently added C23 exp10m1, exp2m1, and log10p1 functions. Signed-off-by: Xi Ruoyao <xry111@xry111.site>
* sparc: Regenerate ULPsAndreas K. Hüttel2024-06-191-0/+80
| | | | | | Linux catbus 5.15.110-gentoo-r1 #1 SMP Fri Jun 9 17:53:23 PDT 2023 sparc64 sun4v UltraSparc T5 (Niagara5) GNU/Linux Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
* s390x: Regenerate ULPs.Stefan Liebler2024-06-191-0/+60
| | | | | | | | Needed due to: - "Implement C23 log10p1" commit ID 55eb99e9a9d840ba452b128be14d6529c2dde039 - "Implement C23 exp2m1, exp10m1" commit ID 7ec903e028271d029818378fd60ddaf6b76b89ac
* LoongArch: Fix _dl_tlsdesc_dynamic in LSX casemengqinggang2024-06-191-9/+9
| | | | | | | HWCAP value is overwritten at the first comparison of the LASX case. The second comparison at LSX get incorrect result. Change to use t0 to save HWCAP value, and use t1 to save comparison result.
* arm: Update ulpsAdhemerval Zanella2024-06-181-0/+48
| | | | For the exp10m1, exp2m1, and log10p1 implementations.
* aarch64: Update ulpsAdhemerval Zanella2024-06-181-0/+60
| | | | For the exp10m1, exp2m1, and log10p1 implementations.
* powerpc: Update ulpsAdhemerval Zanella2024-06-181-0/+60
| | | | For the exp10m1, exp2m1, and log10p1 implementations.
* Linux: Include <dl-symbol-redir-ifunc.h> in dl-sysdep.cFlorian Weimer2024-06-181-0/+1
| | | | | | | | | | The _dl_sysdep_parse_arguments function contains initalization of a large on-stack variable: dl_parse_auxv_t auxv_values = { 0, }; This uses a non-inline version of memset on powerpc64le-linux-gnu, so it must use the baseline memset.
* linux: add definitions for hugetlb page size encodingsCarlos Llamas2024-06-183-6/+45
| | | | | | | | | | | | | | | | | | | | | | A desired hugetlb page size can be encoded in the flags parameter of system calls such as mmap() and shmget(). The Linux UAPI headers have included explicit definitions for these encodings since v4.14. This patch adds these definitions that are used along with MAP_HUGETLB and SHM_HUGETLB flags as specified in the corresponding man pages. This relieves programs from having to duplicate and/or compute the encodings manually. Additionally, the filter on these definitions in tst-mman-consts.py is removed, as suggested by Florian. I then ran this tests successfully, confirming the alignment with the kernel headers. PASS: misc/tst-mman-consts original exit status 0 Signed-off-by: Carlos Llamas <cmllamas@google.com> Tested-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Florian Weimer <fweimer@redhat.com>
* elf: Remove HWCAP_IMPORTANTStefan Liebler2024-06-1810-43/+0
| | | | | | | Remove the definitions of HWCAP_IMPORTANT after removal of LD_HWCAP_MASK / tunable glibc.cpu.hwcap_mask. There HWCAP_IMPORTANT was used as default value. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* elf: Remove LD_HWCAP_MASK / tunable glibc.cpu.hwcap_maskStefan Liebler2024-06-182-7/+0
| | | | | | | | | | | | Remove the environment variable LD_HWCAP_MASK and the tunable glibc.cpu.hwcap_mask as those are not used anymore in common-code after removal in elf/dl-cache.c:search_cache(). The only remaining user is sparc32 where it is used in elf_machine_matches_host(). If sparc32 does not need it anymore, we can get rid of it at all. Otherwise we could also move LD_HWCAP_MASK / tunable glibc.cpu.hwcap_mask to be sparc32 specific. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* elf: Remove _DL_PLATFORMS_COUNTStefan Liebler2024-06-189-28/+6
| | | | | | | | | Remove the definitions of _DL_PLATFORMS_COUNT as those are not used anymore after removal in elf/dl-cache.c:search_cache(). Note: On x86, we can also get rid of the definitions HWCAP_PLATFORMS_START and HWCAP_PLATFORMS_COUNT. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* elf: Remove _DL_FIRST_PLATFORMStefan Liebler2024-06-182-6/+0
| | | | | | | | | | Remove the definitions of _DL_FIRST_PLATFORM as those were only used in the _DL_HWCAP_PLATFORM definitions and in _dl_string_platform(). Both were removed. Note: Removed on every architecture despite of powerpc, where _dl_string_platform() is still used. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* elf: Remove _DL_HWCAP_PLATFORMStefan Liebler2024-06-1810-29/+0
| | | | | | Remove the definitions of _DL_HWCAP_PLATFORM as those are not used anymore after removal in elf/dl-cache.c:search_cache(). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* elf: Remove platform strings in dl-procinfo.cStefan Liebler2024-06-186-189/+7
| | | | | | Remove the platform strings in dl-procinfo.c where also the implementation of _dl_string_platform() was removed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* elf: Remove _dl_string_platformStefan Liebler2024-06-189-83/+0
| | | | | | | | | Despite of powerpc where the returned integer is stored in tcb, and the diagnostics output, there is no user anymore. Thus this patch removes the diagnostics output and _dl_string_platform for all other platforms. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* x86: Remove HWCAP_START and HWCAP_COUNTStefan Liebler2024-06-181-6/+0
| | | | | | | | | | Both defines are not used anymore. Those were only used for _dl_string_hwcap(), which itself was removed with commit ab40f20364f4a417a63dd51fdd943742070bfe96 "elf: Remove _dl_string_hwcap" Just clean up. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* math: Update mips32/mips64 ulps for log2p1YunQiang Su2024-06-172-0/+40
|
* Convert to autoconf 2.72 (vanilla release, no distribution patches)Andreas K. Hüttel2024-06-1727-1037/+1066
| | | | | | | As discussed at the patch review meeting Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org> Reviewed-by: Simon Chopin <simon.chopin@canonical.com>
* Implement C23 exp2m1, exp10m1Joseph Myers2024-06-1742-1/+589
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | C23 adds various <math.h> function families originally defined in TS 18661-4. Add the exp2m1 and exp10m1 functions (exp2(x)-1 and exp10(x)-1, like expm1). As with other such functions, these use type-generic templates that could be replaced with faster and more accurate type-specific implementations in future. Test inputs are copied from those for expm1, plus some additions close to the overflow threshold (copied from exp2 and exp10) and also some near the underflow threshold. exp2m1 has the unusual property of having an input (M_MAX_EXP) where whether the function overflows (under IEEE semantics) depends on the rounding mode. Although these could reasonably be XFAILed in the testsuite (as we do in some cases for arguments very close to a function's overflow threshold when an error of a few ulps in the implementation can result in the implementation not agreeing with an ideal one on whether overflow takes place - the testsuite isn't smart enough to handle this automatically), since these functions aren't required to be correctly rounding, I made the implementation check for and handle this case specially. The Makefile ordering expected by lint-makefiles for the new functions is a bit peculiar, but I implemented it in this patch so that the test passes; I don't know why log2 also needed moving in one Makefile variable setting when it didn't in my previous patches, but the failure showed a different place was expected for that function as well. The powerpc64le IFUNC setup seems not to be as self-contained as one might hope; it shouldn't be necessary to add IFUNCs for new functions such as these simply to get them building, but without setting up IFUNCs for the new functions, there were undefined references to __GI___expm1f128 (that IFUNC machinery results in no such function being defined, but doesn't stop include/math.h from doing the redirection resulting in the exp2m1f128 and exp10m1f128 implementations expecting to call it). Tested for x86_64 and x86, and with build-many-glibcs.py.
* Implement C23 log10p1Joseph Myers2024-06-1739-0/+291
| | | | | | | | | | | | | | C23 adds various <math.h> function families originally defined in TS 18661-4. Add the log10p1 functions (log10(1+x): like log1p, but for base-10 logarithms). This is directly analogous to the log2p1 implementation (except that whereas log2p1 has a smaller underflow range than log1p, log10p1 has a larger underflow range). The test inputs are copied from those for log1p and log2p1, plus a few more inputs in that wider underflow range. Tested for x86_64 and x86, and with build-many-glibcs.py.
* Implement C23 logp1Joseph Myers2024-06-1770-6/+673
| | | | | | | | | | | | | | | | | | | | | | | | | | | C23 adds various <math.h> function families originally defined in TS 18661-4. Add the logp1 functions (aliases for log1p functions - the name is intended to be more consistent with the new log2p1 and log10p1, where clearly it would have been very confusing to name those functions log21p and log101p). As aliases rather than new functions, the content of this patch is somewhat different from those actually adding new functions. Tests are shared with log1p, so this patch *does* mechanically update all affected libm-test-ulps files to expect the same errors for both functions. The vector versions of log1p on aarch64 and x86_64 are *not* updated to have logp1 aliases (and thus there are no corresponding header, tests, abilist or ulps changes for vector functions either). It would be reasonable for such vector aliases and corresponding changes to other files to be made separately. For now, the log1p tests instead avoid testing logp1 in the vector case (a Makefile change is needed to avoid problems with grep, used in generating the .c files for vector function tests, matching more than one ALL_RM_TEST line in a file testing multiple functions with the same inputs, when it assumes that the .inc file only has a single such line). Tested for x86_64 and x86, and with build-many-glibcs.py.
* x86: Fix value for `x86_memset_non_temporal_threshold` when it is undesirableNoah Goldstein2024-06-141-3/+3
| | | | | | | | | | | | | | When we don't want to use non-temporal stores for memset, we set `x86_memset_non_temporal_threshold` to SIZE_MAX. The current code, however, we using `maximum_non_temporal_threshold` as the upper bound which is `SIZE_MAX >> 4` so we ended up with a value of `0`. Fix is to just use `SIZE_MAX` as the upper bound for when setting the tunable. Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* i686: Regenerate ulpsAndreas K. Hüttel2024-06-141-5/+5
| | | | | | | Linux pinacolada 6.6.32-gentoo #1 SMP PREEMPT Sun Jun 9 14:18:17 CEST 2024 x86_64 Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz GenuineIntel GNU/Linux 32bit build for multilib environment Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
* LoongArch: Ensure sp 16-byte aligned for tlsdescXi Ruoyao2024-06-142-7/+4
| | | | | | | | | | "ADDI sp, sp, 24" and "ADDI sp, sp, SZFCSREG" (SZFCSREG = 4) are misaligning the stack: the ABI mandates a 16-byte alignment. Fix it by changing the first one to "ADDI sp, sp, 32", and reuse the spare 4th slot for saving fcsr. Reported-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Xi Ruoyao <xry111@xry111.site>
* x86: Properly set x86 minimum ISA level [BZ #31883]H.J. Lu2024-06-123-3/+17
| | | | | | | | | | | | | | | | | | | | | | Properly set libc_cv_have_x86_isa_level in shell for MINIMUM_X86_ISA_LEVEL defined as (__X86_ISA_V1 + __X86_ISA_V2 + __X86_ISA_V3 + __X86_ISA_V4) Also set __X86_ISA_V2 to 1 for i386 if __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8 is defined. There are no changes in config.h nor in config.make on x86-64. On i386, -march=x86-64-v2 with GCC generates #define MINIMUM_X86_ISA_LEVEL 2 in config.h and have-x86-isa-level = 2 in config.make. This fixes BZ #31883. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* linux: Remove __stack_protAdhemerval Zanella2024-06-121-15/+10
| | | | | | | | | | | | | | | | | | The __stack_prot is used by Linux to make the stack executable if a modules requires it. It is also marked as RELRO, which requires to change the segment permission to RW to update it. Also, there is no need to keep track of the flags: either the stack will have the default permission of the ABI or should be change to PROT_READ | PROT_WRITE | PROT_EXEC. The only additional flag, PROT_GROWSDOWN or PROT_GROWSUP, is Linux only and can be deducted from _STACK_GROWS_DOWN/_STACK_GROWS_UP. Also, the check_consistency function was already removed some time ago. Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* x86: Properly set MINIMUM_X86_ISA_LEVEL for i386 [BZ #31867]H.J. Lu2024-06-112-4/+12
| | | | | | | | | | | On i386, set the default minimum ISA level to 0, not 1 (baseline which includes SSE2). There are no changes in config.h nor in config.make on x86-64. This fixes BZ #31867. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Tested-by: Ian Jordan <immoloism@gmail.com> Reviewed-by: Sam James <sam@gentoo.org> Reviewed-by: Florian Weimer <fweimer@redhat.com>
* x86: Enable non-temporal memset tunable for AMDJoe Damato2024-06-101-4/+4
| | | | | | | | | | | | | | | | | | In commit 46b5e98ef6f1 ("x86: Add seperate non-temporal tunable for memset") a tunable threshold for enabling non-temporal memset was added, but only for Intel hardware. Since that commit, new benchmark results suggest that non-temporal memset is beneficial on AMD, as well, so allow this tunable to be set for AMD. See: https://docs.google.com/spreadsheets/d/1opzukzvum4n6-RUVHTGddV6RjAEil4P2uMjjQGLbLcU/edit?usp=sharing which has been updated to include data using different stategies for large memset on AMD Zen2, Zen3, and Zen4. Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* hurd: Fix lsetxattr return valueSamuel Thibault2024-06-101-1/+1
| | | | The manpage says that lsetxattr returns 0 on success, like setxattr.
* Linux: Add epoll ioctlsJoe Damato2024-06-043-0/+107
| | | | | | | | | | | | | | | | | As of Linux kernel 6.9, some ioctls and a parameters structure have been introduced which allow user programs to control whether a particular epoll context will busy poll. Update the headers to include these for the convenience of user apps. The ioctls were added in Linux kernel 6.9 commit 18e2bf0edf4dd ("eventpoll: Add epoll ioctl for epoll_params") [1] to include/uapi/linux/eventpoll.h. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/?h=v6.9&id=18e2bf0edf4dd Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* math: Fix exp10 undefined left shiftSzabolcs Nagy2024-06-041-3/+3
| | | | | | | Left shift of ki is undefined when ki<0, copy the logic from exp, which uses unsigned arithmetics, to fix it. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Add new AArch64 HWCAP2 definitions from Linux 6.9 to bits/hwcap.hJoseph Myers2024-06-041-0/+15
| | | | | | | Linux 6.9 adds 15 new HWCAP2_* values for AArch64; add them to bits/hwcap.h in glibc. Tested with build-many-glibcs.py for aarch64-linux-gnu.
* x86: Add seperate non-temporal tunable for memsetNoah Goldstein2024-05-306-5/+34
| | | | | | | | | | | The tuning for non-temporal stores for memset vs memcpy is not always the same. This includes both the exact value and whether non-temporal stores are profitable at all for a given arch. This patch add `x86_memset_non_temporal_threshold`. Currently we disable non-temporal stores for non Intel vendors as the only benchmarks showing its benefit have been on Intel hardware. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>