about summary refs log tree commit diff
path: root/sysdeps/x86
Commit message (Collapse)AuthorAgeFilesLines
* x86-64: Simplify minimum ISA check ifdef conditional with ifSunil K Pandey2024-03-031-11/+8
| | | | | | | | | Replace minimum ISA check ifdef conditional with if. Since MINIMUM_X86_ISA_LEVEL and AVX_X86_ISA_LEVEL are compile time constants, compiler will perform constant folding optimization, getting same results. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Update _dl_tlsdesc_dynamic to preserve AMX registersH.J. Lu2024-02-294-5/+71
| | | | | | | | | | | | | | | | _dl_tlsdesc_dynamic should also preserve AMX registers which are caller-saved. Add X86_XSTATE_TILECFG_ID and X86_XSTATE_TILEDATA_ID to x86-64 TLSDESC_CALL_STATE_SAVE_MASK. Compute the AMX state size and save it in xsave_state_full_size which is only used by _dl_tlsdesc_dynamic_xsave and _dl_tlsdesc_dynamic_xsavec. This fixes the AMX part of BZ #31372. Tested on AMX processor. AMX test is enabled only for compilers with the fix for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098 GCC 14 and GCC 11/12/13 branches have the bug fix. Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
* x86: Don't check XFD against /proc/cpuinfoH.J. Lu2024-02-281-0/+3
| | | | | Since /proc/cpuinfo doesn't report XFD, don't check it against /proc/cpuinfo.
* x86-64: Don't use SSE resolvers for ISA level 3 or aboveH.J. Lu2024-02-281-6/+11
| | | | | | | | | | | | | | | | When glibc is built with ISA level 3 or above enabled, SSE resolvers aren't available and glibc fails to build: ld: .../elf/librtld.os: in function `init_cpu_features': .../elf/../sysdeps/x86/cpu-features.c:1200:(.text+0x1445f): undefined reference to `_dl_runtime_resolve_fxsave' ld: .../elf/librtld.os: relocation R_X86_64_PC32 against undefined hidden symbol `_dl_runtime_resolve_fxsave' can not be used when making a shared object /usr/local/bin/ld: final link failed: bad value For ISA level 3 or above, don't use _dl_runtime_resolve_fxsave nor _dl_tlsdesc_dynamic_fxsave. This fixes BZ #31429. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* x86: Update _dl_tlsdesc_dynamic to preserve caller-saved registersH.J. Lu2024-02-286-3/+110
| | | | | | | | | | | | | | | | | | | | | | | | | | | Compiler generates the following instruction sequence for GNU2 dynamic TLS access: leaq tls_var@TLSDESC(%rip), %rax call *tls_var@TLSCALL(%rax) or leal tls_var@TLSDESC(%ebx), %eax call *tls_var@TLSCALL(%eax) CALL instruction is transparent to compiler which assumes all registers, except for EFLAGS and RAX/EAX, are unchanged after CALL. When _dl_tlsdesc_dynamic is called, it calls __tls_get_addr on the slow path. __tls_get_addr is a normal function which doesn't preserve any caller-saved registers. _dl_tlsdesc_dynamic saved and restored integer caller-saved registers, but didn't preserve any other caller-saved registers. Add _dl_tlsdesc_dynamic IFUNC functions for FNSAVE, FXSAVE, XSAVE and XSAVEC to save and restore all caller-saved registers. This fixes BZ #31372. Add GLRO(dl_x86_64_runtime_resolve) with GLRO(dl_x86_tlsdesc_dynamic) to optimize elf_machine_runtime_setup. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* x86: Change ENQCMD test to CHECK_FEATURE_PRESENTH.J. Lu2024-02-271-1/+1
| | | | | | Since ENQCMD is mainly used in kernel, change the ENQCMD test to CHECK_FEATURE_PRESENT. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* x86_64: Exclude SSE, AVX and FMA4 variants in libm multiarchSunil K Pandey2024-02-252-0/+57
| | | | | | | | | | | | | | | | | | | When glibc is built with ISA level 3 or higher by default, the resulting glibc binaries won't run on SSE or FMA4 processors. Exclude SSE, AVX and FMA4 variants in libm multiarch when ISA level 3 or higher is enabled by default. When glibc is built with ISA level 2 enabled by default, only keep SSE4.1 variant. Fixes BZ 31335. NB: elf/tst-valgrind-smoke test fails with ISA level 4, because valgrind doesn't support AVX512 instructions: https://bugs.kde.org/show_bug.cgi?id=383010 Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Save APX registers in ld.so trampolineH.J. Lu2024-02-251-6/+46
| | | | | | | | | Add APX registers to STATE_SAVE_MASK so that APX registers are saved in ld.so trampoline. This fixes BZ #31371. Also update STATE_SAVE_OFFSET and STATE_SAVE_MASK for i386 which will be used by i386 _dl_tlsdesc_dynamic. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* Apply the Makefile sorting fixH.J. Lu2024-02-151-3/+3
| | | | Apply the Makefile sorting fix generated by sort-makefile-lines.py.
* x86: Do not prefer ERMS for memset on Zen3+Adhemerval Zanella2024-02-131-0/+5
| | | | | | | | For AMD Zen3+ architecture, the performance of the vectorized loop is slightly better than ERMS. Checked on x86_64-linux-gnu on Zen3. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86: Fix Zen3/Zen4 ERMS selection (BZ 30994)Adhemerval Zanella2024-02-131-20/+18
| | | | | | | | | | | | | | | | The REP MOVSB usage on memcpy/memmove does not show much performance improvement on Zen3/Zen4 cores compared to the vectorized loops. Also, as from BZ 30994, if the source is aligned and the destination is not the performance can be 20x slower. The performance difference is noticeable with small buffer sizes, closer to the lower bounds limits when memcpy/memmove starts to use ERMS. The performance of REP MOVSB is similar to vectorized instruction on the size limit (the L2 cache). Also, there is no drawback to multiple cores sharing the cache. Checked on x86_64-linux-gnu on Zen3. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* Refer to C23 in place of C2X in glibcJoseph Myers2024-02-011-1/+1
| | | | | | | | | | | | | | | WG14 decided to use the name C23 as the informal name of the next revision of the C standard (notwithstanding the publication date in 2024). Update references to C2X in glibc to use the C23 name. This is intended to update everything *except* where it involves renaming files (the changes involving renaming tests are intended to be done separately). In the case of the _ISOC2X_SOURCE feature test macro - the only user-visible interface involved - support for that macro is kept for backwards compatibility, while adding _ISOC23_SOURCE. Tested for x86_64.
* x86-64: Check if mprotect works before rewriting PLTH.J. Lu2024-01-151-1/+7
| | | | | | | | | | | | | | | | Systemd execution environment configuration may prohibit changing a memory mapping to become executable: MemoryDenyWriteExecute= Takes a boolean argument. If set, attempts to create memory mappings that are writable and executable at the same time, or to change existing memory mappings to become executable, or mapping shared memory segments as executable, are prohibited. When it is set, systemd service stops working if PLT rewrite is enabled. Check if mprotect works before rewriting PLT. This fixes BZ #31230. This also works with SELinux when deny_execmem is on. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* x86-64/cet: Make CET feature check specific to Linux/x86H.J. Lu2024-01-115-33/+35
| | | | | | CET feature bits in TCB, which are Linux specific, are used to check if CET features are active. Move CET feature check to Linux/x86 directory. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* i386: Remove CET support bitsH.J. Lu2024-01-105-135/+3
| | | | | | | | | | | | 1. Remove _dl_runtime_resolve_shstk and _dl_runtime_profile_shstk. 2. Move CET offsets from x86 cpu-features-offsets.sym to x86-64 features-offsets.sym. 3. Rename x86 cet-control.h to x86-64 feature-control.h since it is only for x86-64 and also used for PLT rewrite. 4. Add x86-64 ldsodefs.h to include feature-control.h. 5. Change TUNABLE_CALLBACK (set_plt_rewrite) to x86-64 only. 6. Move x86 dl-procruntime.c to x86-64. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* x86-64/cet: Move check-cet.awk to x86_64H.J. Lu2024-01-101-53/+0
| | | | Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* x86-64/cet: Move dl-cet.[ch] to x86_64 directoriesH.J. Lu2024-01-101-364/+0
| | | | | | Since CET is only enabled for x86-64, move dl-cet.[ch] to x86_64 directories. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* x86: Move x86-64 shadow stack startup codesH.J. Lu2024-01-101-74/+0
| | | | | | Move sysdeps/x86/libc-start.h to sysdeps/x86_64/libc-start.h and use sysdeps/generic/libc-start.h for i386. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* i386: Remove CET supportAdhemerval Zanella2024-01-091-44/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | CET is only support for x86_64, this patch reverts: - faaee1f07ed x86: Support shadow stack pointer in setjmp/longjmp. - be9ccd27c09 i386: Add _CET_ENDBR to indirect jump targets in add_n.S/sub_n.S - c02695d7764 x86/CET: Update vfork to prevent child return - 5d844e1b725 i386: Enable CET support in ucontext functions - 124bcde683 x86: Add _CET_ENDBR to functions in crti.S - 562837c002 x86: Add _CET_ENDBR to functions in dl-tlsdesc.S - f753fa7dea x86: Support IBT and SHSTK in Intel CET [BZ #21598] - 825b58f3fb i386-mcount.S: Add _CET_ENDBR to _mcount and __fentry__ - 7e119cd582 i386: Use _CET_NOTRACK in i686/memcmp.S - 177824e232 i386: Use _CET_NOTRACK in memcmp-sse4.S - 0a899af097 i386: Use _CET_NOTRACK in memcpy-ssse3-rep.S - 7fb613361c i386: Use _CET_NOTRACK in memcpy-ssse3.S - 77a8ae0948 i386: Use _CET_NOTRACK in memset-sse2-rep.S - 00e7b76a8f i386: Use _CET_NOTRACK in memset-sse2.S - 90d15dc577 i386: Use _CET_NOTRACK in strcat-sse2.S - f1574581c7 i386: Use _CET_NOTRACK in strcpy-sse2.S - 4031d7484a i386/sub_n.S: Add a missing _CET_ENDBR to indirect jump - target - Checked on i686-linux-gnu.
* x86: Move CET infrastructure to x86_64Adhemerval Zanella2024-01-0953-1501/+0
| | | | | | | | The CET is only supported for x86_64 and there is no plan to add kernel support for i386. Move the Makefile rules and files from the generic x86 folder to x86_64 one. Checked on x86_64-linux-gnu and i686-linux-gnu.
* Remove ia64-linux-gnuAdhemerval Zanella2024-01-081-13/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Linux 6.7 removed ia64 from the official tree [1], following the general principle that a glibc port needs upstream support for the architecture in all the components it depends on (binutils, GCC, and the Linux kernel). Apart from the removal of sysdeps/ia64 and sysdeps/unix/sysv/linux/ia64, there are updates to various comments referencing ia64 for which removal of those references seemed appropriate. The configuration is removed from README and build-many-glibcs.py. The CONTRIBUTED-BY, elf/elf.h, manual/contrib.texi (the porting mention), *.po files, config.guess, and longlong.h are not changed. For Linux it allows cleanup some clone2 support on multiple files. The following bug can be closed as WONTFIX: BZ 22634 [2], BZ 14250 [3], BZ 21634 [4], BZ 10163 [5], BZ 16401 [6], and BZ 11585 [7]. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=43ff221426d33db909f7159fdf620c3b052e2d1c [2] https://sourceware.org/bugzilla/show_bug.cgi?id=22634 [3] https://sourceware.org/bugzilla/show_bug.cgi?id=14250 [4] https://sourceware.org/bugzilla/show_bug.cgi?id=21634 [5] https://sourceware.org/bugzilla/show_bug.cgi?id=10163 [6] https://sourceware.org/bugzilla/show_bug.cgi?id=16401 [7] https://sourceware.org/bugzilla/show_bug.cgi?id=11585 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* elf: Add ELF_DYNAMIC_AFTER_RELOC to rewrite PLTH.J. Lu2024-01-054-1/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add ELF_DYNAMIC_AFTER_RELOC to allow target specific processing after relocation. For x86-64, add #define DT_X86_64_PLT (DT_LOPROC + 0) #define DT_X86_64_PLTSZ (DT_LOPROC + 1) #define DT_X86_64_PLTENT (DT_LOPROC + 3) 1. DT_X86_64_PLT: The address of the procedure linkage table. 2. DT_X86_64_PLTSZ: The total size, in bytes, of the procedure linkage table. 3. DT_X86_64_PLTENT: The size, in bytes, of a procedure linkage table entry. With the r_addend field of the R_X86_64_JUMP_SLOT relocation set to the memory offset of the indirect branch instruction. Define ELF_DYNAMIC_AFTER_RELOC for x86-64 to rewrite the PLT section with direct branch after relocation when the lazy binding is disabled. PLT rewrite is disabled by default since SELinux may disallow modifying code pages and ld.so can't detect it in all cases. Use $ export GLIBC_TUNABLES=glibc.cpu.plt_rewrite=1 to enable PLT rewrite with 32-bit direct jump at run-time or $ export GLIBC_TUNABLES=glibc.cpu.plt_rewrite=2 to enable PLT rewrite with 32-bit direct jump and on APX processors with 64-bit absolute jump at run-time. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* x86-64/cet: Check the restore token in longjmpH.J. Lu2024-01-041-0/+3
| | | | | | | | | | | | | | | | | | | | | | setcontext and swapcontext put a restore token on the old shadow stack which is used to restore the target shadow stack when switching user contexts. When longjmp from a user context, the target shadow stack can be different from the current shadow stack and INCSSP can't be used to restore the shadow stack pointer to the target shadow stack. Update longjmp to search for a restore token. If found, use the token to restore the shadow stack pointer before using INCSSP to pop the shadow stack. Stop the token search and use INCSSP if the shadow stack entry value is the same as the current shadow stack pointer. It is a user error if there is a shadow stack switch without leaving a restore token on the old shadow stack. The only difference between __longjmp.S and __longjmp_chk.S is that __longjmp_chk.S has a check for invalid longjmp usages. Merge __longjmp.S and __longjmp_chk.S by adding the CHECK_INVALID_LONGJMP macro. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* i386: Ignore --enable-cetH.J. Lu2024-01-042-113/+0
| | | | | | | | | | | | | | Since shadow stack is only supported for x86-64, ignore --enable-cet for i386. Always setting $(enable-cet) for i386 to "no" to support ifneq ($(enable-cet),no) in x86 Makefiles. We can't use ifeq ($(enable-cet),yes) since $(enable-cet) can be "yes", "no" or "permissive". Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* x86/cet: Add -fcf-protection=none before -fcf-protection=branchH.J. Lu2024-01-011-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When shadow stack is enabled, some CET tests failed when compiled with GCC 14: FAIL: elf/tst-cet-legacy-4 FAIL: elf/tst-cet-legacy-5a FAIL: elf/tst-cet-legacy-6a which are caused by https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113039 These tests use -fcf-protection -fcf-protection=branch and assume that -fcf-protection=branch will override -fcf-protection. But this GCC 14 commit: https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=1c6231c05bdcca changed the -fcf-protection behavior such that -fcf-protection -fcf-protection=branch is treated the same as -fcf-protection Use -fcf-protection -fcf-protection=none -fcf-protection=branch as the workaround. This fixes BZ #31187. Tested with GCC 13 and GCC 14 on Intel Tiger Lake. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* Update copyright dates with scripts/update-copyrightsPaul Eggert2024-01-01136-136/+136
|
* x86/cet: Run some CET tests with shadow stackH.J. Lu2024-01-014-0/+17
| | | | | | | When CET is disabled by default, run some CET tests with shadow stack enabled using $ export GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK
* x86/cet: Don't set CET active by defaultH.J. Lu2024-01-012-2/+15
| | | | | | | | | | | | | | | Not all CET enabled applications and libraries have been properly tested in CET enabled environments. Some CET enabled applications or libraries will crash or misbehave when CET is enabled. Don't set CET active by default so that all applications and libraries will run normally regardless of whether CET is active or not. Shadow stack can be enabled by $ export GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK at run-time if shadow stack can be enabled by kernel. NB: This commit can be reverted if it is OK to enable CET by default for all applications and libraries.
* x86/cet: Check feature_1 in TCB for active IBT and SHSTKH.J. Lu2024-01-013-1/+35
| | | | | | | | | Initially, IBT and SHSTK are marked as active when CPU supports them and CET are enabled in glibc. They can be disabled early by tunables before relocation. Since after relocation, GLRO(dl_x86_cpu_features) becomes read-only, we can't update GLRO(dl_x86_cpu_features) to mark IBT and SHSTK as inactive. Instead, check the feature_1 field in TCB to decide if IBT and SHST are active.
* x86/cet: Enable shadow stack during startupH.J. Lu2024-01-016-93/+93
| | | | | | | | | | | | | | | | | | | | | | | Previously, CET was enabled by kernel before passing control to user space and the startup code must disable CET if applications or shared libraries aren't CET enabled. Since the current kernel only supports shadow stack and won't enable shadow stack before passing control to user space, we need to enable shadow stack during startup if the application and all shared library are shadow stack enabled. There is no need to disable shadow stack at startup. Shadow stack can only be enabled in a function which will never return. Otherwise, shadow stack will underflow at the function return. 1. GL(dl_x86_feature_1) is set to the CET features which are supported by the processor and are not disabled by the tunable. Only non-zero features in GL(dl_x86_feature_1) should be enabled. After enabling shadow stack with ARCH_SHSTK_ENABLE, ARCH_SHSTK_STATUS is used to check if shadow stack is really enabled. 2. Use ARCH_SHSTK_ENABLE in RTLD_START in dynamic executable. It is safe since RTLD_START never returns. 3. Call arch_prctl (ARCH_SHSTK_ENABLE) from ARCH_SETUP_TLS in static executable. Since the start function using ARCH_SETUP_TLS never returns, it is safe to enable shadow stack in ARCH_SETUP_TLS.
* x86/cet: Sync with Linux kernel 6.6 shadow stack interfaceH.J. Lu2024-01-012-6/+11
| | | | | | | | | | | | | | | | | | | | | | | Sync with Linux kernel 6.6 shadow stack interface. Since only x86-64 is supported, i386 shadow stack codes are unchanged and CET shouldn't be enabled for i386. 1. When the shadow stack base in TCB is unset, the default shadow stack is in use. Use the current shadow stack pointer as the marker for the default shadow stack. It is used to identify if the current shadow stack is the same as the target shadow stack when switching ucontexts. If yes, INCSSP will be used to unwind shadow stack. Otherwise, shadow stack restore token will be used. 2. Allocate shadow stack with the map_shadow_stack syscall. Since there is no function to explicitly release ucontext, there is no place to release shadow stack allocated by map_shadow_stack in ucontext functions. Such shadow stacks will be leaked. 3. Rename arch_prctl CET commands to ARCH_SHSTK_XXX. 4. Rewrite the CET control functions with the current kernel shadow stack interface. Since CET is no longer enabled by kernel, a separate patch will enable shadow stack during startup.
* x86/cet: Don't disable CET if not single threadedH.J. Lu2023-12-201-2/+9
| | | | | | | | In permissive mode, don't disable IBT nor SHSTK when dlopening a legacy shared library if not single threaded since IBT and SHSTK may be still enabled in other threads. Other threads with IBT or SHSTK enabled will crash when calling functions in the legacy shared library. Instead, an error will be issued.
* x86: Modularize sysdeps/x86/dl-cet.cH.J. Lu2023-12-201-176/+280
| | | | | | | | | | | Improve readability and make maintenance easier for dl-feature.c by modularizing sysdeps/x86/dl-cet.c: 1. Support processors with: a. Only IBT. Or b. Only SHSTK. Or c. Both IBT and SHSTK. 2. Lock CET features only if IBT or SHSTK are enabled and are not enabled permissively.
* Fix elf: Do not duplicate the GLIBC_TUNABLES stringH.J. Lu2023-12-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | commit 2a969b53c0b02fed7e43473a92f219d737fd217a Author: Adhemerval Zanella <adhemerval.zanella@linaro.org> Date: Wed Dec 6 10:24:01 2023 -0300 elf: Do not duplicate the GLIBC_TUNABLES string has @@ -38,7 +39,7 @@ which isn't available. */ #define CHECK_GLIBC_IFUNC_PREFERRED_OFF(f, cpu_features, name, len) \ _Static_assert (sizeof (#name) - 1 == len, #name " != " #len); \ - if (memcmp (f, #name, len) == 0) \ + if (tunable_str_comma_strcmp_cte (&f, #name) == 0) \ { \ cpu_features->preferred[index_arch_##name] \ &= ~bit_arch_##name; \ @@ -46,12 +47,11 @@ Fix it by removing "== 0" after tunable_str_comma_strcmp_cte.
* Fix elf: Do not duplicate the GLIBC_TUNABLES stringH.J. Lu2023-12-191-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix issues in sysdeps/x86/tst-hwcap-tunables.c added by Author: Adhemerval Zanella <adhemerval.zanella@linaro.org> Date: Wed Dec 6 10:24:01 2023 -0300 elf: Do not duplicate the GLIBC_TUNABLES string 1. -AVX,-AVX2,-AVX512F should be used to disable AVX, AVX2 and AVX512. 2. AVX512 IFUNC functions check AVX512VL. -AVX512VL should be added to disable these functions. This fixed: FAIL: elf/tst-hwcap-tunables ... [0] Spawned test for -Prefer_ERMS,-Prefer_FSRM,-AVX,-AVX2,-AVX_Usable,-AVX2_Usable,-AVX512F_Usable,-SSE4_1,-SSE4_2,-SSSE3,-Fast_Unaligned_Load,-ERMS,-AVX_Fast_Unaligned_Load error: subprocess failed: tst-tunables error: unexpected output from subprocess ../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure left: 1 (0x1); from: impls[i].usable right: 0 (0x0); from: false ../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure left: 1 (0x1); from: impls[i].usable right: 0 (0x0); from: false ../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure left: 1 (0x1); from: impls[i].usable right: 0 (0x0); from: false ../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure left: 1 (0x1); from: impls[i].usable right: 0 (0x0); from: false ../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure left: 1 (0x1); from: impls[i].usable right: 0 (0x0); from: false [1] Spawned test for ,-,-Prefer_ERMS,-Prefer_FSRM,-AVX,-AVX2,-AVX_Usable,-AVX2_Usable,-AVX512F_Usable,-SSE4_1,-SSE4_2,,-SSSE3,-Fast_Unaligned_Load,,-,-ERMS,-AVX_Fast_Unaligned_Load,-, error: subprocess failed: tst-tunables error: unexpected output from subprocess ../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure left: 1 (0x1); from: impls[i].usable right: 0 (0x0); from: false ../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure left: 1 (0x1); from: impls[i].usable right: 0 (0x0); from: false ../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure left: 1 (0x1); from: impls[i].usable right: 0 (0x0); from: false ../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure left: 1 (0x1); from: impls[i].usable right: 0 (0x0); from: false ../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure left: 1 (0x1); from: impls[i].usable right: 0 (0x0); from: false error: 2 test failures on Intel Tiger Lake.
* i686: Do not raise exception traps on fesetexcept (BZ 30989)Adhemerval Zanella2023-12-191-18/+5
| | | | | | | | | | | | | | | | According to ISO C23 (7.6.4.4), fesetexcept is supposed to set floating-point exception flags without raising a trap (unlike feraiseexcept, which is supposed to raise a trap if feenableexcept was called with the appropriate argument). The flags can be set in the 387 unit or in the SSE unit. To set a flag, it is sufficient to do it in the SSE unit, because that is guaranteed to not trap. However, on i386 CPUs that have only a 387 unit, set the flags in the 387, as long as this cannot trap. Checked on i686-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* elf: Do not duplicate the GLIBC_TUNABLES stringAdhemerval Zanella2023-12-193-75/+193
| | | | | | | | | | | | | | | | | | | | | The tunable parsing duplicates the tunable environment variable so it null-terminates each one since it simplifies the later parsing. It has the drawback of adding another point of failure (__minimal_malloc failing), and the memory copy requires tuning the compiler to avoid mem operations calls. The parsing now tracks the tunable start and its size. The dl-tunable-parse.h adds helper functions to help parsing, like a strcmp that also checks for size and an iterator for suboptions that are comma-separated (used on hwcap parsing by x86, powerpc, and s390x). Since the environment variable is allocated on the stack by the kernel, it is safe to keep the references to the suboptions for later parsing of string tunables (as done by set_hwcaps by multiple architectures). Checked on x86_64-linux-gnu, powerpc64le-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* x86/cet: Check CPU_FEATURE_ACTIVE in permissive modeH.J. Lu2023-12-192-0/+6
| | | | Verify that CPU_FEATURE_ACTIVE works properly in permissive mode.
* x86/cet: Check legacy shadow stack code in .init_array sectionH.J. Lu2023-12-1911-0/+330
| | | | | | Verify that legacy shadow stack code in .init_array section in application and shared library, which are marked as shadow stack enabled, will trigger segfault.
* x86/cet: Add tests for GLIBC_TUNABLES=glibc.cpu.hwcaps=-SHSTKH.J. Lu2023-12-193-0/+28
| | | | | Verify that GLIBC_TUNABLES=glibc.cpu.hwcaps=-SHSTK turns off shadow stack properly.
* x86/cet: Check CPU_FEATURE_ACTIVE when CET is disabledH.J. Lu2023-12-193-0/+9
| | | | | Verify that CPU_FEATURE_ACTIVE (SHSTK) works properly when CET is disabled.
* x86/cet: Check legacy shadow stack applicationsH.J. Lu2023-12-196-0/+130
| | | | | Add tests to verify that legacy shadow stack applications run properly when shadow stack is enabled in Linux kernel.
* x86/cet: Don't assume that SHSTK implies IBTH.J. Lu2023-12-183-11/+11
| | | | | | | Since shadow stack (SHSTK) is enabled in the Linux kernel without enabling indirect branch tracking (IBT), don't assume that SHSTK implies IBT. Use "CPU_FEATURE_ACTIVE (IBT)" to check if IBT is active and "CPU_FEATURE_ACTIVE (SHSTK)" to check if SHSTK is active.
* x86/cet: Check user_shstk in /proc/cpuinfoH.J. Lu2023-12-171-1/+1
| | | | | Linux kernel reports CPU shadow stack feature in /proc/cpuinfo as user_shstk, instead of shstk.
* x86: Check PT_GNU_PROPERTY earlyH.J. Lu2023-12-111-40/+80
| | | | | | | The PT_GNU_PROPERTY segment is scanned before PT_NOTE. For binaries with the PT_GNU_PROPERTY segment, we can check it to avoid scan of the PT_NOTE segment. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* sysdeps/x86/Makefile: Split and sort testsH.J. Lu2023-12-111-32/+78
| | | | Put each test on a separate line and sort tests.
* x86: Use dl-symbol-redir-ifunc.h on cpu-tunablesAdhemerval Zanella2023-11-211-27/+12
| | | | | | | | | | | | | | | | The dl-symbol-redir-ifunc.h redirects compiler-generated libcalls to arch-specific memory implementations to avoid ifunc calls where it is not yet possible. The memcmp-isa-default-impl.h aims to fix the same issue by calling the specific memset implementation directly. Using the memcmp symbol directly allows the compiler to inline the memset calls (especially because _dl_tunable_set_hwcaps uses constants values), generating better code. Checked on x86_64-linux-gnu. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* x86: Add support for AVX10 preset and vec size in cpu-featuresNoah Goldstein2023-09-294-3/+71
| | | | | | | | | | | | This commit add support for the new AVX10 cpu features: https://cdrdv2-public.intel.com/784267/355989-intel-avx10-spec.pdf We add checks for: - `AVX10`: Check if AVX10 is present. - `AVX10_{X,Y,Z}MM`: Check if a given vec class has AVX10 support. `make check` passes and cpuid output was checked against GNR/DMR on an emulator.
* x86: Check the lower byte of EAX of CPUID leaf 2 [BZ #30643]H.J. Lu2023-08-291-18/+13
| | | | | | | | | | | The old Intel software developer manual specified that the low byte of EAX of CPUID leaf 2 returned 1 which indicated the number of rounds of CPUDID leaf 2 was needed to retrieve the complete cache information. The newer Intel manual has been changed to that it should always return 1 and be ignored. If the lower byte isn't 1, CPUID leaf 2 can't be used. In this case, we ignore CPUID leaf 2 and use CPUID leaf 4 instead. If CPUID leaf 4 doesn't contain the cache information, cache information isn't available at all. This addresses BZ #30643.
* x86: Fix incorrect scope of setting `shared_per_thread` [BZ# 30745]Noah Goldstein2023-08-111-4/+3
| | | | | | | | | | | | | | The: ``` if (shared_per_thread > 0 && threads > 0) shared_per_thread /= threads; ``` Code was accidentally moved to inside the else scope. This doesn't match how it was previously (before af992e7abd). This patch fixes that by putting the division after the `else` block.