about summary refs log tree commit diff
path: root/sysdeps/x86
Commit message (Collapse)AuthorAgeFilesLines
...
* <sys/platform/x86.h>: Add RTM_FORCE_ABORT supportH.J. Lu2023-04-052-1/+2
| | | | | Add RTM_FORCE_ABORT support to <sys/platform/x86.h>. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* <sys/platform/x86.h>: Add SGX-KEYS supportH.J. Lu2023-04-052-1/+2
| | | | | Add SGX-KEYS support to <sys/platform/x86.h>. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* <sys/platform/x86.h>: Add BUS_LOCK_DETECT supportH.J. Lu2023-04-052-1/+2
| | | | | | Add Bus lock debug exceptions (BUS_LOCK_DETECT) support to <sys/platform/x86.h>. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* <sys/platform/x86.h>: Add LA57 supportH.J. Lu2023-04-052-1/+2
| | | | | | Add 57-bit linear addresses and five-level paging (LA57) support to <sys/platform/x86.h>. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* <bits/platform/x86.h>: Rename to x86_cpu_INDEX_7_ECX_15H.J. Lu2023-04-051-1/+1
| | | | | | Rename x86_cpu_INDEX_7_ECX_1 to x86_cpu_INDEX_7_ECX_15 for the unused bit 15 in ECX from CPUID with EAX == 0x7 and ECX == 0. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* x86/dl-cacheinfo: remove unsused parameter from handle_amdAndreas Schwab2023-04-041-36/+30
| | | | Also replace an unreachable assert with __builtin_unreachable.
* x86: Set FSGSBASE to active if enabled by kernelH.J. Lu2023-04-033-0/+31
| | | | | | | | | | | | | Linux kernel uses AT_HWCAP2 to indicate if FSGSBASE instructions are enabled. If the HWCAP2_FSGSBASE bit in AT_HWCAP2 is set, FSGSBASE instructions can be used in user space. Define dl_check_hwcap2 to set the FSGSBASE feature to active on Linux when the HWCAP2_FSGSBASE bit is set. Add a test to verify that FSGSBASE is active on current kernels. NB: This test will fail if the kernel doesn't set the HWCAP2_FSGSBASE bit in AT_HWCAP2 while fsgsbase shows up in /proc/cpuinfo. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* Remove --enable-tunables configure optionAdhemerval Zanella Netto2023-03-295-68/+29
| | | | | | | | | | | | And make always supported. The configure option was added on glibc 2.25 and some features require it (such as hwcap mask, huge pages support, and lock elisition tuning). It also simplifies the build permutations. Changes from v1: * Remove glibc.rtld.dynamic_sort changes, it is orthogonal and needs more discussion. * Cleanup more code. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* x86: Don't check PREFETCHWT1 in tst-cpu-features-cpuinfo.cDJ Delorie2023-03-211-0/+3
| | | | | | | Don't check PREFETCHWT1 against /proc/cpuinfo since kernel doesn't report PREFETCHWT1 in /proc/cpuinfo. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* x86: Fix bug about glibc.cpu.hwcaps.caiyinyu2023-03-071-3/+3
| | | | | | | | | | | | | | | | | | | Recorded in [BZ #30183]: 1. export GLIBC_TUNABLES=glibc.cpu.hwcaps=-AVX512 2. Add _dl_printf("p -- %s\n", p); just before switch(nl) in sysdeps/x86/cpu-tunables.c 3. compiled and run ./testrun.sh /usr/bin/ls you will get: p -- -AVX512 p -- LC_ADDRESS=en_US.UTF-8 p -- LC_NUMERIC=C ... The function, TUNABLE_CALLBACK (set_hwcaps) (tunable_val_t *valp), checks far more than it should and it should stop at end of "-AVX512".
* x86-64: Add glibc.cpu.prefer_map_32bit_exec [BZ #28656]H.J. Lu2023-02-222-0/+16
| | | | | | | | | | | | | | | | | | | Crossing 2GB boundaries with indirect calls and jumps can use more branch prediction resources on Intel Golden Cove CPU (see the "Misprediction for Branches >2GB" section in Intel 64 and IA-32 Architectures Optimization Reference Manual.) There is visible performance improvement on workloads with many PLT calls when executable and shared libraries are mmapped below 2GB. Add the Prefer_MAP_32BIT_EXEC bit so that mmap will try to map executable or denywrite pages in shared libraries with MAP_32BIT first. NB: Prefer_MAP_32BIT_EXEC reduces bits available for address space layout randomization (ASLR), which is always disabled for SUID programs and can only be enabled by the tunable, glibc.cpu.prefer_map_32bit_exec, or the environment variable, LD_PREFER_MAP_32BIT_EXEC. This works only between shared libraries or between shared libraries and executables with addresses below 2GB. PIEs are usually loaded at a random address above 4GB by the kernel.
* string: Remove string_private.hAdhemerval Zanella2023-02-171-20/+0
| | | | | | Now that _STRING_ARCH_unaligned is not used anymore. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
* htl: Generalize i386 pt-machdep.h to x86Samuel Thibault2023-02-121-0/+28
|
* x86: Cache computation for AMD architecture.Sajan Karumanchi2023-01-181-159/+45
| | | | | | | | All AMD architectures cache details will be computed based on __cpuid__ `0x8000_001D` and the reference to __cpuid__ `0x8000_0006` will be zeroed out for future architectures. Reviewed-by: Premachandra Mallappa <premachandra.mallappa@amd.com>
* Update copyright dates with scripts/update-copyrightsJoseph Myers2023-01-06121-121/+121
|
* x86: Check minimum/maximum of non_temporal_threshold [BZ #29953]H.J. Lu2023-01-031-9/+16
| | | | | | | | | The minimum non_temporal_threshold is 0x4040. non_temporal_threshold may be set to less than the minimum value when the shared cache size isn't available (e.g., in an emulator) or by the tunable. Add checks for minimum and maximum of non_temporal_threshold. This fixes BZ #29953.
* x86_64: State assembler is being tested on sysdeps/x86/configureAdhemerval Zanella2022-12-062-3/+3
|
* configure: Remove AS checkAdhemerval Zanella2022-12-062-3/+3
| | | | | | The assembler is not issued directly, but rather always through CC wrapper. The binutils version check if done with LD instead. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* elf: Remove _dl_string_hwcapJavier Pello2022-10-061-14/+0
| | | | | | | | Removal of legacy hwcaps support from the dynamic loader left no users of _dl_string_hwcap. Signed-off-by: Javier Pello <devel@otheo.eu> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* x86-64: Require BMI1/BMI2 for AVX2 strrchr and wcsrchr implementationsAurelien Jarno2022-10-031-0/+1
| | | | | | | | | | | The AVX2 strrchr and wcsrchr implementation uses the 'blsmsk' instruction which belongs to the BMI1 CPU feature and the 'shrx' instruction, which belongs to the BMI2 CPU feature. Fixes: df7e295d18ff ("x86: Optimize {str|wcs}rchr-avx2") Partially resolves: BZ #29611 Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* x86-64: Require BMI2 and LZCNT for AVX2 memrchr implementationAurelien Jarno2022-10-031-0/+1
| | | | | | | | | | | The AVX2 memrchr implementation uses the 'shlxl' instruction, which belongs to the BMI2 CPU feature and uses the 'lzcnt' instruction, which belongs to the LZCNT CPU feature. Fixes: af5306a735eb ("x86: Optimize memrchr-avx2.S") Partially resolves: BZ #29611 Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* x86: include BMI1 and BMI2 in x86-64-v3 levelAurelien Jarno2022-10-031-0/+2
| | | | | | | | The "System V Application Binary Interface AMD64 Architecture Processor Supplement" mandates the BMI1 and BMI2 CPU features for the x86-64-v3 level. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* Update _FloatN header support for C++ in GCC 13Joseph Myers2022-09-281-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GCC 13 adds support for _FloatN and _FloatNx types in C++, so breaking the installed glibc headers that assume such support is not present. GCC mostly works around this with fixincludes, but that doesn't help for building glibc and its tests (glibc doesn't itself contain C++ code, but there's C++ code built for tests). Update glibc's bits/floatn-common.h and bits/floatn.h headers to handle the GCC 13 support directly. In general the changes match those made by fixincludes, though I think the ones in sysdeps/powerpc/bits/floatn.h, where the header tests __LDBL_MANT_DIG__ == 113 or uses #elif, wouldn't match the existing fixincludes patterns. Some places involving special C++ handling in relation to _FloatN support are not changed. There's no need to change the __HAVE_FLOATN_NOT_TYPEDEF definition (also in a form that wouldn't be matched by the fixincludes fixes) because it's only used in relation to macro definitions using features not supported for C++ (__builtin_types_compatible_p and _Generic). And there's no need to change the inline function overloads for issignaling, iszero and iscanonical in C++ because cases where types have the same format but are no longer compatible types are handled automatically by the C++ overload resolution rules. This patch also does not change the overload handling for iseqsig, and there I think changes *are* needed, beyond those in this patch or made by fixincludes. The way that overload is defined, via a template parameter to a structure type, requires overloads whenever the types are incompatible, even if they have the same format. So I think we need to add overloads with GCC 13 for every supported _FloatN and _FloatNx type, rather than just having one for _Float128 when it has a different ABI to long double as at present (but for older GCC, such overloads must not be defined for types that end up defined as typedefs for another type). Tested with build-many-glibcs.py: compilers build for aarch64-linux-gnu ia64-linux-gnu mips64-linux-gnu powerpc-linux-gnu powerpc64le-linux-gnu x86_64-linux-gnu; glibcs build for aarch64-linux-gnu ia64-linux-gnu i686-linux-gnu mips-linux-gnu mips64-linux-gnu-n32 powerpc-linux-gnu powerpc64le-linux-gnu x86_64-linux-gnu.
* math: x86: Use prefix for FP_INIT_ROUNDMODEAdhemerval Zanella2022-09-051-1/+7
| | | | | | | Not all compilers support the inline asm prefix '%v' to emit the avx instruction if AVX is enable. Use a prefix instead. Checked on x86_64-linux-gnu and i686-linux-gnu.
* nptl: x86_64: Use same code for CURRENT_STACK_FRAME and stackinfo_get_spAdhemerval Zanella2022-08-311-1/+3
| | | | | | | | | | | | | It avoids the possible warning of uninitialized 'frame' variable when building with clang: ../sysdeps/nptl/jmp-unwind.c:27:42: error: variable 'frame' is uninitialized when used here [-Werror,-Wuninitialized] __pthread_cleanup_upto (env->__jmpbuf, CURRENT_STACK_FRAME); The resulting code is similar to CURRENT_STACK_FRAME. Checked on x86_64-linux-gnu.
* x86: Add support to build strcmp/strlen/strchr with explicit ISA levelNoah Goldstein2022-07-161-0/+10
| | | | | | | | | | | | | | | | | | | | 1. Add default ISA level selection in non-multiarch/rtld implementations. 2. Add ISA level build guards to different implementations. - I.e strcmp-avx2.S which is ISA level 3 will only build if compiled ISA level <= 3. Otherwise there is no reason to include it as we will always use one of the ISA level 4 implementations (strcmp-evex.S). 3. Refactor the ifunc selector and ifunc implementation list to use the ISA level aware wrapper macros that allow functions below the compiled ISA level (with a guranteed replacement) to be skipped. Tested with and without multiarch on x86_64 for ISA levels: {generic, x86-64-v2, x86-64-v3, x86-64-v4} And m32 with and without multiarch.
* x86: Add missing rtm tests for strcmp familyNoah Goldstein2022-07-136-2/+150
| | | | | | | | | | Add new tests for: strcasecmp strncasecmp strcmp wcscmp These functions all have avx2_rtm implementations so should be tested.
* x86: Add support for building {w}memcmp{eq} with explicit ISA levelNoah Goldstein2022-07-051-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Refactor files so that all implementations are in the multiarch directory - Moved the implementation portion of memcmp sse2 from memcmp.S to multiarch/memcmp-sse2.S - The non-multiarch file now only includes one of the implementations in the multiarch directory based on the compiled ISA level (only used for non-multiarch builds. Otherwise we go through the ifunc selector). 2. Add ISA level build guards to different implementations. - I.e memcmp-avx2-movsb.S which is ISA level 3 will only build if compiled ISA level <= 3. Otherwise there is no reason to include it as we will always use one of the ISA level 4 implementations (memcmp-evex-movbe.S). 3. Add new multiarch/rtld-{w}memcmp{eq}.S that just include the non-multiarch {w}memcmp{eq}.S which will in turn select the best implementation based on the compiled ISA level. 4. Refactor the ifunc selector and ifunc implementation list to use the ISA level aware wrapper macros that allow functions below the compiled ISA level (with a guranteed replacement) to be skipped. Tested with and without multiarch on x86_64 for ISA levels: {generic, x86-64-v2, x86-64-v3, x86-64-v4} And m32 with and without multiarch.
* x86: Add more feature definitions to isa-level.hNoah Goldstein2022-06-281-0/+15
| | | | | This commit doesn't change anything in itself. It is just to add definitions that will be needed by future patches.
* x86-64: Only define used SSE/AVX/AVX512 run-time resolversH.J. Lu2022-06-271-0/+2
| | | | | | | When glibc is built with x86-64 ISA level v3, SSE run-time resolvers aren't used. For x86-64 ISA level v4 build, both SSE and AVX resolvers are unused. Check the minimum x86-64 ISA level to exclude the unused run-time resolvers.
* x86: Move CPU_FEATURE{S}_{USABLE|ARCH}_P to isa-level.hH.J. Lu2022-06-272-27/+24
| | | | | Move X86_ISA_CPU_FEATURE_USABLE_P and X86_ISA_CPU_FEATURES_ARCH_P to where MINIMUM_X86_ISA_LEVEL and XXX_X86_ISA_LEVEL are defined.
* x86: Fix backwards Prefer_No_VZEROUPPER check in ifunc-evex.hNoah Goldstein2022-06-272-24/+32
| | | | | | | | | Add third argument to X86_ISA_CPU_FEATURES_ARCH_P macro so the runtime CPU_FEATURES_ARCH_P check can be inverted if the MINIMUM_X86_ISA_LEVEL is not high enough to constantly evaluate the check. Use this new macro to correct the backwards check in ifunc-evex.h
* x86: Add defines / utilities for making ISA specific x86 buildsNoah Goldstein2022-06-224-13/+180
| | | | | | | | | | | | | | | 1. Factor out some of the ISA level defines in isa-level.c to standalone header isa-level.h 2. Add new headers with ISA level dependent macros for handling ifuncs. Note, this file does not change any code. Tested with and without multiarch on x86_64 for ISA levels: {generic, x86-64-v2, x86-64-v3, x86-64-v4} And m32 with and without multiarch.
* x86: Add BMI1/BMI2 checks for ISA_V3 checkNoah Goldstein2022-06-161-1/+2
| | | | | | | BMI1/BMI2 are part of the ISA V3 requirements: https://en.wikipedia.org/wiki/X86-64 And defined by GCC when building with `-march=x86-64-v3`
* x86: Add bounds `x86_non_temporal_threshold`Noah Goldstein2022-06-151-1/+7
| | | | | | | | | | | | | | | The lower-bound (16448) and upper-bound (SIZE_MAX / 16) are assumed by memmove-vec-unaligned-erms. The lower-bound is needed because memmove-vec-unaligned-erms unrolls the loop aggressively in the L(large_memset_4x) case. The upper-bound is needed because memmove-vec-unaligned-erms right-shifts the value of `x86_non_temporal_threshold` by LOG_4X_MEMCPY_THRESH (4) which without a bound may overflow. The lack of lower-bound can be a correctness issue. The lack of upper-bound cannot.
* elf: Remove ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATAFangrui Song2022-06-151-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If an executable has copy relocations for extern protected data, that can only work if the library containing the definition is built with assumptions (a) the compiler emits GOT-generating relocations (b) the linker produces R_*_GLOB_DAT instead of R_*_RELATIVE. Otherwise the library uses its own definition directly and the executable accesses a stale copy. Note: the GOT relocations defeat the purpose of protected visibility as an optimization, but allow rtld to make the executable and library use the same copy when copy relocations are present, but it turns out this never worked perfectly. ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA has strange semantics when both a.so and b.so define protected var and the executable copy relocates var: b.so accesses its own copy even with GLOB_DAT. The behavior change is from commit 62da1e3b00b51383ffa7efc89d8addda0502e107 (x86) and then copied to nios2 (ae5eae7cfc9c4a8297ff82ec6b794faca1976ecc) and arc (0e7d930c4c11de896fe807f67fa1eb756c9c1e05). Without ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA, b.so accesses the copy relocated data like a.so. There is now a warning for copy relocation on protected symbol since commit 7374c02b683b7110b853a32496a619410364d70b. It's extremely unlikely anyone relies on the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA behavior, so let's remove it: this removes a check in the symbol lookup code.
* x86: Fix misordered logic for setting `rep_movsb_stop_threshold`Noah Goldstein2022-06-141-12/+12
| | | | | | | | Move the setting of `rep_movsb_stop_threshold` to after the tunables have been collected so that the `rep_movsb_stop_threshold` (which is used to redirect control flow to the non_temporal case) will use any user value for `non_temporal_threshold` (set using glibc.cpu.x86_non_temporal_threshold)
* elf: Optimize _dl_new_hash in dl-new-hash.hNoah Goldstein2022-05-231-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unroll slightly and enforce good instruction scheduling. This improves performance on out-of-order machines. The unrolling allows for pipelined multiplies. As well, as an optional sysdep, reorder the operations and prevent reassosiation for better scheduling and higher ILP. This commit only adds the barrier for x86, although it should be either no change or a win for any architecture. Unrolling further started to induce slowdowns for sizes [0, 4] but can help the loop so if larger sizes are the target further unrolling can be beneficial. Results for _dl_new_hash Benchmarked on Tigerlake: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz Time as Geometric Mean of N=30 runs Geometric of all benchmark New / Old: 0.674 type, length, New Time, Old Time, New Time / Old Time fixed, 0, 2.865, 2.72, 1.053 fixed, 1, 3.567, 2.489, 1.433 fixed, 2, 2.577, 3.649, 0.706 fixed, 3, 3.644, 5.983, 0.609 fixed, 4, 4.211, 6.833, 0.616 fixed, 5, 4.741, 9.372, 0.506 fixed, 6, 5.415, 9.561, 0.566 fixed, 7, 6.649, 10.789, 0.616 fixed, 8, 8.081, 11.808, 0.684 fixed, 9, 8.427, 12.935, 0.651 fixed, 10, 8.673, 14.134, 0.614 fixed, 11, 10.69, 15.408, 0.694 fixed, 12, 10.789, 16.982, 0.635 fixed, 13, 12.169, 18.411, 0.661 fixed, 14, 12.659, 19.914, 0.636 fixed, 15, 13.526, 21.541, 0.628 fixed, 16, 14.211, 23.088, 0.616 fixed, 32, 29.412, 52.722, 0.558 fixed, 64, 65.41, 142.351, 0.459 fixed, 128, 138.505, 295.625, 0.469 fixed, 256, 291.707, 601.983, 0.485 random, 2, 12.698, 12.849, 0.988 random, 4, 16.065, 15.857, 1.013 random, 8, 19.564, 21.105, 0.927 random, 16, 23.919, 26.823, 0.892 random, 32, 31.987, 39.591, 0.808 random, 64, 49.282, 71.487, 0.689 random, 128, 82.23, 145.364, 0.566 random, 256, 152.209, 298.434, 0.51 Co-authored-by: Alexander Monakov <amonakov@ispras.ru> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* elf: Replace PI_STATIC_AND_HIDDEN with opposite HIDDEN_VAR_NEEDS_DYNAMIC_RELOCFangrui Song2022-04-262-7/+0
| | | | | | | | | | | | | | | | | | PI_STATIC_AND_HIDDEN indicates whether accesses to internal linkage variables and hidden visibility variables in a shared object (ld.so) need dynamic relocations (usually R_*_RELATIVE). PI (position independent) in the macro name is a misnomer: a code sequence using GOT is typically position-independent as well, but using dynamic relocations does not meet the requirement. Not defining PI_STATIC_AND_HIDDEN is legacy and we expect that all new ports will define PI_STATIC_AND_HIDDEN. Current ports defining PI_STATIC_AND_HIDDEN are more than the opposite. Change the configure default. No functional change. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* x86: Fix fallback for wcsncmp_avx2 in strcmp-avx2.S [BZ #28896]Noah Goldstein2022-03-251-0/+15
| | | | | | | | | | | | | | | | | | | | Overflow case for __wcsncmp_avx2_rtm should be __wcscmp_avx2_rtm not __wcscmp_avx2. commit ddf0992cf57a93200e0c782e2a94d0733a5a0b87 Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Sun Jan 9 16:02:21 2022 -0600 x86: Fix __wcsncmp_avx2 in strcmp-avx2.S [BZ# 28755] Set the wrong fallback function for `__wcsncmp_avx2_rtm`. It was set to fallback on to `__wcscmp_avx2` instead of `__wcscmp_avx2_rtm` which can cause spurious aborts. This change will need to be backported. All string/memory tests pass. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86: Fix TEST_NAME to make it a string in tst-strncmp-rtm.cNoah Goldstein2022-02-181-2/+2
| | | | | | | | Previously TEST_NAME was passing a function pointer. This didn't fail because of the -Wno-error flag (to allow for overflow sizes passed to strncmp/wcsncmp) Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86: Test wcscmp RTM in the wcsncmp overflow case [BZ #28896]Noah Goldstein2022-02-183-10/+48
| | | | | | | | | | | In the overflow fallback strncmp-avx2-rtm and wcsncmp-avx2-rtm would call strcmp-avx2 and wcscmp-avx2 respectively. This would have not checks around vzeroupper and would trigger spurious aborts. This commit fixes that. test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass on AVX2 machines with and without RTM. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86: Fallback {str|wcs}cmp RTM in the ncmp overflow case [BZ #28896]Noah Goldstein2022-02-172-2/+17
| | | | | | | | | | | | In the overflow fallback strncmp-avx2-rtm and wcsncmp-avx2-rtm would call strcmp-avx2 and wcscmp-avx2 respectively. This would have not checks around vzeroupper and would trigger spurious aborts. This commit fixes that. test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass on AVX2 machines with and without RTM. Co-authored-by: H.J. Lu <hjl.tools@gmail.com>
* x86/configure.ac: Define PI_STATIC_AND_HIDDEN/SUPPORT_STATIC_PIEH.J. Lu2022-02-142-0/+13
| | | | | Move PI_STATIC_AND_HIDDEN and SUPPORT_STATIC_PIE to sysdeps/x86/configure.ac.
* x86: Use CHECK_FEATURE_PRESENT on PCONFIGH.J. Lu2022-02-141-1/+1
| | | | | PCONFIG is a privileged instruction. Use CHECK_FEATURE_PRESENT, instead of CHECK_FEATURE_ACTIVE, on PCONFIG in tst-cpu-features-supports.c.
* x86: Don't check PTWRITE in tst-cpu-features-cpuinfo.cH.J. Lu2022-02-141-0/+3
| | | | | Don't check PTWRITE against /proc/cpuinfo since kernel doesn't report PTWRITE in /proc/cpuinfo.
* x86: Improve L to support L(XXX_SYMBOL (YYY, ZZZ))H.J. Lu2022-02-051-1/+2
|
* x86: Use CHECK_FEATURE_PRESENT to check HLE [BZ #27398]H.J. Lu2022-01-261-1/+1
| | | | | HLE is disabled on blacklisted CPUs. Use CHECK_FEATURE_PRESENT, instead of CHECK_FEATURE_ACTIVE, to check HLE.
* x86: Black list more Intel CPUs for TSX [BZ #27398]H.J. Lu2022-01-181-3/+31
| | | | | | | | | | Disable TSX and enable RTM_ALWAYS_ABORT for Intel CPUs listed in: https://www.intel.com/content/www/us/en/support/articles/000059422/processors.html This fixes BZ #27398. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* x86: use default cache size if it cannot be determined [BZ #28784]Aurelien Jarno2022-01-171-4/+10
| | | | | | | | | | | | | | | | | | | | | | In some cases (e.g QEMU, non-Intel/AMD CPU) the cache information can not be retrieved and the corresponding values are set to 0. Commit 2d651eb9265d ("x86: Move x86 processor cache info to cpu_features") changed the behaviour in such case by defining the __x86_shared_cache_size and __x86_data_cache_size variables to 0 instead of using the default values. This cause an issue with the i686 SSE2 optimized bzero/routine which assumes that the cache size is at least 128 bytes, and otherwise tries to zero/set the whole address space minus 128 bytes. Fix that by restoring the original code to only update __x86_shared_cache_size and __x86_data_cache_size variables if the corresponding cache sizes are not zero. Fixes bug 28784 Fixes commit 2d651eb9265d Reviewed-by: H.J. Lu <hjl.tools@gmail.com>