about summary refs log tree commit diff
path: root/sysdeps/i386
Commit message (Collapse)AuthorAgeFilesLines
...
* Fix f64xdivf128, f64xmulf128 spurious underflows (bug 28358)Joseph Myers2021-09-212-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | As described in bug 28358, the round-to-odd computations used in the libm functions that round their results to a narrower format can yield spurious underflow exceptions in the following circumstances: the narrowing only narrows the precision of the type and not the exponent range (i.e., it's narrowing _Float128 to _Float64x on x86_64, x86 or ia64), the architecture does after-rounding tininess detection (which applies to all those architectures), the result is inexact, tiny before rounding but not tiny after rounding (with the chosen rounding mode) for _Float64x (which is possible for narrowing mul, div and fma, not for narrowing add, sub or sqrt), so the underflow exception resulting from the toward-zero computation in _Float128 is spurious for _Float64x. Fixed by making ROUND_TO_ODD call feclearexcept (FE_UNDERFLOW) in the problem cases (as indicated by an extra argument to the macro); there is never any need to preserve underflow exceptions from this part of the computation, because the conversion of the round-to-odd value to the narrower type will underflow in exactly the cases in which the function should raise that exception, but it may be more efficient to avoid the extra manipulation of the floating-point environment when not needed. Tested for x86_64 and x86, and with build-many-glibcs.py.
* elf: Remove THREAD_GSCOPE_IN_TCBSergey Bugaev2021-09-161-1/+0
| | | | | | | | | All the ports now have THREAD_GSCOPE_IN_TCB set to 1. Remove all support for !THREAD_GSCOPE_IN_TCB, along with the definition itself. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20210915171110.226187-4-bugaevc@gmail.com> Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
* Redirect fma calls to __fma in libmJoseph Myers2021-09-152-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | include/math.h has a mechanism to redirect internal calls to various libm functions, that can often be inlined by the compiler, to call non-exported __* names for those functions in the case when the calls aren't inlined, with the redirection being disabled when NO_MATH_REDIRECT. Add fma to the functions to which this mechanism is applied. At present, libm-internal fma calls (generally to __builtin_fma* functions) are only done when it's known the call will be inlined, with alternative code not relying on an fma operation being used in the caller otherwise. This patch is in preparation for adding the TS 18661 / C2X narrowing fma functions to glibc; it will be natural for the narrowing function implementations to call the underlying fma functions unconditionally, with this either being inlined or resulting in an __fma* call. (Using two levels of round-to-odd computation like that, in the case where there isn't an fma hardware instruction, isn't optimal but is certainly a lot simpler for the initial implementation than writing different narrowing fma implementations for all the various pairs of formats.) Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch (using <https://sourceware.org/pipermail/libc-alpha/2021-September/130991.html> to fix installed library stripping in build-many-glibcs.py). Also tested for x86_64.
* Add narrowing square root functionsJoseph Myers2021-09-102-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the narrowing square root functions from TS 18661-1 / TS 18661-3 / C2X to glibc's libm: fsqrt, fsqrtl, dsqrtl, f32sqrtf64, f32sqrtf32x, f32xsqrtf64 for all configurations; f32sqrtf64x, f32sqrtf128, f64sqrtf64x, f64sqrtf128, f32xsqrtf64x, f32xsqrtf128, f64xsqrtf128 for configurations with _Float64x and _Float128; __f32sqrtieee128 and __f64sqrtieee128 aliases in the powerpc64le case (for calls to fsqrtl and dsqrtl when long double is IEEE binary128). Corresponding tgmath.h macro support is also added. The changes are mostly similar to those for the other narrowing functions previously added, so the description of those generally applies to this patch as well. However, the not-actually-narrowing cases (where the two types involved in the function have the same floating-point format) are aliased to sqrt, sqrtl or sqrtf128 rather than needing a separately built not-actually-narrowing function such as was needed for add / sub / mul / div. Thus, there is no __nldbl_dsqrtl name for ldbl-opt because no such name was needed (whereas the other functions needed such a name since the only other name for that entry point was e.g. f32xaddf64, not reserved by TS 18661-1); the headers are made to arrange for sqrt to be called in that case instead. The DIAG_* calls in sysdeps/ieee754/soft-fp/s_dsqrtl.c are because they were observed to be needed in GCC 7 testing of riscv32-linux-gnu-rv32imac-ilp32. The other sysdeps/ieee754/soft-fp/ files added didn't need such DIAG_* in any configuration I tested with build-many-glibcs.py, but if they do turn out to be needed in more files with some other configuration / GCC version, they can always be added there. I reused the same test inputs in auto-libm-test-in as for non-narrowing sqrt rather than adding extra or separate inputs for narrowing sqrt. The tests in libm-test-narrow-sqrt.inc also follow those for non-narrowing sqrt. Tested as followed: natively with the full glibc testsuite for x86_64 (GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC 11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32 hard float, mips64 (all three ABIs, both hard and soft float). The different GCC versions are to cover the different cases in tgmath.h and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in glibc headers, GCC 7 has proper _Float* support, GCC 8 adds __builtin_tgmath).
* Remove "Contributed by" linesSiddhesh Poyarekar2021-09-03200-300/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | We stopped adding "Contributed by" or similar lines in sources in 2012 in favour of git logs and keeping the Contributors section of the glibc manual up to date. Removing these lines makes the license header a bit more consistent across files and also removes the possibility of error in attribution when license blocks or files are copied across since the contributed-by lines don't actually reflect reality in those cases. Move all "Contributed by" and similar lines (Written by, Test by, etc.) into a new file CONTRIBUTED-BY to retain record of these contributions. These contributors are also mentioned in manual/contrib.texi, so we just maintain this additional record as a courtesy to the earlier developers. The following scripts were used to filter a list of files to edit in place and to clean up the CONTRIBUTED-BY file respectively. These were not added to the glibc sources because they're not expected to be of any use in future given that this is a one time task: https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* i686: Don't include multiarch memove in libc.aH.J. Lu2021-08-301-1/+1
| | | | | On i686, there is no multiarch memove in libc.a, don't include multiarch memove in ifunc-impl-list.c in libc.a.
* Remove sysdeps/*/tls-macros.hFangrui Song2021-08-181-47/+0
| | | | | | | | They provide TLS_GD/TLS_LD/TLS_IE/TLS_IE macros for TLS testing. Now that we have migrated to __thread and tls_model attributes, these macros are unused and the tls-macros.h files can retire. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* i386: Regenerate ulpsArjun Shankar2021-07-252-61/+61
| | | | | These failures were caught while building glibc master for Fedora Rawhide which is built with `-mtune=generic -msse2 -mfpmath=sse'.
* mcheck: Align struct hdr to MALLOC_ALIGNMENT bytes [BZ #28068]H.J. Lu2021-07-121-4/+0
| | | | | | | | | | | | 1. Align struct hdr to MALLOC_ALIGNMENT bytes so that malloc hooks in libmcheck align memory to MALLOC_ALIGNMENT bytes. 2. Remove tst-mallocalign1 from tests-exclude-mcheck for i386 and x32. 3. Add tst-pvalloc-fortify and tst-reallocarray to tests-exclude-mcheck since they use malloc_usable_size (see BZ #22057). This fixed BZ #28068. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* Add a generic malloc test for MALLOC_ALIGNMENTH.J. Lu2021-07-091-0/+4
| | | | | | | | | 1. Add sysdeps/generic/malloc-size.h to define size related macros for malloc. 2. Move x86_64/tst-mallocalign1.c to malloc and replace ALIGN_MASK with MALLOC_ALIGN_MASK. 3. Add tst-mallocalign1 to tests-exclude-mcheck for i386 and x32 since mcheck doesn't honor MALLOC_ALIGNMENT.
* Properly check stack alignment [BZ #27901]H.J. Lu2021-05-242-85/+0
| | | | | | | | | | | | | | | | | | | | | | 1. Replace if ((((uintptr_t) &_d) & (__alignof (double) - 1)) != 0) which may be optimized out by compiler, with int __attribute__ ((weak, noclone, noinline)) is_aligned (void *p, int align) { return (((uintptr_t) p) & (align - 1)) != 0; } 2. Add TEST_STACK_ALIGN_INIT to TEST_STACK_ALIGN. 3. Add a common TEST_STACK_ALIGN_INIT to check 16-byte stack alignment for both i386 and x86-64. 4. Update powerpc to use TEST_STACK_ALIGN_INIT. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* Remove architecture specific sched_cpucount optimizationsAdhemerval Zanella2021-05-071-1/+0
| | | | | | | | | | | | | | And replace the generic algorithm with the Brian Kernighan's one. GCC optimize it with popcnt if the architecture supports, so there is no need to add the extra POPCNT define to enable it. This is really a micro-optimization that only adds complexity: recent ABIs already support it (x86-64-v2 or power64le) and it simplifies the code for internal usage, since i686 does not allow an internal iFUNC call. Checked on x86_64-linux-gnu, aarch64-linux-gnu, and powerpc64le-linux-gnu.
* nptl: Move pthread_spin_trylock into libcFlorian Weimer2021-04-231-3/+10
| | | | The symbol was moved using scripts/move-symbol-to-libc.py.
* nptl: Move pthread_spin_lock into libcFlorian Weimer2021-04-231-2/+8
| | | | The symbol was moved using scripts/move-symbol-to-libc.py.
* nptl: Move pthread_spin_init, Move pthread_spin_unlock into libcFlorian Weimer2021-04-231-5/+11
| | | | | | | For some architectures, the two functions are aliased, so these symbols need to be moved at the same time. The symbols were moved using scripts/move-symbol-to-libc.py.
* x86: Remove low-level lock optimizationFlorian Weimer2021-04-211-1/+0
| | | | | | | | | | | | | | | The current approach is to do this optimizations at a higher level, in generic code, so that single-threaded cases can be specifically targeted. Furthermore, using IS_IN (libc) as a compile-time indicator that all locks are private is no longer correct once process-shared lock implementations are moved into libc. The generic <lowlevellock.h> is not compatible with assembler code (obviously), so it's necessary to remove two long-unused #includes. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* elf: Remove lazy tlsdesc relocation related codeSzabolcs Nagy2021-04-211-1/+0
| | | | | | | | | | | Remove generic tlsdesc code related to lazy tlsdesc processing since lazy tlsdesc relocation is no longer supported. This includes removing GL(dl_load_lock) from _dl_make_tlsdesc_dynamic which is only called at load time when that lock is already held. Added a documentation comment too. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* i386: Remove lazy tlsdesc relocation related codeSzabolcs Nagy2021-04-153-391/+2
| | | | | | | | | | Like in commit e75711ebfa976d5468ec292282566a18b07e4d67 for x86_64, remove unused lazy tlsdesc relocation processing code: _dl_tlsdesc_resolve_abs_plus_addend _dl_tlsdesc_resolve_rel _dl_tlsdesc_resolve_rela _dl_tlsdesc_resolve_hold
* i386: Avoid lazy relocation of tlsdesc [BZ #27137]Szabolcs Nagy2021-04-151-42/+34
| | | | | | | | | | | Lazy tlsdesc relocation is racy because the static tls optimization and tlsdesc management operations are done without holding the dlopen lock. This similar to the commit b7cf203b5c17dd6d9878537d41e0c7cc3d270a67 for aarch64, but it fixes a different race: bug 27137. On i386 the code is a bit more complicated than on x86_64 because both rel and rela relocs are supported.
* i386: Update ulpsAdhemerval Zanella2021-04-132-4/+4
| | | | | Required after 43576de04afc6 "Improve the accuracy of tgamma (BZ #26983)"
* i386: Update ulpsAdhemerval Zanella2021-04-052-37/+37
| | | | | Required after 9acda61d94acc "Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]".
* nptl: Remove MULTI_PAGE_ALIASING [BZ #23554]H.J. Lu2021-03-191-23/+0
| | | | | MULTI_PAGE_ALIASING was introduced to mitigate an aliasing issue on Pentium 4. It is no longer needed for processors after Pentium 4.
* i386: Regenerate ulpsFlorian Weimer2021-03-022-46/+46
|
* i386: Implement backtrace on top of <unwind-link.h>Florian Weimer2021-03-011-64/+18
| | | | Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* Implement <unwind-link.h> for dynamically loading the libgcc_s unwinderFlorian Weimer2021-03-011-0/+39
| | | | | | | | | | This will be used to consolidate the libgcc_s access for backtrace and pthread_cancel. Unlike the existing backtrace implementations, it provides some hardening based on pointer mangling. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* Reduce the statically linked startup code [BZ #23323]Florian Weimer2021-02-251-8/+6
| | | | | | | | | | | | | | | | | | | It turns out the startup code in csu/elf-init.c has a perfect pair of ROP gadgets (see Marco-Gisbert and Ripoll-Ripoll, "return-to-csu: A New Method to Bypass 64-bit Linux ASLR"). These functions are not needed in dynamically-linked binaries because DT_INIT/DT_INIT_ARRAY are already processed by the dynamic linker. However, the dynamic linker skipped the main program for some reason. For maximum backwards compatibility, this is not changed, and instead, the main map is consulted from __libc_start_main if the init function argument is a NULL pointer. For statically linked binaries, the old approach based on linker symbols is still used because there is nothing else available. A new symbol version __libc_start_main@@GLIBC_2.34 is introduced because new binaries running on an old libc would not run their ELF constructors, leading to difficult-to-debug issues.
* x86: Use x86/nptl/pthreaddef.hH.J. Lu2021-02-221-43/+0
| | | | | | | 1. Move sysdeps/i386/nptl/pthreaddef.h to sysdeps/x86/nptl/pthreaddef.h. 2. Remove sysdeps/x86_64/nptl/pthreaddef.h. Reviewed-by: DJ Delorie <dj@redhat.com>
* i686: Regenerate ULPsSiddhesh Poyarekar2021-02-031-5/+5
|
* configure: Check for static PIE supportSzabolcs Nagy2021-01-212-0/+6
| | | | | | | | | | | | | | Add SUPPORT_STATIC_PIE that targets can define if they support static PIE. This requires PI_STATIC_AND_HIDDEN support and various linker features as described in commit 9d7a3741c9e59eba87fb3ca6b9f979befce07826 Add --enable-static-pie configure option to build static PIE [BZ #19574] Currently defined on x86_64, i386 and aarch64 where static PIE is known to work. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* x86: Check IFUNC definition in unrelocated executable [BZ #20019]H.J. Lu2021-01-041-5/+11
| | | | | | Calling an IFUNC function defined in unrelocated executable also leads to segfault. Issue a fatal error message when calling IFUNC function defined in the unrelocated executable from a shared library.
* Update copyright dates with scripts/update-copyrightsPaul Eggert2021-01-02289-289/+289
| | | | | | | | | | | | | | | | I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 6694 files FOO. I then removed trailing white space from benchtests/bench-pthread-locks.c and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this diagnostic from Savannah: remote: *** pre-commit check failed ... remote: *** error: lines with trailing whitespace found remote: error: hook declined to update refs/heads/master
* x86 long double: Support pseudo numbers in isnanlSiddhesh Poyarekar2020-12-241-43/+0
| | | | | | | This syncs up isnanl behaviour with gcc. Also move the isnanl implementation to sysdeps/x86 and remove the sysdeps/x86_64 version. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* x86 long double: Support pseudo numbers in fpclassifylSiddhesh Poyarekar2020-12-241-42/+0
| | | | | | | | Also move sysdeps/i386/fpu/s_fpclassifyl.c to sysdeps/x86/fpu/s_fpclassifyl.c and remove sysdeps/x86_64/fpu/s_fpclassifyl.c Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* i386: Regenerate ulpsFlorian Weimer2020-12-212-10/+10
| | | | For new inputs added in commit cad5ad81d2f7f58a7ad0d8afa8c1b710.
* hurd: Fix ELF_MACHINE_USER_ADDRESS_MASK valueSamuel Thibault2020-12-201-1/+1
| | | | | x86 binaries are linked at 0x08000000, so we need to let them get mapped there.
* x86: Fix THREAD_SELF definition to avoid ld.so crash (bug 27004)Jakub Jelinek2020-12-031-1/+6
| | | | | | | | | | | The previous definition of THREAD_SELF did not tell the compiler that %fs (or %gs) usage is invalid for the !DL_LOOKUP_GSCOPE_LOCK case in _dl_lookup_symbol_x. As a result, ld.so could try to use the TCB before it was initialized. As the comment in tls.h explains, asm volatile is undesirable here. Using the __seg_fs (or __seg_gs) namespace does not interfere with optimization, and expresses that THREAD_SELF is potentially trapping.
* nptl: Move stack list variables into _rtld_globalFlorian Weimer2020-11-161-2/+0
| | | | | | | | | Now __thread_gscope_wait (the function behind THREAD_GSCOPE_WAIT, formerly __wait_lookup_done) can be implemented directly in ld.so, eliminating the unprotected GL (dl_wait_lookup_done) function pointer. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* x86: Remove UP macro. Define LOCK_PREFIX unconditionally.Florian Weimer2020-11-132-14/+2
| | | | | | | The UP macro is never defined. Also define LOCK_PREFIX unconditionally, to the same string. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* hurd: keep only required PLTs in ld.soSamuel Thibault2020-11-111-4/+0
| | | | | | | | | | | | | | | | | | | | | We need NO_RTLD_HIDDEN because of the need for PLT calls in ld.so. See Roland's comment in https://sourceware.org/bugzilla/show_bug.cgi?id=15605 "in the Hurd it's crucial that calls like __mmap be the libc ones instead of the rtld-local ones after the bootstrap phase, when the dynamic linker is being used for dlopen and the like." We used to just avoid all hidden use in the rtld ; this commit switches to keeping only those that should use PLT calls, i.e. essentially those defined in sysdeps/mach/hurd/dl-sysdep.c: __assert_fail __assert_perror_fail __*stat64 _exit This fixes a few startup issues, notably the call to __tunable_get_val that is made before PLTs are set up.
* x86: Initialize CPU info via IFUNC relocation [BZ 26203]H.J. Lu2020-10-161-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | X86 CPU features in ld.so are initialized by init_cpu_features, which is invoked by DL_PLATFORM_INIT from _dl_sysdep_start. But when ld.so is loaded by static executable, DL_PLATFORM_INIT is never called. Also x86 cache info in libc.o and libc.a is initialized by a constructor which may be called too late. Since some fields in _rtld_global_ro in ld.so are initialized by dynamic relocation, we can also initialize x86 CPU features in _rtld_global_ro in ld.so and cache info in libc.so by initializing dummy function pointers in ld.so and libc.so via IFUNC relocation. Key points: 1. IFUNC is always supported, independent of --enable-multi-arch or --disable-multi-arch. Linker generates IFUNC relocations from input IFUNC objects and ld.so performs IFUNC relocations. 2. There are no IFUNC dependencies in ld.so before dynamic relocation have been performed, 3. The x86 CPU features in ld.so is initialized by DL_PLATFORM_INIT in dynamic executable and by IFUNC relocation in dlopen in static executable. 4. The x86 cache info in libc.o is initialized by IFUNC relocation. 5. In libc.a, both x86 CPU features and cache info are initialized from ARCH_INIT_CPU_FEATURES, not by IFUNC relocation, before __libc_early_init is called. Note: _dl_x86_init_cpu_features can be called more than once from DL_PLATFORM_INIT and during relocation in ld.so.
* aarch64: enforce >=64K guard size [BZ #26691]Szabolcs Nagy2020-10-021-0/+3
| | | | | | | | | | | | | | | | | | | | | | | There are several compiler implementations that allow large stack allocations to jump over the guard page at the end of the stack and corrupt memory beyond that. See CVE-2017-1000364. Compilers can emit code to probe the stack such that the guard page cannot be skipped, but on aarch64 the probe interval is 64K by default instead of the minimum supported page size (4K). This patch enforces at least 64K guard on aarch64 unless the guard is disabled by setting its size to 0. For backward compatibility reasons the increased guard is not reported, so it is only observable by exhausting the address space or parsing /proc/self/maps on linux. On other targets the patch has no effect. If the stack probe interval is larger than a page size on a target then ARCH_MIN_GUARD_SIZE can be defined to get large enough stack guard on libc allocated stacks. The patch does not affect threads with user allocated stacks. Fixes bug 26691.
* x86: Use one ldbl2mpn.c file for both i386 and x86_64Florian Weimer2020-09-221-120/+0
|
* x86: Install <sys/platform/x86.h> [BZ #26124]H.J. Lu2020-09-112-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Install <sys/platform/x86.h> so that programmers can do #if __has_include(<sys/platform/x86.h>) #include <sys/platform/x86.h> #endif ... if (CPU_FEATURE_USABLE (SSE2)) ... if (CPU_FEATURE_USABLE (AVX2)) ... <sys/platform/x86.h> exports only: enum { COMMON_CPUID_INDEX_1 = 0, COMMON_CPUID_INDEX_7, COMMON_CPUID_INDEX_80000001, COMMON_CPUID_INDEX_D_ECX_1, COMMON_CPUID_INDEX_80000007, COMMON_CPUID_INDEX_80000008, COMMON_CPUID_INDEX_7_ECX_1, /* Keep the following line at the end. */ COMMON_CPUID_INDEX_MAX }; struct cpuid_features { struct cpuid_registers cpuid; struct cpuid_registers usable; }; struct cpu_features { struct cpu_features_basic basic; struct cpuid_features features[COMMON_CPUID_INDEX_MAX]; }; /* Get a pointer to the CPU features structure. */ extern const struct cpu_features *__x86_get_cpu_features (unsigned int max) __attribute__ ((const)); Since all feature checks are done through macros, programs compiled with a newer <sys/platform/x86.h> are compatible with the older glibc binaries as long as the layout of struct cpu_features is identical. The features array can be expanded with backward binary compatibility for both .o and .so files. When COMMON_CPUID_INDEX_MAX is increased to support new processor features, __x86_get_cpu_features in the older glibc binaries returns NULL and HAS_CPU_FEATURE/CPU_FEATURE_USABLE return false on the new processor feature. No new symbol version is neeeded. Both CPU_FEATURE_USABLE and HAS_CPU_FEATURE are provided. HAS_CPU_FEATURE can be used to identify processor features. Note: Although GCC has __builtin_cpu_supports, it only supports a subset of <sys/platform/x86.h> and it is equivalent to CPU_FEATURE_USABLE. It doesn't support HAS_CPU_FEATURE.
* Update i686 ulps.Patsy Griffin2020-09-021-3/+3
| | | | | | | | | | | | | Without this ULP patch these 3 tests fail on i686: FAIL: math/test-float128-j0 FAIL: math/test-float64x-j0 FAIL: math/test-ldouble-j0 CPU info: Vendor ID: GenuineIntel CPU family: 6 Model: 85 Model name: Intel Xeon Processor (Cascadelake)
* math: Update x86_64 ulpsAdhemerval Zanella2020-08-081-3/+3
| | | | From new j0 test.
* x86: Support usable check for all CPU featuresH.J. Lu2020-07-1328-105/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Support usable check for all CPU features with the following changes: 1. Change struct cpu_features to struct cpuid_features { struct cpuid_registers cpuid; struct cpuid_registers usable; }; struct cpu_features { struct cpu_features_basic basic; struct cpuid_features features[COMMON_CPUID_INDEX_MAX]; unsigned int preferred[PREFERRED_FEATURE_INDEX_MAX]; ... }; so that there is a usable bit for each cpuid bit. 2. After the cpuid bits have been initialized, copy the known bits to the usable bits. EAX/EBX from INDEX_1 and EAX from INDEX_7 aren't used for CPU feature detection. 3. Clear the usable bits which require OS support. 4. If the feature is supported by OS, copy its cpuid bit to its usable bit. 5. Replace HAS_CPU_FEATURE and CPU_FEATURES_CPU_P with CPU_FEATURE_USABLE and CPU_FEATURE_USABLE_P to check if a feature is usable. 6. Add DEPR_FPU_CS_DS for INDEX_7_EBX_13. 7. Unset MPX feature since it has been deprecated. The results are 1. If the feature is known and doesn't requre OS support, its usable bit is copied from the cpuid bit. 2. Otherwise, its usable bit is copied from the cpuid bit only if the feature is known to supported by OS. 3. CPU_FEATURE_USABLE/CPU_FEATURE_USABLE_P are used to check if the feature can be used. 4. HAS_CPU_FEATURE/CPU_FEATURE_CPU_P are used to check if CPU supports the feature.
* x86: Remove the unused __x86_prefetchwH.J. Lu2020-07-112-7/+0
| | | | | | | | | | | | | Since commit c867597bff2562180a18da4b8dba89d24e8b65c4 Author: H.J. Lu <hjl.tools@gmail.com> Date: Wed Jun 8 13:57:50 2016 -0700 X86-64: Remove previous default/SSE2/AVX2 memcpy/memmove removed the only usage of __x86_prefetchw, we can remove the unused __x86_prefetchw.
* Update i686 libm-test-ulpsPatsy Franklin2020-07-091-21/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Without my ULP patch these 18 tests fail on i686: https://koji.fedoraproject.org/koji/taskinfo?taskID=46467301 + cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 85 model name : Intel Xeon Processor (Cascadelake) FAIL: math/test-double-j0 FAIL: math/test-double-y0 FAIL: math/test-float-erfc FAIL: math/test-float-j0 FAIL: math/test-float-j1 FAIL: math/test-float-lgamma FAIL: math/test-float-tgamma FAIL: math/test-float-y0 FAIL: math/test-float32-erfc FAIL: math/test-float32-j0 FAIL: math/test-float32-j1 FAIL: math/test-float32-lgamma FAIL: math/test-float32-tgamma FAIL: math/test-float32-y0 FAIL: math/test-float32x-j0 FAIL: math/test-float32x-y0 FAIL: math/test-float64-j0 FAIL: math/test-float64-y0 With my ULP patch applied these tests now pass: https://koji.fedoraproject.org/koji/taskinfo?taskID=46436310
* i386: Use builtin sqrtlAdhemerval Zanella2020-06-221-21/+0
| | | | Checked on i686-linux-gnu.
* i386: Use generic exp10fAdhemerval Zanella2020-06-191-54/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The generic implementation is twice as fast. Using the exp10f benchmark: * master: "exp10f": { "workload-spec2017.wrf (adapted)": { "duration": 1.02967e+09, "iterations": 4.768e+07, "reciprocal-throughput": 18.3579, "latency": 24.8331, "max-throughput": 5.44725e+07, "min-throughput": 4.02688e+07 } } * patched: "exp10f": { "workload-spec2017.wrf (adapted)": { "duration": 1.01821e+09, "iterations": 6.1984e+07, "reciprocal-throughput": 13.1975, "latency": 19.6563, "max-throughput": 7.57719e+07, "min-throughput": 5.08743e+07 } } Checked on i686-linux-gnu.