about summary refs log tree commit diff
path: root/sysdeps/x86
Commit message (Collapse)AuthorAgeFilesLines
* Update copyright dates with scripts/update-copyrights.Joseph Myers2019-01-0162-62/+62
| | | | | | | * All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.
* x86-64: Vectorize sincosf_poly and update s_sincosf-fma.cH.J. Lu2018-12-262-0/+179
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add <sincosf_poly.h> and include it in s_sincosf.h to allow vectorized sincosf_poly. Add x86 sincosf_poly.h to vectorize sincosf_poly. On Broadwell, bench-sincosf shows: Before After Improvement max 160.273 114.198 40% min 6.25 5.625 11% mean 13.0325 10.6462 22% Vectorized sincosf_poly shows Before After Improvement max 138.653 114.198 21% min 5.004 5.625 -11% mean 11.5934 10.6462 9% Tested on x86-64 and i686 as well as with build-many-glibcs.py. * sysdeps/ieee754/flt-32/s_sincosf.h: Include <sincosf_poly.h>. (sincos_t, sincosf_poly, sinf_poly): Moved to ... * sysdeps/ieee754/flt-32/sincosf_poly.h: Here. New file. * sysdeps/x86/fpu/s_sincosf_data.c: New file. * sysdeps/x86/fpu/sincosf_poly.h: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: Just include <sysdeps/ieee754/flt-32/s_sincosf.c>.
* Remove x86 mathinline.h.Joseph Myers2018-12-192-295/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | After previous cleanups, the only code in the x86 bits/mathinline.h that is relevant with current compilers is the inline of __ieee754_atan2l that is conditional on __LIBC_INTERNAL_MATH_INLINES (i.e. for when libm itself is being built). This inline is something that does belong in glibc not GCC, since __ieee754_atan2l is a purely internal function name. This patch moves that inline to a new sysdeps/x86/fpu/math_private.h, removing the bits/mathinline.h header. Note that previously the inline was only for non-SSE 32-bit x86. That condition does not make sense, however, for a long double function; if it's not inlined, exactly the same x87 instruction will end up getting used by the out-of-line function, for both 32-bit and 64-bit. So that condition is not retained in the new version. Tested for x86_64 and x86. As expected, installed stripped shared libraries are unchanged for 32-bit x86, but installed stripped libm.so is changed for x86_64 because calls to __ieee754_atan2l start being inlined where previously they were out of line calls. (The same change to start inlining the function would presumably also apply for 32-bit built with -mfpmath=sse, but that's not a configuration I've tested.) * sysdeps/x86/fpu/math_private.h: New file. * sysdeps/x86/fpu/bits/mathinline.h: Remove.
* Remove x86 mathinline.h sinh, cosh, tanh inlines.Joseph Myers2018-12-191-16/+0
| | | | | | | | | | | | | | | | | | | | | | Continuing the removal of bits/mathinline.h inlines that would better be done by the compiler, this patch removes x86 inlines for sinh, cosh and tanh functions (inlines only previously present for fast-math, non-SSE 32-bit x86). I've filed <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88556> for adding such inlines as an optimization in GCC. I believe the only remaining part of the x86 bits/mathinline.h that does anything useful with current compilers after this patch is the __LIBC_INTERNAL_MATH_INLINES inline of __ieee754_atan2l; I intend to remove the whole header and move that inline to a sysdeps math_private.h header in a subsequent patch. Tested for x86_64 and x86. * sysdeps/x86/fpu/bits/mathinline.h (sinh): Remove inline definition. (cosh): Likewise. (tanh): Likewise.
* x86: Merge i386/x86_64 atomic-machine.hH.J. Lu2018-12-181-0/+571
| | | | | | | | | | Merge i386 and x86_64 atomic-machine.h to x86 atomic-machine.h. Tested on i686 and x86_64 as well as with build-many-glibcs.py. * sysdeps/i386/atomic-machine.h: Merged with ... * sysdeps/x86_64/atomic-machine.h: To ... * sysdeps/x86/atomic-machine.h: This. New file.
* Remove x86 mathinline.h asinh, acosh, atanh inlines.Joseph Myers2018-12-141-13/+0
| | | | | | | | | | | | | | | Continuing the removal of bits/mathinline.h inlines that would better be done by the compiler, this patch removes x86 inlines for asinh, acosh and atanh functions (only for fast-math, non-SSE 32-bit x86). I've filed <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88502> for adding such inlines as an optimization in GCC. Tested for x86_64 and x86. * sysdeps/x86/fpu/bits/mathinline.h (asinh): Remove inline definition. (acosh): Likewise. (atanh): Likewise.
* x86: Add Hygon Dhyana support.Carlos O'Donell2018-12-131-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fix Hygon Dhyana processor CPU Vendor ID detection problem in glibc sysdep module, current glibc codes doesn't recognize Dhyana CPU Vendor ID("HygonGenuine") and set kind to arch_kind_other, which result to incorrect zero value for __cache_sysconf() syscall. As Hygon Dhyana share most architecture feature as AMD Family 17h, this patch add Hygon CPU Vendor ID check and setup kind to arch_kind_amd and reuse AMD code path, which lead to correct return value in __cache_sysconf() syscall. we run the glibc test suite for both Hygon Dhyana and AMD EPYC and found no failure case. Background: Chengdu Haiguang IC Design Co., Ltd (Hygon) is a Joint Venture between AMD and Haiguang Information Technology Co.,Ltd., aims at providing high performance x86 processor for China server market. Its first generation processor codename is Dhyana, which originates from AMD technology and shares most of the architecture with AMD's family 17h, but with different CPU Vendor ID("HygonGenuine")/Family series number(Family 18h). Related Hygon kernel patch can be found on http://lkml.kernel.org/r/5ce86123a7b9dad925ac583d88d2f921040e859b.1538583282.git.puwen@hygon.cn Signed-off-by: fanjinke <fanjinke@hygon.cn> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* Remove x86 mathinline.h hypot inline.Joseph Myers2018-12-121-4/+0
| | | | | | | | | | | | | Continuing the removal of bits/mathinline.h inlines that would better be done by the compiler, this patch removes an x86 inline for hypot functions (only for fast-math, only for non-SSE 32-bit x86). I've filed <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88474> for adding such an inline as an optimization in GCC. Tested for x86_64 and x86. * sysdeps/x86/fpu/bits/mathinline.h (hypot): Remove inline definition.
* x86: Extend CPUID support in struct cpu_featuresH.J. Lu2018-12-034-242/+1246
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend CPUID support for all feature bits from CPUID. Add a new macro, CPU_FEATURE_USABLE, which can be used to check if a feature is usable at run-time, instead of HAS_CPU_FEATURE and HAS_ARCH_FEATURE. Add COMMON_CPUID_INDEX_D_ECX_1, COMMON_CPUID_INDEX_80000007 and COMMON_CPUID_INDEX_80000008 to check CPU feature bits in them. Tested on i686 and x86-64 as well as using build-many-glibcs.py with x86 targets. * sysdeps/x86/cacheinfo.c (intel_check_word): Updated for cpu_features_basic. (__cache_sysconf): Likewise. (init_cacheinfo): Likewise. * sysdeps/x86/cpu-features.c (get_extended_indeces): Also populate COMMON_CPUID_INDEX_80000007 and COMMON_CPUID_INDEX_80000008. (get_common_indices): Also populate COMMON_CPUID_INDEX_D_ECX_1. Use CPU_FEATURES_CPU_P (cpu_features, XSAVEC) to check if XSAVEC is available. Set the bit_arch_XXX_Usable bits. (init_cpu_features): Use _Static_assert on index_arch_Fast_Unaligned_Load. __get_cpuid_registers and __get_arch_feature. Updated for cpu_features_basic. Set stepping in cpu_features. * sysdeps/x86/cpu-features.h: (FEATURE_INDEX_1): Changed to enum. (FEATURE_INDEX_2): New. (FEATURE_INDEX_MAX): Changed to enum. (COMMON_CPUID_INDEX_D_ECX_1): New. (COMMON_CPUID_INDEX_80000007): Likewise. (COMMON_CPUID_INDEX_80000008): Likewise. (cpuid_registers): Likewise. (cpu_features_basic): Likewise. (CPU_FEATURE_USABLE): Likewise. (bit_arch_XXX_Usable): Likewise. (cpu_features): Use cpuid_registers and cpu_features_basic. (bit_arch_XXX): Reweritten. (bit_cpu_XXX): Likewise. (index_cpu_XXX): Likewise. (reg_XXX): Likewise. * sysdeps/x86/tst-get-cpu-features.c: Include <stdio.h> and <support/check.h>. (CHECK_CPU_FEATURE): New. (CHECK_CPU_FEATURE_USABLE): Likewise. (cpu_kinds): Likewise. (do_test): Print vendor, family, model and stepping. Check HAS_CPU_FEATURE and CPU_FEATURE_USABLE. (TEST_FUNCTION): Removed. Include <support/test-driver.c> instead of "../../test-skeleton.c". * sysdeps/x86_64/multiarch/sched_cpucount.c (__sched_cpucount): Check POPCNT instead of POPCOUNT. * sysdeps/x86_64/multiarch/test-multiarch.c (do_test): Likewise.
* x86/CET: Add a re-exec test with legacy bitmapH.J. Lu2018-11-232-2/+86
| | | | | | | | | | Add a re-exec test with legacy bitmap to verify that legacy bitmap is properly hanlded by kernel. * sysdeps/x86/Makefile (tests): Add tst-cet-legacy-1a. (tst-cet-legacy-1a-ARGS): New. ($(objpfx)tst-cet-legacy-1a): New target. * sysdeps/x86/tst-cet-legacy-1a.c: New file.
* Check multiple NT_GNU_PROPERTY_TYPE_0 notes [BZ #23509]H.J. Lu2018-11-082-14/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Linkers group input note sections with the same name into one output note section with the same name. One output note section is placed in one PT_NOTE segment. Since new linkers merge input .note.gnu.property sections into one output .note.gnu.property section, there is only one NT_GNU_PROPERTY_TYPE_0 note in one PT_NOTE segment with new linkers. Since older linkers treat input .note.gnu.property section as a generic note section and just concatenate all input .note.gnu.property sections into one output .note.gnu.property section without merging them, we may see multiple NT_GNU_PROPERTY_TYPE_0 notes in one PT_NOTE segment with older linkers. When an older linker is used to created the program on CET-enabled OS, the linker output has a single .note.gnu.property section with multiple NT_GNU_PROPERTY_TYPE_0 notes, some of which have IBT and SHSTK enable bits set even if the program isn't CET enabled. Such programs will crash on CET-enabled machines. This patch updates the note parser: 1. Skip note parsing if a NT_GNU_PROPERTY_TYPE_0 note has been processed. 2. Check multiple NT_GNU_PROPERTY_TYPE_0 notes. [BZ #23509] * sysdeps/x86/dl-prop.h (_dl_process_cet_property_note): Skip note parsing if a NT_GNU_PROPERTY_TYPE_0 note has been processed. Update the l_cet field when processing NT_GNU_PROPERTY_TYPE_0 note. Check multiple NT_GNU_PROPERTY_TYPE_0 notes. * sysdeps/x86/link_map.h (l_cet): Expand to 3 bits, Add lc_unknown.
* x86: Support RDTSCP for benchtestsH.J. Lu2018-10-241-1/+13
| | | | | | | | | | | | | RDTSCP waits until all previous instructions have executed and all previous loads are globally visible before reading the counter. RDTSC doesn't wait until all previous instructions have been executed before reading the counter. All x86 processors since 2010 support RDTSCP instruction. This patch adds RDTSCP support to benchtests. * benchtests/Makefile (CPPFLAGS-nonlib): Add -DUSE_RDTSCP if USE_RDTSCP is defined. * sysdeps/x86/hp-timing.h (HP_TIMING_NOW): Use RDTSCP if USE_RDTSCP is defined.
* x86: Fix Haswell strong flags (BZ#23709)Adhemerval Zanella2018-10-231-0/+6
| | | | | | | | | | | | | | | | | | | Th commit 'Disable TSX on some Haswell processors.' (2702856bf4) changed the default flags for Haswell models. Previously, new models were handled by the default switch path, which assumed a Core i3/i5/i7 if AVX is available. After the patch, Haswell models (0x3f, 0x3c, 0x45, 0x46) do not set the flags Fast_Rep_String, Fast_Unaligned_Load, Fast_Unaligned_Copy, and Prefer_PMINUB_for_stringop (only the TSX one). This patch fixes it by disentangle the TSX flag handling from the memory optimization ones. The strstr case cited on patch now selects the __strstr_sse2_unaligned as expected for the Haswell cpu. Checked on x86_64-linux-gnu. [BZ #23709] * sysdeps/x86/cpu-features.c (init_cpu_features): Set TSX bits independently of other flags.
* x86: Don't include <x86intrin.h>H.J. Lu2018-10-211-4/+5
| | | | | | | | | | Use __builtin_ia32_rdtsc directly since including <x86intrin.h> makes building glibc very slow. On Intel Core i5-6260U, this patch reduces x86-64 build time from 8 minutes 33 seconds to 3 minutes 48 seconds with "make -j4" and GCC 8.2.1. * sysdeps/x86/hp-timing.h: Don't include <x86intrin.h>. (HP_TIMING_NOW): Replace _rdtsc with __builtin_ia32_rdtsc.
* x86: Use _rdtsc intrinsic for HP_TIMING_NOWH.J. Lu2018-10-172-0/+53
| | | | | | | | | | | | | | | | | | | | | | | | Since _rdtsc intrinsic is supported in GCC 4.9, we can use it for HP_TIMING_NOW. This patch 1. Create x86 hp-timing.h to replace i686 and x86_64 hp-timing.h. 2. Move MINIMUM_ISA from init-arch.h to isa.h so that x86 hp-timing.h can check minimum x86 ISA to decide if _rdtsc can be used. NB: Checking if __i686__ isn't sufficient since __i686__ may not be defined when building for i686 class processors. * sysdeps/i386/init-arch.h: Removed. * sysdeps/i386/i586/init-arch.h: Likewise. * sysdeps/i386/i686/init-arch.h: Likewise. * sysdeps/i386/i686/hp-timing.h: Likewise. * sysdeps/x86_64/hp-timing.h: Likewise. * sysdeps/i386/isa.h: New file. * sysdeps/i386/i586/isa.h: Likewise. * sysdeps/i386/i686/isa.h: Likewise. * sysdeps/x86_64/isa.h: Likewise. * sysdeps/x86/hp-timing.h: New file. * sysdeps/x86/init-arch.h: Include <isa.h>.
* Use round functions not __round functions in glibc libm.Joseph Myers2018-09-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __round functions to call the corresponding round names instead, with asm redirection to __round when the calls are not inlined. An additional complication arises in sysdeps/ieee754/ldbl-128ibm/e_expl.c, where a call to roundl, with the result converted to int, gets converted by the compiler to call lroundl in the case of 32-bit long, so resulting in localplt test failures. It's logically correct to let the compiler make such an optimization; an appropriate asm redirection of lroundl to __lroundl is thus added to that file (it's not needed anywhere else). Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (round): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_round.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_roundf.c: Likewise. * sysdeps/ieee754/dbl-64/s_round.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_round.c: Likewise. * sysdeps/ieee754/float128/s_roundf128.c: Likewise. * sysdeps/ieee754/flt-32/s_roundf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_roundl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_roundl.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_round.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_roundf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_round.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_roundf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_round.c: Likewise. * sysdeps/riscv/rvf/s_roundf.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_roundl.c: Likewise. (round): Redirect to __round. (__roundl): Call round instead of __round. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__round): Remove macro. [_ARCH_PWR5X] (__roundf): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Use round functions instead of __round variants. * sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/x86/fpu/powl_helper.c (__powl_helper): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_expl.c (lroundl): Redirect to __lroundl. (__ieee754_expl): Call roundl instead of __roundl.
* Invert sense of list of i686-class processors in sysdeps/x86/cpu-features.h.Joseph Myers2018-09-201-18/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | I noticed that sysdeps/x86/cpu-features.h had conditionals on whether to define HAS_CPUID, HAS_I586 and HAS_I686 with a long list of preprocessor macros for i686-and-later processors which however was out of date. This patch avoids the problem of the list getting out of date by instead having conditionals on all the (few, old) pre-i686 processors for which GCC has preprocessor macros, rather than the (many, expanding list) i686-and-later processors. It seems HAS_I586 and HAS_I686 are unused so the only effect of these macros being missing is that 32-bit glibc built for one of these processors would end up doing runtime detection of CPUID availability. i386 builds are prevented by a configure test so there is no need to allow for them here. __geode__ (no long nops?) and __k6__ (no CMOV, at least according to GCC) are conservatively handled as i586, not i686, here (as noted above, this is a theoretical distinction at present in that only HAS_CPUID appears to be used). Tested for x86. * sysdeps/x86/cpu-features.h [__geode__ || __k6__]: Handle like [__i586__ || __pentium__]. [__i486__]: Handle explicitly. (HAS_CPUID): Define to 1 if above macros are undefined. (HAS_I586): Likewise. (HAS_I686): Likewise.
* Split fenv_private.h out of math_private.h more consistently.Joseph Myers2018-08-281-0/+497
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On some architectures, the parts of math_private.h relating to the floating-point environment are in a separate file fenv_private.h included from math_private.h. As this is purely an architecture-specific convention used by several architectures, however, all such architectures still need their own math_private.h, even if it has nothing to do beyond #include <fenv_private.h> and peculiarity of including the i386 file directly instead of having a shared file in sysdeps/x86. This patch makes the fenv_private.h name an architecture-independent convention in glibc. The include of fenv_private.h from math_private.h becomes architecture-independent (until callers are updated to include fenv_private.h directly so the include from math_private.h is no longer needed). Some architecture math_private.h headers are removed if no longer needed, or renamed to fenv_private.h if all they define belongs in that header; architecture fenv_private.h headers now do require #include_next <fenv_private.h>. The i386 fenv_private.h file moves to sysdeps/x86/fpu/ to reflect how it is actually shared with x86_64. The generic math_private.h gets a new include of <stdbool.h>, as needed for bool in some prototypes in that header (previously that was indirectly included via include/fenv.h, which now only gets included too late in math_private.h, after those prototypes). Tested for x86_64 and x86, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/aarch64/fpu/fenv_private.h: New file. Based on .... * sysdeps/aarch64/fpu/math_private.h: ... this file. All contents moved to fenv_private.h except for ... (TOINT_INTRINSICS): Kept in math_private.h. (roundtoint): Likewise. (converttoint): Likewise. * sysdeps/arm/fenv_private.h: Change multiple-include guard to [ARM_FENV_PRIVATE_H]. Include next <fenv_private.h>. * sysdeps/arm/math_private.h: Remove. * sysdeps/generic/fenv_private.h: New file. Contents moved from .... * sysdeps/generic/math_private.h: ... this file. Include <stdbool.h>. Do not include <fenv.h> or <get-rounding-mode.h>. Include <fenv_private.h>. Remove functions and macros moved to fenv_private.h. * sysdeps/i386/fpu/math_private.h: Remove. * sysdeps/mips/math_private.h: Move to .... * sysdeps/mips/fpu/fenv_private.h: ... here. Change multiple-include guard to [MIPS_FENV_PRIVATE_H]. Remove [__mips_hard_float] conditional. Include next <fenv_private.h>. * sysdeps/powerpc/fpu/fenv_private.h: Change multiple-include guard to [POWERPC_FENV_PRIVATE_H]. Include next <fenv_private.h>. * sysdeps/powerpc/fpu/math_private.h: Do not include <fenv_private.h>. * sysdeps/riscv/rvf/math_private.h: Move to .... * sysdeps/riscv/rvf/fenv_private.h: ... here. Change multiple-include guard to [RISCV_FENV_PRIVATE_H]. Include next <fenv_private.h>. * sysdeps/sparc/fpu/fenv_private.h: Change multiple-include guard to [SPARC_FENV_PRIVATE_H]. Include next <fenv_private.h>. * sysdeps/sparc/fpu/math_private.h: Remove. * sysdeps/i386/fpu/fenv_private.h: Move to .... * sysdeps/x86/fpu/fenv_private.h: ... here. Change multiple-include guard to [X86_FENV_PRIVATE_H]. Include next <fenv_private.h>. * sysdeps/x86_64/fpu/math_private.h: Do not include <sysdeps/i386/fpu/fenv_private.h>.
* Move SNAN_TESTS_* out of math-tests.h.Joseph Myers2018-08-101-25/+0
| | | | | | | | | | | | | | | | | | | | | | | Continuing moving macros out of math-tests.h to smaller headers following typo-proof conventions instead of using #ifndef, this patch moves the SNAN_TESTS_* macros for individual types out to their own sysdeps header (while the type-generic SNAN_TESTS wrapper for those macros remains in math-tests.h). Tested for x86_64 and x86, and with build-many-glibcs.py. * sysdeps/generic/math-tests-snan.h: New file. * sysdeps/generic/math-tests.h: Include <math-tests-snan.h>. (SNAN_TESTS_float): Do not define here. (SNAN_TESTS_double): Likewise. (SNAN_TESTS_long_double): Likewise. (SNAN_TESTS_float128): Likewise. * sysdeps/i386/fpu/math-tests-snan.h: New file. * sysdeps/i386/fpu/math-tests.h: Remove file. * sysdeps/ia64/math-tests-snan.h: New file. * sysdeps/ia64/math-tests.h: Remove file. * sysdeps/x86/math-tests.h: Likewise. * sysdeps/x86_64/fpu/math-tests-snan.h: New file.
* x86: Move STATE_SAVE_OFFSET/STATE_SAVE_MASK to sysdep.hH.J. Lu2018-08-062-14/+9
| | | | | | | | | | | | | Move STATE_SAVE_OFFSET and STATE_SAVE_MASK to sysdep.h to make sysdeps/x86/cpu-features.h a C header file. * sysdeps/x86/cpu-features.h (STATE_SAVE_OFFSET): Removed. (STATE_SAVE_MASK): Likewise. Don't check __ASSEMBLER__ to include <cpu-features-offsets.h>. * sysdeps/x86/sysdep.h (STATE_SAVE_OFFSET): New. (STATE_SAVE_MASK): Likewise. * sysdeps/x86_64/dl-trampoline.S: Include <cpu-features-offsets.h> instead of <cpu-features.h>.
* x86: Cleanup cpu-features-offsets.symH.J. Lu2018-08-031-19/+1
| | | | | | | | | | | | | | | | | | | | | | | Remove the unused macros. There is no code changes in libc.so nor ld.so on i686 and x86-64. * sysdeps/x86/cpu-features-offsets.sym (rtld_global_ro_offsetof): Removed. (CPU_FEATURES_SIZE): Likewise. (CPUID_OFFSET): Likewise. (CPUID_SIZE): Likewise. (CPUID_EAX_OFFSET): Likewise. (CPUID_EBX_OFFSET): Likewise. (CPUID_ECX_OFFSET): Likewise. (CPUID_EDX_OFFSET): Likewise. (FAMILY_OFFSET): Likewise. (MODEL_OFFSET): Likewise. (FEATURE_OFFSET): Likewise. (FEATURE_SIZ): Likewise. (COMMON_CPUID_INDEX_1): Likewise. (COMMON_CPUID_INDEX_7): Likewise. (FEATURE_INDEX_1): Likewise. (RTLD_GLOBAL_RO_DL_X86_CPU_FEATURES_OFFSET): Updated.
* Rename the glibc.tune namespace to glibc.cpuSiddhesh Poyarekar2018-08-026-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The glibc.tune namespace is vaguely named since it is a 'tunable', so give it a more specific name that describes what it refers to. Rename the tunable namespace to 'cpu' to more accurately reflect what it encompasses. Also rename glibc.tune.cpu to glibc.cpu.name since glibc.cpu.cpu is weird. * NEWS: Mention the change. * elf/dl-tunables.list: Rename tune namespace to cpu. * sysdeps/powerpc/dl-tunables.list: Likewise. * sysdeps/x86/dl-tunables.list: Likewise. * sysdeps/aarch64/dl-tunables.list: Rename tune.cpu to cpu.name. * elf/dl-hwcaps.c (_dl_important_hwcaps): Adjust. * elf/dl-hwcaps.h (GET_HWCAP_MASK): Likewise. * manual/README.tunables: Likewise. * manual/tunables.texi: Likewise. * sysdeps/powerpc/cpu-features.c: Likewise. * sysdeps/unix/sysv/linux/aarch64/cpu-features.c (init_cpu_features): Likewise. * sysdeps/x86/cpu-features.c: Likewise. * sysdeps/x86/cpu-features.h: Likewise. * sysdeps/x86/cpu-tunables.c: Likewise. * sysdeps/x86_64/Makefile: Likewise. * sysdeps/x86/dl-cet.c: Likewise. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* x86: Rename get_common_indeces to get_common_indicesH.J. Lu2018-08-011-4/+4
| | | | | | | | | Reviewed-by: Carlos O'Donell <carlos@redhat.com> * sysdeps/x86/cpu-features.c (get_common_indeces): Renamed to ... (get_common_indices): This. (init_cpu_features): Updated.
* x86/CET: Fix property note parser [BZ #23467]H.J. Lu2018-07-301-9/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | GNU_PROPERTY_X86_FEATURE_1_AND may not be the first property item. We need to check each property item until we reach the end of the property or find GNU_PROPERTY_X86_FEATURE_1_AND. This patch adds 2 tests. The first test checks if IBT is enabled and the second test reads the output from the first test to check if IBT is is enabled. The second second test fails if IBT isn't enabled properly. Reviewed-by: Carlos O'Donell <carlos@redhat.com> [BZ #23467] * sysdeps/unix/sysv/linux/x86/Makefile (tests): Add tst-cet-property-1 and tst-cet-property-2 if CET is enabled. (CFLAGS-tst-cet-property-1.o): New. (ASFLAGS-tst-cet-property-dep-2.o): Likewise. ($(objpfx)tst-cet-property-2): Likewise. ($(objpfx)tst-cet-property-2.out): Likewise. * sysdeps/unix/sysv/linux/x86/tst-cet-property-1.c: New file. * sysdeps/unix/sysv/linux/x86/tst-cet-property-2.c: Likewise. * sysdeps/unix/sysv/linux/x86/tst-cet-property-dep-2.S: Likewise. * sysdeps/x86/dl-prop.h (_dl_process_cet_property_note): Parse each property item until GNU_PROPERTY_X86_FEATURE_1_AND is found.
* x86: Add tst-get-cpu-features-static to $(tests) [BZ #23458]H.J. Lu2018-07-301-1/+1
| | | | | | | | | All tests should be added to $(tests). Reviewed-by: Carlos O'Donell <carlos@redhat.com> [BZ #23458] * sysdeps/x86/Makefile (tests): Add tst-get-cpu-features-static.
* x86/CET: Don't parse beyond the note endH.J. Lu2018-07-271-1/+1
| | | | | | | | | Simply check if "ptr < ptr_end" since "ptr" is always incremented by 8. Reviewed-by: Carlos O'Donell <carlos@redhat.com> * sysdeps/x86/dl-prop.h (_dl_process_cet_property_note): Don't parse beyond the note end.
* x86: Populate COMMON_CPUID_INDEX_80000001 for Intel CPUs [BZ #23459]H.J. Lu2018-07-262-10/+19
| | | | | | | | | | | | Reviewed-by: Carlos O'Donell <carlos@redhat.com> [BZ #23459] * sysdeps/x86/cpu-features.c (get_extended_indices): New function. (init_cpu_features): Call get_extended_indices for both Intel and AMD CPUs. * sysdeps/x86/cpu-features.h (COMMON_CPUID_INDEX_80000001): Remove "for AMD" comment.
* x86: Correct index_cpu_LZCNT [BZ # 23456]H.J. Lu2018-07-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | cpu-features.h has #define bit_cpu_LZCNT (1 << 5) #define index_cpu_LZCNT COMMON_CPUID_INDEX_1 #define reg_LZCNT But the LZCNT feature bit is in COMMON_CPUID_INDEX_80000001: Initial EAX Value: 80000001H ECX Extended Processor Signature and Feature Bits: Bit 05: LZCNT available index_cpu_LZCNT should be COMMON_CPUID_INDEX_80000001, not COMMON_CPUID_INDEX_1. The VMX feature bit is in COMMON_CPUID_INDEX_1: Initial EAX Value: 01H Feature Information Returned in the ECX Register: 5 VMX Reviewed-by: Carlos O'Donell <carlos@redhat.com> [BZ # 23456] * sysdeps/x86/cpu-features.h (index_cpu_LZCNT): Set to COMMON_CPUID_INDEX_80000001.
* x86/CET: Add tests with legacy non-CET shared objectsH.J. Lu2018-07-2512-0/+293
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check binary compatibility of CET-enabled executables: 1. When CET-enabled executable is used with legacy non-CET shared object at run-time, ld.so should disable SHSTK and put legacy non-CET shared objects in legacy bitmap. 2. When IBT-enabled executable dlopens legacy non-CET shared object, ld.so should put legacy shared object in legacy bitmap. 3. Use GLIBC_TUNABLES=glibc.tune.x86_shstk=[on|off|permissive] to control how SHSTK is enabled. * sysdeps/x86/Makefile (tests): Add tst-cet-legacy-1, tst-cet-legacy-2, tst-cet-legacy-2a, tst-cet-legacy-3, tst-cet-legacy-4, tst-cet-legacy-4a, tst-cet-legacy-4b and tst-cet-legacy-4c. (modules-names): Add tst-cet-legacy-mod-1, tst-cet-legacy-mod-2 and tst-cet-legacy-mod-4. (CFLAGS-tst-cet-legacy-2.c): New. (CFLAGS-tst-cet-legacy-mod-1.c): Likewise. (CFLAGS-tst-cet-legacy-mod-2.c): Likewise. (CFLAGS-tst-cet-legacy-3.c): Likewise. (CFLAGS-tst-cet-legacy-4.c): Likewise. (CFLAGS-tst-cet-legacy-mod-4.c): Likewise. ($(objpfx)tst-cet-legacy-1): Likewise. ($(objpfx)tst-cet-legacy-2): Likewise. ($(objpfx)tst-cet-legacy-2.out): Likewise. ($(objpfx)tst-cet-legacy-2a): Likewise. ($(objpfx)tst-cet-legacy-2a.out): Likewise. ($(objpfx)tst-cet-legacy-4): Likewise. ($(objpfx)tst-cet-legacy-4.out): Likewise. ($(objpfx)tst-cet-legacy-4a): Likewise. ($(objpfx)tst-cet-legacy-4a.out): Likewise. (tst-cet-legacy-4a-ENV): Likewise. ($(objpfx)tst-cet-legacy-4b): Likewise. ($(objpfx)tst-cet-legacy-4b.out): Likewise. (tst-cet-legacy-4b-ENV): Likewise. ($(objpfx)tst-cet-legacy-4c): Likewise. ($(objpfx)tst-cet-legacy-4c.out): Likewise. (tst-cet-legacy-4c-ENV): Likewise. * sysdeps/x86/tst-cet-legacy-1.c: New file. * sysdeps/x86/tst-cet-legacy-2.c: Likewise. * sysdeps/x86/tst-cet-legacy-2a.c: Likewise. * sysdeps/x86/tst-cet-legacy-3.c: Likewise. * sysdeps/x86/tst-cet-legacy-4.c: Likewise. * sysdeps/x86/tst-cet-legacy-4a.c: Likewise. * sysdeps/x86/tst-cet-legacy-4b.c: Likewise. * sysdeps/x86/tst-cet-legacy-4c.c: Likewise. * sysdeps/x86/tst-cet-legacy-mod-1.c: Likewise. * sysdeps/x86/tst-cet-legacy-mod-2.c: Likewise. * sysdeps/x86/tst-cet-legacy-mod-4.c: Likewise.
* x86/CET: Extend arch_prctl syscall for CET controlH.J. Lu2018-07-241-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CET arch_prctl bits should be defined in <asm/prctl.h> from Linux kernel header files. Add x86 <include/asm/prctl.h> for pre-CET kernel header files. Note: sysdeps/unix/sysv/linux/x86/include/asm/prctl.h should be removed if <asm/prctl.h> from the required kernel header files contains CET arch_prctl bits. /* CET features: IBT: GNU_PROPERTY_X86_FEATURE_1_IBT SHSTK: GNU_PROPERTY_X86_FEATURE_1_SHSTK */ /* Return CET features in unsigned long long *addr: features: addr[0]. shadow stack base address: addr[1]. shadow stack size: addr[2]. */ # define ARCH_CET_STATUS 0x3001 /* Disable CET features in unsigned int features. */ # define ARCH_CET_DISABLE 0x3002 /* Lock all CET features. */ # define ARCH_CET_LOCK 0x3003 /* Allocate a new shadow stack with unsigned long long *addr: IN: requested shadow stack size: *addr. OUT: allocated shadow stack address: *addr. */ # define ARCH_CET_ALLOC_SHSTK 0x3004 /* Return legacy region bitmap info in unsigned long long *addr: address: addr[0]. size: addr[1]. */ # define ARCH_CET_LEGACY_BITMAP 0x3005 Reviewed-by: Carlos O'Donell <carlos@redhat.com> * sysdeps/unix/sysv/linux/x86/include/asm/prctl.h: New file. * sysdeps/unix/sysv/linux/x86/cpu-features.c: Include <sys/prctl.h> and <asm/prctl.h>. (get_cet_status): Call arch_prctl with ARCH_CET_STATUS. * sysdeps/unix/sysv/linux/x86/dl-cet.h: Include <sys/prctl.h> and <asm/prctl.h>. (dl_cet_allocate_legacy_bitmap): Call arch_prctl with ARCH_CET_LEGACY_BITMAP. (dl_cet_disable_cet): Call arch_prctl with ARCH_CET_DISABLE. (dl_cet_lock_cet): Call arch_prctl with ARCH_CET_LOCK. * sysdeps/x86/libc-start.c: Include <startup.h>.
* Add <bits/indirect-return.h>H.J. Lu2018-07-241-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add <bits/indirect-return.h> and include it in <ucontext.h>. __INDIRECT_RETURN defined in <bits/indirect-return.h> indicates if swapcontext requires special compiler treatment. The default __INDIRECT_RETURN is empty. On x86, when shadow stack is enabled, __INDIRECT_RETURN is defined with indirect_return attribute, which has been added to GCC 9, to indicate that swapcontext returns via indirect branch. Otherwise __INDIRECT_RETURN is defined with returns_twice attribute. When shadow stack is enabled, remove always_inline attribute from prepare_test_buffer in string/tst-xbzero-opt.c to avoid: tst-xbzero-opt.c: In function ‘prepare_test_buffer’: tst-xbzero-opt.c:105:1: error: function ‘prepare_test_buffer’ can never be inlined because it uses setjmp prepare_test_buffer (unsigned char *buf) when indirect_return attribute isn't available. Reviewed-by: Carlos O'Donell <carlos@redhat.com> * bits/indirect-return.h: New file. * misc/sys/cdefs.h (__glibc_has_attribute): New. * sysdeps/x86/bits/indirect-return.h: Likewise. * stdlib/Makefile (headers): Add bits/indirect-return.h. * stdlib/ucontext.h: Include <bits/indirect-return.h>. (swapcontext): Add __INDIRECT_RETURN. * string/tst-xbzero-opt.c (ALWAYS_INLINE): New. (prepare_test_buffer): Use it.
* x86: Always include <dl-cet.h>/cet-tunables.h> for --enable-cetH.J. Lu2018-07-171-2/+5
| | | | | | | | | Always include <dl-cet.h> and cet-tunables.h> when CET is enabled. Otherwise, configure glibc with --enable-cet --disable-tunables will fail to build. * sysdeps/x86/cpu-features.c: Always include <dl-cet.h> and cet-tunables.h> when CET is enabled.
* x86: Support IBT and SHSTK in Intel CET [BZ #21598]H.J. Lu2018-07-1614-0/+979
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Intel Control-flow Enforcement Technology (CET) instructions: https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-en forcement-technology-preview.pdf includes Indirect Branch Tracking (IBT) and Shadow Stack (SHSTK). GNU_PROPERTY_X86_FEATURE_1_IBT is added to GNU program property to indicate that all executable sections are compatible with IBT when ENDBR instruction starts each valid target where an indirect branch instruction can land. Linker sets GNU_PROPERTY_X86_FEATURE_1_IBT on output only if it is set on all relocatable inputs. On an IBT capable processor, the following steps should be taken: 1. When loading an executable without an interpreter, enable IBT and lock IBT if GNU_PROPERTY_X86_FEATURE_1_IBT is set on the executable. 2. When loading an executable with an interpreter, enable IBT if GNU_PROPERTY_X86_FEATURE_1_IBT is set on the interpreter. a. If GNU_PROPERTY_X86_FEATURE_1_IBT isn't set on the executable, disable IBT. b. Lock IBT. 3. If IBT is enabled, when loading a shared object without GNU_PROPERTY_X86_FEATURE_1_IBT: a. If legacy interwork is allowed, then mark all pages in executable PT_LOAD segments in legacy code page bitmap. Failure of legacy code page bitmap allocation causes an error. b. If legacy interwork isn't allowed, it causes an error. GNU_PROPERTY_X86_FEATURE_1_SHSTK is added to GNU program property to indicate that all executable sections are compatible with SHSTK where return address popped from shadow stack always matches return address popped from normal stack. Linker sets GNU_PROPERTY_X86_FEATURE_1_SHSTK on output only if it is set on all relocatable inputs. On a SHSTK capable processor, the following steps should be taken: 1. When loading an executable without an interpreter, enable SHSTK if GNU_PROPERTY_X86_FEATURE_1_SHSTK is set on the executable. 2. When loading an executable with an interpreter, enable SHSTK if GNU_PROPERTY_X86_FEATURE_1_SHSTK is set on interpreter. a. If GNU_PROPERTY_X86_FEATURE_1_SHSTK isn't set on the executable or any shared objects loaded via the DT_NEEDED tag, disable SHSTK. b. Otherwise lock SHSTK. 3. After SHSTK is enabled, it is an error to load a shared object without GNU_PROPERTY_X86_FEATURE_1_SHSTK. To enable CET support in glibc, --enable-cet is required to configure glibc. When CET is enabled, both compiler and assembler must support CET. Otherwise, it is a configure-time error. To support CET run-time control, 1. _dl_x86_feature_1 is added to the writable ld.so namespace to indicate if IBT or SHSTK are enabled at run-time. It should be initialized by init_cpu_features. 2. For dynamic executables: a. A l_cet field is added to struct link_map to indicate if IBT or SHSTK is enabled in an ELF module. _dl_process_pt_note or _rtld_process_pt_note is called to process PT_NOTE segment for GNU program property and set l_cet. b. _dl_open_check is added to check IBT and SHSTK compatibilty when dlopening a shared object. 3. Replace i386 _dl_runtime_resolve and _dl_runtime_profile with _dl_runtime_resolve_shstk and _dl_runtime_profile_shstk, respectively if SHSTK is enabled. CET run-time control can be changed via GLIBC_TUNABLES with $ export GLIBC_TUNABLES=glibc.tune.x86_shstk=[permissive|on|off] $ export GLIBC_TUNABLES=glibc.tune.x86_ibt=[permissive|on|off] 1. permissive: SHSTK is disabled when dlopening a legacy ELF module. 2. on: IBT or SHSTK are always enabled, regardless if there are IBT or SHSTK bits in GNU program property. 3. off: IBT or SHSTK are always disabled, regardless if there are IBT or SHSTK bits in GNU program property. <cet.h> from CET-enabled GCC is automatically included by assembly codes to add GNU_PROPERTY_X86_FEATURE_1_IBT and GNU_PROPERTY_X86_FEATURE_1_SHSTK to GNU program property. _CET_ENDBR is added at the entrance of all assembly functions whose address may be taken. _CET_NOTRACK is used to insert NOTRACK prefix with indirect jump table to support IBT. It is defined as notrack when _CET_NOTRACK is defined in <cet.h>. [BZ #21598] * configure.ac: Add --enable-cet. * configure: Regenerated. * elf/Makefille (all-built-dso): Add a comment. * elf/dl-load.c (filebuf): Moved before "dynamic-link.h". Include <dl-prop.h>. (_dl_map_object_from_fd): Call _dl_process_pt_note on PT_NOTE segment. * elf/dl-open.c: Include <dl-prop.h>. (dl_open_worker): Call _dl_open_check. * elf/rtld.c: Include <dl-prop.h>. (dl_main): Call _rtld_process_pt_note on PT_NOTE segment. Call _rtld_main_check. * sysdeps/generic/dl-prop.h: New file. * sysdeps/i386/dl-cet.c: Likewise. * sysdeps/unix/sysv/linux/x86/cpu-features.c: Likewise. * sysdeps/unix/sysv/linux/x86/dl-cet.h: Likewise. * sysdeps/x86/cet-tunables.h: Likewise. * sysdeps/x86/check-cet.awk: Likewise. * sysdeps/x86/configure: Likewise. * sysdeps/x86/configure.ac: Likewise. * sysdeps/x86/dl-cet.c: Likewise. * sysdeps/x86/dl-procruntime.c: Likewise. * sysdeps/x86/dl-prop.h: Likewise. * sysdeps/x86/libc-start.h: Likewise. * sysdeps/x86/link_map.h: Likewise. * sysdeps/i386/dl-trampoline.S (_dl_runtime_resolve): Add _CET_ENDBR. (_dl_runtime_profile): Likewise. (_dl_runtime_resolve_shstk): New. (_dl_runtime_profile_shstk): Likewise. * sysdeps/linux/x86/Makefile (sysdep-dl-routines): Add dl-cet if CET is enabled. (CFLAGS-.o): Add -fcf-protection if CET is enabled. (CFLAGS-.os): Likewise. (CFLAGS-.op): Likewise. (CFLAGS-.oS): Likewise. (asm-CPPFLAGS): Add -fcf-protection -include cet.h if CET is enabled. (tests-special): Add $(objpfx)check-cet.out. (cet-built-dso): New. (+$(cet-built-dso:=.note)): Likewise. (common-generated): Add $(cet-built-dso:$(common-objpfx)%=%.note). ($(objpfx)check-cet.out): New. (generated): Add check-cet.out. * sysdeps/x86/cpu-features.c: Include <dl-cet.h> and <cet-tunables.h>. (TUNABLE_CALLBACK (set_x86_ibt)): New prototype. (TUNABLE_CALLBACK (set_x86_shstk)): Likewise. (init_cpu_features): Call get_cet_status to check CET status and update dl_x86_feature_1 with CET status. Call TUNABLE_CALLBACK (set_x86_ibt) and TUNABLE_CALLBACK (set_x86_shstk). Disable and lock CET in libc.a. * sysdeps/x86/cpu-tunables.c: Include <cet-tunables.h>. (TUNABLE_CALLBACK (set_x86_ibt)): New function. (TUNABLE_CALLBACK (set_x86_shstk)): Likewise. * sysdeps/x86/sysdep.h (_CET_NOTRACK): New. (_CET_ENDBR): Define if not defined. (ENTRY): Add _CET_ENDBR. * sysdeps/x86/dl-tunables.list (glibc.tune): Add x86_ibt and x86_shstk. * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve): Add _CET_ENDBR. (_dl_runtime_profile): Likewise.
* x86: Support shadow stack pointer in setjmp/longjmpH.J. Lu2018-07-142-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Save and restore shadow stack pointer in setjmp and longjmp to support shadow stack in Intel CET. Use feature_1 in tcbhead_t to check if shadow stack is enabled before saving and restoring shadow stack pointer. Reviewed-by: Carlos O'Donell <carlos@redhat.com> * sysdeps/i386/__longjmp.S: Include <jmp_buf-ssp.h>. (__longjmp): Restore shadow stack pointer if shadow stack is enabled, SHADOW_STACK_POINTER_OFFSET is defined and __longjmp isn't defined for __longjmp_cancel. * sysdeps/i386/bsd-_setjmp.S: Include <jmp_buf-ssp.h>. (_setjmp): Save shadow stack pointer if shadow stack is enabled and SHADOW_STACK_POINTER_OFFSET is defined. * sysdeps/i386/bsd-setjmp.S: Include <jmp_buf-ssp.h>. (setjmp): Save shadow stack pointer if shadow stack is enabled and SHADOW_STACK_POINTER_OFFSET is defined. * sysdeps/i386/setjmp.S: Include <jmp_buf-ssp.h>. (__sigsetjmp): Save shadow stack pointer if shadow stack is enabled and SHADOW_STACK_POINTER_OFFSET is defined. * sysdeps/unix/sysv/linux/i386/____longjmp_chk.S: Include <jmp_buf-ssp.h>. (____longjmp_chk): Restore shadow stack pointer if shadow stack is enabled and SHADOW_STACK_POINTER_OFFSET is defined. * sysdeps/unix/sysv/linux/x86/Makefile (gen-as-const-headers): Remove jmp_buf-ssp.sym. * sysdeps/unix/sysv/linux/x86_64/____longjmp_chk.S: Include <jmp_buf-ssp.h>. (____longjmp_chk): Restore shadow stack pointer if shadow stack is enabled and SHADOW_STACK_POINTER_OFFSET is defined. * sysdeps/x86/Makefile (gen-as-const-headers): Add jmp_buf-ssp.sym. * sysdeps/x86/jmp_buf-ssp.sym: New dummy file. * sysdeps/x86_64/__longjmp.S: Include <jmp_buf-ssp.h>. (__longjmp): Restore shadow stack pointer if shadow stack is enabled, SHADOW_STACK_POINTER_OFFSET is defined and __longjmp isn't defined for __longjmp_cancel. * sysdeps/x86_64/setjmp.S: Include <jmp_buf-ssp.h>. (__sigsetjmp): Save shadow stack pointer if shadow stack is enabled and SHADOW_STACK_POINTER_OFFSET is defined.
* x86: Rename __glibc_reserved1 to feature_1 in tcbhead_t [BZ #22563]H.J. Lu2018-07-142-0/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | feature_1 has X86_FEATURE_1_IBT and X86_FEATURE_1_SHSTK bits for CET run-time control. CET_ENABLED, IBT_ENABLED and SHSTK_ENABLED are defined to 1 or 0 to indicate that if CET, IBT and SHSTK are enabled. <tls-setup.h> is added to set up thread-local data. Reviewed-by: Carlos O'Donell <carlos@redhat.com> [BZ #22563] * nptl/pthread_create.c: Include <tls-setup.h>. (__pthread_create_2_1): Call tls_setup_tcbhead. * sysdeps/generic/tls-setup.h: New file. * sysdeps/x86/nptl/tls-setup.h: Likewise. * sysdeps/i386/nptl/tcb-offsets.sym (FEATURE_1_OFFSET): New. * sysdeps/x86_64/nptl/tcb-offsets.sym (FEATURE_1_OFFSET): Likewise. * sysdeps/i386/nptl/tls.h (tcbhead_t): Rename __glibc_reserved1 to feature_1. * sysdeps/x86_64/nptl/tls.h (tcbhead_t): Likewise. * sysdeps/x86/sysdep.h (X86_FEATURE_1_IBT): New. (X86_FEATURE_1_SHSTK): Likewise. (CET_ENABLED): Likewise. (IBT_ENABLED): Likewise. (SHSTK_ENABLED): Likewise.
* Use AVX_Fast_Unaligned_Load from Zen onwards.Amit Pawar2018-07-061-5/+13
| | | | | | | From Zen onwards this will be enabled. It was disabled for the Excavator case and will remain disabled. Reviewd-by: Carlos O'Donell <carlos@redhat.com>
* x86-64: Check Prefer_FSRM in ifunc-memmove.hH.J. Lu2018-05-212-0/+4
| | | | | | | | | | | | | Although the REP MOVSB implementations of memmove, memcpy and mempcpy aren't used by the current processors, this patch adds Prefer_FSRM check in ifunc-memmove.h so that they can be used in the future. * sysdeps/x86/cpu-features.h (bit_arch_Prefer_FSRM): New. (index_arch_Prefer_FSRM): Likewise. * sysdeps/x86/cpu-tunables.c (TUNABLE_CALLBACK (set_hwcaps)): Also check Prefer_FSRM. * sysdeps/x86_64/multiarch/ifunc-memmove.h (IFUNC_SELECTOR): Also return OPTIMIZE (erms) for Prefer_FSRM.
* Initial Fast Short REP MOVSB (FSRM) supportH.J. Lu2018-05-211-0/+3
| | | | | | | | | | The newer Intel processors support Fast Short REP MOVSB which has a feature bit in CPUID. This patch adds the Fast Short REP MOVSB (FSRM) bit to x86 cpu-features. * sysdeps/x86/cpu-features.h (bit_cpu_FSRM): New. (index_cpu_FSRM): Likewise. (reg_FSRM): Likewise.
* x86: Add sysdeps/x86/ldsodefs.hH.J. Lu2018-05-141-0/+66
| | | | | | | | | | | | | | | | | Merge sysdeps/i386/ldsodefs.h and sysdeps/x86_64/ldsodefs.h into sysdeps/x86/ldsodefs.h. Tested on i686 and x86-64. * sysdeps/i386/ldsodefs.h: Removed. * sysdeps/x86_64/ldsodefs.h: Moved to ... * sysdeps/x86/ldsodefs.h: This. (La_i86_regs): New. (La_i86_retval): Likewise. (ARCH_PLTENTER_MEMBERS): Add i86_gnu_pltenter. (ARCH_PLTEXIT_MEMBERS): i86_gnu_pltexit. Acked-by: Christian Brauner (Ubuntu) christian@brauner.io
* Move math_check_force_underflow macros to separate math-underflow.h.Joseph Myers2018-05-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch continues cleaning up math_private.h by moving the math_check_force_underflow set of macros to a separate header math-underflow.h. This header is included by the files that need it rather than from math_private.h. Moving these macros to a separate file removes the math_private.h uses of macros from float.h, so the inclusion of float.h in math_private.h is also removed; files that were depending on that inclusion are fixed to include float.h directly. The inclusion of math-barriers.h from math_private.h will be removed in a separate patch. Tested for x86_64 and x86. Also tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. * math/math-underflow.h: New file. * sysdeps/generic/math_private.h: Do not include <float.h>. (fabs_tg): Remove macro. Moved to math-underflow.h. (min_of_type_f): Likewise. (min_of_type_): Likewise. (min_of_type_l): Likewise. (min_of_type_f128): Likewise. (min_of_type): Likewise. (math_check_force_underflow): Likewise. (math_check_force_underflow_nonneg): Likewise. (math_check_force_underflow_complex): Likewise. * math/e_exp2_template.c: Include <math-underflow.h>. * math/k_casinh_template.c: Likewise. * math/s_catan_template.c: Likewise. * math/s_catanh_template.c: Likewise. * math/s_ccosh_template.c: Likewise. * math/s_cexp_template.c: Likewise. * math/s_clog10_template.c: Likewise. * math/s_clog_template.c: Likewise. * math/s_csin_template.c: Likewise. * math/s_csinh_template.c: Likewise. * math/s_csqrt_template.c: Likewise. * math/s_ctan_template.c: Likewise. * math/s_ctanh_template.c: Likewise. * sysdeps/ieee754/dbl-64/e_asin.c: Likewise. * sysdeps/ieee754/dbl-64/e_atanh.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp2.c: Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c: Likewise. * sysdeps/ieee754/dbl-64/e_hypot.c: Likewise. * sysdeps/ieee754/dbl-64/e_j1.c: Likewise. * sysdeps/ieee754/dbl-64/e_jn.c: Likewise. * sysdeps/ieee754/dbl-64/e_pow.c: Likewise. * sysdeps/ieee754/dbl-64/e_sinh.c: Likewise. * sysdeps/ieee754/dbl-64/s_asinh.c: Likewise. * sysdeps/ieee754/dbl-64/s_atan.c: Likewise. * sysdeps/ieee754/dbl-64/s_erf.c: Likewise. * sysdeps/ieee754/dbl-64/s_expm1.c: Likewise. * sysdeps/ieee754/dbl-64/s_log1p.c: Likewise. * sysdeps/ieee754/dbl-64/s_sin.c: Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c: Likewise. * sysdeps/ieee754/dbl-64/s_tan.c: Likewise. * sysdeps/ieee754/dbl-64/s_tanh.c: Likewise. * sysdeps/ieee754/flt-32/e_asinf.c: Likewise. * sysdeps/ieee754/flt-32/e_atanhf.c: Likewise. * sysdeps/ieee754/flt-32/e_gammaf_r.c: Likewise. * sysdeps/ieee754/flt-32/e_j1f.c: Likewise. * sysdeps/ieee754/flt-32/e_jnf.c: Likewise. * sysdeps/ieee754/flt-32/e_sinhf.c: Likewise. * sysdeps/ieee754/flt-32/k_sinf.c: Likewise. * sysdeps/ieee754/flt-32/k_tanf.c: Likewise. * sysdeps/ieee754/flt-32/s_asinhf.c: Likewise. * sysdeps/ieee754/flt-32/s_atanf.c: Likewise. * sysdeps/ieee754/flt-32/s_erff.c: Likewise. * sysdeps/ieee754/flt-32/s_expm1f.c: Likewise. * sysdeps/ieee754/flt-32/s_log1pf.c: Likewise. * sysdeps/ieee754/flt-32/s_tanhf.c: Likewise. * sysdeps/ieee754/ldbl-128/e_asinl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_atanhl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_expl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c: Likewise. * sysdeps/ieee754/ldbl-128/e_hypotl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_j1l.c: Likewise. * sysdeps/ieee754/ldbl-128/e_jnl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_sinhl.c: Likewise. * sysdeps/ieee754/ldbl-128/k_sincosl.c: Likewise. * sysdeps/ieee754/ldbl-128/k_sinl.c: Likewise. * sysdeps/ieee754/ldbl-128/k_tanl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_asinhl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_atanl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_erfl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_expm1l.c: Likewise. * sysdeps/ieee754/ldbl-128/s_log1pl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_tanhl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_asinl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_atanhl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_hypotl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_j1l.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_powl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_sinhl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/k_sincosl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/k_sinl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/k_tanl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_asinhl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_atanl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_erfl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_tanhl.c: Likewise. * sysdeps/ieee754/ldbl-96/e_asinl.c: Likewise. * sysdeps/ieee754/ldbl-96/e_atanhl.c: Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c: Likewise. * sysdeps/ieee754/ldbl-96/e_hypotl.c: Likewise. * sysdeps/ieee754/ldbl-96/e_j1l.c: Likewise. * sysdeps/ieee754/ldbl-96/e_jnl.c: Likewise. * sysdeps/ieee754/ldbl-96/e_sinhl.c: Likewise. * sysdeps/ieee754/ldbl-96/k_sinl.c: Likewise. * sysdeps/ieee754/ldbl-96/k_tanl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_asinhl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_erfl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_tanhl.c: Likewise. * sysdeps/powerpc/fpu/e_hypot.c: Likewise. * sysdeps/x86/fpu/powl_helper.c: Likewise. * sysdeps/ieee754/dbl-64/s_nextup.c: Include <float.h>. * sysdeps/ieee754/flt-32/s_nextupf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nextupl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_nextupl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_nextupl.c: Likewise.
* Move math_opt_barrier, math_force_eval to separate math-barriers.h.Joseph Myers2018-05-091-0/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch continues cleaning up math_private.h by moving the math_opt_barrier and math_force_eval macros to a separate header math-barriers.h. At present, those macros are inside a "#ifndef math_opt_barrier" in math_private.h to allow architectures to override them and then use a separate math-barriers.h header, no such #ifndef or #include_next is needed; architectures just have their own alternative version of math-barriers.h when providing their own optimized versions that avoid going through memory unnecessarily. The generic math-barriers.h has a comment added to document these two macros. In this patch, math_private.h is made to #include <math-barriers.h>, so files using these macros do not need updating yet. That is because of uses of math_force_eval in math_check_force_underflow and math_check_force_underflow_nonneg, which are still defined in math_private.h. Once those are moved out to a separate header, that separate header can be made to include <math-barriers.h>, as can the other files directly using these barrier macros, and then the include of <math-barriers.h> from math_private.h can be removed. Tested for x86_64 and x86. Also tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. * sysdeps/generic/math-barriers.h: New file. * sysdeps/generic/math_private.h [!math_opt_barrier] (math_opt_barrier): Move to math-barriers.h. [!math_opt_barrier] (math_force_eval): Likewise. * sysdeps/aarch64/fpu/math-barriers.h: New file. * sysdeps/aarch64/fpu/math_private.h (math_opt_barrier): Move to math-barriers.h. (math_force_eval): Likewise. * sysdeps/alpha/fpu/math-barriers.h: New file. * sysdeps/alpha/fpu/math_private.h (math_opt_barrier): Move to math-barriers.h. (math_force_eval): Likewise. * sysdeps/x86/fpu/math-barriers.h: New file. * sysdeps/i386/fpu/fenv_private.h (math_opt_barrier): Move to math-barriers.h. (math_force_eval): Likewise. * sysdeps/m68k/m680x0/fpu/math_private.h: Move to.... * sysdeps/m68k/m680x0/fpu/math-barriers.h: ... here. Adjust multiple-include guard for rename. * sysdeps/powerpc/fpu/math-barriers.h: New file. * sysdeps/powerpc/fpu/math_private.h (math_opt_barrier): Move to math-barriers.h. (math_force_eval): Likewise.
* x86: Use pad in pthread_unwind_buf to preserve shadow stack registerH.J. Lu2018-05-024-0/+140
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The pad array in struct pthread_unwind_buf is used by setjmp to save shadow stack register. We assert that size of struct pthread_unwind_buf is no less than offset of shadow stack pointer + shadow stack pointer size. Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as these with thread cancellation, call setjmp, but never return after __libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as __libc_longjmp on x86, doesn't need to restore shadow stack register. __libc_longjmp, which is a private interface for thread cancellation implementation in libpthread, is changed to call __longjmp_cancel, instead of __longjmp. __longjmp_cancel is a new internal function in libc, which is similar to __longjmp, but doesn't restore shadow stack register. The compatibility longjmp and siglongjmp in libpthread.so are changed to call __libc_siglongjmp, instead of __libc_longjmp, so that they will restore shadow stack register. Tested with build-many-glibcs.py. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com> * nptl/pthread_create.c (START_THREAD_DEFN): Clear previous handlers after setjmp. * setjmp/longjmp.c (__libc_longjmp): Don't define alias if defined. * sysdeps/unix/sysv/linux/x86/setjmpP.h: Include <libc-pointer-arith.h>. (_JUMP_BUF_SIGSET_BITS_PER_WORD): New. (_JUMP_BUF_SIGSET_NSIG): Changed to 96. (_JUMP_BUF_SIGSET_NWORDS): Changed to use ALIGN_UP and _JUMP_BUF_SIGSET_BITS_PER_WORD. * sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel. * sysdeps/x86/__longjmp_cancel.S: New file. * sysdeps/x86/longjmp.c: Likewise. * sysdeps/x86/nptl/pt-longjmp.c: Likewise.
* Remove sysdeps/x86/fpu/bits/mathinline.h __finite inline.Joseph Myers2018-03-161-13/+0
| | | | | | | | | | | | | | | | | | | | | | | Continuing the removals of inline functions from the x86 bits/mathinline.h, this patch removes an inline of __finite (which was not actually architecture-specific at all beyond its endianness-dependence). This inline is not normally used with GCC 4.4 or later, because isfinite now uses __builtin_isfinite except for -fsignaling-nans. Allowing __builtin_isfinite etc. to work properly even for -fsignaling-nans, by implementing versions of those built-in functions that use integer arithmetic in GCC, is <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66462> (a patch was committed but had to be reverted because it caused problems, and that patch didn't address all formats for all architectures, only some, so by itself would not have been sufficient to allow glibc to use __builtin_isfinite unconditionally for new-enough GCC). Tested for x86_64 and x86. * sysdeps/x86/fpu/bits/mathinline.h [__USE_MISC] (__finite): Remove inline function.
* Remove all target specific __ieee754_sqrt(f/l) inlinesWilco Dijkstra2018-03-151-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the now unused target specific__ieee754_sqrt(f/l) inlines. Also remove inlines of sqrt which are for really old GCC versions. Removing these is desirable, under the general principle of leaving such inlining to the compiler rather than trying to do it in installed headers, especially when only very old compilers are affected. Note that removing inlines for __ieee754_sqrt disables inlining in the sqrt wrapper functions. Given the sqrt function will typically only be called for negative arguments, it doesn't matter whether the inlining happens or not. * sysdeps/aarch64/fpu/math_private.h (__ieee754_sqrt): Remove. (__ieee754_sqrtf): Remove. * sysdeps/alpha/fpu/math_private.h (__ieee754_sqrt): Remove. (__ieee754_sqrtf): Remove. * sysdeps/generic/math-type-macros.h (M_SQRT): Use sqrt. * sysdeps/m68k/m680x0/fpu/mathimpl.h (__ieee754_sqrt): Remove. * sysdeps/powerpc/fpu/math_private.h (__ieee754_sqrt): Remove. (__ieee754_sqrtf): Remove. * sysdeps/s390/fpu/bits/mathinline.h: Remove file. * sysdeps/sparc/fpu/bits/mathinline.h (sqrt) Remove. (sqrtf): Remove. (sqrtl): Remove. (__ieee754_sqrt): Remove. (__ieee754_sqrtf): Remove. (__ieee754_sqrtl): Remove. * sysdeps/m68k/m680x0/fpu/mathimpl.h (__ieee754_sqrt): Remove. * sysdeps/x86/fpu/math_private.h (__ieee754_sqrt): Remove. * sysdeps/x86_64/fpu/math_private.h (__ieee754_sqrt): Remove. (__ieee754_sqrtf): Remove. (__ieee754_sqrtl): Remove.
* Remove more old-compilers parts of sysdeps/x86/fpu/bits/mathinline.h.Joseph Myers2018-03-151-183/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes further parts of sysdeps/x86/fpu/bits/mathinline.h that are only of value for optimization with older compiler versions, in accordance with general principles of preferring the let the compiler deal with such inlining through built-in functions. In general, GCC supports inlining all these functions as of version 4.3 or earlier. However, some inlines in GCC may have had excessively restrictive conditions in past GCC versions (e.g. requiring -ffast-math when the inline is valid under broader conditions). (In particular, GCC had, before GCC 7, unnecessarily restrictive conditions on when it could apply floor and ceil inlines corresponding to the ones removed here. The same was true for rint, but bits/mathinline.h *also* was excessively restrictive there.) The removed sincos inlines are for __sincos etc. functions (not a public interface and not currently used in this header either; not in a part of the header ever used for building glibc itself). Likewise, the atan2 inlines included one for __atan2l, also not a public interface and not used for building glibc itself (calls inside glibc generally use __ieee754_atan2l, for which there is a separate __LIBC_INTERNAL_MATH_INLINES case in this header). Tested for x86_64 and x86. * sysdeps/x86/fpu/bits/mathinline.h [__FAST_MATH__] (__sincos_code): Remove define and undefine. [__FAST_MATH__] (__sincos): Remove inline function. [__FAST_MATH__] (__sincosf): Remove inline function. [__FAST_MATH__] (__sincosl): Remove inline function. (__atan2l): Remove inline functions. [!__GNUC_PREREQ (3, 4)] (__atan2_code): Remove macro. [!__GNUC_PREREQ (3, 4) && __FAST_MATH__] (atan2): Remove inline function. (floor): Remove inline function. (ceil): Likewise. [__FAST_MATH__] (__ldexp_code): Remove macro. [__FAST_MATH__] (ldexp): Remove inline function. [__FAST_MATH__ && __USE_ISOC99] (ldexpf): Likewise. [__FAST_MATH__ && __USE_ISOC99] (ldexpl): Likewise. [__FAST_MATH__ && __USE_ISOC99] (rint): Likewise. [__USE_ISOC99] (__lrint_code): Remove macro. [__USE_ISOC99] (__llrint_code): Likewise. [__USE_ISOC99] (lrintf): Remove inline function. [__USE_ISOC99] (lrint): Likewise. [__USE_ISOC99] (lrintl): Likewise. [__USE_ISOC99] (llrint): Likewise. [__USE_ISOC99] (llrintf): Likewise. [__USE_ISOC99] (llrintl): Likewise.
* Remove old-GCC parts of x86 bits/mathinline.h.Joseph Myers2018-03-141-306/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In accordance with the general principle of preferring to let the compiler optimize function calls based on their standard semantics rather than putting inline definitions of such functions in installed headers, this patch removes various such inline definitions in the x86 bits/mathinline.h that were already disabled for GCC 3.5 or later and so were only used with very old compilers (for which good optimization is particularly unimportant); along with those inlines, a definition of __M_SQRT2, which was only used in such inline functions, is also removed. This is similar to an early step in removing the string.h inlines; I intend to follow up with further removals of bits/mathinline.h inline definitions in appropriate logical groups (with GCC bugs filed in cases where GCC doesn't already support corresponding optimizations). Tested for x86_64 and x86. * sysdeps/x86/fpu/bits/mathinline.h [!__GNUC_PREREQ (3, 4)] (lrintf): Remove definitions used only with old GCC. [!__GNUC_PREREQ (3, 4)] (lrint): Likewise. [!__GNUC_PREREQ (3, 4)] (llrintf): Likewise. [!__GNUC_PREREQ (3, 4)] (llrint): Likewise. [!__GNUC_PREREQ (3, 4)] (fmaxf): Likewise. [!__GNUC_PREREQ (3, 4)] (fmax): Likewise. [!__GNUC_PREREQ (3, 4)] (fminf): Likewise. [!__GNUC_PREREQ (3, 4)] (fmin): Likewise. [!__GNUC_PREREQ (3, 4)] (rint): Likewise. [!__GNUC_PREREQ (3, 4)] (rintf): Likewise. [!__GNUC_PREREQ (3, 4)] (nearbyint): Likewise. [!__GNUC_PREREQ (3, 4)] (nearbyintf): Likewise. [!__GNUC_PREREQ (3, 4)] (ceil): Likewise. [!__GNUC_PREREQ (3, 4)] (ceilf): Likewise. [!__GNUC_PREREQ (3, 4)] (floor): Likewise. [!__GNUC_PREREQ (3, 4)] (floorf): Likewise. [__FAST_MATH__ && !__GNUC_PREREQ (3, 5)] (tan): Likewise. [__FAST_MATH__ && !__GNUC_PREREQ (3, 5)] (fmod): Likewise. [__FAST_MATH__ && !__GNUC_PREREQ (3, 4)] (sin): Likewise. [__FAST_MATH__ && !__GNUC_PREREQ (3, 4)] (cos): Likewise. [__FAST_MATH__ && !__GNUC_PREREQ (3, 5)] (log10): Likewise. [__FAST_MATH__ && !__GNUC_PREREQ (3, 5)] (asin): Likewise. [__FAST_MATH__ && !__GNUC_PREREQ (3, 5)] (acos): Likewise. [__FAST_MATH__ && !__GNUC_PREREQ (3, 4)] (atan): Likewise. [__FAST_MATH__ && !__GNUC_PREREQ (3, 5)] (log1p): Likewise. [__FAST_MATH__ && !__GNUC_PREREQ (3, 5)] (logb): Likewise. [__FAST_MATH__ && !__GNUC_PREREQ (3, 5)] (log2): Likewise. [__FAST_MATH__ && !__GNUC_PREREQ (3, 5)] (drem): Likewise. [__FAST_MATH__] (__M_SQRT2): Remove macro.
* Unify and simplify bits/byteswap.h, bits/byteswap-16.h headers (bug 14508, ↵Joseph Myers2018-02-062-204/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bug 15512, bug 17082, bug 20530). We have a general principle of preferring optimizations for library facilities to use compiler built-in functions rather than being located in library headers, where the compiler can reasonably optimize code without needing to know glibc implementation details. This patch applies this principle to bits/byteswap.h, eliminating all the architecture-specific variants and bits/byteswap-16.h. The __bswap_16, __bswap_32 and __bswap_64 interfaces all become inline functions, never macros, using the GCC built-in functions where available and otherwise a single architecture-independent definition using shifts and masking (which compilers may well be able to detect and optimize; GCC has detection of various byte-swapping idioms). The __bswap_constant_32 macro needs to stay around because of uses in static initializers within glibc and its tests, and so for consistency all __bswap_constant_* are kept rather than just being inlined into the old-GCC-or-non-GCC parts of the __bswap_* inline function definitions. Various open bugs are addressed by this cleanup, with caveats about exactly what is covered by those bugs and when the bugs applied at all. Bug 14508 reports -Wformat warnings building glibc because __bswap_* sometimes returned the wrong types. Obviously we already don't have such warnings any more or the build would be failing, given -Werror, and I suspect that bug was originally for wrong types for x86_64, as fixed by commit d394eb742a3565d7fe7a4b02710a60b5f219ee64 (glibc 2.17). The only case I saw removed by this patch where the types would still have been wrong was the non-__GNUC__ case of __bswap_64 in the s390 header (using unsigned long long int, but uint64_t would be unsigned long int for 64-bit). In any case, the single header consistently uses __uintN_t types after this patch, thereby eliminating all such bugs. The existing string/test-endian-types.c test already suffices to verify that the types are correct with the compiler used to build glibc and its tests. Bug 15512 reports an error from __bswap_constant_16 with -Werror -Wsign-conversion. I am unable to reproduce this with any GCC version supporting -Wsign-conversion - all seem to be able to avoid warning for ((x) >> 8) & 0xffu, where x is uint16_t, which while it formally does involve an implicit conversion from int to unsigned int, is also a case where it should be easy for the compiler to see that the value converted is never negative. But in this patch __bswap_constant_16 is changed to use signed 0xff so that no such implicit conversion occurs at all, and a test with -Werror -Wsign-conversion is added. Bug 17082 objects to the use of ({}) statement expressions in these macros preventing use at file scope (in C, that's in sizeof etc.; in C++, more generally in static initializers). The particular case of these interfaces is fixed by this patch as it changes them to inline functions, eliminating all uses of ({}) in bits/byteswap.h, and a corresponding testcase is added. The bug tries to raise a more general policy question about use of ({}) in macros in installed headers, referring to "many other libc functions" (unspecified which functions are being considered). Since such policy questions belong on libc-alpha, and since there *are* macros in installed headers which can't really avoid using ({}) (where they are type-generic, so can't use an inline function, but need a temporary variable, and a few where the interface involves returning memory from alloca so can't use an inline function either), I propose to consider that bug fixed with this change. That is without prejudice to any other new bugs anyone wishes to file *for precisely defined sets of macros* requesting moving away from ({}) *where it is clearly possible for those interfaces*. Where ({}) can be avoided, typically by use of an inline function, I think that's a good idea - that inline functions are typically to be preferred to ({}) for header interfaces where such optimizations are useful but the interface is suited to being defined using an inline function. Bug 20530 requests use of __builtin_bswap16 when available (GCC 4.8 and later), which this patch implements. Tested for x86_64, and with build-many-glibcs.py. Also did an x86_64 test with the __GNUC_PREREQ conditionals changed to "#if 0" to verify the old-GCC/non-GCC case in the headers. (There are already existing tests for correctness of results of these interfaces.) [BZ #14508] [BZ #15512] [BZ #17082] [BZ #20530] * bits/byteswap.h: Update file comment. Do not include <bits/byteswap-16.h>. (__bswap_constant_16): Cast result to __uint16_t. Use signed 0xff constant. (__bswap_16): Define as inline function. (__bswap_constant_32): Reformat definition. (__bswap_32): Always define as inline function, not macro, using __uint32_t. Use __builtin_bswap32 if [__GNUC_PREREQ (4, 3)], otherwise __bswap_constant_32. (__bswap_constant_64): Reformat definition. Do not use __extension__ here. (__bswap_64): Always define as inline function, not macro. Use __extension__ on function definition. Use __builtin_bswap64 if [__GNUC_PREREQ (4, 3)], otherwise __bswap_constant_64. * string/test-endian-file-scope.c: New file. * string/test-endian-sign-conversion.c: Likewise. * string/Makefile (headers): Remove bits/byteswap-16.h. (tests): Add test-endian-file-scope and test-endian-sign-conversion. (CFLAGS-test-endian-sign-conversion.c): New variable. * bits/byteswap-16.h: Remove file. * sysdeps/ia64/bits/byteswap-16.h: Likewise. * sysdeps/ia64/bits/byteswap.h: Likewise. * sysdeps/m68k/bits/byteswap.h: Likewise. * sysdeps/s390/bits/byteswap-16.h: Likewise. * sysdeps/s390/bits/byteswap.h: Likewise. * sysdeps/tile/bits/byteswap.h: Likewise. * sysdeps/x86/bits/byteswap-16.h: Likewise. * sysdeps/x86/bits/byteswap.h: Likewise.
* Update copyright dates with scripts/update-copyrights.Joseph Myers2018-01-0140-40/+40
| | | | | | | * All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.
* Add _Float64x function aliases.Joseph Myers2017-11-272-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch continues filling out TS 18661-3 support by adding *f64x function aliases on platforms with _Float64x support. (It so happens the set of such platforms is exactly the same as the set of platforms with _Float128 support, although on x86_64, x86 and ia32 the _Float64x format is Intel extended rather than binary128.) The API provided corresponds exactly to that provided for _Float128, mostly coming from TS 18661-3. As these functions always alias those for another type (long double, _Float128 or both), __* function names are not provided, as in other cases of alias types. Given the preparation done in previous patches, this one just enables the feature via Makeconfig and bits/floatn.h, adds symbol versions, and updates documentation and ABI baselines. The symbol versions are present unconditionally as GLIBC_2.27 in the relevant Versions files, as it's OK for those to specify versions for functions that may not be present in some configurations; no additional complexity is needed unless in future some configuration gains support for this type that didn't have such support in 2.27. The Makeconfig additions for ia64 and x86 aren't strictly needed, as those configurations also get float64x-alias-fcts definitions from sysdeps/ieee754/float128/Makeconfig, but still seem appropriate given that _Float64x is not _Float128 for those configurations. A libm-test-ulps update for x86 is included. This is because bits/mathinline.h does not have _Float64x support added and for two functions the use of out-of-line functions results in increased ulps (ifloat64x shares ulps with ildouble / ifloat128 as appropriate). Given that we'd like generally to eliminate bits/mathinline.h optimizations, preferring to have such optimizations in GCC instead, it seems reasonable not to add such support there for new types. GCC support for _FloatN / _FloatNx built-in functions is limited, but has been improved in GCC 8, and at some point I hope the full set of libm built-in functions in GCC, and other optimizations with per-floating-type aspects, will be enabled for all _FloatN / _FloatNx types. Tested for x86_64 and x86, and with build-many-glibcs.py, with both GCC 6 and GCC 7. * sysdeps/ia64/Makeconfig (float64x-alias-fcts): New variable. * sysdeps/ieee754/float128/Makeconfig (float64x-alias-fcts): Likewise. * sysdeps/ieee754/ldbl-128/Makeconfig (float64x-alias-fcts): Likewise. * sysdeps/x86/Makeconfig: New file. * bits/floatn-common.h (__HAVE_FLOAT64X): Remove macro. (__HAVE_FLOAT64X_LONG_DOUBLE): Likewise. * bits/floatn.h (__HAVE_FLOAT64X): New macro. (__HAVE_FLOAT64X_LONG_DOUBLE): Likewise. * sysdeps/ia64/bits/floatn.h (__HAVE_FLOAT64X): Likewise. (__HAVE_FLOAT64X_LONG_DOUBLE): Likewise. * sysdeps/ieee754/ldbl-128/bits/floatn.h (__HAVE_FLOAT64X): Likewise. (__HAVE_FLOAT64X_LONG_DOUBLE): Likewise. * sysdeps/mips/ieee754/bits/floatn.h (__HAVE_FLOAT64X): Likewise. (__HAVE_FLOAT64X_LONG_DOUBLE): Likewise. * sysdeps/powerpc/bits/floatn.h (__HAVE_FLOAT64X): Likewise. (__HAVE_FLOAT64X_LONG_DOUBLE): Likewise. * sysdeps/x86/bits/floatn.h (__HAVE_FLOAT64X): Likewise. (__HAVE_FLOAT64X_LONG_DOUBLE): Likewise. * manual/math.texi (Mathematics): Document support for _Float64x. * math/Versions (GLIBC_2.27): Add _Float64x functions. * stdlib/Versions (GLIBC_2.27): Likewise. * wcsmbs/Versions (GLIBC_2.27): Likewise. * sysdeps/unix/sysv/linux/aarch64/libc.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise. * sysdeps/i386/fpu/libm-test-ulps: Likewise. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise.
* Support bits/floatn.h inclusion from .S files.Joseph Myers2017-11-171-28/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Further _FloatN / _FloatNx type alias support will involve making architecture-specific .S files use the common macros for libm function aliases. Making them use those macros will also serve to simplify existing code for aliases / symbol versions in various cases, similar to such simplifications for ldbl-opt code. The libm-alias-*.h files sometimes need to include <bits/floatn.h> to determine which aliases they should define. At present, this does not work for inclusion from .S files because <bits/floatn.h> can define typedefs for old compilers. This patch changes all the <bits/floatn.h> and <bits/floatn-common.h> headers to include __ASSEMBLER__ conditionals. Those conditionals disable everything related to C syntax in the __ASSEMBLER__ case, not just the problem typedefs, as that seemed cleanest. The __HAVE_* definitions remain in the __ASSEMBLER__ case, as those provide information that is required to define the correct set of aliases. Tested with build-many-glibcs.py for a representative set of configurations (x86_64-linux-gnu i686-linux-gnu ia64-linux-gnu powerpc64le-linux-gnu mips64-linux-gnu-n64 sparc64-linux-gnu) with GCC 6. Also tested with GCC 6 for i686-linux-gnu in conjunction with changes to use alias macros in .S files. * bits/floatn-common.h [!__ASSEMBLER]: Disable everything related to C syntax instead of availability and properties of types. * bits/floatn.h [!__ASSEMBLER]: Likewise. * sysdeps/ia64/bits/floatn.h [!__ASSEMBLER]: Likewise. * sysdeps/ieee754/ldbl-128/bits/floatn.h [!__ASSEMBLER]: Likewise. * sysdeps/mips/ieee754/bits/floatn.h [!__ASSEMBLER]: Likewise. * sysdeps/powerpc/bits/floatn.h [!__ASSEMBLER]: Likewise. * sysdeps/x86/bits/floatn.h [!__ASSEMBLER]: Likewise.