about summary refs log tree commit diff
path: root/manual
Commit message (Collapse)AuthorAgeFilesLines
* nptl: Move cancel state out of cancelhandlingAdhemerval Zanella2021-06-092-3/+1
| | | | | | | | | | | | | | | Now that thread cancellation state is not accessed concurrently anymore, it is possible to move it out the 'cancelhandling'. The code is also simplified: CANCELLATION_P is replaced with a internal pthread_testcancel call and the CANCELSTATE_BIT{MASK} is removed. With this behavior pthread_setcancelstate does not require to act on cancellation if cancel type is asynchronous (is already handled either by pthread_setcanceltype or by the signal handler). Checked on x86_64-linux-gnu and aarch64-linux-gnu.
* fix typoXeonacid2021-06-021-1/+1
| | | | | "accomodate" should be "accommodate" Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
* Update floating-point feature test macro handling for C2XJoseph Myers2021-06-011-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ISO C2X has made some changes to the handling of feature test macros related to features from the floating-point TSes, and to exactly what such features are present in what headers, that require corresponding changes in glibc. * For the few features that were controlled by __STDC_WANT_IEC_60559_BFP_EXT__ (and the corresponding DFP macro) in C2X, there is now instead a new feature test macro __STDC_WANT_IEC_60559_EXT__ covering both binary and decimal FP. This controls CR_DECIMAL_DIG in <float.h> (provided by GCC; I implemented support for the new feature test macro for GCC 11) and the totalorder and payload functions in <math.h>. C2X no longer says anything about __STDC_WANT_IEC_60559_BFP_EXT__ (so it's appropriate for that macro to continue to enable exactly the features from TS 18661-1). * The SNAN macros for each floating-point type have moved to <float.h> (and been renamed in the process). Thus, the copies in <math.h> should only be defined for __STDC_WANT_IEC_60559_BFP_EXT__, not for C2X. * The fmaxmag and fminmag functions have been removed (replaced by new functions for the new min/max operations in IEEE 754-2019). Thus those should also only be declared for __STDC_WANT_IEC_60559_BFP_EXT__. * The _FloatN / _FloatNx handling for the last two points in glibc is trickier, since __STDC_WANT_IEC_60559_TYPES_EXT__ is still in C2X (the integration of TS 18661-3 as an Annex, that is, which hasn't yet been merged into the C standard git repository but has been accepted by WG14), so C2X with that macro should not declare some things that are declared for older standards with that macro. The approach taken here is to provide the declarations (when __STDC_WANT_IEC_60559_TYPES_EXT__ is enabled) only when (defined __USE_GNU || !__GLIBC_USE (ISOC2X)), so if C2X features are enabled then those declarations (that are only in TS 18661-3 and not in C2X) will only be provided if _GNU_SOURCE is defined as well. Thus _GNU_SOURCE remains a superset of the TS features as well as of C2X. Some other somewhat related changes in C2X are not addressed here. There's an open proposal not to include the fmin and fmax functions for the _FloatN / _FloatNx types, given the new min/max operations, which could be handled like the previous point if adopted. And the fromfp functions have been changed to return a result in floating type rather than intmax_t / uintmax_t; my inclination there is to treat that like that change of totalorder type (new symbol versions etc. for the ABI change; old versions become compat symbols and are no longer supported as an API). Tested for x86_64 and x86.
* aarch64: Added optimized memcpy and memmove for A64FXNaohiro Tamura2021-05-271-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch optimizes the performance of memcpy/memmove for A64FX [1] which implements ARMv8-A SVE and has L1 64KB cache per core and L2 8MB cache per NUMA node. The performance optimization makes use of Scalable Vector Register with several techniques such as loop unrolling, memory access alignment, cache zero fill, and software pipelining. SVE assembler code for memcpy/memmove is implemented as Vector Length Agnostic code so theoretically it can be run on any SOC which supports ARMv8-A SVE standard. We confirmed that all testcases have been passed by running 'make check' and 'make xcheck' not only on A64FX but also on ThunderX2. And also we confirmed that the SVE 512 bit vector register performance is roughly 4 times better than Advanced SIMD 128 bit register and 8 times better than scalar 64 bit register by running 'make bench'. [1] https://github.com/fujitsu/A64FX Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com> Reviewed-by: Szabolcs Nagy <Szabolcs.Nagy@arm.com>
* aarch64: Added Vector Length Set test helper scriptNaohiro Tamura2021-05-261-0/+3
| | | | | | | | | | | | | | | | | | | This patch is a test helper script to change Vector Length for child process. This script can be used as test-wrapper for 'make check'. Usage examples: ~/build$ make check subdirs=string \ test-wrapper='~/glibc/sysdeps/unix/sysv/linux/aarch64/vltest.py 16' ~/build$ ~/glibc/sysdeps/unix/sysv/linux/aarch64/vltest.py 16 \ make test t=string/test-memcpy ~/build$ ~/glibc/sysdeps/unix/sysv/linux/aarch64/vltest.py 32 \ ./debugglibc.sh string/test-memmove ~/build$ ~/glibc/sysdeps/unix/sysv/linux/aarch64/vltest.py 64 \ ./testrun.sh string/test-memset
* nptl: Consolidate async cancel enable/disable implementation in libcFlorian Weimer2021-05-051-2/+2
| | | | | | | | | | | | | | Previously, the source file nptl/cancellation.c was compiled multiple times, for libc, libpthread, librt. This commit switches to a single implementation, with new __pthread_enable_asynccancel@@GLIBC_PRIVATE, __pthread_disable_asynccancel@@GLIBC_PRIVATE exports. The almost-unused CANCEL_ASYNC and CANCEL_RESET macros are replaced by LIBC_CANCEL_ASYNC and LIBC_CANCEL_ASYNC macros. They call the __pthread_* functions unconditionally now. The macros are still needed because shared code uses them; Hurd has different definitions. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Improve documentation for malloc etc. (BZ#27719)Paul Eggert2021-04-138-90/+136
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cover key corner cases (e.g., whether errno is set) that are well settled in glibc, fix some examples to avoid integer overflow, and update some other dated examples (code needed for K&R C, e.g.). * manual/charset.texi (Non-reentrant String Conversion): * manual/filesys.texi (Symbolic Links): * manual/memory.texi (Allocating Cleared Space): * manual/socket.texi (Host Names): * manual/string.texi (Concatenating Strings): * manual/users.texi (Setting Groups): Use reallocarray instead of realloc, to avoid integer overflow issues. * manual/filesys.texi (Scanning Directory Content): * manual/memory.texi (The GNU Allocator, Hooks for Malloc): * manual/tunables.texi: Use code font for 'malloc' instead of roman font. (Symbolic Links): Don't assume readlink return value fits in 'int'. * manual/memory.texi (Memory Allocation and C, Basic Allocation) (Malloc Examples, Alloca Example): * manual/stdio.texi (Formatted Output Functions): * manual/string.texi (Concatenating Strings, Collation Functions): Omit pointer casts that are needed only in ancient K&R C. * manual/memory.texi (Basic Allocation): Say that malloc sets errno on failure. Say "convert" rather than "cast", since casts are no longer needed. * manual/memory.texi (Basic Allocation): * manual/string.texi (Concatenating Strings): In examples, use C99 declarations after statements for brevity. * manual/memory.texi (Malloc Examples): Add portability notes for malloc (0), errno setting, and PTRDIFF_MAX. (Changing Block Size): Say that realloc (p, 0) acts like (p ? (free (p), NULL) : malloc (0)). Add xreallocarray example, since other examples can use it. Add portability notes for realloc (0, 0), realloc (p, 0), PTRDIFF_MAX, and improve notes for reallocating to the same size. (Allocating Cleared Space): Reword now-confusing discussion about replacement, and xref "Replacing malloc". * manual/stdio.texi (Formatted Output Functions): Don't assume message size fits in 'int'. * manual/string.texi (Concatenating Strings): Fix undefined behavior involving arithmetic on a freed pointer.
* manual: clarify that scanf %n supports type modifiersAlyssa Ross2021-03-301-5/+6
| | | | | | | | | | My initial reading of the %n documentation was that it didn't support type conversions, because it only mentioned int*. Corresponding man-pages patch: https://lore.kernel.org/linux-man/20210328215509.31666-1-hi@alyssa.is/ Reviewed-by: Arjun Shankar <arjun@redhat.com>
* math: Remove mpa files [BZ #15267]Wilco Dijkstra2021-03-111-85/+0
| | | | | | | Finally remove all mpa related files, headers, declarations, probes, unused tables and update makefiles. Reviewed-By: Paul Zimmermann <Paul.Zimmermann@inria.fr>
* tst: Extend cross-test-ssh.sh to specify if target date can be alteredLukasz Majewski2021-03-081-0/+20
| | | | | | | | | | | | | This code adds new flag - '--allow-time-setting' to cross-test-ssh.sh script to indicate if it is allowed to alter the date on the system on which tests are executed. This change is supposed to be used with test systems, which use virtual machines for testing. The GLIBC_TEST_ALLOW_TIME_SETTING env variable is exported to the remote environment on which the eligible test is run and brings no functional change when it is not. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* tunables: Simplify TUNABLE_SET interfaceSiddhesh Poyarekar2021-02-101-9/+7
| | | | | | | | | | | | | | | The TUNABLE_SET interface took a primitive C type argument, which resulted in inconsistent type conversions internally due to incorrect dereferencing of types, especialy on 32-bit architectures. This change simplifies the TUNABLE setting logic along with the interfaces. Now all numeric tunable values are stored as signed numbers in tunable_num_t, which is intmax_t. All calls to set tunables cast the input value to its primitive type and then to tunable_num_t for storage. This relies on gcc-specific (although I suspect other compilers woul also do the same) unsigned to signed integer conversion semantics, i.e. the bit pattern is conserved. The reverse conversion is guaranteed by the standard.
* x86: Add PTWRITE feature detection [BZ #27346]H.J. Lu2021-02-071-0/+3
| | | | | | | 1. Add CPUID_INDEX_14_ECX_0 for CPUID leaf 0x14 to detect PTWRITE feature in EBX of CPUID leaf 0x14 with ECX == 0. 2. Add PTWRITE detection to CPU feature tests. 3. Add 2 static CPU feature tests.
* manual: Correct description of ENTRY [BZ #17183]Florian Weimer2021-02-041-11/+15
| | | | | | | | The struct tag is actually entry (not ENTRY). The data member has type void *, and it can point to binary data. Only the key member is required to be a null-terminated string. Reviewed-by: Arjun Shankar <arjun@redhat.com>
* sysconf: Add _SC_MINSIGSTKSZ/_SC_SIGSTKSZ [BZ #20305]H.J. Lu2021-02-012-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add _SC_MINSIGSTKSZ for the minimum signal stack size derived from AT_MINSIGSTKSZ, which is the minimum number of bytes of free stack space required in order to gurantee successful, non-nested handling of a single signal whose handler is an empty function, and _SC_SIGSTKSZ which is the suggested minimum number of bytes of stack space required for a signal stack. If AT_MINSIGSTKSZ isn't available, sysconf (_SC_MINSIGSTKSZ) returns MINSIGSTKSZ. On Linux/x86 with XSAVE, the signal frame used by kernel is composed of the following areas and laid out as: ------------------------------ | alignment padding | ------------------------------ | xsave buffer | ------------------------------ | fsave header (32-bit only) | ------------------------------ | siginfo + ucontext | ------------------------------ Compute AT_MINSIGSTKSZ value as size of xsave buffer + size of fsave header (32-bit only) + size of siginfo and ucontext + alignment padding. If _SC_SIGSTKSZ_SOURCE or _GNU_SOURCE are defined, MINSIGSTKSZ and SIGSTKSZ are redefined as /* Default stack size for a signal handler: sysconf (SC_SIGSTKSZ). */ # undef SIGSTKSZ # define SIGSTKSZ sysconf (_SC_SIGSTKSZ) /* Minimum stack size for a signal handler: SIGSTKSZ. */ # undef MINSIGSTKSZ # define MINSIGSTKSZ SIGSTKSZ Compilation will fail if the source assumes constant MINSIGSTKSZ or SIGSTKSZ. The reason for not simply increasing the kernel's MINSIGSTKSZ #define (apart from the fact that it is rarely used, due to glibc's shadowing definitions) was that userspace binaries will have baked in the old value of the constant and may be making assumptions about it. For example, the type (char [MINSIGSTKSZ]) changes if this #define changes. This could be a problem if an newly built library tries to memcpy() or dump such an object defined by and old binary. Bounds-checking and the stack sizes passed to things like sigaltstack() and makecontext() could similarly go wrong.
* Update INSTALL with package versions that are known to workTulio Magno Quites Machado Filho2021-01-251-12/+12
| | | | | | | Most packages have been tested with their latest releases, except for Python, whose latest version is 3.9.1. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* manual: Correct argument order in mount examples [BZ #27207]John McCabe2021-01-221-2/+2
| | | | Reviewed-by: DJ Delorie <dj@redhat.com>
* <sys/platform/x86.h>: Remove the C preprocessor magicH.J. Lu2021-01-211-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In <sys/platform/x86.h>, define CPU features as enum instead of using the C preprocessor magic to make it easier to wrap this functionality in other languages. Move the C preprocessor magic to internal header for better GCC codegen when more than one features are checked in a single expression as in x86-64 dl-hwcaps-subdirs.c. 1. Rename COMMON_CPUID_INDEX_XXX to CPUID_INDEX_XXX. 2. Move CPUID_INDEX_MAX to sysdeps/x86/include/cpu-features.h. 3. Remove struct cpu_features and __x86_get_cpu_features from <sys/platform/x86.h>. 4. Add __x86_get_cpuid_feature_leaf to <sys/platform/x86.h> and put it in libc. 5. Make __get_cpu_features() private to glibc. 6. Replace __x86_get_cpu_features(N) with __get_cpu_features(). 7. Add _dl_x86_get_cpu_features to GLIBC_PRIVATE. 8. Use a single enum index for each CPU feature detection. 9. Pass the CPUID feature leaf to __x86_get_cpuid_feature_leaf. 10. Return zero struct cpuid_feature for the older glibc binary with a smaller CPUID_INDEX_MAX [BZ #27104]. 11. Inside glibc, use the C preprocessor magic so that cpu_features data can be loaded just once leading to more compact code for glibc. 256 bits are used for each CPUID leaf. Some leaves only contain a few features. We can add exceptions to such leaves. But it will increase code sizes and it is harder to provide backward/forward compatibilities when new features are added to such leaves in the future. When new leaves are added, _rtld_global_ro offsets will change which leads to race condition during in-place updates. We may avoid in-place updates by 1. Rename the old glibc. 2. Install the new glibc. 3. Remove the old glibc. NB: A function, __x86_get_cpuid_feature_leaf , is used to avoid the copy relocation issue with IFUNC resolver as shown in IFUNC resolver tests.
* ld.so: Add --list-tunables to print tunable valuesH.J. Lu2021-01-151-0/+38
| | | | | | Pass --list-tunables to ld.so to print tunables with min and max values. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Sync FDL from https://www.gnu.org/licenses/fdl-1.3.texiPaul Eggert2021-01-021-3/+3
|
* Update copyright dates with scripts/update-copyrightsPaul Eggert2021-01-0248-48/+48
| | | | | | | | | | | | | | | | I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 6694 files FOO. I then removed trailing white space from benchtests/bench-pthread-locks.c and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this diagnostic from Savannah: remote: *** pre-commit check failed ... remote: *** error: lines with trailing whitespace found remote: error: hook declined to update refs/heads/master
* Introduce _FORTIFY_SOURCE=3Siddhesh Poyarekar2020-12-311-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a new _FORTIFY_SOURCE level of 3 to enable additional fortifications that may have a noticeable performance impact, allowing more fortification coverage at the cost of some performance. With llvm 9.0 or later, this will replace the use of __builtin_object_size with __builtin_dynamic_object_size. __builtin_dynamic_object_size ----------------------------- __builtin_dynamic_object_size is an LLVM builtin that is similar to __builtin_object_size. In addition to what __builtin_object_size does, i.e. replace the builtin call with a constant object size, __builtin_dynamic_object_size will replace the call site with an expression that evaluates to the object size, thus expanding its applicability. In practice, __builtin_dynamic_object_size evaluates these expressions through malloc/calloc calls that it can associate with the object being evaluated. A simple motivating example is below; -D_FORTIFY_SOURCE=2 would miss this and emit memcpy, but -D_FORTIFY_SOURCE=3 with the help of __builtin_dynamic_object_size is able to emit __memcpy_chk with the allocation size expression passed into the function: void *copy_obj (const void *src, size_t alloc, size_t copysize) { void *obj = malloc (alloc); memcpy (obj, src, copysize); return obj; } Limitations ----------- If the object was allocated elsewhere that the compiler cannot see, or if it was allocated in the function with a function that the compiler does not recognize as an allocator then __builtin_dynamic_object_size also returns -1. Further, the expression used to compute object size may be non-trivial and may potentially incur a noticeable performance impact. These fortifications are hence enabled at a new _FORTIFY_SOURCE level to allow developers to make a choice on the tradeoff according to their environment.
* free: preserve errno [BZ#17924]Paul Eggert2020-12-291-0/+9
| | | | | | | | | | | In the next release of POSIX, free must preserve errno <https://www.austingroupbugs.net/view.php?id=385>. Modify __libc_free to save and restore errno, so that any internal munmap etc. syscalls do not disturb the caller's errno. Add a test malloc/tst-free-errno.c (almost all by Bruno Haible), and document that free preserves errno. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* <sys/platform/x86.h>: Add Intel LAM supportH.J. Lu2020-12-221-0/+3
| | | | | | | | Add Intel Linear Address Masking (LAM) support to <sys/platform/x86.h>. HAS_CPU_FEATURE (LAM) can be used to detect if LAM is enabled in CPU. LAM modifies the checking that is applied to 64-bit linear addresses, allowing software to use of the untranslated address bits for metadata.
* elf: Add a tunable to control use of tagged memoryRichard Earnshaw2020-12-211-0/+35
| | | | | | | | | | | Add a new glibc tunable: mem.tagging. This is a decimal constant in the range 0-255 but used as a bit-field. Bit 0 enables use of tagged memory in the malloc family of functions. Bit 1 enables precise faulting of tag failure on platforms where this can be controlled. Other bits are currently unused, but if set will cause memory tag checking for the current process to be enabled in the kernel.
* config: Allow memory tagging to be enabled when configuring glibcRichard Earnshaw2020-12-211-0/+13
| | | | | | | | This patch adds the configuration machinery to allow memory tagging to be enabled from the command line via the configure option --enable-memory-tagging. The current default is off, though in time we may change that once the API is more stable.
* ieee754: Remove unused __sin32 and __cos32Anssi Hannula2020-12-181-14/+0
| | | | | The __sin32 and __cos32 functions were only used in the now removed slow path of asin and acos.
* s390x: Require GCC 7.1 or later to build glibc.Stefan Liebler2020-12-171-0/+2
| | | | | | | | | | | | | | | | | | | | | | | GCC 6.5 fails to correctly build ldconfig with recent ld.so.cache commits, e.g.: 785969a047ad2f23f758901c6816422573544453 elf: Implement a string table for ldconfig, with tail merging If glibc is build with gcc 6.5.0: __builtin_add_overflow is used in <glibc>/elf/stringtable.c:stringtable_finalize() which leads to ldconfig failing with "String table is too large". This is also recognizable in following tests: FAIL: elf/tst-glibc-hwcaps-cache FAIL: elf/tst-glibc-hwcaps-prepend-cache FAIL: elf/tst-ldconfig-X FAIL: elf/tst-ldconfig-bad-aux-cache FAIL: elf/tst-ldconfig-ld_so_conf-update FAIL: elf/tst-stringtable See gcc "Bug 98269 - gcc 6.5.0 __builtin_add_overflow() with small uint32_t values incorrectly detects overflow" (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98269)
* manual: Clarify File Access Modes section and add O_PATHFlorian Weimer2020-12-031-28/+44
| | | | | | Kees Cook reported that the current text is misleading: <https://lore.kernel.org/lkml/202005150847.2B1ED8F81@keescook/>
* nptl: Return EINVAL for invalid clock for pthread_clockjoin_npAdhemerval Zanella2020-11-251-0/+2
| | | | | | | | | The align the GNU extension with the others one that accept specify which clock to wait for (such as pthread_mutex_clocklock). Check on x86_64-linux-gnu. Reviewed-by: Lukasz Majewski <lukma@denx.de>
* Argument Syntax: Use "option", @option, and @command.Carlos O'Donell2020-10-301-6/+6
| | | | Suggested-by: David O'Brien <daobrien@redhat.com>
* Reword description of SXID_* tunable propertiesSiddhesh Poyarekar2020-10-221-5/+6
| | | | | | | | | | | | | The SXID_* tunable properties only influence processes that are AT_SECURE, so make that a bit more explicit in the documentation and comment. Revisiting the code after a few years I managed to confuse myself, so I imagine there could be others who may have incorrectly assumed like I did that the SXID_ERASE tunables are not inherited by children of non-AT_SECURE processes. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* Move vtimes to a compatibility symbolAdhemerval Zanella2020-10-191-61/+0
| | | | | | | | | | | | | | I couldn't pinpoint which standard has added it, but no other POSIX system supports it and/or no longer provide it. The 'struct vtimes' also has a lot of drawbacks due its limited internal type size. I couldn't also see find any project that actually uses this symbol, either in some dignostic way (such as sanitizer). So I think it should be safer to just move to compat symbol, instead of deprecated. The idea it to avoid new ports to export such broken interface (riscv32 for instance). Checked on x86_64-linux-gnu and i686-linux-gnu.
* manual: correct the spelling of "MALLOC_PERTURB_" [BZ #23015]Benno Schulenberg2020-10-131-1/+1
| | | | Reported-by: Martin Dorey <martin.dorey@hds.com>
* manual: replace an obsolete collation example with a valid oneBenno Schulenberg2020-10-131-3/+3
| | | | | | | | | | | | | In the Spanish language, the digraph "ll" has not been considered a separate letter since 1994: https://www.rae.es/consultas/exclusion-de-ch-y-ll-del-abecedario Since January 1998 (commit 49891c106244888123557fca7fddda4fa1f96b1d), glibc's locale data no longer specifies "ch" and "ll" as separate collation elements. So, it's better to not use "ll" in an example. Also, the Czech "ch" is a better example as it collates in a more surprising place.
* <sys/platform/x86.h>: Add FSRCS/FSRS/FZLRM supportH.J. Lu2020-10-091-0/+9
| | | | | Add Fast Short REP CMP and SCA (FSRCS), Fast Short REP STO (FSRS) and Fast Zero-Length REP MOV (FZLRM) support to <sys/platform/x86.h>.
* <sys/platform/x86.h>: Add Intel HRESET supportH.J. Lu2020-10-091-0/+3
| | | | Add Intel HRESET support to <sys/platform/x86.h>.
* <sys/platform/x86.h>: Add AVX-VNNI supportH.J. Lu2020-10-091-0/+3
| | | | Add AVX-VNNI support to <sys/platform/x86.h>.
* <sys/platform/x86.h>: Add AVX512_FP16 supportH.J. Lu2020-10-091-0/+3
| | | | Add AVX512_FP16 support to <sys/platform/x86.h>.
* <sys/platform/x86.h>: Add Intel UINTR supportH.J. Lu2020-10-091-0/+3
| | | | Add Intel UINTR support to <sys/platform/x86.h>.
* Linux: Require properly configured /dev/pts for PTYsFlorian Weimer2020-10-071-8/+3
| | | | | | | | | | | | | Current systems do not have BSD terminals, so the fallback code in posix_openpt/getpt does not do anything. Also remove the file system check for /dev/pts. Current systems always have a devpts file system mounted there if /dev/ptmx exists. grantpt is now essentially a no-op. It only verifies that the argument is a ptmx-descriptor. Therefore, this change indirectly addresses bug 24941. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* manual: Fix typoJonathan Wakely2020-10-051-1/+1
|
* Set tunable value as well as min/max valuesH.J. Lu2020-09-291-2/+22
| | | | | | | | Some tunable values and their minimum/maximum values must be determinted at run-time. Add TUNABLE_SET_WITH_BOUNDS and TUNABLE_SET_WITH_BOUNDS_FULL to update tunable value together with minimum and maximum values. __tunable_set_val is updated to set tunable value as well as min/max values.
* Reversing calculation of __x86_shared_non_temporal_thresholdPatrick McGehearty2020-09-281-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The __x86_shared_non_temporal_threshold determines when memcpy on x86 uses non_temporal stores to avoid pushing other data out of the last level cache. This patch proposes to revert the calculation change made by H.J. Lu's patch of June 2, 2017. H.J. Lu's patch selected a threshold suitable for a single thread getting maximum performance. It was tuned using the single threaded large memcpy micro benchmark on an 8 core processor. The last change changes the threshold from using 3/4 of one thread's share of the cache to using 3/4 of the entire cache of a multi-threaded system before switching to non-temporal stores. Multi-threaded systems with more than a few threads are server-class and typically have many active threads. If one thread consumes 3/4 of the available cache for all threads, it will cause other active threads to have data removed from the cache. Two examples show the range of the effect. John McCalpin's widely parallel Stream benchmark, which runs in parallel and fetches data sequentially, saw a 20% slowdown with this patch on an internal system test of 128 threads. This regression was discovered when comparing OL8 performance to OL7. An example that compares normal stores to non-temporal stores may be found at https://vgatherps.github.io/2018-09-02-nontemporal/. A simple test shows performance loss of 400 to 500% due to a failure to use nontemporal stores. These performance losses are most likely to occur when the system load is heaviest and good performance is critical. The tunable x86_non_temporal_threshold can be used to override the default for the knowledgable user who really wants maximum cache allocation to a single thread in a multi-threaded system. The manual entry for the tunable has been expanded to provide more information about its purpose. modified: sysdeps/x86/cacheinfo.c modified: manual/tunables.texi
* <sys/platform/x86.h>: Add Intel Key Locker supportH.J. Lu2020-09-161-0/+9
| | | | | | | | | | | | | | | | | | | | | | | Add Intel Key Locker: https://software.intel.com/content/www/us/en/develop/download/intel-key-locker-specification.html support to <sys/platform/x86.h>. Intel Key Locker has 1. KL: AES Key Locker instructions. 2. WIDE_KL: AES wide Key Locker instructions. 3. AESKLE: AES Key Locker instructions are enabled by OS. Applications should use if (CPU_FEATURE_USABLE (KL)) and if (CPU_FEATURE_USABLE (WIDE_KL)) to check if AES Key Locker instructions and AES wide Key Locker instructions are usable.
* x86: Install <sys/platform/x86.h> [BZ #26124]H.J. Lu2020-09-111-0/+517
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Install <sys/platform/x86.h> so that programmers can do #if __has_include(<sys/platform/x86.h>) #include <sys/platform/x86.h> #endif ... if (CPU_FEATURE_USABLE (SSE2)) ... if (CPU_FEATURE_USABLE (AVX2)) ... <sys/platform/x86.h> exports only: enum { COMMON_CPUID_INDEX_1 = 0, COMMON_CPUID_INDEX_7, COMMON_CPUID_INDEX_80000001, COMMON_CPUID_INDEX_D_ECX_1, COMMON_CPUID_INDEX_80000007, COMMON_CPUID_INDEX_80000008, COMMON_CPUID_INDEX_7_ECX_1, /* Keep the following line at the end. */ COMMON_CPUID_INDEX_MAX }; struct cpuid_features { struct cpuid_registers cpuid; struct cpuid_registers usable; }; struct cpu_features { struct cpu_features_basic basic; struct cpuid_features features[COMMON_CPUID_INDEX_MAX]; }; /* Get a pointer to the CPU features structure. */ extern const struct cpu_features *__x86_get_cpu_features (unsigned int max) __attribute__ ((const)); Since all feature checks are done through macros, programs compiled with a newer <sys/platform/x86.h> are compatible with the older glibc binaries as long as the layout of struct cpu_features is identical. The features array can be expanded with backward binary compatibility for both .o and .so files. When COMMON_CPUID_INDEX_MAX is increased to support new processor features, __x86_get_cpu_features in the older glibc binaries returns NULL and HAS_CPU_FEATURE/CPU_FEATURE_USABLE return false on the new processor feature. No new symbol version is neeeded. Both CPU_FEATURE_USABLE and HAS_CPU_FEATURE are provided. HAS_CPU_FEATURE can be used to identify processor features. Note: Although GCC has __builtin_cpu_supports, it only supports a subset of <sys/platform/x86.h> and it is equivalent to CPU_FEATURE_USABLE. It doesn't support HAS_CPU_FEATURE.
* Add mallinfo2 function that support sizes >= 4GB.Martin Liska2020-08-311-18/+18
| | | | | The current int type can easily overflow for allocation of more than 4GB.
* manual: Fix sigdescr_np and sigabbrev_np return type (BZ #26343)Adhemerval Zanella2020-08-081-2/+2
|
* manual: Put the istrerrorname_np and strerrordesc_np return type in bracesAdhemerval Zanella2020-08-071-2/+2
| | | | Otherwise it is not rendered or indexed correctly.
* manual: Fix strerrorname_np and strerrordesc_np return type (BZ #26343)Adhemerval Zanella2020-08-071-2/+2
|
* manual: Fix some @code/@var formatting glitches chapter Date And TimeFlorian Weimer2020-08-051-10/+10
|