about summary refs log tree commit diff
Commit message (Collapse)AuthorAgeFilesLines
* support: Add check for TID zero in support_wait_for_thread_exitFlorian Weimer2021-10-011-1/+4
| | | | | | | | | | | | | | | | Some kernel versions (observed with kernel 5.14 and earlier) can list "0" entries in /proc/self/task. This happens when a thread exits while the task list is being constructed. Treat this entry as not present, like the proposed kernel patch does: [PATCH] procfs: Do not list TID 0 in /proc/<pid>/task <https://lore.kernel.org/all/8735pn5dx7.fsf@oldenburg.str.redhat.com/> Fixes commit 032d74eaf6179100048a5bf0ce942e97dc8b9a60 ("support: Add support_wait_for_thread_exit"). Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
* nptl: Add CLOCK_MONOTONIC support for PI mutexesAdhemerval Zanella2021-10-014-38/+54
| | | | | | | | | | | | | | | Linux added FUTEX_LOCK_PI2 to support clock selection (commit bf22a6976897977b0a3f1aeba6823c959fc4fdae). With the new flag we can now proper support CLOCK_MONOTONIC for pthread_mutex_clocklock with Priority Inheritance. If kernel does not support, EINVAL is returned instead. The difference is the futex operation will be issued and the kernel will advertise the missing support (instead of hard-code error return). Checked on x86_64-linux-gnu and i686-linux-gnu on Linux 5.14, 5.11, and 4.15.
* support: Add support_mutex_pi_monotonicAdhemerval Zanella2021-10-013-0/+41
| | | | Returns true if Priority Inheritance support CLOCK_MONOTONIC.
* nptl: Use FUTEX_LOCK_PI2 when availableAdhemerval Zanella2021-10-015-56/+72
| | | | | | | | | | | This patch uses the new futex PI operation provided by Linux v5.14 when it is required. The futex_lock_pi64() is moved to futex-internal.c (since it used on two different places and its code size might be large depending of the kernel configuration) and clockid is added as an argument. Co-authored-by: Kurt Kanzenbach <kurt@linutronix.de>
* Linux: Add FUTEX_LOCK_PI2Kurt Kanzenbach2021-10-011-0/+8
| | | | | | | | | | Linux v5.14.0 introduced a new futex operation called FUTEX_LOCK_PI2. This kernel feature can be used to implement pthread_mutex_clocklock(MONOTONIC)/PI. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Add C2X _PRINTF_NAN_LEN_MAXJoseph Myers2021-09-303-1/+12
| | | | | | | | | | | | | | C2X adds a macro _PRINTF_NAN_LEN_MAX to <stdio.h>, giving the maximum length of printf output for a NaN. glibc never includes an n-char-sequence in its printf output for NaNs, so the correct value for glibc is 4 ("-nan" or "-NAN"); define the macro accordingly. This patch makes the macro definition conditional on __GLIBC_USE (ISOC2X), as is generally done with features from new standard versions. The name is in the implementation namespace for older standards, so it would also be possible to define it unconditionally. Tested for x86_64.
* Add exp10 macro to <tgmath.h> (bug 26108)Joseph Myers2021-09-305-4/+24
| | | | | | | | | | | | | glibc has had exp10 functions since long before they were standardized; now they are standardized in TS 18661-4 and C2X, they are also specified there to have a corresponding type-generic macro. Add one to <tgmath.h>, so fixing bug 26108. glibc doesn't have other functions from TS 18661-4 yet, but when added, it will be natural to add the type-generic macro for each function family at the same time as the functions. Tested for x86_64.
* elf: Replace nsid with args.nsid [BZ #27609]H.J. Lu2021-09-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit ec935dea6332cb22f9881cd1162bad156173f4b0 Author: Florian Weimer <fweimer@redhat.com> Date: Fri Apr 24 22:31:15 2020 +0200 elf: Implement __libc_early_init has @@ -856,6 +876,11 @@ no more namespaces available for dlmopen()")); /* See if an error occurred during loading. */ if (__glibc_unlikely (exception.errstring != NULL)) { + /* Avoid keeping around a dangling reference to the libc.so link + map in case it has been cached in libc_map. */ + if (!args.libc_already_loaded) + GL(dl_ns)[nsid].libc_map = NULL; + do_dlopen calls _dl_open with nsid == __LM_ID_CALLER (-2), which calls dl_open_worker with args.nsid = nsid. dl_open_worker updates args.nsid if it is __LM_ID_CALLER. After dl_open_worker returns, it is wrong to use nsid. Replace nsid with args.nsid after dl_open_worker returns. This fixes BZ #27609.
* Add missing braces to bsearch inline implementation [BZ #28400]Florian Weimer2021-09-301-1/+3
| | | | | | | | | | GCC treats the pragma as a statement, so that the else branch only consists of the pragma, not the return statement. Fixes commit a725ff1de965f4cc4f36a7e8ae795d40ca0350d7 ("Suppress -Wcast-qual warnings in bsearch"). Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* Update alpha libm-test-ulpsAdhemerval Zanella2021-09-301-49/+53
|
* Suppress -Wcast-qual warnings in bsearchJonathan Wakely2021-09-301-1/+8
| | | | | | | | | | The first cast to (void *) is redundant but should be (const void *) anyway, because that's the type of the lvalue being assigned to. The second cast is necessary and intentionally not const-correct, so tell the compiler not to warn about it. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* elf: Copy l_addr/l_ld when adding ld.so to a new namespaceH.J. Lu2021-09-291-0/+4
| | | | | | | | | | | | When add ld.so to a new namespace, we don't actually load ld.so. We create a new link map and refers the real one for almost everything. Copy l_addr and l_ld from the real ld.so link map to avoid GDB warning: warning: .dynamic section for ".../elf/ld-linux-x86-64.so.2" is not at the expected address (wrong library or version mismatch?) when handling shared library loaded by dlmopen. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* powerpc: Fix unrecognized instruction errors with recent binutilsPaul A. Clarke2021-09-292-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recent versions of binutils (with commit b25f942e18d6ecd7ec3e2d2e9930eb4f996c258a) stopped preserving "sticky" options across a base `.machine` directive, nullifying the use of passing "-many" through GCC to the assembler. As a result, some instructions which were recognized even under older, more stringent `.machine` directives become unrecognized instructions in that context. In `sysdeps/powerpc/tst-set_ppr.c`, the use of the `mfppr32` extended mnemonic became unrecognized, as the default compilation with GCC for 32bit powerpc adds a `.machine ppc` in the resulting assembly, so the command line option `-Wa,-many` is essentially ignored, and the ISA 2.06 instructions and mnemonics, like `mfppr32`, are unrecognized. The compilation of `sysdeps/powerpc/tst-set_ppr.c` fails with: Error: unrecognized opcode: `mfppr32' Add appropriate `.machine` directives in the assembly to bracket the `mfppr32` instruction. Part of a 2019 fix (commit 9250e6610fdb0f3a6f238d2813e319a41fb7a810) to the above test's Makefile to add `-many` to the compilation when GCC itself stopped passing `-many` to the assember no longer has any effect, so remove that. Reported-by: Joseph Myers <joseph@codesourcery.com>
* Do not declare fmax, fmin _FloatN, _FloatNx versions for C2XJoseph Myers2021-09-292-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the last WG14 meeting, <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2711.htm> was accepted, which places more emphasis on the new fmaximum / fminimum functions and less on the old fmax / fmin functions. Some of the changes are to examples, notes or otherwise don't require implementation changes. However, the changes include removing the _FloatN / _FloatNx versions of the fmax and fmin functions that came from TS 18661-3. Thus, those function versions should only be declared under similar conditions to the _FloatN / _FloatNx versions of fmaxmag and fminmag: for _GNU_SOURCE and pre-C2X use of __STDC_WANT_IEC_60559_TYPES_EXT__, but not for C2X without _GNU_SOURCE. In turn this requires a tgmath.h change so that the corresponding tgmath.h macros, for C2X with __STDC_WANT_IEC_60559_TYPES_EXT__ but without _GNU_SOURCE, don't try to use function variants that aren't declared. (That issue doesn't arise for the tgmath.h macros for fmaxmag and fminmag, because those aren't defined at all in those circumstances unless __STDC_WANT_IEC_60559_BFP_EXT__ (from TS 18661-1 and not specified at all by C2X) is also defined, and in that case the _FloatN / _FloatNx versions of fmaxmag and fminmag get declared - this is only ever an issue when it's possible for some functions corresponding to a type-generic-macro to be declared, and for _FloatN / _FloatNx functions in general to be declared, but without the _FloatN / _FloatNx functions corresponding to that particular macro being declared.) Tested for x86_64.
* Do not define tgmath.h fmaxmag, fminmag macros for C2X (bug 28397)Joseph Myers2021-09-291-0/+2
| | | | | | | | | | | | | C2X does not include fmaxmag and fminmag. When I updated feature test macro handling accordingly (commit 858045ad1c5ac1682288bbcb3676632b97a21ddf, "Update floating-point feature test macro handling for C2X", included in 2.34), I missed updating tgmath.h so it doesn't define the corresponding type-generic macros unless __STDC_WANT_IEC_60559_BFP_EXT__ is defined; I've now reported this as bug 28397. Adjust the conditionals in tgmath.h accordingly. Tested for x86_64.
* Add fmaximum, fminimum functionsJoseph Myers2021-09-2866-11/+3634
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | C2X adds new <math.h> functions for floating-point maximum and minimum, corresponding to the new operations that were added in IEEE 754-2019 because of concerns about the old operations not being associative in the presence of signaling NaNs. fmaximum and fminimum handle NaNs like most <math.h> functions (any NaN argument means the result is a quiet NaN). fmaximum_num and fminimum_num handle both quiet and signaling NaNs the way fmax and fmin handle quiet NaNs (if one argument is a number and the other is a NaN, return the number), but still raise "invalid" for a signaling NaN argument, making them exceptions to the normal rule that a function with a floating-point result raising "invalid" also returns a quiet NaN. fmaximum_mag, fminimum_mag, fmaximum_mag_num and fminimum_mag_num are corresponding functions returning the argument with greatest or least absolute value. All these functions also treat +0 as greater than -0. There are also corresponding <tgmath.h> type-generic macros. Add these functions to glibc. The implementations use type-generic templates based on those for fmax, fmin, fmaxmag and fminmag, and test inputs are based on those for those functions with appropriate adjustments to the expected results. The RISC-V maintainers might wish to add optimized versions of fmaximum_num and fminimum_num (for float and double), since RISC-V (F extension version 2.2 and later) provides instructions corresponding to those functions - though it might be at least as useful to add architecture-independent built-in functions to GCC and teach the RISC-V back end to expand those functions inline, which is what you generally want for functions that can be implemented with a single instruction. Tested for x86_64 and x86, and with build-many-glibcs.py.
* Linux: Simplify __opensock and fix race condition [BZ #28353]Florian Weimer2021-09-283-159/+22
| | | | | | | | | | | | AF_NETLINK support is not quite optional on modern Linux systems anymore, so it is likely that the first attempt will always succeed. Consequently, there is no need to cache the result. Keep AF_UNIX and the Internet address families as a fallback, for the rare case that AF_NETLINK is missing. The other address families previously probed are totally obsolete be now, so remove them. Use this simplified version as the generic implementation, disabling Netlink support as needed.
* pthread/tst-cancel28: Fix barrier re-init race conditionStafford Horne2021-09-281-1/+0
| | | | | | | | | | | | | | | | | | | | When running this test on the OpenRISC port I am working on this test fails with a timeout. The test passes when being straced or debugged. Looking at the code there seems to be a race condition in that: 1 main thread: calls xpthread_cancel 2 sub thread : receives cancel signal 3 sub thread : cleanup routine waits on barrier 4 main thread: re-inits barrier 5 main thread: waits on barrier After getting to 5 the main thread and sub thread wait forever as the 2 barriers are no longer the same. Removing the barrier re-init seems to fix this issue. Also, the barrier does not need to be reinitialized as that is done by default. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* powerpc: Delete unneeded ELF_MACHINE_BEFORE_RTLD_RELOCFangrui Song2021-09-272-4/+0
| | | | Reviewed-by: Raphael M Zinsly <rzinsly@linux.ibm.com>
* posix: Remove spawni.cAdhemerval Zanella2021-09-271-343/+0
| | | | | | | | Although it provide an alternate implementation that communicates using pipe() instead of shared memory, no port uses and it adds extra burden for posix_spawn() extensions. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* Disable symbol hack in libc_nonshared.aH.J. Lu2021-09-272-2/+4
| | | | | Don't reference __GI_memmove, __GI_memset, __GI_memcpy, __divdi3_internal, __udivdi3_internal and __moddi3_internal in libc_nonshared.a.
* linux: Revert the use of sched_getaffinity on get_nproc (BZ #28310)Adhemerval Zanella2021-09-271-5/+134
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The use of sched_getaffinity on get_nproc and sysconf (_SC_NPROCESSORS_ONLN) done in 903bc7dcc2acafc40 (BZ #27645) breaks the top command in common hypervisor configurations and also other monitoring tools. The main issue using sched_getaffinity changed the symbols semantic from system-wide scope of online CPUs to per-process one (which can be changed with kernel cpusets or book parameters in VM). This patch reverts mostly of the 903bc7dcc2acafc40, with the exceptions: * No more cached values and atomic updates, since they are inherent racy. * No /proc/cpuinfo fallback, since /proc/stat is already used and it would require to revert more arch-specific code. * The alloca is replace with a static buffer of 1024 bytes. So the implementation first consult the sysfs, and fallbacks to procfs. Checked on x86_64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* linux: Simplify get_nprocsAdhemerval Zanella2021-09-273-51/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch simplifies the memory allocation code and uses the sched routines instead of reimplement it. This still uses a stack allocation buffer, so it can be used on malloc initialization code. Linux currently supports at maximum of 4096 cpus for most architectures: $ find -iname Kconfig | xargs git grep -A10 -w NR_CPUS | grep -w range arch/alpha/Kconfig- range 2 32 arch/arc/Kconfig- range 2 4096 arch/arm/Kconfig- range 2 16 if DEBUG_KMAP_LOCAL arch/arm/Kconfig- range 2 32 if !DEBUG_KMAP_LOCAL arch/arm64/Kconfig- range 2 4096 arch/csky/Kconfig- range 2 32 arch/hexagon/Kconfig- range 2 6 if SMP arch/ia64/Kconfig- range 2 4096 arch/mips/Kconfig- range 2 256 arch/openrisc/Kconfig- range 2 32 arch/parisc/Kconfig- range 2 32 arch/riscv/Kconfig- range 2 32 arch/s390/Kconfig- range 2 512 arch/sh/Kconfig- range 2 32 arch/sparc/Kconfig- range 2 32 if SPARC32 arch/sparc/Kconfig- range 2 4096 if SPARC64 arch/um/Kconfig- range 1 1 arch/x86/Kconfig-# [NR_CPUS_RANGE_BEGIN ... NR_CPUS_RANGE_END] range. arch/x86/Kconfig- range NR_CPUS_RANGE_BEGIN NR_CPUS_RANGE_END arch/xtensa/Kconfig- range 2 32 With x86 supporting 8192: arch/x86/Kconfig 976 config NR_CPUS_RANGE_END 977 int 978 depends on X86_64 979 default 8192 if SMP && CPUMASK_OFFSTACK 980 default 512 if SMP && !CPUMASK_OFFSTACK 981 default 1 if !SMP So using a maximum of 32k cpu should cover all cases (and I would expect once we start to have many more CPUs that Linux would provide a more straightforward way to query for such information). A test is added to check if sched_getaffinity can successfully return with large buffers. Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* misc: Add __get_nprocs_schedAdhemerval Zanella2021-09-275-2/+25
| | | | | | | | | | | This is an internal function meant to return the number of avaliable processor where the process can scheduled, different than the __get_nprocs which returns a the system available online CPU. The Linux implementation currently only calls __get_nprocs(), which in tuns calls sched_getaffinity. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* htl: Fix sigset of main threadSamuel Thibault2021-09-261-2/+5
| | | | | | | d482ebfa6785 ('htl: Keep thread signals blocked during its initialization') fixed not letting signals get delivered too early during thread creation, but it also affected the main thread, thus making it block signals by default. We need to just let the main thread sigset as it is.
* htl: make pthread_sigstate read/write set/oset outside sigstate sectionSamuel Thibault2021-09-261-5/+11
| | | | so that if a segfault occurs, the handler can run fine.
* Avoid warning: overriding recipe for .../tst-ro-dynamic-mod.soH.J. Lu2021-09-251-2/+3
| | | | | | | | Add tst-ro-dynamic-mod to modules-names-nobuild to avoid ../Makerules:767: warning: ignoring old recipe for target '.../elf/tst-ro-dynamic-mod.so' This updates BZ #28340 fix.
* benchtests: Improve reliability of memcmp benchmarksNoah Goldstein2021-09-241-11/+10
| | | | | | | | | | | | | | No bug. Remove reallocation of bufs between implementation tests. Move initialization outside of foreach implementation test loop. Increase iteration count. Generally before this commit was seeing a great deal of variability between runs. The goal of this commit is to make the results more reliable. Benchtests build and bench-memcmp succeeding. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
* Define __STDC_IEC_60559_BFP__ and __STDC_IEC_60559_COMPLEX__Joseph Myers2021-09-242-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TS 18661-1 and C2X specify predefined macros __STDC_IEC_60559_BFP__ and __STDC_IEC_60559_COMPLEX__, making __STDC_IEC_559__ and __STDC_IEC_559_COMPLEX__ obsolescent (but still included in the standard). Now that we have all the functions from TS 18661-1, define these macros in stdc-predef.h, under the same conditions in which the older macros are defined, since support for the floating-point features in TS 18661-1 is now at the same level as that for those in C11 and before (all library functions and other library APIs present, but no standard pragma support). The macros are defined for now with their TS 18661-1 values. C2X will give them new values (listed as yyyymmL in the working drafts until the final standard), at which point there will be the question of what value to use in stdc-predef.h (where it could depend on __STDC_VERSION__, but not on feature test macros defined by the user). My inclination then would be to use the C2X value unconditionally rather than using an older value to indicate TS support, and only have any C standard version conditionals for the value when subsequent C standard versions define further values. (Note that I'm also inclined, when we implement the C2X change to the return types of fromfp functions, to make that change unconditional much like the change made to the types of totalorder functions, with the old version only supported with compat symbols for already-linked programs and not as an API for newly built objects. So using the C2X value would also accurately reflect not supporting the versions of APIs in the TS where those ended up being incompatible with the first version actually added to the standard.) Tested for x86_64.
* build-many-glibcs.py: add powerpc64le glibc variant without multiarchPaul E. Murphy2021-09-241-1/+3
| | | | | | This configuration tests the float128 to ldouble128 redirect support on powerpc64le without the extra wrappers needed to support ifunc on this target.
* Fix sysdeps/x86/fpu/s_ffma.c for 32-bit FMA processor caseJoseph Myers2021-09-241-2/+6
| | | | | | | | | | | | | | | | | | | | It turns out the __SSE2_MATH__ conditional in sysdeps/x86/fpu/s_ffma.c does not cover all cases where the x86 fenv_private.h macros might manipulate one of the SSE and 387 floating-point state, while the actual fma implementation uses the other. Specifically, in the 32-bit case, with a compiler not defaulting to -mfpmath=sse, but testing on a processor with hardware FMA support, the multiarch fma function implementations will end up using SSE, while the fenv_private.h macros will use the 387 state for double. Change the conditional to use the default macros rather than the optimized ones in all cases except when the compiler inlines an fma instruction (in which case, since all those instructions are SSE instructions and -mfpmath=sse must be in effect for them to be inlined, the optimized macros will only use the SSE state and it's OK for them to only use the SSE state). Tested for x86_64 and x86. H.J. reports in <https://sourceware.org/pipermail/libc-alpha/2021-September/131367.html> that it fixes the problems he observed.
* Linux: Avoid closing -1 on failure in __closefrom_fallbackFlorian Weimer2021-09-241-1/+1
| | | | Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* i386: Port elf_machine_{load_address,dynamic} from x86-64Fangrui Song2021-09-241-16/+9
| | | | | | | | | This drops reliance on _GLOBAL_OFFSET_TABLE_[0] being the link-time address of _DYNAMIC. The code sequence length does not change. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* aarch64: Disable A64FX memcpy/memmove BTI unconditionallyNaohiro Tamura2021-09-241-0/+3
| | | | | | | | | This patch disables A64FX memcpy/memmove BTI instruction insertion unconditionally such as A64FX memset patch [1] for performance. [1] commit 07b427296b8d59f439144029d9a948f6c1ce0a31 Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* xsysconf: Only fail on error results and errno setStafford Horne2021-09-241-1/+1
| | | | | | | | | | | | | | | | When testing nptl/tst-pthread-attr-affinity-fail fails with: error: xsysconf.c:33: sysconf (83): Cannot allocate memory error: 1 test failures This happens as xsysconf checks the errno after running sysconf. Internally the sysconf request for _SC_NPROCESSORS_CONF on linux allocates memory. But there is a problem, even though malloc succeeds errno is getting set to ENOMEM. POSIX allows successful calls to clobber errno. So xsysconf just checking errno is wrong. Fix xsysconf by only failing if we have an error result and errno is set.
* powerpc64le: Avoid conflicting types for f64xfmaf128 when IFUNC is not usedTulio Magno Quites Machado Filho2021-09-231-0/+2
| | | | | | | | | Avoid defining f64xfmaf128 twice when building s_fmaf128.c. This can be reproduced on powerpc64le whenever f128 functions do not have IFUNC enabled, e.g. using "--with-cpu=power8 --disable-multi-arch", or when using "-with-cpu=power9". Fixes: b3f27d8150d4f ("Add narrowing fma functions")
* Fix ffma use of round-to-odd on x86Joseph Myers2021-09-231-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On 32-bit x86 with -mfpmath=sse, and on x86_64 with --disable-multi-arch, the tests of ffma and its aliases (fma narrowing from binary64 to binary32) fail. This is probably the issue reported by H.J. in <https://sourceware.org/pipermail/libc-alpha/2021-September/131277.html>. The problem is the use of fenv_private.h macros in the round-to-odd implementation. Those macros are set up to manipulate only one of the SSE and 387 floating-point state, whichever is relevant for the type indicated by the suffix on the macro name. But x86 configurations sometimes use the ldbl-96 implementation of binary64 fma (that's where --disable-multi-arch is relevant for x86_64: it causes the ldbl-96 implementation to be used, instead of an IFUNC implementation that falls back to the dbl-64 version), contrary to the expectations of those macros for functions operating on double when __SSE2_MATH__ is defined. This can be addressed by using the default versions of those macros (giving x86 its own version of s_ffma.c), as is done for the *f128 macro variants where it depends on the details of how GCC was configured when building libgcc which floating-point state is affected by _Float128 arithmetic. The issue only applies when __SSE2_MATH__ is defined, and doesn't apply when __FP_FAST_FMA is defined (because in that case, fma will be inlined by the compiler, meaning it's definitely an SSE operation; for the same reason, this is not an issue for narrowing sqrt, as hardware sqrt is always inlined in that implementation for x86), but in other cases it's safest to use the default versions of the fenv_private.h macros to ensure things work whichever fma implementation is used. Tested for x86_64 (with and without --disable-multi-arch) and x86 (with and without -mfpmath=sse).
* vfprintf: Unify argument handling in process_argFlorian Weimer2021-09-231-117/+89
| | | | | | | Instead of checking a pointer argument for NULL, use helper macros defined differently in the non-positional and positional cases. This avoids frequent conditional checks and a GCC 12 warning about comparing pointers against NULL which cannot be NULL.
* vfprintf: Handle floating-point cases outside of process_arg macroFlorian Weimer2021-09-231-111/+75
| | | | | | A lot of the code is unique to the positional and non-positional code. Also unify the decimal and hexadecimal cases via the new helper function __printf_fp_spec.
* nptl: Avoid setxid deadlock with blocked signals in thread exit [BZ #28361]Florian Weimer2021-09-233-2/+72
| | | | | | | | | | | | | | | | | | | | | | As part of the fix for bug 12889, signals are blocked during thread exit, so that application code cannot run on the thread that is about to exit. This would cause problems if the application expected signals to be delivered after the signal handler revealed the thread to still exist, despite pthread_kill can no longer be used to send signals to it. However, glibc internally uses the SIGSETXID signal in a way that is incompatible with signal blocking, due to the way the setxid handshake delays thread exit until the setxid operation has completed. With a blocked SIGSETXID, the handshake can never complete, causing a deadlock. As a band-aid, restore the previous handshake protocol by not blocking SIGSETXID during thread exit. The new test sysdeps/pthread/tst-pthread-setuid-loop.c is based on a downstream test by Martin Osvald. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
* Add narrowing fma functionsJoseph Myers2021-09-2282-309/+37046
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the narrowing fused multiply-add functions from TS 18661-1 / TS 18661-3 / C2X to glibc's libm: ffma, ffmal, dfmal, f32fmaf64, f32fmaf32x, f32xfmaf64 for all configurations; f32fmaf64x, f32fmaf128, f64fmaf64x, f64fmaf128, f32xfmaf64x, f32xfmaf128, f64xfmaf128 for configurations with _Float64x and _Float128; __f32fmaieee128 and __f64fmaieee128 aliases in the powerpc64le case (for calls to ffmal and dfmal when long double is IEEE binary128). Corresponding tgmath.h macro support is also added. The changes are mostly similar to those for the other narrowing functions previously added, especially that for sqrt, so the description of those generally applies to this patch as well. As with sqrt, I reused the same test inputs in auto-libm-test-in as for non-narrowing fma rather than adding extra or separate inputs for narrowing fma. The tests in libm-test-narrow-fma.inc also follow those for non-narrowing fma. The non-narrowing fma has a known bug (bug 6801) that it does not set errno on errors (overflow, underflow, Inf * 0, Inf - Inf). Rather than fixing this or having narrowing fma check for errors when non-narrowing does not (complicating the cases when narrowing fma can otherwise be an alias for a non-narrowing function), this patch does not attempt to check for errors from narrowing fma and set errno; the CHECK_NARROW_FMA macro is still present, but as a placeholder that does nothing, and this missing errno setting is considered to be covered by the existing bug rather than needing a separate open bug. missing-errno annotations are duly added to many of the auto-libm-test-in test inputs for fma. This completes adding all the new functions from TS 18661-1 to glibc, so will be followed by corresponding stdc-predef.h changes to define __STDC_IEC_60559_BFP__ and __STDC_IEC_60559_COMPLEX__, as the support for TS 18661-1 will be at a similar level to that for C standard floating-point facilities up to C11 (pragmas not implemented, but library functions done). (There are still further changes to be done to implement changes to the types of fromfp functions from N2548.) Tested as followed: natively with the full glibc testsuite for x86_64 (GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC 11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32 hard float, mips64 (all three ABIs, both hard and soft float). The different GCC versions are to cover the different cases in tgmath.h and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in glibc headers, GCC 7 has proper _Float* support, GCC 8 adds __builtin_tgmath).
* ld.so: Replace DL_RO_DYN_SECTION with dl_relocate_ld [BZ #28340]H.J. Lu2021-09-2216-41/+198
| | | | | | | | | | | | | | | | | We can't relocate entries in dynamic section if it is readonly: 1. Add a l_ld_readonly field to struct link_map to indicate if dynamic section is readonly and set it based on p_flags of PT_DYNAMIC segment. 2. Replace DL_RO_DYN_SECTION with dl_relocate_ld to decide if dynamic section should be relocated. 3. Remove DL_RO_DYN_TEMP_CNT. 4. Don't use a static dynamic section to make readonly dynamic section in vDSO writable. 5. Remove the temp argument from elf_get_dynamic_info. This fixes BZ #28340. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* Adjust new narrowing div/mul tests for IBM long double, update powerpc ULPsJoseph Myers2021-09-224-3664/+3667
| | | | | | Testing for powerpc shows some of the new narrowing div/mul tests need XFAILing for IBM long double and some ULPs updates are needed for those tests.
* Mention today's regex merge in SHARED-FILESPaul Eggert2021-09-211-0/+12
|
* Fix f64xdivf128, f64xmulf128 spurious underflows (bug 28358)Joseph Myers2021-09-2118-31/+9604
| | | | | | | | | | | | | | | | | | | | | | | | | | As described in bug 28358, the round-to-odd computations used in the libm functions that round their results to a narrower format can yield spurious underflow exceptions in the following circumstances: the narrowing only narrows the precision of the type and not the exponent range (i.e., it's narrowing _Float128 to _Float64x on x86_64, x86 or ia64), the architecture does after-rounding tininess detection (which applies to all those architectures), the result is inexact, tiny before rounding but not tiny after rounding (with the chosen rounding mode) for _Float64x (which is possible for narrowing mul, div and fma, not for narrowing add, sub or sqrt), so the underflow exception resulting from the toward-zero computation in _Float128 is spurious for _Float64x. Fixed by making ROUND_TO_ODD call feclearexcept (FE_UNDERFLOW) in the problem cases (as indicated by an extra argument to the macro); there is never any need to preserve underflow exceptions from this part of the computation, because the conversion of the round-to-odd value to the narrower type will underflow in exactly the cases in which the function should raise that exception, but it may be more efficient to avoid the extra manipulation of the floating-point environment when not needed. Tested for x86_64 and x86, and with build-many-glibcs.py.
* regex: copy back from GnulibPaul Eggert2021-09-219-78/+142
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Copy regex-related files back from Gnulib, to fix a problem with static checking of regex calls noted by Martin Sebor. This merges the following changes: * New macro __attribute_nonnull__ in misc/sys/cdefs.h, for use later when copying other files back from Gnulib. * Use __GNULIB_CDEFS instead of __GLIBC__ when deciding whether to include bits/wordsize.h etc. * Avoid duplicate entries in epsilon closure table. * New regex.h macro _REGEX_NELTS to let regexec say that its pmatch arg should contain nmatch elts. Use that for regexec, instead of __attr_access (which is incorrect). * New regex.h macro _Attr_access_ which is like __attr_access except portable to non-glibc platforms. * Add some DEBUG_ASSERTs to pacify gcc -fanalyzer and to catch recently-fixed performance bugs if they recur. * Add Gnulib-specific stuff to port the dynarray- and lock-using parts of regex code to non-glibc platforms. * Fix glibc bug 11053. * Avoid some undefined behavior when popping an empty fail stack.
* nptl: Fix type of pthread_mutexattr_getrobust_np, ↵Florian Weimer2021-09-211-2/+2
| | | | | | | pthread_mutexattr_setrobust_np (bug 28036) Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
* powerpc: Fix unrecognized instruction errors with recent GCCPaul A. Clarke2021-09-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | Recent binutils commit b25f942e18d6ecd7ec3e2d2e9930eb4f996c258a changes the behavior of `.machine` directives to override, rather than augment, the base CPU. This can result in _reduced_ functionality when, for example, compiling for default machine "power8", but explicitly asking for ".machine power5", which loses Altivec instructions. In tst-ucontext-ppc64-vscr.c, while the instructions provoking the new error messages are bracketed by ".machine power5", which is ostensibly Power ISA 2.03 (POWER5), the POWER5 processor did not support the VSX subset, so these instructions are not recognized as "power5". Error: unrecognized opcode: `vspltisb' Error: unrecognized opcode: `vpkuwus' Error: unrecognized opcode: `mfvscr' Error: unrecognized opcode: `stvx' Manually adding the VSX subset via ".machine altivec" is sufficient. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
* elf: Include <sysdep.h> in elf/dl-debug-symbols.SFlorian Weimer2021-09-201-0/+4
| | | | | | | This is necessary to generate assembler marker sections on some targets. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* nptl: pthread_kill needs to return ESRCH for old programs (bug 19193)Florian Weimer2021-09-202-10/+48
| | | | | | The fix for bug 19193 breaks some old applications which appear to use pthread_kill to probe if a thread is still running, something that is not supported by POSIX.