about summary refs log tree commit diff
Commit message (Collapse)AuthorAgeFilesLines
* linux: Simplify get_nprocsAdhemerval Zanella2021-09-273-51/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch simplifies the memory allocation code and uses the sched routines instead of reimplement it. This still uses a stack allocation buffer, so it can be used on malloc initialization code. Linux currently supports at maximum of 4096 cpus for most architectures: $ find -iname Kconfig | xargs git grep -A10 -w NR_CPUS | grep -w range arch/alpha/Kconfig- range 2 32 arch/arc/Kconfig- range 2 4096 arch/arm/Kconfig- range 2 16 if DEBUG_KMAP_LOCAL arch/arm/Kconfig- range 2 32 if !DEBUG_KMAP_LOCAL arch/arm64/Kconfig- range 2 4096 arch/csky/Kconfig- range 2 32 arch/hexagon/Kconfig- range 2 6 if SMP arch/ia64/Kconfig- range 2 4096 arch/mips/Kconfig- range 2 256 arch/openrisc/Kconfig- range 2 32 arch/parisc/Kconfig- range 2 32 arch/riscv/Kconfig- range 2 32 arch/s390/Kconfig- range 2 512 arch/sh/Kconfig- range 2 32 arch/sparc/Kconfig- range 2 32 if SPARC32 arch/sparc/Kconfig- range 2 4096 if SPARC64 arch/um/Kconfig- range 1 1 arch/x86/Kconfig-# [NR_CPUS_RANGE_BEGIN ... NR_CPUS_RANGE_END] range. arch/x86/Kconfig- range NR_CPUS_RANGE_BEGIN NR_CPUS_RANGE_END arch/xtensa/Kconfig- range 2 32 With x86 supporting 8192: arch/x86/Kconfig 976 config NR_CPUS_RANGE_END 977 int 978 depends on X86_64 979 default 8192 if SMP && CPUMASK_OFFSTACK 980 default 512 if SMP && !CPUMASK_OFFSTACK 981 default 1 if !SMP So using a maximum of 32k cpu should cover all cases (and I would expect once we start to have many more CPUs that Linux would provide a more straightforward way to query for such information). A test is added to check if sched_getaffinity can successfully return with large buffers. Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* misc: Add __get_nprocs_schedAdhemerval Zanella2021-09-275-2/+25
| | | | | | | | | | | This is an internal function meant to return the number of avaliable processor where the process can scheduled, different than the __get_nprocs which returns a the system available online CPU. The Linux implementation currently only calls __get_nprocs(), which in tuns calls sched_getaffinity. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* htl: Fix sigset of main threadSamuel Thibault2021-09-261-2/+5
| | | | | | | d482ebfa6785 ('htl: Keep thread signals blocked during its initialization') fixed not letting signals get delivered too early during thread creation, but it also affected the main thread, thus making it block signals by default. We need to just let the main thread sigset as it is.
* htl: make pthread_sigstate read/write set/oset outside sigstate sectionSamuel Thibault2021-09-261-5/+11
| | | | so that if a segfault occurs, the handler can run fine.
* Avoid warning: overriding recipe for .../tst-ro-dynamic-mod.soH.J. Lu2021-09-251-2/+3
| | | | | | | | Add tst-ro-dynamic-mod to modules-names-nobuild to avoid ../Makerules:767: warning: ignoring old recipe for target '.../elf/tst-ro-dynamic-mod.so' This updates BZ #28340 fix.
* benchtests: Improve reliability of memcmp benchmarksNoah Goldstein2021-09-241-11/+10
| | | | | | | | | | | | | | No bug. Remove reallocation of bufs between implementation tests. Move initialization outside of foreach implementation test loop. Increase iteration count. Generally before this commit was seeing a great deal of variability between runs. The goal of this commit is to make the results more reliable. Benchtests build and bench-memcmp succeeding. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
* Define __STDC_IEC_60559_BFP__ and __STDC_IEC_60559_COMPLEX__Joseph Myers2021-09-242-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TS 18661-1 and C2X specify predefined macros __STDC_IEC_60559_BFP__ and __STDC_IEC_60559_COMPLEX__, making __STDC_IEC_559__ and __STDC_IEC_559_COMPLEX__ obsolescent (but still included in the standard). Now that we have all the functions from TS 18661-1, define these macros in stdc-predef.h, under the same conditions in which the older macros are defined, since support for the floating-point features in TS 18661-1 is now at the same level as that for those in C11 and before (all library functions and other library APIs present, but no standard pragma support). The macros are defined for now with their TS 18661-1 values. C2X will give them new values (listed as yyyymmL in the working drafts until the final standard), at which point there will be the question of what value to use in stdc-predef.h (where it could depend on __STDC_VERSION__, but not on feature test macros defined by the user). My inclination then would be to use the C2X value unconditionally rather than using an older value to indicate TS support, and only have any C standard version conditionals for the value when subsequent C standard versions define further values. (Note that I'm also inclined, when we implement the C2X change to the return types of fromfp functions, to make that change unconditional much like the change made to the types of totalorder functions, with the old version only supported with compat symbols for already-linked programs and not as an API for newly built objects. So using the C2X value would also accurately reflect not supporting the versions of APIs in the TS where those ended up being incompatible with the first version actually added to the standard.) Tested for x86_64.
* build-many-glibcs.py: add powerpc64le glibc variant without multiarchPaul E. Murphy2021-09-241-1/+3
| | | | | | This configuration tests the float128 to ldouble128 redirect support on powerpc64le without the extra wrappers needed to support ifunc on this target.
* Fix sysdeps/x86/fpu/s_ffma.c for 32-bit FMA processor caseJoseph Myers2021-09-241-2/+6
| | | | | | | | | | | | | | | | | | | | It turns out the __SSE2_MATH__ conditional in sysdeps/x86/fpu/s_ffma.c does not cover all cases where the x86 fenv_private.h macros might manipulate one of the SSE and 387 floating-point state, while the actual fma implementation uses the other. Specifically, in the 32-bit case, with a compiler not defaulting to -mfpmath=sse, but testing on a processor with hardware FMA support, the multiarch fma function implementations will end up using SSE, while the fenv_private.h macros will use the 387 state for double. Change the conditional to use the default macros rather than the optimized ones in all cases except when the compiler inlines an fma instruction (in which case, since all those instructions are SSE instructions and -mfpmath=sse must be in effect for them to be inlined, the optimized macros will only use the SSE state and it's OK for them to only use the SSE state). Tested for x86_64 and x86. H.J. reports in <https://sourceware.org/pipermail/libc-alpha/2021-September/131367.html> that it fixes the problems he observed.
* Linux: Avoid closing -1 on failure in __closefrom_fallbackFlorian Weimer2021-09-241-1/+1
| | | | Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* i386: Port elf_machine_{load_address,dynamic} from x86-64Fangrui Song2021-09-241-16/+9
| | | | | | | | | This drops reliance on _GLOBAL_OFFSET_TABLE_[0] being the link-time address of _DYNAMIC. The code sequence length does not change. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* aarch64: Disable A64FX memcpy/memmove BTI unconditionallyNaohiro Tamura2021-09-241-0/+3
| | | | | | | | | This patch disables A64FX memcpy/memmove BTI instruction insertion unconditionally such as A64FX memset patch [1] for performance. [1] commit 07b427296b8d59f439144029d9a948f6c1ce0a31 Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* xsysconf: Only fail on error results and errno setStafford Horne2021-09-241-1/+1
| | | | | | | | | | | | | | | | When testing nptl/tst-pthread-attr-affinity-fail fails with: error: xsysconf.c:33: sysconf (83): Cannot allocate memory error: 1 test failures This happens as xsysconf checks the errno after running sysconf. Internally the sysconf request for _SC_NPROCESSORS_CONF on linux allocates memory. But there is a problem, even though malloc succeeds errno is getting set to ENOMEM. POSIX allows successful calls to clobber errno. So xsysconf just checking errno is wrong. Fix xsysconf by only failing if we have an error result and errno is set.
* powerpc64le: Avoid conflicting types for f64xfmaf128 when IFUNC is not usedTulio Magno Quites Machado Filho2021-09-231-0/+2
| | | | | | | | | Avoid defining f64xfmaf128 twice when building s_fmaf128.c. This can be reproduced on powerpc64le whenever f128 functions do not have IFUNC enabled, e.g. using "--with-cpu=power8 --disable-multi-arch", or when using "-with-cpu=power9". Fixes: b3f27d8150d4f ("Add narrowing fma functions")
* Fix ffma use of round-to-odd on x86Joseph Myers2021-09-231-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On 32-bit x86 with -mfpmath=sse, and on x86_64 with --disable-multi-arch, the tests of ffma and its aliases (fma narrowing from binary64 to binary32) fail. This is probably the issue reported by H.J. in <https://sourceware.org/pipermail/libc-alpha/2021-September/131277.html>. The problem is the use of fenv_private.h macros in the round-to-odd implementation. Those macros are set up to manipulate only one of the SSE and 387 floating-point state, whichever is relevant for the type indicated by the suffix on the macro name. But x86 configurations sometimes use the ldbl-96 implementation of binary64 fma (that's where --disable-multi-arch is relevant for x86_64: it causes the ldbl-96 implementation to be used, instead of an IFUNC implementation that falls back to the dbl-64 version), contrary to the expectations of those macros for functions operating on double when __SSE2_MATH__ is defined. This can be addressed by using the default versions of those macros (giving x86 its own version of s_ffma.c), as is done for the *f128 macro variants where it depends on the details of how GCC was configured when building libgcc which floating-point state is affected by _Float128 arithmetic. The issue only applies when __SSE2_MATH__ is defined, and doesn't apply when __FP_FAST_FMA is defined (because in that case, fma will be inlined by the compiler, meaning it's definitely an SSE operation; for the same reason, this is not an issue for narrowing sqrt, as hardware sqrt is always inlined in that implementation for x86), but in other cases it's safest to use the default versions of the fenv_private.h macros to ensure things work whichever fma implementation is used. Tested for x86_64 (with and without --disable-multi-arch) and x86 (with and without -mfpmath=sse).
* vfprintf: Unify argument handling in process_argFlorian Weimer2021-09-231-117/+89
| | | | | | | Instead of checking a pointer argument for NULL, use helper macros defined differently in the non-positional and positional cases. This avoids frequent conditional checks and a GCC 12 warning about comparing pointers against NULL which cannot be NULL.
* vfprintf: Handle floating-point cases outside of process_arg macroFlorian Weimer2021-09-231-111/+75
| | | | | | A lot of the code is unique to the positional and non-positional code. Also unify the decimal and hexadecimal cases via the new helper function __printf_fp_spec.
* nptl: Avoid setxid deadlock with blocked signals in thread exit [BZ #28361]Florian Weimer2021-09-233-2/+72
| | | | | | | | | | | | | | | | | | | | | | As part of the fix for bug 12889, signals are blocked during thread exit, so that application code cannot run on the thread that is about to exit. This would cause problems if the application expected signals to be delivered after the signal handler revealed the thread to still exist, despite pthread_kill can no longer be used to send signals to it. However, glibc internally uses the SIGSETXID signal in a way that is incompatible with signal blocking, due to the way the setxid handshake delays thread exit until the setxid operation has completed. With a blocked SIGSETXID, the handshake can never complete, causing a deadlock. As a band-aid, restore the previous handshake protocol by not blocking SIGSETXID during thread exit. The new test sysdeps/pthread/tst-pthread-setuid-loop.c is based on a downstream test by Martin Osvald. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
* Add narrowing fma functionsJoseph Myers2021-09-2282-309/+37046
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the narrowing fused multiply-add functions from TS 18661-1 / TS 18661-3 / C2X to glibc's libm: ffma, ffmal, dfmal, f32fmaf64, f32fmaf32x, f32xfmaf64 for all configurations; f32fmaf64x, f32fmaf128, f64fmaf64x, f64fmaf128, f32xfmaf64x, f32xfmaf128, f64xfmaf128 for configurations with _Float64x and _Float128; __f32fmaieee128 and __f64fmaieee128 aliases in the powerpc64le case (for calls to ffmal and dfmal when long double is IEEE binary128). Corresponding tgmath.h macro support is also added. The changes are mostly similar to those for the other narrowing functions previously added, especially that for sqrt, so the description of those generally applies to this patch as well. As with sqrt, I reused the same test inputs in auto-libm-test-in as for non-narrowing fma rather than adding extra or separate inputs for narrowing fma. The tests in libm-test-narrow-fma.inc also follow those for non-narrowing fma. The non-narrowing fma has a known bug (bug 6801) that it does not set errno on errors (overflow, underflow, Inf * 0, Inf - Inf). Rather than fixing this or having narrowing fma check for errors when non-narrowing does not (complicating the cases when narrowing fma can otherwise be an alias for a non-narrowing function), this patch does not attempt to check for errors from narrowing fma and set errno; the CHECK_NARROW_FMA macro is still present, but as a placeholder that does nothing, and this missing errno setting is considered to be covered by the existing bug rather than needing a separate open bug. missing-errno annotations are duly added to many of the auto-libm-test-in test inputs for fma. This completes adding all the new functions from TS 18661-1 to glibc, so will be followed by corresponding stdc-predef.h changes to define __STDC_IEC_60559_BFP__ and __STDC_IEC_60559_COMPLEX__, as the support for TS 18661-1 will be at a similar level to that for C standard floating-point facilities up to C11 (pragmas not implemented, but library functions done). (There are still further changes to be done to implement changes to the types of fromfp functions from N2548.) Tested as followed: natively with the full glibc testsuite for x86_64 (GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC 11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32 hard float, mips64 (all three ABIs, both hard and soft float). The different GCC versions are to cover the different cases in tgmath.h and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in glibc headers, GCC 7 has proper _Float* support, GCC 8 adds __builtin_tgmath).
* ld.so: Replace DL_RO_DYN_SECTION with dl_relocate_ld [BZ #28340]H.J. Lu2021-09-2216-41/+198
| | | | | | | | | | | | | | | | | We can't relocate entries in dynamic section if it is readonly: 1. Add a l_ld_readonly field to struct link_map to indicate if dynamic section is readonly and set it based on p_flags of PT_DYNAMIC segment. 2. Replace DL_RO_DYN_SECTION with dl_relocate_ld to decide if dynamic section should be relocated. 3. Remove DL_RO_DYN_TEMP_CNT. 4. Don't use a static dynamic section to make readonly dynamic section in vDSO writable. 5. Remove the temp argument from elf_get_dynamic_info. This fixes BZ #28340. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* Adjust new narrowing div/mul tests for IBM long double, update powerpc ULPsJoseph Myers2021-09-224-3664/+3667
| | | | | | Testing for powerpc shows some of the new narrowing div/mul tests need XFAILing for IBM long double and some ULPs updates are needed for those tests.
* Mention today's regex merge in SHARED-FILESPaul Eggert2021-09-211-0/+12
|
* Fix f64xdivf128, f64xmulf128 spurious underflows (bug 28358)Joseph Myers2021-09-2118-31/+9604
| | | | | | | | | | | | | | | | | | | | | | | | | | As described in bug 28358, the round-to-odd computations used in the libm functions that round their results to a narrower format can yield spurious underflow exceptions in the following circumstances: the narrowing only narrows the precision of the type and not the exponent range (i.e., it's narrowing _Float128 to _Float64x on x86_64, x86 or ia64), the architecture does after-rounding tininess detection (which applies to all those architectures), the result is inexact, tiny before rounding but not tiny after rounding (with the chosen rounding mode) for _Float64x (which is possible for narrowing mul, div and fma, not for narrowing add, sub or sqrt), so the underflow exception resulting from the toward-zero computation in _Float128 is spurious for _Float64x. Fixed by making ROUND_TO_ODD call feclearexcept (FE_UNDERFLOW) in the problem cases (as indicated by an extra argument to the macro); there is never any need to preserve underflow exceptions from this part of the computation, because the conversion of the round-to-odd value to the narrower type will underflow in exactly the cases in which the function should raise that exception, but it may be more efficient to avoid the extra manipulation of the floating-point environment when not needed. Tested for x86_64 and x86, and with build-many-glibcs.py.
* regex: copy back from GnulibPaul Eggert2021-09-219-78/+142
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Copy regex-related files back from Gnulib, to fix a problem with static checking of regex calls noted by Martin Sebor. This merges the following changes: * New macro __attribute_nonnull__ in misc/sys/cdefs.h, for use later when copying other files back from Gnulib. * Use __GNULIB_CDEFS instead of __GLIBC__ when deciding whether to include bits/wordsize.h etc. * Avoid duplicate entries in epsilon closure table. * New regex.h macro _REGEX_NELTS to let regexec say that its pmatch arg should contain nmatch elts. Use that for regexec, instead of __attr_access (which is incorrect). * New regex.h macro _Attr_access_ which is like __attr_access except portable to non-glibc platforms. * Add some DEBUG_ASSERTs to pacify gcc -fanalyzer and to catch recently-fixed performance bugs if they recur. * Add Gnulib-specific stuff to port the dynarray- and lock-using parts of regex code to non-glibc platforms. * Fix glibc bug 11053. * Avoid some undefined behavior when popping an empty fail stack.
* nptl: Fix type of pthread_mutexattr_getrobust_np, ↵Florian Weimer2021-09-211-2/+2
| | | | | | | pthread_mutexattr_setrobust_np (bug 28036) Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
* powerpc: Fix unrecognized instruction errors with recent GCCPaul A. Clarke2021-09-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | Recent binutils commit b25f942e18d6ecd7ec3e2d2e9930eb4f996c258a changes the behavior of `.machine` directives to override, rather than augment, the base CPU. This can result in _reduced_ functionality when, for example, compiling for default machine "power8", but explicitly asking for ".machine power5", which loses Altivec instructions. In tst-ucontext-ppc64-vscr.c, while the instructions provoking the new error messages are bracketed by ".machine power5", which is ostensibly Power ISA 2.03 (POWER5), the POWER5 processor did not support the VSX subset, so these instructions are not recognized as "power5". Error: unrecognized opcode: `vspltisb' Error: unrecognized opcode: `vpkuwus' Error: unrecognized opcode: `mfvscr' Error: unrecognized opcode: `stvx' Manually adding the VSX subset via ".machine altivec" is sufficient. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
* elf: Include <sysdep.h> in elf/dl-debug-symbols.SFlorian Weimer2021-09-201-0/+4
| | | | | | | This is necessary to generate assembler marker sections on some targets. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* nptl: pthread_kill needs to return ESRCH for old programs (bug 19193)Florian Weimer2021-09-202-10/+48
| | | | | | The fix for bug 19193 breaks some old applications which appear to use pthread_kill to probe if a thread is still running, something that is not supported by POSIX.
* Extend struct r_debug to support multiple namespaces [BZ #15971]H.J. Lu2021-09-1916-45/+257
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Glibc does not provide an interface for debugger to access libraries loaded in multiple namespaces via dlmopen. The current rtld-debugger interface is described in the file: elf/rtld-debugger-interface.txt under the "Standard debugger interface" heading. This interface only provides access to the first link-map (LM_ID_BASE). 1. Bump r_version to 2 when multiple namespaces are used. This triggers the GDB bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28236 2. Add struct r_debug_extended to extend struct r_debug into a linked-list, where each element correlates to an unique namespace. 3. Initialize the r_debug_extended structure. Bump r_version to 2 for the new namespace and add the new namespace to the namespace linked list. 4. Add _dl_debug_update to return the address of struct r_debug' of a namespace. 5. Add a hidden symbol, _r_debug_extended, for struct r_debug_extended. 6. Provide the symbol, _r_debug, with size of struct r_debug, as an alias of _r_debug_extended, for programs which reference _r_debug. This fixes BZ #15971. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* Use $(pie-default) with conformtestJoseph Myers2021-09-171-1/+2
| | | | | | | | | | | | | | | | | My glibc bot showed that my conformtest changes fail the build of the conformtest execution tests for x86_64-linux-gnu-static-pie, because linking the newly built object with the newly built libc and the associated options normally used for linking requires it to be built as PIE. Add $(pie-default) to the compiler command used so that PIE options are used when required. There's a case for using the whole of $(CFLAGS-.o) (which includes $(pie-default)), but that raises questions of any impact from using optimization flags from CFLAGS in these tests. So for now just use $(pie-default) as the key part of $(CFLAGS-.o) that's definitely needed. Tested with build-many-glibcs.py for x86_64-linux-gnu-static-pie.
* Run conform/ tests using newly built libcJoseph Myers2021-09-173-10/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Although the conform/ header tests are built using the headers of the glibc under test, the execution tests from conformtest (a few tests of the values of macros evaluating to string constants) are linked and run with system libc, not the newly built libc. Apart from preventing testing in cross environments, this can be a problem even for native testing. Specifically, it can be useful to do native testing when building with a cross compiler that links with a libc that is not the system libc; for example, on x86_64, you can test all three ABIs that way if the kernel support is present, even if the host OS lacks 32-bit or x32 libraries or they are older than the libraries in the sysroot used by the compiler used to build glibc. This works for almost all tests, but not for these conformtest tests. Arrange for conformtest to link and run test programs similarly to other tests, with consequent refactoring of various variables in Makeconfig to allow passing relevant parts of the link-time command lines down to conformtest. In general, the parts of the link command involving $@ or $^ are separated out from the parts that should be passed to conformtest (the variables passed to conformtest still involve various variables whose names involve $(@F), but those variables simply won't be defined for the conformtest makefile rules and I think their presence there is harmless). This is also most of the support that would be needed to allow running those tests of string constants for cross testing when test-wrapper is defined. That will also need changes to where conformtest.py puts the test executables, so it puts them in the main object directory (expected to be shared with a test system in cross testing) rather than /tmp (not expected to be shared) as at present. Tested for x86_64.
* posix: Fix attribute access mode on getcwd [BZ #27476]Aurelien Jarno2021-09-162-5/+3
| | | | | | | | | There is a GNU extension that allows to call getcwd(NULL, >0). It is described in the documentation, but also directly in the unistd.h header, just above the declaration. Therefore the attribute access mode added in commit 06febd8c6705 is not correct. Drop it.
* Fix build-many-glibcs.py --strip for installed library renamingJoseph Myers2021-09-161-9/+7
| | | | | | | | | | | | | The renaming of installed shared libraries to use the SONAME directly rather than linking to a versioned name stopped build-many-glibcs.py --strip (used to facilitate comparing binaries before and after changes that aren't meant to change any generated code in installed glibc shared libraries) from stripping most of the installed shared libraries, because it stripped only the *.so names. Fix it to strip *.so* names instead and to detect the case of linker scripts using grep instead of hardcoding particular files that are linker scripts. Tested with build-many-glibcs.py --strip.
* benchtests: Fix validate_benchout.py exceptionsNaohiro Tamura2021-09-163-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixed validate_benchout.py two exceptions, 1) AttributeError if benchout_strings.schema.json is specified, and 2) json.decoder.JSONDecodeError if benchout file is not JSON. $ ~/glibc/benchtests/scripts/validate_benchout.py bench-memset.out \ ~/glibc/benchtests/scripts/benchout_strings.schema.json Traceback (most recent call last): File "/home/naohirot/glibc/benchtests/scripts/validate_benchout.py", line 86, in <module> sys.exit(main(sys.argv[1:])) File "/home/naohirot/glibc/benchtests/scripts/validate_benchout.py", line 69, in main bench.parse_bench(args[0], args[1]) File "/home/naohirot/glibc/benchtests/scripts/import_bench.py", line 139, in parse_bench do_for_all_timings(bench, lambda b, f, v: File "/home/naohirot/glibc/benchtests/scripts/import_bench.py", line 107, in do_for_all_timings if 'timings' not in bench['functions'][func][k].keys(): AttributeError: 'str' object has no attribute 'keys' $ ~/glibc/benchtests/scripts/validate_benchout.py bench-math-inlines.out \ ~/glibc/benchtests/scripts/benchout_strings.schema.json Traceback (most recent call last): File "/home/naohirot/glibc/benchtests/scripts/validate_benchout.py", line 86, in <module> sys.exit(main(sys.argv[1:])) File "/home/naohirot/glibc/benchtests/scripts/validate_benchout.py", line 69, in main bench.parse_bench(args[0], args[1]) File "/home/naohirot/glibc/benchtests/scripts/import_bench.py", line 137, in parse_bench bench = json.load(benchfile) File "/usr/lib/python3.6/json/__init__.py", line 299, in load parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw) File "/usr/lib/python3.6/json/__init__.py", line 354, in loads return _default_decoder.decode(s) File "/usr/lib/python3.6/json/decoder.py", line 342, in decode raise JSONDecodeError("Extra data", s, end) json.decoder.JSONDecodeError: Extra data: line 1 column 17 (char 16) Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* elf: Remove THREAD_GSCOPE_IN_TCBSergey Bugaev2021-09-1622-35/+0
| | | | | | | | | All the ports now have THREAD_GSCOPE_IN_TCB set to 1. Remove all support for !THREAD_GSCOPE_IN_TCB, along with the definition itself. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20210915171110.226187-4-bugaevc@gmail.com> Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
* htl: Reimplement GSCOPESergey Bugaev2021-09-163-20/+76
| | | | | | | | | | | | | | | | This is a new implementation of GSCOPE which largely mirrors its NPTL counterpart. Same as in NPTL, instead of a global flag shared between threads, there is now a per-thread GSCOPE flag stored in each thread's TCB. This makes entering and exiting a GSCOPE faster at the expense of making THREAD_GSCOPE_WAIT () slower. The largest win is the elimination of many redundant gsync_wake () RPC calls; previously, even simplest programs would make dozens of fully redundant gsync_wake () calls. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20210915171110.226187-3-bugaevc@gmail.com> Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
* htl: Move thread table to ld.soSergey Bugaev2021-09-1613-63/+83
| | | | | | | | | | | The next commit is going to introduce a new implementation of THREAD_GSCOPE_WAIT which needs to access the list of threads. Since it must be usable from the dynamic laoder, we have to move the symbols for the list of threads into the loader. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20210915171110.226187-2-bugaevc@gmail.com> Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
* Redirect fma calls to __fma in libmJoseph Myers2021-09-1525-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | include/math.h has a mechanism to redirect internal calls to various libm functions, that can often be inlined by the compiler, to call non-exported __* names for those functions in the case when the calls aren't inlined, with the redirection being disabled when NO_MATH_REDIRECT. Add fma to the functions to which this mechanism is applied. At present, libm-internal fma calls (generally to __builtin_fma* functions) are only done when it's known the call will be inlined, with alternative code not relying on an fma operation being used in the caller otherwise. This patch is in preparation for adding the TS 18661 / C2X narrowing fma functions to glibc; it will be natural for the narrowing function implementations to call the underlying fma functions unconditionally, with this either being inlined or resulting in an __fma* call. (Using two levels of round-to-odd computation like that, in the case where there isn't an fma hardware instruction, isn't optimal but is certainly a lot simpler for the initial implementation than writing different narrowing fma implementations for all the various pairs of formats.) Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch (using <https://sourceware.org/pipermail/libc-alpha/2021-September/130991.html> to fix installed library stripping in build-many-glibcs.py). Also tested for x86_64.
* time: Fix compile error in itimer test affecting hurdStafford Horne2021-09-162-2/+15
| | | | | | | | | | | | | | | | | | | | | The recent change to use __KERNEL_OLD_TIMEVAL_MATCHES_TIMEVAL64 to avoid doing 64-bit checks on some platforms broke the test for hurd where __KERNEL_OLD_TIMEVAL_MATCHES_TIMEVAL64 is not defined. With error: tst-itimer.c: In function 'do_test': tst-itimer.c:103:11: error: '__KERNEL_OLD_TIMEVAL_MATCHES_TIMEVAL64' undeclared (first use in this function) 103 | if (__KERNEL_OLD_TIMEVAL_MATCHES_TIMEVAL64) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ tst-itimer.c:103:11: note: each undeclared identifier is reported only once for each function it appears in Define a support helper to detect when setitimer and getitimer support 64-bit time_t. Fixes commit 6e8a0aac2f ("time: Fix overflow itimer tests on 32-bit systems"). Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org> Cc: Joseph Myers <joseph@codesourcery.com>
* mach lll_lock/unlock: Explicitly request private lockingSamuel Thibault2021-09-151-2/+2
| | | | 0 was actually LLL_PRIVATE, so this does not actually change the code.
* elf: Replace most uses of THREAD_GSCOPE_IN_TCBSergey Bugaev2021-09-155-13/+16
| | | | | | | | | | | | | | | | | | | | | | While originally this definition was indeed used to distinguish between the cases where the GSCOPE flag was stored in TCB or not, it has since become used as a general way to distinguish between HTL and NPTL. THREAD_GSCOPE_IN_TCB will be removed in the following commits, as HTL, which currently is the only port that does not put the flag into TCB, will get ported to put the GSCOPE flag into the TCB as well. To prepare for that change, migrate all code that wants to distinguish between HTL and NPTL to use PTHREAD_IN_LIBC instead, which is a better choice since the distinction mostly has to do with whether libc has access to the list of thread structures and therefore can initialize thread-local storage. The parts of code that actually depend on whether the GSCOPE flag is in TCB are left unchanged. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20210907133325.255690-2-bugaevc@gmail.com> Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
* Add MADV_POPULATE_READ and MADV_POPULATE_WRITE from Linux 5.14 to ↵Joseph Myers2021-09-141-0/+4
| | | | | | | | | | bits/mman-linux.h Linux 5.14 adds constants MADV_POPULATE_READ and MADV_POPULATE_WRITE (with the same values on all architectures). Add these to glibc's bits/mman-linux.h. Tested for x86_64.
* Update kernel version to 5.14 in tst-mman-consts.pyJoseph Myers2021-09-141-1/+1
| | | | | | | | This patch updates the kernel version in the test tst-mman-consts.py to 5.14. (There are no new MAP_* constants covered by this test in 5.14 that need any other header changes.) Tested with build-many-glibcs.py.
* configure: Fix check for INSERT in linker scriptFangrui Song2021-09-132-2/+2
| | | | | | | | | | | GCC/Clang use local access when referencing a const variable, so the conftest.so may have no dynamic relocation. LLD reports `error: unable to insert .foo after .rela.dyn` when the destination section does not exist. Use a non-const int to ensure that .rela.dyn exists. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* iconvconfig: Fix behaviour with --prefix [BZ #28199]Siddhesh Poyarekar2021-09-133-12/+28
| | | | | | | | | | | | | | The consolidation of configuration parsing broke behaviour with --prefix, where the prefix bled into the modules cache. Accept a prefix which, when non-NULL, is prepended to the path when looking for configuration files but only the original directory is added to the modules cache. This has no effect on the codegen of gconv_conf since it passes NULL. Reported-by: Patrick McCarty <patrick.mccarty@intel.com> Reported-by: Michael Hudson-Doyle <michael.hudson@canonical.com> Reviewed-by: Andreas Schwab <schwab@linux-m68k.org>
* nptl: Fix race between pthread_kill and thread exit (bug 12889)Florian Weimer2021-09-137-25/+275
| | | | | | | | | | | A new thread exit lock and flag are introduced. They are used to detect that the thread is about to exit or has exited in __pthread_kill_internal, and the signal is not sent in this case. The test sysdeps/pthread/tst-pthread_cancel-select-loop.c is derived from a downstream test originally written by Marek Polacek. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* nptl: pthread_kill, pthread_cancel should not fail after exit (bug 19193)Florian Weimer2021-09-136-95/+106
| | | | | | | | | | | | | | This closes one remaining race condition related to bug 12889: if the thread already exited on the kernel side, returning ESRCH is not correct because that error is reserved for the thread IDs (pthread_t values) whose lifetime has ended. In case of a kernel-side exit and a valid thread ID, no signal needs to be sent and cancellation does not have an effect, so just return 0. sysdeps/pthread/tst-kill4.c triggers undefined behavior and is removed with this commit. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* benchtests: Remove redundant assert.hNaohiro Tamura2021-09-132-2/+0
| | | | | | | This patch removed redundant "#include <assert.h>" from bench-memset-large.c and bench-memset-walk.c. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* benchtests: Enable scripts/plot_strings.py to read stdinNaohiro Tamura2021-09-131-3/+8
| | | | | | | | | | | | | | | | | | | | This patch enables scripts/plot_strings.py to read a benchmark result file from stdin. To keep backward compatibility, that is to keep accepting multiple of benchmark result files in argument, blank argument doesn't mean stdin, but '-' does. Therefore nargs parameter of ArgumentParser.add_argument() method is not changed to '?', but keep '+'. ex: $ jq '.' bench-memset.out | plot_strings.py - $ jq '.' bench-memset.out | plot_strings.py - bench-memset-large.out $ plot_strings.py bench-memset.out bench-memset-large.out error ex: $ jq '.' bench-memset.out | plot_strings.py Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* Add narrowing square root functionsJoseph Myers2021-09-1079-92/+5736
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the narrowing square root functions from TS 18661-1 / TS 18661-3 / C2X to glibc's libm: fsqrt, fsqrtl, dsqrtl, f32sqrtf64, f32sqrtf32x, f32xsqrtf64 for all configurations; f32sqrtf64x, f32sqrtf128, f64sqrtf64x, f64sqrtf128, f32xsqrtf64x, f32xsqrtf128, f64xsqrtf128 for configurations with _Float64x and _Float128; __f32sqrtieee128 and __f64sqrtieee128 aliases in the powerpc64le case (for calls to fsqrtl and dsqrtl when long double is IEEE binary128). Corresponding tgmath.h macro support is also added. The changes are mostly similar to those for the other narrowing functions previously added, so the description of those generally applies to this patch as well. However, the not-actually-narrowing cases (where the two types involved in the function have the same floating-point format) are aliased to sqrt, sqrtl or sqrtf128 rather than needing a separately built not-actually-narrowing function such as was needed for add / sub / mul / div. Thus, there is no __nldbl_dsqrtl name for ldbl-opt because no such name was needed (whereas the other functions needed such a name since the only other name for that entry point was e.g. f32xaddf64, not reserved by TS 18661-1); the headers are made to arrange for sqrt to be called in that case instead. The DIAG_* calls in sysdeps/ieee754/soft-fp/s_dsqrtl.c are because they were observed to be needed in GCC 7 testing of riscv32-linux-gnu-rv32imac-ilp32. The other sysdeps/ieee754/soft-fp/ files added didn't need such DIAG_* in any configuration I tested with build-many-glibcs.py, but if they do turn out to be needed in more files with some other configuration / GCC version, they can always be added there. I reused the same test inputs in auto-libm-test-in as for non-narrowing sqrt rather than adding extra or separate inputs for narrowing sqrt. The tests in libm-test-narrow-sqrt.inc also follow those for non-narrowing sqrt. Tested as followed: natively with the full glibc testsuite for x86_64 (GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC 11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32 hard float, mips64 (all three ABIs, both hard and soft float). The different GCC versions are to cover the different cases in tgmath.h and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in glibc headers, GCC 7 has proper _Float* support, GCC 8 adds __builtin_tgmath).