about summary refs log tree commit diff
Commit message (Collapse)AuthorAgeFilesLines
...
* LoongArch: Add ifunc support for memset{aligned, unaligned, lsx, lasx}dengjianbo2023-08-298-0/+688
| | | | | | | | | | | | | | According to glibc memset microbenchmark test results, for LSX and LASX versions, A few cases with length less than 8 experience performace degradation, overall, the LASX version could reduce the runtime about 15% - 75%, LSX version could reduce the runtime about 15%-50%. The unaligned version uses unaligned memmory access to set data which length is less than 64 and make address aligned with 8. For this part, the performace is better than aligned version. Comparing with the generic version, the performance is close when the length is larger than 128. When the length is 8-128, the unaligned version could reduce the runtime about 30%-70%, the aligned version could reduce the runtime about 20%-50%.
* LoongArch: Add ifunc support for memrchr{lsx, lasx}dengjianbo2023-08-297-0/+335
| | | | | | | | | According to glibc memrchr microbenchmark, this implementation could reduce the runtime as following: Name Percent of rutime reduced memrchr-lasx 20%-83% memrchr-lsx 20%-64%
* LoongArch: Add ifunc support for memchr{aligned, lsx, lasx}dengjianbo2023-08-297-0/+401
| | | | | | | | | | According to glibc memchr microbenchmark, this implementation could reduce the runtime as following: Name Percent of runtime reduced memchr-lasx 37%-83% memchr-lsx 30%-66% memchr-aligned 0%-15%
* LoongArch: Add ifunc support for rawmemchr{aligned, lsx, lasx}dengjianbo2023-08-297-0/+365
| | | | | | | | | According to glibc rawmemchr microbenchmark, A few cases tested with char '\0' experience performance degradation due to the lasx and lsx versions don't handle the '\0' separately. Overall, rawmemchr-lasx implementation could reduce the runtime about 40%-80%, rawmemchr-lsx implementation could reduce the runtime about 40%-66%, rawmemchr-aligned implementation could reduce the runtime about 20%-40%.
* LoongArch: Micro-optimize LD_PCRELXi Ruoyao2023-08-291-6/+4
| | | | | | | We are requiring Binutils >= 2.41, so explicit relocation syntax is always supported by the assembler. Use it to reduce one instruction. Signed-off-by: Xi Ruoyao <xry111@xry111.site>
* LoongArch: Remove support code for old linker in start.SXi Ruoyao2023-08-291-16/+3
| | | | | | We are requiring Binutils >= 2.41, so la.pcrel always works here. Signed-off-by: Xi Ruoyao <xry111@xry111.site>
* LoongArch: Simplify the autoconf check for static PIEXi Ruoyao2023-08-292-50/+16
| | | | | | | We are strictly requiring GAS >= 2.41 now, so we don't need to check assembler capability anymore. Signed-off-by: Xi Ruoyao <xry111@xry111.site>
* Add F_SEAL_EXEC from Linux 6.3 to bits/fcntl-linux.h.Kir Kolyshkin2023-08-281-0/+1
| | | | | | | | This patch adds the new F_SEAL_EXEC constant from Linux 6.3 (see Linux commit 6fd7353829c ("mm/memfd: add F_SEAL_EXEC") to bits/fcntl-linux.h. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* argp-parse: Get rid of allocaJoe Simmons-Talbott2023-08-281-24/+11
| | | | | | | | | Even though the alloca usage is relatively small and fixed size the code can be written without using alloca. Convert to local variables. Checked on x86_64-linux-gnu. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* gencat: Get rid of alloca.Joe Simmons-Talbott2023-08-281-6/+31
| | | | | | | | Convert to scratch_buffers to avoid potential stack overflow. Checked on x86_64-linux-gnu and aarch64-linux-gnu. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* m68k: Use M68K_SCALE_AVAILABLE on __mpn_lshift and __mpn_rshiftAdhemerval Zanella2023-08-253-7/+14
| | | | | | | | | This patch adds a new macro, M68K_SCALE_AVAILABLE, similar to gmp scale_available_p (mpn/m68k/m68k-defs.m4) that expand to 1 if a scale factor can be used in addressing modes. This is used instead of __mc68020__ for some optimization decisions. Checked on a build for m68k-linux-gnu target mc68020 and mc68040.
* m68k: Fix build with -mcpu=68040 or higher (BZ 30740)Adhemerval Zanella2023-08-252-1/+21
| | | | | | | | GCC currently does not define __mc68020__ for -mcpu=68040 or higher, which memcpy/memmove assumptions. Since this memory copy optimization seems only intended for m68020, disable for other m680X0 variants. Checked on a build for m68k-linux-gnu target mc68020 and mc68040.
* elf: Check that --list-diagnostics output has the expected syntaxFlorian Weimer2023-08-254-0/+327
| | | | | | | | | | Parts of elf/tst-rtld-list-diagnostics.py have been copied from scripts/tst-ld-trace.py. The abnf module is entirely optional and used to verify the ABNF grammar as included in the manual. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* manual: Document ld.so --list-diagnostics outputFlorian Weimer2023-08-251-0/+207
| | | | Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* manual/jobs.texi: Add missing @item EPERM for getpgidMark Wielaard2023-08-251-0/+1
| | | | | | | The missing @item makes it look like errno will be set to ESRCH if a cross-session getpgid is not permitted. Found by ulfvonbelow on irc.
* LoongArch: Add ifunc support for strncmp{aligned, lsx}dengjianbo2023-08-246-0/+508
| | | | | | | | Based on the glibc microbenchmark, only a few short inputs with this strncmp-aligned and strncmp-lsx implementation experience performance degradation, overall, strncmp-aligned could reduce the runtime 0%-10% for aligned comparision, 10%-25% for unaligend comparision, strncmp-lsx could reduce the runtime about 0%-60%.
* LoongArch: Add ifunc support for strcmp{aligned, lsx}dengjianbo2023-08-246-0/+426
| | | | | | Based on the glibc microbenchmark, strcmp-aligned implementation could reduce the runtime 0%-10% for aligned comparison, 10%-20% for unaligned comparison, strcmp-lsx implemenation could reduce the runtime 0%-50%.
* LoongArch: Add ifunc support for strnlen{aligned, lsx, lasx}dengjianbo2023-08-247-0/+382
| | | | | | | Based on the glibc microbenchmark, strnlen-aligned implementation could reduce the runtime more than 10%, strnlen-lsx implementation could reduce the runtime about 50%-78%, strnlen-lasx implementation could reduce the runtime about 50%-88%.
* htl: move pthread_attr_setdetachstate into libcGuy-Fleury Iteriteka2023-08-247-11/+4
| | | | | Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org> Message-Id: <20230716084414.107245-11-gfleury@disroot.org>
* htl: move pthread_attr_getdetachstate into libcGuy-Fleury Iteriteka2023-08-247-11/+3
| | | | | Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org> Message-Id: <20230716084414.107245-10-gfleury@disroot.org>
* htl: move pthread_attr_setschedpolicy into libcGuy-Fleury Iteriteka2023-08-247-11/+3
| | | | | Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org> Message-Id: <20230716084414.107245-9-gfleury@disroot.org>
* htl: move pthread_attr_getschedpolicy into libcGuy-Fleury Iteriteka2023-08-247-10/+2
| | | | | Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org> Message-Id: <20230716084414.107245-8-gfleury@disroot.org>
* htl: move pthread_attr_setinheritsched into libcGuy-Fleury Iteriteka2023-08-247-11/+4
| | | | | Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org> Message-Id: <20230716084414.107245-7-gfleury@disroot.org>
* htl: move pthread_attr_getinheritsched into libcGuy-Fleury Iteriteka2023-08-247-10/+3
| | | | | Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org> Message-Id: <20230716084414.107245-6-gfleury@disroot.org>
* htl: move pthread_attr_getschedparam into libcGuy-Fleury Iteriteka2023-08-247-13/+3
| | | | | Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org> Message-Id: <20230716084414.107245-5-gfleury@disroot.org>
* htl: move pthread_setschedparam into libcGuy-Fleury Iteriteka2023-08-247-15/+3
| | | | | Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org> Message-Id: <20230716084414.107245-4-gfleury@disroot.org>
* htl: move pthread_getschedparam into libcGuy-Fleury Iteriteka2023-08-247-12/+4
| | | | | Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org> Message-Id: <20230716084414.107245-3-gfleury@disroot.org>
* htl: move pthread_equal into libcGuy-Fleury Iteriteka2023-08-247-12/+2
| | | | | Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org> Message-Id: <20230716084414.107245-2-gfleury@disroot.org>
* Linux: Avoid conflicting types in ld.so --list-diagnosticsFlorian Weimer2023-08-231-5/+8
| | | | | | | | The path auxv[*].a_val could either be an integer or a string, depending on the a_type value. Use a separate field, a_val_string, to simplify mechanical parsing of the --list-diagnostics output. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* elf: Do not run constructors for proxy objectsFlorian Weimer2023-08-221-2/+6
| | | | | Otherwise, the ld.so constructor runs for each audit namespace and each dlmopen namespace.
* x86_64: Add log1p with FMAH.J. Lu2023-08-214-0/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Skylake, it changes log1p bench performance by: Before After Improvement max 63.349 58.347 8% min 4.448 5.651 -30% mean 12.0674 10.336 14% The minimum code path is if (hx < 0x3FDA827A) /* x < 0.41422 */ { if (__glibc_unlikely (ax >= 0x3ff00000)) /* x <= -1.0 */ { ... } if (__glibc_unlikely (ax < 0x3e200000)) /* |x| < 2**-29 */ { math_force_eval (two54 + x); /* raise inexact */ if (ax < 0x3c900000) /* |x| < 2**-54 */ { ... } else return x - x * x * 0.5; FMA and non-FMA code sequences look similar. Non-FMA version is slightly faster. Since log1p is called by asinh and atanh, it improves asinh performance by: Before After Improvement max 75.645 63.135 16% min 10.074 10.071 0% mean 15.9483 14.9089 6% and improves atanh performance by: Before After Improvement max 91.768 75.081 18% min 15.548 13.883 10% mean 18.3713 16.8011 8%
* Remove references to the defunct db2 subdirAndreas Schwab2023-08-213-11/+0
| | | | The db2 subdir has been removed more than 20 years ago.
* string: Fix tester build with fortify enable with gcc < 12Mahesh Bodapati2023-08-181-3/+8
| | | | | | | | | When building with fortify enabled, GCC < 12 issues a warning on the fortify strncat wrapper might overflow the destination buffer (the failure is tied to -Werror). Checked on ppc64 and x86_64. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* s390x: Fix static PIE condition for toolchain bootstrapping.Stefan Liebler2023-08-182-12/+138
| | | | | | | | | | The static PIE configure check uses link tests. When bootstrapping a cross-toolchain, the link tests fail due to missing crt-files / libc.so. As we explicitely want to test an issue in binutils (ld), we now also explicitely check for known linker versions. See also commit 368b7c614b102122b86af3953daea2b30230d0a8 S390: Use compile-only instead of also link-tests in configure.
* m68k: fix __mpn_lshift and __mpn_rshift for non-68020Andreas Schwab2023-08-172-4/+4
| | | | From revision 03f3d275d0d6 in the gmp repository.
* sysdeps: tst-bz21269: fix -Wreturn-typeSam James2023-08-171-2/+0
| | | | | | | Thanks to Andreas Schwab for reporting. Fixes: 652b9fdb77d9fd056d4dd26dad2c14142768ab49 Signed-off-by: Sam James <sam@gentoo.org>
* Loongarch: Add ifunc support for memcpy{aligned, unaligned, lsx, lasx} and ↵dengjianbo2023-08-1713-0/+2435
| | | | | | | | | | | | | | | memmove{aligned, unaligned, lsx, lasx} These implementations improve the time to copy data in the glibc microbenchmark as below: memcpy-lasx reduces the runtime about 8%-76% memcpy-lsx reduces the runtime about 8%-72% memcpy-unaligned reduces the runtime of unaligned data copying up to 40% memcpy-aligned reduece the runtime of unaligned data copying up to 25% memmove-lasx reduces the runtime about 20%-73% memmove-lsx reduces the runtime about 50% memmove-unaligned reduces the runtime of unaligned data moving up to 40% memmove-aligned reduces the runtime of unaligned data moving up to 25%
* Loongarch: Add ifunc support for strchr{aligned, lsx, lasx} and ↵dengjianbo2023-08-1712-0/+581
| | | | | | | | | | | | | strchrnul{aligned, lsx, lasx} These implementations improve the time to run strchr{nul} microbenchmark in glibc as below: strchr-lasx reduces the runtime about 50%-83% strchr-lsx reduces the runtime about 30%-67% strchr-aligned reduces the runtime about 10%-20% strchrnul-lasx reduces the runtime about 50%-83% strchrnul-lsx reduces the runtime about 36%-65% strchrnul-aligned reduces the runtime about 6%-10%
* sysdeps: tst-bz21269: handle ENOSYS & skip appropriatelySam James2023-08-161-1/+10
| | | | | | | | | SYS_modify_ldt requires CONFIG_MODIFY_LDT_SYSCALL to be set in the kernel, which some distributions may disable for hardening. Check if that's the case (unset) and mark the test as UNSUPPORTED if so. Reviewed-by: DJ Delorie <dj@redhat.com> Signed-off-by: Sam James <sam@gentoo.org>
* sysdeps: tst-bz21269: fix test parameterSam James2023-08-161-1/+1
| | | | | | | | All callers pass 1 or 0x11 anyway (same meaning according to man page), but still. Reviewed-by: DJ Delorie <dj@redhat.com> Signed-off-by: Sam James <sam@gentoo.org>
* hurd: Fix strictness of <mach/thread_state.h>Samuel Thibault2023-08-161-3/+3
| | | | Fixes: db25bc52026f ("hurd: Add prototype for and thus fix _hurdsig_abort_rpcs call")
* hurd: Add prototype for and thus fix _hurdsig_abort_rpcs callSamuel Thibault2023-08-152-10/+7
| | | | This was actually not a problem since NULL was getting passed.
* io/tst-statvfs: fix statfs().f_type comparison test on some archesнаб2023-08-151-1/+1
| | | | | | | | | On i686 f_type is an i32 so the test fails when that has the top bit set. Explicitly cast to u32. Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Reviewed-by: Florian Weimer <fweimer@redhat.com>
* fxprintf: Get rid of allocaJoe Simmons-Talbott2023-08-151-8/+6
| | | | | | | Use a scratch_buffer rather than alloca/malloc to avoid potential stack overflow. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* configure: Add -Wall again to the default CFLAGSFlorian Weimer2023-08-151-1/+1
| | | | | | | Commit 78ceef25d64efeeb6067d1cb282a00466e637e2a ("configure: Remove --enable-all-warnings option") removed it due to a missing +. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* malloc: Remove bin scanning from memalign (bug 30723)Florian Weimer2023-08-152-166/+10
| | | | | | | | | | | | | | | | | On the test workload (mpv --cache=yes with VP9 video decoding), the bin scanning has a very poor success rate (less than 2%). The tcache scanning has about 50% success rate, so keep that. Update comments in malloc/tst-memalign-2 to indicate the purpose of the tests. Even with the scanning removed, the additional merging opportunities since commit 542b1105852568c3ebc712225ae78b ("malloc: Enable merging of remainders in memalign (bug 30723)") are sufficient to pass the existing large bins test. Remove leftover variables from _int_free from refactoring in the same commit. Reviewed-by: DJ Delorie <dj@redhat.com>
* resolv/nss_dns/dns-host: Get rid of alloca.Joe Simmons-Talbott2023-08-141-2/+2
| | | | | | Since the alloca is a small constant size use an array instead. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* x86_64: Add expm1 with FMAH.J. Lu2023-08-144-0/+55
| | | | | | | | | | | | | | | | | | On Skylake, it improves expm1 bench performance by: Before After Improvement max 70.204 68.054 3% min 20.709 16.2 22% mean 22.1221 16.7367 24% NB: Add extern long double __expm1l (long double); extern long double __expm1f128 (long double); for __typeof (__expm1l) and __typeof (__expm1f128) when __expm1 is defined since __expm1 may be expanded in their declarations which causes the build failure.
* LoongArch: elf: Add new LoongArch reloc types 109 into elf.hcaiyinyu2023-08-141-0/+1
| | | | | These reloc types are generated by GNU assembler >= 2.41 for relaxation support.
* elf: Add new LoongArch reloc types (101 to 108) into elf.hXi Ruoyao2023-08-141-0/+8
| | | | | | | | These reloc types are generated by GNU assembler >= 2.41 for relaxation support. Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=57a930e3 Signed-off-by: Xi Ruoyao <xry111@xry111.site>