about summary refs log tree commit diff
Commit message (Collapse)AuthorAgeFilesLines
* riscv: align stack before calling _dl_init [BZ #28703]Aurelien Jarno2021-12-171-0/+6
| | | | | | | | | Align the stack pointer to 128 bits during the call to _dl_init() as specified by the RISC-V ABI [1]. This fixes the elf/tst-align2 test. Fixes bug 28703. [1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc
* riscv: align stack in clone [BZ #28702]Aurelien Jarno2021-12-171-0/+3
| | | | | | | | | | | | The RISC-V ABI [1] mandates that "the stack pointer shall be aligned to a 128-bit boundary upon procedure entry". This as not the case in clone. This fixes the misc/tst-misalign-clone-internal and misc/tst-misalign-clone tests. Fixes bug 28702. [1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc
* elf: Fix tst-cpu-features-cpuinfo for KVM guests on some AMD systems [BZ #28704]Aurelien Jarno2021-12-171-1/+8
| | | | | | | | | On KVM guests running on some AMD systems, the IBRS feature is reported as a synthetic feature using the Intel feature, while the cpuinfo entry keeps the same. Handle that by first checking the presence of the Intel feature on AMD systems. Fixes bug 28704.
* powerpc64[le]: Allocate extra stack frame on syscall.SMatheus Castanho2021-12-171-0/+4
| | | | | | | | | | | | | The syscall function does not allocate the extra stack frame for scv like other assembly syscalls using DO_CALL_SCV. So after commit d120fb9941 changed the offset that is used to save LR, syscall ended up using an invalid offset, causing regressions on powerpc64. So make sure the extra stack frame is allocated in syscall.S as well to make it consistent with other uses of DO_CALL_SCV and avoid similar issues in the future. Tested on powerpc, powerpc64, and powerpc64le (with and without scv) Reviewed-by: Raphael M Zinsly <rzinsly@linux.ibm.com>
* Update copyright header in recently merged ab_GE localeMaxim Kuvyrkov2021-12-171-6/+16
| | | | | | | | | | ab_GE locale was committed under DCO and this header proposed in [1] suits it better. [1] https://sourceware.org/pipermail/libc-alpha/2021-September/130692.html Signed-off-by: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org> Signed-off-by: Nart Tlisha <daniel.abzakh@gmail.com>
* fortify: Fix spurious warning with realpathSiddhesh Poyarekar2021-12-173-2/+40
| | | | | | | | | The length and object size arguments were swapped around for realpath. Also add a smoke test so that any changes in this area get caught in future. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* nss: Use "files dns" as the default for the hosts database (bug 28700)Florian Weimer2021-12-173-6/+5
| | | | | | | | | | | | | | | | | | This matches what is currently in nss/nsswitch.conf. The new ordering matches what most distributions use in their installed configuration files. It is common to add localhost to /etc/hosts because the name does not exist in the DNS, but is commonly used as a host name. With the built-in "dns [!UNAVAIL=return] files" default, dns is searched first and provides an answer for "localhost" (NXDOMAIN). We never look at the files database as a result, so the contents of /etc/hosts is ignored. This means that "getent hosts localhost" fail without a /etc/nsswitch.conf file, even though the host name is listed in /etc/hosts. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* arm: Guard ucontext _rtld_global_ro access by SHARED, not PIC macroFlorian Weimer2021-12-172-4/+4
| | | | | | | | | | | | | | | | Due to PIE-by-default, PIC is now defined in more cases. libc.a does not have _rtld_global_ro, and statically linking setcontext fails. SHARED is the right condition to use, so that libc.a references _dl_hwcap instead of _rtld_global_ro. For static PIE support, the !SHARED case would still have to be made PIC. This patch does not achieve that. Fixes commit 23645707f12f2dd9d80b51effb2d9618a7b65565 ("Replace --enable-static-pie with --disable-default-pie"). Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* Fix The GNU ToolChain Authors copyright noticeSiddhesh Poyarekar2021-12-176-6/+6
| | | | | | | | | | | I (and maybe one or two others) added a (C) to the copyright notice regardless of the contribution checklist[1] not mentioning it. Fix all these instances so that the notice reads as "Copyright The GNU Toolchain Authors" across the source code. [1] https://sourceware.org/glibc/wiki/Contribution%20checklist Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* Remove upper limit on tunable MALLOC_MMAP_THRESHOLDPatrick McGehearty2021-12-161-10/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The current limit on MALLOC_MMAP_THRESHOLD is either 1 Mbyte (for 32-bit apps) or 32 Mbytes (for 64-bit apps). This value was set by a patch dated 2006 (15 years ago). Attempts to set the threshold higher are currently ignored. The default behavior is appropriate for many highly parallel applications where many processes or threads are sharing RAM. In other situations where the number of active processes or threads closely matches the number of cores, a much higher limit may be desired by the application designer. By today's standards on personal computers and small servers, 2 Gbytes of RAM per core is commonly available. On larger systems 4 Gbytes or more of RAM is sometimes available. Instead of raising the limit to match current needs, this patch proposes to remove the limit of the tunable, leaving the decision up to the user of a tunable to judge the best value for their needs. This patch does not change any of the defaults for malloc tunables, retaining the current behavior of the dynamic malloc mmap threshold. bugzilla 27801 - Remove upper limit on tunable MALLOC_MMAP_THRESHOLD Reviewed-by: DJ Delorie <dj@redhat.com> malloc/ malloc.c changed do_set_mmap_threshold to remove test for HEAP_MAX_SIZE.
* localedata: add new locale ab_GENart Tlisha2021-12-161-0/+146
| | | | | | | | | | | | | | Add the Abkhazian language in the Georgia territory The ab_GE was just recently added to CLDR, it should be available in CLDR v41, https://github.com/unicode-org/cldr/pull/1402 The Abkhazian language has been added to Gnome for localization The locale has been tested on Ubuntu 20.04, Mint 20.2 and Fedora 35 Beta Signed-off-by: Nart Tlisha <daniel.abzakh@gmail.com> Reviewed-by: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
* Fix __minimal_malloc segfaults in __mmap due to stack-protectorStefan Liebler2021-12-161-0/+4
| | | | | | | | | | | | | | | Starting with commit b05fae4d8e34604a72ee36d2d3164391b76fcf0b "elf: Use the minimal malloc on tunables_strdup", I get lots of segfaults in static tests on s390x when also using, e.g.: export GLIBC_TUNABLES="glibc.elision.enable=1" tunables_strdup callls __minimal_malloc which tries to call __mmap due to insufficient space left. __mmap itself first setups a new stack frame and segfaults when copying the stack-protector canary from thread-pointer. The latter one is not yet setup. Thus this patch also turns off stack-protection for mmap. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* __glibc_unsafe_len: Fix commentSiddhesh Poyarekar2021-12-161-1/+1
| | | | | | We know that the length is *unsafe*. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* malloc: Enable huge page support on main arenaAdhemerval Zanella2021-12-153-6/+14
| | | | | | | | | | | | This patch adds support huge page support on main arena allocation, enable with tunable glibc.malloc.hugetlb=2. The patch essentially disable the __glibc_morecore() sbrk() call (similar when memory tag does when sbrk() call does not support it) and fallback to default page size if the memory allocation fails. Checked on x86_64-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>
* malloc: Move MORECORE fallback mmap to sysmalloc_mmap_fallbackAdhemerval Zanella2021-12-151-32/+53
| | | | | | So it can be used on hugepage code as well. Reviewed-by: DJ Delorie <dj@redhat.com>
* malloc: Add Huge Page support to arenasAdhemerval Zanella2021-12-153-44/+99
| | | | | | | | | | | | | | | | | | | | | | | It is enabled as default for glibc.malloc.hugetlb set to 2 or higher. It also uses a non configurable minimum value and maximum value, currently set respectively to 1 and 4 selected huge page size. The arena allocation with huge pages does not use MAP_NORESERVE. As indicate by kernel internal documentation [1], the flag might trigger a SIGBUS on soft page faults if at memory access there is no left pages in the pool. On systems without a reserved huge pages pool, is just stress the mmap(MAP_HUGETLB) allocation failure. To improve test coverage it is required to create a pool with some allocated pages. Checked on x86_64-linux-gnu with no reserved pages, 10 reserved pages (which trigger mmap(MAP_HUGETBL) failures) and with 256 reserved pages (which does not trigger mmap(MAP_HUGETLB) failures). [1] https://www.kernel.org/doc/html/v4.18/vm/hugetlbfs_reserv.html#resv-map-modifications Reviewed-by: DJ Delorie <dj@redhat.com>
* malloc: Add Huge Page support for mmapAdhemerval Zanella2021-12-1511-15/+207
| | | | | | | | | | | | | | | | | | | | | | | | | | | | With the morecore hook removed, there is not easy way to provide huge pages support on with glibc allocator without resorting to transparent huge pages. And some users and programs do prefer to use the huge pages directly instead of THP for multiple reasons: no splitting, re-merging by the VM, no TLB shootdowns for running processes, fast allocation from the reserve pool, no competition with the rest of the processes unlike THP, no swapping all, etc. This patch extends the 'glibc.malloc.hugetlb' tunable: the value '2' means to use huge pages directly with the system default size, while a positive value means and specific page size that is matched against the supported ones by the system. Currently only memory allocated on sysmalloc() is handled, the arenas still uses the default system page size. To test is a new rule is added tests-malloc-hugetlb2, which run the addes tests with the required GLIBC_TUNABLE setting. On systems without a reserved huge pages pool, is just stress the mmap(MAP_HUGETLB) allocation failure. To improve test coverage it is required to create a pool with some allocated pages. Checked on x86_64-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>
* malloc: Move mmap logic to its own functionAdhemerval Zanella2021-12-151-76/+88
| | | | | | So it can be used with different pagesize and flags. Reviewed-by: DJ Delorie <dj@redhat.com>
* malloc: Add THP/madvise support for sbrkAdhemerval Zanella2021-12-152-5/+37
| | | | | | | | | | To increase effectiveness with Transparent Huge Page with madvise, the large page size is use instead page size for sbrk increment for the main arena. Checked on x86_64-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>
* malloc: Add madvise support for Transparent Huge PagesAdhemerval Zanella2021-12-1513-0/+259
| | | | | | | | | | | | | | | | | | | | | | | | | | Linux Transparent Huge Pages (THP) current supports three different states: 'never', 'madvise', and 'always'. The 'never' is self-explanatory and 'always' will enable THP for all anonymous pages. However, 'madvise' is still the default for some system and for such case THP will be only used if the memory range is explicity advertise by the program through a madvise(MADV_HUGEPAGE) call. To enable it a new tunable is provided, 'glibc.malloc.hugetlb', where setting to a value diffent than 0 enables the madvise call. This patch issues the madvise(MADV_HUGEPAGE) call after a successful mmap() call at sysmalloc() with sizes larger than the default huge page size. The madvise() call is disable is system does not support THP or if it has the mode set to "never" and on Linux only support one page size for THP, even if the architecture supports multiple sizes. To test is a new rule is added tests-malloc-hugetlb1, which run the addes tests with the required GLIBC_TUNABLE setting. Checked on x86_64-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>
* powerpc: Use global register variable in <thread_pointer.h>Florian Weimer2021-12-152-12/+8
| | | | | | | | | | | | | A local register variable is merely a compiler hint, and so not appropriate in this context. Move the global register variable into <thread_pointer.h> and include it from <tls.h>, as there can only be one global definition for one particular register. Fixes commit 8dbeb0561eeb876f557ac9eef5721912ec074ea5 ("nptl: Add <thread_pointer.h> for defining __thread_pointer"). Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Raphael M Zinsly <rzinsly@linux.ibm.com>
* Use LFS and 64 bit time for installed programs (BZ #15333)Adhemerval Zanella2021-12-153-6/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The installed programs are built with a combination of different values for MODULE_NAME, as below. To enable both Long File Support and 64 bt time, -D_TIME_BITS=64 -D_FILE_OFFSET_BITS=64 is added for nonlibi, nscd, lddlibc4, libresolv, ldconfig, locale_programs, iconvprogs, libnss_files, libnss_compat, libnss_db, libnss_hesiod, libutil, libpcprofile, and libSegFault. nscd/nscd nscd/nscd.o MODULE_NAME=nscd nscd/connections.o MODULE_NAME=nscd nscd/pwdcache.o MODULE_NAME=nscd nscd/getpwnam_r.o MODULE_NAME=nscd nscd/getpwuid_r.o MODULE_NAME=nscd nscd/grpcache.o MODULE_NAME=nscd nscd/getgrnam_r.o MODULE_NAME=nscd nscd/getgrgid_r.o MODULE_NAME=nscd nscd/hstcache.o MODULE_NAME=nscd nscd/gethstbyad_r.o MODULE_NAME=nscd nscd/gethstbynm3_r.o MODULE_NAME=nscd nscd/getsrvbynm_r.o MODULE_NAME=nscd nscd/getsrvbypt_r.o MODULE_NAME=nscd nscd/servicescache.o MODULE_NAME=nscd nscd/dbg_log.o MODULE_NAME=nscd nscd/nscd_conf.o MODULE_NAME=nscd nscd/nscd_stat.o MODULE_NAME=nscd nscd/cache.o MODULE_NAME=nscd nscd/mem.o MODULE_NAME=nscd nscd/nscd_setup_thread.o MODULE_NAME=nscd nscd/xmalloc.o MODULE_NAME=nscd nscd/xstrdup.o MODULE_NAME=nscd nscd/aicache.o MODULE_NAME=nscd nscd/initgrcache.o MODULE_NAME=nscd nscd/gai.o MODULE_NAME=nscd nscd/res_hconf.o MODULE_NAME=nscd nscd/netgroupcache.o MODULE_NAME=nscd nscd/cachedumper.o MODULE_NAME=nscd elf/lddlibc4 elf/lddlibc4 MODULE_NAME=lddlibc4 elf/pldd elf/pldd.o MODULE_NAME=nonlib elf/xmalloc.o MODULE_NAME=nonlib elf/sln elf/sln.o MODULE_NAME=nonlib elf/static-stubs.o MODULE_NAME=nonlib elf/sprof MODULE_NAME=nonlib elf/ldconfig elf/ldconfig.o MODULE_NAME=ldconfig elf/cache.o MODULE_NAME=nonlib elf/readlib.o MODULE_NAME=nonlib elf/xmalloc.o MODULE_NAME=nonlib elf/xstrdup.o MODULE_NAME=nonlib elf/chroot_canon.o MODULE_NAME=nonlib elf/static-stubs.o MODULE_NAME=nonlib elf/stringtable.o MODULE_NAME=nonlib io/pwd io/pwd.o MODULE_NAME=nonlib locale/locale locale/locale.o MODULE_NAME=locale_programs locale/locale-spec.o MODULE_NAME=locale_programs locale/charmap-dir.o MODULE_NAME=locale_programs locale/simple-hash.o MODULE_NAME=locale_programs locale/xmalloc.o MODULE_NAME=locale_programs locale/xstrdup.o MODULE_NAME=locale_programs locale/record-status.o MODULE_NAME=locale_programs locale/xasprintf.o MODULE_NAME=locale_programs locale/localedef locale/localedef.o MODULE_NAME=locale_programs locale/ld-ctype.o MODULE_NAME=locale_programs locale/ld-messages.o MODULE_NAME=locale_programs locale/ld-monetary.o MODULE_NAME=locale_programs locale/ld-numeric.o MODULE_NAME=locale_programs locale/ld-time.o MODULE_NAME=locale_programs locale/ld-paper.o MODULE_NAME=locale_programs locale/ld-name.o MODULE_NAME=locale_programs locale/ld-address.o MODULE_NAME=locale_programs locale/ld-telephone.o MODULE_NAME=locale_programs locale/ld-measurement.o MODULE_NAME=locale_programs locale/ld-identification.o MODULE_NAME=locale_programs locale/ld-collate.o MODULE_NAME=locale_programs locale/charmap.o MODULE_NAME=locale_programs locale/linereader.o MODULE_NAME=locale_programs locale/locfile.o MODULE_NAME=locale_programs locale/repertoire.o MODULE_NAME=locale_programs locale/locarchive.o MODULE_NAME=locale_programs locale/md5.o MODULE_NAME=locale_programs locale/charmap-dir.o MODULE_NAME=locale_programs locale/simple-hash.o MODULE_NAME=locale_programs locale/xmalloc.o MODULE_NAME=locale_programs locale/xstrdup.o MODULE_NAME=locale_programs locale/record-status.o MODULE_NAME=locale_programs locale/xasprintf.o MODULE_NAME=locale_programs catgets/gencat catgets/gencat.o MODULE_NAME=nonlib catgets/xmalloc.o MODULE_NAME=nonlib nss/makedb nss/makedb.o MODULE_NAME=nonlib nss/xmalloc.o MODULE_NAME=nonlib nss/hash-string.o MODULE_NAME=nonlib nss/getent nss/getent.o MODULE_NAME=nonlib posix/getconf posix/getconf.o MODULE_NAME=nonlib login/utmpdump login/utmpdump.o MODULE_NAME=nonlib debug/pcprofiledump debug/pcprofiledump.o MODULE_NAME=nonlib timezone/zic timezone/zic.o MODULE_NAME=nonlib timezone/zdump timezone/zdump.o MODULE_NAME=nonlib iconv/iconv_prog iconv/iconv_prog.o MODULE_NAME=nonlib iconv/iconv_charmap.o MODULE_NAME=iconvprogs iconv/charmap.o MODULE_NAME=iconvprogs iconv/charmap-dir.o MODULE_NAME=iconvprogs iconv/linereader.o MODULE_NAME=iconvprogs iconv/dummy-repertoire.o MODULE_NAME=iconvprogs iconv/simple-hash.o MODULE_NAME=iconvprogs iconv/xstrdup.o MODULE_NAME=iconvprogs iconv/xmalloc.o MODULE_NAME=iconvprogs iconv/record-status.o MODULE_NAME=iconvprogs iconv/iconvconfig iconv/iconvconfig.o MODULE_NAME=nonlib iconv/strtab.o MODULE_NAME=iconvprogs iconv/xmalloc.o MODULE_NAME=iconvprogs iconv/hash-string.o MODULE_NAME=iconvprogs nss/libnss_files.so MODULE_NAME=libnss_files nss/libnss_compat.so.2 MODULE_NAME=libnss_compat nss/libnss_db.so MODULE_NAME=libnss_db hesiod/libnss_hesiod.so MODULE_NAME=libnss_hesiod login/libutil.so MODULE_NAME=libutil debug/libpcprofile.so MODULE_NAME=libpcprofile debug/libSegFault.so MODULE_NAME=libSegFault Also, to avoid adding both LFS and 64 bit time support on internal tests they are moved to a newer 'testsuite-internal' module. It should be similar to 'nonlib' regarding internal definition and linking namespace. This patch also enables LFS and 64 bit support of libsupport container programs (echo-container, test-container, shell-container, and true-container). Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>
* Support target specific ALIGN for variable alignment test [BZ #28676]H.J. Lu2021-12-146-6/+82
| | | | | | | | | | | Add <tst-file-align.h> to support target specific ALIGN for variable alignment test: 1. Alpha: Use 0x10000. 2. MicroBlaze and Nios II: Use 0x8000. 3. All others: Use 0x200000. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* NEWS: Document LD_PREFER_MAP_32BIT_EXEC as x86-64 onlyH.J. Lu2021-12-141-3/+3
|
* elf: Align argument of __munmap to page size [BZ #28676]H.J. Lu2021-12-141-0/+1
| | | | | | | | | | | | On Linux/x86-64, for elf/tst-align3, we now get munmap(0x7f88f9401000, 1126424) = 0 instead of munmap(0x7f1615200018, 544768) = -1 EINVAL (Invalid argument) Reviewed-by: Florian Weimer <fweimer@redhat.com>
* elf: Use new dependency sorting algorithm by defaultFlorian Weimer2021-12-144-6/+7
| | | | | | | The default has to change eventually, and there are no known failures that require a delay. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* intl: Emit no lines in bison generated filesKhem Raj2021-12-141-1/+1
| | | | | | | | | | | Improve reproducibility: Do not put any #line preprocessor commands in bison generated files. These lines contain absolute paths containing file locations on the host build machine. Signed-off-by: Juro Bystricky <juro.bystricky@intel.com> Signed-off-by: Khem Raj <raj.khem@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* hurd: Do not set PIE_UNSUPPORTEDSamuel Thibault2021-12-142-11/+0
| | | | This is now supported.
* NEWS: Move LD_PREFER_MAP_32BIT_EXECH.J. Lu2021-12-131-4/+4
| | | | | | Move LD_PREFER_MAP_32BIT_EXEC to Deprecated and removed features, and other changes affecting compatibility:
* mach: Fix spurious inclusion of stack_chk_fail_local in libmachuser.aSamuel Thibault2021-12-141-0/+1
| | | | | When linking programs statically, stack_chk_fail_local already comes from libc_nonshared, so we don't need it in lib{mach,hurd}user.a.
* Disable DT_RUNPATH on NSS tests [BZ #28455]H.J. Lu2021-12-131-0/+8
| | | | | | | | | | | The glibc internal NSS functions should always load NSS modules from the system. For testing purpose, disable DT_RUNPATH on NSS tests so that the glibc internal NSS functions can load testing NSS modules via DT_RPATH. This partially fixes BZ #28455. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* sysdeps: Simplify sin Taylor Series calculationAkila Welihinda2021-12-131-5/+5
| | | | | | | | | | | | The macro TAYLOR_SIN adds the term `-0.5*da*a^2 + da` in hopes of regaining some precision as a function of da. However the comment says we add the term `-0.5*da*a^2 + 0.5*da` which is different. This fix updates the comment to reflect the code and also simplifies the calculation by replacing `a` with `x` because they always have the same value. Signed-off-by: Akila Welihinda <akilawelihinda@ucla.edu> Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
* math: Remove the error handling wrapper from hypot and hypotfAdhemerval Zanella2021-12-1336-16/+132
| | | | | | | | | | | | The error handling is moved to sysdeps/ieee754 version with no SVID support. The compatibility symbol versions still use the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). Only ia64 is unchanged, since it still uses the arch specific __libm_error_region on its implementation. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
* math: Use fmin/fmax on hypotWilco Dijkstra2021-12-131-2/+3
| | | | | | It optimizes for architectures that provides fast builtins. Checked on aarch64-linux-gnu.
* aarch64: Add math-use-builtins-f{max,min}.hAdhemerval Zanella2021-12-136-112/+8
| | | | It allows to remove the arch-specific implementations.
* math: Add math-use-builtinds-fmin.hAdhemerval Zanella2021-12-133-0/+10
| | | | | It allows the architecture to use the builtin instead of generic implementation.
* math: Add math-use-builtinds-fmax.hAdhemerval Zanella2021-12-137-0/+14
| | | | | It allows the architecture to use the builtin instead of generic implementation.
* math: Remove powerpc e_hypotAdhemerval Zanella2021-12-139-327/+1
| | | | | | | | | | | | | | | | | | | | | | | The generic implementation is shows only slight worse performance: POWER10 reciprocal-throughput latency master 8.28478 13.7253 new hypot 7.21945 13.1933 POWER9 reciprocal-throughput latency master 13.4024 14.0967 new hypot 14.8479 15.8061 POWER8 reciprocal-throughput latency master 15.5767 16.8885 new hypot 16.5371 18.4057 One way to improve might to make gcc generate xsmaxdp/xsmindp for fmax/fmin (it onl does for -ffast-math, clang does for default options). Checked on powerpc64-linux-gnu (power8) and powerpc64le-linux-gnu (power9).
* i386: Move hypot implementation to CAdhemerval Zanella2021-12-133-139/+48
| | | | | | | | | | The generic hypotf is slight slower, mostly due the tricks the assembly does to optimize the isinf/isnan/issignaling. The generic hypot is way slower, since the optimized implementation uses the i386 default excessive precision to issue the operation directly. A similar implementation is provided instead of using the generic implementation: Checked on i686-linux-gnu.
* math: Use an improved algorithm for hypotl (ldbl-128)Adhemerval Zanella2021-12-131-130/+96
| | | | | | | | | | | | | | | | | | | | | This implementation is based on 'An Improved Algorithm for hypot(a,b)' by Carlos F. Borges [1] using the MyHypot3 with the following changes: - Handle qNaN and sNaN. - Tune the 'widely varying operands' to avoid spurious underflow due the multiplication and fix the return value for upwards rounding mode. - Handle required underflow exception for subnormal results. The main advantage of the new algorithm is its precision. With a random 1e9 input pairs in the range of [LDBL_MIN, LDBL_MAX], glibc current implementation shows around 0.05% results with an error of 1 ulp (453266 results) while the new implementation only shows 0.0001% of total (1280). Checked on aarch64-linux-gnu and x86_64-linux-gnu. [1] https://arxiv.org/pdf/1904.09481.pdf
* math: Use an improved algorithm for hypotl (ldbl-96)Adhemerval Zanella2021-12-131-133/+98
| | | | | | | | | | | | | | | | | | | This implementation is based on 'An Improved Algorithm for hypot(a,b)' by Carlos F. Borges [1] using the MyHypot3 with the following changes: - Handle qNaN and sNaN. - Tune the 'widely varying operands' to avoid spurious underflow due the multiplication and fix the return value for upwards rounding mode. - Handle required underflow exception for subnormal results. The main advantage of the new algorithm is its precision. With a random 1e8 input pairs in the range of [LDBL_MIN, LDBL_MAX], glibc current implementation shows around 0.02% results with an error of 1 ulp (23158 results) while the new implementation only shows 0.0001% of total (111). [1] https://arxiv.org/pdf/1904.09481.pdf
* math: Improve hypot performance with FMAWilco Dijkstra2021-12-131-1/+16
| | | | | | | | | | | Improve hypot performance significantly by using fma when available. The fma version has twice the throughput of the previous version and 70% of the latency. The non-fma version has 30% higher throughput and 10% higher latency. Max ULP error is 0.949 with fma and 0.792 without fma. Passes GLIBC testsuite.
* math: Use an improved algorithm for hypot (dbl-64)Wilco Dijkstra2021-12-131-143/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This implementation is based on the 'An Improved Algorithm for hypot(a,b)' by Carlos F. Borges [1] using the MyHypot3 with the following changes: - Handle qNaN and sNaN. - Tune the 'widely varying operands' to avoid spurious underflow due the multiplication and fix the return value for upwards rounding mode. - Handle required underflow exception for denormal results. The main advantage of the new algorithm is its precision: with a random 1e9 input pairs in the range of [DBL_MIN, DBL_MAX], glibc current implementation shows around 0.34% results with an error of 1 ulp (3424869 results) while the new implementation only shows 0.002% of total (18851). The performance result are also only slight worse than current implementation. On x86_64 (Ryzen 5900X) with gcc 12: Before: "hypot": { "workload-random": { "duration": 3.73319e+09, "iterations": 1.12e+08, "reciprocal-throughput": 22.8737, "latency": 43.7904, "max-throughput": 4.37184e+07, "min-throughput": 2.28361e+07 } } After: "hypot": { "workload-random": { "duration": 3.7597e+09, "iterations": 9.8e+07, "reciprocal-throughput": 23.7547, "latency": 52.9739, "max-throughput": 4.2097e+07, "min-throughput": 1.88772e+07 } } Co-Authored-By: Adhemerval Zanella <adhemerval.zanella@linaro.org> Checked on x86_64-linux-gnu and aarch64-linux-gnu. [1] https://arxiv.org/pdf/1904.09481.pdf
* math: Simplify hypotf implementationAdhemerval Zanella2021-12-132-35/+37
| | | | | | | | | | | | Use a more optimized comparison for check for NaN and infinite and add an inlined issignaling implementation for float. With gcc it results in 2 FP comparisons. The file Copyright is also changed to use GPL, the implementation was completely changed by 7c10fd3515f to use double precision instead of scaling and this change removes all the GET_FLOAT_WORD usage. Checked on x86_64-linux-gnu.
* Cleanup encoding in commentsSiddhesh Poyarekar2021-12-134-32/+32
| | | | | | | | | Replace non-UTF-8 and non-ASCII characters in comments with their UTF-8 equivalents so that files don't end up with mixed encodings. With this, all files (except tests that actually test different encodings) have a single encoding. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* Replace --enable-static-pie with --disable-default-pieSiddhesh Poyarekar2021-12-1322-90/+205
| | | | | | | | | | | | | | | | | | | | | | | | Build glibc programs and tests as PIE by default and enable static-pie automatically if the architecture and toolchain supports it. Also add a new configuration option --disable-default-pie to prevent building programs as PIE. Only the following architectures now have PIE disabled by default because they do not work at the moment. hppa, ia64, alpha and csky don't work because the linker is unable to handle a pcrel relocation generated from PIE objects. The microblaze compiler is currently failing with an ICE. GNU hurd tries to enable static-pie, which does not work and hence fails. All these targets have default PIE disabled at the moment and I have left it to the target maintainers to enable PIE on their targets. build-many-glibcs runs clean for all targets. I also tested x86_64 on Fedora and Ubuntu, to verify that the default build as well as --disable-default-pie work as expected with both system toolchains. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* hurd: Add rules for static PIE buildSamuel Thibault2021-12-122-5/+14
| | | | This fixes [BZ #28671].
* hurd: Fix gmon-staticSamuel Thibault2021-12-122-5/+5
| | | | We need to use crt0 for gmon-static too.
* x86-64: Remove LD_PREFER_MAP_32BIT_EXEC support [BZ #28656]H.J. Lu2021-12-105-95/+4
| | | | | | | | | | Remove the LD_PREFER_MAP_32BIT_EXEC environment variable support since the first PT_LOAD segment is no longer executable due to defaulting to -z separate-code. This fixes [BZ #28656]. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* elf: Use errcode instead of (unset) errno in rtld_chain_loadFlorian Weimer2021-12-101-1/+1
|