about summary refs log tree commit diff
Commit message (Collapse)AuthorAgeFilesLines
* microblaze: Use the correct select syscall (BZ #28883)Adhemerval Zanella2022-02-161-1/+1
| | | | | | | | | On Microblaze only __NR_newselect is implemented, even though kernel advertise __NR_select on asm/unistd.h. Since microblaze is the only architecture that undef __ASSUME_PSELECT, the generic code change is simpler than chaging the architecture syscall number. Acked-by: Mark Hatle <mark.hatle@xilinx.com>
* Update kernel version to 5.16 in tst-mman-consts.pyJoseph Myers2022-02-161-1/+1
| | | | | | | | This patch updates the kernel version in the test tst-mman-consts.py to 5.16. (There are no new MAP_* constants covered by this test in 5.16 that need any other header changes.) Tested with build-many-glibcs.py.
* pthread: Use 64 bit time_t stat internally for sem_open (BZ #28880)Adhemerval Zanella2022-02-161-4/+4
| | | | | | | | | | The __sem_check_add_mapping internal stat calls fails with EOVERFLOW if system time is larger than 32 bit. It is a missing spot from 52a5fe70a2c fix to use 64 bit stat internally. Checked on x86_64-linux-gnu and i686-linux-gnu.
* x86: Fix bug in strncmp-evex and strncmp-avx2 [BZ #28895]Noah Goldstein2022-02-163-0/+25
| | | | | | | | | | | | | Logic can read before the start of `s1` / `s2` if both `s1` and `s2` are near the start of a page. To avoid having the result contimated by these comparisons the `strcmp` variants would mask off these comparisons. This was missing in the `strncmp` variants causing the bug. This commit adds the masking to `strncmp` so that out of range comparisons don't affect the result. test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass as well a full xcheck on x86_64 linux. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* String: Strength memset tests in test-memset.cNoah Goldstein2022-02-151-15/+21
| | | | | | | | | The prior sentinel logic was broken and was checking the SIMPLE_MEMSET as opposed to the tested implementation. As well `s` (the test buffer) was not reset between implementation tests so it was possible for a buggy implementation to be hidden by a previously executed correct one. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* x86-64: Define __memcmpeq in ld.soH.J. Lu2022-02-141-1/+3
| | | | | Define __memcmpeq in ld.so so that compiler can generate __memcmpeq call when compiling for ld.so.
* htl: Destroy thread-specific data before releasing joinsSamuel Thibault2022-02-141-3/+3
| | | | | Applications may want to assume that after pthread_join() returns, all thread-specific data has been released.
* htl: Fix initializing the key lockSamuel Thibault2022-02-142-3/+5
| | | | | | The static pthread_once_t in the pt-key.h header was creating one pthread_once_t per includer. We have to use a shared common pthread_once_t instead.
* mach: Fix LLL_SHARED valueSamuel Thibault2022-02-141-1/+1
| | | | Mach defines GSYNC_SHARED, not SYNC_SHARED.
* htl: Make pthread_[gs]etspecific not check for key validitySamuel Thibault2022-02-142-4/+2
| | | | | | | Since __pthread_key_create might be concurrently reallocating the __pthread_key_destructors array, it's not safe to access it without the mutex held. Posix explicitly says we are allowed to prefer performance over error detection.
* x86-64: Remove bzero weak alias in SS2 memsetH.J. Lu2022-02-141-3/+1
| | | | | | | | | | | commit 3d9f171bfb5325bd5f427e9fc386453358c6e840 Author: H.J. Lu <hjl.tools@gmail.com> Date: Mon Feb 7 05:55:15 2022 -0800 x86-64: Optimize bzero added the optimized bzero. Remove bzero weak alias in SS2 memset to avoid undefined __bzero in memset-sse2-unaligned-erms.
* hppa: Fix typoJohn David Anglin2022-02-141-1/+1
|
* linux: Use socket-constants-time64.h on tst-socket-timestamp-compatAdhemerval Zanella2022-02-141-12/+13
| | | | | | | | The kernel header might not define the SO_TIMESTAMP{NS}_OLD or SO_TIMESTAMP{NS}_NEW if it older than v5.1. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
* x86/configure.ac: Define PI_STATIC_AND_HIDDEN/SUPPORT_STATIC_PIEH.J. Lu2022-02-146-26/+13
| | | | | Move PI_STATIC_AND_HIDDEN and SUPPORT_STATIC_PIE to sysdeps/x86/configure.ac.
* Fix elf/tst-audit2 on hppaJohn David Anglin2022-02-141-14/+10
| | | | | | | | | | | | The test elf/tst-audit2 fails on hppa with a segmentation fault in the long branch stub used to call malloc from calloc. This occurs because the test is not a PIC executable and calloc is called from the dynamic linker before the dp register is initialized in _dl_start_user. The fix is to move the dp register initialization into elf_machine_runtime_setup. Since the address of $global$ can't be loaded directly, we continue to use the DT_PLTGOT value from the the main_map to initialize dp.
* x86: Use CHECK_FEATURE_PRESENT on PCONFIGH.J. Lu2022-02-141-1/+1
| | | | | PCONFIG is a privileged instruction. Use CHECK_FEATURE_PRESENT, instead of CHECK_FEATURE_ACTIVE, on PCONFIG in tst-cpu-features-supports.c.
* x86: Don't check PTWRITE in tst-cpu-features-cpuinfo.cH.J. Lu2022-02-141-0/+3
| | | | | Don't check PTWRITE against /proc/cpuinfo since kernel doesn't report PTWRITE in /proc/cpuinfo.
* x86: Set .text section in memset-vec-unaligned-ermsNoah Goldstein2022-02-121-0/+1
| | | | | | | | | | | commit 3d9f171bfb5325bd5f427e9fc386453358c6e840 Author: H.J. Lu <hjl.tools@gmail.com> Date: Mon Feb 7 05:55:15 2022 -0800 x86-64: Optimize bzero Remove setting the .text section for the code. This commit adds that back.
* Linux: Include <dl-auxv.h> in dl-sysdep.c only for SHAREDFlorian Weimer2022-02-111-1/+2
| | | | | | | | | Otherwise, <dl-auxv.h> on POWER ends up being included twice, once in dl-sysdep.c, once in dl-support.c. That leads to a linker failure due to multiple definitions of _dl_cache_line_size. Fixes commit d96d2995c1121d3310102afda2deb1f35761b5e6 ("Revert "Linux: Consolidate auxiliary vector parsing").
* Revert "Linux: Consolidate auxiliary vector parsing"Florian Weimer2022-02-116-106/+180
| | | | | | | | | | | | This reverts commit 8c8510ab2790039e58995ef3a22309582413d3ff. The revert is not perfect because the commit included a bug fix for _dl_sysdep_start with an empty argv, introduced in commit 2d47fa68628e831a692cba8fc9050cef435afc5e ("Linux: Remove DL_FIND_ARG_COMPONENTS"), and this bug fix is kept. The revert is necessary because the reverted commit introduced an early memset call on aarch64, which leads to crash due to lack of TCB initialization.
* String: Ensure 'MIN_PAGE_SIZE' is multiple of 'getpagesize'Noah Goldstein2022-02-112-16/+16
| | | | | When 'TEST_LEN' was defined as (4096 * 3) the allocation size Would not be a multiple of system page size if system page size > 4096.
* Use binutils 2.38 branch in build-many-glibcs.pyJoseph Myers2022-02-101-1/+1
| | | | | | This patch makes build-many-glibcs.py use binutils 2.38 branch. Tested with build-many-glibcs.py (compilers and glibcs builds).
* elf: Remove LD_USE_LOAD_BIASAdhemerval Zanella2022-02-106-22/+4
| | | | | | | | It is solely for prelink with PIE executables [1]. [1] https://sourceware.org/legacy-ml/libc-hacker/2003-11/msg00127.html Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* malloc: Remove LD_TRACE_PRELINKING usage from mtraceAdhemerval Zanella2022-02-104-33/+51
| | | | | | | | | | | | | | | | | | | | | | The fix for BZ#22716 replacde LD_TRACE_LOADED_OBJECTS with LD_TRACE_PRELINKING so mtrace could record executable address position. To provide the same information, LD_TRACE_LOADED_OBJECTS is extended where a value or '2' also prints the executable address as well. It avoid adding another loader environment variable to be used solely for mtrace. The vDSO will be printed as a default library (with '=>' pointing the same name), which is ok since both mtrace and ldd already handles it. The mtrace script is changed to also parse the new format. To correctly support PIE and non-PIE executables, both the default mtrace address and the one calculated as used (it fixes mtrace for non-PIE exectuable as for BZ#22716 for PIE). Checked on x86_64-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* elf: Remove prelink supportAdhemerval Zanella2022-02-1025-885/+117
| | | | | | | | | | | | | Prelinked binaries and libraries still work, the dynamic tags DT_GNU_PRELINKED, DT_GNU_LIBLIST, DT_GNU_CONFLICT just ignored (meaning the process is reallocated as default). The loader environment variable TRACE_PRELINKING is also removed, since it used solely on prelink. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* Linux: Consolidate auxiliary vector parsingFlorian Weimer2022-02-106-181/+107
| | | | | | | | | | | | | | | | | | | | And optimize it slightly. The large switch statement in _dl_sysdep_start can be replaced with a large array. This reduces source code and binary size. On i686-linux-gnu: Before: text data bss dec hex filename 7791 12 0 7803 1e7b elf/dl-sysdep.os After: text data bss dec hex filename 7135 12 0 7147 1beb elf/dl-sysdep.os Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Linux: Assume that NEED_DL_SYSINFO_DSO is always definedFlorian Weimer2022-02-102-9/+3
| | | | | | The definition itself is still needed for generic code. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Linux: Remove DL_FIND_ARG_COMPONENTSFlorian Weimer2022-02-101-15/+10
| | | | | | | The generic definition is always used since the Native Client port has been removed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Linux: Remove HAVE_AUX_SECURE, HAVE_AUX_XID, HAVE_AUX_PAGESIZEFlorian Weimer2022-02-102-66/+1
| | | | | | They are always defined. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* elf: Merge dl-sysdep.c into the Linux versionFlorian Weimer2022-02-102-362/+337
| | | | | | | The generic version is the de-facto Linux implementation. It requires an auxiliary vector, so Hurd does not use it. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* hppa: Fix bind-now audit (BZ #28857)Adhemerval Zanella2022-02-096-8/+15
| | | | | | | | | | | | | | | | | | | | | | | | | On hppa, a function pointer returned by la_symbind is actually a function descriptor has the plabel bit set (bit 30). This must be cleared to get the actual address of the descriptor. If the descriptor has been bound, the first word of the descriptor is the physical address of theA function, otherwise, the first word of the descriptor points to a trampoline in the PLT. This patch also adds a workaround on tests because on hppa (and it seems to be the only ABI I have see it), some shared library adds a dynamic PLT relocation to am empty symbol name: $ readelf -r elf/tst-audit25mod1.so [...] Relocation section '.rela.plt' at offset 0x464 contains 6 entries: Offset Info Type Sym.Value Sym. Name + Addend 00002008 00000081 R_PARISC_IPLT 508 [...] It breaks some assumptions on the test, where a symbol with an empty name ("") is passed on la_symbind. Checked on x86_64-linux-gnu and hppa-linux-gnu.
* x86-64: Optimize bzeroH.J. Lu2022-02-0810-25/+256
| | | | | | | | | | memset with zero as the value to set is by far the majority value (99%+ for Python3 and GCC). bzero can be slightly more optimized for this case by using a zero-idiom xor for broadcasting the set value to a register (vector or GPR). Co-developed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* benchtests: Add benches for bzeroH.J. Lu2022-02-084-0/+372
| | | | Add bench-bzero-large.c, bench-bzero-walk.c and bench-bzero.c.
* linux: fix accuracy of get_nprocs and get_nprocs_conf [BZ #28865]Dmitry V. Levin2022-02-071-31/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | get_nprocs() and get_nprocs_conf() use various methods to obtain an accurate number of processors. Re-introduce __get_nprocs_sched() as a source of information, and fix the order in which these methods are used to return the most accurate information. The primary source of information used in both functions remains unchanged. This also changes __get_nprocs_sched() error return value from 2 to 0, but all its users are already prepared to handle that. Old fallback order: get_nprocs: /sys/devices/system/cpu/online -> /proc/stat -> 2 get_nprocs_conf: /sys/devices/system/cpu/ -> /proc/stat -> 2 New fallback order: get_nprocs: /sys/devices/system/cpu/online -> /proc/stat -> sched_getaffinity -> 2 get_nprocs_conf: /sys/devices/system/cpu/ -> /proc/stat -> sched_getaffinity -> 2 Fixes: 342298278e ("linux: Revert the use of sched_getaffinity on get_nproc") Closes: BZ #28865 Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* x86: Remove SSSE3 instruction for broadcast in memset.S (SSE2 Only)Noah Goldstein2022-02-071-3/+4
| | | | | | | | | | | commit b62ace2740a106222e124cc86956448fa07abf4d Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Sun Feb 6 00:54:18 2022 -0600 x86: Improve vec generation in memset-vec-unaligned-erms.S Revert usage of 'pshufb' in broadcast logic as it is an SSSE3 instruction and memset.S is restricted to only SSE2 instructions.
* benchtests: Sort benches in MakefileH.J. Lu2022-02-071-19/+110
| | | | Put one bench per line and sort them.
* Benchtests: Add length zero benchmark for memset in bench-memset.cNoah Goldstein2022-02-061-1/+1
| | | | | | Zero is a relevant size for some workloads (roughly 5% of uses for GCC) so we should be testing it's performance as well. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86: Improve vec generation in memset-vec-unaligned-erms.SNoah Goldstein2022-02-065-87/+152
| | | | | | | | | | | | | | | | | | | | | | | | | No bug. Split vec generation into multiple steps. This allows the broadcast in AVX2 to use 'xmm' registers for the L(less_vec) case. This saves an expensive lane-cross instruction and removes the need for 'vzeroupper'. For SSE2 replace 2x 'punpck' instructions with zero-idiom 'pxor' for byte broadcast. Results for memset-avx2 small (geomean of N = 20 benchset runs). size, New Time, Old Time, New / Old 0, 4.100, 3.831, 0.934 1, 5.074, 4.399, 0.867 2, 4.433, 4.411, 0.995 4, 4.487, 4.415, 0.984 8, 4.454, 4.396, 0.987 16, 4.502, 4.443, 0.987 All relevant string/wcsmbs tests are passing. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector tan/tanf to libmvec microbenchmarkSunil K Pandey2022-02-063-0/+8201
| | | | | | | | | | | | | | | | | | | | Add vector tan/tanf and input files to libmvec microbenchmark. libmvec-tan-inputs: 90% Normal random distribution range: (-DBL_MAX, DBL_MAX) mean: 0.0 sigma: 5.0 10% uniform random distribution in range (-1000.0, 1000.0) libmvec-tanf-inputs: 90% Normal random distribution range: (-FLT_MAX, FLT_MAX) mean: 0.0f sigma: 5.0f 10% uniform random distribution in range (-1000.0f, 1000.0f) Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector erfc/erfcf to libmvec microbenchmarkSunil K Pandey2022-02-063-0/+8201
| | | | | | | | | | | | | | | | | | | | Add vector erfc/erfcf and input files to libmvec microbenchmark. libmvec-erfc-inputs: 90% Normal random distribution range: (-6.0, 6.0) mean: 0.0 sigma: 1.0 10% uniform random distribution in range (-5.9, 5.9) libmvec-erfcf-inputs: 90% Normal random distribution range: (-4.0f, 4.0f) mean: 0.0f sigma: 1.0f 10% uniform random distribution in range (-3.9f, 3.9f) Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector asinh/asinhf to libmvec microbenchmarkSunil K Pandey2022-02-063-0/+8201
| | | | | | | | | | | | | | | | | | | | Add vector asinh/asinhf and input files to libmvec microbenchmark. libmvec-asinh-inputs: 90% Normal random distribution range: (-DBL_MAX, DBL_MAX) mean: 0.0 sigma: 2.0 10% uniform random distribution in range (-1.0e6, 1.0e6) libmvec-asinhf-inputs: 90% Normal random distribution range: (-FLT_MAX, FLT_MAX) mean: 0.0f sigma: 2.0f 10% uniform random distribution in range (-1.0e6f, 1.0e6f) Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector tanh/tanhf to libmvec microbenchmarkSunil K Pandey2022-02-063-0/+8201
| | | | | | | | | | | | | | | | | | | | Add vector tanh/tanhf and input files to libmvec microbenchmark. libmvec-tanh-inputs: 90% Normal random distribution range: (-19.0, 19.0) mean: 0.0 sigma: 2.0 10% uniform random distribution in range (-16.0, 16.0) libmvec-tanhf-inputs: 90% Normal random distribution range: (-10.0f, 10.0f) mean: 0.0f sigma: 2.0f 10% uniform random distribution in range (-8.0f, 8.0f) Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector erf/erff to libmvec microbenchmarkSunil K Pandey2022-02-063-0/+8201
| | | | | | | | | | | | | | | | | | | | Add vector erf/erff and input files to libmvec microbenchmark. libmvec-erf-inputs: 90% Normal random distribution range: (-6.0, 6.0) mean: 0.0 sigma: 1.0 10% uniform random distribution in range (-5.9, 5.9) libmvec-erff-inputs: 90% Normal random distribution range: (-4.0f, 4.0f) mean: 0.0f sigma: 1.0f 10% uniform random distribution in range (-3.9f, 3.9f) Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector acosh/acoshf to libmvec microbenchmarkSunil K Pandey2022-02-063-0/+8201
| | | | | | | | | | | | | | | | | | | | Add vector acosh/acoshf and input files to libmvec microbenchmark. libmvec-acosh-inputs: 90% Normal random distribution range: (1.0, DBL_MAX) mean: 1.0 sigma: 8.0 10% uniform random distribution in range (1.0, 1.0e6) libmvec-acoshf-inputs: 90% Normal random distribution range: (1.0f, FLT_MAX) mean: 1.0f sigma: 4.0f 10% uniform random distribution in range (1.0f, 1.0e6f) Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector atanh/atanhf to libmvec microbenchmarkSunil K Pandey2022-02-063-0/+8201
| | | | | | | | | | | | | | | | | | | | Add vector atanh/atanhf and input files to libmvec microbenchmark. libmvec-atanh-inputs: 90% Normal random distribution range: (-1.0, 1.0) mean: 0.0 sigma: 1.0 10% uniform random distribution in range (-1.0, 1.0) libmvec-atanhf-inputs: 90% Normal random distribution range: (-1.0f, 1.0f) mean: 0.0f sigma: 1.0f 10% uniform random distribution in range (-1.0f, 1.0f) Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector log1p/log1pf to libmvec microbenchmarkSunil K Pandey2022-02-063-0/+8201
| | | | | | | | | | | | | | | | | | | | Add vector log1p/log1pf and input files to libmvec microbenchmark. libmvec-log1p-inputs: 70% Normal random distribution range: (-1.0, DBL_MAX) mean: 0.0 sigma: 50.0 30% uniform random distribution in range (-1.0, 1.0e6) libmvec-log1pf-inputs: 70% Normal random distribution range: (-1.0f, FLT_MAX) mean: 0.0f sigma: 50.0f 30% uniform random distribution in range (-1.0f, 1.0e6f) Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector log2/log2f to libmvec microbenchmarkSunil K Pandey2022-02-063-0/+8201
| | | | | | | | | | | | | | | | | | | | Add vector log2/log2f and input files to libmvec microbenchmark. libmvec-log2-inputs: 70% Normal random distribution range: (0.0, DBL_MAX) mean: 1.0 sigma: 50.0 30% uniform random distribution in range (0.0, 1.0e6) libmvec-log2f-inputs: 70% Normal random distribution range: (0.0f, FLT_MAX) mean: 1.0f sigma: 50.0f 30% uniform random distribution in range (0.0f, 1.0e6f) Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector log10/log10f to libmvec microbenchmarkSunil K Pandey2022-02-063-0/+8201
| | | | | | | | | | | | | | | | | | | | Add vector log10/log10f and input files to libmvec microbenchmark. libmvec-log10-inputs: 70% Normal random distribution range: (0.0, DBL_MAX) mean: 1.0 sigma: 50.0 30% uniform random distribution in range (0.0, 1.0e6) libmvec-log10f-inputs: 70% Normal random distribution range: (0.0f, FLT_MAX) mean: 1.0f sigma: 50.0f 30% uniform random distribution in range (0.0f, 1.0e6f) Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector atan2/atan2f to libmvec microbenchmarkSunil K Pandey2022-02-063-0/+8201
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add vector atan2/atan2f and input files to libmvec microbenchmark. libmvec-atan2-inputs: arg1: 90% Normal random distribution range: (-DBL_MAX, DBL_MAX) mean: 0.0 sigma: 4.0 10% uniform random distribution in range (-1.0e6, 1.0e6) arg2: 90% Normal random distribution range: (-DBL_MAX, DBL_MAX) mean: 0.0 sigma: 4.0 10% uniform random distribution in range (-1.0e6, 1.0e6) libmvec-atan2f-inputs: arg1: 90% Normal random distribution range: (-FLT_MAX, FLT_MAX) mean: 0.0f sigma: 4.0f 10% uniform random distribution in range (-1.0e6f, 1.0e6f) arg2: 90% Normal random distribution range: (-FLT_MAX, FLT_MAX) mean: 0.0f sigma: 4.0f 10% uniform random distribution in range (-1.0e6f, 1.0e6f) Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector cbrt/cbrtf to libmvec microbenchmarkSunil K Pandey2022-02-063-0/+8201
| | | | | | | | | | | | | | | | | | | | Add vector cbrt/cbrtf and input files to libmvec microbenchmark. libmvec-cbrt-inputs: 90% Normal random distribution range: (-DBL_MAX, DBL_MAX) mean: 0.0 sigma: 10.0 10% uniform random distribution in range (-1000.0, 1000.0) libmvec-cbrtf-inputs: 90% Normal random distribution range: (-FLT_MAX, FLT_MAX) mean: 0.0f sigma: 10.0f 10% uniform random distribution in range (-1000.0f, 1000.0f) Reviewed-by: H.J. Lu <hjl.tools@gmail.com>