about summary refs log tree commit diff
Commit message (Collapse)AuthorAgeFilesLines
* ia64: Remove bcopyAdhemerval Zanella2022-02-231-10/+0
| | | | It just call memmove as the generic implementation.
* hppa: Fix warnings from _dl_lookup_addressJohn David Anglin2022-02-224-9/+14
| | | | | | | | | | | | | This change fixes two warnings from _dl_lookup_address. The first warning comes from dropping the volatile keyword from desc in the call to _dl_read_access_allowed. We now have a full atomic barrier between loading desc[0] and the access check, so desc no longer needs to be declared as volatile. The second warning comes from the implicit declaration of _dl_fix_reloc_arg. This is fixed by including dl-runtime.h and declaring _dl_fix_reloc_arg in dl-runtime.h.
* hppa: Revise gettext trampoline designJohn David Anglin2022-02-223-31/+35
| | | | | | | | | | | | | | | | | | | The current getcontext return trampoline is overly complex and it unnecessarily clobbers several registers. By saving the context pointer (r26) in the context, __getcontext_ret can restore any registers not restored by setcontext. This allows getcontext to save and restore the entire register context present when getcontext is entered. We use the unused oR0 context slot for the return from __getcontext_ret. While this is not directly useful in C, it can be exploited in assembly code. Registers r20, r23, r24 and r25 are not clobbered in the call path to getcontext. This allows a small simplification of swapcontext. It also allows saving and restoring the 6-bit SAR register in the LSB of the oSAR context slot. The getcontext flag value can be stored in the MSB of the oSAR slot.
* Add SOL_MPTCP, SOL_MCTP from Linux 5.16 to bits/socket.hJoseph Myers2022-02-211-0/+2
| | | | | | | Linux 5.16 adds constants SOL_MPTCP and SOL_MCTP to the getsockopt / setsockopt levels; add these constants to bits/socket.h. Tested for x86_64.
* elf: Check invalid hole in PT_LOAD segments [BZ #28838]H.J. Lu2022-02-211-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Changes in v2: 1. Update commit log. commit 163f625cf9becbb82dfec63a29e566324129c0cd Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Dec 21 12:35:47 2021 -0800 elf: Remove excessive p_align check on PT_LOAD segments [BZ #28688] removed the p_align check against the page size. It caused the loader error or crash on elf/tst-p_align3 when loading elf/tst-p_alignmod3.so, which has the invalid p_align in PT_LOAD segments, added by commit d8d94863ef125a392b929732b37e07dc927fbcd1 Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Dec 21 13:42:28 2021 -0800 The loader failure caused by a negative length passed to __mprotect is random, depending on architecture and toolchain. Update _dl_map_segments to detect invalid holes. This fixes BZ #28838. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* realpath: Do not copy result on failure (BZ #28815)Siddhesh Poyarekar2022-02-212-3/+5
| | | | | | | | | | | On failure, the contents of the resolved buffer passed in by the caller to realpath are undefined. Do not copy any partial resolution to the buffer and also do not test resolved contents in test-canon.c. Resolves: BZ #28815 Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* x86: Fix TEST_NAME to make it a string in tst-strncmp-rtm.cNoah Goldstein2022-02-181-2/+2
| | | | | | | | Previously TEST_NAME was passing a function pointer. This didn't fail because of the -Wno-error flag (to allow for overflow sizes passed to strncmp/wcsncmp) Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86: Test wcscmp RTM in the wcsncmp overflow case [BZ #28896]Noah Goldstein2022-02-183-10/+48
| | | | | | | | | | | In the overflow fallback strncmp-avx2-rtm and wcsncmp-avx2-rtm would call strcmp-avx2 and wcscmp-avx2 respectively. This would have not checks around vzeroupper and would trigger spurious aborts. This commit fixes that. test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass on AVX2 machines with and without RTM. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* hppa: Fix swapcontextJohn David Anglin2022-02-183-7/+58
| | | | | | | | | | | | | | | | | This change fixes the failure of stdlib/tst-setcontext2 and stdlib/tst-setcontext7 on hppa. The implementation of swapcontext in C is broken. C saves the return pointer (rp) and any non call-clobbered registers (in this case r3, r4 and r5) on the stack. However, the setcontext call in swapcontext pops the stack and subsequent calls clobber the saved registers. When the context in oucp is restored, both tests fault. Here we rewrite swapcontext in assembly code to avoid using the stack for register values that need to be used after restoration. The getcontext and setcontext routines are revised to save and restore register ret1 for normal returns. We copy the oucp pointer to ret1. This allows access to the old context after calling getcontext and setcontext.
* x86: Fallback {str|wcs}cmp RTM in the ncmp overflow case [BZ #28896]Noah Goldstein2022-02-177-10/+23
| | | | | | | | | | | | In the overflow fallback strncmp-avx2-rtm and wcsncmp-avx2-rtm would call strcmp-avx2 and wcscmp-avx2 respectively. This would have not checks around vzeroupper and would trigger spurious aborts. This commit fixes that. test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass on AVX2 machines with and without RTM. Co-authored-by: H.J. Lu <hjl.tools@gmail.com>
* string: Add a testcase for wcsncmp with SIZE_MAX [BZ #28755]H.J. Lu2022-02-171-0/+13
| | | | | | | | | | | | | | | | | | | | | | | Verify that wcsncmp (L("abc"), L("abd"), SIZE_MAX) == 0. The new test fails without commit ddf0992cf57a93200e0c782e2a94d0733a5a0b87 Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Sun Jan 9 16:02:21 2022 -0600 x86: Fix __wcsncmp_avx2 in strcmp-avx2.S [BZ# 28755] and commit 7e08db3359c86c94918feb33a1182cd0ff3bb10b Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Sun Jan 9 16:02:28 2022 -0600 x86: Fix __wcsncmp_evex in strcmp-evex.S [BZ# 28755] This is for BZ #28755. Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
* microblaze: Use the correct select syscall (BZ #28883)Adhemerval Zanella2022-02-161-1/+1
| | | | | | | | | On Microblaze only __NR_newselect is implemented, even though kernel advertise __NR_select on asm/unistd.h. Since microblaze is the only architecture that undef __ASSUME_PSELECT, the generic code change is simpler than chaging the architecture syscall number. Acked-by: Mark Hatle <mark.hatle@xilinx.com>
* Update kernel version to 5.16 in tst-mman-consts.pyJoseph Myers2022-02-161-1/+1
| | | | | | | | This patch updates the kernel version in the test tst-mman-consts.py to 5.16. (There are no new MAP_* constants covered by this test in 5.16 that need any other header changes.) Tested with build-many-glibcs.py.
* pthread: Use 64 bit time_t stat internally for sem_open (BZ #28880)Adhemerval Zanella2022-02-161-4/+4
| | | | | | | | | | The __sem_check_add_mapping internal stat calls fails with EOVERFLOW if system time is larger than 32 bit. It is a missing spot from 52a5fe70a2c fix to use 64 bit stat internally. Checked on x86_64-linux-gnu and i686-linux-gnu.
* x86: Fix bug in strncmp-evex and strncmp-avx2 [BZ #28895]Noah Goldstein2022-02-163-0/+25
| | | | | | | | | | | | | Logic can read before the start of `s1` / `s2` if both `s1` and `s2` are near the start of a page. To avoid having the result contimated by these comparisons the `strcmp` variants would mask off these comparisons. This was missing in the `strncmp` variants causing the bug. This commit adds the masking to `strncmp` so that out of range comparisons don't affect the result. test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass as well a full xcheck on x86_64 linux. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* String: Strength memset tests in test-memset.cNoah Goldstein2022-02-151-15/+21
| | | | | | | | | The prior sentinel logic was broken and was checking the SIMPLE_MEMSET as opposed to the tested implementation. As well `s` (the test buffer) was not reset between implementation tests so it was possible for a buggy implementation to be hidden by a previously executed correct one. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* x86-64: Define __memcmpeq in ld.soH.J. Lu2022-02-141-1/+3
| | | | | Define __memcmpeq in ld.so so that compiler can generate __memcmpeq call when compiling for ld.so.
* htl: Destroy thread-specific data before releasing joinsSamuel Thibault2022-02-141-3/+3
| | | | | Applications may want to assume that after pthread_join() returns, all thread-specific data has been released.
* htl: Fix initializing the key lockSamuel Thibault2022-02-142-3/+5
| | | | | | The static pthread_once_t in the pt-key.h header was creating one pthread_once_t per includer. We have to use a shared common pthread_once_t instead.
* mach: Fix LLL_SHARED valueSamuel Thibault2022-02-141-1/+1
| | | | Mach defines GSYNC_SHARED, not SYNC_SHARED.
* htl: Make pthread_[gs]etspecific not check for key validitySamuel Thibault2022-02-142-4/+2
| | | | | | | Since __pthread_key_create might be concurrently reallocating the __pthread_key_destructors array, it's not safe to access it without the mutex held. Posix explicitly says we are allowed to prefer performance over error detection.
* x86-64: Remove bzero weak alias in SS2 memsetH.J. Lu2022-02-141-3/+1
| | | | | | | | | | | commit 3d9f171bfb5325bd5f427e9fc386453358c6e840 Author: H.J. Lu <hjl.tools@gmail.com> Date: Mon Feb 7 05:55:15 2022 -0800 x86-64: Optimize bzero added the optimized bzero. Remove bzero weak alias in SS2 memset to avoid undefined __bzero in memset-sse2-unaligned-erms.
* hppa: Fix typoJohn David Anglin2022-02-141-1/+1
|
* linux: Use socket-constants-time64.h on tst-socket-timestamp-compatAdhemerval Zanella2022-02-141-12/+13
| | | | | | | | The kernel header might not define the SO_TIMESTAMP{NS}_OLD or SO_TIMESTAMP{NS}_NEW if it older than v5.1. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
* x86/configure.ac: Define PI_STATIC_AND_HIDDEN/SUPPORT_STATIC_PIEH.J. Lu2022-02-146-26/+13
| | | | | Move PI_STATIC_AND_HIDDEN and SUPPORT_STATIC_PIE to sysdeps/x86/configure.ac.
* Fix elf/tst-audit2 on hppaJohn David Anglin2022-02-141-14/+10
| | | | | | | | | | | | The test elf/tst-audit2 fails on hppa with a segmentation fault in the long branch stub used to call malloc from calloc. This occurs because the test is not a PIC executable and calloc is called from the dynamic linker before the dp register is initialized in _dl_start_user. The fix is to move the dp register initialization into elf_machine_runtime_setup. Since the address of $global$ can't be loaded directly, we continue to use the DT_PLTGOT value from the the main_map to initialize dp.
* x86: Use CHECK_FEATURE_PRESENT on PCONFIGH.J. Lu2022-02-141-1/+1
| | | | | PCONFIG is a privileged instruction. Use CHECK_FEATURE_PRESENT, instead of CHECK_FEATURE_ACTIVE, on PCONFIG in tst-cpu-features-supports.c.
* x86: Don't check PTWRITE in tst-cpu-features-cpuinfo.cH.J. Lu2022-02-141-0/+3
| | | | | Don't check PTWRITE against /proc/cpuinfo since kernel doesn't report PTWRITE in /proc/cpuinfo.
* x86: Set .text section in memset-vec-unaligned-ermsNoah Goldstein2022-02-121-0/+1
| | | | | | | | | | | commit 3d9f171bfb5325bd5f427e9fc386453358c6e840 Author: H.J. Lu <hjl.tools@gmail.com> Date: Mon Feb 7 05:55:15 2022 -0800 x86-64: Optimize bzero Remove setting the .text section for the code. This commit adds that back.
* Linux: Include <dl-auxv.h> in dl-sysdep.c only for SHAREDFlorian Weimer2022-02-111-1/+2
| | | | | | | | | Otherwise, <dl-auxv.h> on POWER ends up being included twice, once in dl-sysdep.c, once in dl-support.c. That leads to a linker failure due to multiple definitions of _dl_cache_line_size. Fixes commit d96d2995c1121d3310102afda2deb1f35761b5e6 ("Revert "Linux: Consolidate auxiliary vector parsing").
* Revert "Linux: Consolidate auxiliary vector parsing"Florian Weimer2022-02-116-106/+180
| | | | | | | | | | | | This reverts commit 8c8510ab2790039e58995ef3a22309582413d3ff. The revert is not perfect because the commit included a bug fix for _dl_sysdep_start with an empty argv, introduced in commit 2d47fa68628e831a692cba8fc9050cef435afc5e ("Linux: Remove DL_FIND_ARG_COMPONENTS"), and this bug fix is kept. The revert is necessary because the reverted commit introduced an early memset call on aarch64, which leads to crash due to lack of TCB initialization.
* String: Ensure 'MIN_PAGE_SIZE' is multiple of 'getpagesize'Noah Goldstein2022-02-112-16/+16
| | | | | When 'TEST_LEN' was defined as (4096 * 3) the allocation size Would not be a multiple of system page size if system page size > 4096.
* Use binutils 2.38 branch in build-many-glibcs.pyJoseph Myers2022-02-101-1/+1
| | | | | | This patch makes build-many-glibcs.py use binutils 2.38 branch. Tested with build-many-glibcs.py (compilers and glibcs builds).
* elf: Remove LD_USE_LOAD_BIASAdhemerval Zanella2022-02-106-22/+4
| | | | | | | | It is solely for prelink with PIE executables [1]. [1] https://sourceware.org/legacy-ml/libc-hacker/2003-11/msg00127.html Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* malloc: Remove LD_TRACE_PRELINKING usage from mtraceAdhemerval Zanella2022-02-104-33/+51
| | | | | | | | | | | | | | | | | | | | | | The fix for BZ#22716 replacde LD_TRACE_LOADED_OBJECTS with LD_TRACE_PRELINKING so mtrace could record executable address position. To provide the same information, LD_TRACE_LOADED_OBJECTS is extended where a value or '2' also prints the executable address as well. It avoid adding another loader environment variable to be used solely for mtrace. The vDSO will be printed as a default library (with '=>' pointing the same name), which is ok since both mtrace and ldd already handles it. The mtrace script is changed to also parse the new format. To correctly support PIE and non-PIE executables, both the default mtrace address and the one calculated as used (it fixes mtrace for non-PIE exectuable as for BZ#22716 for PIE). Checked on x86_64-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* elf: Remove prelink supportAdhemerval Zanella2022-02-1025-885/+117
| | | | | | | | | | | | | Prelinked binaries and libraries still work, the dynamic tags DT_GNU_PRELINKED, DT_GNU_LIBLIST, DT_GNU_CONFLICT just ignored (meaning the process is reallocated as default). The loader environment variable TRACE_PRELINKING is also removed, since it used solely on prelink. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* Linux: Consolidate auxiliary vector parsingFlorian Weimer2022-02-106-181/+107
| | | | | | | | | | | | | | | | | | | | And optimize it slightly. The large switch statement in _dl_sysdep_start can be replaced with a large array. This reduces source code and binary size. On i686-linux-gnu: Before: text data bss dec hex filename 7791 12 0 7803 1e7b elf/dl-sysdep.os After: text data bss dec hex filename 7135 12 0 7147 1beb elf/dl-sysdep.os Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Linux: Assume that NEED_DL_SYSINFO_DSO is always definedFlorian Weimer2022-02-102-9/+3
| | | | | | The definition itself is still needed for generic code. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Linux: Remove DL_FIND_ARG_COMPONENTSFlorian Weimer2022-02-101-15/+10
| | | | | | | The generic definition is always used since the Native Client port has been removed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Linux: Remove HAVE_AUX_SECURE, HAVE_AUX_XID, HAVE_AUX_PAGESIZEFlorian Weimer2022-02-102-66/+1
| | | | | | They are always defined. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* elf: Merge dl-sysdep.c into the Linux versionFlorian Weimer2022-02-102-362/+337
| | | | | | | The generic version is the de-facto Linux implementation. It requires an auxiliary vector, so Hurd does not use it. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* hppa: Fix bind-now audit (BZ #28857)Adhemerval Zanella2022-02-096-8/+15
| | | | | | | | | | | | | | | | | | | | | | | | | On hppa, a function pointer returned by la_symbind is actually a function descriptor has the plabel bit set (bit 30). This must be cleared to get the actual address of the descriptor. If the descriptor has been bound, the first word of the descriptor is the physical address of theA function, otherwise, the first word of the descriptor points to a trampoline in the PLT. This patch also adds a workaround on tests because on hppa (and it seems to be the only ABI I have see it), some shared library adds a dynamic PLT relocation to am empty symbol name: $ readelf -r elf/tst-audit25mod1.so [...] Relocation section '.rela.plt' at offset 0x464 contains 6 entries: Offset Info Type Sym.Value Sym. Name + Addend 00002008 00000081 R_PARISC_IPLT 508 [...] It breaks some assumptions on the test, where a symbol with an empty name ("") is passed on la_symbind. Checked on x86_64-linux-gnu and hppa-linux-gnu.
* x86-64: Optimize bzeroH.J. Lu2022-02-0810-25/+256
| | | | | | | | | | memset with zero as the value to set is by far the majority value (99%+ for Python3 and GCC). bzero can be slightly more optimized for this case by using a zero-idiom xor for broadcasting the set value to a register (vector or GPR). Co-developed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* benchtests: Add benches for bzeroH.J. Lu2022-02-084-0/+372
| | | | Add bench-bzero-large.c, bench-bzero-walk.c and bench-bzero.c.
* linux: fix accuracy of get_nprocs and get_nprocs_conf [BZ #28865]Dmitry V. Levin2022-02-071-31/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | get_nprocs() and get_nprocs_conf() use various methods to obtain an accurate number of processors. Re-introduce __get_nprocs_sched() as a source of information, and fix the order in which these methods are used to return the most accurate information. The primary source of information used in both functions remains unchanged. This also changes __get_nprocs_sched() error return value from 2 to 0, but all its users are already prepared to handle that. Old fallback order: get_nprocs: /sys/devices/system/cpu/online -> /proc/stat -> 2 get_nprocs_conf: /sys/devices/system/cpu/ -> /proc/stat -> 2 New fallback order: get_nprocs: /sys/devices/system/cpu/online -> /proc/stat -> sched_getaffinity -> 2 get_nprocs_conf: /sys/devices/system/cpu/ -> /proc/stat -> sched_getaffinity -> 2 Fixes: 342298278e ("linux: Revert the use of sched_getaffinity on get_nproc") Closes: BZ #28865 Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* x86: Remove SSSE3 instruction for broadcast in memset.S (SSE2 Only)Noah Goldstein2022-02-071-3/+4
| | | | | | | | | | | commit b62ace2740a106222e124cc86956448fa07abf4d Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Sun Feb 6 00:54:18 2022 -0600 x86: Improve vec generation in memset-vec-unaligned-erms.S Revert usage of 'pshufb' in broadcast logic as it is an SSSE3 instruction and memset.S is restricted to only SSE2 instructions.
* benchtests: Sort benches in MakefileH.J. Lu2022-02-071-19/+110
| | | | Put one bench per line and sort them.
* Benchtests: Add length zero benchmark for memset in bench-memset.cNoah Goldstein2022-02-061-1/+1
| | | | | | Zero is a relevant size for some workloads (roughly 5% of uses for GCC) so we should be testing it's performance as well. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86: Improve vec generation in memset-vec-unaligned-erms.SNoah Goldstein2022-02-065-87/+152
| | | | | | | | | | | | | | | | | | | | | | | | | No bug. Split vec generation into multiple steps. This allows the broadcast in AVX2 to use 'xmm' registers for the L(less_vec) case. This saves an expensive lane-cross instruction and removes the need for 'vzeroupper'. For SSE2 replace 2x 'punpck' instructions with zero-idiom 'pxor' for byte broadcast. Results for memset-avx2 small (geomean of N = 20 benchset runs). size, New Time, Old Time, New / Old 0, 4.100, 3.831, 0.934 1, 5.074, 4.399, 0.867 2, 4.433, 4.411, 0.995 4, 4.487, 4.415, 0.984 8, 4.454, 4.396, 0.987 16, 4.502, 4.443, 0.987 All relevant string/wcsmbs tests are passing. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector tan/tanf to libmvec microbenchmarkSunil K Pandey2022-02-063-0/+8201
| | | | | | | | | | | | | | | | | | | | Add vector tan/tanf and input files to libmvec microbenchmark. libmvec-tan-inputs: 90% Normal random distribution range: (-DBL_MAX, DBL_MAX) mean: 0.0 sigma: 5.0 10% uniform random distribution in range (-1000.0, 1000.0) libmvec-tanf-inputs: 90% Normal random distribution range: (-FLT_MAX, FLT_MAX) mean: 0.0f sigma: 5.0f 10% uniform random distribution in range (-1000.0f, 1000.0f) Reviewed-by: H.J. Lu <hjl.tools@gmail.com>