about summary refs log tree commit diff
Commit message (Collapse)AuthorAgeFilesLines
* x86: Optimize strlen-evex.SNoah Goldstein2021-04-191-264/+317
| | | | | | | | | | No bug. This commit optimizes strlen-evex.S. The optimizations are mostly small things but they add up to roughly 10-30% performance improvement for strlen. The results for strnlen are bit more ambiguous. test-strlen, test-strnlen, test-wcslen, and test-wcsnlen are all passing. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
* Reindent string/test-memmove.cH.J. Lu2021-04-191-15/+15
|
* x86: Expand test-memset.c and bench-memset.cNoah Goldstein2021-04-192-7/+19
| | | | | | | | | No bug. This commit adds tests cases and benchmarks for page cross and for memset to the end of the page without crossing. As well in test-memset.c this commit adds sentinel on start/end of tstbuf to test for overwrites Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
* x86: Optimize less_vec evex and avx512 memset-vec-unaligned-erms.SNoah Goldstein2021-04-195-27/+74
| | | | | | | | No bug. This commit adds optimized cased for less_vec memset case that uses the avx512vl/avx512bw mask store avoiding the excessive branches. test-memset and test-wmemset are passing. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
* x86-64: Require BMI2 for strchr-avx2.SH.J. Lu2021-04-192-5/+11
| | | | | | | | | | | | | | | | | Since strchr-avx2.S updated by commit 1f745ecc2109890886b161d4791e1406fdfc29b8 Author: noah <goldstein.w.n@gmail.com> Date: Wed Feb 3 00:38:59 2021 -0500 x86-64: Refactor and improve performance of strchr-avx2.S uses sarx: c4 e2 72 f7 c0 sarx %ecx,%eax,%eax for strchr-avx2 family functions, require BMI2 in ifunc-impl-list.c and ifunc-avx2.h.
* x86-64: Require BMI2 for __strlen_evex and __strnlen_evexH.J. Lu2021-04-191-2/+4
| | | | | | | | | | | | | | | | | Since __strlen_evex and __strnlen_evex added by commit 1fd8c163a83d96ace1ff78fa6bac7aee084f6f77 Author: H.J. Lu <hjl.tools@gmail.com> Date: Fri Mar 5 06:24:52 2021 -0800 x86-64: Add ifunc-avx2.h functions with 256-bit EVEX use sarx: c4 e2 6a f7 c0 sarx %edx,%eax,%eax require BMI2 for __strlen_evex and __strnlen_evex in ifunc-impl-list.c. ifunc-avx2.h already requires BMI2 for EVEX implementation.
* benchtests: Fix name of exp10f benchmark variantSiddhesh Poyarekar2021-04-181-1/+1
| | | | Variant names don't accept brackets.
* benchtests: Fix pthread-locks test to produce valid jsonSiddhesh Poyarekar2021-04-182-8/+11
| | | | | | | | | The benchtests json allows {function {variant}} categorization of results whereas the pthread-locks tests had {function {variant {subvariant}}}, which broke validation. Fix that by serializing the subvariants as variant-subvariant. Also update the schema to recognize the new benchmark attributes after fixing the naming conventions.
* x86: Expanding test-memmove.c, test-memcpy.c, bench-memcpy-large.cnoah2021-04-163-55/+82
| | | | | | | | | | | | No Bug. This commit expanding the range of tests / benchmarks for memmove and memcpy. The test expansion is mostly in the vein of increasing the maximum size, increasing the number of unique alignments tested, and testing both source < destination and vice versa. The benchmark expansaion is just to increase the number of unique alignments. test-memcpy, test-memccpy, test-mempcpy, test-memmove, and tst-memmove-overflow all pass. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
* Set the retain attribute on _elf_set_element if CC supports [BZ #27492]Fangrui Song2021-04-1613-4/+162
| | | | | | | | | | | | | | | | | | | | | | So that text_set_element/data_set_element/bss_set_element defined variables will be retained by the linker. Note: 'used' and 'retain' are orthogonal: 'used' makes sure the variable will not be optimized out; 'retain' prevents section garbage collection if the linker support SHF_GNU_RETAIN. GNU ld 2.37 and LLD 13 will support -z start-stop-gc which allow C identifier name sections to be GCed even if there are live __start_/__stop_ references. Without the change, there are some static linking problems, e.g. _IO_cleanup (libio/genops.c) may be discarded by ld --gc-sections, so stdout is not flushed on exit. Note: GCC may warning 'retain' attribute ignored while __has_attribute(retain) is 1 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99587). Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86: Update large memcpy case in memmove-vec-unaligned-erms.Snoah2021-04-161-73/+265
| | | | | | | | | | | | | | No Bug. This commit updates the large memcpy case (no overlap). The update is to perform memcpy on either 2 or 4 contiguous pages at once. This 1) helps to alleviate the affects of false memory aliasing when destination and source have a close 4k alignment and 2) In most cases and for most DRAM units is a modestly more efficient access pattern. These changes are a clear performance improvement for VEC_SIZE =16/32, though more ambiguous for VEC_SIZE=64. test-memcpy, test-memccpy, test-mempcpy, test-memmove, and tst-memmove-overflow all pass. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
* powerpc: Add missing registers to clobbers list for syscalls [BZ #27623]Matheus Castanho2021-04-161-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | Some registers that can be clobbered by the kernel during a syscall are not listed on the clobbers list in sysdeps/unix/sysv/linux/powerpc/sysdep.h. For syscalls using sc: - XER is zeroed by the kernel on exit For syscalls using scv: - XER is zeroed by the kernel on exit - Different from the sc case, most CR fields can be clobbered (according to the ELF ABI and the Linux kernel's syscall ABI for powerpc (linux/Documentation/powerpc/syscall64-abi.rst) The same should apply to vsyscalls, which effectively execute a function call but are not currently adding these registers as clobbers either. These are likely not causing issues today, but they should be added to the clobbers list just in case things change on the kernel side in the future. Reported-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Raphael M Zinsly <rzinsly@linux.ibm.com>
* misc: syslog: Use static const for AF_UNIX addressAdhemerval Zanella2021-04-151-5/+6
| | | | Checked on x86_64-linux-gnu.
* misc: syslog: Use CLOC_EXEC with _PATH_CONSOLE (BZ #17145)Adhemerval Zanella2021-04-151-1/+2
| | | | | | | The syslog open the '/dev/console' for LOG_CONS without O_CLOEXEC, which might leak in multithread programs that call fork. Checked on x86_64-linux-gnu.
* misc: syslog: Assume MSG_NOSIGNAL support (BZ #17144)Adhemerval Zanella2021-04-152-49/+4
| | | | | | | | MSG_NOSIGNAL was added on POSIX 2008 and Hurd seems to support it. The SIGPIPE handling also makes the implementation not thread-safe (due the sigaction usage). Checked on x86_64-linux-gnu.
* misc: syslog: Use bool for connectedAdhemerval Zanella2021-04-151-3/+3
| | | | Checked on x86_64-linux-gnu.
* posix: Add wait3 testsAdhemerval Zanella2021-04-154-191/+235
| | | | | | | | | The tst-wait4 is moved to common file and used for wait3 tests. Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* time: Add 64 bit tests for getdate / getdate_rAdhemerval Zanella2021-04-151-61/+92
| | | | | | | | The test is also converted to use libsupport. Checked on i686-linux-gnu and x86_64-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* time: Add basic timespec_get testsAdhemerval Zanella2021-04-152-1/+42
| | | | | | Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* time: Add timegm/timelocal basic testsAdhemerval Zanella2021-04-152-1/+96
| | | | | | Checked i686-linux-gnu and x86_64-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* time: Add gmtime/gmtime_r testsAdhemerval Zanella2021-04-152-1/+125
| | | | | | Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* time: Add getitimer and setitimer basic testsAdhemerval Zanella2021-04-152-1/+176
| | | | | | Checked on i686-linux-gnu and x86_64-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* io: Use temporary directory and file for ftwtest-shAdhemerval Zanella2021-04-151-123/+119
| | | | | | | | It allows run it in parallel. Checked on x86_64-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* io: Add basic tests for utimensatAdhemerval Zanella2021-04-152-0/+71
| | | | | | Checked on x86_64-linux-gnu and i686-linux-gnu Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* linux: Add lutimes testAdhemerval Zanella2021-04-157-5/+64
| | | | | | | | It uses stat to compare against the values set by lutimes. Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* linux: Add futimes testAdhemerval Zanella2021-04-152-0/+47
| | | | | | | | It uses stat to compare against the values set by futimes. Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* io: Move file timestamps tests out of LinuxAdhemerval Zanella2021-04-156-2/+5
| | | | | | | | | | | | Now that libsupport abstract Linux possible missing support (either due FS limitation that can't handle 64 bit timestamp or architectures that do not handle values larger than unsigned 32 bit values) the tests can be turned generic. Checked on x86_64-linux-gnu and i686-linux-gnu. I also built the tests for i686-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* elf: Fix missing include in test case [BZ #27136]Szabolcs Nagy2021-04-151-0/+1
| | | | | | | Broken test was introduced in commit 8f85075a2e9c26ff7486d4bbaf358999807d215c elf: Add a DTV setup test [BZ #27136]
* s390: Update ulpsStefan Liebler2021-04-151-3/+3
| | | | | Required after 9acda61d94acc "Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]".
* i386: Remove lazy tlsdesc relocation related codeSzabolcs Nagy2021-04-153-391/+2
| | | | | | | | | | Like in commit e75711ebfa976d5468ec292282566a18b07e4d67 for x86_64, remove unused lazy tlsdesc relocation processing code: _dl_tlsdesc_resolve_abs_plus_addend _dl_tlsdesc_resolve_rel _dl_tlsdesc_resolve_rela _dl_tlsdesc_resolve_hold
* x86_64: Remove lazy tlsdesc relocation related codeSzabolcs Nagy2021-04-154-219/+2
| | | | | _dl_tlsdesc_resolve_rela and _dl_tlsdesc_resolve_hold are only used for lazy tlsdesc relocation processing which is no longer supported.
* i386: Avoid lazy relocation of tlsdesc [BZ #27137]Szabolcs Nagy2021-04-151-42/+34
| | | | | | | | | | | Lazy tlsdesc relocation is racy because the static tls optimization and tlsdesc management operations are done without holding the dlopen lock. This similar to the commit b7cf203b5c17dd6d9878537d41e0c7cc3d270a67 for aarch64, but it fixes a different race: bug 27137. On i386 the code is a bit more complicated than on x86_64 because both rel and rela relocs are supported.
* x86_64: Avoid lazy relocation of tlsdesc [BZ #27137]Szabolcs Nagy2021-04-151-5/+14
| | | | | | | | | | | | | Lazy tlsdesc relocation is racy because the static tls optimization and tlsdesc management operations are done without holding the dlopen lock. This similar to the commit b7cf203b5c17dd6d9878537d41e0c7cc3d270a67 for aarch64, but it fixes a different race: bug 27137. Another issue is that ld auditing ignores DT_BIND_NOW and thus tries to relocate tlsdesc lazily, but that does not work in a BIND_NOW module due to missing DT_TLSDESC_PLT. Unconditionally relocating tlsdesc at load time fixes this bug 27721 too.
* elf: Refactor _dl_update_slotinfo to avoid use after freeSzabolcs Nagy2021-04-151-16/+5
| | | | | | | | | | | | map is not valid to access here because it can be freed by a concurrent dlclose: during tls access (via __tls_get_addr) _dl_update_slotinfo is called without holding dlopen locks. So don't check the modid of map. The map == 0 and map != 0 code paths can be shared (avoiding the dtv resize in case of map == 0 is just an optimization: larger dtv than necessary would be fine too). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* elf: Fix comments and logic in _dl_add_to_slotinfoSzabolcs Nagy2021-04-151-10/+1
| | | | | | | | | | | | | | | Since commit a509eb117fac1d764b15eba64993f4bdb63d7f3c Avoid late dlopen failure due to scope, TLS slotinfo updates [BZ #25112] the generation counter update is not needed in the failure path. That commit ensures allocation in _dl_add_to_slotinfo happens before the demarcation point in dlopen (it is called twice, first time is for allocation only where dlopen can still be reverted on failure, then second time actual dtv updates are done which then cannot fail). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* elf: Add a DTV setup test [BZ #27136]Szabolcs Nagy2021-04-153-1/+109
| | | | | | | | | | | The test dlopens a large number of modules with TLS, they are reused from an existing test. The test relies on the reuse of slotinfo entries after dlclose, without bug 27135 fixed this needs a failing dlopen. With a slotinfo list that has non-monotone increasing generation counters, bug 27136 can trigger. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* elf: Fix a DTV setup issue [BZ #27136]Szabolcs Nagy2021-04-151-1/+1
| | | | | | | | | | | The max modid is a valid index in the dtv, it should not be skipped. The bug is observable if the last module has modid == 64 and its generation is same or less than the max generation of the previous modules. Then dtv[0].counter implies dtv[64] is initialized but it isn't. Fixes bug 27136. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* ARC: Update ulpsVineet Gupta2021-04-142-25/+29
| | | | | | Needed after 43576de04afc6 Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
* Remove PR_TAGGED_ADDR_ENABLE from sys/prctl.hSzabolcs Nagy2021-04-141-4/+0
| | | | | | | | | | | | | | The value of PR_TAGGED_ADDR_ENABLE was incorrect in the installed headers and the prctl command macros were missing that are needed for it to be useful (PR_SET_TAGGED_ADDR_CTRL). Linux headers have the definitions since 5.4 so it's widely available, we don't need to repeat these definitions. The remaining definitions are from Linux 5.10. To build glibc with --enable-memory-tagging, Linux 5.4 headers and binutils 2.33.1 or newer is needed. Reviewed-by: DJ Delorie <dj@redhat.com>
* linux: sysconf: Use a more explicit maximum_ARG_MAXAdhemerval Zanella2021-04-131-1/+1
|
* linux: sysconf: limit _SC_MAX_ARG to 6 MiB (BZ #25305)Michal Nazarewicz2021-04-131-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since Linux 4.13, kernel limits the maximum command line arguments length to 6 MiB [1]. Normally the limit is still quarter of the maximum stack size but if that limit exceeds 6 MiB it's clamped down. glibc's __sysconf implementation for Linux platform is not aware of this limitation and for stack sizes of over 24 MiB it returns higher ARG_MAX than Linux will actually accept. This can be verified by executing the following application on Linux 4.13 or newer: #include <stdio.h> #include <string.h> #include <sys/resource.h> #include <sys/time.h> #include <unistd.h> int main(void) { const struct rlimit rlim = { 40 * 1024 * 1024, 40 * 1024 * 1024 }; if (setrlimit(RLIMIT_STACK, &rlim) < 0) { perror("setrlimit: RLIMIT_STACK"); return 1; } printf("ARG_MAX : %8ld\n", sysconf(_SC_ARG_MAX)); printf("63 * 100 KiB: %8ld\n", 63L * 100 * 1024); printf("6 MiB : %8ld\n", 6L * 1024 * 1024); char str[100 * 1024], *argv[64], *envp[1]; memset(&str, 'A', sizeof str); str[sizeof str - 1] = '\0'; for (size_t i = 0; i < sizeof argv / sizeof *argv - 1; ++i) { argv[i] = str; } argv[sizeof argv / sizeof *argv - 1] = envp[0] = 0; execve("/bin/true", argv, envp); perror("execve"); return 1; } On affected systems the program will report ARG_MAX as 10 MiB but despite that executing /bin/true with a bit over 6 MiB of command line arguments will fail with E2BIG error. Expected result is that ARG_MAX is reported as 6 MiB. Update the __sysconf function to clamp ARG_MAX value to 6 MiB if it would otherwise exceed it. This resolves bug #25305 which was market WONTFIX as suggested solution was to cap ARG_MAX at 128 KiB. As an aside and point of comparison, bionic (a libc implementation for Android systems) decided to resolve this issue by always returning 128 KiB ignoring any potential xargs regressions [2]. On older kernels this results in returning overly conservative value but that's a safer option than being aggressive and returning invalid value on recent systems. It's also worth noting that at this point all supported Linux releases have the 6 MiB barrier so only someone running an unsupported kernel version would get incorrectly truncated result. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> [1] See https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=da029c11e6b12f321f36dac8771e833b65cec962 [2] See https://android.googlesource.com/platform/bionic/+/baed51ee3a13dae4b87b11870bdf7f10bdc9efc1
* misc: syslog: Fix calls to openlog() with LOG_KERN facility (BZ #3604)Dan Raymond2021-04-131-3/+3
| | | | | | | | | | | | | | | | | | POSIX states for syslog [1]: "Values of the priority argument are formed by OR'ing together a severity-level value and an optional facility value. If no facility value is specified, the current default facility value is used." So the patch fixes an existing violation of the openlog interface contract where it is ignoring the facility argument when the value is zero It allows the use LOG_KERN by calling openlog prior syslog usage. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/syslog.html
* s390: Update ulpsAdhemerval Zanella2021-04-131-1/+1
| | | | | Required after 43576de04afc6 "Improve the accuracy of tgamma (BZ #26983)"
* i386: Update ulpsAdhemerval Zanella2021-04-132-4/+4
| | | | | Required after 43576de04afc6 "Improve the accuracy of tgamma (BZ #26983)"
* Improve documentation for malloc etc. (BZ#27719)Paul Eggert2021-04-138-90/+136
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cover key corner cases (e.g., whether errno is set) that are well settled in glibc, fix some examples to avoid integer overflow, and update some other dated examples (code needed for K&R C, e.g.). * manual/charset.texi (Non-reentrant String Conversion): * manual/filesys.texi (Symbolic Links): * manual/memory.texi (Allocating Cleared Space): * manual/socket.texi (Host Names): * manual/string.texi (Concatenating Strings): * manual/users.texi (Setting Groups): Use reallocarray instead of realloc, to avoid integer overflow issues. * manual/filesys.texi (Scanning Directory Content): * manual/memory.texi (The GNU Allocator, Hooks for Malloc): * manual/tunables.texi: Use code font for 'malloc' instead of roman font. (Symbolic Links): Don't assume readlink return value fits in 'int'. * manual/memory.texi (Memory Allocation and C, Basic Allocation) (Malloc Examples, Alloca Example): * manual/stdio.texi (Formatted Output Functions): * manual/string.texi (Concatenating Strings, Collation Functions): Omit pointer casts that are needed only in ancient K&R C. * manual/memory.texi (Basic Allocation): Say that malloc sets errno on failure. Say "convert" rather than "cast", since casts are no longer needed. * manual/memory.texi (Basic Allocation): * manual/string.texi (Concatenating Strings): In examples, use C99 declarations after statements for brevity. * manual/memory.texi (Malloc Examples): Add portability notes for malloc (0), errno setting, and PTRDIFF_MAX. (Changing Block Size): Say that realloc (p, 0) acts like (p ? (free (p), NULL) : malloc (0)). Add xreallocarray example, since other examples can use it. Add portability notes for realloc (0, 0), realloc (p, 0), PTRDIFF_MAX, and improve notes for reallocating to the same size. (Allocating Cleared Space): Reword now-confusing discussion about replacement, and xref "Replacing malloc". * manual/stdio.texi (Formatted Output Functions): Don't assume message size fits in 'int'. * manual/string.texi (Concatenating Strings): Fix undefined behavior involving arithmetic on a freed pointer.
* linux: always update select timeout (BZ #27706)Adhemerval Zanella2021-04-122-2/+32
| | | | | | The timeout should be updated even on failure for time64 support. Checked on i686-linux-gnu.
* linux: Normalize and return timeout on select (BZ #27651)Adhemerval Zanella2021-04-124-10/+54
| | | | | | | | | | | | | | | | | | | | The commit 2433d39b697, which added time64 support to select, changed the function to use __NR_pselect6 (or __NR_pelect6_time64) on all architectures. However, on architectures where the symbol was implemented with __NR_select the kernel normalizes the passed timeout instead of return EINVAL. For instance, the input timeval { 0, 5000000 } is interpreted as { 5, 0 }. And as indicated by BZ #27651, this semantic seems to be expected and changing it results in some performance issues (most likely the program does not check the return code and keeps issuing select with unormalized tv_usec argument). To avoid a different semantic depending whether which syscall the architecture used to issue, select now always normalize the timeout input. This is a slight change for some ABIs (for instance aarch64). Checked on x86_64-linux-gnu and i686-linux-gnu.
* libsupport: Add support_select_normalizes_timeoutAdhemerval Zanella2021-04-123-0/+34
| | | | It will be used on a select() test.
* libsupport: Add support_select_modifies_timeoutAdhemerval Zanella2021-04-123-0/+34
| | | | It will be used on a select() test.
* Fix SXID_ERASE behavior in setuid programs (BZ #27471)Siddhesh Poyarekar2021-04-122-30/+52
| | | | | | | | | | | | | | | | | | | | | | | | | When parse_tunables tries to erase a tunable marked as SXID_ERASE for setuid programs, it ends up setting the envvar string iterator incorrectly, because of which it may parse the next tunable incorrectly. Given that currently the implementation allows malformed and unrecognized tunables pass through, it may even allow SXID_ERASE tunables to go through. This change revamps the SXID_ERASE implementation so that: - Only valid tunables are written back to the tunestr string, because of which children of SXID programs will only inherit a clean list of identified tunables that are not SXID_ERASE. - Unrecognized tunables get scrubbed off from the environment and subsequently from the child environment. - This has the side-effect that a tunable that is not identified by the setxid binary, will not be passed on to a non-setxid child even if the child could have identified that tunable. This may break applications that expect this behaviour but expecting such tunables to cross the SXID boundary is wrong. Reviewed-by: Carlos O'Donell <carlos@redhat.com>