about summary refs log tree commit diff
path: root/sysdeps
Commit message (Collapse)AuthorAgeFilesLines
...
* x86-64: Use ZMM16-ZMM31 in AVX512 memmove family functionsH.J. Lu2021-03-293-19/+42
| | | | | | Update ifunc-memmove.h to select the function optimized with AVX512 instructions using ZMM16-ZMM31 registers to avoid RTM abort with usable AVX512VL since VZEROUPPER isn't needed at function exit.
* x86-64: Use ZMM16-ZMM31 in AVX512 memset family functionsH.J. Lu2021-03-294-24/+31
| | | | | | | Update ifunc-memset.h/ifunc-wmemset.h to select the function optimized with AVX512 instructions using ZMM16-ZMM31 registers to avoid RTM abort with usable AVX512VL and AVX512BW since VZEROUPPER isn't needed at function exit.
* x86: Add string/memory function tests in RTM regionH.J. Lu2021-03-2912-0/+618
| | | | | | | | At function exit, AVX optimized string/memory functions have VZEROUPPER which triggers RTM abort. When such functions are called inside a transactionally executing RTM region, RTM abort causes severe performance degradation. Add tests to verify that string/memory functions won't cause RTM abort in RTM region.
* x86-64: Add AVX optimized string/memory functions for RTMH.J. Lu2021-03-2952-248/+670
| | | | | | | | | | | | | | | | | Since VZEROUPPER triggers RTM abort while VZEROALL won't, select AVX optimized string/memory functions with xtest jz 1f vzeroall ret 1: vzeroupper ret at function exit on processors with usable RTM, but without 256-bit EVEX instructions to avoid VZEROUPPER inside a transactionally executing RTM region.
* x86-64: Add memcmp family functions with 256-bit EVEXH.J. Lu2021-03-295-4/+467
| | | | | | | Update ifunc-memcmp.h to select the function optimized with 256-bit EVEX instructions using YMM16-YMM31 registers to avoid RTM abort with usable AVX512VL, AVX512BW and MOVBE since VZEROUPPER isn't needed at function exit.
* x86-64: Add memset family functions with 256-bit EVEXH.J. Lu2021-03-296-14/+90
| | | | | | | Update ifunc-memset.h/ifunc-wmemset.h to select the function optimized with 256-bit EVEX instructions using YMM16-YMM31 registers to avoid RTM abort with usable AVX512VL and AVX512BW since VZEROUPPER isn't needed at function exit.
* x86-64: Add memmove family functions with 256-bit EVEXH.J. Lu2021-03-295-11/+104
| | | | | | Update ifunc-memmove.h to select the function optimized with 256-bit EVEX instructions using YMM16-YMM31 registers to avoid RTM abort with usable AVX512VL since VZEROUPPER isn't needed at function exit.
* x86-64: Add strcpy family functions with 256-bit EVEXH.J. Lu2021-03-299-3/+1339
| | | | | | Update ifunc-strcpy.h to select the function optimized with 256-bit EVEX instructions using YMM16-YMM31 registers to avoid RTM abort with usable AVX512VL and AVX512BW since VZEROUPPER isn't needed at function exit.
* x86-64: Add ifunc-avx2.h functions with 256-bit EVEXH.J. Lu2021-03-2924-17/+2995
| | | | | | | | | | Update ifunc-avx2.h, strchr.c, strcmp.c, strncmp.c and wcsnlen.c to select the function optimized with 256-bit EVEX instructions using YMM16-YMM31 registers to avoid RTM abort with usable AVX512VL, AVX512BW and BMI2 since VZEROUPPER isn't needed at function exit. For strcmp/strncmp, prefer AVX2 strcmp/strncmp if Prefer_AVX2_STRCMP is set.
* x86: Set Prefer_No_VZEROUPPER and add Prefer_AVX2_STRCMPH.J. Lu2021-03-293-2/+21
| | | | | | | | | 1. Set Prefer_No_VZEROUPPER if RTM is usable to avoid RTM abort triggered by VZEROUPPER inside a transactionally executing RTM region. 2. Since to compare 2 32-byte strings, 256-bit EVEX strcmp requires 2 loads, 3 VPCMPs and 2 KORDs while AVX2 strcmp requires 1 load, 2 VPCMPEQs, 1 VPMINU and 1 VPMOVMSKB, AVX2 strcmp is faster than EVEX strcmp. Add Prefer_AVX2_STRCMP to prefer AVX2 strcmp family functions.
* linux: Add y2106 support on utimensat testsAdhemerval Zanella2021-03-294-166/+119
| | | | | | | | | | | | The tests are refactored to use a common skeleton that handles whether the underlying filesystem supports 64 bit time, skips 64 bit time tests when the TU only supports 32 bit, and also skip 64 bit time tests larger than 32 unsigned int (y2106) if the system does not support it (MIPSn64 on kernels without statx support). Checked on x86_64-linux-gnu and i686-linux-gnu. I also checked on a mips64el-linux-gnu with 4.1.4 and 5.10.0-4-5kc-malta kernel to verify if the y2106 are indeed skipped.
* linux: Use statx for MIPSn64Adhemerval Zanella2021-03-293-33/+29
| | | | | | | | | | | MIPSn64 kernel ABI for legacy stat uses unsigned 32 bit for second timestamp, which limits the maximum value to y2106. This patch make mips64 use statx as for 32-bit architectures. Thie __cp_stat64_t64_statx is open coded, its usage is solely on fstatat64 and it avoid the need to redefine the name for mips64 (which will call __cp_stat64_statx since its does not use __stat64_t64 internally).
* linux: Disable fstatat64 fallback if __ASSUME_STATX is definedAdhemerval Zanella2021-03-291-16/+40
| | | | | | | | | | | | If the minimum kernel supports statx there is no need to call the fallback stat legacy syscalls. The statx is also called on compat xstat syscall, but different than the fstatat it calls no fallback and it is assumed to be always present. Checked on powerpc-linux-gnu (with and without --enable-kernel=4.11) and on powerpc64-linux-gnu.
* linux: Implement fstatat with __fstatat64_time64Adhemerval Zanella2021-03-291-38/+11
| | | | | | | | | It makes fstatat use __NR_statx, which fix the s390 issue with missing nanoxsecond support on compat stat syscalls (at least on recent kernels) and limits the statx call to only one function (which simplifies the __ASSUME_STATX support). Checked on i686-linux-gnu and on powerpc-linux-gnu.
* x86: Properly disable XSAVE related features [BZ #27605]H.J. Lu2021-03-292-0/+56
| | | | | | | 1. Support GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVE. 2. Disable all features which depend on XSAVE: a. If OSXSAVE is disabled by glibc tunables. Or b. If both XSAVE and XSAVEC aren't usable.
* nptl: Remove __libc_allocate_rtsig, __libc_current_sigrtmax, and ↵Adhemerval Zanella2021-03-2632-118/+9
| | | | | | | | __libc_current_sigrtmin The libc version is identical and built with same flags. Checked on x86_64-linux-gnu.
* nptl: Move sigaction to libcAdhemerval Zanella2021-03-2629-58/+0
| | | | | | The libc version is identical and built with same flags. Checked on x86_64-linux-gnu.
* nptl: Remove pthread raise implementationAdhemerval Zanella2021-03-2631-50/+0
| | | | | | | | | | The Linux version already target the current thread by using tgkill along with getpid and gettid. For arm, libpthread does not do a intra PLT since it will call the raise from libc. Checked on x86_64-linux-gnu.
* nptl: Move pthread_kill to libcAdhemerval Zanella2021-03-2661-29/+64
| | | | | | A new 2.34 version is also provided. Checked on x86_64-linux-gnu.
* nptl: Remove pwrite from libpthreadAdhemerval Zanella2021-03-2638-87/+37
| | | | | | | The libc version is identical and built with same flags, it is also uses as the default version. Checked on x86_64-linux-gnu.
* nptl: Remove pread from libpthreadAdhemerval Zanella2021-03-2638-87/+37
| | | | | | | The libc version is identical and built with same flags, it is also uses as the default version. Checked on x86_64-linux-gnu.
* nptl: Remove open from libpthreadAdhemerval Zanella2021-03-2638-120/+15
| | | | | | | The libc version is identical and built with same flags. The libc version is set as the default version. Checked on x86_64-linux-gnu.
* nptl: Remove lseek from libpthreadAdhemerval Zanella2021-03-2637-89/+12
| | | | | | | | | | | The libc version is identical and built with same flags. The libc version is set as the default version. The libpthread compat symbol requires to mask it when building the loader object otherwise ld might complain about a missing versioned symbol (as for alpha). Checked on x86_64-linux-gnu.
* nptl: Remove send from libpthreadAdhemerval Zanella2021-03-2637-72/+20
| | | | | | | | | | | | | The libc version is identical and built with same flags. Both aarch64 and nios2 also requires to export __send and tt was done previously with the HAVE_INTERNAL_SEND_SYMBOL (which forced the symbol creation). All __send callers are internal to libc and the original issue that required the symbol export was due a missing libc_hidden_def. So a compat symbol is added for __send and the libc_hidden_def is defined regardless. Checked on x86_64-linux-gnu and i686-linux-gnu.
* aarch64: Optimize __libc_mtag_tag_zero_regionSzabolcs Nagy2021-03-261-16/+80
| | | | | | | | This is a target hook for memory tagging, the original was a naive implementation. Uses the same algorithm as __libc_mtag_tag_region, but with instructions that also zero the memory. This was not benchmarked on real cpu, but expected to be faster than the naive implementation.
* aarch64: Optimize __libc_mtag_tag_regionSzabolcs Nagy2021-03-261-18/+80
| | | | | | | | This is a target hook for memory tagging, the original was a naive implementation. The optimized version relies on "dc gva" to tag 64 bytes at a time for large allocations and optimizes small cases without adding too many branches. This was not benchmarked on real cpu, but expected to be faster than the naive implementation.
* aarch64: inline __libc_mtag_new_tagSzabolcs Nagy2021-03-263-41/+11
| | | | | This is a common operation when heap tagging is enabled, so inline the instructions instead of using an extern call.
* aarch64: inline __libc_mtag_address_get_tagSzabolcs Nagy2021-03-263-39/+10
| | | | | | | | | | | | This is a common operation when heap tagging is enabled, so inline the instruction instead of using an extern call. The .inst directive is used instead of the name of the instruction (or acle intrinsics) because malloc.c is not compiled for armv8.5-a+memtag architecture, runtime cpu support detection is used. Prototypes are removed from the comments as they were not always correct.
* malloc: Only support zeroing and not arbitrary memset with mtagSzabolcs Nagy2021-03-264-17/+13
| | | | | | | | | | The memset api is suboptimal and does not provide much benefit. Memory tagging only needs a zeroing memset (and only for memory that's sized and aligned to multiples of the tag granule), so change the internal api and the target hooks accordingly. This is to simplify the implementation of the target hook. Reviewed-by: DJ Delorie <dj@redhat.com>
* malloc: Ensure the generic mtag hooks are not usedSzabolcs Nagy2021-03-261-10/+31
| | | | | | | | | | | | | | | | | | Use inline functions instead of macros, because macros can cause unused variable warnings and type conversion issues. We assume these functions may appear in the code but only in dead code paths (hidden by a runtime check), so it's important that they can compile with correct types, but if they are actually used that should be an error. Currently the hooks are only used when USE_MTAG is true which only happens on aarch64 and then the aarch64 specific code is used not this generic header. However followup refactoring will allow the hooks to be used with !USE_MTAG. Note: the const qualifier in the comment was wrong: changing tags is a write operation. Reviewed-by: DJ Delorie <dj@redhat.com>
* S390: Also check vector support in memmove ifunc-selector [BZ #27511]Stefan Liebler2021-03-264-6/+15
| | | | | | | | | | | | | | | The arch13 memmove variant is currently selected by the ifunc selector if the Miscellaneous-Instruction-Extensions Facility 3 facility bit is present, but the function is also using vector instructions. If the vector support is not present, one is receiving an operation exception. Therefore this patch also checks for vector support in the ifunc selector and in ifunc-impl-list.c. Just to be sure, the configure check is now also testing an arch13 vector instruction and an arch13 Miscellaneous-Instruction-Extensions Facility 3 instruction.
* Support for multiple versions in versioned_symbol, compat_symbolFlorian Weimer2021-03-252-3/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This essentially folds compat_symbol_unique functionality into compat_symbol. This change eliminates the need for intermediate aliases for defining multiple symbol versions, for both compat_symbol and versioned_symbol. Some binutils versions do not suport multiple versions per symbol on some targets, so aliases are automatically introduced, similar to what compat_symbol_unique did. To reduce symbol table sizes, a configure check is added to avoid these aliases if they are not needed. The new mechanism works with data symbols as well as function symbols, due to the way an assembler-level redirect is used. It is not compatible with weak symbols for old binutils versions, which is why the definition of __malloc_initialize_hook had to be changed. This is not a loss of functionality because weak symbols do not matter to dynamic linking. The placeholder symbol needs repeating in nptl/libpthread-compat.c now that compat_symbol is used, but that seems more obvious than introducing yet another macro. A subtle difference was that compat_symbol_unique made the symbol global automatically. compat_symbol does not do this, so static had to be removed from the definition of __libpthread_version_placeholder. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Change how the symbol_version_reference macro is definedFlorian Weimer2021-03-251-0/+38
| | | | | | | | | | | | A subsequent change will require including <config.h> for defining symbol_version_reference. <libc-symbol.h> should not include <config.h> for _ISOMAC, so it cannot define symbol_version_reference anymore, but symbol_version_reference is needed <shlib-compat.h> even for _ISOMAC. Moving the definition of symbol_version_reference to a separate file <libc-symver.h> makes it possible to use a single definition for both cases. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* elf: Fix not compiling ifunc tests that need gcc ifunc supportSamuel Thibault2021-03-241-0/+2
|
* htl: Add missing fork.hSamuel Thibault2021-03-241-0/+20
| | | | | | | | 2b47727c68b6 ("posix: Consolidate register-atfork") introduced a fork.h header to declare the atfork unregister hook, but was missing adding it for htl. This fixes tst-atfork2.
* hurd: handle EINTR during critical sectionsSamuel Thibault2021-03-2322-2/+102
| | | | | | | | | During critical sections, signal handling is deferred and thus RPCs return EINTR, even if SA_RESTART is set. We thus have to restart the whole critical section in that case. This also adds HURD_CRITICAL_UNLOCK in the cases where one wants to break the section in the middle.
* tst: Add test for sigtimedwaitLukasz Majewski2021-03-232-1/+63
| | | | | | | | | | | | | This change adds new test to assess sigtimedwait's timeout related functionality - the sigset_t is configured for SIGUSR1, which will not be triggered, so sigtimedwait just waits for timeout. To be more specific - two use cases are checked: - if sigtimedwait times out immediately when passed struct timespec has zero values of tv_nsec and tv_sec. - if sigtimedwait times out after timeout specified in passed argument Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* tst: Add test for ntp_gettimexLukasz Majewski2021-03-232-1/+23
| | | | | | This test is a wrapper on tst-ntp_gettime test. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* tst: Add test for ntp_gettimeLukasz Majewski2021-03-232-1/+57
| | | | | | | This code provides test to check if time on target machine is properly read via ntp_gettime syscall. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* fix: Always export ntp_gettimex functionLukasz Majewski2021-03-231-1/+1
| | | | | | After this patch applied the ntp_gettimex function is always declared in the sys/timex.h header. Currently it is not when __REDIRECT_NTH is defined (i.e. in ARM 32 bit port).
* nptl: Remove MULTI_PAGE_ALIASING [BZ #23554]H.J. Lu2021-03-192-24/+0
| | | | | MULTI_PAGE_ALIASING was introduced to mitigate an aliasing issue on Pentium 4. It is no longer needed for processors after Pentium 4.
* signal: Add __libc_sigactionAdhemerval Zanella2021-03-1810-17/+12
| | | | | | | | The generic implementation basically handle the system agnostic logic (filtering out the invalid signals) while the __libc_sigaction is the function with implements the system and architecture bits. Checked on x86_64-linux-gnu and i686-linux-gnu.
* nptl: Move system to libcAdhemerval Zanella2021-03-1825-25/+0
| | | | | | The libc version is identical and built with same flags. Checked on x86_64-linux-gnu.
* nptl: Move fcntl from libpthreadAdhemerval Zanella2021-03-1827-101/+0
| | | | | | The libc version is identical and built with same flags. Checked on x86_64-linux-gnu.
* nptl: Remove sendmsg from libpthreadAdhemerval Zanella2021-03-1829-29/+0
| | | | | | The libc version is identical and built with same flags. Checked on x86_64-linux-gnu.
* nptl: Remove recvmsg from libpthreadAdhemerval Zanella2021-03-1829-29/+0
| | | | | | The libc version is identical and built with same flags. Checked on x86_64-linux-gnu.
* nptl: Remove sigwait from libpthreadAdhemerval Zanella2021-03-1829-29/+0
| | | | | | The libc version is identical and built with same flags. Checked on x86_64-linux-gnu.
* nptl: Remove tcdrain from libpthreadAdhemerval Zanella2021-03-1829-29/+0
| | | | | | The libc version is identical and built with same flags. Checked on x86_64-linux-gnu.
* nptl: Remove pause from libpthreadAdhemerval Zanella2021-03-1829-29/+0
| | | | | | The libc version is identical and built with same flags. Checked on x86_64-linux-gnu.
* nptl: Remove msync from libpthreadAdhemerval Zanella2021-03-1829-29/+0
| | | | | | The libc version is identical and built with same flags. Checked on x86_64-linux-gnu.