about summary refs log tree commit diff
path: root/sysdeps
Commit message (Collapse)AuthorAgeFilesLines
* linux: Consolidate fstatvfs implementationsAdhemerval Zanella2021-02-117-46/+25
| | | | | | | | | | | There is no need to handle ENOSYS on fstatfs64 call, required only for alpha (where is already fallbacks to fstatfs). The wordsize internal_statvfs64.c is removed, since how the LFS support is provided by fstatvfs64.c (used on 64-bit architectures as well). Checked on x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* linux: Consolidate statfs implementationsAdhemerval Zanella2021-02-119-81/+125
| | | | | | | | | | | | | | | | | | | | | | | | The __NR_statfs64 syscall is supported on all architectures but aarch64, mips64, riscv64, and x86_64. And newer ABIs also uses the new statfs64 interface (where the struct size is used as second argument). So the default implementation now uses: 1. __NR_statfs64 for non-LFS call and handle overflow directly There is no need to handle __NR_statfs since all architectures that only support are LFS only. 2. __NR_statfs if defined or __NR_statfs64 otherwise for LFS call. Alpha is the only outlier, since it is a 64-bit architecture which provides non-LFS interface and only provides __NR_statfs64 on newer kernels (v5.1+). Checked on x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* linux: Consolidate fstatfs implementationsAdhemerval Zanella2021-02-119-73/+126
| | | | | | | | | | | | | | | | | | | | | | | | The __NR_fstatfs64 syscall is supported on all architectures but aarch64, mips64, riscv64, and x86_64. And newer ABIs also uses the new fstatfs64 interface (where the struct size is used as first argument). So the default implementation now uses: 1. __NR_fstatfs64 for non-LFS call and handle overflow directly There is no need to handle __NR_fstatfs since all architectures that only support are LFS only. 2. __NR_fstatfs if defined or __NR_fstatfs64 otherwise for LFS call. Alpha is the only outlier, it is a 64-bit architecture which provides non-LFS interface and only provides __NR_fstatfs64 on newer kernels (5.1+). Checked on x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* linux: Set LFS statfs as defaultAdhemerval Zanella2021-02-112-11/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently glibc has three different struct statfs{64} definitions: 1. Non-LFS support where non-LFS and LFS struct have different size: alpha, arm, hppa, i686, m68k, microblaze, mips (all abis), powerpc32, s390, sh4, and sparc. 2. Non-LFS support where non-LFS and LFS struct have the same size: csky and nios2. 3. Only LFS support (where both struct have the same size): arc, ia64, powerpc64 (including LE), riscv (both 32 and 64 bits), s390x, sparc64, and x86 (including x32). The STATFS_IS_STATFS64/__STATFS_MATCHES_STATFS64 does not tell apart between 1. and 2. since for both the only difference is the struct size (for 2. both non-LFS and LFS uses the same syscall, where for 1. the old non-LFS is used for [f]statfs). This patch move the generic statfs.h for both csky and nios2, and make the default definitions for newer ABIs to assume that only LFS will be support (so there is no need to keep no-LFS and LFS struct statfs with the same size, it will be implicit). This patch does not change the code generation. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* linux: Set default kernel_stat.h to LFSAdhemerval Zanella2021-02-1115-200/+231
| | | | | | | | | | | | | | The XSTAT_IS_XSTAT64 and STAT_IS_KERNEL_STAT flags are now set to 1 and STATFS_IS_STATFS64 is set to __STATFS_MATCHES_STATFS64. This makes the default ABI for newer ports to provide only LFS calls. A copy of non-LFS support is provided to 32-bit ABIS with non-LFS support (arm, csky, i386, m68k, nios2, s390, and sh). Is also allows to remove the 64-bit ports, which already uses the default values. This patch does not change the code generation. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* linux: Fix STATFS_IS_STATFS64 definitionAdhemerval Zanella2021-02-118-8/+36
| | | | | | | | | | | | | | | aarch64, arc, ia64, mips64, powerpc64, riscv32, riscv64, s390x, sparc64, and x86_64 defines STATFS_IS_STATFS64 to 0, but all of them alias statfs to statfs64 and the struct statfs has the same and layout of struct statfs64. The correct definition will be used on the [f]statfs[64] consolidation. This patch does not change code generation since the symbols are implemented using the auto-generation syscall for all the aforementioned ABIs. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* x86: Use SIZE_MAX instead of (long int)-1 for tunable range valueSiddhesh Poyarekar2021-02-101-5/+5
| | | | | The tunable types are SIZE_T, so set the ranges to the correct maximum value, i.e. SIZE_MAX.
* tunables: Simplify TUNABLE_SET interfaceSiddhesh Poyarekar2021-02-102-10/+7
| | | | | | | | | | | | | | | The TUNABLE_SET interface took a primitive C type argument, which resulted in inconsistent type conversions internally due to incorrect dereferencing of types, especialy on 32-bit architectures. This change simplifies the TUNABLE setting logic along with the interfaces. Now all numeric tunable values are stored as signed numbers in tunable_num_t, which is intmax_t. All calls to set tunables cast the input value to its primitive type and then to tunable_num_t for storage. This relies on gcc-specific (although I suspect other compilers woul also do the same) unsigned to signed integer conversion semantics, i.e. the bit pattern is conserved. The reverse conversion is guaranteed by the standard.
* linux: Fix __sem_check_add_mapping search_semAdhemerval Zanella2021-02-091-1/+1
| | | | | | | Similar to __sem_check_add_mapping fix, take in consideration the trailling NULL. Checked x86_64-linux-gnu.
* linux: Fix __sem_check_add_mapping name lengthAdhemerval Zanella2021-02-091-0/+1
| | | | | | | | Take in consideration the trailling NULL since sem_search uses strcmp to compare entries. Checked on x86_64-linux-gnu and powerpc-linux-gnu (where it triggered a nptl/tst-sem7 regression).
* Add more ptrace constants for AArch64 and PowerPC.Joseph Myers2021-02-082-0/+26
| | | | | | | | | | | | Linux 5.10 adds PTRACE_PEEKMTETAGS and PTRACE_POKEMTETAGS for AArch64. Adding those shows up that glibc is also missing PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP, for AArch64 (where they were added to Linux in 5.3) and for PowerPC (where they were added in Linux 4.20); it already has those two defines for x86. Add all those defines to glibc's headers. Tested with build-many-glibcs.py for aarch64-linux-gnu and powerpc-linux-gnu.
* x86-64: Refactor and improve performance of strchr-avx2.Snoah2021-02-082-113/+113
| | | | | | | | No bug. Just seemed the performance could be improved a bit. Observed and expected behavior are unchanged. Optimized body of main loop. Updated page cross logic and optimized accordingly. Made a few minor instruction selection modifications. No regressions in test suite. Both test-strchrnul and test-strchr passed.
* pthread: Remove alloca usage from __sem_check_add_mappingAdhemerval Zanella2021-02-081-6/+17
| | | | | | | sem_open already returns EINVAL for input names larger than NAME_MAX, so it can assume the largest name length with tfind. Checked on x86_64-linux-gnu.
* pthread: Refactor semaphore codeAdhemerval Zanella2021-02-084-165/+221
| | | | | | | | | | | The internal semaphore list code is moved to a specific file, sem_routine.c, and the internal usage is simplified to only two functions (one to insert a new semaphore and one to remove it from the internal list). There is no need to expose the internal locking, neither how the semaphore mapping is implemented. No functional or semantic change is expected, tested on x86_64-linux-gnu.
* linux: Require /dev/shm as the shared memory file systemFlorian Weimer2021-02-0810-342/+43
| | | | | | | | | | | | | | | | | | | | | Previously, glibc would pick an arbitrary tmpfs file system from /proc/mounts if /dev/shm was not available. This could lead to an unsuitable file system being picked for the backing storage for shm_open, sem_open, and related functions. This patch introduces a new function, __shm_get_name, which builds the file name under the appropriate (now hard-coded) directory. It is called from the various shm_* and sem_* function. Unlike the SHM_GET_NAME macro it replaces, the callers handle the return values and errno updates. shm-directory.c is moved directly into the posix subdirectory because it can be implemented directly using POSIX functionality. It resides in libc because it is needed by both librt and nptl/htl. In the sem_open implementation, tmpfname is initialized directly from a string constant. This happens to remove one alloca call. Checked on x86_64-linux-gnu.
* tst: Provide test for ppollLukasz Majewski2021-02-082-1/+57
| | | | | | | | | | | This change adds new test to assess ppoll()'s timeout related functionality (the struct pollfd does not provide valid fd to wait for - just wait for timeout). To be more specific - two use cases are checked: - if ppoll() times out immediately when passed struct timespec has zero values of tv_nsec and tv_sec. - if ppoll() times out after timeout specified in passed argument
* tst: Provide test for timerfd related functionsLukasz Majewski2021-02-082-1/+67
| | | | | | | This change adds new test to assess functionality of timerfd_* functions. It creates new timer (operates on its file descriptor) and checks if time before and after sleep is between expected values.
* x86: Add PTWRITE feature detection [BZ #27346]H.J. Lu2021-02-079-5/+44
| | | | | | | 1. Add CPUID_INDEX_14_ECX_0 for CPUID leaf 0x14 to detect PTWRITE feature in EBX of CPUID leaf 0x14 with ECX == 0. 2. Add PTWRITE detection to CPU feature tests. 3. Add 2 static CPU feature tests.
* nptl: Remove private futex optimization [BZ #27304]Florian Weimer2021-02-041-13/+1
| | | | | | | | | | | It is effectively used, unexcept for pthread_cond_destroy, where we do not want it; see bug 27304. The internal locks do not support a process-shared mode. This fixes commit dc6cfdc934db9997c33728082d63552b9eee4563 ("nptl: Move pthread_cond_destroy implementation into libc"). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* i686: Regenerate ULPsSiddhesh Poyarekar2021-02-031-5/+5
|
* linux: Remove shmmax check from tst-sysvshm-linuxAdhemerval Zanella2021-02-021-12/+14
| | | | | | | | | | | | | | | | | | | | The shmmax expected value is tricky to check because kernel clamps it to INT_MAX in two cases: 1. Compat symbols with IPC_64, i.e, 32-bit binaries running on 64-bit kernels. 2. Default symbol without IPC_64 (defined as IPC_OLD within Linux) and glibc always use IPC_64 for 32-bit ABIs (to support 64-bit time_t). It means that 32-bit binaries running on 32-bit kernels will not see shmmax being clamped. And finding out whether the compat symbol is used would require checking the underlying kernel against the current ABI. The shmall and shmmni already provided enough coverage. Checked on x86_64-linux-gnu and i686-linux-gnu. It should fix the tst-sysvshm-linux failures on 32-bit kernels.
* x86: Adding an upper bound for Enhanced REP MOVSB.Sajan Karumanchi2021-02-024-3/+25
| | | | | | | | | | | | | | | In the process of optimizing memcpy for AMD machines, we have found the vector move operations are outperforming enhanced REP MOVSB for data transfers above the L2 cache size on Zen3 architectures. To handle this use case, we are adding an upper bound parameter on enhanced REP MOVSB:'__x86_rep_movsb_stop_threshold'. As per large-bench results, we are configuring this parameter to the L2 cache size for AMD machines and applicable from Zen3 architecture supporting the ERMS feature. For architectures other than AMD, it is the computed value of non-temporal threshold parameter. Reviewed-by: Premachandra Mallappa <premachandra.mallappa@amd.com>
* Add MS_NOSYMFOLLOW from Linux 5.10 to <sys/mount.h>.Joseph Myers2021-02-021-0/+2
| | | | | | | This patch adds the new constant MS_NOSYMFOLLOW from Linux 5.10 to <sys/mount.h>. Tested for x86_64.
* hurd TIOCFLUSH: fix fixing argumentSamuel Thibault2021-02-011-2/+2
| | | | The argument actually used inside send_rpc is argptr, not arg.
* sysconf: Add _SC_MINSIGSTKSZ/_SC_SIGSTKSZ [BZ #20305]H.J. Lu2021-02-019-0/+228
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add _SC_MINSIGSTKSZ for the minimum signal stack size derived from AT_MINSIGSTKSZ, which is the minimum number of bytes of free stack space required in order to gurantee successful, non-nested handling of a single signal whose handler is an empty function, and _SC_SIGSTKSZ which is the suggested minimum number of bytes of stack space required for a signal stack. If AT_MINSIGSTKSZ isn't available, sysconf (_SC_MINSIGSTKSZ) returns MINSIGSTKSZ. On Linux/x86 with XSAVE, the signal frame used by kernel is composed of the following areas and laid out as: ------------------------------ | alignment padding | ------------------------------ | xsave buffer | ------------------------------ | fsave header (32-bit only) | ------------------------------ | siginfo + ucontext | ------------------------------ Compute AT_MINSIGSTKSZ value as size of xsave buffer + size of fsave header (32-bit only) + size of siginfo and ucontext + alignment padding. If _SC_SIGSTKSZ_SOURCE or _GNU_SOURCE are defined, MINSIGSTKSZ and SIGSTKSZ are redefined as /* Default stack size for a signal handler: sysconf (SC_SIGSTKSZ). */ # undef SIGSTKSZ # define SIGSTKSZ sysconf (_SC_SIGSTKSZ) /* Minimum stack size for a signal handler: SIGSTKSZ. */ # undef MINSIGSTKSZ # define MINSIGSTKSZ SIGSTKSZ Compilation will fail if the source assumes constant MINSIGSTKSZ or SIGSTKSZ. The reason for not simply increasing the kernel's MINSIGSTKSZ #define (apart from the fact that it is rarely used, due to glibc's shadowing definitions) was that userspace binaries will have baked in the old value of the constant and may be making assumptions about it. For example, the type (char [MINSIGSTKSZ]) changes if this #define changes. This could be a problem if an newly built library tries to memcpy() or dump such an object defined by and old binary. Bounds-checking and the stack sizes passed to things like sigaltstack() and makecontext() could similarly go wrong.
* hurd TIOCFLUSH: Cope BSD 4.1 semanticSamuel Thibault2021-02-011-0/+4
| | | | | | BSD 4.1 did not have an argument for TIOCFLUSH, BSD 4.2 added it. There are still a lot of applications out there that pass a NULL argument to TIOCFLUSH, so we should rather cope with it.
* x86: Properly set usable CET feature bits [BZ #26625]H.J. Lu2021-01-2910-13/+120
| | | | | | | | | | | | | | | | | | | | | | | | commit 94cd37ebb293321115a36a422b091fdb72d2fb08 Author: H.J. Lu <hjl.tools@gmail.com> Date: Wed Sep 16 05:27:32 2020 -0700 x86: Use HAS_CPU_FEATURE with IBT and SHSTK [BZ #26625] broke GLIBC_TUNABLES=glibc.cpu.hwcaps=-IBT,-SHSTK since it can no longer disable IBT nor SHSTK. Handle IBT and SHSTK with: 1. Revert commit 94cd37ebb293321115a36a422b091fdb72d2fb08. 2. Clears the usable CET feature bits if kernel doesn't support CET. 3. Add GLIBC_TUNABLES tests without dlopen. 4. Add tests to verify that CPU_FEATURE_USABLE on IBT and SHSTK matches _get_ssp. 5. Update GLIBC_TUNABLES tests with dlopen to verify that CET is disabled with GLIBC_TUNABLES. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* Update ia64 libm-test-ulpsAdhemerval Zanella2021-01-281-2/+2
|
* sh: Update libm-tests-ulpsAdhemerval Zanella2021-01-281-19/+23
|
* ia64: Fix brk call on statupAdhemerval Zanella2021-01-281-0/+22
| | | | | | | | | brk used by statup before TCB is properly set, so we can't use IA64_USE_NEW_STUB. This patch fixes a regression introduced by 720480934ab910. Checked on ia64-linux-gnu.
* Update sparc libm-test-ulpsAdhemerval Zanella2021-01-281-11/+12
|
* Update alpha libm-test-ulpsAdhemerval Zanella2021-01-281-11/+12
|
* powerpc64: Workaround sigtramp vdso return callRaoni Fassina Firmino2021-01-281-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A not so recent kernel change[1] changed how the trampoline `__kernel_sigtramp_rt64` is used to call signal handlers. This was exposed on the test misc/tst-sigcontext-get_pc Before kernel 5.9, the kernel set LR to the trampoline address and jumped directly to the signal handler, and at the end the signal handler, as any other function, would `blr` to the address set. In other words, the trampoline was executed just at the end of the signal handler and the only thing it did was call sigreturn. But since kernel 5.9 the kernel set CTRL to the signal handler and calls to the trampoline code, the trampoline then `bctrl` to the address in CTRL, setting the LR to the next instruction in the middle of the trampoline, when the signal handler returns, the rest of the trampoline code executes the same code as before. Here is the full trampoline code as of kernel 5.11.0-rc5 for reference: V_FUNCTION_BEGIN(__kernel_sigtramp_rt64) .Lsigrt_start: bctrl /* call the handler */ addi r1, r1, __SIGNAL_FRAMESIZE li r0,__NR_rt_sigreturn sc .Lsigrt_end: V_FUNCTION_END(__kernel_sigtramp_rt64) This new behavior breaks how `backtrace()` uses to detect the trampoline frame to correctly reconstruct the stack frame when it is called from inside a signal handling. This workaround rely on the fact that the trampoline code is at very least two (maybe 3?) instructions in size (as it is in the 32 bits version, only on `li` and `sc`), so it is safe to check the return address be in the range __kernel_sigtramp_rt64 .. + 4. [1] subject: powerpc/64/signal: Balance return predictor stack in signal trampoline commit: 0138ba5783ae0dcc799ad401a1e8ac8333790df9 url: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0138ba5783ae0dcc799ad401a1e8ac8333790df9 Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* aarch64: Fix the list of tested IFUNC variants [BZ #26818]Szabolcs Nagy2021-01-252-4/+6
| | | | | | | | | | | | | | Some IFUNC variants are not compatible with BTI and MTE so don't set them as usable for testing and benchmarking on a BTI or MTE enabled system. As far as IFUNC selectors are concerned a system is BTI enabled if the cpu supports it and glibc was built with BTI branch protection. Most IFUNC variants are BTI compatible, but thunderx2 memcpy and memmove use a jump table with indirect jump, without a BTI j. Fixes bug 26818.
* aarch64: Move and update the definition of MTE_ENABLEDSzabolcs Nagy2021-01-252-11/+11
| | | | | | | | | | | | The hwcap value is now in linux 5.10 and in glibc bits/hwcap.h, so use that definition. Move the definition to init-arch.h so all ifunc selectors can use it and expose an "mte" shorthand for mte enabled runtime. For now we allow user code to enable tag checks and use PROT_MTE mappings without libc involvment, this is not guaranteed ABI, but can be useful for testing and debugging with MTE.
* Fix misplaced constAndreas Schwab2021-01-252-2/+2
| | | | | Constify __x86_cacheinfo_p and __x86_cpu_features_p, not their pointer target types.
* Update C-SKY libm-test-ulpsMao Han2021-01-232-54/+61
|
* linux: mips: Fix getdents64 fallback on mips64-n32Adhemerval Zanella2021-01-222-24/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | GCC mainline shows the following error: ../sysdeps/unix/sysv/linux/mips/mips64/getdents64.c: In function '__getdents64': ../sysdeps/unix/sysv/linux/mips/mips64/getdents64.c:121:7: error: 'memcpy' forming offset [4, 7] is out of the bounds [0, 4] [-Werror=array-bounds] 121 | memcpy (((char *) dp + offsetof (struct dirent64, d_ino)), | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 122 | KDP_MEMBER (kdp, d_ino), sizeof ((struct dirent64){0}.d_ino)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../sysdeps/unix/sysv/linux/mips/mips64/getdents64.c:123:7: error: 'memcpy' forming offset [4, 7] is out of the bounds [0, 4] [-Werror=array-bounds] 123 | memcpy (((char *) dp + offsetof (struct dirent64, d_off)), | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 124 | KDP_MEMBER (kdp, d_off), sizeof ((struct dirent64){0}.d_off)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The issue is due both d_ino and d_off fields for mips64-n32 kernel_dirent are 32-bits, while this is using memcpy to copy 64 bits from it into the glibc dirent64. The fix is to use a temporary buffer to read the correct type from kernel_dirent. Checked with a build-many-glibcs.py for mips64el-linux-gnu and I also checked the tst-getdents64 on mips64el 4.1.4 kernel with and without fallback enabled (by manually setting the getdents64_supported).
* x86: Properly match CPU features in /proc/cpuinfo [BZ #27222]H.J. Lu2021-01-221-13/+30
| | | | | | | | | | | | | | Search " YYY " and " YYY\n", instead of "YYY", to avoid matching "XXXYYYZZZ" with "YYY". Update /proc/cpuinfo CPU feature names: /proc/cpuinfo glibc ------------------------------------------------ avx512vbmi AVX512_VBMI dts DS pni SSE3 tsc_deadline_timer TSC_DEADLINE
* x86-64: Update tst-glibc-hwcaps-2.c for x86-64 baselineH.J. Lu2021-01-221-1/+1
| | | | | Return EXIT_FAILURE only if the level 2 libx86-64-isa-level.so is used on x86-64 baseline machine.
* powerpc64: Select POWER9 machine for the scv instructionFlorian Weimer2021-01-223-1/+10
| | | | | | | | | It is not available with the baseline ISA. Fixes commit 68ab82f56690ada86ac1e0c46bad06ba189a10ef ("powerpc: Runtime selection between sc and scv for syscalls"). Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
* x86: Check ifunc resolver with CPU_FEATURE_USABLE [BZ #27072]H.J. Lu2021-01-216-0/+184
| | | | | | | | Check ifunc resolver with CPU_FEATURE_USABLE and tunables in dynamic and static executables to verify that CPUID features are initialized early in static PIE. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Revert "linux: Move {f}xstat{at} to compat symbols" for static buildAdhemerval Zanella2021-01-2116-28/+26
| | | | | | | | | | | | | | This reverts commit 20b39d59467b0c1d858e89ded8b0cebe55e22f60 for static library. This avoids the need to rebuild the world for the case where libstdc++ (and potentially other libraries) are linked to a old glibc. To avoid requering to provide xstat symbols for newer ABIs (such as riscv32) a new LIB_COMPAT macro is added. It is similar to SHLIB_COMPAT but also works for static case (thus evaluating similar to SHLIB_COMPAT for both shared and static case). Checked with a check-abi on all affected ABIs. I also check if the static library does contains the xstat symbols.
* aarch64: revert memcpy optimze for kunpeng to avoid performance degradationShuo Wang2021-01-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit 863d775c481704baaa41855fc93e5a1ca2dc6bf6, kunpeng920 is added to default memcpy version, however, there is performance degradation when the copy size is some large bytes, eg: 100k. This is the result, tested in glibc-2.28: before backport after backport Performance improvement memcpy_1k 0.005 0.005 0.00% memcpy_10k 0.032 0.029 10.34% memcpy_100k 0.356 0.429 -17.02% memcpy_1m 7.470 11.153 -33.02% This is the demo #include "stdio.h" #include "string.h" #include "stdlib.h" char a[1024*1024] = {12}; char b[1024*1024] = {13}; int main(int argc, char *argv[]) { int i = atoi(argv[1]); int j; int size = atoi(argv[2]); for (j = 0; j < i; j++) memcpy(b, a, size*1024); return 0; } # gcc -g -O0 memcpy.c -o memcpy # time taskset -c 10 ./memcpy 100000 1024 Co-authored-by: liqingqing <liqingqing3@huawei.com>
* Use hidden visibility for early static PIE codeSzabolcs Nagy2021-01-212-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | Extern symbol access in position independent code usually involves GOT indirection which needs RELATIVE reloc in a static linked PIE. (On some targets this is avoided e.g. because the linker can relax a GOT access to a pc-relative access, but this is not generally true.) Code that runs before static PIE self relocation must avoid relying on dynamic relocations which can be ensured by using hidden visibility. However we cannot just make all symbols hidden: On i386, all calls to IFUNC functions must go through PLT and calls to hidden functions CANNOT go through PLT in PIE since EBX used in PIE PLT may not be set up for local calls to hidden IFUNC functions. This patch aims to make symbol references hidden in code that is used before and by _dl_relocate_static_pie when building a static PIE libc. Note: for an object that is used in the startup code, its references and definition may not have consistent visibility: it is only forced hidden in the startup code. This is needed for fixing bug 27072. Co-authored-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* configure: Check for static PIE supportSzabolcs Nagy2021-01-216-0/+19
| | | | | | | | | | | | | | Add SUPPORT_STATIC_PIE that targets can define if they support static PIE. This requires PI_STATIC_AND_HIDDEN support and various linker features as described in commit 9d7a3741c9e59eba87fb3ca6b9f979befce07826 Add --enable-static-pie configure option to build static PIE [BZ #19574] Currently defined on x86_64, i386 and aarch64 where static PIE is known to work. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* <sys/platform/x86.h>: Remove the C preprocessor magicH.J. Lu2021-01-2123-842/+1160
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In <sys/platform/x86.h>, define CPU features as enum instead of using the C preprocessor magic to make it easier to wrap this functionality in other languages. Move the C preprocessor magic to internal header for better GCC codegen when more than one features are checked in a single expression as in x86-64 dl-hwcaps-subdirs.c. 1. Rename COMMON_CPUID_INDEX_XXX to CPUID_INDEX_XXX. 2. Move CPUID_INDEX_MAX to sysdeps/x86/include/cpu-features.h. 3. Remove struct cpu_features and __x86_get_cpu_features from <sys/platform/x86.h>. 4. Add __x86_get_cpuid_feature_leaf to <sys/platform/x86.h> and put it in libc. 5. Make __get_cpu_features() private to glibc. 6. Replace __x86_get_cpu_features(N) with __get_cpu_features(). 7. Add _dl_x86_get_cpu_features to GLIBC_PRIVATE. 8. Use a single enum index for each CPU feature detection. 9. Pass the CPUID feature leaf to __x86_get_cpuid_feature_leaf. 10. Return zero struct cpuid_feature for the older glibc binary with a smaller CPUID_INDEX_MAX [BZ #27104]. 11. Inside glibc, use the C preprocessor magic so that cpu_features data can be loaded just once leading to more compact code for glibc. 256 bits are used for each CPUID leaf. Some leaves only contain a few features. We can add exceptions to such leaves. But it will increase code sizes and it is harder to provide backward/forward compatibilities when new features are added to such leaves in the future. When new leaves are added, _rtld_global_ro offsets will change which leads to race condition during in-place updates. We may avoid in-place updates by 1. Rename the old glibc. 2. Install the new glibc. 3. Remove the old glibc. NB: A function, __x86_get_cpuid_feature_leaf , is used to avoid the copy relocation issue with IFUNC resolver as shown in IFUNC resolver tests.
* Use <startup.h> in __libc_init_secureH.J. Lu2021-01-192-2/+53
| | | | | | | | | Since __libc_init_secure is called before ARCH_SETUP_TLS, it must use "int $0x80" for system calls in i386 static PIE. Add startup_getuid, startup_geteuid, startup_getgid and startup_getegid to <startup.h>. Update __libc_init_secure to use them. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* libmvec: Add extra-test-objs to test-extrasH.J. Lu2021-01-191-0/+8
| | | | | | | Add extra-test-objs to test-extras so that they are compiled with -DMODULE_NAME=testsuite instead of -DMODULE_NAME=libc. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Hurd: Add rtld-strncpy-c.cH.J. Lu2021-01-191-0/+1
| | | | | All IFUNC functions which are used in ld.so must have a rtld version if the IFUNC version isn't safe to use in ld.so.