about summary refs log tree commit diff
path: root/sysdeps
Commit message (Collapse)AuthorAgeFilesLines
* Fix array overflow in backtrace on PowerPC (bug 25423)Andreas Schwab2020-03-202-0/+4
| | | | | | | | When unwinding through a signal frame the backtrace function on PowerPC didn't check array bounds when storing the frame address. Fixes commit d400dcac5e ("PowerPC: fix backtrace to handle signal trampolines"). (cherry picked from commit d93769405996dfc11d216ddbe415946617b5a494)
* rtld: Check __libc_enable_secure before honoring LD_PREFER_MAP_32BIT_EXEC ↵Marcin Kościelnicki2019-11-221-1/+2
| | | | | | | | | | (CVE-2019-19126) [BZ #25204] The problem was introduced in glibc 2.23, in commit b9eb92ab05204df772eb4929eccd018637c9f3e9 ("Add Prefer_MAP_32BIT_EXEC to map executable pages with MAP_32BIT"). (cherry picked from commit d5dfad4326fc683c813df1e37bbf5cf920591c8e)
* mips: Force RWX stack for hard-float builds that can run on pre-4.8 kernelsDragan Mladjenovic2019-11-053-5/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Linux/Mips kernels prior to 4.8 could potentially crash the user process when doing FPU emulation while running on non-executable user stack. Currently, gcc doesn't emit .note.GNU-stack for mips, but that will change in the future. To ensure that glibc can be used with such future gcc, without silently resulting in binaries that might crash in runtime, this patch forces RWX stack for all built objects if configured to run against minimum kernel version less than 4.8. * sysdeps/unix/sysv/linux/mips/Makefile (test-xfail-check-execstack): Move under mips-has-gnustack != yes. (CFLAGS-.o*, ASFLAGS-.o*): New rules. Apply -Wa,-execstack if mips-force-execstack == yes. * sysdeps/unix/sysv/linux/mips/configure: Regenerated. * sysdeps/unix/sysv/linux/mips/configure.ac (mips-force-execstack): New var. Set to yes for hard-float builds with minimum_kernel < 4.8.0 or minimum_kernel not set at all. (mips-has-gnustack): New var. Use value of libc_cv_as_noexecstack if mips-force-execstack != yes, otherwise set to no. (cherry picked from commit 33bc9efd91de1b14354291fc8ebd5bce96379f12)
* [AArch64] Add ifunc support for AresWilco Dijkstra2019-09-063-1/+5
| | | | | | | | | | | | | | | | | Add Ares to the midr_el0 list and support ifunc dispatch. Since Ares supports 2 128-bit loads/stores, use Neon registers for memcpy by selecting __memcpy_falkor by default (we should rename this to __memcpy_simd or similar). * manual/tunables.texi (glibc.cpu.name): Add ares tunable. * sysdeps/aarch64/multiarch/memcpy.c (__libc_memcpy): Use __memcpy_falkor for ares. * sysdeps/unix/sysv/linux/aarch64/cpu-features.h (IS_ARES): Add new define. * sysdeps/unix/sysv/linux/aarch64/cpu-features.c (cpu_list): Add ares cpu. (cherry picked from commit 02f440c1ef5d5d79552a524065aa3e2fabe469b9)
* aarch64,falkor: Use vector registers for memcpySiddhesh Poyarekar2019-09-061-72/+65
| | | | | | | | | | | | | Vector registers perform better than scalar register pairs for copying data so prefer them instead. This results in a time reduction of over 50% (i.e. 2x speed improvemnet) for some smaller sizes for memcpy-walk. Larger sizes show improvements of around 1% to 2%. memcpy-random shows a very small improvement, in the range of 1-2%. * sysdeps/aarch64/multiarch/memcpy_falkor.S (__memcpy_falkor): Use vector registers. (cherry picked from commit 0aec4c1d1801e8016ebe89281d16597e0557b8be)
* aarch64,falkor: Ignore prefetcher tagging for smaller copiesSiddhesh Poyarekar2019-09-061-27/+41
| | | | | | | | | | | | | | | | | | | | For smaller and medium sized copies, the effect of hardware prefetching are not as dominant as instruction level parallelism. Hence it makes more sense to load data into multiple registers than to try and route them to the same prefetch unit. This is also the case for the loop exit where we are unable to latch on to the same prefetch unit anyway so it makes more sense to have data loaded in parallel. The performance results are a bit mixed with memcpy-random, with numbers jumping between -1% and +3%, i.e. the numbers don't seem repeatable. memcpy-walk sees a 70% improvement (i.e. > 2x) for 128 bytes and that improvement reduces down as the impact of the tail copy decreases in comparison to the loop. * sysdeps/aarch64/multiarch/memcpy_falkor.S (__memcpy_falkor): Use multiple registers to copy data in loop tail. (cherry picked from commit db725a458e1cb0e17204daa543744faf08bb2e06)
* aarch64/strncmp: Use lsr instead of mov+lsrSiddhesh Poyarekar2019-09-061-4/+2
| | | | | | A lsr can do what the mov and lsr did. (cherry picked from commit b47c3e7637efb77818cbef55dcd0ed1f0ea0ddf1)
* aarch64/strncmp: Unbreak builds with old binutilsSiddhesh Poyarekar2019-09-061-2/+4
| | | | | | | Binutils 2.26.* and older do not support moves with shifted registers, so use a separate shift instruction instead. (cherry picked from commit d46f84de745db8f3f06a37048261f4e5ceacf0a3)
* aarch64: Improve strncmp for mutually misaligned inputsSiddhesh Poyarekar2019-09-061-15/+80
| | | | | | | | | | | | | | | | | | | | | | | The mutually misaligned inputs on aarch64 are compared with a simple byte copy, which is not very efficient. Enhance the comparison similar to strcmp by loading a double-word at a time. The peak performance improvement (i.e. 4k maxlen comparisons) due to this on the strncmp microbenchmark is as follows: falkor: 3.5x (up to 72% time reduction) cortex-a73: 3.5x (up to 71% time reduction) cortex-a53: 3.5x (up to 71% time reduction) All mutually misaligned inputs from 16 bytes maxlen onwards show upwards of 15% improvement and there is no measurable effect on the performance of aligned/mutually aligned inputs. * sysdeps/aarch64/strncmp.S (count): New macro. (strncmp): Store misaligned length in SRC1 in COUNT. (mutual_align): Adjust. (misaligned8): Load dword at a time when it is safe. (cherry picked from commit 7108f1f944792ac68332967015d5e6418c5ccc88)
* aarch64/strcmp: fix misaligned loop jump targetSiddhesh Poyarekar2019-09-061-1/+1
| | | | | | | | | | | I accidentally set the loop jump back label as misaligned8 instead of do_misaligned. The typo is harmless but it's always nice to not have to unnecessarily execute those two instructions. * sysdeps/aarch64/strcmp.S (do_misaligned): Jump back to do_misaligned, not misaligned8. (cherry picked from commit 6ca24c43481e2c93a6eec362b04c3e77a35b28e3)
* aarch64: Improve strcmp unaligned performanceSiddhesh Poyarekar2019-09-061-2/+29
| | | | | | | | | | | | | | | | Replace the simple byte-wise compare in the misaligned case with a dword compare with page boundary checks in place. For simplicity I've chosen a 4K page boundary so that we don't have to query the actual page size on the system. This results in up to 3x improvement in performance in the unaligned case on falkor and about 2.5x improvement on mustang as measured using bench-strcmp. * sysdeps/aarch64/strcmp.S (misaligned8): Compare dword at a time whenever possible. (cherry picked from commit 2bce01ebbaf8db52ba4a5635eb5744f989cdbf69)
* aarch64: Fix branch target to loop16Siddhesh Poyarekar2019-09-061-1/+1
| | | | | | | | | I goofed up when changing the loop8 name to loop16 and missed on out the branch instance. Fixed and actually build tested this time. * sysdeps/aarch64/memcmp.S (more16): Fix branch target loop16. (cherry picked from commit 4e54d918630ea53e29dd70d3bdffcb00d29ed3d4)
* aarch64: Optimized memcmp for medium to large sizesSiddhesh Poyarekar2019-09-061-21/+55
| | | | | | | | | | | | | | | This improved memcmp provides a fast path for compares up to 16 bytes and then compares 16 bytes at a time, thus optimizing loads from both sources. The glibc memcmp microbenchmark retains performance (with an error of ~1ns) for smaller compare sizes and reduces up to 31% of execution time for compares up to 4K on the APM Mustang. On Qualcomm Falkor this improves to almost 48%, i.e. it is almost 2x improvement for sizes of 2K and above. * sysdeps/aarch64/memcmp.S: Widen comparison to 16 bytes at a time. (cherry picked from commit 30a81dae5b752f8aa5f96e7f7c341ec57cba3585)
* aarch64: Use the L() macro for labels in memcmpSiddhesh Poyarekar2019-09-061-16/+16
| | | | | | | | The L() macro makes the assembly a bit more readable. * sysdeps/aarch64/memcmp.S: Use L() macro for labels. (cherry picked from commit 84c94d2fd90d84ae7e67657ee8e22c2d1b796f63)
* [AArch64] Optimized memcmp.Wilco Dijkstra2019-09-061-105/+71
| | | | | | | | | | | | | | | | | | | | | | This is an optimized memcmp for AArch64. This is a complete rewrite using a different algorithm. The previous version split into cases where both inputs were aligned, the inputs were mutually aligned and unaligned using a byte loop. The new version combines all these cases, while small inputs of less than 8 bytes are handled separately. This allows the main code to be sped up using unaligned loads since there are now at least 8 bytes to be compared. After the first 8 bytes, align the first input. This ensures each iteration does at most one unaligned access and mutually aligned inputs behave as aligned. After the main loop, process the last 8 bytes using unaligned accesses. This improves performance of (mutually) aligned cases by 25% and unaligned by >500% (yes >6 times faster) on large inputs. * sysdeps/aarch64/memcmp.S (memcmp): Rewrite of optimized memcmp. (cherry picked from commit 922369032c604b4dcfd535e1bcddd4687e7126a5)
* posix: Fix large mmap64 offset for mips64n32 (BZ#24699)Adhemerval Zanella2019-07-123-1/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The fix for BZ#21270 (commit 158d5fa0e19) added a mask to avoid offset larger than 1^44 to be used along __NR_mmap2. However mips64n32 users __NR_mmap, as mips64n64, but still defines off_t as old non-LFS type (other ILP32, such x32, defines off_t being equal to off64_t). This leads to use the same mask meant only for __NR_mmap2 call for __NR_mmap, thus limiting the maximum offset it can use with mmap64. This patch fixes by setting the high mask only for __NR_mmap2 usage. The posix/tst-mmap-offset.c already tests it and also fails for mips64n32. The patch also change the test to check for an arch-specific header that defines the maximum supported offset. Checked on x86_64-linux-gnu, i686-linux-gnu, and I also tests tst-mmap-offset on qemu simulated mips64 with kernel 3.2.0 kernel for both mips-linux-gnu and mips64-n32-linux-gnu. [BZ #24699] * posix/tst-mmap-offset.c: Mention BZ #24699. (do_test_bz21270): Rename to do_test_large_offset and use mmap64_maximum_offset to check for maximum expected offset value. * sysdeps/generic/mmap_info.h: New file. * sysdeps/unix/sysv/linux/mips/mmap_info.h: Likewise. * sysdeps/unix/sysv/linux/mmap64.c (MMAP_OFF_HIGH_MASK): Define iff __NR_mmap2 is used. (cherry picked from commit a008c76b56e4f958cf5a0d6f67d29fade89421b7)
* aarch64: handle STO_AARCH64_VARIANT_PCSSzabolcs Nagy2019-07-121-4/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of commit 82bc69c012838a381c4167c156a06f4598f34227 and commit 30ba0375464f34e4bf8129f3d3dc14d0c09add17 without using DT_AARCH64_VARIANT_PCS for optimizing the symbol table check. This is needed so the internal abi between ld.so and libc.so is unchanged. Avoid lazy binding of symbols that may follow a variant PCS with different register usage convention from the base PCS. Currently the lazy binding entry code does not preserve all the registers required for AdvSIMD and SVE vector calls. Saving and restoring all registers unconditionally may break existing binaries, even if they never use vector calls, because of the larger stack requirement for lazy resolution, which can be significant on an SVE system. The solution is to mark all symbols in the symbol table that may follow a variant PCS so the dynamic linker can handle them specially. In this patch such symbols are always resolved at load time, not lazily. So currently LD_AUDIT for variant PCS symbols are not supported, for that the _dl_runtime_profile entry needs to be changed e.g. to unconditionally save/restore all registers (but pass down arg and retval registers to pltentry/exit callbacks according to the base PCS). This patch also removes a __builtin_expect from the modified code because the branch prediction hint did not seem useful. * sysdeps/aarch64/dl-machine.h (elf_machine_lazy_rel): Check STO_AARCH64_VARIANT_PCS and bind such symbols at load time.
* x86-64 memcmp: Use unsigned Jcc instructions on size [BZ #24155]H.J. Lu2019-02-043-9/+93
| | | | | | | | | | | | | | | | | | Since the size argument is unsigned. we should use unsigned Jcc instructions, instead of signed, to check size. Tested on x86-64 and x32, with and without --disable-multi-arch. [BZ #24155] CVE-2019-7309 * NEWS: Updated for CVE-2019-7309. * sysdeps/x86_64/memcmp.S: Use RDX_LP for size. Clear the upper 32 bits of RDX register for x32. Use unsigned Jcc instructions, instead of signed. * sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcmp-2. * sysdeps/x86_64/x32/tst-size_t-memcmp-2.c: New test. (cherry picked from commit 3f635fb43389b54f682fc9ed2acc0b2aaf4a923d)
* x86-64 strnlen/wcsnlen: Properly handle the length parameter [BZ #24097]H.J. Lu2019-02-015-11/+106
| | | | | | | | | | | | | | | | | | | | | | | On x32, the size_t parameter may be passed in the lower 32 bits of a 64-bit register with the non-zero upper 32 bits. The string/memory functions written in assembly can only use the lower 32 bits of a 64-bit register as length or must clear the upper 32 bits before using the full 64-bit register for length. This pach fixes strnlen/wcsnlen for x32. Tested on x86-64 and x32. On x86-64, libc.so is the same with and withou the fix. [BZ #24097] CVE-2019-6488 * sysdeps/x86_64/multiarch/strlen-avx2.S: Use RSI_LP for length. Clear the upper 32 bits of RSI register. * sysdeps/x86_64/strlen.S: Use RSI_LP for length. * sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strnlen and tst-size_t-wcsnlen. * sysdeps/x86_64/x32/tst-size_t-strnlen.c: New file. * sysdeps/x86_64/x32/tst-size_t-wcsnlen.c: Likewise. (cherry picked from commit 5165de69c0908e28a380cbd4bb054e55ea4abc95)
* x86-64 strncpy: Properly handle the length parameter [BZ #24097]H.J. Lu2019-02-014-6/+64
| | | | | | | | | | | | | | | | | | | | | On x32, the size_t parameter may be passed in the lower 32 bits of a 64-bit register with the non-zero upper 32 bits. The string/memory functions written in assembly can only use the lower 32 bits of a 64-bit register as length or must clear the upper 32 bits before using the full 64-bit register for length. This pach fixes strncpy for x32. Tested on x86-64 and x32. On x86-64, libc.so is the same with and withou the fix. [BZ #24097] CVE-2019-6488 * sysdeps/x86_64/multiarch/strcpy-sse2-unaligned.S: Use RDX_LP for length. * sysdeps/x86_64/multiarch/strcpy-ssse3.S: Likewise. * sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strncpy. * sysdeps/x86_64/x32/tst-size_t-strncpy.c: New file. (cherry picked from commit c7c54f65b080affb87a1513dee449c8ad6143c8b)
* x86-64 strncmp family: Properly handle the length parameter [BZ #24097]H.J. Lu2019-02-016-8/+167
| | | | | | | | | | | | | | | | | | | | | | | On x32, the size_t parameter may be passed in the lower 32 bits of a 64-bit register with the non-zero upper 32 bits. The string/memory functions written in assembly can only use the lower 32 bits of a 64-bit register as length or must clear the upper 32 bits before using the full 64-bit register for length. This pach fixes the strncmp family for x32. Tested on x86-64 and x32. On x86-64, libc.so is the same with and withou the fix. [BZ #24097] CVE-2019-6488 * sysdeps/x86_64/multiarch/strcmp-sse42.S: Use RDX_LP for length. * sysdeps/x86_64/strcmp.S: Likewise. * sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strncasecmp, tst-size_t-strncmp and tst-size_t-wcsncmp. * sysdeps/x86_64/x32/tst-size_t-strncasecmp.c: New file. * sysdeps/x86_64/x32/tst-size_t-strncmp.c: Likewise. * sysdeps/x86_64/x32/tst-size_t-wcsncmp.c: Likewise. (cherry picked from commit ee915088a0231cd421054dbd8abab7aadf331153)
* x86-64 memset/wmemset: Properly handle the length parameter [BZ #24097]H.J. Lu2019-02-015-15/+120
| | | | | | | | | | | | | | | | | | | | | | On x32, the size_t parameter may be passed in the lower 32 bits of a 64-bit register with the non-zero upper 32 bits. The string/memory functions written in assembly can only use the lower 32 bits of a 64-bit register as length or must clear the upper 32 bits before using the full 64-bit register for length. This pach fixes memset/wmemset for x32. Tested on x86-64 and x32. On x86-64, libc.so is the same with and withou the fix. [BZ #24097] CVE-2019-6488 * sysdeps/x86_64/multiarch/memset-avx512-no-vzeroupper.S: Use RDX_LP for length. Clear the upper 32 bits of RDX register. * sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Likewise. * sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-wmemset. * sysdeps/x86_64/x32/tst-size_t-memset.c: New file. * sysdeps/x86_64/x32/tst-size_t-wmemset.c: Likewise. (cherry picked from commit 82d0b4a4d76db554eb6757acb790fcea30b19965)
* x86-64 memrchr: Properly handle the length parameter [BZ #24097]H.J. Lu2019-02-014-5/+63
| | | | | | | | | | | | | | | | | | | | On x32, the size_t parameter may be passed in the lower 32 bits of a 64-bit register with the non-zero upper 32 bits. The string/memory functions written in assembly can only use the lower 32 bits of a 64-bit register as length or must clear the upper 32 bits before using the full 64-bit register for length. This pach fixes memrchr for x32. Tested on x86-64 and x32. On x86-64, libc.so is the same with and withou the fix. [BZ #24097] CVE-2019-6488 * sysdeps/x86_64/memrchr.S: Use RDX_LP for length. * sysdeps/x86_64/multiarch/memrchr-avx2.S: Likewise. * sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memrchr. * sysdeps/x86_64/x32/tst-size_t-memrchr.c: New file. (cherry picked from commit ecd8b842cf37ea112e59cd9085ff1f1b6e208ae0)
* x86-64 memcpy: Properly handle the length parameter [BZ #24097]H.J. Lu2019-02-016-40/+120
| | | | | | | | | | | | | | | | | | | | | | | | | | On x32, the size_t parameter may be passed in the lower 32 bits of a 64-bit register with the non-zero upper 32 bits. The string/memory functions written in assembly can only use the lower 32 bits of a 64-bit register as length or must clear the upper 32 bits before using the full 64-bit register for length. This pach fixes memcpy for x32. Tested on x86-64 and x32. On x86-64, libc.so is the same with and withou the fix. [BZ #24097] CVE-2019-6488 * sysdeps/x86_64/multiarch/memcpy-ssse3-back.S: Use RDX_LP for length. Clear the upper 32 bits of RDX register. * sysdeps/x86_64/multiarch/memcpy-ssse3.S: Likewise. * sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S: Likewise. * sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: Likewise. * sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcpy. tst-size_t-wmemchr. * sysdeps/x86_64/x32/tst-size_t-memcpy.c: New file. (cherry picked from commit 231c56760c1e2ded21ad96bbb860b1f08c556c7a)
* x86-64 memcmp/wmemcmp: Properly handle the length parameter [BZ #24097]H.J. Lu2019-02-016-9/+114
| | | | | | | | | | | | | | | | | | | | | | | | On x32, the size_t parameter may be passed in the lower 32 bits of a 64-bit register with the non-zero upper 32 bits. The string/memory functions written in assembly can only use the lower 32 bits of a 64-bit register as length or must clear the upper 32 bits before using the full 64-bit register for length. This pach fixes memcmp/wmemcmp for x32. Tested on x86-64 and x32. On x86-64, libc.so is the same with and withou the fix. [BZ #24097] CVE-2019-6488 * sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S: Use RDX_LP for length. Clear the upper 32 bits of RDX register. * sysdeps/x86_64/multiarch/memcmp-sse4.S: Likewise. * sysdeps/x86_64/multiarch/memcmp-ssse3.S: Likewise. * sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcmp and tst-size_t-wmemcmp. * sysdeps/x86_64/x32/tst-size_t-memcmp.c: New file. * sysdeps/x86_64/x32/tst-size_t-wmemcmp.c: Likewise. (cherry picked from commit b304fc201d2f6baf52ea790df8643e99772243cd)
* x86-64 memchr/wmemchr: Properly handle the length parameter [BZ #24097]H.J. Lu2019-02-016-5/+148
| | | | | | | | | | | | | | | | | | | | | | | | On x32, the size_t parameter may be passed in the lower 32 bits of a 64-bit register with the non-zero upper 32 bits. The string/memory functions written in assembly can only use the lower 32 bits of a 64-bit register as length or must clear the upper 32 bits before using the full 64-bit register for length. This pach fixes memchr/wmemchr for x32. Tested on x86-64 and x32. On x86-64, libc.so is the same with and withou the fix. [BZ #24097] CVE-2019-6488 * sysdeps/x86_64/memchr.S: Use RDX_LP for length. Clear the upper 32 bits of RDX register. * sysdeps/x86_64/multiarch/memchr-avx2.S: Likewise. * sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memchr and tst-size_t-wmemchr. * sysdeps/x86_64/x32/test-size_t.h: New file. * sysdeps/x86_64/x32/tst-size_t-memchr.c: Likewise. * sysdeps/x86_64/x32/tst-size_t-wmemchr.c: Likewise. (cherry picked from commit 97700a34f36721b11a754cf37a1cc40695ece1fd)
* powerpc: Regenerate ULPsGabriel F. T. Gomes2019-01-111-2/+2
| | | | | | | | | On POWER9, cbrtf128 fails by 1 ULP. * sysdeps/powerpc/fpu/libm-test-ulps: Regenerate. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com> (cherry picked from commit 428fc49eaafe0fe5352445fcf23d9f603e9083a2)
* [BZ #21745] powerpc: build some IFUNC math functions for libc and libmTulio Magno Quites Machado Filho2019-01-112-17/+22
| | | | | | | | | | | | | | | | | | | | | | | | | Some math functions have to be distributed in libc because they're required by printf. libc and libm require their own builds of these functions, e.g. libc functions have to call __stack_chk_fail_local in order to bypass the PLT, while libm functions have to call __stack_chk_fail. While math/Makefile treat the generic cases, i.e. s_isinff, the multiarch Makefile has to treat its own files, i.e. s_isinff-ppc64. [BZ #21745] * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile: [$(subdir) = math] (sysdep_calls): New variable. Has the previous contents of sysdep_routines, but re-sorted.. [$(subdir) = math] (sysdep_routines): Re-use the contents from sysdep_calls. [$(subdir) = math] (libm-sysdep_routines): Remove the functions defined in sysdep_calls and replace by the respective m_* names. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-ppc64.S: (compat_symbol): Undefine to avoid duplicated compat symbols in libc. (cherry picked from commit 61c45f250528dae431391823a9766053e61ccde1)
* CVE-2018-19591: if_nametoindex: Fix descriptor for overlong name [BZ #23927]Florian Weimer2018-11-271-5/+6
| | | | (cherry picked from commit d527c860f5a3f0ed687bd03f0cb464612dc23408)
* x86: Fix Haswell CPU string flags (BZ#23709)Adhemerval Zanella2018-11-021-0/+6
| | | | | | | | | | | | | | | | | | | | | Th commit 'Disable TSX on some Haswell processors.' (2702856bf4) changed the default flags for Haswell models. Previously, new models were handled by the default switch path, which assumed a Core i3/i5/i7 if AVX is available. After the patch, Haswell models (0x3f, 0x3c, 0x45, 0x46) do not set the flags Fast_Rep_String, Fast_Unaligned_Load, Fast_Unaligned_Copy, and Prefer_PMINUB_for_stringop (only the TSX one). This patch fixes it by disentangle the TSX flag handling from the memory optimization ones. The strstr case cited on patch now selects the __strstr_sse2_unaligned as expected for the Haswell cpu. Checked on x86_64-linux-gnu. [BZ #23709] * sysdeps/x86/cpu-features.c (init_cpu_features): Set TSX bits independently of other flags. (cherry picked from commit c3d8dc45c9df199b8334599a6cbd98c9950dba62)
* utmp: Avoid -Wstringop-truncation warningMartin Sebor2018-10-222-6/+12
| | | | | | | | | | | | | | | | | | The -Wstringop-truncation option new in GCC 8 detects common misuses of the strncat and strncpy function that may result in truncating the copied string before the terminating NUL. To avoid false positive warnings for correct code that intentionally creates sequences of characters that aren't guaranteed to be NUL-terminated, arrays that are intended to store such sequences should be decorated with a new nonstring attribute. This change add this attribute to Glibc and uses it to suppress such false positives. ChangeLog: * misc/sys/cdefs.h (__attribute_nonstring__): New macro. * sysdeps/gnu/bits/utmp.h (struct utmp): Use it. * sysdeps/unix/sysv/linux/s390/bits/utmp.h (struct utmp): Same. (cherry picked from commit 7532837d7b03b3ca5b9a63d77a5bd81dd23f3d9c)
* Avoid use of strlen in getlogin_r (bug 22447).Joseph Myers2018-10-221-2/+3
| | | | | | | | | | | | | | | | | | | | Building glibc with current mainline GCC fails, among other reasons, because of an error for use of strlen on the nonstring ut_user field. This patch changes the problem code in getlogin_r to use __strnlen instead. It also needs to set the trailing NUL byte of the result explicitly, because of the case where ut_user does not have such a trailing NUL byte (but the result should always have one). Tested for x86_64. Also tested that, in conjunction with <https://sourceware.org/ml/libc-alpha/2017-11/msg00797.html>, it fixes the build for arm with mainline GCC. [BZ #22447] * sysdeps/unix/getlogin_r.c (__getlogin_r): Use __strnlen not strlen to compute length of ut_user and set trailing NUL byte of result explicitly. (cherry picked from commit 4bae615022cb5a5da79ccda83cc6c9ba9f2d479c)
* signal: Use correct type for si_band in siginfo_t [BZ #23562]Ilya Yu. Malakhov2018-10-221-1/+1
| | | | (cherry picked from commit f997b4be18f7e57d757d39e42f7715db26528aa0)
* Fix misreported errno on preadv2/pwritev2 (BZ#23579)Adhemerval Zanella2018-09-284-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The fallback code of Linux wrapper for preadv2/pwritev2 executes regardless of the errno code for preadv2, instead of the case where the syscall is not supported. This fixes it by calling the fallback code iff errno is ENOSYS. The patch also adds tests for both invalid file descriptor and invalid iov_len and vector count. The only discrepancy between preadv2 and fallback code regarding error reporting is when an invalid flags are used. The fallback code bails out earlier with ENOTSUP instead of EINVAL/EBADF when the syscall is used. Checked on x86_64-linux-gnu on a 4.4.0 and 4.15.0 kernel. [BZ #23579] * misc/tst-preadvwritev2-common.c (do_test_with_invalid_fd): New test. * misc/tst-preadvwritev2.c, misc/tst-preadvwritev64v2.c (do_test): Call do_test_with_invalid_fd. * sysdeps/unix/sysv/linux/preadv2.c (preadv2): Use fallback code iff errno is ENOSYS. * sysdeps/unix/sysv/linux/preadv64v2.c (preadv64v2): Likewise. * sysdeps/unix/sysv/linux/pwritev2.c (pwritev2): Likewise. * sysdeps/unix/sysv/linux/pwritev64v2.c (pwritev64v2): Likewise. (cherry picked from commit 7a16bdbb9ff4122af0a28dc20996c95352011fdd)
* preadv2/pwritev2: Handle offset == -1 [BZ #22753]Florian Weimer2018-09-288-8/+33
| | | | | | Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit d4b4a00a462348750bb18544eb30853ee6ac5d10)
* Fix segfault in maybe_script_execute.Stefan Liebler2018-09-101-1/+1
| | | | | | | | | | | | | | | | | | | If glibc is built with gcc 8 and -march=z900, the testcase posix/tst-spawn4-compat crashes with a segfault. In function maybe_script_execute, the new_argv array is dynamically initialized on stack with (argc + 1) elements. The function wants to add _PATH_BSHELL as the first argument and writes out of bounds of new_argv. There is an off-by-one because maybe_script_execute fails to count the terminating NULL when sizing new_argv. ChangeLog: * sysdeps/unix/sysv/linux/spawni.c (maybe_script_execute): Increment size of new_argv by one. (cherry picked from commit 28669f86f6780a18daca264f32d66b1428c9c6f1)
* x86: Populate COMMON_CPUID_INDEX_80000001 for Intel CPUs [BZ #23459]H.J. Lu2018-07-292-10/+19
| | | | | | | | | | | | | | Reviewed-by: Carlos O'Donell <carlos@redhat.com> [BZ #23459] * sysdeps/x86/cpu-features.c (get_extended_indices): New function. (init_cpu_features): Call get_extended_indices for both Intel and AMD CPUs. * sysdeps/x86/cpu-features.h (COMMON_CPUID_INDEX_80000001): Remove "for AMD" comment. (cherry picked from commit be525a69a6630abc83144c0a96474f2e26da7443)
* x86: Correct index_cpu_LZCNT [BZ #23456]H.J. Lu2018-07-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | cpu-features.h has #define bit_cpu_LZCNT (1 << 5) #define index_cpu_LZCNT COMMON_CPUID_INDEX_1 #define reg_LZCNT But the LZCNT feature bit is in COMMON_CPUID_INDEX_80000001: Initial EAX Value: 80000001H ECX Extended Processor Signature and Feature Bits: Bit 05: LZCNT available index_cpu_LZCNT should be COMMON_CPUID_INDEX_80000001, not COMMON_CPUID_INDEX_1. The VMX feature bit is in COMMON_CPUID_INDEX_1: Initial EAX Value: 01H Feature Information Returned in the ECX Register: 5 VMX Reviewed-by: Carlos O'Donell <carlos@redhat.com> [BZ #23456] * sysdeps/x86/cpu-features.h (index_cpu_LZCNT): Set to COMMON_CPUID_INDEX_80000001. (cherry picked from commit 65d87ade1ee6f3ac099105e3511bd09bdc24cf3f)
* Check length of ifname before copying it into to ifreq structure.Steve Ellcey2018-06-291-0/+6
| | | | | | | | [BZ #22442] * sysdeps/unix/sysv/linux/if_index.c (__if_nametoindex): Check if ifname is too long. (cherry picked from commit 2180fee114b778515b3f560e5ff1e795282e60b0)
* getifaddrs: Don't return ifa entries with NULL names [BZ #21812]Daniel Alvarez2018-06-291-0/+8
| | | | | | | | | | | | | | | | | A lookup operation in map_newlink could turn into an insert because of holes in the interface part of the map. This leads to incorrectly set the name of the interface to NULL when the interface is not present for the address being processed (most likely because the interface was added between the RTM_GETLINK and RTM_GETADDR calls to the kernel). When such changes are detected by the kernel, it'll mark the dump as "inconsistent" by setting NLM_F_DUMP_INTR flag on the next netlink message. This patch checks this condition and retries the whole operation. Hopes are that next time the interface corresponding to the address entry is present in the list and correct name is returned. (cherry picked from commit c1f86a33ca32e26a9d6e29fc961e5ecb5e2e5eb4)
* Don't write beyond destination in __mempcpy_avx512_no_vzeroupper (bug 23196)Andreas Schwab2018-05-241-2/+3
| | | | | | | When compiled as mempcpy, the return value is the end of the destination buffer, thus it cannot be used to refer to the start of it. (cherry picked from commit 9aaaab7c6e4176e61c59b0a63c6ba906d875dc0e)
* Fix blocking pthread_join. [BZ #23137]Stefan Liebler2018-05-171-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On s390 (31bit) if glibc is build with -Os, pthread_join sometimes blocks indefinitely. This is e.g. observable with testcase intl/tst-gettext6. pthread_join is calling lll_wait_tid(tid), which performs the futex-wait syscall in a loop as long as tid != 0 (thread is alive). On s390 (and build with -Os), tid is loaded from memory before comparing against zero and then the tid is loaded a second time in order to pass it to the futex-wait-syscall. If the thread exits in between, then the futex-wait-syscall is called with the value zero and it waits until a futex-wake occurs. As the thread is already exited, there won't be a futex-wake. In lll_wait_tid, the tid is stored to the local variable __tid, which is then used as argument for the futex-wait-syscall. But unfortunately the compiler is allowed to reload the value from memory. With this patch, the tid is loaded with atomic_load_acquire. Then the compiler is not allowed to reload the value for __tid from memory. ChangeLog: [BZ #23137] * sysdeps/nptl/lowlevellock.h (lll_wait_tid): Use atomic_load_acquire to load __tid. (cherry picked from commit 1660901840dfc9fde6c5720a32f901af6f08f00a)
* i386: Fix i386 sigaction sa_restorer initialization (BZ#21269)Adhemerval Zanella2018-05-173-1/+238
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the i386 sa_restorer field initialization for sigaction syscall for kernel with vDSO. As described in bug report, i386 Linux (and compat on x86_64) interprets SA_RESTORER clear with nonzero sa_restorer as a request for stack switching if the SS segment is 'funny'. This means that anything that tries to mix glibc's signal handling with segmentation (for instance through modify_ldt syscall) is randomly broken depending on what values lands in sa_restorer. The testcase added is based on Linux test tools/testing/selftests/x86/ldt_gdt.c, more specifically in do_multicpu_tests function. The main changes are: - C11 atomics instead of plain access. - Remove x86_64 support which simplifies the syscall handling and fallbacks. - Replicate only the test required to trigger the issue. Checked on i686-linux-gnu. [BZ #21269] * sysdeps/unix/sysv/linux/i386/Makefile (tests): Add tst-bz21269. * sysdeps/unix/sysv/linux/i386/sigaction.c (SET_SA_RESTORER): Clear sa_restorer for vDSO case. * sysdeps/unix/sysv/linux/i386/tst-bz21269.c: New file. (cherry picked from commit 68448be208ee06e76665918b37b0a57e3e00c8b4)
* Fix i386 memmove issue (bug 22644).Andrew Senkevich2018-05-171-6/+6
| | | | | | | | | [BZ #22644] * sysdeps/i386/i686/multiarch/memcpy-sse2-unaligned.S: Fixed branch conditions. * string/test-memmove.c (do_test2): New testcase. (cherry picked from commit cd66c0e584c6d692bc8347b5e72723d02b8a8ada)
* getlogin_r: return early when linux sentinel value is setJesse Hathaway2018-05-171-0/+9
| | | | | | | | | | | | | | | When there is no login uid Linux sets /proc/self/loginid to the sentinel value of, (uid_t) -1. If this is set we can return early and avoid needlessly looking up the sentinel value in any configured nss databases. Checked on aarch64-linux-gnu. * sysdeps/unix/sysv/linux/getlogin_r.c (__getlogin_r_loginuid): Return early when linux sentinel value is set. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit cc8a1620eb97ccddd337d157263c13c57b39ab71)
* powerpc: Fix syscalls during early process initialization [BZ #22685]Tulio Magno Quites Machado Filho2018-01-293-4/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | The tunables framework needs to execute syscall early in process initialization, before the TCB is available for consumption. This behavior conflicts with powerpc{|64|64le}'s lock elision code, that checks the TCB before trying to abort transactions immediately before executing a syscall. This patch adds a powerpc-specific implementation of __access_noerrno that does not abort transactions before the executing syscall. Tested on powerpc{|64|64le}. [BZ #22685] * sysdeps/powerpc/powerpc32/sysdep.h (ABORT_TRANSACTION_IMPL): Renamed from ABORT_TRANSACTION. (ABORT_TRANSACTION): Redirect to ABORT_TRANSACTION_IMPL. * sysdeps/powerpc/powerpc64/sysdep.h (ABORT_TRANSACTION, ABORT_TRANSACTION_IMPL): Likewise. * sysdeps/unix/sysv/linux/powerpc/not-errno.h: New file. Reuse Linux code, but remove the code that aborts transactions. Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com> Tested-by: Aurelien Jarno <aurelien@aurel32.net> (cherry picked from commit 4612268a0ad8e3409d8ce2314dd2dd8ee0af5269)
* Provide a C++ version of iseqsig (bug 22377)Gabriel F. T. Gomes2018-01-291-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In C++ mode, __MATH_TG cannot be used for defining iseqsig, because __MATH_TG relies on __builtin_types_compatible_p, which is a C-only builtin. This is true when float128 is provided as an ABI-distinct type from long double. Moreover, the comparison macros from ISO C take two floating-point arguments, which need not have the same type. Choosing what underlying function to call requires evaluating the formats of the arguments, then selecting which is wider. The macro __MATH_EVAL_FMT2 provides this information, however, only the type of the macro expansion is relevant (actually evaluating the expression would be incorrect). This patch provides a C++ version of iseqsig, in which only the type of __MATH_EVAL_FMT2 (__typeof or decltype) is used as a template parameter for __iseqsig_type. This function calls the appropriate underlying function. Tested for powerpc64le and x86_64. [BZ #22377] * math/Makefile [C++] (tests): Add test for iseqsig. * math/math.h [C++] (iseqsig): New implementation, which does not rely on __MATH_TG/__builtin_types_compatible_p. * math/test-math-iseqsig.cc: New file. * sysdeps/powerpc/powerpc64le/Makefile (CFLAGS-test-math-iseqsig.cc): New variable. (cherry picked from commit c85e54ac6cef0faed7b7ffc722f52523dec59bf5)
* [AARCH64] Rewrite elf_machine_load_address using _DYNAMIC symbolSzabolcs Nagy2018-01-261-34/+5
| | | | | | | | | | | | | | | | | | | | | | This patch rewrites aarch64 elf_machine_load_address to use special _DYNAMIC symbol instead of _dl_start. The static address of _DYNAMIC symbol is stored in the first GOT entry. Here is the change which makes this solution work (part of binutils 2.24): https://sourceware.org/ml/binutils/2013-06/msg00248.html i386, x86_64 targets use the same method to do this as well. The original implementation relies on a trick that R_AARCH64_ABS32 relocation being resolved at link time and the static address fits in the 32bits. However, in LP64, normally, the address is defined to be 64 bit. Here is the C version one which should be portable in all cases. * sysdeps/aarch64/dl-machine.h (elf_machine_load_address): Use _DYNAMIC symbol to calculate load address. (cherry picked from commit a68ba2f3cd3cbe32c1f31e13c20ed13487727b32)
* x86-64: Properly align La_x86_64_retval to VEC_SIZE [BZ #22715]H.J. Lu2018-01-191-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | _dl_runtime_profile calls _dl_call_pltexit, passing a pointer to La_x86_64_retval which is allocated on stack. The lrv_vector0 field in La_x86_64_retval must be aligned to size of vector register. When allocating stack space for La_x86_64_retval, we need to make sure that the address of La_x86_64_retval + RV_VECTOR0_OFFSET is aligned to VEC_SIZE. This patch checks the alignment of the lrv_vector0 field and pads the stack space if needed. Tested with x32 and x86-64 on SSE4, AVX and AVX512 machines. It fixed FAIL: elf/tst-audit10 FAIL: elf/tst-audit4 FAIL: elf/tst-audit5 FAIL: elf/tst-audit6 FAIL: elf/tst-audit7 on x32 AVX512 machine. (cherry picked from commit 207a72e2988c6d6343f50fe0128eb4fc4edfdd15) [BZ #22715] * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_profile): Properly align La_x86_64_retval to VEC_SIZE.
* nptl: Open libgcc.so with RTLD_NOW during pthread_cancel [BZ #22636]Florian Weimer2018-01-151-1/+1
| | | | | | | | | | Disabling lazy binding reduces stack usage during unwinding. Note that RTLD_NOW only makes a difference if libgcc.so has not already been loaded, so this is only a partial fix. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit f993b8754080ac7572b692870e926d8b493db16c)