| Commit message (Collapse) | Author | Age | Files | Lines |
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| | |
The value of `end_name' points into the value of `dirname', thus don't
deallocate the latter before the last use of the former.
(cherry picked from commit ddc650e9b3dc916eab417ce9f79e67337b05035c with
changes from commit d711a00f93fa964f41a53839228598fbf1a6b482)
|
| |
| |
| |
| |
| |
| |
| |
| | |
When unwinding through a signal frame the backtrace function on PowerPC
didn't check array bounds when storing the frame address. Fixes commit
d400dcac5e ("PowerPC: fix backtrace to handle signal trampolines").
(cherry picked from commit d93769405996dfc11d216ddbe415946617b5a494)
|
|\| |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Commit c402355dfa7807b8e0adb27c009135a7e2b9f1b0 ("libio: Disable
vtable validation in case of interposition [BZ #23313]") only covered
the interposable glibc 2.1 handles, in libio/stdfiles.c. The
parallel code in libio/oldstdfiles.c needs similar detection logic.
Fixes (again) commit db3476aff19b75c4fdefbe65fcd5f0a90588ba51
("libio: Implement vtable verification [BZ #20191]").
Change-Id: Ief6f9f17e91d1f7263421c56a7dc018f4f595c21
(cherry picked from commit cb61630ed712d033f54295f776967532d3f4b46a)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
(CVE-2019-19126) [BZ #25204]
The problem was introduced in glibc 2.23, in commit
b9eb92ab05204df772eb4929eccd018637c9f3e9
("Add Prefer_MAP_32BIT_EXEC to map executable pages with MAP_32BIT").
(cherry picked from commit d5dfad4326fc683c813df1e37bbf5cf920591c8e)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Linux/Mips kernels prior to 4.8 could potentially crash the user
process when doing FPU emulation while running on non-executable
user stack.
Currently, gcc doesn't emit .note.GNU-stack for mips, but that will
change in the future. To ensure that glibc can be used with such
future gcc, without silently resulting in binaries that might crash
in runtime, this patch forces RWX stack for all built objects if
configured to run against minimum kernel version less than 4.8.
* sysdeps/unix/sysv/linux/mips/Makefile
(test-xfail-check-execstack):
Move under mips-has-gnustack != yes.
(CFLAGS-.o*, ASFLAGS-.o*): New rules.
Apply -Wa,-execstack if mips-force-execstack == yes.
* sysdeps/unix/sysv/linux/mips/configure: Regenerated.
* sysdeps/unix/sysv/linux/mips/configure.ac
(mips-force-execstack): New var.
Set to yes for hard-float builds with minimum_kernel < 4.8.0
or minimum_kernel not set at all.
(mips-has-gnustack): New var.
Use value of libc_cv_as_noexecstack
if mips-force-execstack != yes, otherwise set to no.
(cherry picked from commit 33bc9efd91de1b14354291fc8ebd5bce96379f12)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch significantly improves performance of memmem using a novel
modified Horspool algorithm. Needles up to size 256 use a bad-character
table indexed by hashed pairs of characters to quickly skip past mismatches.
Long needles use a self-adapting filtering step to avoid comparing the whole
needle repeatedly.
By limiting the needle length to 256, the shift table only requires 8 bits
per entry, lowering preprocessing overhead and minimizing cache effects.
This limit also implies worst-case performance is linear.
Small needles up to size 2 use a dedicated linear search. Very long needles
use the Two-Way algorithm (to avoid increasing stack size or slowing down
the common case, inlining is disabled).
The performance gain is 6.6 times on English text on AArch64 using random
needles with average size 8.
Tested against GLIBC testsuite and randomized tests.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* string/memmem.c (__memmem): Rewrite to improve performance.
(cherry picked from commit 680942b0167715e123d934b609060cd382f8e39f)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch significantly improves performance of strstr using a novel
modified Horspool algorithm. Needles up to size 256 use a bad-character
table indexed by hashed pairs of characters to quickly skip past mismatches.
Long needles use a self-adapting filtering step to avoid comparing the whole
needle repeatedly.
By limiting the needle length to 256, the shift table only requires 8 bits
per entry, lowering preprocessing overhead and minimizing cache effects.
This limit also implies worst-case performance is linear.
Small needles up to size 3 use a dedicated linear search. Very long needles
use the Two-Way algorithm.
The performance gain using the improved bench-strstr on Cortex-A72 is 5.8
times basic_strstr and 3.7 times twoway_strstr.
Tested against GLIBC testsuite, randomized tests and the GNULIB strstr test
(https://git.savannah.gnu.org/cgit/gnulib.git/tree/tests/test-strstr.c).
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* string/str-two-way.h (two_way_short_needle): Add inline to avoid
warning.
(two_way_long_needle): Block inlining.
* string/strstr.c (strstr2): Add new function.
(strstr3): Likewise.
(STRSTR): Completely rewrite strstr to improve performance.
(cherry picked from commit 5e0a7ecb6629461b28adc1a5aabcc0ede122f201)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The generic strstr in GLIBC 2.28 fails to match huge needles. The optimized
AVAILABLE macro reads ahead a large fixed amount to reduce the overhead of
repeatedly checking for the end of the string. However if the needle length
is larger than this, two_way_long_needle may confuse this as meaning the end
of the string and return NULL. This is fixed by adding the needle length to
the amount to read ahead.
[BZ #23637]
* string/test-strstr.c (pr23637): New function.
(test_main): Add tests with longer needles.
* string/strcasestr.c (AVAILABLE): Fix readahead distance.
* string/strstr.c (AVAILABLE): Likewise.
(cherry picked from commit 83a552b0bb9fc2a5e80a0ab3723c0a80ce1db9f2)
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
As done in commit 284f42bc778e487dfd5dff5c01959f93b9e0c4f5, memcmp
can be used after memchr to avoid the initialization overhead of the
two-way algorithm for the first match. This has shown improvement
>40% for first match.
(cherry picked from commit c8dd67e7c958de04c3783cbea7c384431707b5f8)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Looking at the benchtests, both strstr and strcasestr spend a lot of time
in a slow initialization loop handling one character per iteration.
This can be simplified and use the much faster strlen/strnlen/strchr/memcmp.
Read ahead a few cachelines to reduce the number of strnlen calls, which
improves performance by ~3-4%. This patch improves the time taken for the
full strstr benchtest by >40%.
* string/strcasestr.c (STRCASESTR): Simplify and speedup first match.
* string/strstr.c (AVAILABLE): Likewise.
(cherry picked from commit 284f42bc778e487dfd5dff5c01959f93b9e0c4f5)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Improve strstr performance. Strstr tends to be slow because it uses
many calls to memchr and a slow byte loop to scan for the next match.
Performance is significantly improved by using strnlen on larger blocks
and using strchr to search for the next matching character. strcasestr
can also use strnlen to scan ahead, and memmem can use memchr to check
for the next match.
On the GLIBC bench tests the performance gains on Cortex-A72 are:
strstr: +25%
strcasestr: +4.3%
memmem: +18%
On a 256KB dataset strstr performance improves by 67%, strcasestr by 47%.
Reviewd-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 3ae725dfb6d7f61447d27d00ed83e573bd5454f4)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add Ares to the midr_el0 list and support ifunc dispatch. Since Ares
supports 2 128-bit loads/stores, use Neon registers for memcpy by
selecting __memcpy_falkor by default (we should rename this to
__memcpy_simd or similar).
* manual/tunables.texi (glibc.cpu.name): Add ares tunable.
* sysdeps/aarch64/multiarch/memcpy.c (__libc_memcpy): Use
__memcpy_falkor for ares.
* sysdeps/unix/sysv/linux/aarch64/cpu-features.h (IS_ARES):
Add new define.
* sysdeps/unix/sysv/linux/aarch64/cpu-features.c (cpu_list):
Add ares cpu.
(cherry picked from commit 02f440c1ef5d5d79552a524065aa3e2fabe469b9)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Vector registers perform better than scalar register pairs for copying
data so prefer them instead. This results in a time reduction of over
50% (i.e. 2x speed improvemnet) for some smaller sizes for memcpy-walk.
Larger sizes show improvements of around 1% to 2%. memcpy-random shows
a very small improvement, in the range of 1-2%.
* sysdeps/aarch64/multiarch/memcpy_falkor.S (__memcpy_falkor):
Use vector registers.
(cherry picked from commit 0aec4c1d1801e8016ebe89281d16597e0557b8be)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
For smaller and medium sized copies, the effect of hardware
prefetching are not as dominant as instruction level parallelism.
Hence it makes more sense to load data into multiple registers than to
try and route them to the same prefetch unit. This is also the case
for the loop exit where we are unable to latch on to the same prefetch
unit anyway so it makes more sense to have data loaded in parallel.
The performance results are a bit mixed with memcpy-random, with
numbers jumping between -1% and +3%, i.e. the numbers don't seem
repeatable. memcpy-walk sees a 70% improvement (i.e. > 2x) for 128
bytes and that improvement reduces down as the impact of the tail copy
decreases in comparison to the loop.
* sysdeps/aarch64/multiarch/memcpy_falkor.S (__memcpy_falkor):
Use multiple registers to copy data in loop tail.
(cherry picked from commit db725a458e1cb0e17204daa543744faf08bb2e06)
|
| |
| |
| |
| |
| |
| | |
A lsr can do what the mov and lsr did.
(cherry picked from commit b47c3e7637efb77818cbef55dcd0ed1f0ea0ddf1)
|
| |
| |
| |
| |
| |
| |
| | |
Binutils 2.26.* and older do not support moves with shifted registers,
so use a separate shift instruction instead.
(cherry picked from commit d46f84de745db8f3f06a37048261f4e5ceacf0a3)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The mutually misaligned inputs on aarch64 are compared with a simple
byte copy, which is not very efficient. Enhance the comparison
similar to strcmp by loading a double-word at a time. The peak
performance improvement (i.e. 4k maxlen comparisons) due to this on
the strncmp microbenchmark is as follows:
falkor: 3.5x (up to 72% time reduction)
cortex-a73: 3.5x (up to 71% time reduction)
cortex-a53: 3.5x (up to 71% time reduction)
All mutually misaligned inputs from 16 bytes maxlen onwards show
upwards of 15% improvement and there is no measurable effect on the
performance of aligned/mutually aligned inputs.
* sysdeps/aarch64/strncmp.S (count): New macro.
(strncmp): Store misaligned length in SRC1 in COUNT.
(mutual_align): Adjust.
(misaligned8): Load dword at a time when it is safe.
(cherry picked from commit 7108f1f944792ac68332967015d5e6418c5ccc88)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
I accidentally set the loop jump back label as misaligned8 instead of
do_misaligned. The typo is harmless but it's always nice to not have
to unnecessarily execute those two instructions.
* sysdeps/aarch64/strcmp.S (do_misaligned): Jump back to
do_misaligned, not misaligned8.
(cherry picked from commit 6ca24c43481e2c93a6eec362b04c3e77a35b28e3)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Replace the simple byte-wise compare in the misaligned case with a
dword compare with page boundary checks in place. For simplicity I've
chosen a 4K page boundary so that we don't have to query the actual
page size on the system.
This results in up to 3x improvement in performance in the unaligned
case on falkor and about 2.5x improvement on mustang as measured using
bench-strcmp.
* sysdeps/aarch64/strcmp.S (misaligned8): Compare dword at a
time whenever possible.
(cherry picked from commit 2bce01ebbaf8db52ba4a5635eb5744f989cdbf69)
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
I goofed up when changing the loop8 name to loop16 and missed on out
the branch instance. Fixed and actually build tested this time.
* sysdeps/aarch64/memcmp.S (more16): Fix branch target loop16.
(cherry picked from commit 4e54d918630ea53e29dd70d3bdffcb00d29ed3d4)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This improved memcmp provides a fast path for compares up to 16 bytes
and then compares 16 bytes at a time, thus optimizing loads from both
sources. The glibc memcmp microbenchmark retains performance (with an
error of ~1ns) for smaller compare sizes and reduces up to 31% of
execution time for compares up to 4K on the APM Mustang. On Qualcomm
Falkor this improves to almost 48%, i.e. it is almost 2x improvement
for sizes of 2K and above.
* sysdeps/aarch64/memcmp.S: Widen comparison to 16 bytes at a
time.
(cherry picked from commit 30a81dae5b752f8aa5f96e7f7c341ec57cba3585)
|
| |
| |
| |
| |
| |
| |
| |
| | |
The L() macro makes the assembly a bit more readable.
* sysdeps/aarch64/memcmp.S: Use L() macro for labels.
(cherry picked from commit 84c94d2fd90d84ae7e67657ee8e22c2d1b796f63)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This is an optimized memcmp for AArch64. This is a complete rewrite
using a different algorithm. The previous version split into cases
where both inputs were aligned, the inputs were mutually aligned and
unaligned using a byte loop. The new version combines all these cases,
while small inputs of less than 8 bytes are handled separately.
This allows the main code to be sped up using unaligned loads since
there are now at least 8 bytes to be compared. After the first 8 bytes,
align the first input. This ensures each iteration does at most one
unaligned access and mutually aligned inputs behave as aligned.
After the main loop, process the last 8 bytes using unaligned accesses.
This improves performance of (mutually) aligned cases by 25% and
unaligned by >500% (yes >6 times faster) on large inputs.
* sysdeps/aarch64/memcmp.S (memcmp):
Rewrite of optimized memcmp.
(cherry picked from commit 922369032c604b4dcfd535e1bcddd4687e7126a5)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The fix for BZ#21270 (commit 158d5fa0e19) added a mask to avoid offset larger
than 1^44 to be used along __NR_mmap2. However mips64n32 users __NR_mmap,
as mips64n64, but still defines off_t as old non-LFS type (other ILP32, such
x32, defines off_t being equal to off64_t). This leads to use the same
mask meant only for __NR_mmap2 call for __NR_mmap, thus limiting the maximum
offset it can use with mmap64.
This patch fixes by setting the high mask only for __NR_mmap2 usage. The
posix/tst-mmap-offset.c already tests it and also fails for mips64n32. The
patch also change the test to check for an arch-specific header that defines
the maximum supported offset.
Checked on x86_64-linux-gnu, i686-linux-gnu, and I also tests tst-mmap-offset
on qemu simulated mips64 with kernel 3.2.0 kernel for both mips-linux-gnu and
mips64-n32-linux-gnu.
[BZ #24699]
* posix/tst-mmap-offset.c: Mention BZ #24699.
(do_test_bz21270): Rename to do_test_large_offset and use
mmap64_maximum_offset to check for maximum expected offset value.
* sysdeps/generic/mmap_info.h: New file.
* sysdeps/unix/sysv/linux/mips/mmap_info.h: Likewise.
* sysdeps/unix/sysv/linux/mmap64.c (MMAP_OFF_HIGH_MASK): Define iff
__NR_mmap2 is used.
(cherry picked from commit a008c76b56e4f958cf5a0d6f67d29fade89421b7)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Backport of commit 82bc69c012838a381c4167c156a06f4598f34227
and commit 30ba0375464f34e4bf8129f3d3dc14d0c09add17
without using DT_AARCH64_VARIANT_PCS for optimizing the symbol table check.
This is needed so the internal abi between ld.so and libc.so is unchanged.
Avoid lazy binding of symbols that may follow a variant PCS with different
register usage convention from the base PCS.
Currently the lazy binding entry code does not preserve all the registers
required for AdvSIMD and SVE vector calls. Saving and restoring all
registers unconditionally may break existing binaries, even if they never
use vector calls, because of the larger stack requirement for lazy
resolution, which can be significant on an SVE system.
The solution is to mark all symbols in the symbol table that may follow
a variant PCS so the dynamic linker can handle them specially. In this
patch such symbols are always resolved at load time, not lazily.
So currently LD_AUDIT for variant PCS symbols are not supported, for that
the _dl_runtime_profile entry needs to be changed e.g. to unconditionally
save/restore all registers (but pass down arg and retval registers to
pltentry/exit callbacks according to the base PCS).
This patch also removes a __builtin_expect from the modified code because
the branch prediction hint did not seem useful.
* sysdeps/aarch64/dl-machine.h (elf_machine_lazy_rel): Check
STO_AARCH64_VARIANT_PCS and bind such symbols at load time.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
STO_AARCH64_VARIANT_PCS is a non-visibility st_other flag for marking
symbols that reference functions that may follow a variant PCS with
different register usage convention from the base PCS.
DT_AARCH64_VARIANT_PCS is a dynamic tag that marks ELF modules that
have R_*_JUMP_SLOT relocations for symbols marked with
STO_AARCH64_VARIANT_PCS (i.e. have variant PCS calls via a PLT).
* elf/elf.h (STO_AARCH64_VARIANT_PCS): Define.
(DT_AARCH64_VARIANT_PCS): Define.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The tcache counts[] array is a char, which has a very small range and thus
may overflow. When setting tcache_count tunable, there is no overflow check.
However the tunable must not be larger than the maximum value of the tcache
counts[] array, otherwise it can overflow when filling the tcache.
[BZ #24531]
* malloc/malloc.c (MAX_TCACHE_COUNT): New define.
(do_set_tcache_count): Only update if count is small enough.
* manual/tunables.texi (glibc.malloc.tcache_count): Document max value.
(cherry picked from commit 5ad533e8e65092be962e414e0417112c65d154fb)
|
| |
| |
| |
| |
| |
| |
| |
| | |
When computing the length of the converted part of the stdio buffer, use
the number of consumed wide characters, not the (negative) distance to the
end of the wide buffer.
(cherry picked from commit 32ff397533715988c19cbf3675dcbd727ec13e18)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
pthread_mutex_trylock. [BZ #24180]
While debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
instructions:
140: a5 1b 00 01 oill %r1,1
144: e5 48 a0 f0 00 00 mvghi 240(%r10),0 <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
14a: e3 10 a0 e0 00 24 stg %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI
vs (with compiler barriers):
140: a5 1b 00 01 oill %r1,1
144: e3 10 a0 e0 00 24 stg %r1,224(%r10)
14a: e5 48 a0 f0 00 00 mvghi 240(%r10),0
Please have a look at the discussion:
"Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
(https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)
This patch is introducing the same compiler barriers and comments
for pthread_mutex_trylock as introduced for pthread_mutex_lock and
pthread_mutex_timedlock by commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
"Add compiler barriers around modifications of the robust mutex list."
ChangeLog:
[BZ #24180]
* nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
Add compiler barriers and comments.
(cherry picked from commit 823624bdc47f1f80109c9c52dee7939b9386d708)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Since the size argument is unsigned. we should use unsigned Jcc
instructions, instead of signed, to check size.
Tested on x86-64 and x32, with and without --disable-multi-arch.
[BZ #24155]
CVE-2019-7309
* NEWS: Updated for CVE-2019-7309.
* sysdeps/x86_64/memcmp.S: Use RDX_LP for size. Clear the
upper 32 bits of RDX register for x32. Use unsigned Jcc
instructions, instead of signed.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcmp-2.
* sysdeps/x86_64/x32/tst-size_t-memcmp-2.c: New test.
(cherry picked from commit 3f635fb43389b54f682fc9ed2acc0b2aaf4a923d)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes strnlen/wcsnlen for x32. Tested on x86-64 and x32. On
x86-64, libc.so is the same with and withou the fix.
[BZ #24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/strlen-avx2.S: Use RSI_LP for length.
Clear the upper 32 bits of RSI register.
* sysdeps/x86_64/strlen.S: Use RSI_LP for length.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strnlen
and tst-size_t-wcsnlen.
* sysdeps/x86_64/x32/tst-size_t-strnlen.c: New file.
* sysdeps/x86_64/x32/tst-size_t-wcsnlen.c: Likewise.
(cherry picked from commit 5165de69c0908e28a380cbd4bb054e55ea4abc95)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes strncpy for x32. Tested on x86-64 and x32. On x86-64,
libc.so is the same with and withou the fix.
[BZ #24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/strcpy-sse2-unaligned.S: Use RDX_LP
for length.
* sysdeps/x86_64/multiarch/strcpy-ssse3.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strncpy.
* sysdeps/x86_64/x32/tst-size_t-strncpy.c: New file.
(cherry picked from commit c7c54f65b080affb87a1513dee449c8ad6143c8b)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes the strncmp family for x32. Tested on x86-64 and x32.
On x86-64, libc.so is the same with and withou the fix.
[BZ #24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/strcmp-sse42.S: Use RDX_LP for length.
* sysdeps/x86_64/strcmp.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strncasecmp,
tst-size_t-strncmp and tst-size_t-wcsncmp.
* sysdeps/x86_64/x32/tst-size_t-strncasecmp.c: New file.
* sysdeps/x86_64/x32/tst-size_t-strncmp.c: Likewise.
* sysdeps/x86_64/x32/tst-size_t-wcsncmp.c: Likewise.
(cherry picked from commit ee915088a0231cd421054dbd8abab7aadf331153)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes memset/wmemset for x32. Tested on x86-64 and x32. On
x86-64, libc.so is the same with and withou the fix.
[BZ #24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/memset-avx512-no-vzeroupper.S: Use
RDX_LP for length. Clear the upper 32 bits of RDX register.
* sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-wmemset.
* sysdeps/x86_64/x32/tst-size_t-memset.c: New file.
* sysdeps/x86_64/x32/tst-size_t-wmemset.c: Likewise.
(cherry picked from commit 82d0b4a4d76db554eb6757acb790fcea30b19965)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes memrchr for x32. Tested on x86-64 and x32. On x86-64,
libc.so is the same with and withou the fix.
[BZ #24097]
CVE-2019-6488
* sysdeps/x86_64/memrchr.S: Use RDX_LP for length.
* sysdeps/x86_64/multiarch/memrchr-avx2.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memrchr.
* sysdeps/x86_64/x32/tst-size_t-memrchr.c: New file.
(cherry picked from commit ecd8b842cf37ea112e59cd9085ff1f1b6e208ae0)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes memcpy for x32. Tested on x86-64 and x32. On x86-64,
libc.so is the same with and withou the fix.
[BZ #24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/memcpy-ssse3-back.S: Use RDX_LP for
length. Clear the upper 32 bits of RDX register.
* sysdeps/x86_64/multiarch/memcpy-ssse3.S: Likewise.
* sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S:
Likewise.
* sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:
Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcpy.
tst-size_t-wmemchr.
* sysdeps/x86_64/x32/tst-size_t-memcpy.c: New file.
(cherry picked from commit 231c56760c1e2ded21ad96bbb860b1f08c556c7a)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes memcmp/wmemcmp for x32. Tested on x86-64 and x32. On
x86-64, libc.so is the same with and withou the fix.
[BZ #24097]
CVE-2019-6488
* sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S: Use RDX_LP for
length. Clear the upper 32 bits of RDX register.
* sysdeps/x86_64/multiarch/memcmp-sse4.S: Likewise.
* sysdeps/x86_64/multiarch/memcmp-ssse3.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcmp and
tst-size_t-wmemcmp.
* sysdeps/x86_64/x32/tst-size_t-memcmp.c: New file.
* sysdeps/x86_64/x32/tst-size_t-wmemcmp.c: Likewise.
(cherry picked from commit b304fc201d2f6baf52ea790df8643e99772243cd)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
On x32, the size_t parameter may be passed in the lower 32 bits of a
64-bit register with the non-zero upper 32 bits. The string/memory
functions written in assembly can only use the lower 32 bits of a
64-bit register as length or must clear the upper 32 bits before using
the full 64-bit register for length.
This pach fixes memchr/wmemchr for x32. Tested on x86-64 and x32. On
x86-64, libc.so is the same with and withou the fix.
[BZ #24097]
CVE-2019-6488
* sysdeps/x86_64/memchr.S: Use RDX_LP for length. Clear the
upper 32 bits of RDX register.
* sysdeps/x86_64/multiarch/memchr-avx2.S: Likewise.
* sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memchr and
tst-size_t-wmemchr.
* sysdeps/x86_64/x32/test-size_t.h: New file.
* sysdeps/x86_64/x32/tst-size_t-memchr.c: Likewise.
* sysdeps/x86_64/x32/tst-size_t-wmemchr.c: Likewise.
(cherry picked from commit 97700a34f36721b11a754cf37a1cc40695ece1fd)
|
|\| |
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
On POWER9, cbrtf128 fails by 1 ULP.
* sysdeps/powerpc/fpu/libm-test-ulps: Regenerate.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
(cherry picked from commit 428fc49eaafe0fe5352445fcf23d9f603e9083a2)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Some math functions have to be distributed in libc because they're
required by printf.
libc and libm require their own builds of these functions, e.g. libc
functions have to call __stack_chk_fail_local in order to bypass the
PLT, while libm functions have to call __stack_chk_fail.
While math/Makefile treat the generic cases, i.e. s_isinff, the
multiarch Makefile has to treat its own files, i.e. s_isinff-ppc64.
[BZ #21745]
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile:
[$(subdir) = math] (sysdep_calls): New variable. Has the
previous contents of sysdep_routines, but re-sorted..
[$(subdir) = math] (sysdep_routines): Re-use the contents from
sysdep_calls.
[$(subdir) = math] (libm-sysdep_routines): Remove the functions
defined in sysdep_calls and replace by the respective m_* names.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-ppc64.S:
(compat_symbol): Undefine to avoid duplicated compat symbols in
libc.
(cherry picked from commit 61c45f250528dae431391823a9766053e61ccde1)
|
| |
| |
| |
| |
| |
| |
| | |
Fixes commit 9695dd0c9309712ed8e9c17a7040fe7af347f2dc ("DCIGETTEXT:
Use getcwd, asprintf to construct absolute pathname").
(cherry picked from commit 8c1aafc1f34d090a5b41dc527c33e8687f6a1287)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This commit removes the custom memcpy implementation from _int_realloc
for small chunk sizes. The ncopies variable has the wrong type, and
an integer wraparound could cause the existing code to copy too few
elements (leaving the new memory region mostly uninitialized).
Therefore, removing this code fixes bug 24027.
(cherry picked from commit b50dd3bc8cbb1efe85399b03d7e6c0310c2ead84)
|
| |
| |
| |
| | |
(cherry picked from commit d527c860f5a3f0ed687bd03f0cb464612dc23408)
|
| |
| |
| |
| |
| |
| | |
Test for the infinite loop in getnetbyname, bug #17630.
(cherry picked from commit ac8060265bcaca61568ef3a20b9a0140a270af54)
|
| |
| |
| |
| | |
(cherry picked from commit bd3b0fbae33a9a4cc5e2daf049443d5cf03d4251)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Th commit 'Disable TSX on some Haswell processors.' (2702856bf4) changed the
default flags for Haswell models. Previously, new models were handled by the
default switch path, which assumed a Core i3/i5/i7 if AVX is available. After
the patch, Haswell models (0x3f, 0x3c, 0x45, 0x46) do not set the flags
Fast_Rep_String, Fast_Unaligned_Load, Fast_Unaligned_Copy, and
Prefer_PMINUB_for_stringop (only the TSX one).
This patch fixes it by disentangle the TSX flag handling from the memory
optimization ones. The strstr case cited on patch now selects the
__strstr_sse2_unaligned as expected for the Haswell cpu.
Checked on x86_64-linux-gnu.
[BZ #23709]
* sysdeps/x86/cpu-features.c (init_cpu_features): Set TSX bits
independently of other flags.
(cherry picked from commit c3d8dc45c9df199b8334599a6cbd98c9950dba62)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
nptl/tst-attr3 fails to build with GCC mainline because of
(deliberate) aliasing between the second (attributes) and fourth
(argument to thread start routine) arguments to pthread_create.
Although both those arguments are restrict-qualified in POSIX,
pthread_create does not actually dereference its fourth argument; it's
an opaque pointer passed to the thread start routine. Thus, the
aliasing is actually valid in this case, and it's deliberate in the
test. So this patch makes the test disable -Wrestrict for the two
pthread_create calls in question. (-Wrestrict was added in GCC 7,
hence the __GNUC_PREREQ conditions, but the particular warning in
question is new in GCC 8.)
Tested compilation with build-many-glibcs.py for aarch64-linux-gnu.
* nptl/tst-attr3.c: Include <libc-diag.h>.
(do_test) [__GNUC_PREREQ (7, 0)]: Ignore -Wrestrict for two tests.
(cherry picked from commit 40c4162df6766fb1e8ede875ca8df25d8075d3a5)
|