about summary refs log tree commit diff
Commit message (Collapse)AuthorAgeFilesLines
...
| * librt: add test (bug 28213)Nikita Popov2021-08-172-0/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This test implements following logic: 1) Create POSIX message queue. Register a notification with mq_notify (using NULL attributes). Then immediately unregister the notification with mq_notify. Helper thread in a vulnerable version of glibc should cause NULL pointer dereference after these steps. 2) Once again, register the same notification. Try to send a dummy message. Test is considered successfulif the dummy message is successfully received by the callback function. Signed-off-by: Nikita Popov <npv1310@gmail.com> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org> (cherry picked from commit 4cc79c217744743077bf7a0ec5e0a4318f1e6641)
| * librt: fix NULL pointer dereference (bug 28213)Nikita Popov2021-08-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Helper thread frees copied attribute on NOTIFY_REMOVED message received from the OS kernel. Unfortunately, it fails to check whether copied attribute actually exists (data.attr != NULL). This worked earlier because free() checks passed pointer before actually attempting to release corresponding memory. But __pthread_attr_destroy assumes pointer is not NULL. So passing NULL pointer to __pthread_attr_destroy will result in segmentation fault. This scenario is possible if notification->sigev_notify_attributes == NULL (which means default thread attributes should be used). Signed-off-by: Nikita Popov <npv1310@gmail.com> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org> (cherry picked from commit b805aebd42364fe696e417808a700fdb9800c9e8)
| * x86_64: Remove unneeded static PIE check for undefined weak diagnosticFangrui Song2021-07-082-58/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://sourceware.org/bugzilla/show_bug.cgi?id=21782 dropped an ld diagnostic for R_X86_64_PC32 referencing an undefined weak symbol in -pie links. Arguably keeping the diagnostic like other ports is more correct, since statically resolving movl foo(%rip), %eax to the link-time zero address produces a corrupted output. It turns out that --enable-static-pie builds do not depend on the ld behavior. GCC generates GOT indirection for weak declarations for -fPIE/-fPIC, so what ld does with the PC-relative relocation doesn't really matter. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit 115d242456de158e698ffb0f9a5fee3118e9e825)
| * wordexp: handle overflow in positional parameter number (bug 28011)Andreas Schwab2021-07-062-1/+2
| | | | | | | | | | | | Use strtoul instead of atoi so that overflow can be detected. (cherry picked from commit 5adda61f62b77384718b4c0d8336ade8f2b4b35c)
| * Fix use of __pthread_attr_copy in mq_notify (bug 27896)Florian Weimer2021-06-101-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __pthread_attr_copy can fail and does not initialize the attribute structure in that case. If __pthread_attr_copy is never called and there is no allocated attribute, pthread_attr_destroy should not be called, otherwise there is a null pointer dereference in rt/tst-mqueue6. Fixes commit 42d359350510506b87101cf77202fefcbfc790cb ("Use __pthread_attr_copy in mq_notify (bug 27896)"). Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org> (cherry picked from commit 217b6dc298156bdb0d6aea9ea93e7e394a5ff091)
| * Use __pthread_attr_copy in mq_notify (bug 27896)Andreas Schwab2021-06-102-5/+14
| | | | | | | | | | | | | | Make a deep copy of the pthread attribute object to remove a potential use-after-free issue. (cherry picked from commit 42d359350510506b87101cf77202fefcbfc790cb)
* | Merge branch release/2.32/master into ibm/2.32/masterTulio Magno Quites Machado Filho2021-04-2737-541/+688
|\|
| * support: Typo and formatting fixesSiddhesh Poyarekar2021-04-142-4/+4
| | | | | | | | | | | | | | - Add a newline to the end of error messages in transfer(). - Fixed the name of support_subprocess_init(). (cherry picked from commit 95c68080a3ded882789b1629f872c3ad531efda0)
| * support: Pass environ to child processSiddhesh Poyarekar2021-04-141-1/+1
| | | | | | | | | | | | | | Pass environ to posix_spawn so that the child process can inherit environment of the test. (cherry picked from commit e958490f8c74e660bd93c128b3bea746e268f3f6)
| * Fix SXID_ERASE behavior in setuid programs (BZ #27471)Siddhesh Poyarekar2021-04-142-30/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When parse_tunables tries to erase a tunable marked as SXID_ERASE for setuid programs, it ends up setting the envvar string iterator incorrectly, because of which it may parse the next tunable incorrectly. Given that currently the implementation allows malformed and unrecognized tunables pass through, it may even allow SXID_ERASE tunables to go through. This change revamps the SXID_ERASE implementation so that: - Only valid tunables are written back to the tunestr string, because of which children of SXID programs will only inherit a clean list of identified tunables that are not SXID_ERASE. - Unrecognized tunables get scrubbed off from the environment and subsequently from the child environment. - This has the side-effect that a tunable that is not identified by the setxid binary, will not be passed on to a non-setxid child even if the child could have identified that tunable. This may break applications that expect this behaviour but expecting such tunables to cross the SXID boundary is wrong. Reviewed-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit 2ed18c5b534d9e92fc006202a5af0df6b72e7aca)
| * Enhance setuid-tunables testSiddhesh Poyarekar2021-04-142-23/+69
| | | | | | | | | | | | | | | | | | | | | | | | Instead of passing GLIBC_TUNABLES via the environment, pass the environment variable from parent to child. This allows us to test multiple variables to ensure better coverage. The test list currently only includes the case that's already being tested. More tests will be added later. Reviewed-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit 061fe3f8add46a89b7453e87eabb9c4695005ced)
| * tst-env-setuid: Use support_capture_subprogram_self_sgidSiddhesh Poyarekar2021-04-141-183/+14
| | | | | | | | | | | | | | Use the support_capture_subprogram_self_sgid to spawn an sgid child. Reviewed-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit ca335281068a1ed549a75ee64f90a8310755956f)
| * support: Add capability to fork an sgid childSiddhesh Poyarekar2021-04-145-181/+168
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new function support_capture_subprogram_self_sgid that spawns an sgid child of the running program with its own image and returns the exit code of the child process. This functionality is used by at least three tests in the testsuite at the moment, so it makes sense to consolidate. There is also a new function support_subprogram_wait which should provide simple system() like functionality that does not set up file actions. This is useful in cases where only the return code of the spawned subprocess is interesting. This patch also ports tst-secure-getenv to this new function. A subsequent patch will port other tests. This also brings an important change to tst-secure-getenv behaviour. Now instead of succeeding, the test fails as UNSUPPORTED if it is unable to spawn a setgid child, which is how it should have been in the first place. Reviewed-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit 716a3bdc41b2b4b864dc64475015ba51e35e1273)
| * S390: Also check vector support in memmove ifunc-selector [BZ #27511]Stefan Liebler2021-03-264-6/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The arch13 memmove variant is currently selected by the ifunc selector if the Miscellaneous-Instruction-Extensions Facility 3 facility bit is present, but the function is also using vector instructions. If the vector support is not present, one is receiving an operation exception. Therefore this patch also checks for vector support in the ifunc selector and in ifunc-impl-list.c. Just to be sure, the configure check is now also testing an arch13 vector instruction and an arch13 Miscellaneous-Instruction-Extensions Facility 3 instruction. (cherry picked from commit 7759be2593b689cb1eafc0f52ee7f59c639e5d2f)
| * powerpc64: Workaround sigtramp vdso return callRaoni Fassina Firmino2021-03-081-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A not so recent kernel change[1] changed how the trampoline `__kernel_sigtramp_rt64` is used to call signal handlers. This was exposed on the test misc/tst-sigcontext-get_pc Before kernel 5.9, the kernel set LR to the trampoline address and jumped directly to the signal handler, and at the end the signal handler, as any other function, would `blr` to the address set. In other words, the trampoline was executed just at the end of the signal handler and the only thing it did was call sigreturn. But since kernel 5.9 the kernel set CTRL to the signal handler and calls to the trampoline code, the trampoline then `bctrl` to the address in CTRL, setting the LR to the next instruction in the middle of the trampoline, when the signal handler returns, the rest of the trampoline code executes the same code as before. Here is the full trampoline code as of kernel 5.11.0-rc5 for reference: V_FUNCTION_BEGIN(__kernel_sigtramp_rt64) .Lsigrt_start: bctrl /* call the handler */ addi r1, r1, __SIGNAL_FRAMESIZE li r0,__NR_rt_sigreturn sc .Lsigrt_end: V_FUNCTION_END(__kernel_sigtramp_rt64) This new behavior breaks how `backtrace()` uses to detect the trampoline frame to correctly reconstruct the stack frame when it is called from inside a signal handling. This workaround rely on the fact that the trampoline code is at very least two (maybe 3?) instructions in size (as it is in the 32 bits version, only on `li` and `sc`), so it is safe to check the return address be in the range __kernel_sigtramp_rt64 .. + 4. [1] subject: powerpc/64/signal: Balance return predictor stack in signal trampoline commit: 0138ba5783ae0dcc799ad401a1e8ac8333790df9 url: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0138ba5783ae0dcc799ad401a1e8ac8333790df9 Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit 5ee506ed35a2c9184bcb1fb5e79b6cceb9bb0dd1)
| * nscd: Fix double free in netgroupcache [BZ #27462]DJ Delorie2021-03-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit 745664bd798ec8fd50438605948eea594179fba1 a use-after-free was fixed, but this led to an occasional double-free. This patch tracks the "live" allocation better. Tested manually by a third party. Related: RHBZ 1927877 Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit dca565886b5e8bd7966e15f0ca42ee5cff686673)
| * gconv: Fix assertion failure in ISO-2022-JP-3 module (bug 27256)Florian Weimer2021-01-273-20/+178
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The conversion loop to the internal encoding does not follow the interface contract that __GCONV_FULL_OUTPUT is only returned after the internal wchar_t buffer has been filled completely. This is enforced by the first of the two asserts in iconv/skeleton.c: /* We must run out of output buffer space in this rerun. */ assert (outbuf == outerr); assert (nstatus == __GCONV_FULL_OUTPUT); This commit solves this issue by queuing a second wide character which cannot be written immediately in the state variable, like other converters already do (e.g., BIG5-HKSCS or TSCII). Reported-by: Tavis Ormandy <taviso@gmail.com> (cherry picked from commit 7d88c6142c6efc160c0ee5e4f85cde382c072888)
| * aarch64: fix static PIE start code for BTI [BZ #27068]Guillaume Gardet2021-01-211-0/+1
| | | | | | | | | | | | | | | | | | A bti c was missing from rcrt1.o which made all -static-pie binaries fail at program startup on BTI enabled systems. Fixes bug 27068. (cherry picked from commit d4136903a29baabeec8987b53081def8b4a49826)
| * __vfscanf_internal: fix aliasing violation (bug 26690)Andreas Schwab2021-01-211-11/+11
| | | | | | | | | | | | | | | | | | As noted in <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97264>, the cast in the call to the read_int function is an aliasing violation. Change the type of local variable f to a pointer to unsigned, which allows to eliminate most casts while only adding three new ones. (cherry picked from commit c0e9ddf59e73e21afe15fca4e94cf7b4b7359bf2)
| * aarch64: Use mmap to add PROT_BTI instead of mprotect [BZ #26831]Szabolcs Nagy2021-01-213-19/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Re-mmap executable segments if possible instead of using mprotect to add PROT_BTI. This allows using BTI protection with security policies that prevent mprotect with PROT_EXEC. If the fd of the ELF module is not available because it was kernel mapped then mprotect is used and failures are ignored. To protect the main executable even when mprotect is filtered the linux kernel will have to be changed to add PROT_BTI to it. The delayed failure reporting is mainly needed because currently _dl_process_gnu_properties does not propagate failures such that the required cleanups happen. Using the link_map_machine struct for error propagation is not ideal, but this seemed to be the least intrusive solution. Fixes bug 26831. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit cd543b5eb3642d76e365a131ce676f31fe3f1dd4)
| * elf: Pass the fd to note processingSzabolcs Nagy2021-01-216-18/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To handle GNU property notes on aarch64 some segments need to be mmaped again, so the fd of the loaded ELF module is needed. When the fd is not available (kernel loaded modules), then -1 is passed. The fd is passed to both _dl_process_pt_gnu_property and _dl_process_pt_note for consistency. Target specific note processing functions are updated accordingly. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit c00452d7757a300931ee186d043c43b48eeb0875)
| * elf: Move note processing after l_phdr is updatedSzabolcs Nagy2021-01-211-15/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Program headers are processed in two pass: after the first pass load segments are mmapped so in the second pass target specific note processing logic can access the notes. The second pass is moved later so various link_map fields are set up that may be useful for note processing such as l_phdr. The second pass should be before the fd is closed so that is available. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit 38a3836011f3fe3290a94ab136dcb5f3c5c9f4e2) elf: Fix dl-load.c Rebasing broke commit 38a3836011f3fe3290a94ab136dcb5f3c5c9f4e2 it was supposed to move code. (cherry picked from commit 751acde7ec335506b54e94ed6f2c998f6c0a22c6)
| * aarch64: align address for BTI protection [BZ #26988]Szabolcs Nagy2021-01-211-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | Handle unaligned executable load segments (the bfd linker is not expected to produce such binaries, but other linkers may). Computing the mapping bounds follows _dl_map_object_from_fd more closely now. Fixes bug 26988. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit 8b8f616e6a594b91d0afb152384bf2a9f72b7288)
| * aarch64: Fix missing BTI protection from dependencies [BZ #26926]Szabolcs Nagy2021-01-211-2/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | The _dl_open_check and _rtld_main_check hooks are not called on the dependencies of a loaded module, so BTI protection was missed on every module other than the main executable and directly dlopened libraries. The fix just iterates over dependencies to enable BTI. Fixes bug 26926. (cherry picked from commit 72739c79f61989a76b7dd719f34fcfb7b8eadde9)
| * x86: Check IFUNC definition in unrelocated executable [BZ #20019]H.J. Lu2021-01-136-25/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Calling an IFUNC function defined in unrelocated executable also leads to segfault. Issue a fatal error message when calling IFUNC function defined in the unrelocated executable from a shared library. On x86, ifuncmain6pie failed with: [hjl@gnu-cfl-2 build-i686-linux]$ ./elf/ifuncmain6pie --direct ./elf/ifuncmain6pie: IFUNC symbol 'foo' referenced in '/export/build/gnu/tools-build/glibc-32bit/build-i686-linux/elf/ifuncmod6.so' is defined in the executable and creates an unsatisfiable circular dependency. [hjl@gnu-cfl-2 build-i686-linux]$ readelf -rW elf/ifuncmod6.so | grep foo 00003ff4 00000706 R_386_GLOB_DAT 0000400c foo_ptr 00003ff8 00000406 R_386_GLOB_DAT 00000000 foo 0000400c 00000401 R_386_32 00000000 foo [hjl@gnu-cfl-2 build-i686-linux]$ Remove non-JUMP_SLOT relocations against foo in ifuncmod6.so, which trigger the circular IFUNC dependency, and build ifuncmain6pie with -Wl,-z,lazy. (cherry picked from commits 6ea5b57afa5cdc9ce367d2b69a2cebfb273e4617 and 7137d682ebfcb6db5dfc5f39724718699922f06c)
| * x86: Set header.feature_1 in TCB for always-on CET [BZ #27177]H.J. Lu2021-01-134-1/+12
| | | | | | | | | | | | | | Update dl_cet_check() to set header.feature_1 in TCB when both IBT and SHSTK are always on. (cherry picked from commit 2ef23b520597f4ea1790a669b83e608f24f4cf12)
| * Update for [BZ #27130] fixH.J. Lu2021-01-121-0/+1
| |
| * x86-64: Avoid rep movsb with short distance [BZ #27130]H.J. Lu2021-01-121-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When copying with "rep movsb", if the distance between source and destination is N*4GB + [1..63] with N >= 0, performance may be very slow. This patch updates memmove-vec-unaligned-erms.S for AVX and AVX512 versions with the distance in RCX: cmpl $63, %ecx // Don't use "rep movsb" if ECX <= 63 jbe L(Don't use rep movsb") Use "rep movsb" Benchtests data with bench-memcpy, bench-memcpy-large, bench-memcpy-random and bench-memcpy-walk on Skylake, Ice Lake and Tiger Lake show that its performance impact is within noise range as "rep movsb" is only used for data size >= 4KB. (cherry picked from commit 3ec5d83d2a237d39e7fd6ef7a0bc8ac4c171a4a5)
* | Merge branch release/2.32/master into ibm/2.32/masterTulio Magno Quites Machado Filho2021-01-1155-459/+2168
|\|
| * Fix buffer overrun in EUC-KR conversion module (bz #24973)Andreas Schwab2021-01-064-9/+59
| | | | | | | | | | | | | | | | | | The byte 0xfe as input to the EUC-KR conversion denotes a user-defined area and is not allowed. The from_euc_kr function used to skip two bytes when told to skip over the unknown designation, potentially running over the buffer end. (cherry picked from commit ee7a3144c9922808181009b7b3e50e852fb4999b)
| * tests-mcheck: New variable to run tests with MALLOC_CHECK_=3Siddhesh Poyarekar2020-12-242-1/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This new variable allows various subsystems in glibc to run all or some of their tests with MALLOC_CHECK_=3. This patch adds infrastructure support for this variable as well as an implementation in malloc/Makefile to allow running some of the tests with MALLOC_CHECK_=3. At present some tests in malloc/ have been excluded from the mcheck tests either because they're specifically testing MALLOC_CHECK_ or they are failing in master even without the Memory Tagging patches that prompted this work. Some tests were reviewed and found to need specific error points that MALLOC_CHECK_ defeats by terminating early but a thorough review of all tests is needed to bring them into mcheck coverage. Backported from 4f969166ce4ab535fa798dcbaa5de4c4e05773ec.
| * iconv: Accept redundant shift sequences in IBM1364 [BZ #26224]Arjun Shankar2020-11-303-18/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The IBM1364, IBM1371, IBM1388, IBM1390 and IBM1399 character sets share converter logic (iconvdata/ibm1364.c) which would reject redundant shift sequences when processing input in these character sets. This led to a hang in the iconv program (CVE-2020-27618). This commit adjusts the converter to ignore redundant shift sequences and adds test cases for iconv_prog hangs that would be triggered upon their rejection. This brings the implementation in line with other converters that also ignore redundant shift sequences (e.g. IBM930 etc., fixed in commit 692de4b3960d). Reviewed-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit 9a99c682144bdbd40792ebf822fe9264e0376fb5)
| * sh: Add sh4 fpu Implies folderAdhemerval Zanella2020-11-275-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | The commit 605f38177db (sh: Split BE/LE abilist) did not take in consideration the SH4 fpu support. Checked with a build for sh4-linux-gnu and manually checked that the implementations at sysdeps/sh/sh4/fpu/ are selected. John Paul Adrian Glaubitz also confirmed it fixes the build issues he encontered. (cherry-picked from 9ff2674ef82eccd5ae5dfa6bb733c0e3613764c6)
| * struct _Unwind_Exception alignment should not depend on compiler flagsFlorian Weimer2020-11-161-9/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | __attribute__((__aligned__)) selects an alignment that depends on the micro-architecture selected by GCC flags. Enabling vector extensions may increase the allignment. This is a problem when building glibc as a collection of ELF multilibs with different GCC flags because ld.so and libc.so/libpthread.so/&c may end up with a different layout of struct pthread because of the changing offset of its struct _Unwind_Exception field. Tested-By: Matheus Castanho <msc@linux.ibm.com> (cherry picked from commit 30af7c7fa13e17d82c3f1f91536384715844f432)
| * resolv: Serialize processing in resolv/tst-resolv-txnid-collisionFlorian Weimer2020-11-101-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | When switching name servers, response processing by two server threads clobbers the global test state. (There is still some risk that this test is negatively impact by packet drops and packet reordering, but this applies to many of the resolver tests and is difficult to avoid.) Fixes commit f1f00c072138af90ae6da180f260111f09afe7a3 ("resolv: Handle transaction ID collisions in parallel queries (bug 26600)"). (cherry picked from commit b8b53b338f6da91e86d115a39da860cefac736ad)
| * resolv: Handle transaction ID collisions in parallel queries (bug 26600)Florian Weimer2020-11-104-20/+357
| | | | | | | | | | | | | | | | If the transaction IDs are equal, the old check attributed both responses to the first query, not recognizing the second response. This fixes bug 26600. (cherry picked from commit f1f00c072138af90ae6da180f260111f09afe7a3)
| * support: Provide a way to clear the RA bit in DNS server responsesFlorian Weimer2020-11-102-1/+7
| | | | | | | | (cherry picked from commit 08443b19965f48862b02c2fd7b33a39d66daf2ff)
| * support: Provide a way to reorder responses within the DNS test serverFlorian Weimer2020-11-105-28/+135
| | | | | | | | (cherry picked from commit 873e239a4c3d8ec235c27439c1bdc5bbf8aa1818)
| * Remove __warndeclSiddhesh Poyarekar2020-11-103-74/+1
| | | | | | | | | | | | | | | | The macro is not used anymore, so remove it and warning-nop.c. Reviewed-by: Florian Weimer <fweimer@redhat.com> (cherry-picked from 34aec973e15a81926198f4b71ff99081dff87a92)
| * Remove __warn_memset_zero_len [BZ #25399]Siddhesh Poyarekar2020-11-101-15/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Non-gcc compilers (clang and possibly other compilers that do not masquerade as gcc 5.0 or later) are unable to use __warn_memset_zero_len since the symbol is no longer available on glibc built with gcc 5.0 or later. While it was likely an oversight that caused this omission, the fact that it wasn't noticed until recently (when clang closed the gap on _FORTIFY_SUPPORT) that the symbol was missing. Given that both gcc and clang are capable of doing this check in the compiler, drop all remaining signs of __warn_memset_zero_len from glibc so that no more objects are built with this symbol in future. (cherry-picked from dc274b141666766b8ef70992d887e3c0c5e41bed)
| * aarch64: Add unwind information to _start (bug 26853)Florian Weimer2020-11-102-4/+4
| | | | | | | | | | | | | | | | | | This adds CFI directives which communicate that the stack ends with this function. Fixes bug 26853. (cherry picked from commit 5edf3d9fd6efe06fda37b2a460e60690a90457a4)
| * aarch64: Fix DT_AARCH64_VARIANT_PCS handling [BZ #26798]Szabolcs Nagy2020-11-041-8/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The variant PCS support was ineffective because in the common case linkmap->l_mach.plt == 0 but then the symbol table flags were ignored and normal lazy binding was used instead of resolving the relocs early. (This was a misunderstanding about how GOT[1] is setup by the linker.) In practice this mainly affects SVE calls when the vector length is more than 128 bits, then the top bits of the argument registers get clobbered during lazy binding. Fixes bug 26798. (cherry picked from commit 558251bd8785760ad40fcbfeaaee5d27fa5b0fe4)
| * x86: Optimizing memcpy for AMD Zen architecture.Sajan Karumanchi2020-10-301-3/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Modifying the shareable cache '__x86_shared_cache_size', which is a factor in computing the non-temporal threshold parameter '__x86_shared_non_temporal_threshold' to optimize memcpy for AMD Zen architectures. In the existing implementation, the shareable cache is computed as 'L3 per thread, L2 per core'. Recomputing this shareable cache as 'L3 per CCX(Core-Complex)' has brought in performance gains. As per the large bench variant results, this patch also addresses the regression problem on AMD Zen architectures. Backport of commit 59803e81f96b479c17f583b31eac44b57591a1bf upstream, with the fix from cb3a749a22a55645dc6a52659eea765300623f98 ("x86: Restore processing of cache size tunables in init_cacheinfo") applied. Reviewed-by: Premachandra Mallappa <premachandra.mallappa@amd.com> Co-Authored-by: Florian Weimer <fweimer@redhat.com>
| * Reversing calculation of __x86_shared_non_temporal_thresholdPatrick McGehearty2020-10-282-6/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The __x86_shared_non_temporal_threshold determines when memcpy on x86 uses non_temporal stores to avoid pushing other data out of the last level cache. This patch proposes to revert the calculation change made by H.J. Lu's patch of June 2, 2017. H.J. Lu's patch selected a threshold suitable for a single thread getting maximum performance. It was tuned using the single threaded large memcpy micro benchmark on an 8 core processor. The last change changes the threshold from using 3/4 of one thread's share of the cache to using 3/4 of the entire cache of a multi-threaded system before switching to non-temporal stores. Multi-threaded systems with more than a few threads are server-class and typically have many active threads. If one thread consumes 3/4 of the available cache for all threads, it will cause other active threads to have data removed from the cache. Two examples show the range of the effect. John McCalpin's widely parallel Stream benchmark, which runs in parallel and fetches data sequentially, saw a 20% slowdown with this patch on an internal system test of 128 threads. This regression was discovered when comparing OL8 performance to OL7. An example that compares normal stores to non-temporal stores may be found at https://vgatherps.github.io/2018-09-02-nontemporal/. A simple test shows performance loss of 400 to 500% due to a failure to use nontemporal stores. These performance losses are most likely to occur when the system load is heaviest and good performance is critical. The tunable x86_non_temporal_threshold can be used to override the default for the knowledgable user who really wants maximum cache allocation to a single thread in a multi-threaded system. The manual entry for the tunable has been expanded to provide more information about its purpose. modified: sysdeps/x86/cacheinfo.c modified: manual/tunables.texi (cherry picked from commit d3c57027470b78dba79c6d931e4e409b1fecfc80)
| * sysvipc: Fix IPC_INFO and SHM_INFO handling [BZ #26636]Adhemerval Zanella2020-10-154-7/+206
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both commands are Linux extensions where the third argument is either a 'struct shminfo' (IPC_INFO) or a 'struct shm_info' (SHM_INFO) instead of 'struct shmid_ds'. And their information does not contain any time related fields, so there is no need to extra conversion for __IPC_TIME64. The regression testcase checks for Linux specifix SysV ipc message control extension. For SHM_INFO it tries to match the values against the tunable /proc values and for MSG_STAT/MSG_STAT_ANY it check if the create\ shared memory is within the global list returned by the kernel. Checked on x86_64-linux-gnu and on i686-linux-gnu (Linux v5.4 and on Linux v4.15). (cherry picked from commit a49d7fd4f764e97ccaf922e433046590ae52fce9)
| * sysvipc: Fix IPC_INFO and MSG_INFO handling [BZ #26639]Adhemerval Zanella2020-10-154-5/+197
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both commands are Linux extensions where the third argument is a 'struct msginfo' instead of 'struct msqid_ds' and its information does not contain any time related fields (so there is no need to extra conversion for __IPC_TIME64. The regression testcase checks for Linux specifix SysV ipc message control extension. For IPC_INFO/MSG_INFO it tries to match the values against the tunable /proc values and for MSG_STAT/MSG_STAT_ANY it check if the create message queue is within the global list returned by the kernel. Checked on x86_64-linux-gnu and on i686-linux-gnu (Linux v5.4 and on Linux v4.15). (cherry picked from commit 20a00dbefca5695cccaa44846a482db8ccdd85ab)
| * sysvipc: Fix SEM_STAT_ANY kernel argument pass [BZ #26637]Dmitry V. Levin2020-10-155-1/+194
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Handle SEM_STAT_ANY the same way as SEM_STAT so that the buffer argument of SEM_STAT_ANY is properly passed to the kernel and back. The regression testcase checks for Linux specifix SysV ipc message control extension. For IPC_INFO/SEM_INFO it tries to match the values against the tunable /proc values and for SEM_STAT/SEM_STAT_ANY it check if the create message queue is within the global list returned by the kernel. Checked on x86_64-linux-gnu and on i686-linux-gnu (Linux v5.4 and on Linux v4.15). Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit 574500a108be1d2a6a0dc97a075c9e0a98371aba)
| * AArch64: Use __memcpy_simd on Neoverse N2/V1Wilco Dijkstra2020-10-143-2/+8
| | | | | | | | | | | | | | | | Add CPU detection of Neoverse N2 and Neoverse V1, and select __memcpy_simd as the memcpy/memmove ifunc. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit e11ed9d2b4558eeacff81557dc9557001af42a6b)
| * AArch64: Improve backwards memmove performanceWilco Dijkstra2020-10-121-3/+4
| | | | | | | | | | | | | | | | | | On some microarchitectures performance of the backwards memmove improves if the stores use STR with decreasing addresses. So change the memmove loop in memcpy_advsimd.S to use 2x STR rather than STP. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit bd394d131c10c9ec22c6424197b79410042eed99)
| * Set version.h RELEASE to "stable" (Bug 26700)Carlos O'Donell2020-10-021-1/+1
| | | | | | | | | | | | | | The RELEASE macro was accidentaly set to "release" instead of the expected "stable" by the release manager. This is a mistake that leads to the build using "-g -O1" instead of "-g -O2" if configure was executed with "CFLAGS=" (CFLAGS set but empty).