about summary refs log tree commit diff
Commit message (Collapse)AuthorAgeFilesLines
* support: Add xsetlocale functionArjun Shankar2022-08-303-0/+32
| | | | (cherry picked from commit cce35a50c1de0cec5cd1f6c18979ff6ee3ea1dd1)
* support: Implement TEST_COMPARE_STRINGFlorian Weimer2022-08-305-0/+224
| | | | (cherry picked from commit 1df872fd74f730bcae3df201a229195445d2e18a)
* Fix memmove call in vfprintf-internal.c:group_numberJoseph Myers2022-08-301-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A recent GCC mainline change introduces errors of the form: vfprintf-internal.c: In function 'group_number': vfprintf-internal.c:2093:15: error: 'memmove' specified bound between 9223372036854775808 and 18446744073709551615 exceeds maximum object size 9223372036854775807 [-Werror=stringop-overflow=] 2093 | memmove (w, s, (front_ptr -s) * sizeof (CHAR_T)); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This is a genuine bug in the glibc code: s > front_ptr is always true at this point in the code, and the intent is clearly for the subtraction to be the other way round. The other arguments to the memmove call here also appear to be wrong; w and s point just *after* the destination and source for copying the rest of the number, so the size needs to be subtracted to get appropriate pointers for the copying. Adjust the memmove call to conform to the apparent intent of the code, so fixing the -Wstringop-overflow error. Now, if the original code were ever executed, a buffer overrun would result. However, I believe this code (introduced in commit edc1686af0c0fc2eb535f1d38cdf63c1a5a03675, "vfprintf: Reuse work_buffer in group_number", so in glibc 2.26) is unreachable in prior glibc releases (so there is no need for a bug in Bugzilla, no need to consider any backports unless someone wants to build older glibc releases with GCC 12 and no possibility of this buffer overrun resulting in a security issue). work_buffer is 1000 bytes / 250 wide characters. This case is only reachable if an initial part of the number, plus a grouped copy of the rest of the number, fail to fit in that space; that is, if the grouped number fails to fit in the space. In the wide character case, grouping is always one wide character, so even with a locale (of which there aren't any in glibc) grouping every digit, a number would need to occupy at least 125 wide characters to overflow, and a 64-bit integer occupies at most 23 characters in octal including a leading 0. In the narrow character case, the multibyte encoding of the grouping separator would need to be at least 42 bytes to overflow, again supposing grouping every digit, but MB_LEN_MAX is 16. So even if we admit the case of artificially constructed locales not shipped with glibc, given that such a locale would need to use one of the character sets supported by glibc, this code cannot be reached at present. (And POSIX only actually specifies the ' flag for grouping for decimal output, though glibc acts on it for other bases as well.) With binary output (if you consider use of grouping there to be valid), you'd need a 15-byte multibyte character for overflow; I don't know if any supported character set has such a character (if, again, we admit constructed locales using grouping every digit and a grouping separator chosen to have a multibyte encoding as long as possible, as well as accepting use of grouping with binary), but given that we have this code at all (clearly it's not *correct*, or in accordance with the principle of avoiding arbitrary limits, to skip grouping on running out of internal space like that), I don't think it should need any further changes for binary printf support to go in. On the other hand, support for large sizes of _BitInt in printf (see the N2858 proposal) *would* require something to be done about such arbitrary limits (presumably using dynamic allocation in printf again, for sufficiently large _BitInt arguments only - currently only floating-point uses dynamic allocation, and, as previously discussed, that could actually be replaced by bounded allocation given smarter code). Tested with build-many-glibcs.py for aarch64-linux-gnu (GCC mainline). Also tested natively for x86_64. (cherry picked from commit db6c4935fae6005d46af413b32aa92f4f6059dce)
* Remove most vfprintf width/precision-dependent allocations (bug 14231, bug ↵Joseph Myers2022-08-305-120/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 26211). The vfprintf implementation (used for all printf-family functions) contains complicated logic to allocate internal buffers of a size depending on the width and precision used for a format, using either malloc or alloca depending on that size, and with consequent checks for size overflow and allocation failure. As noted in bug 26211, the version of that logic used when '$' plus argument number formats are in use is missing the overflow checks, which can result in segfaults (quite possibly exploitable, I didn't try to work that out) when the width or precision is in the range 0x7fffffe0 through 0x7fffffff (maybe smaller values as well in the wprintf case on 32-bit systems, when the multiplication by sizeof (CHAR_T) can overflow). All that complicated logic in fact appears to be useless. As far as I can tell, there has been no need (outside the floating-point printf code, which does its own allocations) for allocations depending on width or precision since commit 3e95f6602b226e0de06aaff686dc47b282d7cc16 ("Remove limitation on size of precision for integers", Sun Sep 12 21:23:32 1999 +0000). Thus, this patch removes that logic completely, thereby fixing both problems with excessive allocations for large width and precision for non-floating-point formats, and the problem with missing overflow checks with such allocations. Note that this does have the consequence that width and precision up to INT_MAX are now allowed where previously INT_MAX / sizeof (CHAR_T) - EXTSIZ or more would have been rejected, so could potentially expose any other overflows where the value would previously have been rejected by those removed checks. I believe this completely fixes bugs 14231 and 26211. Excessive allocations are still possible in the floating-point case (bug 21127), as are other integer or buffer overflows (see bug 26201). This does not address the cases where a precision larger than INT_MAX (embedded in the format string) would be meaningful without printf's return value overflowing (when it's used with a string format, or %g without the '#' flag, so the actual output will be much smaller), as mentioned in bug 17829 comment 8; using size_t internally for precision to handle that case would be complicated by struct printf_info being a public ABI. Nor does it address the matter of an INT_MIN width being negated (bug 17829 comment 7; the same logic appears a second time in the file as well, in the form of multiplying by -1). There may be other sources of memory allocations with malloc in printf functions as well (bug 24988, bug 16060). From inspection, I think there are also integer overflows in two copies of "if ((width -= len) < 0)" logic (where width is int, len is size_t and a very long string could result in spurious padding being output on a 32-bit system before printf overflows the count of output characters). Tested for x86-64 and x86. (cherry picked from commit 6caddd34bd7ffb5ac4f36c8e036eee100c2cc535)
* stdio: Add tests for printf multibyte convertion leak [BZ#25691]Adhemerval Zanella2022-08-302-2/+115
| | | | | | Checked on x86_64-linux-gnu and i686-linux-gnu. (cherry picked from commit 910a835dc96c1f518ac2a6179fc622ba81ffb159)
* stdio: Remove memory leak from multibyte convertion [BZ#25691]Florian Weimer2022-08-302-144/+183
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is an updated version of a previous patch [1] with the following changes: - Use compiler overflow builtins on done_add_func function. - Define the scratch +utstring_converted_wide_string using CHAR_T. - Added a testcase and mention the bug report. Both default and wide printf functions might leak memory when manipulate multibyte characters conversion depending of the size of the input (whether __libc_use_alloca trigger or not the fallback heap allocation). This patch fixes it by removing the extra memory allocation on string formatting with conversion parts. The testcase uses input argument size that trigger memory leaks on unpatched code (using a scratch buffer the threashold to use heap allocation is lower). Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> [1] https://sourceware.org/pipermail/libc-alpha/2017-June/082098.html (cherry picked from commit 3cc4a8367c23582b7db14cf4e150e4068b7fd461)
* NEWS: Add a bug fix entry for BZ #28896H.J. Lu2022-02-181-0/+2
|
* x86: Fix TEST_NAME to make it a string in tst-strncmp-rtm.cNoah Goldstein2022-02-181-2/+2
| | | | | | | | | Previously TEST_NAME was passing a function pointer. This didn't fail because of the -Wno-error flag (to allow for overflow sizes passed to strncmp/wcsncmp) Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit b98d0bbf747f39770e0caba7e984ce9f8f900330)
* x86: Test wcscmp RTM in the wcsncmp overflow case [BZ #28896]Noah Goldstein2022-02-183-10/+48
| | | | | | | | | | | | | In the overflow fallback strncmp-avx2-rtm and wcsncmp-avx2-rtm would call strcmp-avx2 and wcscmp-avx2 respectively. This would have not checks around vzeroupper and would trigger spurious aborts. This commit fixes that. test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass on AVX2 machines with and without RTM. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit 7835d611af0854e69a0c71e3806f8fe379282d6f)
* x86: Fallback {str|wcs}cmp RTM in the ncmp overflow case [BZ #28896]Noah Goldstein2022-02-187-5/+22
| | | | | | | | | | | | | | In the overflow fallback strncmp-avx2-rtm and wcsncmp-avx2-rtm would call strcmp-avx2 and wcscmp-avx2 respectively. This would have not checks around vzeroupper and would trigger spurious aborts. This commit fixes that. test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass on AVX2 machines with and without RTM. Co-authored-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit c6272098323153db373f2986c67786ea8c85f1cf)
* string: Add a testcase for wcsncmp with SIZE_MAX [BZ #28755]H.J. Lu2022-02-171-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | Verify that wcsncmp (L("abc"), L("abd"), SIZE_MAX) == 0. The new test fails without commit ddf0992cf57a93200e0c782e2a94d0733a5a0b87 Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Sun Jan 9 16:02:21 2022 -0600 x86: Fix __wcsncmp_avx2 in strcmp-avx2.S [BZ# 28755] and commit 7e08db3359c86c94918feb33a1182cd0ff3bb10b Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Sun Jan 9 16:02:28 2022 -0600 x86: Fix __wcsncmp_evex in strcmp-evex.S [BZ# 28755] This is for BZ #28755. Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com> (cherry picked from commit aa5a720056d37cf24924c138a3dbe6dace98e97c)
* x86-64: Test strlen and wcslen with 0 in the RSI register [BZ #28064]H.J. Lu2022-02-013-0/+108
| | | | | | | | | | | | | | | | commit 6f573a27b6c8b4236445810a44660612323f5a73 Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Wed Jun 23 01:19:34 2021 -0400 x86-64: Add wcslen optimize for sse4.1 added wcsnlen-sse4.1 to the wcslen ifunc implementation list. Since the random value in the the RSI register is larger than the wide-character string length in the existing wcslen test, it didn't trigger the wcslen test failure. Add a test to force 0 into the RSI register before calling wcslen. (cherry picked from commit a6e7c3745d73ff876b4ba6991fb00768a938aef5)
* x86: Remove wcsnlen-sse4_1 from wcslen ifunc-impl-list [BZ #28064]Noah Goldstein2022-02-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | The following commit commit 6f573a27b6c8b4236445810a44660612323f5a73 Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Wed Jun 23 01:19:34 2021 -0400 x86-64: Add wcslen optimize for sse4.1 Added wcsnlen-sse4.1 to the wcslen ifunc implementation list and did not add wcslen-sse4.1 to wcslen ifunc implementation list. This commit fixes that by removing wcsnlen-sse4.1 from the wcslen ifunc implementation list and adding wcslen-sse4.1 to the ifunc implementation list. Testing: test-wcslen.c, test-rsi-wcslen.c, and test-rsi-strlen.c are passing as well as all other tests in wcsmbs and string. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit 0679442defedf7e52a94264975880ab8674736b2)
* x86: Black list more Intel CPUs for TSX [BZ #27398]H.J. Lu2022-02-011-3/+32
| | | | | | | | | | | Disable TSX and enable RTM_ALWAYS_ABORT for Intel CPUs listed in: https://www.intel.com/content/www/us/en/support/articles/000059422/processors.html This fixes BZ #27398. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com> (cherry picked from commit 1e000d3d33211d5a954300e2a69b90f93f18a1a1)
* x86: Check RTM_ALWAYS_ABORT for RTM [BZ #28033]H.J. Lu2022-02-012-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From https://www.intel.com/content/www/us/en/support/articles/000059422/processors.html * Intel TSX will be disabled by default. * The processor will force abort all Restricted Transactional Memory (RTM) transactions by default. * A new CPUID bit CPUID.07H.0H.EDX[11](RTM_ALWAYS_ABORT) will be enumerated, which is set to indicate to updated software that the loaded microcode is forcing RTM abort. * On processors that enumerate support for RTM, the CPUID enumeration bits for Intel TSX (CPUID.07H.0H.EBX[11] and CPUID.07H.0H.EBX[4]) continue to be set by default after microcode update. * Workloads that were benefited from Intel TSX might experience a change in performance. * System software may use a new bit in Model-Specific Register (MSR) 0x10F TSX_FORCE_ABORT[TSX_CPUID_CLEAR] functionality to clear the Hardware Lock Elision (HLE) and RTM bits to indicate to software that Intel TSX is disabled. 1. Add RTM_ALWAYS_ABORT to CPUID features. 2. Set RTM usable only if RTM_ALWAYS_ABORT isn't set. This skips the string/tst-memchr-rtm etc. testcases on the affected processors, which always fail after a microcde update. 3. Check RTM feature, instead of usability, against /proc/cpuinfo. This fixes BZ #28033. (cherry picked from commit ea8e465a6b8d0f26c72bcbe453a854de3abf68ec)
* NEWS: Add a bug fix entry for BZ #27974H.J. Lu2022-01-271-0/+1
|
* String: Add overflow tests for strnlen, memchr, and strncat [BZ #27974]Noah Goldstein2022-01-273-3/+130
| | | | | | | | | | | | | | | | | | | | | | | This commit adds tests for a bug in the wide char variant of the functions where the implementation may assume that maxlen for wcsnlen or n for wmemchr/strncat will not overflow when multiplied by sizeof(wchar_t). These tests show the following implementations failing on x86_64: wcsnlen-sse4_1 wcsnlen-avx2 wmemchr-sse2 wmemchr-avx2 strncat would fail as well if it where on a system that prefered either of the wcsnlen implementations that failed as it relies on wcsnlen. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit da5a6fba0febbfc90896ce1b2eb75c6d8a88a72d)
* x86: Optimize strlen-evex.SNoah Goldstein2022-01-271-264/+317
| | | | | | | | | | | No bug. This commit optimizes strlen-evex.S. The optimizations are mostly small things but they add up to roughly 10-30% performance improvement for strlen. The results for strnlen are bit more ambiguous. test-strlen, test-strnlen, test-wcslen, and test-wcsnlen are all passing. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> (cherry picked from commit 4ba65586847751372520a36757c17f114588794e)
* x86: Fix overflow bug in wcsnlen-sse4_1 and wcsnlen-avx2 [BZ #27974]Noah Goldstein2022-01-272-38/+107
| | | | | | | | | | | | | | This commit fixes the bug mentioned in the previous commit. The previous implementations of wmemchr in these files relied on maxlen * sizeof(wchar_t) which was not guranteed by the standard. The new overflow tests added in the previous commit now pass (As well as all the other tests). Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit a775a7a3eb1e85b54af0b4ee5ff4dcf66772a1fb)
* x86-64: Add wcslen optimize for sse4.1Noah Goldstein2022-01-276-36/+63
| | | | | | | | | | No bug. This comment adds the ifunc / build infrastructure necessary for wcslen to prefer the sse4.1 implementation in strlen-vec.S. test-wcslen.c is passing. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit 6f573a27b6c8b4236445810a44660612323f5a73)
* x86-64: Move strlen.S to multiarch/strlen-vec.SH.J. Lu2022-01-274-242/+262
| | | | | | | | | | Since strlen.S contains SSE2 version of strlen/strnlen and SSE4.1 version of wcslen/wcsnlen, move strlen.S to multiarch/strlen-vec.S and include multiarch/strlen-vec.S from SSE2 and SSE4.1 variants. This also removes the unused symbols, __GI___strlen_sse2 and __GI___wcsnlen_sse4_1. (cherry picked from commit a0db678071c60b6c47c468d231dd0b3694ba7a98)
* x86-64: Fix an unknown vector operation in memchr-evex.SAlice Xu2022-01-271-1/+1
| | | | | | | | An unknown vector operation occurred in commit 2a76821c308. Fixed it by using "ymm{k1}{z}" but not "ymm {k1} {z}". Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit 6ea916adfa0ab9af6e7dc6adcf6f977dfe017835)
* x86: Optimize memchr-evex.SNoah Goldstein2022-01-271-225/+322
| | | | | | | | | | | | | No bug. This commit optimizes memchr-evex.S. The optimizations include replacing some branches with cmovcc, avoiding some branches entirely in the less_4x_vec case, making the page cross logic less strict, saving some ALU in the alignment process, and most importantly increasing ILP in the 4x loop. test-memchr, test-rawmemchr, and test-wmemchr are all passing. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit 2a76821c3081d2c0231ecd2618f52662cb48fccd)
* x86: Optimize strlen-avx2.SNoah Goldstein2022-01-272-214/+334
| | | | | | | | | | | No bug. This commit optimizes strlen-avx2.S. The optimizations are mostly small things but they add up to roughly 10-30% performance improvement for strlen. The results for strnlen are bit more ambiguous. test-strlen, test-strnlen, test-wcslen, and test-wcsnlen are all passing. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> (cherry picked from commit aaa23c35071537e2dcf5807e956802ed215210aa)
* x86: Fix overflow bug with wmemchr-sse2 and wmemchr-avx2 [BZ #27974]Noah Goldstein2022-01-272-37/+98
| | | | | | | | | | | | | | This commit fixes the bug mentioned in the previous commit. The previous implementations of wmemchr in these files relied on n * sizeof(wchar_t) which was not guranteed by the standard. The new overflow tests added in the previous commit now pass (As well as all the other tests). Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit 645a158978f9520e74074e8c14047503be4db0f0)
* x86: Optimize memchr-avx2.SNoah Goldstein2022-01-271-178/+247
| | | | | | | | | | | | No bug. This commit optimizes memchr-avx2.S. The optimizations include replacing some branches with cmovcc, avoiding some branches entirely in the less_4x_vec case, making the page cross logic less strict, asaving a few instructions the in loop return loop. test-memchr, test-rawmemchr, and test-wmemchr are all passing. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit acfd088a1963ba51cd83c78f95c0ab25ead79e04)
* test-strnlen.c: Check that strnlen won't go beyond the maximum lengthH.J. Lu2022-01-271-0/+30
| | | | | | | Place strings ending at page boundary without the null byte. If an implementation goes beyond EXP_LEN, it will trigger the segfault. (cherry picked from commit cb882b21b63606aabd6e55afe23b42434d95f2ef)
* test-strnlen.c: Initialize wchar_t string with wmemset [BZ #27655]H.J. Lu2022-01-271-1/+3
| | | | | | Use wmemset to initialize wchar_t string. (cherry picked from commit 86859b7e58d8670b186c5209ba25f0fbd6612fb7)
* x86-64: Require BMI2 for __strlen_evex and __strnlen_evexH.J. Lu2022-01-271-2/+4
| | | | | | | | | | | | | | | | | | | Since __strlen_evex and __strnlen_evex added by commit 1fd8c163a83d96ace1ff78fa6bac7aee084f6f77 Author: H.J. Lu <hjl.tools@gmail.com> Date: Fri Mar 5 06:24:52 2021 -0800 x86-64: Add ifunc-avx2.h functions with 256-bit EVEX use sarx: c4 e2 6a f7 c0 sarx %edx,%eax,%eax require BMI2 for __strlen_evex and __strnlen_evex in ifunc-impl-list.c. ifunc-avx2.h already requires BMI2 for EVEX implementation. (cherry picked from commit 55bf411b451c13f0fb7ff3d3bf9a820020b45df1)
* NEWS: Add a bug fix entry for BZ #27457H.J. Lu2022-01-271-0/+1
|
* x86-64: Fix ifdef indentation in strlen-evex.SSunil K Pandey2022-01-271-8/+8
| | | | | | | Fix some indentations of ifdef in file strlen-evex.S which are off by 1 and confusing to read. (cherry picked from commit 595c22ecd8e87a27fd19270ed30fdbae9ad25426)
* x86-64: Use ZMM16-ZMM31 in AVX512 memmove family functionsH.J. Lu2022-01-273-19/+35
| | | | | | | | Update ifunc-memmove.h to select the function optimized with AVX512 instructions using ZMM16-ZMM31 registers to avoid RTM abort with usable AVX512VL since VZEROUPPER isn't needed at function exit. (cherry picked from commit e4fda4631017e49d4ee5a2755db34289b6860fa4)
* x86-64: Use ZMM16-ZMM31 in AVX512 memset family functionsH.J. Lu2022-01-274-24/+31
| | | | | | | | | Update ifunc-memset.h/ifunc-wmemset.h to select the function optimized with AVX512 instructions using ZMM16-ZMM31 registers to avoid RTM abort with usable AVX512VL and AVX512BW since VZEROUPPER isn't needed at function exit. (cherry picked from commit 4e2d8f352774b56078c34648b14a2412c38384f4)
* x86: Add string/memory function tests in RTM regionH.J. Lu2022-01-2712-0/+622
| | | | | | | | | | At function exit, AVX optimized string/memory functions have VZEROUPPER which triggers RTM abort. When such functions are called inside a transactionally executing RTM region, RTM abort causes severe performance degradation. Add tests to verify that string/memory functions won't cause RTM abort in RTM region. (cherry picked from commit 4bd660be40967cd69072f69ebc2ad32bfcc1f206)
* x86-64: Add AVX optimized string/memory functions for RTMH.J. Lu2022-01-2748-190/+594
| | | | | | | | | | | | | | | | | | | Since VZEROUPPER triggers RTM abort while VZEROALL won't, select AVX optimized string/memory functions with xtest jz 1f vzeroall ret 1: vzeroupper ret at function exit on processors with usable RTM, but without 256-bit EVEX instructions to avoid VZEROUPPER inside a transactionally executing RTM region. (cherry picked from commit 7ebba91361badf7531d4e75050627a88d424872f)
* x86-64: Add memcmp family functions with 256-bit EVEXH.J. Lu2022-01-275-4/+467
| | | | | | | | | Update ifunc-memcmp.h to select the function optimized with 256-bit EVEX instructions using YMM16-YMM31 registers to avoid RTM abort with usable AVX512VL, AVX512BW and MOVBE since VZEROUPPER isn't needed at function exit. (cherry picked from commit 91264fe3577fe887b4860923fa6142b5274c8965)
* x86-64: Add memset family functions with 256-bit EVEXH.J. Lu2022-01-276-14/+90
| | | | | | | | | Update ifunc-memset.h/ifunc-wmemset.h to select the function optimized with 256-bit EVEX instructions using YMM16-YMM31 registers to avoid RTM abort with usable AVX512VL and AVX512BW since VZEROUPPER isn't needed at function exit. (cherry picked from commit 1b968b6b9b3aac702ac2f133e0dd16cfdbb415ee)
* x86-64: Add memmove family functions with 256-bit EVEXH.J. Lu2022-01-275-11/+97
| | | | | | | | Update ifunc-memmove.h to select the function optimized with 256-bit EVEX instructions using YMM16-YMM31 registers to avoid RTM abort with usable AVX512VL since VZEROUPPER isn't needed at function exit. (cherry picked from commit 63ad43566f7a25d140dc723598aeb441ad657eed)
* x86-64: Add strcpy family functions with 256-bit EVEXH.J. Lu2022-01-279-0/+1338
| | | | | | | | Update ifunc-strcpy.h to select the function optimized with 256-bit EVEX instructions using YMM16-YMM31 registers to avoid RTM abort with usable AVX512VL and AVX512BW since VZEROUPPER isn't needed at function exit. (cherry picked from commit 525bc2a32c9710df40371f951217c6ae7a923aee)
* x86-64: Add ifunc-avx2.h functions with 256-bit EVEXH.J. Lu2022-01-2724-17/+2996
| | | | | | | | | | | | Update ifunc-avx2.h, strchr.c, strcmp.c, strncmp.c and wcsnlen.c to select the function optimized with 256-bit EVEX instructions using YMM16-YMM31 registers to avoid RTM abort with usable AVX512VL, AVX512BW and BMI2 since VZEROUPPER isn't needed at function exit. For strcmp/strncmp, prefer AVX2 strcmp/strncmp if Prefer_AVX2_STRCMP is set. (cherry picked from commit 1fd8c163a83d96ace1ff78fa6bac7aee084f6f77)
* x86: Add AVX512VL_Usable and AVX512BW_UsableH.J. Lu2022-01-272-0/+12
| | | | | Add AVX512VL_Usable and AVX512BW_Usable for backporting string/memory functions optimized with 256-bit EVEX.
* x86: Set Prefer_No_VZEROUPPER and add Prefer_AVX2_STRCMPH.J. Lu2022-01-273-2/+23
| | | | | | | | | | | 1. Set Prefer_No_VZEROUPPER if RTM is usable to avoid RTM abort triggered by VZEROUPPER inside a transactionally executing RTM region. 2. Since to compare 2 32-byte strings, 256-bit EVEX strcmp requires 2 loads, 3 VPCMPs and 2 KORDs while AVX2 strcmp requires 1 load, 2 VPCMPEQs, 1 VPMINU and 1 VPMOVMSKB, AVX2 strcmp is faster than EVEX strcmp. Add Prefer_AVX2_STRCMP to prefer AVX2 strcmp family functions. (cherry picked from commit 1da50d4bda07f04135dca39f40e79fc9eabed1f8)
* NEWS: Add a bug fix entry for BZ #28755H.J. Lu2022-01-271-0/+1
|
* x86: Fix __wcsncmp_avx2 in strcmp-avx2.S [BZ# 28755]Noah Goldstein2022-01-271-0/+10
| | | | | | | | | | | Fixes [BZ# 28755] for wcsncmp by redirecting length >= 2^56 to __wcscmp_avx2. For x86_64 this covers the entire address range so any length larger could not possibly be used to bound `s1` or `s2`. test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> (cherry picked from commit ddf0992cf57a93200e0c782e2a94d0733a5a0b87)
* Fix SXID_ERASE behavior in setuid programs (BZ #27471)Siddhesh Poyarekar2021-04-142-30/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | When parse_tunables tries to erase a tunable marked as SXID_ERASE for setuid programs, it ends up setting the envvar string iterator incorrectly, because of which it may parse the next tunable incorrectly. Given that currently the implementation allows malformed and unrecognized tunables pass through, it may even allow SXID_ERASE tunables to go through. This change revamps the SXID_ERASE implementation so that: - Only valid tunables are written back to the tunestr string, because of which children of SXID programs will only inherit a clean list of identified tunables that are not SXID_ERASE. - Unrecognized tunables get scrubbed off from the environment and subsequently from the child environment. - This has the side-effect that a tunable that is not identified by the setxid binary, will not be passed on to a non-setxid child even if the child could have identified that tunable. This may break applications that expect this behaviour but expecting such tunables to cross the SXID boundary is wrong. Reviewed-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit 2ed18c5b534d9e92fc006202a5af0df6b72e7aca)
* Enhance setuid-tunables testSiddhesh Poyarekar2021-04-143-23/+524
| | | | | | | | | | | | | | Instead of passing GLIBC_TUNABLES via the environment, pass the environment variable from parent to child. This allows us to test multiple variables to ensure better coverage. The test list currently only includes the case that's already being tested. More tests will be added later. Reviewed-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit 061fe3f8add46a89b7453e87eabb9c4695005ced) Also add intprops.h from 2.29 from commit 8e6fd2bdb21efe2cc1ae7571ff8fb2599db6a05a
* tst-env-setuid: Use support_capture_subprogram_self_sgidSiddhesh Poyarekar2021-04-141-183/+14
| | | | | | | Use the support_capture_subprogram_self_sgid to spawn an sgid child. Reviewed-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit ca335281068a1ed549a75ee64f90a8310755956f)
* support: Add capability to fork an sgid childSiddhesh Poyarekar2021-04-145-170/+168
| | | | | | | | | | | | | | | | | | | | | | Add a new function support_capture_subprogram_self_sgid that spawns an sgid child of the running program with its own image and returns the exit code of the child process. This functionality is used by at least three tests in the testsuite at the moment, so it makes sense to consolidate. There is also a new function support_subprogram_wait which should provide simple system() like functionality that does not set up file actions. This is useful in cases where only the return code of the spawned subprocess is interesting. This patch also ports tst-secure-getenv to this new function. A subsequent patch will port other tests. This also brings an important change to tst-secure-getenv behaviour. Now instead of succeeding, the test fails as UNSUPPORTED if it is unable to spawn a setgid child, which is how it should have been in the first place. Reviewed-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit 716a3bdc41b2b4b864dc64475015ba51e35e1273)
* support: Typo and formatting fixesSiddhesh Poyarekar2021-04-142-4/+4
| | | | | | | - Add a newline to the end of error messages in transfer(). - Fixed the name of support_subprocess_init(). (cherry picked from commit 95c68080a3ded882789b1629f872c3ad531efda0)
* support: Pass environ to child processSiddhesh Poyarekar2021-04-141-1/+1
| | | | | | | Pass environ to posix_spawn so that the child process can inherit environment of the test. (cherry picked from commit e958490f8c74e660bd93c128b3bea746e268f3f6)