about summary refs log tree commit diff
path: root/stdlib
Commit message (Collapse)AuthorAgeFilesLines
* stdlib: Undo post review change to 16adc58e73f3 [BZ #27749]Vitaly Buka2023-02-203-2/+81
| | | | | | | | | | Post review removal of "goto restart" from https://sourceware.org/pipermail/libc-alpha/2021-April/125470.html introduced a bug when some atexit handers skipped. Signed-off-by: Vitaly Buka <vitalybuka@google.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Use uintptr_t instead of performing pointer subtraction with a null pointerQihao Chencao2023-02-171-3/+3
| | | | | | Signed-off-by: Qihao Chencao <twose@qq.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* stdlib: Simplify getenvAdhemerval Zanella2023-02-171-59/+5
| | | | | | | | And remove _STRING_ARCH_unaligned usage. Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
* C2x strtol binary constant handlingJoseph Myers2023-02-1618-10/+523
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | C2x adds binary integer constants starting with 0b or 0B, and supports those constants in strtol-family functions when the base passed is 0 or 2. Implement that strtol support for glibc. As discussed at <https://sourceware.org/pipermail/libc-alpha/2020-December/120414.html>, this is incompatible with previous C standard versions, in that such an input string starting with 0b or 0B was previously required to be parsed as 0 (with the rest of the string unprocessed). Thus, as proposed there, this patch adds 20 new __isoc23_* functions with appropriate header redirection support. This patch does *not* do anything about scanf %i (which will need 12 new functions per long double variant, so 12, 24 or 36 depending on the glibc configuration), instead leaving that for a future patch. The function names would remain as __isoc23_* even if C2x ends up published in 2024 rather than 2023. Making this change leads to the question of what should happen to internal uses of these functions in glibc and its tests. The header redirection (which applies for _GNU_SOURCE or any other feature test macros enabling C2x features) has the effect of redirecting internal uses but without those uses then ending up at a hidden alias (see the comment in include/stdio.h about interaction with libc_hidden_proto). It seems desirable for the default for internal uses to be the same versions used by normal code using _GNU_SOURCE, so rather than doing anything to disable that redirection, similar macro definitions to those in include/stdio.h are added to the include/ headers for the new functions. Given that the default for uses in glibc is for the redirections to apply, the next question is whether the C2x semantics are correct for all those uses. Uses with the base fixed to 10, 16 or any other value other than 0 or 2 can be ignored. I think this leaves the following internal uses to consider (an important consideration for review of this patch will be both whether this list is complete and whether my conclusions on all entries in it are correct): benchtests/bench-malloc-simple.c benchtests/bench-string.h elf/sotruss-lib.c math/libm-test-support.c nptl/perf.c nscd/nscd_conf.c nss/nss_files/files-parse.c posix/tst-fnmatch.c posix/wordexp.c resolv/inet_addr.c rt/tst-mqueue7.c soft-fp/testit.c stdlib/fmtmsg.c support/support_test_main.c support/test-container.c sysdeps/pthread/tst-mutex10.c I think all of these places are OK with the new semantics, except for resolv/inet_addr.c, where the POSIX semantics of inet_addr do not allow for binary constants; thus, I changed that file (to use __strtoul_internal, whose semantics are unchanged) and added a test for this case. In the case of posix/wordexp.c I think accepting binary constants is OK since POSIX explicitly allows additional forms of shell arithmetic expressions, and in stdlib/fmtmsg.c SEV_LEVEL is not in POSIX so again I think accepting binary constants is OK. Functions such as __strtol_internal, which are only exported for compatibility with old binaries from when those were used in inline functions in headers, have unchanged semantics; the __*_l_internal versions (purely internal to libc and not exported) have a new argument to specify whether to accept binary constants. As well as for the standard functions, the header redirection also applies to the *_l versions (GNU extensions), and to legacy functions such as strtoq, to avoid confusing inconsistency (the *q functions redirect to __isoc23_*ll rather than needing their own __isoc23_* entry points). For the functions that are only declared with _GNU_SOURCE, this means the old versions are no longer available for normal user programs at all. An internal __GLIBC_USE_C2X_STRTOL macro is used to control the redirections in the headers, and cases in glibc that wish to avoid the redirections - the function implementations themselves and the tests of the old versions of the GNU functions - then undefine and redefine that macro to allow the old versions to be accessed. (There would of course be greater complexity should we wish to make any of the old versions into compat symbols / avoid them being defined at all for new glibc ABIs.) strtol_l.c has some similarity to strtol.c in gnulib, but has already diverged some way (and isn't listed at all at https://sourceware.org/glibc/wiki/SharedSourceFiles unlike strtoll.c and strtoul.c); I haven't made any attempts at gnulib compatibility in the changes to that file. I note incidentally that inttypes.h and wchar.h are missing the __nonnull present on declarations of this family of functions in stdlib.h; I didn't make any changes in that regard for the new declarations added.
* Replace rawmemchr (s, '\0') with strchrWilco Dijkstra2023-02-061-2/+1
| | | | | | | | | Almost all uses of rawmemchr find the end of a string. Since most targets use a generic implementation, replacing it with strchr is better since that is optimized by compilers into strlen (s) + s. Also fix the generic rawmemchr implementation to use a cast to unsigned char in the if statement. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* stdlib: tests: don't double-define _FORTIFY_SOURCESam James2023-02-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | If using -D_FORITFY_SOURCE=3 (in my case, I've patched GCC to add =3 instead of =2 (we've done =2 for years in Gentoo)), building glibc tests will fail on testmb like: ``` <command-line>: error: "_FORTIFY_SOURCE" redefined [-Werror] <built-in>: note: this is the location of the previous definition cc1: all warnings being treated as errors make[2]: *** [../o-iterator.mk:9: /var/tmp/portage/sys-libs/glibc-2.36/work/build-x86-x86_64-pc-linux-gnu-nptl/stdlib/testmb.o] Error 1 make[2]: *** Waiting for unfinished jobs.... ``` It's just because we're always setting -D_FORTIFY_SOURCE=2 rather than unsetting it first. If F_S is already 2, it's harmless, but if it's another value (say, 1, or 3), the compiler will bawk. (I'm not aware of a reason this couldn't be tested with =3, but the toolchain support is limited for that (too new), and we want to run the tests everywhere possible.) Signed-off-by: Sam James <sam@gentoo.org> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* Update copyright dates with scripts/update-copyrightsJoseph Myers2023-01-06216-216/+216
|
* Remove trailing whitespace in gmp.hJoseph Myers2023-01-061-1/+1
|
* stdio-common: Convert vfprintf and related functions to buffersFlorian Weimer2022-12-192-151/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vfprintf is entangled with vfwprintf (of course), __printf_fp, __printf_fphex, __vstrfmon_l_internal, and the strfrom family of functions. The latter use the internal snprintf functionality, so vsnprintf is converted as well. The simples conversion is __printf_fphex, followed by __vstrfmon_l_internal and __printf_fp, and finally __vfprintf_internal and __vfwprintf_internal. __vsnprintf_internal and strfrom* are mostly consuming the new interfaces, so they are comparatively simple. __printf_fp is a public symbol, so the FILE *-based interface had to preserved. The __printf_fp rewrite does not change the actual binary-to-decimal conversion algorithm, and digits are still not emitted directly to the target buffer. However, the staging buffer now uses bytes instead of wide characters, and one buffer copy is eliminated. The changes are at least performance-neutral in my testing. Floating point printing and snprintf improved measurably, so that this Lua script for i=1,5000000 do print(i, i * math.pi) end runs about 5% faster for me. To preserve fprintf performance for a simple "%d" format, this commit has some logic changes under LABEL (unsigned_number) to avoid additional function calls. There are certainly some very easy performance improvements here: binary, octal and hexadecimal formatting can easily avoid the temporary work buffer (the number of digits can be computed ahead-of-time using one of the __builtin_clz* built-ins). Decimal formatting can use a specialized version of _itoa_word for base 10. The existing (inconsistent) width handling between strfmon and printf is preserved here. __print_fp_buffer_1 would have to use __translated_number_width to achieve ISO conformance for printf. Test expectations in libio/tst-vtables-common.c are adjusted because the internal staging buffer merges all virtual function calls into one. In general, stack buffer usage is greatly reduced, particularly for unbuffered input streams. __printf_fp can still use a large buffer in binary128 mode for %g, though. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* stdlib: Move _IO_cleanup to call_function_static_weakAdhemerval Zanella2022-12-121-4/+2
| | | | Reviewed-by: Florian Weimer <fweimer@redhat.com>
* Apply asm redirection in gmp.h before first useAdhemerval Zanella2022-11-071-33/+39
| | | | | | | | For clang the redeclaration after the first use, the visibility attribute is silently ignored (symbol is STV_DEFAULT) while the asm label attribute causes an error. Reviewed-by: Fangrui Song <maskray@google.com>
* Fix OOB read in stdlib thousand grouping parsing [BZ #29727]Szabolcs Nagy2022-11-021-9/+7
| | | | | | | | | | | | | | | | | | | __correctly_grouped_prefixmb only worked with thousands_len == 1, otherwise it read past the end of cp or thousands. This affects scanf formats like %'d, %'f and the internal but exposed __strto{l,ul,f,d,..}_internal with grouping flag set and an LC_NUMERIC locale where thousands_len > 1. Avoid OOB access by considering thousands_len when initializing cp. This fixes bug 29727. Found by the morello port with strict bounds checking where FAIL: stdlib/tst-strtod4 FAIL: stdlib/tst-strtod5i crashed using a locale with thousands_len==3.
* configure: Use -Wno-ignored-attributes if compiler warns about multiple aliasesAdhemerval Zanella2022-11-011-0/+12
| | | | | | | | | clang emits an warning when a double alias redirection is used, to warn the the original symbol will be used even when weak definition is overridden. However, this is a common pattern for weak_alias, where multiple alias are set to same symbol. Reviewed-by: Fangrui Song <maskray@google.com>
* stdlib/strfrom: Add copysign to fix NAN issue on riscv (BZ #29501)Letu Ren2022-10-281-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to the specification of ISO/IEC TS 18661-1:2014, The strfromd, strfromf, and strfroml functions are equivalent to snprintf(s, n, format, fp) (7.21.6.5), except the format string contains only the character %, an optional precision that does not contain an asterisk *, and one of the conversion specifiers a, A, e, E, f, F, g, or G, which applies to the type (double, float, or long double) indicated by the function suffix (rather than by a length modifier). Use of these functions with any other 20 format string results in undefined behavior. strfromf will convert the arguement with type float to double first. According to the latest version of IEEE754 which is published in 2019, Conversion of a quiet NaN from a narrower format to a wider format in the same radix, and then back to the same narrower format, should not change the quiet NaN payload in any way except to make it canonical. When either an input or result is a NaN, this standard does not interpret the sign of a NaN. However, operations on bit strings—copy, negate, abs, copySign—specify the sign bit of a NaN result, sometimes based upon the sign bit of a NaN operand. The logical predicates totalOrder and isSignMinus are also affected by the sign bit of a NaN operand. For all other operations, this standard does not specify the sign bit of a NaN result, even when there is only one input NaN, or when the NaN is produced from an invalid operation. converting NAN or -NAN with type float to double doesn't need to keep the signbit. As a result, this test case isn't mandatory. The problem is that according to RISC-V ISA manual in chapter 11.3 of riscv-isa-20191213, Except when otherwise stated, if the result of a floating-point operation is NaN, it is the canonical NaN. The canonical NaN has a positive sign and all significand bits clear except the MSB, a.k.a. the quiet bit. For single-precision floating-point, this corresponds to the pattern 0x7fc00000. which means that conversion -NAN from float to double won't keep the signbit. Since glibc ought to be consistent here between types and architectures, this patch adds copysign to fix this problem if the string is NAN. This patch adds two different functions under sysdeps directory to work around the issue. This patch has been tested on x86_64 and riscv64. Resolves: BZ #29501 v2: Change from macros to different inline functions. v3: Add unlikely check to isnan. v4: Fix wrong commit message header. v5: Fix style: add space before parentheses. v6: Add copyright. Signed-off-by: Letu Ren <fantasquex@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* longlong.h: update from GCC for LoongArch clz/ctz supportXi Ruoyao2022-10-281-0/+12
| | | | | Update longlong.h to GCC r13-3269. Keep our local change (prefer https for gnu.org URL).
* Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sourcesFlorian Weimer2022-10-185-16/+4
| | | | | | | | | | | | | | | | | In the future, this will result in a compilation failure if the macros are unexpectedly undefined (due to header inclusion ordering or header inclusion missing altogether). Assembler sources are more difficult to convert. In many cases, they are hand-optimized for the mangling and no-mangling variants, which is why they are not converted. sysdeps/s390/s390-32/__longjmp.c and sysdeps/s390/s390-64/__longjmp.c are special: These are C sources, but most of the implementation is in assembler, so the PTR_DEMANGLE macro has to be undefined in some cases, to match the assembler style. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Introduce <pointer_guard.h>, extracted from <sysdep.h>Florian Weimer2022-10-185-4/+5
| | | | | | | | | | | | | | This allows us to define a generic no-op version of PTR_MANGLE and PTR_DEMANGLE. In the future, we can use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources, avoiding an unintended loss of hardening due to missing include files or unlucky header inclusion ordering. In i386 and x86_64, we can avoid a <tls.h> dependency in the C code by using the computed constant from <tcb-offsets.h>. <sysdep.h> no longer includes these definitions, so there is no cyclic dependency anymore when computing the <tcb-offsets.h> constants. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* malloc: Do not clobber errno on __getrandom_nocancel (BZ #29624)Adhemerval Zanella2022-09-301-1/+1
| | | | | | | | | | Use INTERNAL_SYSCALL_CALL instead of INLINE_SYSCALL_CALL. This requires emulate the semantic for hurd call (so __arc4random_buf uses the fallback). Checked on x86_64-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
* stdlib: Fix __getrandom_nocancel type and arc4random usage (BZ #29638)Adhemerval Zanella2022-09-301-1/+1
| | | | | | | | | Using an unsigned type prevents the fallback to be used if kernel does not support getrandom syscall. Checked on x86_64-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
* stdlib: Fix macro expansion producing 'defined' has undefined behaviorAdhemerval Zanella2022-08-301-3/+6
| | | | | | | | | | | | | The FPIOCONST_HAVE_EXTENDED_RANGE is defined as: #define FPIOCONST_HAVE_EXTENDED_RANGE \ ((!defined __NO_LONG_DOUBLE_MATH && __LDBL_MAX_EXP__ > 1024) \ || __HAVE_DISTINCT_FLOAT128) Which is undefined behavior accordingly to C Standard (Preprocessing directives, p4). Checked on x86_64-linux-gnu.
* inet: Turn __ivaliduser into a compatibility symbolFlorian Weimer2022-08-101-0/+2
| | | | | It is not declared in a header file, and as the comment indicates, it is not expected to be used.
* assert: Do not use stderr in libc-internal assertFlorian Weimer2022-08-031-1/+1
| | | | | | | | | | | | | | | | | | | | Redirect internal assertion failures to __libc_assert_fail, based on based on __libc_message, which writes directly to STDERR_FILENO and calls abort. Also disable message translation and reword the error message slightly (adjusting stdlib/tst-bz20544 accordingly). As a result of these changes, malloc no longer needs its own redefinition of __assert_fail. __libc_assert_fail needs to be stubbed out during rtld dependency analysis because the rtld rebuilds turn __libc_assert_fail into __assert_fail, which is unconditionally provided by elf/dl-minimal.c. This change is not possible for the public assert macro and its __assert_fail function because POSIX requires that the diagnostic is written to stderr. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* stdlib: Simplify arc4random_uniformAdhemerval Zanella2022-08-011-99/+30
| | | | | | | | | | | | | | | | | It uses the bitmask with rejection [1], which calculates a mask being the lowest power of two bounding the request upper bound, successively queries new random values, and rejects values outside the requested range. Performance-wise, there is no much gain in trying to conserve bits since arc4random is wrapper on getrandom syscall. It should be cheaper to just query a uint32_t value. The algorithm also avoids modulo and divide operations, which might be costly depending of the architecture. [1] https://www.pcg-random.org/posts/bounded-rands.html Reviewed-by: Yann Droneaud <ydroneaud@opteya.com>
* stdlib: Tuned down tst-arc4random-thread internal parametersAdhemerval Zanella2022-07-291-6/+18
| | | | | | | | | | | | | | | | With new arc4random implementation, the internal parameters might require a lot of runtime and/or trigger some contention on older kernels (which might trigger spurious timeout failures). Also, since we are now testing getrandom entropy instead of an userspace RNG, there is no much need to extensive testing. With this change the tst-arc4random-thread goes from about 1m to 5s on a Ryzen 9 with 5.15.0-41-generic. Checked on x86_64-linux-gnu. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* arc4random: simplify design for better safetyJason A. Donenfeld2022-07-275-559/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than buffering 16 MiB of entropy in userspace (by way of chacha20), simply call getrandom() every time. This approach is doubtlessly slower, for now, but trying to prematurely optimize arc4random appears to be leading toward all sorts of nasty properties and gotchas. Instead, this patch takes a much more conservative approach. The interface is added as a basic loop wrapper around getrandom(), and then later, the kernel and libc together can work together on optimizing that. This prevents numerous issues in which userspace is unaware of when it really must throw away its buffer, since we avoid buffering all together. Future improvements may include userspace learning more from the kernel about when to do that, which might make these sorts of chacha20-based optimizations more possible. The current heuristic of 16 MiB is meaningless garbage that doesn't correspond to anything the kernel might know about. So for now, let's just do something conservative that we know is correct and won't lead to cryptographic issues for users of this function. This patch might be considered along the lines of, "optimization is the root of all evil," in that the much more complex implementation it replaces moves too fast without considering security implications, whereas the incremental approach done here is a much safer way of going about things. Once this lands, we can take our time in optimizing this properly using new interplay between the kernel and userspace. getrandom(0) is used, since that's the one that ensures the bytes returned are cryptographically secure. But on systems without it, we fallback to using /dev/urandom. This is unfortunate because it means opening a file descriptor, but there's not much of a choice. Secondly, as part of the fallback, in order to get more or less the same properties of getrandom(0), we poll on /dev/random, and if the poll succeeds at least once, then we assume the RNG is initialized. This is a rough approximation, as the ancient "non-blocking pool" initialized after the "blocking pool", not before, and it may not port back to all ancient kernels, though it does to all kernels supported by glibc (≥3.2), so generally it's the best approximation we can do. The motivation for including arc4random, in the first place, is to have source-level compatibility with existing code. That means this patch doesn't attempt to litigate the interface itself. It does, however, choose a conservative approach for implementing it. Cc: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: Cristian Rodríguez <crrodriguez@opensuse.org> Cc: Paul Eggert <eggert@cs.ucla.edu> Cc: Mark Harris <mark.hsj@gmail.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: linux-crypto@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* aarch64: Add optimized chacha20Adhemerval Zanella Netto2022-07-221-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It adds vectorized ChaCha20 implementation based on libgcrypt cipher/chacha20-aarch64.S. It is used as default and only little-endian is supported (BE uses generic code). As for generic implementation, the last step that XOR with the input is omited. The final state register clearing is also omitted. On a virtualized Linux on Apple M1 it shows the following improvements (using formatted bench-arc4random data): GENERIC MB/s ----------------------------------------------- arc4random [single-thread] 380.89 arc4random_buf(16) [single-thread] 500.73 arc4random_buf(32) [single-thread] 552.61 arc4random_buf(48) [single-thread] 566.82 arc4random_buf(64) [single-thread] 574.01 arc4random_buf(80) [single-thread] 581.02 arc4random_buf(96) [single-thread] 591.19 arc4random_buf(112) [single-thread] 592.29 arc4random_buf(128) [single-thread] 596.43 ----------------------------------------------- OPTIMIZED MB/s ----------------------------------------------- arc4random [single-thread] 569.60 arc4random_buf(16) [single-thread] 825.78 arc4random_buf(32) [single-thread] 987.03 arc4random_buf(48) [single-thread] 1042.39 arc4random_buf(64) [single-thread] 1075.50 arc4random_buf(80) [single-thread] 1094.68 arc4random_buf(96) [single-thread] 1130.16 arc4random_buf(112) [single-thread] 1129.58 arc4random_buf(128) [single-thread] 1137.91 ----------------------------------------------- Checked on aarch64-linux-gnu.
* stdlib: Add arc4random testsAdhemerval Zanella Netto2022-07-225-0/+860
| | | | | | | | | | | | | | | | | | | | The basic tst-arc4random-chacha20.c checks if the output of ChaCha20 implementation matches the reference test vectors from RFC8439. The tst-arc4random-fork.c check if subprocesses generate distinct streams of randomness (if fork handling is done correctly). The tst-arc4random-stats.c is a statistical test to the randomness of arc4random, arc4random_buf, and arc4random_uniform. The tst-arc4random-thread.c check if threads generate distinct streams of randomness (if function are thread-safe). Checked on x86_64-linux-gnu, aarch64-linux, and powerpc64le-linux-gnu. Co-authored-by: Florian Weimer <fweimer@redhat.com> Checked on x86_64-linux-gnu and aarch64-linux-gnu.
* stdlib: Add arc4random, arc4random_buf, and arc4random_uniform (BZ #4417)Adhemerval Zanella Netto2022-07-227-0/+603
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The implementation is based on scalar Chacha20 with per-thread cache. It uses getrandom or /dev/urandom as fallback to get the initial entropy, and reseeds the internal state on every 16MB of consumed buffer. To improve performance and lower memory consumption the per-thread cache is allocated lazily on first arc4random functions call, and if the memory allocation fails getentropy or /dev/urandom is used as fallback. The cache is also cleared on thread exit iff it was initialized (so if arc4random is not called it is not touched). Although it is lock-free, arc4random is still not async-signal-safe (the per thread state is not updated atomically). The ChaCha20 implementation is based on RFC8439 [1], omitting the final XOR of the keystream with the plaintext because the plaintext is a stream of zeros. This strategy is similar to what OpenBSD arc4random does. The arc4random_uniform is based on previous work by Florian Weimer, where the algorithm is based on Jérémie Lumbroso paper Optimal Discrete Uniform Generation from Coin Flips, and Applications (2013) [2], who credits Donald E. Knuth and Andrew C. Yao, The complexity of nonuniform random number generation (1976), for solving the general case. The main advantage of this method is the that the unit of randomness is not the uniform random variable (uint32_t), but a random bit. It optimizes the internal buffer sampling by initially consuming a 32-bit random variable and then sampling byte per byte. Depending of the upper bound requested, it might lead to better CPU utilization. Checked on x86_64-linux-gnu, aarch64-linux, and powerpc64le-linux-gnu. Co-authored-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Yann Droneaud <ydroneaud@opteya.com> [1] https://datatracker.ietf.org/doc/html/rfc8439 [2] https://arxiv.org/pdf/1304.1916.pdf
* stdlib: Simplify buffer management in canonicalizeFlorian Weimer2022-07-051-64/+51
| | | | | | | | | | | Move the buffer management from realpath_stk to __realpath. This allows returning directly after allocation errors. Always make a copy of the result buffer using strdup even if it is already heap-allocated. (Heap-allocated buffers are somewhat rare.) This avoids GCC warnings at certain optimization levels. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* Refactor internal-signals.hAdhemerval Zanella2022-06-301-5/+5
| | | | | | | | | | | | | | | | | | The main drive is to optimize the internal usage and required size when sigset_t is embedded in other data structures. On Linux, the current supported signal set requires up to 8 bytes (16 on mips), was lower than the user defined sigset_t (128 bytes). A new internal type internal_sigset_t is added, along with the functions to operate on it similar to the ones for sigset_t. The internal-signals.h is also refactored to remove unused functions Besides small stack usage on some functions (posix_spawn, abort) it lower the struct pthread by about 120 bytes (112 on mips). Checked on x86_64-linux-gnu. Reviewed-by: Arjun Shankar <arjun@redhat.com>
* stdlib: Fixup mbstowcs NULL __dst handling. [BZ #29279]Noah Goldstein2022-06-231-4/+4
| | | | | | | | | | | | | | | | commit 464d189b9622932a75302290625de84931656ec0 (origin/master, origin/HEAD) Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Wed Jun 22 08:24:21 2022 -0700 stdlib: Remove attr_write from mbstows if dst is NULL [BZ: 29265] Incorrectly called `__mbstowcs_chk` in the NULL __dst case which is incorrect as in the NULL __dst case we are explicitly skipping the objsize checks. As well, remove the `__always_inline` attribute which exists in `__fortify_function`. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* stdlib: Remove attr_write from mbstows if dst is NULL [BZ: 29265]Noah Goldstein2022-06-223-5/+21
| | | | | | | | mbstows is defined if dst is NULL and is defined to special cased if dst is NULL so the fortify objsize check if incorrect in that case. Tested on x86-64 linux. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* stdlib: Remove trailing whitespace from MakefileNoah Goldstein2022-06-221-1/+1
| | | | | This causes precommit tests to fail when pushing commits that modify this file.
* stdlib: Reflow and sort most variable assignmentsAdhemerval Zanella2022-04-131-63/+227
|
* realpath: Bring back GNU extension on ENOENT and EACCES [BZ #28996]Siddhesh Poyarekar2022-03-312-5/+8
| | | | | | | | | | | | | | | The GNU extension for realpath states that if the path resolution fails with ENOENT or EACCES and the resolved buffer is non-NULL, it will contain part of the path that failed resolution. commit 949ad78a189194048df8a253bb31d1d11d919044 broke this when it omitted the copy on failure. Bring it back partially to continue supporting this GNU extension. Resolves: BZ #28996 Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Andreas Schwab <schwab@linux-m68k.org>
* stdlib: Fix tst-getrandom memcmp callAdhemerval Zanella2022-03-311-1/+1
| | | | | | | The idea is to check if the up sizeof (buf) are equal, not only the first byte. Checked on x86_64-linux-gnu and i686-linux-gnu.
* stdlib: Fix tst-rand48.c printf typesAdhemerval Zanella2022-03-311-3/+3
| | | | Checked on x86_64-linux-gnu and i686-linux-gnu.
* Add some missing access function attributesSteve Grubb2022-03-101-2/+4
| | | | | | | This patch adds some missing access function attributes to getrandom / getentropy and several functions in sys/xattr.h Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* realpath: Do not copy result on failure (BZ #28815)Siddhesh Poyarekar2022-02-212-3/+5
| | | | | | | | | | | On failure, the contents of the resolved buffer passed in by the caller to realpath are undefined. Do not copy any partial resolution to the buffer and also do not test resolved contents in test-canon.c. Resolves: BZ #28815 Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* stdlib: Avoid -Wuse-after-free in __add_to_environ [BZ #26779]Martin Sebor2022-01-251-2/+4
| | | | Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* realpath: Avoid overwriting preexisting error (CVE-2021-3998)Siddhesh Poyarekar2022-01-241-1/+1
| | | | | | | | | | Set errno and failure for paths that are too long only if no other error occurred earlier. Related: BZ #28770 Reviewed-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* tst-realpath-toolong: Fix hurd buildSiddhesh Poyarekar2022-01-241-0/+4
| | | | | | Define PATH_MAX to a constant if it isn't already defined, like in hurd. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* realpath: Set errno to ENAMETOOLONG for result larger than PATH_MAX [BZ #28770]Siddhesh Poyarekar2022-01-213-2/+60
| | | | | | | | | | | | | | realpath returns an allocated string when the result exceeds PATH_MAX, which is unexpected when its second argument is not NULL. This results in the second argument (resolved) being uninitialized and also results in a memory leak since the caller expects resolved to be the same as the returned value. Return NULL and set errno to ENAMETOOLONG if the result exceeds PATH_MAX. This fixes [BZ #28770], which is CVE-2021-3998. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* stdlib: Fix formatting of tests list in MakefileSiddhesh Poyarekar2022-01-131-75/+77
| | | | | Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Florian Weimer <fweimer@redhat.com>
* stdlib: Sort tests in MakefileSiddhesh Poyarekar2022-01-131-24/+75
| | | | | | Put one test per line and sort them. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* Update copyright dates with scripts/update-copyrightsPaul Eggert2022-01-01211-211/+211
| | | | | | | | | | | | | | | | | | | | | | | I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 7061 files FOO. I then removed trailing white space from math/tgmath.h, support/tst-support-open-dev-null-range.c, and sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following obscure pre-commit check failure diagnostics from Savannah. I don't know why I run into these diagnostics whereas others evidently do not. remote: *** 912-#endif remote: *** 913: remote: *** 914- remote: *** error: lines with trailing whitespace found ... remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
* fortify: Fix spurious warning with realpathSiddhesh Poyarekar2021-12-171-1/+1
| | | | | | | | | The length and object size arguments were swapped around for realpath. Also add a smoke test so that any changes in this area get caught in future. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Add alloc_align attribute to memalign et alJonathan Wakely2021-10-211-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GCC 4.9.0 added the alloc_align attribute to say that a function argument specifies the alignment of the returned pointer. Clang supports the attribute too. Using the attribute can allow a compiler to generate better code if it knows the returned pointer has a minimum alignment. See https://gcc.gnu.org/PR60092 for more details. GCC implicitly knows the semantics of aligned_alloc and posix_memalign, but not the obsolete memalign. As a result, GCC generates worse code when memalign is used, compared to aligned_alloc. Clang knows about aligned_alloc and memalign, but not posix_memalign. This change adds a new __attribute_alloc_align__ macro to <sys/cdefs.h> and then uses it on memalign (where it helps GCC) and aligned_alloc (where GCC and Clang already know the semantics, but it doesn't hurt) and xposix_memalign. It can't be used on posix_memalign because that doesn't return a pointer (the allocated pointer is returned via a void** parameter instead). Unlike the alloc_size attribute, alloc_align only allows a single argument. That means the new __attribute_alloc_align__ macro doesn't really need to be used with double parentheses to protect a comma between its arguments. For consistency with __attribute_alloc_size__ this patch defines it the same way, so that double parentheses are required. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
* stdlib: Fix tst-canon-bz26341 when the glibc build current working directory ↵omain GEISSLER2021-10-201-0/+6
| | | | is itself using symlinks.
* Make sure that the fortified function conditionals are constantSiddhesh Poyarekar2021-10-201-40/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In _FORTIFY_SOURCE=3, the size expression may be non-constant, resulting in branches in the inline functions remaining intact and causing a tiny overhead. Clang (and in future, gcc) make sure that the -1 case is always safe, i.e. any comparison of the generated expression with (size_t)-1 is always false so that bit is taken care of. The rest is avoidable since we want the _chk variant whenever we have a size expression and it's not -1. Rework the conditionals in a uniform way to clearly indicate two conditions at compile time: - Either the size is unknown (-1) or we know at compile time that the operation length is less than the object size. We can call the original function in this case. It could be that either the length, object size or both are non-constant, but the compiler, through range analysis, is able to fold the *comparison* to a constant. - The size and length are known and the compiler can see at compile time that operation length > object size. This is valid grounds for a warning at compile time, followed by emitting the _chk variant. For everything else, emit the _chk variant. This simplifies most of the fortified function implementations and at the same time, ensures that only one call from _chk or the regular function is emitted. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>