about summary refs log tree commit diff
Commit message (Collapse)AuthorAgeFilesLines
...
* ldso: use pthread_t rather than kernel tid to track ctor visitorRich Felker2020-10-141-3/+3
| | | | | | | | commit 188759bbee057aa94db2bbb7cf7f5855f3b9ab53 documented the intent to allow recursive dlopen based on tracking ctor_visitor, but used a kernel tid rather than the pthread_t to identify the caller. as a result, it would not behave as intended under fork by a ctor, where the child tid would not match.
* fix stale lock when allocation of ctor queue fails during dlopenRich Felker2020-10-141-1/+2
| | | | | | | | | | queue_ctors should not be called with the init_fini_lock held, since it may longjmp out on allocation failure. this introduces a minor TOCTOU race with p->constructed, but one already exists further down anyway, and by design it's okay to run through the queue more than once anyway. the only reason we bother to check p->constructed at all is to avoid spurious failure of dlopen when the library is already fully loaded and constructed.
* drop use of pthread_once in mutexattr kernel support testsRich Felker2020-10-142-21/+18
| | | | | | | | this makes the code slightly smaller and eliminates these functions from relevance to possible future changes to multithreaded fork. the barrier of a_store isn't technically needed here, but a_store is used anyway for internal consistency of the memory model.
* fix missing synchronization of fork with abortRich Felker2020-10-141-0/+3
| | | | | | | | | if the multithreaded parent forked while another thread was calling sigaction for SIGABRT or calling abort, the child could inherit a lock state in which future calls to abort will deadlock, or in which the disposition for SIGABRT has already been reset to SIG_DFL. this is nonconforming since abort is AS-safe and permitted to be called concurrently with fork or in the MT-forked child.
* move __abort_lock to its own file and drop pointless weak_alias trickRich Felker2020-10-144-8/+5
| | | | | | | | the dummy definition of __abort_lock in sigaction.c was performing exactly the same role that putting the lock in its own source file could and should have been used to achieve. while we're moving it, give it a proper declaration.
* fix fork of processes with active async io contextsRich Felker2020-09-283-0/+19
| | | | | | | | | | | | | | previously, if a file descriptor had aio operations pending in the parent before fork, attempting to close it in the child would attempt to cancel a thread belonging to the parent. this could deadlock, fail, or crash the whole process of the cancellation signal handler was not yet installed in the parent. in addition, further use of aio from the child could malfunction or deadlock. POSIX specifies that async io operations are not inherited by the child on fork, so clear the entire aio fd map in the child, and take the aio map lock (with signals blocked) across the fork so that the lock is kept in a consistent state.
* avoid set*id/setrlimit misbehavior and hang in vforked/cloned childRich Felker2020-09-171-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | taking the deprecated/dropped vfork spec strictly, doing pretty much anything but execve in the child is wrong and undefined. however, these are commonly needed operations to setup the child state before exec, and historical implementations tolerated them. for single-threaded parents, these operations already worked as expected in the vforked child. however, due to the need for __synccall to synchronize id/resource limit changes among all threads, calling these functions in the vforked child of a multithreaded parent caused a misdirected broadcast signaling of all threads in the parent. these signals could kill the parent entirely if the synccall signal handler had never been installed in the parent, or could be ignored if it had, or could signal/kill one or more utterly wrong processes if the parent already terminated (due to vfork semantics, only possible via fatal signal) and the parent tids were recycled. in any case, the expected number of semaphore posts would never happen, so the child would permanently hang (with all signals blocked) waiting for them. to mitigate this, and also make the normal usage case work as intended, treat the condition where the caller's actual tid does not match the tid in its thread structure as single-threaded, and bypass the entire synccall broadcast operation.
* use new SYS_faccessat2 syscall to implement faccessat with flagsRich Felker2020-09-091-3/+8
| | | | | | | | | | | | | commit 0a05eace163cee9b08571d2ff9d90f5e82d9c228 implemented AT_EACCESS for faccessat with a horrible hack, creating a child process to change switch uid/gid and perform the access probe without making potentially irreversible changes to the caller's credentials. this was due to the syscall lacking a flags argument. linux 5.8 introduced a new syscall, SYS_faccessat2, fixing this deficiency. use it if any flags are passed, and fallback to the old strategy on ENOSYS. continue using the old syscall when there are no flags.
* netinet/if_ether.h: add ETH_P_MRP from linux v5.8Szabolcs Nagy2020-09-091-0/+1
| | | | | | | Ethernet protocol number for media redundancy protocol, see linux commit 4714d13791f831d253852c8b5d657270becb8b2a bridge: uapi: mrp: Add mrp attributes.
* elf.h: add .note.gnu.property related definitionsSzabolcs Nagy2020-09-091-0/+2
| | | | On x86 and aarch64 GNU properties may be used to mark ELF objects.
* bits/syscall.h: add __NR_faccessat2 from linux v5.8Szabolcs Nagy2020-09-0916-0/+16
| | | | | | | | the linux faccessat syscall lacks a flag argument that is necessary to implement the posix api, see linux commit c8ffd8bcdd28296a198f237cc595148a8d4adfbe vfs: add faccessat2 syscall
* netinet/tcp.h: update to linux v5.7Szabolcs Nagy2020-09-091-0/+3
| | | | | | | | | | | | | add TCP_NLA_BYTES_NOTSENT and new tcp_zerocopy_receive fields, see linux commit c8856c051454909e5059df4e81c77b9c366c5515 tcp-zerocopy: Return inq along with tcp receive zerocopy. linux commit 33946518d493cdf10aedb4a483f1aa41948a3dab tcp-zerocopy: Return sk_err (if set) along with tcp receive zerocopy. linux commit e08ab0b377a1489760533424437c5f4be7f484a4 tcp: add bytes not sent to SCM_TIMESTAMPING_OPT_STATS
* sys/mman.h: add MREMAP_DONTUNMAP from linux v5.7Szabolcs Nagy2020-09-091-0/+1
| | | | | | | | it remaps anon mappings without unmapping the original. chromeos plans to use it with userfaultfd, see: linux commit e346b3813067d4b17383f975f197a9aa28a3b077 mm/mremap: add MREMAP_DONTUNMAP to mremap()
* sys/fanotify.h: update to linux v5.7Szabolcs Nagy2020-09-091-1/+3
| | | | | | | | | | see linux commit 9e2ba2c34f1922ca1e0c7d31b30ace5842c2e7d1 fanotify: send FAN_DIR_MODIFY event flavor with dir inode and name linux commit 44d705b0370b1d581f46ff23e5d33e8b5ff8ec58 fanotify: report name info for FAN_DIR_MODIFY event
* aarch64: add new HWCAP2_ macros from linux v5.6Szabolcs Nagy2020-09-091-0/+8
| | | | | | | | | | added in linux commit 1a50ec0b3b2e9a83f1b1245ea37a853aac2f741c arm64: Implement archrandom.h for ARMv8.5-RNG linux commit d4209d8b717311d114b5d47ba7f8249fd44e97c2 arm64: cpufeature: Export matrix and other features to userspace
* aarch64: add HWCAP2_ macros from linux v5.3Szabolcs Nagy2020-09-091-0/+2
| | | | | | | | | | these were missed before, added in linux commit 1201937491822b61641c1878ebcd16a93aed4540 arm64: Expose ARMv8.5 CondM capability to userspace linux commit ca9503fc9e9812aa6258e55d44edb03eb30fc46f arm64: Expose FRINT capabilities to userspace
* sched.h: add CLONE_NEWTIME from linux v5.6Szabolcs Nagy2020-09-091-0/+1
| | | | | | | | reuses a bit from CSIGNAL so it can only be used with unshare and clone3, added in linux commit 769071ac9f20b6a447410c7eaa55d1a5233ef40c ns: Introduce Time Namespace
* sys/random.h: add GRND_INSECURE from linux v5.6Szabolcs Nagy2020-09-091-0/+1
| | | | | | | added in linux commit 75551dbf112c992bc6c99a972990b3f272247e23 random: add GRND_INSECURE to return best-effort non-cryptographic bytes
* sys/prctl.h: add PR_{SET,GET}_IO_FLUSHER from linux v5.6Szabolcs Nagy2020-09-091-0/+3
| | | | | | | | needed for storage drivers with userspace component that may run in the IO path, see linux commit 8d19f1c8e1937baf74e1962aae9f90fa3aeab463 prctl: PR_{G,S}ET_IO_FLUSHER to support controlling memory reclaim
* netinet/udp.h: add TCP_ENCAP_ESPINTCP from linux v5.6Szabolcs Nagy2020-09-091-0/+1
| | | | | | | | The use of TCP_ in udp.h is not known, fortunately udp.h is not specified by posix so there are no strict namespace rules, added in linux commit e27cca96cd68fa2c6814c90f9a1cfd36bb68c593 xfrm: add espintcp (RFC 8229)
* netinet/tcp.h: update for linux v5.6Szabolcs Nagy2020-09-091-2/+4
| | | | | | | | | | | | | TCP_NLA_TIMEOUT_REHASH queries timeout-triggered rehash attempts, tcpm_ifindex limits the scope of TCP_MD5SIG* sockopt to a device. see linux commit 32efcc06d2a15fa87585614d12d6c2308cc2d3f3 tcp: export count for rehash attempts linux commit 6b102db50cdde3ba2f78631ed21222edf3a5fb51 net: Add device index to tcp_md5sig
* netinet/in.h: add IPPROTO_ macros from linux v5.6Szabolcs Nagy2020-09-091-1/+3
| | | | | | | | | | add IPPROTO_ETHERNET and IPPROTO_MPTCP, see linux commit 2677625387056136e256c743e3285b4fe3da87bb seg6: fix SRv6 L2 tunnels to use IANA-assigned protocol number linux commit faf391c3826cd29feae02078ca2022d2f912f7cc tcp: Define IPPROTO_MPTCP
* add pidfd_getfd and openat2 syscall numbers from linux v5.6Szabolcs Nagy2020-09-0916-0/+34
| | | | | | | | | | | | | | | | | | | also added clone3 on sh and m68k, on sh it's still missing (not yet wired up), but reserved so safe to add. see linux commit fddb5d430ad9fa91b49b1d34d0202ffe2fa0e179 open: introduce openat2(2) syscall linux commit 9a2cef09c801de54feecd912303ace5c27237f12 arch: wire up pidfd_getfd syscall linux commit 8649c322f75c96e7ced2fec201e123b2b073bf09 pid: Implement pidfd_getfd syscall linux commit e8bb2a2a1d51511e6b3f7e08125d52ec73c11139 m68k: Wire up clone3() syscall
* netinet/tcp.h: update tcp_info for linux v5.5Szabolcs Nagy2020-09-091-1/+8
| | | | | | | see linux commit 480274787d7e3458bc5a7cfbbbe07033984ad711 tcp: add TCP_INFO status for failed client TFO
* use generic bits/fcntl.h for x86_64 and riscv64Rich Felker2020-09-032-78/+0
| | | | | these were only using a custom version because they needed the "non-64" variants of the file locking command macros.
* make generic bits/fcntl.h shareable with 64-bit archsRich Felker2020-09-031-0/+6
| | | | | | | | | | | the fcntl file locking command macro values in the existing generic bits/fcntl.h were the "64" variants, requiring 64-bit archs that use the "plain" variants to have their own bits/fcntl.h, even if they otherwise use the common definitions for everything. since commit 7cc79d10afd43811a486fd5e9fcdf8e45ac599e0 exposed __LONG_MAX to all bits headers, we can now make the generic one common between 32- and 64-bit archs.
* fix missing O_LARGEFILE values on x86_64, x32, and mips64Rich Felker2020-09-033-3/+3
| | | | | | | | | | | | | | | | | prior to commit 685e40bb09f5f24a2af54ea09c97328808f76990, x86_64 was correctly passing O_LARGEFILE to SYS_open; it was removed (defined to 0 in the public header, and changed to use the public definition) as part of that change, probably out of a mistaken belief that it's not needed. however, on a mixed system with 32-bit and 64-bit binaries, it's important that all files be opened with O_LARGEFILE, even if the opening process is 64-bit, in case a descriptor is passed to a 32-bit process. otherwise, attempts to access past 2GB in the 32-bit process could produce EOVERFLOW. most 64-bit archs added later got this right alread, except for mips64. x32 was also affected. there are now fixed.
* fix missing newline in herror outputRich Felker2020-09-031-1/+1
|
* fix i386 __set_thread_area fallbackRich Felker2020-08-301-0/+1
| | | | | | | | this code is only needed for pre-2.6 kernels, which are not actually supported anyway, and was never tested. the fallback path using SYS_modify_ldt failed to clear the upper bits of %eax (all ones due to SYS_set_thread_area's return value being an error) before modifying %al to attempt a new syscall.
* restore h_errno ABI compatibility with ancient binariesRich Felker2020-08-301-0/+4
| | | | | | | | | | | | prior to commit e68c51ac46a9f273927aef8dcebc89912ab19ece, h_errno was actually an external data object not a macro. bring back the symbol, and use it as the storage for the main thread's h_errno. technically this still doesn't provide full compatibility if the application was multithreaded, but at the time there were no res_* functions (and they did not set h_errno anyway), so any use of h_errno would have been via thread-unsafe functions. thus a solution that just fixes single-threaded applications seems acceptable.
* clean up overinclusion in files using TIOCGWINSZRich Felker2020-08-303-3/+0
| | | | | now that struct winsize is available via sys/ioctl.h once again, including termios.h is not needed.
* fix regression with applications that expect struct winsize in ioctl.hRich Felker2020-08-303-7/+5
| | | | | | putting the (simple) definition in alltypes.h seems like the best solution here. making sys/ioctl.h implicitly include termios.h is probably excess namespace pollution.
* configure: enable warnings by defaultRich Felker2020-08-271-2/+2
| | | | | | | now that -Wall is not used and we control which warnings are enabled, it makes sense to have the wanted ones on by default. hopefully this will also discourage manually adding -Wall to CFLAGS and making incorrect changes or bug reports based on the compiler's output.
* configure: use additive warnings instead of subtracting from -WallRich Felker2020-08-271-9/+15
| | | | | | | | | | -Wall varies too much by compiler and version. rather than trying to track all the unwanted style warnings that need to be subtracted, just enable wanted warnings. also, move -Wno-pointer-to-int-cast outside --enable-warnings conditional so that it always applies, since it's turning off a nuisance warning that's on-by-default with most compilers.
* configure: add further -Werror=... options to detected CFLAGSRich Felker2020-08-271-0/+4
| | | | | | | these four warning options were overlooked previously, likely because they're not part of GCC's -Wall. they all detect constraint violations (invalid C at the source level) and should always be on in -Werror form.
* remove redundant pthread struct members repeated for layout purposesRich Felker2020-08-278-18/+19
| | | | | | | dtv_copy, canary2, and canary_at_end existed solely to match multiple ABI and asm-accessed layouts simultaneously. now that pthread_arch.h can be included before struct __pthread is defined, the struct layout can depend on macros defined by pthread_arch.h.
* deduplicate __pthread_self thread pointer adjustment out of each archRich Felker2020-08-2717-65/+65
| | | | | | | | | | | | | | | | the adjustment made is entirely a function of TLS_ABOVE_TP and TP_OFFSET. aside from avoiding repetition of the TP_OFFSET value and arithmetic, this change makes pthread_arch.h independent of the definition of struct __pthread from pthread_impl.h. this in turn will allow inclusion of pthread_arch.h to be moved to the top of pthread_impl.h so that it can influence the definition of the structure. previously, arch files were very inconsistent about the type used for the thread pointer. this change unifies the new __get_tp interface to always use uintptr_t, which is the most correct when performing arithmetic that may involve addresses outside the actual pointed-to object (due to TP_OFFSET).
* deduplicate TP_ADJ logic out of each arch, replace with TP_OFFSETRich Felker2020-08-2417-21/+16
| | | | | | the only part of TP_ADJ that was not uniquely determined by TLS_ABOVE_TP was the 0x7000 adjustment used mainly on mips and powerpc variants.
* report res_query failures, including nxdomain/nodata, via h_errnoRich Felker2020-08-241-1/+15
| | | | | | | | | while it's not clearly documented anywhere, this is the historical behavior which some applications expect. applications which need to see the response packet in these cases, for example to distinguish between nonexistence in a secure vs insecure zone, must already use res_mkquery with res_send in order to be portable, since most if not all other implementations of res_query don't provide it.
* make h_errno thread-localRich Felker2020-08-242-4/+3
| | | | | | | | | the framework to do this always existed but it was deemed unnecessary because the only [ex-]standard functions using h_errno were not thread-safe anyway. however, some of the nonstandard res_* functions are also supposed to set h_errno to indicate the cause of error, and were unable to do so because it was not thread-safe. this change is a prerequisite for fixing them.
* add tcgetwinsize and tcsetwinsize functions, move struct winsizeRich Felker2020-08-247-7/+29
| | | | | | | | | | | | these have been adopted for future issue of POSIX as the outcome of Austin Group issue 1151, and are simply functions performing the roles of the historical ioctls. since struct winsize is being standardized along with them, its definition is moved to the appropriate header. there is some chance this will break source files that expect struct winsize to be defined by sys/ioctl.h without including termios.h. if this happens, further changes will be needed to have sys/ioctl.h expose it too.
* fix MUSL_LOCPATH searchRich Felker2020-08-221-1/+1
| | | | all path elements but the last had the final byte truncated.
* add gettid functionRich Felker2020-08-172-0/+9
| | | | | | | | | | | | | | | this is a prerequisite for addition of other interfaces that use kernel tids, including futex and SIGEV_THREAD_ID. there is some ambiguity as to whether the semantic return type should be int or pid_t. either way, futex API imposes a contract that the values fit in int (excluding some upper reserved bits). glibc used pid_t, so in the interest of not having gratuitous mismatch (the underlying types are the same anyway), pid_t is used here as well. while conceptually this is a syscall, the copy stored in the thread structure is always valid in all contexts where it's valid to call libc functions, so it's used to avoid the syscall.
* aarch64: fix setjmp return valueSzabolcs Nagy2020-08-121-4/+3
| | | | | | | longjmp should set the return value of setjmp, but 64bit registers were used for the 0 check while the type is int. use the code that gcc generates for return val ? val : 1;
* setjmp: optimize longjmp prologuesAlexander Monakov2020-08-123-14/+8
| | | | | Use a branchless sequence that is one byte shorter on 64-bit, same size on 32-bit. Thanks to Pete Cawley for suggesting this variant.
* setjmp: optimize x86 longjmp epiloguesAlexander Monakov2020-08-113-12/+6
|
* setjmp: avoid useless REX-prefix on xor %eax, %eaxAlexander Monakov2020-08-112-2/+2
|
* setjmp: fix x86-64 longjmp argument adjustmentAlexander Monakov2020-08-112-6/+6
| | | | | | | | | | | | | | | | | longjmp 'val' argument is an int, but the assembly is referencing 64-bit registers as if the argument was a long, or the caller was responsible for extending the argument. Though the psABI is not clear on this, the interpretation in GCC is that high bits may be arbitrary and the callee is responsible for sign/zero-extending the value as needed (likewise for return values: callers must anticipate that high bits may be garbage). Therefore testing %rax is a functional bug: setjmp would wrongly return zero if longjmp was called with val==0, but high bits of %rsi happened to be non-zero. Rewrite the prologue to refer to 32-bit registers. In passing, change 'test' to use %rsi, as there's no advantage to using %rax and the new form is cheaper on processors that do not perform move elimination.
* prefer new socket syscalls, fallback to SYS_socketcall only if neededRich Felker2020-08-086-14/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | a number of users performing seccomp filtering have requested use of the new individual syscall numbers for socket syscalls, rather than the legacy multiplexed socketcall, since the latter has the arguments all in memory where they can't participate in filter decisions. previously, some archs used the multiplexed socketcall if it was historically all that was available, while other archs used the separate syscalls. the intent was that the latter set only include archs that have "always" had separate socket syscalls, at least going back to linux 2.6.0. however, at least powerpc, powerpc64, and sh were wrongly included in this set, and thus socket operations completely failed on old kernels for these archs. with the changes made here, the separate syscalls are always preferred, but fallback code is compiled for archs that also define SYS_socketcall. two such archs, mips (plain o32) and microblaze, define SYS_socketcall despite never having needed it, so it's now undefined by their versions of syscall_arch.h to prevent inclusion of useless fallback code. some archs, where the separate syscalls were only added after the addition of SYS_accept4, lack SYS_accept. because socket calls are always made with zeros in the unused argument positions, it suffices to just use SYS_accept4 to provide a definition of SYS_accept, and this is done to make happy the macro machinery that concatenates the socket call name onto __SC_ and SYS_.
* math: new software sqrtlSzabolcs Nagy2020-08-051-1/+253
| | | | | | | | | | | | | | | | same approach as in sqrt. sqrtl was broken on aarch64, riscv64 and s390x targets because of missing quad precision support and on m68k-sf because of missing ld80 sqrtl. this implementation is written for quad precision and then edited to make it work for both m68k and x86 style ld80 formats too, but it is not expected to be optimal for them. note: using fp instructions for the initial estimate when such instructions are available (e.g. double prec sqrt or rsqrt) is avoided because of fenv correctness.