about summary refs log tree commit diff
path: root/src/process/posix_spawn.c
Commit message (Collapse)AuthorAgeFilesLines
* optimize posix_spawn to avoid spurious sigaction syscallsRich Felker2013-08-091-7/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | the trick here is that sigaction can track for us which signals have ever had a signal handler set for them, and only those signals need to be considered for reset. this tracking mask may have false positives, since it is impossible to remove bits from it without race conditions. false negatives are not possible since the mask is updated with atomic operations prior to making the sigaction syscall. implementation-internal signals are set to SIG_IGN rather than SIG_DFL so that a signal raised in the parent (e.g. calling pthread_cancel on the thread executing pthread_spawn) does not have any chance make it to the child, where it would cause spurious termination by signal. this change reduces the minimum/typical number of syscalls in the child from around 70 to 4 (including execve). this should greatly improve the performance of posix_spawn and other interfaces which use it (popen and system). to facilitate these changes, sigismember is also changed to return 0 rather than -1 for invalid signals, and to return the actual status of implementation-internal signals. POSIX allows but does not require an error on invalid signal numbers, and in fact returning an error tends to confuse applications which wrongly assume the return value of sigismember is boolean.
* fix missing errno from exec failure in posix_spawnRich Felker2013-08-091-0/+1
| | | | | failures prior to the exec attempt were reported correctly, but on exec failure, the return value contained junk.
* make posix_spawn (and functions that use it) use CLONE_VFORK flagRich Felker2013-07-171-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | this is both a minor scheduling optimization and a workaround for a difficult-to-fix bug in qemu app-level emulation. from the scheduling standpoint, it makes no sense to schedule the parent thread again until the child has exec'd or exited, since the parent will immediately block again waiting for it. on the qemu side, as regular application code running on an underlying libc, qemu cannot make arbitrary clone syscalls itself without confusing the underlying implementation. instead, it breaks them down into either fork-like or pthread_create-like cases. it was treating the code in posix_spawn as pthread_create-like, due to CLONE_VM, which caused horribly wrong behavior: CLONE_FILES broke the synchronization mechanism, CLONE_SIGHAND broke the parent's signals, and CLONE_THREAD caused the child's exec to end the parent -- if it hadn't already crashed. however, qemu special-cases CLONE_VFORK and emulates that with fork, even when CLONE_VM is also specified. this also gives incorrect semantics for code that really needs the memory sharing, but posix_spawn does not make use of the vm sharing except to avoid momentary double commit charge. programs using posix_spawn (including via popen) should now work correctly under qemu app-level emulation.
* remove explicit locking to prevent __synccall setuid during posix_spawnRich Felker2013-04-261-13/+0
| | | | | | | | | | | | | | | for the duration of the vm-sharing clone used by posix_spawn, all signals are blocked in the parent process, including implementation-internal signals. since __synccall cannot do anything until successfully signaling all threads, the fact that signals are blocked automatically yields the necessary safety. aside from debloating and general simplification, part of the motivation for removing the explicit lock is to simplify the synchronization logic of __synccall in hopes that it can be made async-signal-safe, which is needed to make setuid and setgid, which depend on __synccall, conform to the standard. whether this will be possible remains to be seen.
* fix unsigned comparison bug in posix_spawnRich Felker2013-02-031-1/+1
| | | | | | | read should never return anything but 0 or sizeof ec here, but if it does, we want to treat any other return as "success". then the caller will get back the pid and is responsible for waiting on it when it immediately exits.
* overhaul posix_spawn to use CLONE_VM instead of vforkRich Felker2013-02-031-52/+122
| | | | | | | | | | | | | | | | | | | | | the proposed change was described in detail in detail previously on the mailing list. in short, vfork is unsafe because: 1. the compiler could make optimizations that cause the child to clobber the parent's local vars. 2. strace is buggy and allows the vforking parent to run before the child execs when run under strace. the new design uses a close-on-exec pipe instead of vfork semantics to synchronize the parent and child so that the parent does not return before the child has finished using its arguments (and now, also its stack). this also allows reporting exec failures to the caller instead of giving the caller a child that mysteriously exits with status 127 on exec error. basic testing has been performed on both the success and failure code paths. further testing should be done.
* fix usage of locks with vforkRich Felker2012-10-191-1/+1
| | | | | | __release_ptc() is only valid in the parent; if it's performed in the child, the lock will be unlocked early then double-unlocked later, corrupting the lock state.
* fix parent-memory-clobber in posix_spawn (environ)Rich Felker2012-10-181-4/+3
|
* overhaul system() and popen() to use vfork; fix various related bugsRich Felker2012-10-181-6/+7
| | | | | | | | | | | | | | | | since we target systems without overcommit, special care should be taken that system() and popen(), like posix_spawn(), do not fail in processes whose commit charges are too high to allow ordinary forking. this in turn requires special precautions to ensure that the parent process's signal handlers do not end up running in the shared-memory child, where they could corrupt the state of the parent process. popen has also been updated to use pipe2, so it does not have a fd-leak race in multi-threaded programs. since pipe2 is missing on older kernels, (non-atomic) emulation has been added. some silly bugs in the old code should be gone too.
* block uid/gid changes during posix_spawnRich Felker2012-10-151-0/+10
| | | | | | | | | | | | | | | | | | usage of vfork creates a situation where a process of lower privilege may momentarily have write access to the memory of a process of higher privilege. consider the case of a multi-threaded suid program which is calling posix_spawn in one thread while another thread drops the elevated privileges then runs untrusted (relative to the elevated privilege) code as the original invoking user. this untrusted code can then potentially modify the data the child process will use before calling exec, for example changing the pathname or arguments that will be passed to exec. note that if vfork is implemented as fork, the lock will not be held until the child execs, but since memory is not shared it does not matter.
* use vfork if possible in posix_spawnRich Felker2012-09-141-1/+3
| | | | | | vfork is implemented as the fork syscall (with no atfork handlers run) on archs where it is not available, so this change does not introduce any change in behavior or regression for such archs.
* use restrict everywhere it's required by c99 and/or posix 2008Rich Felker2012-09-061-6/+6
| | | | | | | | to deal with the fact that the public headers may be used with pre-c99 compilers, __restrict is used in place of restrict, and defined appropriately for any supported compiler. we also avoid the form [restrict] since older versions of gcc rejected it due to a bug in the original c99 standard, and instead use the form *restrict.
* fix various errors in function signatures/prototypes found by nszRich Felker2011-09-131-3/+5
|
* fix backwards posix_spawn file action orderRich Felker2011-05-291-2/+3
|
* add file actions support to posix_spawnRich Felker2011-05-281-0/+28
|
* posix_spawn: honor POSIX_SPAWN_SETSIGDEF flagRich Felker2011-05-281-1/+3
|
* initial implementation of posix_spawnRich Felker2011-05-281-0/+65
file actions are not yet implemented, but everything else should be mostly complete and roughly correct.