about summary refs log tree commit diff
path: root/src/aio
Commit message (Collapse)AuthorAgeFilesLines
* fix AS-safety of close when aio is in use and fd map is expandedRich Felker2022-10-191-0/+6
| | | | | | | | | | | | the aio operations that lead to calling __aio_get_queue with the possibility to expand the fd map are not AS-safe, but if they are interrupted by a signal handler, the signal handler may call close, which is required to be AS-safe. due to __aio_get_queue taking the write lock without blocking signals, such a call to close from a signal handler could deadlock. change __aio_get_queue to block signals if it needs to obtain a write lock, and restore when finished.
* fix use of uninitialized dummy_fut in aio_suspendAlexey Izbyshev2022-10-191-1/+1
| | | | | | | aio_suspend waits on a dummy futex in the corner case when the array of requests contains NULL pointers only. But the value of this futex was left uninitialized, so if it happens to be non-zero, aio_suspend degrades to spinning instead of blocking.
* fix potential deadlock between multithreaded fork and aioRich Felker2022-10-191-2/+17
| | | | | | | | | | | | | | | | | | | | | | | | as reported by Alexey Izbyshev, there is a lock order inversion deadlock between the malloc lock and aio maplock at MT-fork time: _Fork attempts to take the aio maplock while fork already has the malloc lock, but a concurrent aio operation holding the maplock may attempt to allocate memory. move the __aio_atfork calls in the parent from _Fork to fork, and reorder the lock before most other locks, since nothing else depends on aio(*). this leaves us with the possibility that the child will not be able to obtain the read lock, if _Fork is used directly and happens concurrent with an aio operation. however, in that case, the child context is an async signal context that cannot call any further aio functions, so all we need is to ensure that close does not attempt to perform any aio cancellation. this can be achieved just by nulling out the map pointer. (*) even if other functions call close, they will only need a read lock, not a write lock, and read locks being recursive ensures they can obtain it. moreover, the number of read references held is bounded by something like twice the number of live threads, meaning that the read lock count cannot saturate.
* remove LFS64 symbol aliases; replace with dynamic linker remappingRich Felker2022-10-193-13/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | originally the namespace-infringing "large file support" interfaces were included as part of glibc-ABI-compat, with the intent that they not be used for linking, since our off_t is and always has been unconditionally 64-bit and since we usually do not aim to support nonstandard interfaces when there is an equivalent standard interface. unfortunately, having the symbols present and available for linking caused configure scripts to detect them and attempt to use them without declarations, producing all the expected ill effects that entails. as a result, commit 2dd8d5e1b8ba1118ff1782e96545cb8a2318592c was made to prevent this, using macros to redirect the LFS64 names to the standard names, conditional on _GNU_SOURCE or _LARGEFILE64_SOURCE. however, this has turned out to be a source of further problems, especially since g++ defines _GNU_SOURCE by default. in particular, the presence of these names as macros breaks a lot of valid code. this commit removes all the LFS64 symbols and replaces them with a mechanism in the dynamic linker symbol lookup failure path to retry with the spurious "64" removed from the symbol name. in the future, if/when the rest of glibc-ABI-compat is moved out of libc, this can be removed.
* drop use of pthread_once for aio thread stack size initRich Felker2020-12-081-10/+8
| | | | | | | | | | | | | pthread_once is not compatible with MT-fork constraints (commit 167390f05564e0a4d3fcb4329377fd7743267560) and is not needed here anyway; we already have a lock suitable for initialization. while changing this, fix a corner case where AT_MINSIGSTKSZ gives a value that's more than MINSIGSTKSZ but by a margin of less than 2048, thereby causing the size to be reduced. it shouldn't matter but the intent was to be the larger of a 2048-byte margin over the legacy fixed minimum stack requirement or a 512-byte margin over the minimum the kernel reports at runtime.
* convert malloc use under libc-internal locks to use internal allocatorRich Felker2020-11-111-0/+5
| | | | | | | | | | | | | this change lifts undocumented restrictions on calls by replacement mallocs to libc functions that might take these locks, and sets the stage for lifting restrictions on the child execution environment after multithreaded fork. care is taken to #define macros to replace all four functions (malloc, calloc, realloc, free) even if not all of them will be used, using an undefined symbol name for the ones intended not to be used so that any inadvertent future use will be caught at compile time rather than directed to the wrong implementation.
* move aio implementation details to a proper internal headerRich Felker2020-10-142-0/+2
| | | | | also fix the lack of declaration (and thus hidden visibility) in __stdio_close's use of __aio_close.
* fix fork of processes with active async io contextsRich Felker2020-09-281-0/+14
| | | | | | | | | | | | | | previously, if a file descriptor had aio operations pending in the parent before fork, attempting to close it in the child would attempt to cancel a thread belonging to the parent. this could deadlock, fail, or crash the whole process of the cancellation signal handler was not yet installed in the parent. in addition, further use of aio from the child could malfunction or deadlock. POSIX specifies that async io operations are not inherited by the child on fork, so clear the entire aio fd map in the child, and take the aio map lock (with signals blocked) across the fork so that the lock is kept in a consistent state.
* disable lfs64 aliases for remapped time64 functionsRich Felker2019-10-281-0/+2
| | | | | | | these functions cannot provide the glibc lfs64-ABI-compatible symbols when time_t differs from what it was in that ABI. instead, the aliases need to be provided by the time32 compat shims or through some other mechanism.
* fix restrict violations in internal use of several functionsSamuel Holland2019-07-101-3/+3
| | | | | | | The old/new parameters to pthread_sigmask, sigprocmask, and setitimer are marked restrict, so passing the same address to both is prohibited. Modify callers of these functions to use a separate object for each argument.
* on failed aio submission, set aiocb error and return valueRich Felker2018-12-111-2/+4
| | | | | | | | it's not clear whether this is required, but it seems arguable that it should happen. for example aio_suspend is supposed to return immediately if any of the operations has "completed", which includes ending with an error status asynchonously and might also be interpreted to include doing so synchronously.
* don't create aio queue/map structures for invalid file descriptorsRich Felker2018-12-111-4/+8
| | | | | | | | | | | | | | the map structures in particular are permanent once created, and thus a large number of aio function calls with invalid file descriptors could exhaust memory, whereas, assuming normal resource limits, only a very small number of entries ever need to be allocated. check validity of the fd before allocating anything new, so that allocation of large amounts of memory is only possible when resource limits have been increased and a large number of files are actually open. this change also improves error reporting for bad file descriptors to happen at the time the aio submission call is made, as opposed to asynchronously.
* move aio queue allocation from io thread to submitting threadRich Felker2018-12-111-16/+21
| | | | | | | | | | | | | | | since commit c9f415d7ea2dace5bf77f6518b6afc36bb7a5732, it has been possible that the allocator is application-provided code, which cannot necessarily run safely on io thread stacks, and which should not be able to see the existence of io threads, since they are an implementation detail. instead of having the io thread request and possibly allocate its queue (and the map structures leading to it), make the submitting thread responsible for this, and pass the queue pointer into the io thread via its args structure. this eliminates the only early error case in io threads, making it no longer necessary to pass an error status back to the submitting thread via the args structure.
* fix and future-proof against stack overflow in aio io threadsRich Felker2018-12-091-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | aio threads not using SIGEV_THREAD notification are created with small stacks and no guard page, which is possible since they only run the code for the requested io operation, not any application code. the motivation is not creating a lot of VMAs. however, the io thread needs to be able to receive a cancellation signal in case aio_cancel (implemented via pthread_cancel) is called. this requires sufficient stack space for a signal frame, which PTHREAD_STACK_MIN does not necessarily include. in principle MINSIGSTKSZ from signal.h should give us sufficient space for a signal frame, but the value is incorrect on some existing archs due to kernel addition of new vector register support without consideration for impact on ABI. some powerpc models exceed MINSIGSTKSZ by about 0.5k, and x86[_64] with AVX-512 can exceed it by up to about 1.5k. so use MINSIGSTKSZ+2048 to allow for the discrepancy plus some working space. unfortunately, it's possible that signal frame sizes could continue to grow, and some archs (aarch64) explicitly specify that they may. passing of a runtime value for MINSIGSTKSZ via AT_MINSIGSTKSZ in the aux vector was added to aarch64 linux, and presumably other archs will use this mechanism to report if they further increase the signal frame size. when AT_MINSIGSTKSZ is present, assume it's correct, so that we only need a small amount of working space in addition to it; in this case just add 512.
* remove spurious inclusion of libc.h for LFS64 ABI aliasesRich Felker2018-09-123-8/+8
| | | | | | the LFS64 macro was not self-documenting and barely saved any characters. simply use weak_alias directly so that it's clear what's being done, and doesn't depend on a header to provide a strange macro.
* reduce spurious inclusion of libc.hRich Felker2018-09-123-3/+0
| | | | | | | | | | | | | | | | | | | | | libc.h was intended to be a header for access to global libc state and related interfaces, but ended up included all over the place because it was the way to get the weak_alias macro. most of the inclusions removed here are places where weak_alias was needed. a few were recently introduced for hidden. some go all the way back to when libc.h defined CANCELPT_BEGIN and _END, and all (wrongly implemented) cancellation points had to include it. remaining spurious users are mostly callers of the LOCK/UNLOCK macros and files that use the LFS64 macro to define the awful *64 aliases. in a few places, new inclusion of libc.h is added because several internal headers no longer implicitly include libc.h. declarations for __lockfile and __unlockfile are moved from libc.h to stdio_impl.h so that the latter does not need libc.h. putting them in libc.h made no sense at all, since the macros in stdio_impl.h are needed to use them correctly anyway.
* move additional pthread internal declarations to pthread_impl.h, hideRich Felker2018-09-121-2/+0
| | | | these were overlooked for various reasons in earlier stages.
* make all objects used with atomic operations volatileRich Felker2015-03-031-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the memory model we use internally for atomics permits plain loads of values which may be subject to concurrent modification without requiring that a special load function be used. since a compiler is free to make transformations that alter the number of loads or the way in which loads are performed, the compiler is theoretically free to break this usage. the most obvious concern is with atomic cas constructs: something of the form tmp=*p;a_cas(p,tmp,f(tmp)); could be transformed to a_cas(p,*p,f(*p)); where the latter is intended to show multiple loads of *p whose resulting values might fail to be equal; this would break the atomicity of the whole operation. but even more fundamental breakage is possible. with the changes being made now, objects that may be modified by atomics are modeled as volatile, and the atomic operations performed on them by other threads are modeled as asynchronous stores by hardware which happens to be acting on the request of another thread. such modeling of course does not itself address memory synchronization between cores/cpus, but that aspect was already handled. this all seems less than ideal, but it's the best we can do without mandating a C11 compiler and using the C11 model for atomics. in the case of pthread_once_t, the ABI type of the underlying object is not volatile-qualified. so we are assuming that accessing the object through a volatile-qualified lvalue via casts yields volatile access semantics. the language of the C standard is somewhat unclear on this matter, but this is an assumption the linux kernel also makes, and seems to be the correct interpretation of the standard.
* make aio_suspend a cancellation point and properly handle cancellationRich Felker2015-03-021-3/+9
|
* factor cancellation cleanup push/pop out of futex __timedwait functionRich Felker2015-03-021-1/+1
| | | | | | | | | | | | | previously, the __timedwait function was optionally a cancellation point depending on whether it was passed a pointer to a cleaup function and context to register. as of now, only one caller actually used such a cleanup function (and it may face removal soon); most callers either passed a null pointer to disable cancellation or a dummy cleanup function. now, __timedwait is never a cancellation point, and __timedwait_cp is the cancellable version. this makes the intent of the calling code more obvious and avoids ugly dummy functions and long argument lists.
* fix type error (arch-dependent) in new aio codeRich Felker2015-02-141-1/+1
| | | | | | | a_store is only valid for int, but ssize_t may be defined as long or another type. since there is no valid way for another thread to acess the return value without first checking the error/completion status of the aiocb anyway, an atomic store is not necessary.
* overhaul aio implementation for correctnessRich Felker2015-02-137-192/+424
| | | | | | | | | | | | | | | | | | | | | | | | previously, aio operations were not tracked by file descriptor; each operation was completely independent. this resulted in non-conforming behavior for non-seekable/append-mode writes (which are required to be ordered) and made it impossible to implement aio_cancel, which in turn made closing file descriptors with outstanding aio operations unsafe. the new implementation is significantly heavier (roughly twice the size, and seems to be slightly slower) and presently aims mainly at correctness, not performance. most of the public interfaces have been moved into a single file, aio.c, because there is little benefit to be had from splitting them. whenever any aio functions are used, aio_cancel and the internal queue lifetime management and fd-to-queue mapping code must be linked, and these functions make up the bulk of the code size. the close function's interaction with aio is implemented with weak alias magic, to avoid pulling in heavy aio cancellation code in programs that don't use aio, and the expensive cancellation path (which includes signal blocking) is optimized out when there are no active aio queues.
* add missing legacy LFS *64 symbol aliasesSzabolcs Nagy2014-09-057-0/+19
| | | | | | versionsort64, aio*64 and lio*64 symbols were missing, they are only needed for glibc ABI compatibility, on the source level dirent.h and aio.h already redirect them.
* eliminate use of cached pid from thread structureRich Felker2014-07-052-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | the main motivation for this change is to remove the assumption that the tid of the main thread is also the pid of the process. (the value returned by the set_tid_address syscall was used to fill both fields despite it semantically being the tid.) this is historically and presently true on linux and unlikely to change, but it conceivably could be false on other systems that otherwise reproduce the linux syscall api/abi. only a few parts of the code were actually still using the cached pid. in a couple places (aio and synccall) it was a minor optimization to avoid a syscall. caching could be reintroduced, but lazily as part of the public getpid function rather than at program startup, if it's deemed important for performance later. in other places (cancellation and pthread_kill) the pid was completely unnecessary; the tkill syscall can be used instead of tgkill. this is actually a rather subtle issue, since tgkill is supposedly a solution to race conditions that can affect use of tkill. however, as documented in the commit message for commit 7779dbd2663269b465951189b4f43e70839bc073, tgkill does not actually solve this race; it just limits it to happening within one process rather than between processes. we use a lock that avoids the race in pthread_kill, and the use in the cancellation signal handler is self-targeted and thus not subject to tid reuse races, so both are safe regardless of which syscall (tgkill or tkill) is used.
* support configurable page size on mips, powerpc and microblazeSzabolcs Nagy2013-09-152-2/+2
| | | | | | | | | | | | | | | | PAGE_SIZE was hardcoded to 4096, which is historically what most systems use, but on several archs it is a kernel config parameter, user space can only know it at execution time from the aux vector. PAGE_SIZE and PAGESIZE are not defined on archs where page size is a runtime parameter, applications should use sysconf(_SC_PAGE_SIZE) to query it. Internally libc code defines PAGE_SIZE to libc.page_size, which is set to aux[AT_PAGESZ] in __init_libc and early in __dynlink as well. (Note that libc.page_size can be accessed without GOT, ie. before relocations are done) Some fpathconf settings are hardcoded to 4096, these should be actually queried from the filesystem using statfs.
* fix invalid access in aio notificationRich Felker2013-06-161-1/+1
| | | | | | | issue found and patch provided by Jens Gustedt. after the atomic store to the error code field of the aiocb, the application is permitted to free or reuse the storage, so further access is invalid. instead, use the local copy that was already made.
* fix uninitialized variable in lio (aio) codeRich Felker2013-06-161-1/+1
|
* fix lio_listio return value in LIO_WAIT modeSzabolcs Nagy2013-01-131-1/+1
|
* use alternate argument syntax for restrict with lio_listioRich Felker2012-12-041-1/+1
| | | | | | | for some reason I have not been able to determine, gcc 3.2 rejects the array notation. this seems to be a gcc bug, but since it's easy to work around, let's do the workaround and avoid gratuitously requiring newer compilers.
* clean up sloppy nested inclusion from pthread_impl.hRich Felker2012-11-082-0/+6
| | | | | | | | | | | | | | this mirrors the stdio_impl.h cleanup. one header which is not strictly needed, errno.h, is left in pthread_impl.h, because since pthread functions return their error codes rather than using errno, nearly every single pthread function needs the errno constants. in a few places, rather than bringing in string.h to use memset, the memset was replaced by direct assignment. this seems to generate much better code anyway, and makes many functions which were previously non-leaf functions into leaf functions (possibly eliminating a great deal of bloat on some platforms where non-leaf functions require ugly prologue and/or epilogue).
* use restrict everywhere it's required by c99 and/or posix 2008Rich Felker2012-09-061-2/+2
| | | | | | | | to deal with the fact that the public headers may be used with pre-c99 compilers, __restrict is used in place of restrict, and defined appropriately for any supported compiler. we also avoid the form [restrict] since older versions of gcc rejected it due to a bug in the original c99 standard, and instead use the form *restrict.
* stupid typo (caused by rather ugly spelling in POSIX..) in aioRich Felker2011-09-281-1/+1
|
* fix idiotic const-correctness error in lio_listioRich Felker2011-09-161-1/+1
| | | | | i blame this one on posix for using hideous const-qualified double pointers which are unusable without hideous casts.
* fix inconsistent signature for aio_errorRich Felker2011-09-141-1/+1
|
* fix return types for aio_read and aio_write againRich Felker2011-09-131-2/+2
| | | | | previous fix was backwards and propagated the wrong type rather than the right one...
* fix various errors in function signatures/prototypes found by nszRich Felker2011-09-131-1/+1
|
* implement POSIX asynchronous ioRich Felker2011-09-097-0/+338
some features are not yet supported, and only minimal testing has been performed. should be considered experimental at this point.