about summary refs log tree commit diff
path: root/src/mman
Commit message (Collapse)AuthorAgeFilesLines
* fix mremap memory synchronization and use of variadic argumentRich Felker2015-11-021-4/+11
| | | | | | | | | | | | since mremap with the MREMAP_FIXED flag is an operation that unmaps existing mappings, it needs to use the vm lock mechanism to ensure that any in-progress synchronization operations using vm identities from before the call have finished. also, the variadic argument was erroneously being read even if the MREMAP_FIXED flag was not passed. in practice this didn't break anything, but it's UB and in theory LTO could turn it into a hard error.
* prevent allocs than PTRDIFF_MAX via mremapDaniel Micay2015-11-021-1/+8
| | | | It's quite feasible for this to happen via MREMAP_MAYMOVE.
* redesign and simplify vmlock systemRich Felker2015-04-102-15/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | this global lock allows certain unlock-type primitives to exclude mmap/munmap operations which could change the identity of virtual addresses while references to them still exist. the original design mistakenly assumed mmap/munmap would conversely need to exclude the same operations which exclude mmap/munmap, so the vmlock was implemented as a sort of 'symmetric recursive rwlock'. this turned out to be unnecessary. commit 25d12fc0fc51f1fae0f85b4649a6463eb805aa8f already shortened the interval during which mmap/munmap held their side of the lock, but left the inappropriate lock design and some inefficiency. the new design uses a separate function, __vm_wait, which does not hold any lock itself and only waits for lock users which were already present when it was called to release the lock. this is sufficient because of the way operations that need to be excluded are sequenced: the "unlock-type" operations using the vmlock need only block mmap/munmap operations that are precipitated by (and thus sequenced after) the atomic-unlock they perform while holding the vmlock. this allows for a spectacular lack of synchronization in the __vm_wait function itself.
* make fsync, fdatasync, and msync cancellation pointsTrutz Behn2015-01-301-1/+1
| | | | | these are mandatory cancellation points per POSIX, so their omission was a conformance bug.
* use weak symbols for the POSIX functions that will be used by C threadsJens Gustedt2014-09-061-1/+3
| | | | | | | | | | The intent of this is to avoid name space pollution of the C threads implementation. This has two sides to it. First we have to provide symbols that wouldn't pollute the name space for the C threads implementation. Second we have to clean up some internal uses of POSIX functions such that they don't implicitly drag in such symbols.
* optimize locking against vm changes for mmap/munmapRich Felker2014-08-162-8/+7
| | | | | | | | | | the whole point of this locking is to prevent munmap, or mmap with MAP_FIXED, from deallocating virtual addresses, or changing the backing a given virtual address refers to, during certain race windows involving self-synchronized unmapping or destruction of pthread synchronization objects. there is no need for exclusion in the other direction, so it suffices to take the lock momentarily and release it before making the syscall, rather than holding it across the syscall.
* add framework for mmap2 syscall unit to vary by archRich Felker2014-07-301-2/+3
|
* include cleanups: remove unused headers and add feature test macrosSzabolcs Nagy2013-12-122-2/+0
|
* support configurable page size on mips, powerpc and microblazeSzabolcs Nagy2013-09-151-1/+1
| | | | | | | | | | | | | | | | PAGE_SIZE was hardcoded to 4096, which is historically what most systems use, but on several archs it is a kernel config parameter, user space can only know it at execution time from the aux vector. PAGE_SIZE and PAGESIZE are not defined on archs where page size is a runtime parameter, applications should use sysconf(_SC_PAGE_SIZE) to query it. Internally libc code defines PAGE_SIZE to libc.page_size, which is set to aux[AT_PAGESZ] in __init_libc and early in __dynlink as well. (Note that libc.page_size can be accessed without GOT, ie. before relocations are done) Some fpathconf settings are hardcoded to 4096, these should be actually queried from the filesystem using statfs.
* fix shm_open wrongly being cancellableRich Felker2013-07-201-1/+6
|
* disallow creation of objects larger than PTRDIFF_MAX via mmapRich Felker2013-06-271-0/+5
| | | | | | | | | | | | | | internally, other parts of the library assume sizes don't overflow ssize_t and/or ptrdiff_t, and the way this assumption is made valid is by preventing creating of such large objects. malloc already does so, but the check was missing from mmap. this is also a quality of implementation issue: even if the implementation internally could handle such objects, applications could inadvertently invoke undefined behavior by subtracting pointers within an object. it is very difficult to guard against this in applications, so a good implementation should simply ensure that it does not happen.
* clean up and fix logic for making mmap fail on invalid/unsupported offsetsRich Felker2012-12-201-3/+7
| | | | | | | | | | | | | | | | | | | | the previous logic was assuming the kernel would give EINVAL when passed an invalid address, but instead with MAP_FIXED it was giving EPERM, as it considered this an attempt to map over kernel memory. instead of trying to get the kernel to do the rigth thing, the new code just handles the error in userspace. I have also cleaned up the code to use a single mask to check for invalid low bits and unsupported high bits, so it's simpler and more clearly correct. the old code was actually wrong for sizeof(long) smaller than sizeof(off_t) but not equal to 4; now it should be correct for all possibilities. for 64-bit systems, the low-bits test is new and extraneous (the kernel should catch the error anyway when the mmap2 syscall is not used), but it's cheap anyway. if this is an issue, the OFF_MASK definition could be tweaked to omit the low bits when SYS_mmap2 is not defined.
* overhaul sem_openRich Felker2012-09-301-3/+3
| | | | | | | | | | | this function was overly complicated and not even obviously correct. avoid using openat/linkat just like in shm_open, and instead expand pathname using code shared with shm_open. remove bogus (and dangerous, with priorities) use of spinlocks. this commit also heavily streamlines the code and ensures there are no failure cases that can happen after a new semaphore has been created in the filesystem, since that case is unreportable.
* clean up, bugfixes, and general improvement for shm_open/shm_unlinkRich Felker2012-09-302-30/+28
| | | | | | | 1. don't make non-cloexec file descriptors 2. cancellation safety (cleanup handlers were missing, now unneeded) 3. share name validation/mapping code between open/unlink functions 4. avoid wasteful/slow syscalls
* mincore syscall wrapperRich Felker2012-09-091-0/+8
|
* process-shared barrier support, based on discussion with bdonlanRich Felker2011-09-272-3/+21
| | | | | | | | | | | | | this implementation is rather heavy-weight, but it's the first solution i've found that's actually correct. all waiters actually wait twice at the barrier so that they can synchronize exit, and they hold a "vm lock" that prevents changes to virtual memory mappings (and blocks pthread_barrier_destroy) until all waiters are finished inspecting the barrier. thus, it is safe for any thread to destroy and/or unmap the barrier's memory as soon as pthread_barrier_wait returns, without further synchronization.
* work around linux bug in mprotectRich Felker2011-06-291-1/+5
| | | | | | | | | | | per POSIX: The mprotect() function shall change the access protections to be that specified by prot for those whole pages containing any part of the address space of the process starting at address addr and continuing for len bytes. on the other hand, linux mprotect fails with EINVAL if the base address and/or length is not page-aligned, so we have to align them before making the syscall.
* fix missing include in posix_madvise.c (compile error)Rich Felker2011-04-201-0/+1
|
* support posix_madvise (previous a stub)Rich Felker2011-04-201-1/+3
| | | | | the check against MADV_DONTNEED to because linux MADV_DONTNEED semantics conflict dangerously with the POSIX semantics
* consistency: change all remaining syscalls to use SYS_ rather than __NR_ prefixRich Felker2011-04-061-1/+1
|
* global cleanup to use the new syscall interfaceRich Felker2011-03-2010-11/+11
|
* implement POSIX shared memoryRich Felker2011-03-032-0/+42
|
* cleaning up syscalls in preparation for x86_64 portRich Felker2011-02-131-0/+4
| | | | | | | | | - hide all the legacy xxxxxx32 name cruft in syscall.h so the actual source files can be clean and uniform across all archs. - cleanup llseek/lseek and mmap2/mmap handling for 32/64 bit systems - alternate implementation for nice if the target lacks nice syscall
* initial check-in, version 0.5.0 v0.5.0Rich Felker2011-02-1211-0/+107