about summary refs log tree commit diff
path: root/src/malloc
Commit message (Collapse)AuthorAgeFilesLines
* remove useless check of bin match in mallocRich Felker2015-03-041-1/+1
| | | | | | | | | | | | | | | | this re-check idiom seems to have been copied from the alloc_fwd and alloc_rev functions, which guess a bin based on non-synchronized memory access to adjacent chunk headers then need to confirm, after locking the bin, that the chunk is actually in the bin they locked. the check being removed, however, was being performed on a chunk obtained from the already-locked bin. there is no race to account for here; the check could only fail in the event of corrupt free lists, and even then it would not catch them but simply continue running. since the bin_index function is mildly expensive, it seems preferable to remove the check rather than trying to convert it into a useful consistency check. casual testing shows a 1-5% reduction in run time.
* fix init race that could lead to deadlock in malloc init codeRich Felker2015-03-041-39/+14
| | | | | | | | | | | | | the malloc init code provided its own version of pthread_once type logic, including the exact same bug that was fixed in pthread_once in commit 0d0c2f40344640a2a6942dda156509593f51db5d. since this code is called adjacent to expand_heap, which takes a lock, there is no reason to have pthread_once-type initialization. simply moving the init code into the interval where expand_heap already holds its lock on the brk achieves the same result with much less synchronization logic, and allows the buggy code to be eliminated rather than just fixed.
* make all objects used with atomic operations volatileRich Felker2015-03-032-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the memory model we use internally for atomics permits plain loads of values which may be subject to concurrent modification without requiring that a special load function be used. since a compiler is free to make transformations that alter the number of loads or the way in which loads are performed, the compiler is theoretically free to break this usage. the most obvious concern is with atomic cas constructs: something of the form tmp=*p;a_cas(p,tmp,f(tmp)); could be transformed to a_cas(p,*p,f(*p)); where the latter is intended to show multiple loads of *p whose resulting values might fail to be equal; this would break the atomicity of the whole operation. but even more fundamental breakage is possible. with the changes being made now, objects that may be modified by atomics are modeled as volatile, and the atomic operations performed on them by other threads are modeled as asynchronous stores by hardware which happens to be acting on the request of another thread. such modeling of course does not itself address memory synchronization between cores/cpus, but that aspect was already handled. this all seems less than ideal, but it's the best we can do without mandating a C11 compiler and using the C11 model for atomics. in the case of pthread_once_t, the ABI type of the underlying object is not volatile-qualified. so we are assuming that accessing the object through a volatile-qualified lvalue via casts yields volatile access semantics. the language of the C standard is somewhat unclear on this matter, but this is an assumption the linux kernel also makes, and seems to be the correct interpretation of the standard.
* add malloc_usable_size function and non-stub malloc.hRich Felker2014-08-251-0/+17
| | | | | | | | | | | | | | | | | | | | | | this function is needed for some important practical applications of ABI compatibility, and may be useful for supporting some non-portable software at the source level too. I was hesitant to add a function which imposes any constraints on malloc internals; however, it turns out that any malloc implementation which has realloc must already have an efficient way to determine the size of existing allocations, so no additional constraint is imposed. for now, some internal malloc definitions are duplicated in the new source file. if/when malloc is refactored to put them in a shared internal header file, these could be removed. since malloc_usable_size is conventionally declared in malloc.h, the empty stub version of this file was no longer suitable. it's updated to provide the standard allocator functions, nonstandard ones (even if stdlib.h would not expose them based on the feature test macros in effect), and any malloc-extension functions provided (currently, only malloc_usable_size).
* avoid malloc failure for small requests when brk can't be extendedRich Felker2014-04-021-1/+23
| | | | | | | | | | | | | | | | | this issue mainly affects PIE binaries and execution of programs via direct invocation of the dynamic linker binary: depending on kernel behavior, in these cases the initial brk may be placed at at location where it cannot be extended, due to conflicting adjacent maps. when brk fails, mmap is used instead to expand the heap. in order to avoid expensive bookkeeping for managing fragmentation by merging these new heap regions, the minimum size for new heap regions increases exponentially in the number of regions. this limits the number of regions, and thereby the number of fixed fragmentation points, to a quantity which is logarithmic with respect to the size of virtual address space and thus negligible. the exponential growth is tuned so as to avoid expanding the heap by more than approximately 50% of its current total size.
* include cleanups: remove unused headers and add feature test macrosSzabolcs Nagy2013-12-121-1/+0
|
* slightly optimize __brk for sizeRich Felker2013-10-051-1/+1
| | | | | | | | | | there is no reason to check the return value for setting errno, since brk never returns errors, only the new value of the brk (which may be the same as the old, or otherwise differ from the requested brk, on failure). it may be beneficial to eventually just eliminate this file and make the syscalls inline in malloc.c.
* fix failure of malloc to set errno on heap (brk) exhaustionRich Felker2013-10-051-0/+1
| | | | | I wrongly assumed the brk syscall would set errno, but on failure it returns the old value of the brk rather than an error code.
* fix potential deadlock bug in libc-internal locking logicRich Felker2013-09-201-8/+7
| | | | | | | | | | | | | | | | | | | | if a multithreaded program became non-multithreaded (i.e. all other threads exited) while one thread held an internal lock, the remaining thread would fail to release the lock. the the program then became multithreaded again at a later time, any further attempts to obtain the lock would deadlock permanently. the underlying cause is that the value of libc.threads_minus_1 at unlock time might not match the value at lock time. one solution would be returning a flag to the caller indicating whether the lock was taken and needs to be unlocked, but there is a simpler solution: using the lock itself as such a flag. note that this flag is not needed anyway for correctness; if the lock is not held, the unlock code is harmless. however, the memory synchronization properties associated with a_store are costly on some archs, so it's best to avoid executing the unlock code when it is unnecessary.
* remove redundant check in memalignRich Felker2013-07-231-1/+1
| | | | | the case where mem was already aligned is handled earlier in the function now.
* fix heap corruption bug in memalignRich Felker2013-07-231-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | this bug was caught by the new footer-corruption check in realloc and free. if the block returned by malloc was already aligned to the desired alignment, memalign's logic to split off the misaligned head was incorrect; rather than writing to a point inside the allocated block, it was overwriting the footer of the previous block on the heap with the value 1 (length 0 plus an in-use flag). fortunately, the impact of this bug was fairly low. (this is probably why it was not caught sooner.) due to the way the heap works, malloc will never return a block whose previous block is free. (doing so would be harmful because it would increase fragmentation with no benefit.) the footer is actually not needed for in-use blocks, except that its in-use bit needs to remain set so that it does not get merged with free blocks, so there was no harm in it being set to 1 instead of the correct value. however, there is one case where this bug could have had an impact: in multi-threaded programs, if another thread freed the previous block after memalign's call to malloc returned, but before memalign overwrote the previous block's footer, the resulting block in the free list could be left in a corrupt state. I have not analyzed the impact of this bad state and whether it could lead to more serious malfunction.
* harden realloc/free to detect simple overflowsRich Felker2013-07-191-0/+6
| | | | | | | | | | | the sizes in the header and footer for a chunk should always match. if they don't, the program has definitely invoked undefined behavior, and the most likely cause is a simple overflow, either of a buffer in the block being freed or the one just below it. crashing here should not only improve security of buggy programs, but also aid in debugging, since the crash happens in a context where you have a pointer to the likely-overflowed buffer.
* move core memalign code from aligned_alloc to __memalignRich Felker2013-07-043-49/+55
| | | | | | | | | | | | there are two motivations for this change. one is to avoid gratuitously depending on a C11 symbol for implementing a POSIX function. the other pertains to the documented semantics. C11 does not define any behavior for aligned_alloc when the length argument is not a multiple of the alignment argument. posix_memalign on the other hand places no requirements on the length argument. using __memalign as the implementation of both, rather than trying to implement one in terms of the other when their documented contracts differ, eliminates this confusion.
* move alignment check from aligned_alloc to posix_memalignRich Felker2013-07-042-1/+2
| | | | | | | | C11 has no requirement that the alignment be a multiple of sizeof(void*), and in fact seems to require any "valid alignment supported by the implementation" to work. since the alignment of char is 1 and thus a valid alignment, an alignment argument of 1 should be accepted.
* page-align initial brk value used by malloc in shared libcRich Felker2012-12-071-1/+5
| | | | | | | | | | this change fixes an obscure issue with some nonstandard kernels, where the initial brk syscall returns a pointer just past the end of bss rather than the beginning of a new page. in that case, the dynamic linker has already reclaimed the space between the end of bss and the page end for use by malloc, and memory corruption (allocating the same memory twice) will occur when malloc again claims it on the first call to brk.
* fix invalid read in aligned_allocRich Felker2012-12-061-2/+3
| | | | | | in case of mmap-obtained chunks, end points past the end of the mapping and reading it may fault. since the value is not needed until after the conditional, move the access to prevent invalid reads.
* workaround gcc got-register-reload performance problems in mallocRich Felker2012-09-141-4/+8
| | | | | | | with this patch, the malloc in libc.so built with -Os is nearly the same speed as the one built with -O3. thus it solves the performance regression that resulted from removing the forced -O3 when building libc.so; now libc.so can be both small and fast.
* implement "low hanging fruit" from C11Rich Felker2012-08-253-47/+55
| | | | | | | | based on Gregor's patch sent to the list. includes: - stdalign.h - removing gets in C11 mode - adding aligned_alloc and adjusting other functions to use it - adding 'x' flag to fopen for exclusive mode
* ditch the priority inheritance locks; use malloc's version of lockRich Felker2012-04-241-4/+4
| | | | | | | | | | | | | | | | | | | i did some testing trying to switch malloc to use the new internal lock with priority inheritance, and my malloc contention test got 20-100 times slower. if priority inheritance futexes are this slow, it's simply too high a price to pay for avoiding priority inversion. maybe we can consider them somewhere down the road once the kernel folks get their act together on this (and perferably don't link it to glibc's inefficient lock API)... as such, i've switch __lock to use malloc's implementation of lightweight locks, and updated all the users of the code to use an array with a waiter count for their locks. this should give optimal performance in the vast majority of cases, and it's simple. malloc is still using its own internal copy of the lock code because it seems to yield measurably better performance with -O3 when it's inlined (20% or more difference in the contention stress test).
* fix issue with excessive mremap syscalls on reallocRich Felker2011-11-161-4/+2
| | | | | | | | | | CHUNK_SIZE macro was defined incorrectly and shaving off at least one significant bit in the size of mmapped chunks, resulting in the test for oldlen==newlen always failing and incurring a syscall. fortunately i don't think this issue caused any other observable behavior; the definition worked correctly for all non-mmapped chunks where its correctness matters more, since their lengths are always multiples of the alignment.
* use new a_crash() asm to optimize double-free handler.Rich Felker2011-08-231-2/+2
| | | | | | | gcc generates extremely bad code (7 byte immediate mov) for the old null pointer write approach. it should be generating something like "xor %eax,%eax ; mov %al,(%eax)". in any case, using a dedicated crashing opcode accomplishes the same thing in one byte.
* simplify and improve double-free checkRich Felker2011-08-151-2/+2
| | | | | | | | | | | | a valid mmapped block will have an even (actually aligned) "extra" field, whereas a freed chunk on the heap will always have an in-use neighbor. this fixes a potential bug if mmap ever allocated memory below the main program/brk (in which case it would be wrongly-detected as a double-free by the old code) and allows the double-free check to work for donated memory outside of the brk area (or, in the future, secondary heap zones if support for their creation is added).
* posix_memalign should fail if size is not a multiple of sizeof(void *)Rich Felker2011-06-291-1/+1
|
* eliminate OOB array hacks in mallocRich Felker2011-06-261-46/+45
|
* malloc: cast size down to int in bin_index functionsRich Felker2011-06-121-2/+2
| | | | | | | | | | | even if size_t was 32-bit already, the fact that the value was unsigned and that gcc is too stupid to figure out it would be positive as a signed quantity (due to the immediately-prior arithmetic and conditionals) results in gcc compiling the integer-to-float conversion as zero extension to 64 bits followed by an "fildll" (64 bit) instruction rather than a simple "fildl" (32 bit) instruction on x86. reportedly fildll is very slow on certain p4-class machines; even if not, the new code is slightly smaller.
* use volatile pointers for intentional-crash code.Rich Felker2011-06-061-2/+2
|
* namespace fixes for sys/mman.hRich Felker2011-04-201-0/+1
|
* fix rare but nasty under-allocation bug in malloc with large requestsRich Felker2011-04-041-1/+1
| | | | | | | | | the bug appeared only with requests roughly 2*sizeof(size_t) to 4*sizeof(size_t) bytes smaller than a multiple of the page size, and only for requests large enough to be serviced by mmap instead of the normal heap. it was only ever observed on 64-bit machines but presumably could also affect 32-bit (albeit with a smaller window of opportunity).
* avoid over-allocation of brk on first mallocRich Felker2011-04-011-4/+4
| | | | | | | if init_malloc returns positive (successful first init), malloc will retry getting a chunk from the free bins rather than expanding the heap again. also pass init_malloc a hint for the size of the initial allocation.
* rename __simple_malloc.c to lite_malloc.c - yes this affects behavior!Rich Felker2011-03-301-0/+0
| | | | | | | | | | | | | | | why does this affect behavior? well, the linker seems to traverse archive files starting from its current position when resolving symbols. since calloc.c comes alphabetically (and thus in sequence in the archive file) between __simple_malloc.c and malloc.c, attempts to resolve the "malloc" symbol for use by calloc.c were pulling in the full malloc.c implementation rather than the __simple_malloc.c implementation. as of now, lite_malloc.c and malloc.c are adjacent in the archive and in the correct order, so malloc.c should never be used to resolve "malloc" unless it's already needed to resolve another symbol ("free" or "realloc").
* very cheap double-free checks in mallocRich Felker2011-03-231-0/+4
|
* global cleanup to use the new syscall interfaceRich Felker2011-03-201-1/+1
|
* make malloc(0) return unique pointers rather than NULLRich Felker2011-02-202-6/+10
| | | | | | | this change is made with some reluctance, but i think it's for the best. correct programs must handle either behavior, so there is little advantage to having malloc(0) return NULL. and i managed to actually make the malloc code slightly smaller with this change.
* fix simple_malloc malloc(0) behavior not to return non-unique pointersRich Felker2011-02-201-0/+1
|
* fix simple_malloc size restrictionsRich Felker2011-02-201-5/+6
| | | | | | do not allow allocations that overflow ptrdiff_t; fix some overflow checks that were not quite right but didn't matter due to address layout implementation.
* initial check-in, version 0.5.0 v0.5.0Rich Felker2011-02-127-0/+671