| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
unlike other implementations, this one reserves memory for new TLS in
all pre-existing threads at dlopen-time, and dlopen will fail with no
resources consumed and no new libraries loaded if memory is not
available. memory is not immediately distributed to running threads;
that would be too complex and too costly. instead, assurances are made
that threads needing the new TLS can obtain it in an async-signal-safe
way from a buffer belonging to the dynamic linker/new module (via
atomic fetch-and-add based allocator).
I've re-appropriated the lock that was previously used for __synccall
(synchronizing set*id() syscalls between threads) as a general
pthread_create lock. it's a "backwards" rwlock where the "read"
operation is safe atomic modification of the live thread count, which
multiple threads can perform at the same time, and the "write"
operation is making sure the count does not increase during an
operation that depends on it remaining bounded (__synccall or dlopen).
in static-linked programs that don't use __synccall, this lock is a
no-op and has no cost.
|
|
|
|
|
| |
orig_tail was being saved before the lock was obtained, allowing
dlopen failure to roll-back other dlopens that had succeeded.
|
|
|
|
|
|
|
|
| |
currently, only i386 is tested. x86_64 and arm should probably work.
the necessary relocation types for mips and microblaze have not been
added because I don't understand how they're supposed to work, and I'm
not even sure if it's defined yet on microblaze. I may be able to
reverse engineer the requirements out of gcc/binutils output.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
this was an optimization to save/recover a minimal amount of extra
memory for use by malloc, that's becoming increasingly costly to keep
around. freeing this data:
1. breaks debugging with gdb (it can't find library symbols)
2. breaks thread-local storage in shared libraries
it would be possible to disable freeing when TLS is used, but in
addition to the above breakages, tracking whether dlopen/dlsym is used
adds a cost to every symbol lookup, possibly making program startup
slower for large programs. combined with the complexity, it's not
worth it. we already save/recover plenty of memory in the dynamic
linker with reclaim_gaps.
|
|
|
|
|
|
| |
this code will not work yet because the necessary relocations are not
supported, and cannot be supported without some internal changes to
how relocation processing works (coming soon).
|
|
|
|
|
| |
only TLS in the main program is supported so far; TLS defined in
shared libraries will not work yet.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the design for TLS in dynamic-linked programs is mostly complete too,
but I have not yet implemented it. cost is nonzero but still low for
programs which do not use TLS and/or do not use threads (a few hundred
bytes of new code, plus dependency on memcpy). i believe it can be
made smaller at some point by merging __init_tls and __init_security
into __libc_start_main and avoiding duplicate auxv-parsing code.
at the same time, I've also slightly changed the logic pthread_create
uses to allocate guard pages to ensure that guard pages are not
counted towards commit charge.
|
|
|
|
|
|
|
| |
based on proposed patches by Daniel Cegiełka, with minor changes:
- use a weak symbol for optreset so it doesn't clash with namespace
- also reset optpos (position in multi-option arg like -lR)
- also make getopt_long support reset
|
|
|
|
|
| |
also fix one minor bug: failure to free the early-reserved slot when
the semaphore later found to already be mapped.
|
|
|
|
|
|
|
|
|
|
|
| |
this function was overly complicated and not even obviously correct.
avoid using openat/linkat just like in shm_open, and instead expand
pathname using code shared with shm_open. remove bogus (and dangerous,
with priorities) use of spinlocks.
this commit also heavily streamlines the code and ensures there are no
failure cases that can happen after a new semaphore has been created
in the filesystem, since that case is unreportable.
|
|
|
|
|
|
|
| |
1. don't make non-cloexec file descriptors
2. cancellation safety (cleanup handlers were missing, now unneeded)
3. share name validation/mapping code between open/unlink functions
4. avoid wasteful/slow syscalls
|
| |
|
|
|
|
|
|
| |
this feature will be in the next version of POSIX, and can be used
internally immediately. there are many internal uses of fopen where
close-on-exec is needed to fix bugs.
|
| |
|
|
|
|
|
| |
these interfaces have been adopted by the Austin Group for inclusion
in the next version of POSIX.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
also update syslog to use SOCK_CLOEXEC rather than separate fcntl
step, to make it safe in multithreaded programs that run external
programs.
emulation is not atomic; it could be made atomic by holding a lock on
forking during the operation, but this seems like overkill. my goal is
not to achieve perfect behavior on old kernels (which have plenty of
other imperfect behavior already) but to avoid catastrophic breakage
in (1) syslog, which would give no output on old kernels with the
change to use SOCK_CLOEXEC, and (2) programs built on a new kernel
where configure scripts detected a working SOCK_CLOEXEC, which later
get run on older kernels (they may otherwise fail to work completely).
|
|
|
|
|
| |
this did not matter because we don't yet treat process-shared special.
when private futex support is added, however, it will matter.
|
| |
|
| |
|
|
|
|
|
|
| |
based on initial work by rdp, with heavy modifications. some features
including threads are untested because qemu app-level emulation seems
to be broken and I do not have a proper system image for testing.
|
|
|
|
|
| |
the code to exit the new thread/process after the start function
returns was mixed up in its syscall convention.
|
|
|
|
|
|
|
|
|
| |
when strchr fails, and important piece of information already
computed, the string length, is thrown away. have strchrnul (with
namespace protection) be the underlying function so this information
can be kept, and let strchr be a wrapper for it. this also allows
strcspn to be considerably faster in the case where the match set has
a single element that's not matched.
|
|
|
|
|
|
|
| |
testing with gcc 4.6.3 on x86, -Os, the old version does a duplicate
null byte check after the first loop. this is purely the compiler
being stupid, but the old code was also stupid and unintuitive in how
it expressed the check.
|
|
|
|
| |
also optimized a bit.
|
|
|
|
|
|
|
|
|
|
| |
austin group interpretation for defect #529
(http://austingroupbugs.net/view.php?id=529) tightens the
requirements on close such that, if it returns with EINTR, the file
descriptor must not be closed. the linux kernel developers vehemently
disagree with this, and will not change it. we catch and remap EINTR
to EINPROGRESS, which the standard allows close() to return when the
operation was not finished but the file descriptor has been closed.
|
|
|
|
|
|
|
|
| |
new behavior can be summarized as:
inputs that parse completely as a decimal number are treated as one,
and rejected only if the result is out of 16-bit range.
inputs that do not parse as a decimal number (where strtoul leaves
anything left over in the input) are searched in /etc/services.
|
|
|
|
| |
also cleanup cruft related to the issue
|
| |
|
|
|
|
|
| |
not tested on mips and arm; they may still be broken. x86_64 should be
ok now.
|
|
|
|
| |
issue reported/requested by Justin Cormack
|
|
|
|
| |
patch by Justin Cormack, with slight modification
|
|
|
|
| |
contributed by nsz
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
it was determined in discussion that these kind of limits are not
sufficient to protect single-threaded servers against denial of
service attacks from maliciously large round counts. the time scales
simply vary too much; many users will want login passwords with rounds
counts on a scale that gives decisecond latency, while highly loaded
webservers will need millisecond latency or shorter.
still some limit is left in place; the idea is not to protect against
attacks, but to avoid the runtime of a single call to crypt being, for
all practical purposes, infinite, so that configuration errors can be
caught and fixed without bringing down whole systems. these limits are
very high, on the order of minute-long runtimes for modest systems.
|
|
|
|
|
| |
these fixes were already made to the normal syscall asm but not the
cancellation point version.
|
|
|
|
|
|
|
| |
with this patch, the malloc in libc.so built with -Os is nearly the
same speed as the one built with -O3. thus it solves the performance
regression that resulted from removing the forced -O3 when building
libc.so; now libc.so can be both small and fast.
|
|
|
|
|
|
| |
vfork is implemented as the fork syscall (with no atfork handlers run)
on archs where it is not available, so this change does not introduce
any change in behavior or regression for such archs.
|
|
|
|
|
|
|
| |
for the sake of simplicity, I've only used rep movsb rather than
breaking up the copy for using rep movsd/q. on all modern cpus, this
seems to be fine, but if there are performance problems, there might
be a need to go back and add support for rep movsd/q.
|
| |
|
|
|
|
|
|
|
|
|
| |
before restrict was added, memove called memcpy for forward copies and
used a byte-at-a-time loop for reverse copies. this was changed to
avoid invoking UB now that memcpy has an undefined copying order,
making memmove considerably slower.
performance is still rather bad, so I'll be adding asm soon.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
no syscalls actually use that many arguments; the issue is that some
syscalls with 64-bit arguments have them ordered badly so that
breaking them into aligned 32-bit half-arguments wastes slots with
padding, and a 7th slot is needed for the last argument.
|
| |
|
|
|
|
|
|
|
| |
this code was using $10 to save the syscall number, but $10 is not
necessarily preserved by the kernel across syscalls. only mattered for
syscalls that got interrupted by a signal and restarted. as far as i
can tell, $25 is preserved by the kernel across syscalls.
|
|
|
|
|
|
|
| |
something is wrong with the logic for the argument layout, resulting
in compile errors on mips due to too many args to syscall... further
information on how it's supposed to work will be needed before it can
be reactivated.
|