about summary refs log tree commit diff
path: root/nptl/allocatestack.c
Commit message (Collapse)AuthorAgeFilesLines
* nptl: Remove pthread_clock_gettime pthread_clock_settimeAdhemerval Zanella2019-03-221-48/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes CLOCK_THREAD_CPUTIME_ID and CLOCK_PROCESS_CPUTIME_ID support from clock_gettime and clock_settime generic implementation. For Linux, kernel already provides supports through the syscall and Hurd HTL lacks __pthread_clock_gettime and __pthread_clock_settime internal implementation. As described in clock_gettime man-page [1] on 'Historical note for SMP system', implementing CLOCK_{THREAD,PROCESS}_CPUTIME_ID with timer registers is error-prone and susceptible to timing and accurary issues that the libc can not deal without kernel support. This allows removes unused code which, however, still incur in some runtime overhead in thread creation (the struct pthread cpuclock_offset initialization). If hurd eventually wants to support them it should either either implement as a kernel facility (or something related due its architecture) or in system specific implementation. Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu. I also checked on a i686-gnu build. * nptl/Makefile (libpthread-routines): Remove pthread_clock_gettime and pthread_clock_settime. * nptl/pthreadP.h (__find_thread_by_id): Remove prototype. * elf/dl-support.c [!HP_TIMING_NOAVAIL] (_dl_cpuclock_offset): Remove. (_dl_non_dynamic_init): Remove _dl_cpuclock_offset setting. * elf/rtld.c (_dl_start_final): Likewise. * nptl/allocatestack.c (__find_thread_by_id): Remove function. * sysdeps/generic/ldsodefs.h [!HP_TIMING_NOAVAIL] (_dl_cpuclock_offset): Remove. * sysdeps/mach/hurd/dl-sysdep.c [!HP_TIMING_NOAVAIL] (_dl_cpuclock_offset): Remove. * nptl/descr.h (struct pthread): Rename cpuclock_offset to cpuclock_offset_ununsed. * nptl/nptl-init.c (__pthread_initialize_minimal_internal): Remove cpuclock_offset set. * nptl/pthread_create.c (START_THREAD_DEFN): Likewise. * sysdeps/nptl/fork.c (__libc_fork): Likewise. * nptl/pthread_clock_gettime.c: Remove file. * nptl/pthread_clock_settime.c: Likewise. * sysdeps/unix/clock_gettime.c (hp_timing_gettime): Remove function. [HP_TIMING_AVAIL] (realtime_gettime): Remove CLOCK_THREAD_CPUTIME_ID and CLOCK_PROCESS_CPUTIME_ID support. * sysdeps/unix/clock_settime.c (hp_timing_gettime): Likewise. [HP_TIMING_AVAIL] (realtime_gettime): Likewise. * sysdeps/posix/clock_getres.c (hp_timing_getres): Likewise. [HP_TIMING_AVAIL] (__clock_getres): Likewise. * sysdeps/unix/clock_nanosleep.c (CPUCLOCK_P, INVALID_CLOCK_P): Likewise. (__clock_nanosleep): Remove CPUCLOCK_P and INVALID_CLOCK_P usage. [1] http://man7.org/linux/man-pages/man2/clock_gettime.2.html
* Avoid "inline" after return type in function definitions.Joseph Myers2019-02-061-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One group of warnings seen with -Wextra is warnings for static or inline not at the start of a declaration (-Wold-style-declaration). This patch fixes various such cases for inline, ensuring it comes at the start of the declaration (after any static). A common case of the fix is "static inline <type> __always_inline"; the definition of __always_inline starts with __inline, so the natural change is to "static __always_inline <type>". Other cases of the warning may be harder to fix (one pattern is a function definition that gets rewritten to be static by an including file, "#define funcname static wrapped_funcname" or similar), but it seems worth fixing these cases with inline anyway. Tested for x86_64. * elf/dl-load.h (_dl_postprocess_loadcmd): Use __always_inline before return type, without separate inline. * elf/dl-tunables.c (maybe_enable_malloc_check): Likewise. * elf/dl-tunables.h (tunable_is_name): Likewise. * malloc/malloc.c (do_set_trim_threshold): Likewise. (do_set_top_pad): Likewise. (do_set_mmap_threshold): Likewise. (do_set_mmaps_max): Likewise. (do_set_mallopt_check): Likewise. (do_set_perturb_byte): Likewise. (do_set_arena_test): Likewise. (do_set_arena_max): Likewise. (do_set_tcache_max): Likewise. (do_set_tcache_count): Likewise. (do_set_tcache_unsorted_limit): Likewise. * nis/nis_subr.c (count_dots): Likewise. * nptl/allocatestack.c (advise_stack_range): Likewise. * sysdeps/ieee754/dbl-64/s_sin.c (do_cos): Likewise. (do_sin): Likewise. (reduce_sincos): Likewise. (do_sincos): Likewise. * sysdeps/unix/sysv/linux/x86/elision-conf.c (do_set_elision_enable): Likewise. (TUNABLE_CALLBACK_FNDECL): Likewise.
* Fix alignment of TLS variables for tls variant TLS_TCB_AT_TP [BZ #23403]Stefan Liebler2019-02-061-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The alignment of TLS variables is wrong if accessed from within a thread for architectures with tls variant TLS_TCB_AT_TP. For the main thread the static tls data is properly aligned. For other threads the alignment depends on the alignment of the thread pointer as the static tls data is located relative to this pointer. This patch adds this alignment for TLS_TCB_AT_TP variants in the same way as it is already done for TLS_DTV_AT_TP. The thread pointer is also already properly aligned if the user provides its own stack for the new thread. This patch extends the testcase nptl/tst-tls1.c in order to check the alignment of the tls variables and it adds a pthread_create invocation with a user provided stack. The test itself is migrated from test-skeleton.c to test-driver.c and the missing support functions xpthread_attr_setstack and xposix_memalign are added. ChangeLog: [BZ #23403] * nptl/allocatestack.c (allocate_stack): Align pointer pd for TLS_TCB_AT_TP tls variant. * nptl/tst-tls1.c: Migrate to support/test-driver.c. Add alignment checks. * support/Makefile (libsupport-routines): Add xposix_memalign and xpthread_setstack. * support/support.h: Add xposix_memalign. * support/xthread.h: Add xpthread_attr_setstack. * support/xposix_memalign.c: New File. * support/xpthread_attr_setstack.c: Likewise.
* Update copyright dates with scripts/update-copyrights.Joseph Myers2019-01-011-1/+1
| | | | | | | * All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.
* nptl: Use __mprotect consistently for _STACK_GROWS_UPFlorian Weimer2018-07-121-1/+1
|
* libc: Extend __libc_freeres framework (Bug 23329).Carlos O'Donell2018-06-291-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | The __libc_freeres framework does not extend to non-libc.so objects. This causes problems in general for valgrind and mtrace detecting unfreed objects in both libdl.so and libpthread.so. This change is a pre-requisite to properly moving the malloc hooks out of malloc since such a move now requires precise accounting of all allocated data before destructors are run. This commit adds a proper hook in libc.so.6 for both libdl.so and for libpthread.so, this ensures that shm-directory.c which uses freeit () to free memory is called properly. We also remove the nptl_freeres hook and fall back to using weak-ref-and-check idiom for a loaded libpthread.so, thus making this process similar for all DSOs. Lastly we follow best practice and use explicit free calls for both libdl.so and libpthread.so instead of the generic hook process which has undefined order. Tested on x86_64 with no regressions. Signed-off-by: DJ Delorie <dj@redhat.com> Signed-off-by: Carlos O'Donell <carlos@redhat.com>
* nptl: Remove __ASSUME_PRIVATE_FUTEXH.J. Lu2018-05-171-12/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since __ASSUME_PRIVATE_FUTEX is always defined, this patch removes the !__ASSUME_PRIVATE_FUTEX paths. Tested with build-many-glibcs.py. * nptl/allocatestack.c (allocate_stack): Remove the !__ASSUME_PRIVATE_FUTEX paths. * nptl/descr.h (header): Remove the !__ASSUME_PRIVATE_FUTEX path. * nptl/nptl-init.c (__pthread_initialize_minimal_internal): Likewise. * sysdeps/i386/nptl/tcb-offsets.sym (PRIVATE_FUTEX): Removed. * sysdeps/powerpc/nptl/tcb-offsets.sym (PRIVATE_FUTEX): Likewise. * sysdeps/sh/nptl/tcb-offsets.sym (PRIVATE_FUTEX): Likewise. * sysdeps/x86_64/nptl/tcb-offsets.sym (PRIVATE_FUTEX): Likewise. * sysdeps/i386/nptl/tls.h: (tcbhead_t): Remve the !__ASSUME_PRIVATE_FUTEX path. * sysdeps/s390/nptl/tls.h (tcbhead_t): Likewise. * sysdeps/sparc/nptl/tls.h (tcbhead_t): Likewise. * sysdeps/x86_64/nptl/tls.h (tcbhead_t): Likewise. * sysdeps/unix/sysv/linux/i386/lowlevellock.S: Remove the !__ASSUME_PRIVATE_FUTEX macros. * sysdeps/unix/sysv/linux/lowlevellock-futex.h: Likewise. * sysdeps/unix/sysv/linux/x86_64/cancellation.S: Likewise. * sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: Likewise. * sysdeps/unix/sysv/linux/kernel-features.h (__ASSUME_PRIVATE_FUTEX): Removed.
* [BZ #22637] Fix stack guard size accountingSzabolcs Nagy2018-01-081-0/+4
| | | | | | | | | | | | | | | | | | | | | Previously if user requested S stack and G guard when creating a thread, the total mapping was S and the actual available stack was S - G - static_tls, which is not what the user requested. This patch fixes the guard size accounting by pretending the user requested S+G stack. This way all later logic works out except when reporting the user requested stack size (pthread_getattr_np) or when computing the minimal stack size (__pthread_get_minstack). Normally this will increase thread stack allocations by one page. TLS accounting is not affected, that will require a separate fix. [BZ #22637] * nptl/descr.h (stackblock, stackblock_size): Update comments. * nptl/allocatestack.c (allocate_stack): Add guardsize to stacksize. * nptl/nptl-init.c (__pthread_get_minstack): Remove guardsize from stacksize. * nptl/pthread_getattr_np.c (pthread_getattr_np): Likewise.
* Update copyright dates with scripts/update-copyrights.Joseph Myers2018-01-011-1/+1
| | | | | | | * All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.
* nptl: Define __PTHREAD_MUTEX_{NUSERS_AFTER_KIND,USE_UNION}Adhemerval Zanella2017-11-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds two new internal defines to set the internal pthread_mutex_t layout required by the supported ABIS: 1. __PTHREAD_MUTEX_NUSERS_AFTER_KIND which control whether to define __nusers fields before or after __kind. The preferred value for is 0 for new ports and it sets __nusers before __kind. 2. __PTHREAD_MUTEX_USE_UNION which control whether internal __spins and __list members will be place inside an union for linuxthreads compatibility. The preferred value is 0 for ports and it sets to not use an union to define both fields. It fixes the wrong offsets value for __kind value on x86_64-linux-gnu-x32. Checked with a make check run-built-tests=no on all afected ABIs. [BZ #22298] * nptl/allocatestack.c (allocate_stack): Check if __PTHREAD_MUTEX_HAVE_PREV is non-zero, instead if __PTHREAD_MUTEX_HAVE_PREV is defined. * nptl/descr.h (pthread): Likewise. * nptl/nptl-init.c (__pthread_initialize_minimal_internal): Likewise. * nptl/pthread_create.c (START_THREAD_DEFN): Likewise. * sysdeps/nptl/fork.c (__libc_fork): Likewise. * sysdeps/nptl/pthread.h (PTHREAD_MUTEX_INITIALIZER): Likewise. * sysdeps/nptl/bits/thread-shared-types.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): New defines. (__pthread_internal_list): Check __PTHREAD_MUTEX_USE_UNION instead of __WORDSIZE for internal layout. (__pthread_mutex_s): Check __PTHREAD_MUTEX_NUSERS_AFTER_KIND instead of __WORDSIZE for internal __nusers layout and __PTHREAD_MUTEX_USE_UNION instead of __WORDSIZE whether to use an union for __spins and __list fields. (__PTHREAD_MUTEX_HAVE_PREV): Define also for __PTHREAD_MUTEX_USE_UNION case. * sysdeps/aarch64/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): New defines. * sysdeps/alpha/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/arm/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/hppa/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/ia64/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/m68k/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/microblaze/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/mips/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/nios2/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/powerpc/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/s390/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/sh/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/sparc/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/tile/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/x86/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* nptl: Preserve error in setxid thread broadcast in coredumps [BZ #22153]Peter Zelezny2017-10-131-2/+7
|
* nptl: Remove internal_function attributeFlorian Weimer2017-08-311-5/+0
|
* ia64: Fix thread stack allocation permission set (BZ #21672)Adhemerval Zanella2017-08-291-1/+28
| | | | | | | | | | | | | | | | | | | | | | This patch fixes ia64 failures on thread exit by madvise the required area taking in consideration its disjoing stacks (NEED_SEPARATE_REGISTER_STACK). Also the snippet that setup the madvise call to advertise kernel the area won't be used anymore in near future is reallocated in allocatestack.c (for consistency to put all stack management function in one place). Checked on x86_64-linux-gnu and i686-linux-gnu for sanity (since it is not expected code changes for architecture that do not define NEED_SEPARATE_REGISTER_STACK) and also got a report that it fixes ia64-linux-gnu failures from Sergei Trofimovich <slyfox@gentoo.org>. [BZ #21672] * nptl/allocatestack.c [_STACK_GROWS_DOWN] (setup_stack_prot): Set to use !NEED_SEPARATE_REGISTER_STACK as well. (advise_stack_range): New function. * nptl/pthread_create.c (START_THREAD_DEFN): Move logic to mark stack non required to advise_stack_range at allocatestack.c
* NPTL: Remove internal_function from stack marking functionsFlorian Weimer2017-08-131-1/+0
| | | | | These are called across DSO boundaries and therefore should use the ABI calling convention.
* Fix guard alignment in allocate_stack when stack grows up.John David Anglin2017-07-151-2/+8
|
* Clean pthread functions namespaces for C11 threadsAdhemerval Zanella2017-06-231-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds internal definition (through {libc_}hidden_{proto,def}) and also change some strong to weak alias for symbols that might be used by C11 threads implementations. The patchset should not change libc/libpthread functional, although object changes are expected (since now internal symbols are used instead) and final exported symbols through GLIBC_PRIVATE is also expanded (to cover libpthread usage of __mmap{64}, __munmap, __mprotect). Checked with a build for all major ABI (aarch64-linux-gnu, alpha-linux-gnu, arm-linux-gnueabi, i386-linux-gnu, ia64-linux-gnu, m68k-linux-gnu, microblaze-linux-gnu [1], mips{64}-linux-gnu, nios2-linux-gnu, powerpc{64le}-linux-gnu, s390{x}-linux-gnu, sparc{64}-linux-gnu, tile{pro,gx}-linux-gnu, and x86_64-linux-gnu). * include/sched.h (__sched_get_priority_max): Add libc hidden proto. (__sched_get_prioriry_min): Likewise. * include/sys/mman.h (__mmap): Likewise. (__mmap64): Likewise. (__munmap): Likewise. (__mprotect): Likewise. * include/termios.h (__tcsetattr): Likewise. * include/time.h (__nanosleep): Use hidden_proto instead of libc_hidden_proto. * posix/nanosleep.c (__nanosleep): Likewise. * misc/Versions (libc): Export __mmap, __munmap, __mprotect, __sched_get_priority_min, and __sched_get_priority_max under GLIBC_PRIVATE. * nptl/allocatestack.c (__free_stacks): Use internal definition for libc symbols. (change_stack_perm): Likewise. (allocate_stack): Likewise. * sysdeps/posix/gethostname.c: Likewise. * nptl/tpp.c (__init_sched_fifo_prio): Likewise. * sysdeps/unix/sysv/linux/i386/smp.h (is_smp_system): Likewise. * sysdeps/unix/sysv/linux/powerpc/ioctl.c (__ioctl): Likewise. * nptl/pthreadP.h (__pthread_mutex_timedlock): Add definition. (__pthread_key_delete): Likewise. (__pthread_detach): Likewise. (__pthread_cancel): Likewise. (__pthread_mutex_trylock): Likewise. (__pthread_mutexattr_init): Likewise. (__pthread_mutexattr_settype): Likewise. * nptl/pthread_cancel.c (pthread_cancel): Change to internal name and create alias for exported one. * nptl/pthread_join.c (pthread_join): Likewise. * nptl/pthread_detach.c (pthread_detach): Likewise. * nptl/pthread_key_delete.c (pthread_key_delete): Likewise. * nptl/pthread_mutex_timedlock.c (pthread_mutex_timedlock): Likewise. * nptl/pthread_create.c: Change static requirements for pthread symbols. * nptl/pthread_equal.c (__pthread_equal): Change strong alias to weak for internal definition. * nptl/pthread_exit.c (__pthread_exit): Likewise. * nptl/pthread_getspecific.c (__pthread_getspecific): Likewise. * nptl/pthread_key_create.c (__pthread_key_create): Likewise. * nptl/pthread_mutex_destroy.c (__pthread_mutex_destroy): Likewise. * nptl/pthread_mutex_init.c (__pthread_mutex_init): Likewise. * nptl/pthread_mutex_lock.c (__pthread_mutex_lock): Likewise. * nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock): Likewise. * nptl/pthread_mutex_unlock.c (__pthread_mutex_unlock): Likewise. * nptl/pthread_mutexattr_init.c (__pthread_mutexattr_init): Likwise. * nptl/pthread_mutexattr_settype.c (__pthread_mutexattr_settype): Likewise. * nptl/pthread_self.c (__pthread_self): Likewise. * nptl/pthread_setspecific.c (__pthread_setspecific): Likewise. * sysdeps/unix/sysv/linux/tcsetattr.c (tcsetattr): Likewise. * misc/mmap.c (__mmap): Add internal symbol definition. * misc/mmap.c (__mmap64): Likewise. * sysdeps/unix/sysv/linux/mmap.c (__mmap): Likewise. * sysdeps/unix/sysv/linux/mmap64.c (__mmap): Likewise. (__mmap64): Likewise. * sysdeps/unix/sysv/linux/i386/Versions (libc) [GLIBC_PRIVATE): Add __uname.
* nptl: Invert the mmap/mprotect logic on allocated stacks (BZ#18988)Adhemerval Zanella2017-06-141-8/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current allocate_stack logic for create stacks is to first mmap all the required memory with the desirable memory and then mprotect the guard area with PROT_NONE if required. Although it works as expected, it pessimizes the allocation because it requires the kernel to actually increase commit charge (it counts against the available physical/swap memory available for the system). The only issue is to actually check this change since side-effects are really Linux specific and to actually account them it would require a kernel specific tests to parse the system wide information. On the kernel I checked /proc/self/statm does not show any meaningful difference for vmm and/or rss before and after thread creation. I could only see really meaningful information checking on system wide /proc/meminfo between thread creation: MemFree, MemAvailable, and Committed_AS shows large difference without the patch. I think trying to use these kind of information on a testcase is fragile. The BZ#18988 reports shows that the commit pages are easily seen with mlockall (MCL_FUTURE) (with lock all pages that become mapped in the process) however a more straighfoward testcase shows that pthread_create could be faster using this patch: -- static const int inner_count = 256; static const int outer_count = 128; static void *thread1(void *arg) { return NULL; } static void *sleeper(void *arg) { pthread_t ts[inner_count]; for (int i = 0; i < inner_count; i++) pthread_create (&ts[i], &a, thread1, NULL); for (int i = 0; i < inner_count; i++) pthread_join (ts[i], NULL); return NULL; } int main(void) { pthread_attr_init(&a); pthread_attr_setguardsize(&a, 1<<20); pthread_attr_setstacksize(&a, 1134592); pthread_t ts[outer_count]; for (int i = 0; i < outer_count; i++) pthread_create(&ts[i], &a, sleeper, NULL); for (int i = 0; i < outer_count; i++) pthread_join(ts[i], NULL); assert(r == 0); } return 0; } -- On x86_64 (4.4.0-45-generic, gcc 5.4.0) running the small benchtests I see: $ time ./test real 0m3.647s user 0m0.080s sys 0m11.836s While with the patch I see: $ time ./test real 0m0.696s user 0m0.040s sys 0m1.152s So I added a pthread_create benchtest (thread_create) which check the thread creation latency. As for the simple benchtests, I saw improvements in thread creation on all architectures I tested the change. Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu, arm-linux-gnueabihf, powerpc64le-linux-gnu, sparc64-linux-gnu, and sparcv9-linux-gnu. [BZ #18988] * benchtests/thread_create-inputs: New file. * benchtests/thread_create-source.c: Likewise. * support/xpthread_attr_setguardsize.c: Likewise. * support/Makefile (libsupport-routines): Add xpthread_attr_setguardsize object. * support/xthread.h: Add xpthread_attr_setguardsize prototype. * benchtests/Makefile (bench-pthread): Add thread_create. * nptl/allocatestack.c (allocate_stack): Call mmap with PROT_NONE and then mprotect the required area.
* nptl: Remove COLORING_INCREMENTAdhemerval Zanella2017-02-061-38/+2
| | | | | | | | | | | This patch removes the COLORING_INCREMENT define and usage on allocatestack.c. It has not been used since 564cd8b67ec487f (glibc-2.3.3) by any architecture. The idea is to simplify the code by removing obsolete code. * nptl/allocatestack.c [COLORING_INCREMENT] (nptl_ncreated): Remove. (allocate_stack): Remove COLORING_INCREMENT usage. * nptl/stack-aliasing.h (COLORING_INCREMENT). Likewise. * sysdeps/i386/i686/stack-aliasing.h (COLORING_INCREMENT): Likewise.
* Bug 20915: Do not initialize DTV of other threads.Alexandre Oliva2017-02-031-5/+0
| | | | | | | | | | | | | | | In _dl_nothread_init_static_tls() and init_one_static_tls() we must not touch the DTV of other threads since we do not have ownership of them. The DTV need not be initialized at this point anyway since only LD/GD accesses will use them. If LD/GD accesses occur they will take care to initialize their own thread's DTV. Concurrency comments were removed from the patch since they need to be reworked along with a full description of DTV ownership and when it is or is not safe to modify these structures. Alexandre Oliva's original patch and discussion: https://sourceware.org/ml/libc-alpha/2016-09/msg00512.html
* Update copyright dates with scripts/update-copyrights.Joseph Myers2017-01-011-1/+1
|
* Remove cached PID/TID in cloneAdhemerval Zanella2016-11-241-18/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch remove the PID cache and usage in current GLIBC code. Current usage is mainly used a performance optimization to avoid the syscall, however it adds some issues: - The exposed clone syscall will try to set pid/tid to make the new thread somewhat compatible with current GLIBC assumptions. This cause a set of issue with new workloads and usecases (such as BZ#17214 and [1]) as well for new internal usage of clone to optimize other algorithms (such as clone plus CLONE_VM for posix_spawn, BZ#19957). - The caching complexity also added some bugs in the past [2] [3] and requires more effort of each port to handle such requirements (for both clone and vfork implementation). - Caching performance gain in mainly on getpid and some specific code paths. The getpid performance leverage is questionable [4], either by the idea of getpid being a hotspot as for the getpid implementation itself (if it is indeed a justifiable hotspot a vDSO symbol could let to a much more simpler solution). Other usage is mainly for non usual code paths, such as pthread cancellation signal and handling. For thread creation (on stack allocation) the code simplification in fact adds some performance gain due the no need of transverse the stack cache and invalidate each element pid. Other thread usages will require a direct getpid syscall, such as cancellation/setxid signal, thread cancellation, thread fail path (at create_thread), and thread signal (pthread_kill and pthread_sigqueue). However these are hardly usual hotspots and I think adding a syscall is justifiable. It also simplifies both the clone and vfork arch-specific implementation. And by review each fork implementation there are some discrepancies that this patch also solves: - microblaze clone/vfork does not set/reset the pid/tid field - hppa uses the default vfork implementation that fallback to fork. Since vfork is deprecated I do not think we should bother with it. The patch also removes the TID caching in clone. My understanding for such semantic is try provide some pthread usage after a user program issue clone directly (as done by thread creation with CLONE_PARENT_SETTID and pthread tid member). However, as stated before in multiple discussions threads, GLIBC provides clone syscalls without further supporting all this semantics. I ran a full make check on x86_64, x32, i686, armhf, aarch64, and powerpc64le. For sparc32, sparc64, and mips I ran the basic fork and vfork tests from posix/ folder (on a qemu system). So it would require further testing on alpha, hppa, ia64, m68k, nios2, s390, sh, and tile (I excluded microblaze because it is already implementing the patch semantic regarding clone/vfork). [1] https://codereview.chromium.org/800183004/ [2] https://sourceware.org/ml/libc-alpha/2006-07/msg00123.html [3] https://sourceware.org/bugzilla/show_bug.cgi?id=15368 [4] http://yarchive.net/comp/linux/getpid_caching.html * sysdeps/nptl/fork.c (__libc_fork): Remove pid cache setting. * nptl/allocatestack.c (allocate_stack): Likewise. (__reclaim_stacks): Likewise. (setxid_signal_thread): Obtain pid through syscall. * nptl/nptl-init.c (sigcancel_handler): Likewise. (sighandle_setxid): Likewise. * nptl/pthread_cancel.c (pthread_cancel): Likewise. * sysdeps/unix/sysv/linux/pthread_kill.c (__pthread_kill): Likewise. * sysdeps/unix/sysv/linux/pthread_sigqueue.c (pthread_sigqueue): Likewise. * sysdeps/unix/sysv/linux/createthread.c (create_thread): Likewise. * sysdeps/unix/sysv/linux/getpid.c: Remove file. * nptl/descr.h (struct pthread): Change comment about pid value. * nptl/pthread_getattr_np.c (pthread_getattr_np): Remove thread pid assert. * sysdeps/unix/sysv/linux/pthread-pids.h (__pthread_initialize_pids): Do not set pid value. * nptl_db/td_ta_thr_iter.c (iterate_thread_list): Remove thread pid cache check. * nptl_db/td_thr_validate.c (td_thr_validate): Likewise. * sysdeps/aarch64/nptl/tcb-offsets.sym: Remove pid offset. * sysdeps/alpha/nptl/tcb-offsets.sym: Likewise. * sysdeps/arm/nptl/tcb-offsets.sym: Likewise. * sysdeps/hppa/nptl/tcb-offsets.sym: Likewise. * sysdeps/i386/nptl/tcb-offsets.sym: Likewise. * sysdeps/ia64/nptl/tcb-offsets.sym: Likewise. * sysdeps/m68k/nptl/tcb-offsets.sym: Likewise. * sysdeps/microblaze/nptl/tcb-offsets.sym: Likewise. * sysdeps/mips/nptl/tcb-offsets.sym: Likewise. * sysdeps/nios2/nptl/tcb-offsets.sym: Likewise. * sysdeps/powerpc/nptl/tcb-offsets.sym: Likewise. * sysdeps/s390/nptl/tcb-offsets.sym: Likewise. * sysdeps/sh/nptl/tcb-offsets.sym: Likewise. * sysdeps/sparc/nptl/tcb-offsets.sym: Likewise. * sysdeps/tile/nptl/tcb-offsets.sym: Likewise. * sysdeps/x86_64/nptl/tcb-offsets.sym: Likewise. * sysdeps/unix/sysv/linux/aarch64/clone.S: Remove pid and tid caching. * sysdeps/unix/sysv/linux/alpha/clone.S: Likewise. * sysdeps/unix/sysv/linux/arm/clone.S: Likewise. * sysdeps/unix/sysv/linux/hppa/clone.S: Likewise. * sysdeps/unix/sysv/linux/i386/clone.S: Likewise. * sysdeps/unix/sysv/linux/ia64/clone2.S: Likewise. * sysdeps/unix/sysv/linux/mips/clone.S: Likewise. * sysdeps/unix/sysv/linux/nios2/clone.S: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/clone.S: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/clone.S: Likewise. * sysdeps/unix/sysv/linux/sh/clone.S: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/clone.S: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/clone.S: Likewise. * sysdeps/unix/sysv/linux/tile/clone.S: Likewise. * sysdeps/unix/sysv/linux/x86_64/clone.S: Likewise. * sysdeps/unix/sysv/linux/aarch64/vfork.S: Remove pid set and reset. * sysdeps/unix/sysv/linux/alpha/vfork.S: Likewise. * sysdeps/unix/sysv/linux/arm/vfork.S: Likewise. * sysdeps/unix/sysv/linux/i386/vfork.S: Likewise. * sysdeps/unix/sysv/linux/ia64/vfork.S: Likewise. * sysdeps/unix/sysv/linux/m68k/clone.S: Likewise. * sysdeps/unix/sysv/linux/m68k/vfork.S: Likewise. * sysdeps/unix/sysv/linux/mips/vfork.S: Likewise. * sysdeps/unix/sysv/linux/nios2/vfork.S: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/vfork.S: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/vfork.S: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/vfork.S: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/vfork.S: Likewise. * sysdeps/unix/sysv/linux/sh/vfork.S: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/vfork.S: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/vfork.S: Likewise. * sysdeps/unix/sysv/linux/tile/vfork.S: Likewise. * sysdeps/unix/sysv/linux/x86_64/vfork.S: Likewise. * sysdeps/unix/sysv/linux/tst-clone2.c (f): Remove direct pthread struct access. (clone_test): Remove function. (do_test): Rewrite to take in consideration pid is not cached anymore.
* [PR19826] fix non-LE TLS in static programsAlexandre Oliva2016-09-211-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | An earlier fix for TLS dropped early initialization of DTV entries for modules using static TLS, leaving it for __tls_get_addr to set them up. That worked on platforms that require the GD access model to be relaxed to LE in the main executable, but it caused a regression on platforms that allow GD in the main executable, particularly in statically-linked programs: they use a custom __tls_get_addr that does not update the DTV, which fails when the DTV early initialization is not performed. In static programs, __libc_setup_tls performs the DTV initialization for the main thread, but the DTV of other threads is set up in _dl_allocate_tls_init, so that's the fix that matters. Restoring the initialization in the remaining functions modified by this patch was just for uniformity. It's not clear that it is ever needed: even on platforms that allow GD in the main executable, the dynamically-linked version of __tls_get_addr would set up the DTV entries, even for static TLS modules, while updating the DTV counter. for ChangeLog [BZ #19826] * elf/dl-tls.c (_dl_allocate_tls_init): Restore DTV early initialization of static TLS entries. * elf/dl-reloc.c (_dl_nothread_init_static_tls): Likewise. * nptl/allocatestack.c (init_one_static_tls): Likewise.
* elf: Avoid using memalign for TLS allocations [BZ #17730]Florian Weimer2016-08-031-3/+1
| | | | | | | | | | | | Instead of a flag which indicates the pointer can be freed, dtv_t now includes the pointer which should be freed. Due to padding, the size of dtv_t does not increase. To avoid using memalign, the new allocate_dtv_entry function allocates a sufficiently large buffer so that a sub-buffer can be found in it which starts with an aligned pointer. Both the aligned and original pointers are kept, the latter for calling free later.
* nptl: support thread stacks that grow upCarlos O'Donell2016-02-191-7/+13
| | | | | | Gentoo has been carrying this for all arches since 2.17. URL: http://bugs.gentoo.org/301642
* Update copyright dates with scripts/update-copyrights.Joseph Myers2016-01-041-1/+1
|
* nptl: fix set-but-unused warning w/_STACK_GROWS_UPMike Frysinger2015-08-051-7/+10
| | | | | | On arches that set _STACK_GROWS_UP, the stacktop variable is declared and set, but never actually used. Refactor the code a bit so that the variable is only declared/set under _STACK_GROWS_DOWN settings.
* Add and use new glibc-internal futex API.Torvald Riegel2015-07-101-6/+10
| | | | | | | | | | | | | | | | | | | | This adds new functions for futex operations, starting with wait, abstimed_wait, reltimed_wait, wake. They add documentation and error checking according to the current draft of the Linux kernel futex manpage. Waiting with absolute or relative timeouts is split into separate functions. This allows for removing a few cases of code duplication in pthreads code, which uses absolute timeouts; also, it allows us to put platform-specific code to go from an absolute to a relative timeout into the platform-specific futex abstractions.. Futex operations that can be canceled are also split out into separate functions suffixed by "_cancelable". There are separate versions for both Linux and NaCl; while they currently differ only slightly, my expectation is that the separate versions of lowlevellock-futex.h will eventually be merged into futex-internal.h when we get to move the lll_ functions over to the new futex API.
* Fix DTV race, assert, DTV_SURPLUS Static TLS limit, and nptl_db garbageAlexandre Oliva2015-03-171-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for ChangeLog [BZ #17090] [BZ #17620] [BZ #17621] [BZ #17628] * NEWS: Update. * elf/dl-tls.c (_dl_update_slotinfo): Clean up outdated DTV entries with Static TLS too. Skip entries past the end of the allocated DTV, from Alan Modra. (tls_get_addr_tail): Update to glibc_likely/unlikely. Move Static TLS DTV entry set up from... (_dl_allocate_tls_init): ... here (fix modid assertion), ... * elf/dl-reloc.c (_dl_nothread_init_static_tls): ... here... * nptl/allocatestack.c (init_one_static_tls): ... and here... * elf/dlopen.c (dl_open_worker): Drop l_tls_modid upper bound for Static TLS. * elf/tlsdeschtab.h (map_generation): Return size_t. Check that the slot we find is associated with the given map before using its generation count. * nptl_db/db_info.c: Include ldsodefs.h. (rtld_global, dtv_slotinfo_list, dtv_slotinfo): New typedefs. * nptl_db/structs.def (DB_RTLD_VARIABLE): New macro. (DB_MAIN_VARIABLE, DB_RTLD_GLOBAL_FIELD): Likewise. (link_map::l_tls_offset): New struct field. (dtv_t::counter): Likewise. (rtld_global): New struct. (_rtld_global): New rtld variable. (dl_tls_dtv_slotinfo_list): New rtld global field. (dtv_slotinfo_list): New struct. (dtv_slotinfo): Likewise. * nptl_db/td_symbol_list.c: Drop gnu/lib-names.h include. (td_lookup): Rename to... (td_mod_lookup): ... this. Use new mod parameter instead of LIBPTHREAD_SO. * nptl_db/td_thr_tlsbase.c: Include link.h. (dtv_slotinfo_list, dtv_slotinfo): New functions. (td_thr_tlsbase): Check DTV generation. Compute Static TLS addresses even if the DTV is out of date or missing them. * nptl_db/fetch-value.c (_td_locate_field): Do not refuse to index zero-length arrays. * nptl_db/thread_dbP.h: Include gnu/lib-names.h. (td_lookup): Make it a macro implemented in terms of... (td_mod_lookup): ... this declaration. * nptl_db/db-symbols.awk (DB_RTLD_VARIABLE): Override. (DB_MAIN_VARIABLE): Likewise.
* Update copyright dates with scripts/update-copyrights.Joseph Myers2015-01-021-1/+1
|
* NPTL: Clean up THREAD_SYSINFO macros.Roland McGrath2014-10-171-4/+2
|
* NPTL: Conditionalize more uses of SIGCANCEL and SIGSETXID.Roland McGrath2014-10-171-0/+3
|
* nptl: Fix abort in case of set*id failure [BZ #17135]Florian Weimer2014-07-111-2/+25
| | | | | | | | | | If a call to the set*id functions fails in a multi-threaded program, the abort introduced in commit 13f7fe35ae2b0ea55dc4b9628763aafdc8bdc30c was triggered. We address by checking that all calls to set*id on all threads give the same result, and only abort if we see success followed by failure (or vice versa).
* Clean up stack-coloring macros.Roland McGrath2014-06-201-0/+1
|
* Inline nested function check_listSiddhesh Poyarekar2014-06-061-20/+17
|
* Use glibc_likely instead __builtin_expect.Ondřej Bílka2014-02-101-7/+7
|
* Revert "Patch 3/4 of the effort to make TLS access async-signal-safe."Allan McRae2014-02-061-1/+5
| | | | This reverts commit 35e8f7ab94c910659de9d507aa0f3e1f8973d914.
* Revert "Async-signal safe TLS."Allan McRae2014-02-061-9/+4
| | | | | | | | | This reverts commit 7f507ee17aee720fa423fa38502bc3caa0dd03d7. Conflicts: ChangeLog nptl/tst-tls7.c nptl/tst-tls7mod.c
* Async-signal safe TLS.Andrew Hunter2014-01-031-4/+9
| | | | | | | | | | | | | | | | | | | | | | ChangeLog: 2014-01-03 Andrew Hunter <ahh@google.com> * elf/dl-open.c (): New comment. * elf/dl-reloc.c (_dl_try_allocate_static_tls): Use atomic_compare_and_exchange_bool_acq (_dl_allocate_static_tls): Block signals. * elf/dl-tls.c (allocate_and_init): Return void. (_dl_update_slotinfo): Block signals, use atomic update. nptl/ChangeLog: 2014-01-03 Andrew Hunter <ahh@google.com> * nptl/Makefile (tst-tls7): New test. * nptl/tst-tls7.c: New file. * nptl/tst-tls7mod.c: New file. * nptl/allocatestack.c (init_one_static_tls): Use atomic barrier.
* Update copyright notices with scripts/update-copyrightsAllan McRae2014-01-011-1/+1
|
* Patch 3/4 of the effort to make TLS access async-signal-safe.Paul Pluzhnikov2013-12-181-5/+1
| | | | | | | | | | | Factor out _dl_clear_dtv. 2013-12-18 Andrew Hunter <ahh@google.com> * elf/Versions (ld): Add _dl_clear_dtv. * sysdeps/generic/ldsodefs.h (_dl_clear_dtv): New prototype. * elf/dl-tls.c (_dl_clear_dtv): New function. * nptl/allocatestack.c (get_cached_stack): Call _dl_clear_dtv.
* New API to set default thread attributesSiddhesh Poyarekar2013-06-151-2/+10
| | | | | | | This patch introduces two new convenience functions to set the default thread attributes used for creating threads. This allows a programmer to set the default thread attributes just once in a process and then run pthread_create without additional attributes.
* Move __default_stacksize into __default_pthread_attrSiddhesh Poyarekar2013-03-191-1/+1
| | | | | Make __default_pthread_attr object to store default attribute values for threads.
* Remove unnecessary assert on attr in allocate_stack().Carlos O'Donell2013-01-111-1/+4
|
* Update copyright notices with scripts/update-copyrights.Joseph Myers2013-01-021-1/+1
|
* Remove __ASSUME_TGKILL.Joseph Myers2012-08-081-11/+1
|
* Replace FSF snail mail address with URLs.Paul Eggert2012-02-091-3/+2
|
* Return errno on failure in allocate_stackCarlos O'Donell2011-12-141-12/+4
| | | | | | In cases where a function call fails return errno and allow the caller to fixup the return code as required by their API.
* Avoid race between {,__de}allocate_stack and __reclaim_stacks during forkAndreas Schwab2011-09-151-0/+1
|
* Fix setxid race handling exiting threadsAndreas Schwab2011-08-311-1/+10
|
* Quash a warning in nptl/allocatestack.cRoland McGrath2011-07-141-1/+1
|