mirror/glibc - mirror of git://sourceware.org/git/glibc.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	New pthread rwlock that is more scalable.	Torvald Riegel	2017-01-10	1	-14/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This replaces the pthread rwlock with a new implementation that uses a more scalable algorithm (primarily through not using a critical section anymore to make state changes). The fast path for rdlock acquisition and release is now basically a single atomic read-modify write or CAS and a few branches. See nptl/pthread_rwlock_common.c for details. * nptl/DESIGN-rwlock.txt: Remove. * nptl/lowlevelrwlock.sym: Remove. * nptl/Makefile: Add new tests. * nptl/pthread_rwlock_common.c: New file. Contains the new rwlock. * nptl/pthreadP.h (PTHREAD_RWLOCK_PREFER_READER_P): Remove. (PTHREAD_RWLOCK_WRPHASE, PTHREAD_RWLOCK_WRLOCKED, PTHREAD_RWLOCK_RWAITING, PTHREAD_RWLOCK_READER_SHIFT, PTHREAD_RWLOCK_READER_OVERFLOW, PTHREAD_RWLOCK_WRHANDOVER, PTHREAD_RWLOCK_FUTEX_USED): New. * nptl/pthread_rwlock_init.c (__pthread_rwlock_init): Adapt to new implementation. * nptl/pthread_rwlock_rdlock.c (__pthread_rwlock_rdlock_slow): Remove. (__pthread_rwlock_rdlock): Adapt. * nptl/pthread_rwlock_timedrdlock.c (pthread_rwlock_timedrdlock): Adapt. * nptl/pthread_rwlock_timedwrlock.c (pthread_rwlock_timedwrlock): Adapt. * nptl/pthread_rwlock_trywrlock.c (pthread_rwlock_trywrlock): Adapt. * nptl/pthread_rwlock_tryrdlock.c (pthread_rwlock_tryrdlock): Adapt. * nptl/pthread_rwlock_unlock.c (pthread_rwlock_unlock): Adapt. * nptl/pthread_rwlock_wrlock.c (__pthread_rwlock_wrlock_slow): Remove. (__pthread_rwlock_wrlock): Adapt. * nptl/tst-rwlock10.c: Adapt. * nptl/tst-rwlock11.c: Adapt. * nptl/tst-rwlock17.c: New file. * nptl/tst-rwlock18.c: New file. * nptl/tst-rwlock19.c: New file. * nptl/tst-rwlock2b.c: New file. * nptl/tst-rwlock8.c: Adapt. * nptl/tst-rwlock9.c: Adapt. * sysdeps/aarch64/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/arm/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/hppa/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/ia64/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/m68k/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/microblaze/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/mips/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/nios2/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/s390/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/sh/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/sparc/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/tile/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/unix/sysv/linux/alpha/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/unix/sysv/linux/powerpc/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/x86/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * nptl/nptl-printers.py (): Adapt. * nptl/nptl_lock_constants.pysym: Adapt. * nptl/test-rwlock-printers.py: Adapt. * nptl/test-rwlockattr-printers.c: Adapt. * nptl/test-rwlockattr-printers.py: Adapt.
*	Update copyright dates with scripts/update-copyrights.	Joseph Myers	2017-01-01	259	-259/+259
\|
*	New condvar implementation that provides stronger ordering guarantees.	Torvald Riegel	2016-12-31	1	-8/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a new implementation for condition variables, required after http://austingroupbugs.net/view.php?id=609 to fix bug 13165. In essence, we need to be stricter in which waiters a signal or broadcast is required to wake up; this couldn't be solved using the old algorithm. ISO C++ made a similar clarification, so this also fixes a bug in current libstdc++, for example. We can't use the old algorithm anymore because futexes do not guarantee to wake in FIFO order. Thus, when we wake, we can't simply let any waiter grab a signal, but we need to ensure that one of the waiters happening before the signal is woken up. This is something the previous algorithm violated (see bug 13165). There's another issue specific to condvars: ABA issues on the underlying futexes. Unlike mutexes that have just three states, or semaphores that have no tokens or a limited number of them, the state of a condvar is the order of the waiters. A waiter on a semaphore can grab a token whenever one is available; a condvar waiter must only consume a signal if it is eligible to do so as determined by the relative order of the waiter and the signal. Therefore, this new algorithm maintains two groups of waiters: Those eligible to consume signals (G1), and those that have to wait until previous waiters have consumed signals (G2). Once G1 is empty, G2 becomes the new G1. 64b counters are used to avoid ABA issues. This condvar doesn't yet use a requeue optimization (ie, on a broadcast, waking just one thread and requeueing all others on the futex of the mutex supplied by the program). I don't think doing the requeue is necessarily the right approach (but I haven't done real measurements yet): * If a program expects to wake many threads at the same time and make that scalable, a condvar isn't great anyway because of how it requires waiters to operate mutually exclusive (due to the mutex usage). Thus, a thundering herd problem is a scalability problem with or without the optimization. Using something like a semaphore might be more appropriate in such a case. * The scalability problem is actually at the mutex side; the condvar could help (and it tries to with the requeue optimization), but it should be the mutex who decides how that is done, and whether it is done at all. * Forcing all but one waiter into the kernel-side wait queue of the mutex prevents/avoids the use of lock elision on the mutex. Thus, it prevents the only cure against the underlying scalability problem inherent to condvars. * If condvars use short critical sections (ie, hold the mutex just to check a binary flag or such), which they should do ideally, then forcing all those waiter to proceed serially with kernel-based hand-off (ie, futex ops in the mutex' contended state, via the futex wait queues) will be less efficient than just letting a scalable mutex implementation take care of it. Our current mutex impl doesn't employ spinning at all, but if critical sections are short, spinning can be much better. * Doing the requeue stuff requires all waiters to always drive the mutex into the contended state. This leads to each waiter having to call futex_wake after lock release, even if this wouldn't be necessary. [BZ #13165] * nptl/pthread_cond_broadcast.c (__pthread_cond_broadcast): Rewrite to use new algorithm. * nptl/pthread_cond_destroy.c (__pthread_cond_destroy): Likewise. * nptl/pthread_cond_init.c (__pthread_cond_init): Likewise. * nptl/pthread_cond_signal.c (__pthread_cond_signal): Likewise. * nptl/pthread_cond_wait.c (__pthread_cond_wait): Likewise. (__pthread_cond_timedwait): Move here from pthread_cond_timedwait.c. (__condvar_confirm_wakeup, __condvar_cancel_waiting, __condvar_cleanup_waiting, __condvar_dec_grefs, __pthread_cond_wait_common): New. (__condvar_cleanup): Remove. * npt/pthread_condattr_getclock.c (pthread_condattr_getclock): Adapt. * npt/pthread_condattr_setclock.c (pthread_condattr_setclock): Likewise. * npt/pthread_condattr_getpshared.c (pthread_condattr_getpshared): Likewise. * npt/pthread_condattr_init.c (pthread_condattr_init): Likewise. * nptl/tst-cond1.c: Add comment. * nptl/tst-cond20.c (do_test): Adapt. * nptl/tst-cond22.c (do_test): Likewise. * sysdeps/aarch64/nptl/bits/pthreadtypes.h (pthread_cond_t): Adapt structure. * sysdeps/arm/nptl/bits/pthreadtypes.h (pthread_cond_t): Likewise. * sysdeps/ia64/nptl/bits/pthreadtypes.h (pthread_cond_t): Likewise. * sysdeps/m68k/nptl/bits/pthreadtypes.h (pthread_cond_t): Likewise. * sysdeps/microblaze/nptl/bits/pthreadtypes.h (pthread_cond_t): Likewise. * sysdeps/mips/nptl/bits/pthreadtypes.h (pthread_cond_t): Likewise. * sysdeps/nios2/nptl/bits/pthreadtypes.h (pthread_cond_t): Likewise. * sysdeps/s390/nptl/bits/pthreadtypes.h (pthread_cond_t): Likewise. * sysdeps/sh/nptl/bits/pthreadtypes.h (pthread_cond_t): Likewise. * sysdeps/tile/nptl/bits/pthreadtypes.h (pthread_cond_t): Likewise. * sysdeps/unix/sysv/linux/alpha/bits/pthreadtypes.h (pthread_cond_t): Likewise. * sysdeps/unix/sysv/linux/powerpc/bits/pthreadtypes.h (pthread_cond_t): Likewise. * sysdeps/x86/bits/pthreadtypes.h (pthread_cond_t): Likewise. * sysdeps/nptl/internaltypes.h (COND_NWAITERS_SHIFT): Remove. (COND_CLOCK_BITS): Adapt. * sysdeps/nptl/pthread.h (PTHREAD_COND_INITIALIZER): Adapt. * nptl/pthreadP.h (__PTHREAD_COND_CLOCK_MONOTONIC_MASK, __PTHREAD_COND_SHARED_MASK): New. * nptl/nptl-printers.py (CLOCK_IDS): Remove. (ConditionVariablePrinter, ConditionVariableAttributesPrinter): Adapt. * nptl/nptl_lock_constants.pysym: Adapt. * nptl/test-cond-printers.py: Adapt. * sysdeps/unix/sysv/linux/hppa/internaltypes.h (cond_compat_clear, cond_compat_check_and_clear): Adapt. * sysdeps/unix/sysv/linux/hppa/pthread_cond_timedwait.c: Remove file ... * sysdeps/unix/sysv/linux/hppa/pthread_cond_wait.c (__pthread_cond_timedwait): ... and move here. * nptl/DESIGN-condvar.txt: Remove file. * nptl/lowlevelcond.sym: Likewise. * nptl/pthread_cond_timedwait.c: Likewise. * sysdeps/unix/sysv/linux/i386/i486/pthread_cond_broadcast.S: Likewise. * sysdeps/unix/sysv/linux/i386/i486/pthread_cond_signal.S: Likewise. * sysdeps/unix/sysv/linux/i386/i486/pthread_cond_timedwait.S: Likewise. * sysdeps/unix/sysv/linux/i386/i486/pthread_cond_wait.S: Likewise. * sysdeps/unix/sysv/linux/i386/i586/pthread_cond_broadcast.S: Likewise. * sysdeps/unix/sysv/linux/i386/i586/pthread_cond_signal.S: Likewise. * sysdeps/unix/sysv/linux/i386/i586/pthread_cond_timedwait.S: Likewise. * sysdeps/unix/sysv/linux/i386/i586/pthread_cond_wait.S: Likewise. * sysdeps/unix/sysv/linux/i386/i686/pthread_cond_broadcast.S: Likewise. * sysdeps/unix/sysv/linux/i386/i686/pthread_cond_signal.S: Likewise. * sysdeps/unix/sysv/linux/i386/i686/pthread_cond_timedwait.S: Likewise. * sysdeps/unix/sysv/linux/i386/i686/pthread_cond_wait.S: Likewise. * sysdeps/unix/sysv/linux/x86_64/pthread_cond_broadcast.S: Likewise. * sysdeps/unix/sysv/linux/x86_64/pthread_cond_signal.S: Likewise. * sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S: Likewise. * sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S: Likewise.
*	Fix typos in the spelling of "implementation"	Dmitry V. Levin	2016-12-27	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	Apply the following spelling fix: $ git grep -El 'implemetn?ation' \| xargs sed -ri 's/implemetn?ation/implementation/g' [BZ #19514] * resolv/res_send.c: Fix typo in comment. * sysdeps/i386/i386-mcount.S: Likewise. * sysdeps/s390/s390-32/s390-mcount.S: Likewise. * sysdeps/s390/s390-64/s390x-mcount.S: Likewise. * sysdeps/sparc/sparc-mcount.S: Likewise.
*	Refactor long double information into bits/long-double.h.	Joseph Myers	2016-12-14	3	-42/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Information about whether the ABI of long double is the same as that of double is split between bits/mathdef.h and bits/wordsize.h. When the ABIs are the same, bits/mathdef.h defines __NO_LONG_DOUBLE_MATH. In addition, in the case where the same glibc binary supports both -mlong-double-64 and -mlong-double-128, bits/wordsize.h defines __LONG_DOUBLE_MATH_OPTIONAL, along with __NO_LONG_DOUBLE_MATH if this particular compilation is with -mlong-double-64. As part of the refactoring I proposed in <https://sourceware.org/ml/libc-alpha/2016-11/msg00745.html>, this patch puts all that information in a single header, bits/long-double.h. It is included from sys/cdefs.h alongside the include of bits/wordsize.h, so other headers generally do not need to include bits/long-double.h directly. Previously, various bits/mathdef.h headers and bits/wordsize.h headers had this long double information (including implicitly in some bits/mathdef.h headers through not having the defines present in the default version). After the patch, it's all in six bits/long-double.h headers. Furthermore, most of those new headers are not architecture-specific. Architectures with optional long double all use the ldbl-opt sysdeps directory, either in the order (ldbl-64-128, ldbl-opt, ldbl-128) or (ldbl-128ibm, ldbl-opt). Thus a generic header for the case where long double = double, and headers in ldbl-128, ldbl-96 and ldbl-opt, suffices to cover every architecture except for cases where long double properties vary between different ABIs sharing a set of installed headers; fortunately all the ldbl-opt cases share a single compiler-predefined macro __LONG_DOUBLE_128__ that can be used to tell whether this compilation is -mlong-double-64 or -mlong-double-128. The two cases where a set of headers is shared between ABIs with different long double properties, MIPS (o32 has long double = double, other ABIs use ldbl-128) and SPARC (32-bit has optional long double, 64-bit has required long double), need their own bits/long-double.h headers. As with bits/wordsize.h, multiple-include protection for this header is generally implicit through the include guards on sys/cdefs.h, and multiple inclusion is harmless in any case. There is one subtlety: the header must not define __LONG_DOUBLE_MATH_OPTIONAL if __NO_LONG_DOUBLE_MATH was defined before its inclusion, because doing so breaks how sysdeps/ieee754/ldbl-opt/nldbl-compat.h defines __NO_LONG_DOUBLE_MATH itself before including system headers. Subject to keeping that working, it would be reasonable to move these macros from defined/undefined #ifdef to always-defined 1/0 #if semantics, but this patch does not attempt to do so, just rearranges where the macros are defined. After this patch, the only use of bits/mathdef.h is the alpha one for modifying complex function ABIs for old GCC. Thus, all versions of the header other than the default and alpha versions are removed, as is the include from math.h. Tested for x86_64 and x86. Also did compilation-only testing with build-many-glibcs.py. * bits/long-double.h: New file. * sysdeps/ieee754/ldbl-128/bits/long-double.h: Likewise. * sysdeps/ieee754/ldbl-96/bits/long-double.h: Likewise. * sysdeps/ieee754/ldbl-opt/bits/long-double.h: Likewise. * sysdeps/mips/bits/long-double.h: Likewise. * sysdeps/unix/sysv/linux/sparc/bits/long-double.h: Likewise. * math/Makefile (headers): Add bits/long-double.h. * misc/sys/cdefs.h: Include <bits/long-double.h>. * stdlib/strtold.c: Include <bits/long-double.h> instead of <bits/wordsize.h>. * bits/mathdef.h [!_COMPLEX_H]: Do not allow inclusion. [!__NO_LONG_DOUBLE_MATH]: Remove conditional code. * math/math.h: Do not include <bits/mathdef.h>. * sysdeps/aarch64/bits/mathdef.h: Remove file. * sysdeps/alpha/bits/mathdef.h [!_COMPLEX_H]: Do not allow inclusion. * sysdeps/ia64/bits/mathdef.h: Remove file. * sysdeps/m68k/m680x0/bits/mathdef.h: Likewise. * sysdeps/mips/bits/mathdef.h: Likewise. * sysdeps/powerpc/bits/mathdef.h: Likewise. * sysdeps/s390/bits/mathdef.h: Likewise. * sysdeps/sparc/bits/mathdef.h: Likewise. * sysdeps/x86/bits/mathdef.h: Likewise. * sysdeps/s390/s390-32/bits/wordsize.h [!__NO_LONG_DOUBLE_MATH && !__LONG_DOUBLE_MATH_OPTIONAL]: Remove conditional code. * sysdeps/s390/s390-64/bits/wordsize.h [!__NO_LONG_DOUBLE_MATH && !__LONG_DOUBLE_MATH_OPTIONAL]: Likewise. * sysdeps/unix/sysv/linux/alpha/bits/wordsize.h [!__NO_LONG_DOUBLE_MATH && !__LONG_DOUBLE_MATH_OPTIONAL]: Likewise. * sysdeps/unix/sysv/linux/powerpc/bits/wordsize.h [!__NO_LONG_DOUBLE_MATH && !__LONG_DOUBLE_MATH_OPTIONAL]: Likewise. * sysdeps/unix/sysv/linux/sparc/bits/wordsize.h [!__NO_LONG_DOUBLE_MATH && !__LONG_DOUBLE_MATH_OPTIONAL]: Likewise.
*	S390: Regenerate ULPs.	Stefan Liebler	2016-12-02	1	-6/+6
\| \| \| \| \| \| \| \|	Updated ulps file. ChangeLog: * sysdeps/s390/fpu/libm-test-ulps: Regenerated.
*	Refactor FP_ILOGB* out of bits/mathdef.h.	Joseph Myers	2016-12-01	1	-9/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Continuing the refactoring of bits/mathdef.h, this patch stops it defining FP_ILOGB0 and FP_ILOGBNAN, moving the required information to a new header bits/fp-logb.h. There are only two possible values of each of those macros permitted by ISO C. TS 18661-1 adds corresponding macros for llogb, and their values are required to correspond to those of the ilogb macros in the obvious way. Thus two boolean values - for which the same choices are correct for most architectures - suffice to determine the value of all these macros, and by defining macros for those boolean values in bits/fp-logb.h we can then define the public FP_* macros in math.h and avoid the present duplication of the associated feature test macro logic. This patch duly moves to bits/fp-logb.h defining __FP_LOGB0_IS_MIN and __FP_LOGBNAN_IS_MIN. Default definitions of those to 0 are correct for both architectures, while ia64, m68k and x86 get their own versions of bits/fp-logb.h to reflect their use of values different from the defaults. The patch renders many copies of bits/mathdef.h trivial (needed only to avoid the default __NO_LONG_DOUBLE_MATH). I'll revise <https://sourceware.org/ml/libc-alpha/2016-11/msg00865.html> accordingly so that it removes all bits/mathdef.h headers except the default one and the alpha one, and arranges for the header to be included only by complex.h as the only remaining use at that point will be for the alpha ABI issues there. Tested for x86_64 and x86. Also did compile-only testing with build-many-glibcs.py (using glibc sources from before the commit that introduced many build failures with undefined __GI___sigsetjmp). * bits/fp-logb.h: New file. * sysdeps/ia64/bits/fp-logb.h: Likewise. * sysdeps/m68k/m680x0/bits/fp-logb.h: Likewise. * sysdeps/x86/bits/fp-logb.h: Likewise. * math/Makefile (headers): Add bits/fp-logb.h. * math/math.h: Include <bits/fp-logb.h>. [__USE_ISOC99] (FP_ILOGB0): Define based on __FP_LOGB0_IS_MIN. [__USE_ISOC99] (FP_ILOGBNAN): Define based on __FP_LOGBNAN_IS_MIN. * bits/mathdef.h (FP_ILOGB0): Remove. (FP_ILOGBNAN): Likewise. * sysdeps/aarch64/bits/mathdef.h (FP_ILOGB0): Likewise. (FP_ILOGBNAN): Likewise. * sysdeps/alpha/bits/mathdef.h (FP_ILOGB0): Likewise. (FP_ILOGBNAN): Likewise. * sysdeps/ia64/bits/mathdef.h (FP_ILOGB0): Likewise. (FP_ILOGBNAN): Likewise. * sysdeps/m68k/m680x0/bits/mathdef.h (FP_ILOGB0): Likewise. (FP_ILOGBNAN): Likewise. * sysdeps/mips/bits/mathdef.h (FP_ILOGB0): Likewise. (FP_ILOGBNAN): Likewise. * sysdeps/powerpc/bits/mathdef.h (FP_ILOGB0): Likewise. (FP_ILOGBNAN): Likewise. * sysdeps/s390/bits/mathdef.h (FP_ILOGB0): Likewise. (FP_ILOGBNAN): Likewise. * sysdeps/sparc/bits/mathdef.h (FP_ILOGB0): Likewise. (FP_ILOGBNAN): Likewise. * sysdeps/x86/bits/mathdef.h (FP_ILOGB0): Likewise. (FP_ILOGBNAN): Likewise.
*	Remove cached PID/TID in clone	Adhemerval Zanella	2016-11-24	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch remove the PID cache and usage in current GLIBC code. Current usage is mainly used a performance optimization to avoid the syscall, however it adds some issues: - The exposed clone syscall will try to set pid/tid to make the new thread somewhat compatible with current GLIBC assumptions. This cause a set of issue with new workloads and usecases (such as BZ#17214 and [1]) as well for new internal usage of clone to optimize other algorithms (such as clone plus CLONE_VM for posix_spawn, BZ#19957). - The caching complexity also added some bugs in the past [2] [3] and requires more effort of each port to handle such requirements (for both clone and vfork implementation). - Caching performance gain in mainly on getpid and some specific code paths. The getpid performance leverage is questionable [4], either by the idea of getpid being a hotspot as for the getpid implementation itself (if it is indeed a justifiable hotspot a vDSO symbol could let to a much more simpler solution). Other usage is mainly for non usual code paths, such as pthread cancellation signal and handling. For thread creation (on stack allocation) the code simplification in fact adds some performance gain due the no need of transverse the stack cache and invalidate each element pid. Other thread usages will require a direct getpid syscall, such as cancellation/setxid signal, thread cancellation, thread fail path (at create_thread), and thread signal (pthread_kill and pthread_sigqueue). However these are hardly usual hotspots and I think adding a syscall is justifiable. It also simplifies both the clone and vfork arch-specific implementation. And by review each fork implementation there are some discrepancies that this patch also solves: - microblaze clone/vfork does not set/reset the pid/tid field - hppa uses the default vfork implementation that fallback to fork. Since vfork is deprecated I do not think we should bother with it. The patch also removes the TID caching in clone. My understanding for such semantic is try provide some pthread usage after a user program issue clone directly (as done by thread creation with CLONE_PARENT_SETTID and pthread tid member). However, as stated before in multiple discussions threads, GLIBC provides clone syscalls without further supporting all this semantics. I ran a full make check on x86_64, x32, i686, armhf, aarch64, and powerpc64le. For sparc32, sparc64, and mips I ran the basic fork and vfork tests from posix/ folder (on a qemu system). So it would require further testing on alpha, hppa, ia64, m68k, nios2, s390, sh, and tile (I excluded microblaze because it is already implementing the patch semantic regarding clone/vfork). [1] https://codereview.chromium.org/800183004/ [2] https://sourceware.org/ml/libc-alpha/2006-07/msg00123.html [3] https://sourceware.org/bugzilla/show_bug.cgi?id=15368 [4] http://yarchive.net/comp/linux/getpid_caching.html * sysdeps/nptl/fork.c (__libc_fork): Remove pid cache setting. * nptl/allocatestack.c (allocate_stack): Likewise. (__reclaim_stacks): Likewise. (setxid_signal_thread): Obtain pid through syscall. * nptl/nptl-init.c (sigcancel_handler): Likewise. (sighandle_setxid): Likewise. * nptl/pthread_cancel.c (pthread_cancel): Likewise. * sysdeps/unix/sysv/linux/pthread_kill.c (__pthread_kill): Likewise. * sysdeps/unix/sysv/linux/pthread_sigqueue.c (pthread_sigqueue): Likewise. * sysdeps/unix/sysv/linux/createthread.c (create_thread): Likewise. * sysdeps/unix/sysv/linux/getpid.c: Remove file. * nptl/descr.h (struct pthread): Change comment about pid value. * nptl/pthread_getattr_np.c (pthread_getattr_np): Remove thread pid assert. * sysdeps/unix/sysv/linux/pthread-pids.h (__pthread_initialize_pids): Do not set pid value. * nptl_db/td_ta_thr_iter.c (iterate_thread_list): Remove thread pid cache check. * nptl_db/td_thr_validate.c (td_thr_validate): Likewise. * sysdeps/aarch64/nptl/tcb-offsets.sym: Remove pid offset. * sysdeps/alpha/nptl/tcb-offsets.sym: Likewise. * sysdeps/arm/nptl/tcb-offsets.sym: Likewise. * sysdeps/hppa/nptl/tcb-offsets.sym: Likewise. * sysdeps/i386/nptl/tcb-offsets.sym: Likewise. * sysdeps/ia64/nptl/tcb-offsets.sym: Likewise. * sysdeps/m68k/nptl/tcb-offsets.sym: Likewise. * sysdeps/microblaze/nptl/tcb-offsets.sym: Likewise. * sysdeps/mips/nptl/tcb-offsets.sym: Likewise. * sysdeps/nios2/nptl/tcb-offsets.sym: Likewise. * sysdeps/powerpc/nptl/tcb-offsets.sym: Likewise. * sysdeps/s390/nptl/tcb-offsets.sym: Likewise. * sysdeps/sh/nptl/tcb-offsets.sym: Likewise. * sysdeps/sparc/nptl/tcb-offsets.sym: Likewise. * sysdeps/tile/nptl/tcb-offsets.sym: Likewise. * sysdeps/x86_64/nptl/tcb-offsets.sym: Likewise. * sysdeps/unix/sysv/linux/aarch64/clone.S: Remove pid and tid caching. * sysdeps/unix/sysv/linux/alpha/clone.S: Likewise. * sysdeps/unix/sysv/linux/arm/clone.S: Likewise. * sysdeps/unix/sysv/linux/hppa/clone.S: Likewise. * sysdeps/unix/sysv/linux/i386/clone.S: Likewise. * sysdeps/unix/sysv/linux/ia64/clone2.S: Likewise. * sysdeps/unix/sysv/linux/mips/clone.S: Likewise. * sysdeps/unix/sysv/linux/nios2/clone.S: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/clone.S: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/clone.S: Likewise. * sysdeps/unix/sysv/linux/sh/clone.S: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/clone.S: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/clone.S: Likewise. * sysdeps/unix/sysv/linux/tile/clone.S: Likewise. * sysdeps/unix/sysv/linux/x86_64/clone.S: Likewise. * sysdeps/unix/sysv/linux/aarch64/vfork.S: Remove pid set and reset. * sysdeps/unix/sysv/linux/alpha/vfork.S: Likewise. * sysdeps/unix/sysv/linux/arm/vfork.S: Likewise. * sysdeps/unix/sysv/linux/i386/vfork.S: Likewise. * sysdeps/unix/sysv/linux/ia64/vfork.S: Likewise. * sysdeps/unix/sysv/linux/m68k/clone.S: Likewise. * sysdeps/unix/sysv/linux/m68k/vfork.S: Likewise. * sysdeps/unix/sysv/linux/mips/vfork.S: Likewise. * sysdeps/unix/sysv/linux/nios2/vfork.S: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/vfork.S: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/vfork.S: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/vfork.S: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/vfork.S: Likewise. * sysdeps/unix/sysv/linux/sh/vfork.S: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/vfork.S: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/vfork.S: Likewise. * sysdeps/unix/sysv/linux/tile/vfork.S: Likewise. * sysdeps/unix/sysv/linux/x86_64/vfork.S: Likewise. * sysdeps/unix/sysv/linux/tst-clone2.c (f): Remove direct pthread struct access. (clone_test): Remove function. (do_test): Rewrite to take in consideration pid is not cached anymore.
*	Refactor float_t, double_t information into bits/flt-eval-method.h.	Joseph Myers	2016-11-24	2	-7/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At present, definitions of float_t and double_t are split among many bits/mathdef.h headers. For all but three architectures, these types are float and double. Furthermore, if you assume __FLT_EVAL_METHOD__ to be defined, that provides a more generic way of determining the correct values of these typedefs. Defining these typedefs more generally based on __FLT_EVAL_METHOD__ was previously proposed by Paul Eggert in <https://sourceware.org/ml/libc-alpha/2012-02/msg00002.html>. This patch refactors things in the way I proposed in <https://sourceware.org/ml/libc-alpha/2016-11/msg00745.html>. A new header bits/flt-eval-method.h defines a single macro, __GLIBC_FLT_EVAL_METHOD, which is then used by math.h to define float_t and double_t. The default is based on __FLT_EVAL_METHOD__ (although actually a default to 0 would have the same effect for current ports, because ports where values other than 0 or 16 are possible all have their own headers). To avoid changing the existing semantics in any case, including for compilers not defining __FLT_EVAL_METHOD__, architecture-specific files are then added for m68k, s390, x86 which replicate the existing semantics. At least with __FLT_EVAL_METHOD__ values possible with GCC, there should be no change to the choices of float_t and double_t for any supported configuration. Architecture maintainer notes: * m68k: sysdeps/m68k/m680x0/bits/flt-eval-method.h always defines __GLIBC_FLT_EVAL_METHOD to 2 to replicate the existing logic. But actually GCC defines __FLT_EVAL_METHOD__ to 0 if TARGET_68040. It might make sense to make the header prefer to base things on __FLT_EVAL_METHOD__ if defined, like the x86 version, and so make the choices of these types more accurate (with a NEWS entry as for the other changes to these types on particular architectures). * s390: sysdeps/s390/bits/flt-eval-method.h always defines __GLIBC_FLT_EVAL_METHOD to 1 to replicate the existing logic. As previously discussed, it might make sense in coordination with GCC to eliminate the historic mistake, avoid excess precision in the -fexcess-precision=standard case and make the typedefs match (with a NEWS entry, again). Tested for x86-64 and x86. Also did compilation-only testing with build-many-glibcs.py. * bits/flt-eval-method.h: New file. * sysdeps/m68k/m680x0/bits/flt-eval-method.h: Likewise. * sysdeps/s390/bits/flt-eval-method.h: Likewise. * sysdeps/x86/bits/flt-eval-method.h: Likewise. * math/Makefile (headers): Add bits/flt-eval-method.h. * math/math.h: Include <bits/flt-eval-method.h>. [__USE_ISOC99] (float_t): Define based on __GLIBC_FLT_EVAL_METHOD. [__USE_ISOC99] (double_t): Likewise. * bits/mathdef.h (float_t): Remove. (double_t): Likewise. * sysdeps/aarch64/bits/mathdef.h (float_t): Likewise. (double_t): Likewise. * sysdeps/alpha/bits/mathdef.h (float_t): Likewise. (double_t): Likewise. * sysdeps/arm/bits/mathdef.h (float_t): Likewise. (double_t): Likewise. * sysdeps/hppa/fpu/bits/mathdef.h (float_t): Likewise. (double_t): Likewise. * sysdeps/ia64/bits/mathdef.h (float_t): Likewise. (double_t): Likewise. * sysdeps/m68k/m680x0/bits/mathdef.h (float_t): Likewise. (double_t): Likewise. * sysdeps/mips/bits/mathdef.h (float_t): Likewise. (double_t): Likewise. * sysdeps/powerpc/bits/mathdef.h (float_t): Likewise. (double_t): Likewise. * sysdeps/s390/bits/mathdef.h (float_t): Likewise. (double_t): Likewise. * sysdeps/sh/sh4/bits/mathdef.h (float_t): Likewise. (double_t): Likewise. * sysdeps/sparc/bits/mathdef.h (float_t): Likewise. (double_t): Likewise. * sysdeps/tile/bits/mathdef.h (float_t): Likewise. (double_t): Likewise. * sysdeps/x86/bits/mathdef.h (float_t): Likewise. (double_t): Likewise.
*	s390x: Add hidden definition for __sigsetjmp	Florian Weimer	2016-11-15	2	-36/+48
\|
*	nptl: Document the reason why __kind in pthread_mutex_t is part of the ABI	Florian Weimer	2016-11-07	1	-1/+1
\|
*	Do not hardcode platform names in manual/libm-err-tab.pl (bug 14139).	Joseph Myers	2016-11-04	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	manual/libm-err-tab.pl hardcodes a list of names for particular platforms (mapping from sysdeps directory name to friendly name for the manual). This goes against the principle of keeping information about individual platforms in their corresponding sysdeps directory, and the list is also very out-of-date regarding supported platforms and their corresponding sysdeps directories. This patch fixes this by adding a libm-test-ulps-name file alongside each libm-test-ulps file. The script then gets the friendly name from that file, which is required to exist, so it no longer needs to allow for the mapping being missing. Tested for x86_64. [BZ #14139] * manual/libm-err-tab.pl (%pplatforms): Initialize to empty. (find_files): Obtain platform name from libm-test-ulps-name and store in %pplatforms. (canonicalize_platform): Remove. (print_platforms): Use $pplatforms directly. (by_platforms): Do not allow for platforms missing from %pplatforms. * sysdeps/aarch64/libm-test-ulps-name: New file. * sysdeps/alpha/fpu/libm-test-ulps-name: Likewise. * sysdeps/arm/libm-test-ulps-name: Likewise. * sysdeps/generic/libm-test-ulps-name: Likewise. * sysdeps/hppa/fpu/libm-test-ulps-name: Likewise. * sysdeps/i386/fpu/libm-test-ulps-name: Likewise. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps-name: Likewise. * sysdeps/ia64/fpu/libm-test-ulps-name: Likewise. * sysdeps/m68k/coldfire/fpu/libm-test-ulps-name: Likewise. * sysdeps/m68k/m680x0/fpu/libm-test-ulps-name: Likewise. * sysdeps/microblaze/libm-test-ulps-name: Likewise. * sysdeps/mips/mips32/libm-test-ulps-name: Likewise. * sysdeps/mips/mips64/libm-test-ulps-name: Likewise. * sysdeps/nios2/libm-test-ulps-name: Likewise. * sysdeps/powerpc/fpu/libm-test-ulps-name: Likewise. * sysdeps/powerpc/nofpu/libm-test-ulps-name: Likewise. * sysdeps/s390/fpu/libm-test-ulps-name: Likewise. * sysdeps/sh/libm-test-ulps-name: Likewise. * sysdeps/sparc/fpu/libm-test-ulps-name: Likewise. * sysdeps/tile/libm-test-ulps-name: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps-name: Likewise.
*	Define wordsize.h macros everywhere	Steve Ellcey	2016-11-04	2	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* bits/wordsize.h: Add documentation. * sysdeps/aarch64/bits/wordsize.h : New file * sysdeps/generic/stdint.h (PTRDIFF_MIN, PTRDIFF_MAX): Update definitions. (SIZE_MAX): Change ifdef to if in __WORDSIZE32_SIZE_ULONG check. * sysdeps/gnu/bits/utmp.h (__WORDSIZE_TIME64_COMPAT32): Check with #if instead of #ifdef. * sysdeps/gnu/bits/utmpx.h (__WORDSIZE_TIME64_COMPAT32): Ditto. * sysdeps/mips/bits/wordsize.h (__WORDSIZE32_SIZE_ULONG, __WORDSIZE32_PTRDIFF_LONG, __WORDSIZE_TIME64_COMPAT32): Add or change defines. * sysdeps/powerpc/powerpc32/bits/wordsize.h: Likewise. * sysdeps/powerpc/powerpc64/bits/wordsize.h: Likewise. * sysdeps/s390/s390-32/bits/wordsize.h: Likewise. * sysdeps/s390/s390-64/bits/wordsize.h: Likewise. * sysdeps/sparc/sparc32/bits/wordsize.h: Likewise. * sysdeps/sparc/sparc64/bits/wordsize.h: Likewise. * sysdeps/tile/tilegx/bits/wordsize.h: Likewise. * sysdeps/tile/tilepro/bits/wordsize.h: Likewise. * sysdeps/unix/sysv/linux/alpha/bits/wordsize.h: Likewise. * sysdeps/unix/sysv/linux/powerpc/bits/wordsize.h: Likewise. * sysdeps/unix/sysv/linux/sparc/bits/wordsize.h: Likewise. * sysdeps/wordsize-32/bits/wordsize.h: Likewise. * sysdeps/wordsize-64/bits/wordsize.h: Likewise. * sysdeps/x86/bits/wordsize.h: Likewise.
*	S390: Fix fp comparison not raising FE_INVALID.	Stefan Liebler	2016-10-17	1	-0/+36
\| \| \| \| \| \| \|	As gcc is using unordered comparison instructions which do not raise invalid exception if any operand is quiet NAN, FIX_COMPARE_INVALID is defined to 1. Thus iseqsig is calling feraiseexcept as workaround.
*	s390: Refactor ifunc resolvers due to false debuginfo.	Stefan Liebler	2016-10-07	30	-85/+169
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adjusts the s390 specific ifunc helper macros in ifunc-resolve.h to use the common __ifunc macro, which uses gcc attribute ifunc to get rid of the false debuginfo. Therefore the redirection construct is applied where needed. Perhaps in future we can switch some of the internal symbols __GI_* from the fallback variant to the ifunc function. But this change is not straightforward due to a segmentation fault while linking libc.so with older binutils on s390. ChangeLog: [BZ #20478] * sysdeps/s390/multiarch/ifunc-resolve.h (s390_vx_libc_ifunc2, s390_libc_ifunc): Use __ifunc from libc-symbols.h to create ifunc symbols. (s390_vx_libc_ifunc_init, s390_vx_libc_ifunc_redirected , s390_vx_libc_ifunc2_redirected, s390_libc_ifunc_init): New define. * sysdeps/s390/multiarch/memchr.c: Redirect ifunced function in header for using it as type for ifunc function. * sysdeps/s390/multiarch/mempcpy.c: Likewise. * sysdeps/s390/multiarch/rawmemchr.c: Likewise. * sysdeps/s390/multiarch/stpcpy.c: Likewise. * sysdeps/s390/multiarch/stpncpy.c: Likewise. * sysdeps/s390/multiarch/strcat.c: Likewise. * sysdeps/s390/multiarch/strchr.c: Likewise. * sysdeps/s390/multiarch/strcmp.c: Likewise. * sysdeps/s390/multiarch/strcpy.c: Likewise. * sysdeps/s390/multiarch/strcspn.c: Likewise. * sysdeps/s390/multiarch/strlen.c: Likewise. * sysdeps/s390/multiarch/strncmp.c: Likewise. * sysdeps/s390/multiarch/strncpy.c: Likewise. * sysdeps/s390/multiarch/strnlen.c: Likewise. * sysdeps/s390/multiarch/strpbrk.c: Likewise. * sysdeps/s390/multiarch/strrchr.c: Likewise. * sysdeps/s390/multiarch/strspn.c: Likewise. * sysdeps/s390/multiarch/wcschr.c: Likewise. * sysdeps/s390/multiarch/wcscmp.c: Likewise. * sysdeps/s390/multiarch/wcspbrk.c: Likewise. * sysdeps/s390/multiarch/wcsspn.c: Likewise. * sysdeps/s390/multiarch/wmemchr.c: Likewise. * sysdeps/s390/multiarch/wmemset.c: Likewise. * sysdeps/s390/s390-32/multiarch/memcmp.c: Likewise. * sysdeps/s390/s390-32/multiarch/memcpy.c: Likewise. * sysdeps/s390/s390-32/multiarch/memset.c: Likewise. * sysdeps/s390/s390-64/multiarch/memcmp.c: Likewise. * sysdeps/s390/s390-64/multiarch/memcpy.c: Likewise. * sysdeps/s390/s390-64/multiarch/memset.c: Likewise.
*	S390: Regenerate ULPs	Stefan Liebler	2016-10-04	1	-8/+2
\| \| \| \| \| \| \| \|	Regenerated ulps file after recent commit "Use __builtin_fma more in dbl-64 code.". ChangeLog: * sysdeps/s390/fpu/libm-test-ulps: Regenerated.
*	Remove the ptw-% patterns	Florian Weimer	2016-09-14	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	Nothing depends on the PTW macro anymore, so the mechanism to define PTW for recompliations of libc routines is no longer needed. The source files are still recompiled for the nptl directory, just without the “ptw-” prefix. (Reducing the number of pattern rules in sysd-rules is critical for improving make performance.)
*	Add femode_t functions: s390.	Joseph Myers	2016-09-07	2	-0/+66
\| \| \| \| \| \| \|	This patch adds S/390 versions of fegetmode and fesetmode. Untested. * sysdeps/s390/fpu/fegetmode.c: New file. * sysdeps/s390/fpu/fesetmode.c: Likewise.
*	Add femode_t functions.	Joseph Myers	2016-09-07	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TS 18661-1 defines a type femode_t to represent the set of dynamic floating-point control modes (such as the rounding mode and trap enablement modes), and functions fegetmode and fesetmode to manipulate those modes (without affecting other state such as the raised exception flags) and a corresponding macro FE_DFL_MODE. This patch series implements those interfaces for glibc. This first patch adds the architecture-independent pieces, the x86 and x86_64 implementations, and the <bits/fenv.h> and ABI baseline updates for all architectures so glibc keeps building and passing the ABI tests on all architectures. Subsequent patches add the fegetmode and fesetmode implementations for other architectures. femode_t is generally an integer type - the same type as fenv_t, or as the single element of fenv_t where fenv_t is a structure containing a single integer (or the single relevant element, where it has elements for both status and control registers) - except where architecture properties or consistency with the fenv_t implementation indicate otherwise. FE_DFL_MODE follows FE_DFL_ENV in whether it's a magic pointer value (-1 cast to const femode_t ), a value that can be distinguished from valid pointers by its high bits but otherwise contains a representation of the desired register contents, or a pointer to a constant variable (the powerpc case; __fe_dfl_mode is added as an exported constant object, an alias to __fe_dfl_env). Note that where architectures (that share a register between control and status bits) gain definitions of new floating-point control or status bits in future, the implementations of fesetmode for those architectures may need updating (depending on whether the new bits are control or status bits and what the implementation does with previously unknown bits), just like existing implementations of <fenv.h> functions that take care not to touch reserved bits may need updating when the set of reserved bits changes. (As any new bits are outside the scope of ISO C, that's just a quality-of-implementation issue for supporting them, not a conformance issue.) As with fenv_t, femode_t should properly include any software DFP rounding mode (and for both fenv_t and femode_t I'd consider that fragment of DFP support appropriate for inclusion in glibc even in the absence of the rest of libdfp; hardware DFP rounding modes should already be included if the definitions of which bits are status / control bits are correct). Tested for x86_64, x86, mips64 (hard float, and soft float to test the fallback version), arm (hard float) and powerpc (hard float, soft float and e500). Other architecture versions are untested. math/fegetmode.c: New file. * math/fesetmode.c: Likewise. * sysdeps/i386/fpu/fegetmode.c: Likewise. * sysdeps/i386/fpu/fesetmode.c: Likewise. * sysdeps/x86_64/fpu/fegetmode.c: Likewise. * sysdeps/x86_64/fpu/fesetmode.c: Likewise. * math/fenv.h: Update comment on inclusion of <bits/fenv.h>. [__GLIBC_USE (IEC_60559_BFP_EXT)] (fegetmode): New function declaration. [__GLIBC_USE (IEC_60559_BFP_EXT)] (fesetmode): Likewise. * bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/aarch64/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/alpha/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/arm/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/hppa/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/ia64/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/m68k/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/microblaze/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/mips/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/nios2/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/powerpc/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (__fe_dfl_mode): New variable declaration. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/s390/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/sh/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/sparc/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/tile/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/x86/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * manual/arith.texi (FE_DFL_MODE): Document macro. (fegetmode): Document function. (fesetmode): Likewise. * math/Versions (fegetmode): New libm symbol at version GLIBC_2.25. (fesetmode): Likewise. * math/Makefile (libm-support): Add fegetmode and fesetmode. (tests): Add test-femode and test-femode-traps. * math/test-femode-traps.c: New file. * math/test-femode.c: Likewise. * sysdeps/powerpc/fpu/fenv_const.c (__fe_dfl_mode): Declare as alias for __fe_dfl_env. * sysdeps/powerpc/nofpu/fenv_const.c (__fe_dfl_mode): Likewise. * sysdeps/powerpc/powerpc32/e500/nofpu/fenv_const.c (__fe_dfl_mode): Likewise. * sysdeps/powerpc/Versions (__fe_dfl_mode): New libm symbol at version GLIBC_2.25. * sysdeps/nacl/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
*	S390: Do not set FE_INEXACT with feraiseexcept (FE_OWERFLOW\|FE_UNDERFLOW).	Stefan Liebler	2016-08-31	4	-6/+100
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On s390 feraiseexcept (FE_OVERFLOW\|FE_UNDERFLOW) sets FE_INEXACT, too. This patch uses z196 zarch load rounded instruction which can suppress FE_INEXACT exception if gcc has z196 support in used configuration. Otherwise FE_INEXACT flag is set as before. The gcc support is tested in a new configure-check. A comment in fsetexcptflg.c is corrected as new exceptions are not executed with the next floating-point instruction if fpc is set with _FPU_SETCW macro. It seems the comment was copied e.g. from sysdeps/x86_64/fpu/fsetexcptflg.c file. ChangeLog: * config.h.in (HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT): New undefine. * sysdeps/s390/configure.ac: Add test for z196 zarch support. * sysdeps/s390/configure: Regenerated. * sysdeps/s390/fpu/fraiseexcpt.c (__feraiseexcept): Use ledbra instruction for raising over-/underflow if z196 zarch is supported by default. * sysdeps/s390/fpu/fsetexcptflg.c (fesetexceptflag): Correct comment.
*	Add fetestexceptflag.	Joseph Myers	2016-08-29	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TS 18661-1 defines an fetestexceptflag function to test the exception state saved in an fexcept_t object by fegetexceptflag. This patch implements this function for glibc. Almost all architectures save exception state in such a way that it can be directly ANDed with exception flag bits, so rather than having lots of fetestexceptflag implementations that all do the same thing, the math/ implementation is made to use this generic logic (which is also OK in the fallback case where FE_ALL_EXCEPT is zero). The only architecture that seems to need anything different is s390. (fegetexceptflag and fesetexceptflag use abbreviated filenames fgetexcptflg.c and fsetexcptflg.c. Because we are no longer concerned by 14-character filename limits, fetestexceptflag uses the obvious filename fetestexceptflag.c.) The NEWS entry is intended to be expanded along the lines given in <https://sourceware.org/ml/libc-alpha/2016-08/msg00356.html> when fegetmode and fesetmode are added. Tested for x86_64, x86, mips64 and powerpc. * math/fetestexceptflag.c: New file. * sysdeps/s390/fpu/fetestexceptflag.c: Likewise. Comment by Stefan Liebler. * math/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (fetestexceptflag): New function declaration. * manual/arith.texi (fetestexceptflag): Document function. * math/Versions (fetestexceptflag): New libm symbol at version GLIBC_2.25. * math/Makefile (libm-support): Add fetestexceptflag. (tests): Add test-fetestexceptflag. * math/test-fetestexceptflag.c: New file. * sysdeps/nacl/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
*	Do not override objects in libc.a in other static libraries [BZ #20452]	Florian Weimer	2016-08-17	1	-0/+1
\| \| \| \| \|	With this change, we no longer add sysdep.o and similar objects which are present in libc.a to other static libraries.
*	Add fesetexcept: s390.	Joseph Myers	2016-08-16	1	-0/+33
\| \| \| \| \| \| \|	This patch adds an S/390 version of fesetexcept. Tested and corrected by Stefan Liebler. * sysdeps/s390/fpu/fesetexcept.c: New file.
*	S390: Do not clobber r13 with memcpy on 31bit with copies >1MB.	Stefan Liebler	2016-07-20	1	-8/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the default memcpy variant is called with a length of >1MB on 31bit, r13 is clobbered as the algorithm is switching to mvcle. The mvcle code returns without restoring r13. All other cases are restoring r13. If memcpy is called from outside libc the ifunc resolver will only select this variant if running on machines older than z10. Otherwise or if memcpy is called from inside libc, this default memcpy variant is called. The testcase timezone/tst-tzset is triggering this issue in some combinations of gcc versions and optimization levels. This bug was introduced in commit 04bb21ac93e90d7696bcaf8febe2b2dd2d83585a and thus is a regression compared to former glibc 2.23 release. This patch removes the usage of r13 at all. Thus it is not saved and restored. The base address for execute-instruction is now stored in r5 which is obtained after r5 is not needed anymore as 256byte block counter. ChangeLog: * sysdeps/s390/s390-32/memcpy.S (memcpy): Eliminate the usage of r13 as it is not restored in mvcle case.
*	S390: Use DT_JUMPREL in prelink undo code.	Stefan Liebler	2016-07-06	3	-10/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On s390, the current prelink undo code in elf_machine_lazy_rel() has the requirement, that the plt stubs use the first got slots after the 3 reserved ones. In case of undoing prelink, the plt got slots are reset to the correct addresses whithin the corresponding plt-stub. Therefore the address is calculated by the address of the first plt-stub-address which was written by prelink (see l->l_mach.plt) to got[1] and index of current relocation multiplied with 32 (=size of one plt slot). The index was calculated with &current-got-slot - &got[3]. This patch removes the requirement, that the plt-got-slots are starting at got[3]. The index is now calculated with &current-reloc - &reloc[0]. The first struct Elf64_Rela is stored at DT_JMPREL. This patch is needed to prepare for partial relro support. Ulrich Weigand suggested this approach to use DT_JMPREL - Thanks. ChangeLog: * sysdeps/s390/linkmap.h (struct link_map_machine): Remove member gotplt and add member jmprel. * sysdeps/s390/s390-32/dl-machine.h (elf_machine_runtime_setup): Setup member jmprel with DT_JMPREL instead of gotplt with &got[3]. (elf_machine_lazy_rel): Calculate address with reloc and jmprel. * sysdeps/s390/s390-64/dl-machine.h: Likewise.
*	elf: Consolidate machine-agnostic DTV definitions in <dl-dtv.h>	Florian Weimer	2016-06-20	2	-16/+1
\| \| \| \| \|	Identical definitions of dtv_t and TLS_DTV_UNALLOCATED were repeated for all architectures using DTVs.
*	S390: Fix utf32 to utf16 handling of low surrogates (disable cu42).	Stefan Liebler	2016-05-25	1	-93/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	According to the latest Unicode standard, a conversion from/to UTF-xx has to report an error if the character value is in range of an utf16 surrogate (0xd800..0xdfff). See https://sourceware.org/ml/libc-help/2015-12/msg00015.html. Thus the cu42 instruction, which converts from utf32 to utf16, has to be disabled because it does not report an error in case of a value in range of a low surrogate (0xdc00..0xdfff). The etf3eh variant is removed and the c, vector variant is adjusted to handle the value in range of an utf16 low surrogate correctly. ChangeLog: * sysdeps/s390/utf16-utf32-z9.c: Disable cu42 instruction and report an error in case of a value in range of an utf16 low surrogate.
*	S390: Fix utf32 to utf8 handling of low surrogates (disable cu41).	Stefan Liebler	2016-05-25	1	-73/+115
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	According to the latest Unicode standard, a conversion from/to UTF-xx has to report an error if the character value is in range of an utf16 surrogate (0xd800..0xdfff). See https://sourceware.org/ml/libc-help/2015-12/msg00015.html. Thus the cu41 instruction, which converts from utf32 to utf8, has to be disabled because it does not report an error in case of a value in range of a low surrogate (0xdc00..0xdfff). The etf3eh variant is removed and the c, vector variant is adjusted to handle the value in range of an utf16 low surrogate correctly. ChangeLog: * sysdeps/s390/utf8-utf32-z9.c: Disable cu41 instruction and report an error in case of a value in range of an utf16 low surrogate.
*	S390: Use s390-64 specific ionv-modules on s390-32, too.	Stefan Liebler	2016-05-25	6	-46/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch reworks the existing s390 64bit specific iconv modules in order to use them on s390 31bit, too. Thus the parts for subdirectory iconvdata in sysdeps/s390/s390-64/Makefile were moved to sysdeps/s390/Makefile so that they apply on 31bit, too. All those modules are moved from sysdeps/s390/s390-64 directory to sysdeps/s390. The iso-8859-1 to/from cp037 module was adjusted, to use brct (branch relative on count) instruction on 31bit s390 instead of brctg, because the brctg is a zarch instruction and is not available on a 31bit kernel. The utf modules are using zarch instructions, thus the directive machinemode zarch_nohighgprs was added to the inline assemblies to omit the high-gprs flag in the shared libraries. Otherwise they can't be loaded on a 31bit kernel. The ifunc resolvers were adjusted in order to call the etf3eh or vector variants only if zarch instructions are available (64bit kernel in 31bit compat-mode). Furthermore some variable types were changed. E.g. unsigned long long would be a register pair on s390 31bit, but we want only one single register. For variables of type size_t the register contents have to be enlarged from a 32bit to a 64bit value on 31bit, because the inline assemblies uses 64bit values in such cases. ChangeLog: * sysdeps/s390/s390-64/Makefile (iconvdata-subdirectory): Move to ... * sysdeps/s390/Makefile: ... here. * sysdeps/s390/s390-64/iso-8859-1_cp037_z900.c: Move to ... * sysdeps/s390/iso-8859-1_cp037_z900.c: ... here. (BRANCH_ON_COUNT): New define. (TR_LOOP): Use BRANCH_ON_COUNT instead of brctg. * sysdeps/s390/s390-64/utf16-utf32-z9.c: Move to ... * sysdeps/s390/utf16-utf32-z9.c: ... here and adjust to run on s390-32, too. * sysdeps/s390/s390-64/utf8-utf16-z9.c: Move to ... * sysdeps/s390/utf8-utf16-z9.c: ... here and adjust to run on s390-32, too. * sysdeps/s390/s390-64/utf8-utf32-z9.c: Move to ... * sysdeps/s390/utf8-utf32-z9.c: ... here and adjust to run on s390-32, too.
*	S390: Optimize utf16-utf32 module.	Stefan Liebler	2016-05-25	1	-92/+379
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch reworks the s390 specific module to convert between utf16 and utf32. Now ifunc is used to choose either the c or etf3eh (with convert utf instruction) variants at runtime. Furthermore a new vector variant for z13 is introduced which will be build and chosen if vector support is available at build / runtime. In case of converting utf 32 to utf16, the vector variant optimizes input of 2byte utf16 characters. The convert utf instruction is used if an utf16 surrogate is found. For the other direction utf16 to utf32, the cu24 instruction can't be re- enabled, because it does not report an error, if the input-stream consists of a single low surrogate utf16 char (e.g. 0xdc00). This applies to the newest z13, too. Thus there is only the c or the new vector variant, which can handle utf16 surrogate characters. This patch also fixes some whitespace errors. Furthermore, the etf3eh variant is handling the "UTF-xx//IGNORE" case now. Before they ignored the ignore-case and always stopped at an error. ChangeLog: * sysdeps/s390/s390-64/utf16-utf32-z9.c: Use ifunc to select c, etf3eh or new vector loop-variant.
*	S390: Optimize utf8-utf16 module.	Stefan Liebler	2016-05-25	1	-106/+441
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch reworks the s390 specific module to convert between utf8 and utf16. Now ifunc is used to choose either the c or etf3eh (with convert utf instruction) variants at runtime. Furthermore a new vector variant for z13 is introduced which will be build and chosen if vector support is available at build / runtime. In case of converting utf 8 to utf16, the vector variant optimizes input of 1byte utf8 characters. The convert utf instruction is used if a multibyte utf8 character is found. For the other direction utf16 to utf8, the cu21 instruction can't be re-enabled, because it does not report an error, if the input-stream consists of a single low surrogate utf16 char (e.g. 0xdc00). This applies to the newest z13, too. Thus there is only the c or the new vector variant, which can handle 1..4 byte utf8 characters. The c variant from utf16 to utf8 has beed fixed. If a high surrogate was at the end of the input-buffer, then errno was set to EINVAL and the input-pointer pointed just after the high surrogate. Now it points to the beginning of the high surrogate. This patch also fixes some whitespace errors. The c variant from utf8 to utf16 is now checking that tail-bytes starts with 0b10... and the value is not in range of an utf16 surrogate. Furthermore, the etf3eh variants are handling the "UTF-xx//IGNORE" case now. Before they ignored the ignore-case and always stopped at an error. ChangeLog: * sysdeps/s390/s390-64/utf8-utf16-z9.c: Use ifunc to select c, etf3eh or new vector loop-variant.
*	S390: Optimize utf8-utf32 module.	Stefan Liebler	2016-05-25	1	-184/+480
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch reworks the s390 specific module to convert between utf8 and utf32. Now ifunc is used to choose either the c or etf3eh (with convert utf instruction) variants at runtime. Furthermore a new vector variant for z13 is introduced which will be build and chosen if vector support is available at build / runtime. The vector variants optimize input of 1byte utf8 characters. The convert utf instruction is used if a multibyte utf8 character is found. This patch also fixes some whitespace errors. The c variants are rejecting UTF-16 surrogates and values above 0x10ffff now. Furthermore, the etf3eh variants are handling the "UTF-xx//IGNORE" case now. Before they ignored the ignore-case and always stopped at an error. ChangeLog: * sysdeps/s390/s390-64/utf8-utf32-z9.c: Use ifunc to select c, etf3eh or new vector loop-variant.
*	S390: Optimize iso-8859-1 to ibm037 iconv-module.	Stefan Liebler	2016-05-25	1	-37/+56
\| \| \| \| \| \| \| \| \| \| \|	This patch reworks the s390 specific module which used the z900 translate one to one instruction. Now the g5 translate instruction is used, because it outperforms the troo instruction. ChangeLog: * sysdeps/s390/s390-64/iso-8859-1_cp037_z900.c (TROO_LOOP): Rename to TR_LOOP and usage of tr instead of troo instruction.
*	S390: Optimize builtin iconv-modules.	Stefan Liebler	2016-05-25	2	-0/+1270
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch introduces a s390 specific gconv_simple.c file which provides optimized versions for z13 with vector instructions, which will be chosen at runtime via ifunc. The optimized conversions can convert between internal and ascii, ucs4, ucs4le, ucs2, ucs2le. If the build-environment lacks vector support, then iconv/gconv_simple.c is used wihtout any change. Otherwise iconvdata/gconv_simple.c is used to create conversion loop routines without vector instructions as fallback, if vector instructions aren't available at runtime. ChangeLog: * sysdeps/s390/multiarch/gconv_simple.c: New File. * sysdeps/s390/multiarch/Makefile (sysdep_routines): Add gconv_simple.
*	S390: Optimize 8bit-generic iconv modules.	Stefan Liebler	2016-05-25	4	-0/+452
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch introduces a s390 specific 8bit-generic.c file which provides an optimized version for z13 with translate-/vector-instructions, which will be chosen at runtime via ifunc. If the build-environment lacks vector support, then iconvdata/8bit-generic.c is used wihtout any change. Otherwise iconvdata/8bit-generic.c is used to create conversion loop routines without vector instructions as fallback, if vector instructions aren't available at runtime. The vector routines can only be used with charsets where the maximum UCS4 value fits in 1 byte size. Then the hardware translate-instruction is used to translate between up to 256 generic characters and "1 byte UCS4" characters at once. The vector instructions are used to convert between the "1 byte UCS4" and UCS4. The gen-8bit.sh script in sysdeps/s390/multiarch generates the conversion table to_ucs1. Therefore in sysdeps/s390/multiarch/Makefile is added an override define generate-8bit-table, which is originally defined in iconvdata/Makefile. This version calls the gen-8bit.sh in iconvdata folder and the s390 one. ChangeLog: * sysdeps/s390/multiarch/8bit-generic.c: New File. * sysdeps/s390/multiarch/gen-8bit.sh: New File. * sysdeps/s390/multiarch/Makefile (generate-8bit-table): New override define. * sysdeps/s390/multiarch/iconv/skeleton.c: Likewise.
*	S390: Configure check for vector support in gcc.	Stefan Liebler	2016-05-25	2	-0/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The S390 specific test checks if the gcc has support for vector registers by compiling an inline assembly which clobbers vector registers. On success the macro HAVE_S390_VX_GCC_SUPPORT is defined. This macro can be used to determine if e.g. clobbering vector registers is allowed or not. ChangeLog: * config.h.in (HAVE_S390_VX_GCC_SUPPORT): New macro undefine. * sysdeps/s390/configure.ac: Add test for S390 vector register support in gcc. * sysdeps/s390/configure: Regenerated.
*	S390: Get rid of make warning: overriding recipe for target gconv-modules.	Stefan Liebler	2016-05-25	2	-50/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch introduces a way to provide an architecture dependent gconv-modules file. Before this patch, the gconv-modules file was normally installed from src-dir/iconvdata/gconv-modules. The S390 Makefile had overridden the installation recipe (with a make warning) in order to install the gconv-module-s390 file from build-dir. The iconvdata/Makefile provides another recipe, which copies the gconv-modules file from src to build dir, which are used by the testcases. Thus the testcases does not use the currently build s390-modules. This patch uses build-dir/iconvdata/gconv-modules for installation, which is generated by concatenating src-dir/iconvdata/gconv-modules and the architecture specific one. The latter one can be specified by setting the variable sysdeps-gconv-modules in sysdeps/.../Makefile. The architecture specific gconv-modules file is emitted before the common one because these modules aren't used in all possible conversions. E.g. the converting from INTERNAL to UTF-16 used the common UTF-16.so module instead of UTF16_UTF32_Z9.so. This way, the s390-Makefile does not need to override the recipe for gconv-modules and no warning is emitted anymore. Since we no longer support empty objpfx the conditional test in iconvdata/Makefile is removed. ChangeLog: * iconvdata/Makefile ($(inst_gconvdir)/gconv-modules): Install file from $(objpfx)gconv-modules. ($(objpfx)gconv-modules): Concatenate architecture specific file in variable sysdeps-gconv-modules and gconv-modules in src dir. * sysdeps/s390/gconv-modules: New file. * sysdeps/s390/s390-64/Makefile: ($(inst_gconvdir)/gconv-modules): Deleted. ($(objpfx)gconv-modules-s390): Deleted. (sysdeps-gconv-modules): New variable.
*	S390: Implement mempcpy with help of memcpy. [BZ #19765]	Stefan Liebler	2016-05-24	8	-45/+167
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There exist optimized memcpy functions on s390, but no optimized mempcpy. This patch adds mempcpy entry points in memcpy.S files, which use the memcpy implementation. Now mempcpy itself is also an IFUNC function as memcpy is and the variants are listed in ifunc-impl-list.c. The s390 string.h does not define _HAVE_STRING_ARCH_mempcpy. Instead mempcpy string/string.h inlines memcpy() + n. If n is constant and small enough, GCC emits instructions like mvi or mvc and avoids the function call to memcpy. If n is not constant, then memcpy is called and n is added afterwards. If _HAVE_STRING_ARCH_mempcpy would be defined, mempcpy would be called in every case. According to PR70140 "Inefficient expansion of __builtin_mempcpy" (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70140) GCC should handle a call to mempcpy in the same way as memcpy. Then either the mempcpy macro in string/string.h has to be removed or _HAVE_STRING_ARCH_mempcpy has to be defined for S390. ChangeLog: [BZ #19765] * sysdeps/s390/mempcpy.S: New File. * sysdeps/s390/multiarch/mempcpy.c: Likewise. * sysdeps/s390/multiarch/Makefile (sysdep_routines): Add mempcpy. * sysdeps/s390/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list): Add mempcpy variants. * sysdeps/s390/s390-32/memcpy.S: Add mempcpy entry point. (memcpy): Adjust to be usable from mempcpy entry point. (__memcpy_mvcle): Likewise. * sysdeps/s390/s390-64/memcpy.S: Likewise. * sysdeps/s390/s390-32/multiarch/memcpy-s390.S: Add entry points ____mempcpy_z196, ____mempcpy_z10 and add __GI_ symbols for mempcpy. (__memcpy_z196): Adjust to be usable from mempcpy entry point. (__memcpy_z10): Likewise. * sysdeps/s390/s390-64/multiarch/memcpy-s390x.S: Likewise.
*	S390: Do not call memcpy, memcmp, memset within libc.so via ifunc-plt.	Stefan Liebler	2016-05-24	7	-5/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On s390, the memcpy, memcmp, memset functions are IFUNC symbols, which are created with s390_libc_ifunc-macro. This macro creates a __GI_ symbol which is set to the ifunced symbol. Thus calls within libc.so to e.g. memcpy result in a call to ABS+0x954c0@plt stub and afterwards to the resolved memcpy-ifunc-variant. This patch sets the __GI_ symbol to the default-ifunc-variant to avoid the plt call. The __GI_ symbols are now created at the default variant of ifunced function. ChangeLog: * sysdeps/s390/multiarch/ifunc-resolve.h (s390_libc_ifunc): Remove __GI_ symbol. * sysdeps/s390/s390-32/multiarch/memcmp-s390.S: Add __GI_memcmp symbol. * sysdeps/s390/s390-64/multiarch/memcmp-s390x.S: Likewise. * sysdeps/s390/s390-32/multiarch/memcpy-s390.S: Add __GI_memcpy symbol. * sysdeps/s390/s390-64/multiarch/memcpy-s390x.S: Likewise. * sysdeps/s390/s390-32/multiarch/memset-s390.S: Add __GI_memset symbol. * sysdeps/s390/s390-64/multiarch/memset-s390x.S: Likewise.
*	S390: Use 64bit instruction to check for copies of > 1MB with mvcle.	Stefan Liebler	2016-05-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	The __memcpy_default variant on s390 64bit calculates the number of 256byte blocks in a 64bit register and checks, if they exceed 1MB to jump to mvcle. Otherwise a mvc-loop is used. The compare-instruction only checks a 32bit value. This patch uses a 64bit compare. ChangeLog: * sysdeps/s390/s390-64/memcpy.S (memcpy): Use cghi instead of chi to compare 64bit value.
*	S390: Use mvcle for copies > 1MB on 32bit with default memcpy variant.	Stefan Liebler	2016-05-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	If more than 255 bytes should be copied, the algorithm jumps away. Before this patch, it jumps to the mvc-loop (.L_G5_12). Now it jumps first to the "> 1MB" check, which jumps away to __memcpy_mvcle. Otherwise, the mvc-loop (.L_G5_12) copies the bytes. ChangeLog: * sysdeps/s390/s390-32/memcpy.S (memcpy): Jump to 1MB check before executing mvc-loop.
*	S390: Use fPIC to avoid R_390_GOT12 relocation in gcrt1.o.	Stefan Liebler	2016-05-11	2	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	if glibc is build with -march=z900 \| -march=z990, the startup file gcrt1.o (used if you link with gcc -pg) contains R_390_GOT12 \| R_390_GOT20 relocations. Thus, an entry in the GOT can be addressed relative to the GOT pointer with a 12 \| 20 bit displacement value. The startup files should not contain R_390_GOT12, R_390_GOT20 relocations, but R_390_GOTENT ones. This patch removes the overrides of pic-ccflag and the default pic-ccflag = -fPIC in Makeconfig is used instead to get the R_390_GOTENT relocations in gcrt1.o. ChangeLog: * sysdeps/s390/s390-32/Makefile (pic-ccflag): Remove. * sysdeps/s390/s390-64/Makefile: Likewise.
*	S390: Use ahi instead of aghi in 32bit _dl_runtime_resolve.	Stefan Liebler	2016-04-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	This patch uses ahi instead of aghi in 32bit _dl_runtime_resolve to adjust the stack pointer. This is no functional change, but a cosmetic one. ChangeLog: * sysdeps/s390/s390-32/dl-trampoline.h (_dl_runtime_resolve): Use ahi instead of aghi to adjust stack pointer.
*	S390: Extend structs La_s390_regs / La_s390_retval with vector-registers.	Stefan Liebler	2016-03-31	3	-65/+124
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Starting with z13, vector registers can also occur as argument registers. Thus the passed input/output register structs for la_s390_[32\|64]_gnu_plt[enter\|exit] functions should reflect those new registers. This patch extends these structs La_s390_regs and La_s390_retval and adjusts _dl_runtime_profile() to handle those fields in case of running on a z13 machine. ChangeLog: * sysdeps/s390/bits/link.h: (La_s390_vr) New typedef. (La_s390_32_regs): Append vector register lr_v24-lr_v31. (La_s390_64_regs): Likewise. (La_s390_32_retval): Append vector register lrv_v24. (La_s390_64_retval): Likeweise. * sysdeps/s390/s390-32/dl-trampoline.h (_dl_runtime_profile): Handle extended structs La_s390_32_regs and La_s390_32_retval. * sysdeps/s390/s390-64/dl-trampoline.h (_dl_runtime_profile): Handle extended structs La_s390_64_regs and La_s390_64_retval.
*	S390: Save and restore fprs/vrs while resolving symbols.	Stefan Liebler	2016-03-31	6	-248/+496
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On s390, no fpr/vrs were saved while resolving a symbol via _dl_runtime_resolve/_dl_runtime_profile. According to the abi, the fpr-arguments are defined as call clobbered. In leaf-functions, gcc 4.9 and newer can use fprs for saving/restoring gprs instead of saving them to the stack. If gcc do this in one of the resolver-functions, then the floating point arguments of a library-function are invalid for the first library-function-call. Thus, this patch saves/restores the fprs around the resolving code. The same could occur for vector registers. Furthermore an ifunc-resolver could also clobber the vector/floating point argument registers. Thus this patch provides the further variants _dl_runtime_resolve_vx/ _dl_runtime_profile_vx, which are used if the kernel claims, that we run on a machine with vector registers. Furthermore, if _dl_runtime_profile calls _dl_call_pltexit, the pointers to inregs-/outregs-structs were setup invalid. Now they point to the correct location in the stack-frame. Before branching back to the caller, the return values are now restored instead of containing the return values of the _dl_call_pltexit() call. On s390-32, an endless loop occurs if _dl_call_pltexit() should be called. Now, this code-path branches to this function instead of just after the preceding basr-instruction. ChangeLog: * sysdeps/s390/s390-32/dl-trampoline.S: Include dl-trampoline.h twice to create a non-vector/vector version for _dl_runtime_resolve and _dl_runtime_profile. Move implementation to ... * sysdeps/s390/s390-32/dl-trampoline.h: ... here. (_dl_runtime_resolve) Save and restore fpr/vrs. (_dl_runtime_profile) Save and restore vrs and fix some issues if _dl_call_pltexit is called. * sysdeps/s390/s390-32/dl-machine.h (elf_machine_runtime_setup): Choose the correct resolver function if running on a machine with vx. * sysdeps/s390/s390-64/dl-trampoline.S: Include dl-trampoline.h twice to create a non-vector/vector version for _dl_runtime_resolve and _dl_runtime_profile. Move implementation to ... * sysdeps/s390/s390-64/dl-trampoline.h: ... here. (_dl_runtime_resolve) Save and restore fpr/vrs. (_dl_runtime_profile) Save and restore vrs and fix some issues * sysdeps/s390/s390-64/dl-machine.h: (elf_machine_runtime_setup): Choose the correct resolver function if running on a machine with vx.
*	Add _STRING_INLINE_unaligned and string_private.h	H.J. Lu	2016-02-18	2	-2/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As discussed in https://sourceware.org/ml/libc-alpha/2015-10/msg00403.html the setting of _STRING_ARCH_unaligned currently controls the external GLIBC ABI as well as selecting the use of unaligned accesses withing GLIBC. Since _STRING_ARCH_unaligned was recently changed for AArch64, this would potentially break the ABI in GLIBC 2.23, so split the uses and add _STRING_INLINE_unaligned to select the string ABI. This setting must be fixed for each target, while _STRING_ARCH_unaligned may be changed from release to release. _STRING_ARCH_unaligned is used unconditionally in glibc. But <bits/string.h>, which defines _STRING_ARCH_unaligned, isn't included with -Os. Since _STRING_ARCH_unaligned is internal to glibc and may change between glibc releases, it should be made private to glibc. _STRING_ARCH_unaligned should defined in the new string_private.h heade file which is included unconditionally from internal <string.h> for glibc build. [BZ #19462] * bits/string.h (_STRING_ARCH_unaligned): Renamed to ... (_STRING_INLINE_unaligned): This. * include/string.h: Include <string_private.h>. * string/bits/string2.h: Replace _STRING_ARCH_unaligned with _STRING_INLINE_unaligned. * sysdeps/aarch64/bits/string.h (_STRING_ARCH_unaligned): Removed. (_STRING_INLINE_unaligned): New. * sysdeps/aarch64/string_private.h: New file. * sysdeps/generic/string_private.h: Likewise. * sysdeps/m68k/m680x0/m68020/string_private.h: Likewise. * sysdeps/s390/string_private.h: Likewise. * sysdeps/x86/string_private.h: Likewise. * sysdeps/m68k/m680x0/m68020/bits/string.h (_STRING_ARCH_unaligned): Renamed to ... (_STRING_INLINE_unaligned): This. * sysdeps/s390/bits/string.h (_STRING_ARCH_unaligned): Renamed to ... (_STRING_INLINE_unaligned): This. * sysdeps/sparc/bits/string.h (_STRING_ARCH_unaligned): Renamed to ... (_STRING_INLINE_unaligned): This. * sysdeps/x86/bits/string.h (_STRING_ARCH_unaligned): Renamed to ... (_STRING_INLINE_unaligned): This.
*	S390: Regenerate ULPs	Stefan Liebler	2016-01-19	1	-238/+268
\| \| \| \| \| \| \| \| \|	I've regenerated ulps from scratch for s390/s390x. All math testcases are passing afterwards. ChangeLog: * sysdeps/s390/fpu/libm-test-ulps: Regenerated.
*	S/390: Do not raise inexact exception in lrint/lround. [BZ #19486]	Stefan Liebler	2016-01-18	1	-0/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I get some math test-failures on s390 for float/double/ldouble for various lrint/lround functions like: lrint (0x1p64): Exception "Inexact" set lrint (-0x1p64): Exception "Inexact" set lround (0x1p64): Exception "Inexact" set lround (-0x1p64): Exception "Inexact" set ... GCC emits "convert to fixed" instructions for casting floating point values to integer values. These instructions raise invalid and inexact exceptions if the floating point value exceeds the integer type ranges. This patch enables the various FIX_DBL_LONG_CONVERT_OVERFLOW macros in order to avoid a cast from floating point to integer type and raise the invalid exception with feraiseexcept. The ldbl-128 rint/round functions are now using the same logic. ChangeLog: [BZ #19486] * sysdeps/s390/fix-fp-int-convert-overflow.h: New File. * sysdeps/generic/fix-fp-int-convert-overflow.h (FIX_LDBL_LONG_CONVERT_OVERFLOW, FIX_LDBL_LLONG_CONVERT_OVERFLOW): New define. * sysdeps/arm/fix-fp-int-convert-overflow.h: Likewise. * sysdeps/mips/mips32/fpu/fix-fp-int-convert-overflow.h: Likewise. * sysdeps/ieee754/ldbl-128/s_lrintl.c (__lrintl): Avoid conversions to long int where inexact exceptions could be raised. * sysdeps/ieee754/ldbl-128/s_lroundl.c (__lroundl): Likewise. * sysdeps/ieee754/ldbl-128/s_llrintl.c (__llrintl): Avoid conversions to long long int where inexact exceptions could be raised. * sysdeps/ieee754/ldbl-128/s_llroundl.c (__llroundl): Likewise.
*	Add __private_ss to s390 struct tcbhead.	Marcin Kościelnicki	2016-01-14	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	Preparation for gcc -fsplit-stack support (gcc bug #68191). The new field is basically identical to the one on x86. Its TCB offset needs to be constant, as it'll be hardcoded in gcc. ChangeLog: * sysdeps/s390/nptl/tls.h (struct tcbhead_t): Add __private_ss field.
*	Update copyright dates with scripts/update-copyrights.	Joseph Myers	2016-01-04	244	-244/+244
\|