about summary refs log tree commit diff
Commit message (Collapse)AuthorAgeFilesLines
* Speedup tanf range reductionWilco Dijkstra2018-08-232-1/+32
| | | | | | | | | | | | | | | | Speedup tanf range reduction by using the new sincosf range reduction algorithm. Overall code quality is improved due to inlining, so there is a speedup even if no range reduction is required. tanf throughput gains on Cortex-A72: * |x| < M_PI_4 : 1.1x * |x| < M_PI_2 : 1.2x * |x| < 2 * M_PI: 1.5x * |x| < 120.0 : 1.6x * |x| < Inf : 12.1x * sysdeps/ieee754/flt-32/s_tanf.c (__tanf): Use fast range reduction.
* Add test-in-container infrastructure.DJ Delorie2018-08-2220-6/+1779
| | | | | | | | | | | | | | | | | | | | | | | | | * Makefile (testroot.pristine): New rules to initialize the test-in-container "testroot". * Makerules (all-testsuite): Add tests-container. * Rules (tests-expected): Add tests-container. (binaries-all-tests): Likewise. (tests-container): New, run these tests in the testroot container. * support/Makefile (others): Add *-container, support_paths.c, xmkdirp, and links-dso-program. * support/links-dso-program-c.c: New. * support/links-dso-program.cc: New. * support/test-container.c: New. * support/shell-container.c: New. * support/echo-container.c: New. * support/true-container.c: New. * support/xmkdirp.c: New. * support/xsymlink.c: New. * support/support_paths.c: New. * support/support.h: Add support paths prototypes. * support/xunistd.h: Add xmkdirp () and xsymlink (). * nss/tst-nss-test3.c: Convert to test-in-container. * nss/tst-nss-test3.root/: New.
* regex: port Gnulib code to z/OS POSIX environmentPaul Eggert2018-08-222-0/+11
| | | | | | | Problem reported by Arnold Robbins in: https://lists.gnu.org/r/bug-gnulib/2018-08/msg00129.html * posix/regex_internal.h (__iswalnum, __towlower, __towupper) [!_LIBC]: Undef.
* Don't redefine ROUNDING_TESTS_* in math/test-*-vlen*.h.Joseph Myers2018-08-228-24/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch completes the move of ROUNDING_TESTS_* macros to typo-proof conventions by stopping redefining them in test-*-vlen*.h. Instead, libm-test-driver.c is made to check TEST_MATHVEC when setting non-to-nearest rounding modes. Tested for x86_64. * math/test-double-vlen2.h: Don't include <math-tests-rounding.h>. (ROUNDING_TESTS_double): Remove. * math/test-double-vlen4.h: Don't include <math-tests-rounding.h>. (ROUNDING_TESTS_double): Remove. * math/test-double-vlen8.h: Don't include <math-tests-rounding.h>. (ROUNDING_TESTS_double): Remove. * math/test-float-vlen16.h: Don't include <math-tests-rounding.h>. (ROUNDING_TESTS_float): Remove. * math/test-float-vlen4.h: Don't include <math-tests-rounding.h>. (ROUNDING_TESTS_float): Remove. * math/test-float-vlen8.h: Don't include <math-tests-rounding.h>. (ROUNDING_TESTS_float): Remove. * math/libm-test-driver.c (IF_ROUND_INIT_FE_DOWNWARD): Check !TEST_MATHVEC here. (IF_ROUND_INIT_FE_TOWARDZERO): Likewise. (IF_ROUND_INIT_FE_UPWARD): Likewise.
* Move ROUNDING_TESTS_* out of math-tests.h.Joseph Myers2018-08-2216-35/+141
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Continuing moving macros out of math-tests.h to smaller headers following typo-proof conventions instead of using #ifndef, this patch moves the ROUNDING_TESTS_* macros for individual types out to their own sysdeps header. In the soft-float case where FE_TONEAREST is the only rounding mode macro defined, there is no need to define ROUNDING_TESTS_*; it is only necessary when rounding modes macros are defined that may not be supported at runtime. Thus, the ROUNDING_TESTS_* definitions for some configurations are just removed, not moved to new math-tests-rounding.h headers; the only architectures needing math-tests-rounding.h are those where the macros are defined in bits/fenv.h because of the possibility of a soft-float compilation using a hard-float glibc with the same ABI (i.e., ARM and RISC-V). The test-*-vlen*.h headers, by using #undef, do not yet follow typo-proof conventions (but they no longer implicitly rely on being included before math-tests.h, and this area can always be cleaned up further in future). Tested with build-many-glibcs.py. * sysdeps/generic/math-tests-rounding.h: New file. * sysdeps/generic/math-tests.h: Include <math-tests-rounding.h>. (ROUNDING_TESTS_float): Do not define here. (ROUNDING_TESTS_double): Likewise. (ROUNDING_TESTS_long_double): Likewise. (ROUNDING_TESTS_float128): Likewise. * math/test-double-vlen2.h: Include <math-tests-rounding.h>. (ROUNDING_TESTS_double): Undefine before defining. * math/test-double-vlen4.h: Include <math-tests-rounding.h>. (ROUNDING_TESTS_double): Undefine before defining. * math/test-double-vlen8.h: Include <math-tests-rounding.h>. (ROUNDING_TESTS_double): Undefine before defining. * math/test-float-vlen16.h: Include <math-tests-rounding.h>. (ROUNDING_TESTS_float): Undefine before defining. * math/test-float-vlen4.h: Include <math-tests-rounding.h>. (ROUNDING_TESTS_float): Undefine before defining. * math/test-float-vlen8.h: Include <math-tests-rounding.h>. (ROUNDING_TESTS_float): Undefine before defining. * sysdeps/arm/nofpu/math-tests-rounding.h: New file. * sysdeps/arm/math-tests.h [__SOFTFP__] (ROUNDING_TESTS_float): Do not define here. [__SOFTFP__] (ROUNDING_TESTS_double): Likewise. [__SOFTFP__] (ROUNDING_TESTS_long_double): Likewise. * sysdeps/riscv/nofpu/math-tests-rounding.h: New file. * sysdeps/riscv/math-tests.h [!__riscv_flen] (ROUNDING_TESTS_float): Do not define here. [!__riscv_flen] (ROUNDING_TESTS_double): Likewise. [!__risv_flen] (ROUNDING_TESTS_long_double): Likewise. * sysdeps/m68k/coldfire/math-tests.h [!__mcffpu__] (ROUNDING_TESTS_float): Likewise. [!__mcffpu__] (ROUNDING_TESTS_double): Likewise. [!__mcffpu__] (ROUNDING_TESTS_long_double): Likewise. * sysdeps/mips/math-tests.h [__mips_soft_float] (ROUNDING_TESTS_float): Likewise. [__mips_soft_float] (ROUNDING_TESTS_double): Likewise. [__mips_soft_float] (ROUNDING_TESTS_long_double): Likewise. * sysdeps/nios2/math-tests.h (ROUNDING_TESTS_float): Likewise. (ROUNDING_TESTS_double): Likewise. (ROUNDING_TESTS_long_double): Likewise.
* Add PF_XDP, AF_XDP and SOL_XDP from Linux 4.18 to bits/socket.h.Tobias Klauser2018-08-212-1/+11
| | | | | | | | | | This patch adds the PF_XDP, AF_XDP and SOL_XDP macros from Linux 4.18 to sysdeps/unix/sysv/linux/bits/socket.h. * sysdeps/unix/sysv/linux/bits/socket.h (PF_MAX): Set to 45. (PF_XDP): New macro. (AF_XDP): New macro. (SOL_XDP): New macro.
* Update netinet/tcp.h from Linux 4.18.Joseph Myers2018-08-212-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds constants from netinet/tcp.h in Linux 4.18, and an associated struct tcp_zerocopy_receive, to sysdeps/gnu/netinet/tcp.h. The new TCP_REPAIR_* constants seemed sufficiently related to those already present to include them. Note that this patch does not include additions to struct tcp_info; there are many other elements in this structure in the Linux kernel that are not included in the glibc version (which was last extended in 2007, it seems). Such additions to the end of the structure may be OK with the expected way it is used (size passed explicitly to the kernel with getsockopt), but in principle any change to the size of a type provided by glibc is an ABI change for external applications / libraries using that type in their ABIs, and has the associated risks of such a change. Tested for x86_64. * sysdeps/gnu/netinet/tcp.h (TCP_ZEROCOPY_RECEIVE): New macro. (TCP_INQ): Likewise. (TCP_CM_INQ): Likewise. (TCP_REPAIR_ON): Likewise. (TCP_REPAIR_OFF): Likewise. (TCP_REPAIR_OFF_NO_WP): Likewise. (struct tcp_zerocopy_receive): New type.
* Avoid running some tests if the file system does not support holesFlorian Weimer2018-08-2110-1/+135
| | | | | Otherwise, these tests fills up the entire disk (or just run very slowly and eventually time out).
* Makeconfig: Do not sort and deduplicate +cflags [BZ # 17248]Florian Weimer2018-08-212-3/+5
| | | | | The original intent behind this is unclear. It interferes with flags that has to be ordered in a particular way.
* __readlink_chk: Remove micro-optimizationFlorian Weimer2018-08-213-37/+5
|
* __readlink_chk: Assume HAVE_INLINED_SYSCALLSFlorian Weimer2018-08-202-7/+6
| | | | | HAVE_INLINED_SYSCALLS is always defined on Linux. Switch to INLINE_SYSCALL_CALL as well.
* Update struct signalfd_siginfo from Linux 4.18.Joseph Myers2018-08-202-1/+10
| | | | | | | | | | | | | This patch updates struct signalfd_siginfo in sys/signalfd.h with new members from Linux 4.18 (plus ssi_addr_lsb, added to the kernel in 2.6.37 without being added to sys/signalfd.h at that time). The __pad2 member name follows the kernel and the existing __pad name. Tested for x86_64. * sysdeps/unix/sysv/linux/sys/signalfd.h (struct signalfd_siginfo): Add ssi_addr_lsb, ssi_syscall, ssi_call_addr and ssi_arch members.
* Add NT_VMCOREDD, AT_MINSIGSTKSZ from Linux 4.18 to elf.h.Joseph Myers2018-08-202-0/+9
| | | | | | | | | | This patch adds two new constants from Linux 4.18 to elf.h, NT_VMCOREDD and AT_MINSIGSTKSZ. Tested for x86_64. * elf/elf.c (NT_VMCOREDD): New macro. (AT_MINSIGSTKSZ): Likewise.
* malloc: Add ChangeLog for accidentally committed changeFlorian Weimer2018-08-202-1/+5
| | | | | | Commit b90ddd08f6dd688e651df9ee89ca3a69ff88cd0c ("malloc: Additional checks for unsorted bin integrity I.") was committed without a whitespace fix, so it is adjusted here as well.
* powerpc: Remove powerpc specific sinf and cosf optimizationRajalakshmi Srinivasaraghavan2018-08-2014-1453/+18
| | | | | | | New generic optimization of sinf and cosf introduced by commit 599cf3976679e1b345307d9c02057f02aa95528f shows improvement compared to powerpc specific assembly version. Hence removing the powerpc assembly versions to make use of generic code.
* math: Regenerate s390 ulpsFlorian Weimer2018-08-172-0/+22
| | | | | Based on results on a s390x 2964 machine, with -march=z196 and -mtune=zEC12, and separately with -march=z13 and -mtune=z14.
* malloc: Additional checks for unsorted bin integrity I.Istvan Kurucsai2018-08-171-4/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Thu, Jan 11, 2018 at 3:50 PM, Florian Weimer <fweimer@redhat.com> wrote: > On 11/07/2017 04:27 PM, Istvan Kurucsai wrote: >> >> + next = chunk_at_offset (victim, size); > > > For new code, we prefer declarations with initializers. Noted. >> + if (__glibc_unlikely (chunksize_nomask (victim) <= 2 * SIZE_SZ) >> + || __glibc_unlikely (chunksize_nomask (victim) > >> av->system_mem)) >> + malloc_printerr("malloc(): invalid size (unsorted)"); >> + if (__glibc_unlikely (chunksize_nomask (next) < 2 * SIZE_SZ) >> + || __glibc_unlikely (chunksize_nomask (next) > >> av->system_mem)) >> + malloc_printerr("malloc(): invalid next size (unsorted)"); >> + if (__glibc_unlikely ((prev_size (next) & ~(SIZE_BITS)) != >> size)) >> + malloc_printerr("malloc(): mismatching next->prev_size >> (unsorted)"); > > > I think this check is redundant because prev_size (next) and chunksize > (victim) are loaded from the same memory location. I'm fairly certain that it compares mchunk_size of victim against mchunk_prev_size of the next chunk, i.e. the size of victim in its header and footer. >> + if (__glibc_unlikely (bck->fd != victim) >> + || __glibc_unlikely (victim->fd != unsorted_chunks (av))) >> + malloc_printerr("malloc(): unsorted double linked list >> corrupted"); >> + if (__glibc_unlikely (prev_inuse(next))) >> + malloc_printerr("malloc(): invalid next->prev_inuse >> (unsorted)"); > > > There's a missing space after malloc_printerr. Noted. > Why do you keep using chunksize_nomask? We never investigated why the > original code uses it. It may have been an accident. You are right, I don't think it makes a difference in these checks. So the size local can be reused for the checks against victim. For next, leaving it as such avoids the masking operation. > Again, for non-main arenas, the checks against av->system_mem could be made > tighter (against the heap size). Maybe you could put the condition into a > separate inline function? We could also do a chunk boundary check similar to what I proposed in the thread for the first patch in the series to be even more strict. I'll gladly try to implement either but believe that refining these checks would bring less benefits than in the case of the top chunk. Intra-arena or intra-heap overlaps would still be doable here with unsorted chunks and I don't see any way to counter that besides more generic measures like randomizing allocations and your metadata encoding patches. I've attached a revised version with the above comments incorporated but without the refined checks. Thanks, Istvan From a12d5d40fd7aed5fa10fc444dcb819947b72b315 Mon Sep 17 00:00:00 2001 From: Istvan Kurucsai <pistukem@gmail.com> Date: Tue, 16 Jan 2018 14:48:16 +0100 Subject: [PATCH v2 1/1] malloc: Additional checks for unsorted bin integrity I. Ensure the following properties of chunks encountered during binning: - victim chunk has reasonable size - next chunk has reasonable size - next->prev_size == victim->size - valid double linked list - PREV_INUSE of next chunk is unset * malloc/malloc.c (_int_malloc): Additional binning code checks.
* Add --with-nonshared-cflags option to configureFlorian Weimer2018-08-177-1/+64
|
* Makeconfig (ASFLAGS): Always append required assembler flagsFlorian Weimer2018-08-172-1/+5
| | | | | Otherwise, it is impossible to set ASFLAGS differently from CFLAGS without also overriding essential flags such as -Wa,--noexecstack.
* Fix attribution of previous change in ChangeLogFlorian Weimer2018-08-171-1/+1
|
* malloc: Mitigate null-byte overflow attacksMoritz Eckert2018-08-162-0/+9
| | | | | * malloc/malloc.c (_int_free): Check for corrupt prev_size vs size. (malloc_consolidate): Likewise.
* malloc: Verify size of top chunk.Pochang Chen2018-08-162-0/+7
| | | | | | | | | | | | The House of Force is a well-known technique to exploit heap overflow. In essence, this exploit takes three steps: 1. Overwrite the size of top chunk with very large value (e.g. -1). 2. Request x bytes from top chunk. As the size of top chunk is corrupted, x can be arbitrarily large and top chunk will still be offset by x. 3. The next allocation from top chunk will thus be controllable. If we verify the size of top chunk at step 2, we can stop such attack.
* Reallocate buffers for every run in strlenSiddhesh Poyarekar2018-08-162-7/+13
| | | | Try and avoid influencing performance of neighbouring functions.
* Print strlen benchmark output in jsonSiddhesh Poyarekar2018-08-162-15/+42
| | | | Allow reading the benchmark using the compare_strings.py script.
* powerpc: Rearrange little endian specific filesRajalakshmi Srinivasaraghavan2018-08-169-24/+53
| | | | | | This patch moves little endian specific POWER9 optimization files to sysdeps/powerpc/powerpc64/le and creates POWER9 ifunc functions only for little endian.
* [aarch64] Add an ASIMD variant of strlen for falkorSiddhesh Poyarekar2018-08-157-4/+273
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This variant of strlen uses vector loads and operations to reduce the size of the code and also eliminate the non-ascii fallback. This works very well for falkor because of its two vector units and efficient vector ops. In the best case it reduces latency of cases in bench-strlen by 48%, with gains throughout the benchmark. strlen-walk also sees uniform gains in the 5%-15% range. Overall the routine appears to work better than the stock one for falkor regardless of the benchmark, length of string or cache state. The same cannot be said of a53 and a72 though. a53 performance was greatly reduced and for a72 it was a bit of a mixed bag, slightly on the negative side but I reckon it might be fast in some situations. * sysdeps/aarch64/strlen.S (__strlen): Rename to STRLEN. [!STRLEN](STRLEN): Set to __strlen. * sysdeps/aarch64/multiarch/strlen.c: New file. * sysdeps/aarch64/multiarch/strlen_generic.S: Likewise. * sysdeps/aarch64/multiarch/strlen_asimd.S: Likewise. * sysdeps/aarch64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list): Add strlen. * sysdeps/aarch64/multiarch/Makefile (sysdep_routines): Add strlen_generic and strlen_asimd. Reviewed-By: szabolcs.nagy@arm.com CC: pinskia@gmail.com
* Use generic sinf/cosf in lgammaf_rWilco Dijkstra2018-08-154-112/+14
| | | | | | | | | | | | | | The internal functions __kernel_sinf and __kernel_cosf are used only by lgammaf_r. Removing the internal functions and using the generic sinf and cosf is better overall. Benchmarking on Cortex-A72 shows the generic sinf and cosf are 1.4x and 2.3x faster in the range |x| < PI/4, and 0.66x and 1.1x for |x| < PI/2, so it should make lgammaf_r faster on average. GLIBC regression tests pass on AArch64. * sysdeps/ieee754/flt-32/e_lgammaf_r.c (sin_pif): Use __sinf/__cosf. * sysdeps/ieee754/flt-32/k_cosf.c (__kernel_cosf): Remove all code. * sysdeps/ieee754/flt-32/k_sinf.c (__kernel_sinf): Likewise.
* Fix spaces in x86_64 ULP fileWilco Dijkstra2018-08-152-3/+7
| | | | | | | | Fix a few missing spaces, it's now identical to the regenerated version. Passes GLIBC tests on x64. * sysdeps/x86_64/fpu/libm-test-ulps: Regenerate to fix spaces.
* error, warn, warnx: Use __fxprintf for wide printing [BZ #23519]Florian Weimer2018-08-147-145/+129
| | | | Also introduce the __vfxprintf function.
* Improve performance of sinf and cosfWilco Dijkstra2018-08-149-384/+207
| | | | | | | | | | | | | | | | | | | | | | | | | The second patch improves performance of sinf and cosf using the same algorithms and polynomials. The returned values are identical to sincosf for the same input. ULP definitions for AArch64 and x64 are updated. sinf/cosf througput gains on Cortex-A72: * |x| < 0x1p-12 : 1.2x * |x| < M_PI_4 : 1.8x * |x| < 2 * M_PI: 1.7x * |x| < 120.0 : 2.3x * |x| < Inf : 3.0x * NEWS: Mention sinf, cosf, sincosf. * sysdeps/aarch64/libm-test-ulps: Update ULP for sinf, cosf, sincosf. * sysdeps/x86_64/fpu/libm-test-ulps: Update ULP for sinf and cosf. * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: Add definitions of constants rather than including generic sincosf.h. * sysdeps/x86_64/fpu/s_sincosf_data.c: Remove. * sysdeps/ieee754/flt-32/s_cosf.c (cosf): Rewrite. * sysdeps/ieee754/flt-32/s_sincosf.h (reduced_sin): Remove. (reduced_cos): Remove. (sinf_poly): New function. * sysdeps/ieee754/flt-32/s_sinf.c (sinf): Rewrite.
* nss_files: Fix file stream leak in aliases lookup [BZ #23521]Florian Weimer2018-08-144-0/+262
| | | | | In order to get a clean test case, it was necessary to fix partially fixed bug 23522 as well.
* nscd: Deallocate existing user names in file parserFlorian Weimer2018-08-142-1/+10
| | | | | This avoids a theoretical memory leak (theoretical because it depends on multiple server-user/stat-user directives in the configuration file).
* Update syscall-names.list for Linux 4.18.Joseph Myers2018-08-132-2/+9
| | | | | | | | | | | | | This patch updates sysdeps/unix/sysv/linux/syscall-names.list for Linux 4.18. The io_pgetevents and rseq syscalls are added to the kernel on various architectures, so need to be mentioned in this file. Tested with build-many-glibcs.py. * sysdeps/unix/sysv/linux/syscall-names.list: Update kernel version to 4.18. (io_pgetevents): New syscall. (rseq): Likewise.
* Update install.texi documentation of uses of Perl and Python.Joseph Myers2018-08-133-73/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The install.texi documentation of uses of Perl and Python is substantially out of date. The description of Perl is "to test the installation" (which I interpret as referring to test-installation.pl), but it's used for more tests than that, and to build the manual, and to regenerate one file in the source tree. The description of Python is only for pretty-printer tests, but it's used for other tests / benchmarks as well (and for other internal uses such as updating Unicode data, for which we already require Python 3, but I think install.texi only needs to describe uses from the main glibc Makefiles). This patch updates the descriptions of what those tools are used for. The Python information (and information about other tools for testing pretty printers) was awkwardly in the middle of the general description of building and testing glibc, rather than with the rest of information about tools used in glibc build and test; this patch moves the information about those tools into the main list. Tested with regeneration of INSTALL as well as "make info" and "make pdf". * manual/install.texi (Configuring and compiling): Do not list tools used for testing pretty printers here. (Tools for Compilation): List Python, PExpect and GDB here. Update descriptions of uses of Perl and Python. * INSTALL: Regenerate.
* Use Linux 4.18 in build-many-glibcs.py.Joseph Myers2018-08-132-1/+6
| | | | | * scripts/build-many-glibcs.py (Context.checkout): Default Linux version to 4.18.
* error, error_at_line: Add missing va_end callsFlorian Weimer2018-08-132-0/+7
|
* mbstowcs: Remove outdated commentFlorian Weimer2018-08-132-6/+5
| | | | | | ISO C requires that there is no effect on any global conversion state, so the change in commit 9f097308c7465443765d1e25699a4cf33eff5455 was correct in princple.
* [benchtests] Add workload test properties to schemaSiddhesh Poyarekar2018-08-112-0/+7
| | | | | | | | Add the workload test properties (max-throughput, latency, etc.) to the schema to prevent benchmark output validation from failing. * benchtests/scripts/benchout.schema.json (properties): Add new properties.
* [benchtests] Add mandatory attributes to workload testsSiddhesh Poyarekar2018-08-112-0/+7
| | | | | | | | Add the duration and iterations attributes to the workloads tests to make the json schema parser happy * benchtests/bench-skeleton.c (main): Add duration and iterations attributes.
* ChangeLog: Fix an obvious typo.Rafal Luzynski2018-08-101-1/+1
| | | | | The typo has been introduced in commit 08a5ee14c6fcd87caa4f6f5c442be2fc345211f0.
* regex: Gnulib unibyte RRI uses bytes not charsPaul Eggert2018-08-102-5/+16
| | | | | | | | | | | Adjust the non-glibc code to agree with what Gawk needs for rational range interpretation (RRI) for regular expression ranges. In unibyte locales, Gawk wants ranges to use the underlying byte rather than the character code point. This change does not affect glibc proper. * posix/regcomp.c (parse_byte) [!LIBC && RE_ENABLE_I18N]: In unibyte locales, use the byte value rather than running it through btowc.
* Move SNAN_TESTS_* out of math-tests.h.Joseph Myers2018-08-106-33/+93
| | | | | | | | | | | | | | | | | | | | | | | Continuing moving macros out of math-tests.h to smaller headers following typo-proof conventions instead of using #ifndef, this patch moves the SNAN_TESTS_* macros for individual types out to their own sysdeps header (while the type-generic SNAN_TESTS wrapper for those macros remains in math-tests.h). Tested for x86_64 and x86, and with build-many-glibcs.py. * sysdeps/generic/math-tests-snan.h: New file. * sysdeps/generic/math-tests.h: Include <math-tests-snan.h>. (SNAN_TESTS_float): Do not define here. (SNAN_TESTS_double): Likewise. (SNAN_TESTS_long_double): Likewise. (SNAN_TESTS_float128): Likewise. * sysdeps/i386/fpu/math-tests-snan.h: New file. * sysdeps/i386/fpu/math-tests.h: Remove file. * sysdeps/ia64/math-tests-snan.h: New file. * sysdeps/ia64/math-tests.h: Remove file. * sysdeps/x86/math-tests.h: Likewise. * sysdeps/x86_64/fpu/math-tests-snan.h: New file.
* Improve performance of sincosfWilco Dijkstra2018-08-108-133/+274
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is a complete rewrite of sincosf. The new version is significantly faster, as well as simple and accurate. The worst-case ULP is 0.5607, maximum relative error is 0.5303 * 2^-23 over all 4 billion inputs. In non-nearest rounding modes the error is 1ULP. The algorithm uses 3 main cases: small inputs which don't need argument reduction, small inputs which need a simple range reduction and large inputs requiring complex range reduction. The code uses approximate integer comparisons to quickly decide between these cases. The small range reducer uses a single reduction step to handle values up to 120.0. It is fastest on targets which support inlined round instructions. The large range reducer uses integer arithmetic for simplicity. It does a 32x96 bit multiply to compute a 64-bit modulo result. This is more than accurate enough to handle the worst-case cancellation for values close to an integer multiple of PI/4. It could be further optimized, however it is already much faster than necessary. sincosf throughput gains on Cortex-A72: * |x| < 0x1p-12 : 1.6x * |x| < M_PI_4 : 1.7x * |x| < 2 * M_PI: 1.5x * |x| < 120.0 : 1.8x * |x| < Inf : 2.3x * math/Makefile: Add s_sincosf_data.c. * sysdeps/ia64/fpu/s_sincosf_data.c: New file. * sysdeps/ieee754/flt-32/s_sincosf.h (abstop12): Add new function. (sincosf_poly): Likewise. (reduce_small): Likewise. (reduce_large): Likewise. * sysdeps/ieee754/flt-32/s_sincosf.c (sincosf): Rewrite. * sysdeps/ieee754/flt-32/s_sincosf_data.c: New file with sincosf data. * sysdeps/m68k/m680x0/fpu/s_sincosf_data.c: New file. * sysdeps/x86_64/fpu/s_sincosf_data.c: New file.
* Clean up converttoint handling and document the semanticsSzabolcs Nagy2018-08-104-19/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch currently only affects aarch64. The roundtoint and converttoint internal functions are only called with small values, so 32 bit result is enough for converttoint and it is a signed int conversion so the return type is changed to int32_t. The original idea was to help the compiler keeping the result in uint64_t, then it's clear that no sign extension is needed and there is no accidental undefined or implementation defined signed int arithmetics. But it turns out gcc does a good job with inlining so changing the type has no overhead and the semantics of the conversion is less surprising this way. Since we want to allow the asuint64 (x + 0x1.8p52) style conversion, the top bits were never usable and the existing code ensures that only the bottom 32 bits of the conversion result are used. On aarch64 the neon intrinsics (which round ties to even) are changed to round and lround (which round ties away from zero) this does not affect the results in a significant way, but more portable (relies on round and lround being inlined which works with -fno-math-errno). The TOINT_SHIFT and TOINT_RINT macros were removed, only keep separate code paths for TOINT_INTRINSICS and !TOINT_INTRINSICS. * sysdeps/aarch64/fpu/math_private.h (roundtoint): Use round. (converttoint): Use lround. * sysdeps/ieee754/flt-32/math_config.h (roundtoint): Declare and document the semantics when TOINT_INTRINSICS is set. (converttoint): Likewise. (TOINT_RINT): Remove. (TOINT_SHIFT): Remove. * sysdeps/ieee754/flt-32/e_expf.c (__expf): Remove the TOINT_RINT code path.
* Linux: Rewrite __old_getdents64 [BZ #23497]Florian Weimer2018-08-104-25/+187
| | | | | | | | | Commit 298d0e3129c0b5137f4989275b13fe30d0733c4d ("Consolidate Linux getdents{64} implementation") broke the implementation because it does not take into account struct offset differences. The new implementation is close to the old one, before the consolidation, but has been cleaned up slightly.
* S390: Fix unwind in 32-bit _mcountIlya Leoshkevich2018-08-102-2/+11
| | | | | | | | | | * Fix CFI offset for %r14. * Fix unwound value of %r15 being off by 128 bytes. ChangeLog: * sysdeps/s390/s390-32/s390-mcount.S (_mcount): Fix unwind.
* S390: Implement 64-bit __fentry__Ilya Leoshkevich2018-08-105-57/+134
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Since __fentry__ is almost the same as _mcount, reuse the code by #including it twice with different #defines around. * Remove LA usages - they are needed in 31-bit mode to clear the top bit, but in 64-bit they appear to do nothing. * Add CFI rule for the nonstandard return register. This rule applies to the current function (binutils generates a new CIE - see gas/dw2gencfi.c:select_cie_for_fde()), so it is not necessary to put __fentry__ into a new file. * Fix CFI offset for %r14. * Add CFI rule for %r0. * Fix unwound value of %r15 being off by 244 bytes. * Unwinding in __fentry__@plt does not work, no plan to fix it - it would require asking linker to generate CFI for return address in %r0. From functional perspective keeping it broken is fine, since the callee did not have a chance to do anything yet. From convenience perspective it would be possible to enhance GDB in the future to treat __fentry__@plt in a special way. * Fix whitespace. * Fix offsets in comments, which were copied from 32-bit code. * 32-bit version will not be implemented, since it's not compatible with the corresponding PLT stubs: they assume %r12 points to GOT, which is not the case for gcc-emitted __fentry__ stub, which runs before the prolog. This patch adds the runtime support in glibc for the -mfentry gcc feature introduced in [1] and [2]. [1] https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00784.html [2] https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00912.html ChangeLog: * sysdeps/s390/s390-64/Versions (__fentry__): Add. * sysdeps/s390/s390-64/s390x-mcount.S: Move the common code to s390x-mcount.h and #include it. * sysdeps/s390/s390-64/s390x-mcount.h: New file. * sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist (__fentry__): Add.
* Move __fentry__ version definition to sysdeps/{i386,x86_64}Ilya Leoshkevich2018-08-104-3/+12
| | | | | | | | | | | | | | __fentry__ symbol is currently not defined for other architectures. Attempts to introduce it cause abicheck to fail, because it will be available since 2.29 earliest, and not 2.13, which is the case for Intel. With the new code, abicheck passes for i686-linux-gnu, x86_64-linux-gnu and x86_64-linux-gnu32 triples. ChangeLog: * stdlib/Versions: Remove __fentry__. * sysdeps/i386/Versions: Add __fentry__. * sysdeps/x86_64/Versions: Add __fentry__.
* S390: Test that lazy binding does not clobber R0Ilya Leoshkevich2018-08-108-0/+230
| | | | | | | | | | | | | | | | | | | | | | The following combinations need to be tested: * 32- (g5, esa and zarch) and 64-bit * linux32 glibc/configure CC='gcc -m31 -march=g5' * linux32 glibc/configure CC='gcc -m31' * linux32 glibc/configure CC='gcc -m31 -mzarch' * With and without VX: * glibc/configure libc_cv_asm_s390_vx=no * With and without profiling (using LD_PROFILE) * With and without pltexit (using LD_AUDIT) ChangeLog: * sysdeps/s390/Makefile: Register the new tests. * sysdeps/s390/tst-dl-runtime-mod.S: New file. * sysdeps/s390/tst-dl-runtime-profile-audit.c: New file. * sysdeps/s390/tst-dl-runtime-profile-noaudit.c: New file. * sysdeps/s390/tst-dl-runtime-resolve-audit.c: New file. * sysdeps/s390/tst-dl-runtime-resolve-noaudit.c: New file. * sysdeps/s390/tst-dl-runtime.c: New file.
* S390: Do not clobber R0 in 64-bit _dl_runtime_profileIlya Leoshkevich2018-08-102-0/+11
| | | | | | | | | Preparation for the usage of R0 by __fentry__. ChangeLog: * sysdeps/s390/s390-64/dl-trampoline.h (_dl_runtime_profile): Do not clobber R0.