about summary refs log tree commit diff
path: root/sysdeps/i386
Commit message (Collapse)AuthorAgeFilesLines
* Move tests of catan, catanh to auto-libm-test-*.Joseph Myers2017-02-172-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | This patch moves tests of catan and catanh with finite inputs (other than the divide-by-zero cases producing an exact infinity) to using the auto-libm-test machinery. Each of auto-libm-test-out-catan and auto-libm-test-out-catanh takes about three seconds to generate on my system (so in fact it wasn't necessary after all to defer the move to auto-libm-test-* until the output files were split up by function). Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add tests of catan and catanh. * math/auto-libm-test-out-catan: New generated file. * math/auto-libm-test-out-catanh: Likewise. * math/libm-test-catan.inc (catan_test_data): Use AUTO_TESTS_c_c. Move tests with finite inputs, except divide-by-zero cases, to auto-libm-test-in. * math/libm-test-catanh.inc (catanh_test_data): Likewise. * math/Makefile (libm-test-funcs-auto): Add catan and catanh. (libm-test-funcs-noauto): Remove catan and catanh. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Move tests of casin, casinh to auto-libm-test-*.Joseph Myers2017-02-172-72/+72
| | | | | | | | | | | | | | | | | | | | | | | This patch moves tests of casin and casinh with finite inputs to using the auto-libm-test machinery. Each of auto-libm-test-out-casin and auto-libm-test-out-casinh takes about 38 minutes to generate on my system because of MPC slowness on special cases that appear in the tests (with MPC 1.0.3; I don't know to what extent current MPC master might speed it up). Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add tests of casin and casinh. * math/auto-libm-test-out-casin: New generated file. * math/auto-libm-test-out-casinh: Likewise. * math/libm-test-casin.inc (casin_test_data): Use AUTO_TESTS_c_c. Move tests with finite inputs to auto-libm-test-in. * math/libm-test-casinh.inc (casinh_test_data): Likewise. * math/Makefile (libm-test-funcs-auto): Add casin and casinh. (libm-test-funcs-noauto): Remove casin and casinh. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* Move tests of cacos, cacosh to auto-libm-test-*.Joseph Myers2017-02-172-38/+38
| | | | | | | | | | | | | | | | | | | | | | | This patch moves tests of cacos and cacosh with finite inputs to using the auto-libm-test machinery. Each of auto-libm-test-out-cacos and auto-libm-test-out-cacosh takes about 80 minutes to generate on my system because of MPC slowness on special cases that appear in the tests (with MPC 1.0.3; I don't know to what extent current MPC master might speed it up). Tested for x86_64 and x86 and ulps updated accordingly. * math/auto-libm-test-in: Add tests of cacos and cacosh. * math/auto-libm-test-out-cacos: New generated file. * math/auto-libm-test-out-cacosh: Likewise. * math/libm-test-cacos.inc (cacos_test_data): Use AUTO_TESTS_c_c. Move tests with finite inputs to auto-libm-test-in. * math/libm-test-cacosh.inc (cacosh_test_data): Likewise. * math/Makefile (libm-test-funcs-auto): Add cacos and cacosh. (libm-test-funcs-noauto): Remove cacos and cacosh. * sysdeps/i386/fpu/libm-test-ulps: Update. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
* nptl: Remove COLORING_INCREMENTAdhemerval Zanella2017-02-061-5/+0
| | | | | | | | | | | This patch removes the COLORING_INCREMENT define and usage on allocatestack.c. It has not been used since 564cd8b67ec487f (glibc-2.3.3) by any architecture. The idea is to simplify the code by removing obsolete code. * nptl/allocatestack.c [COLORING_INCREMENT] (nptl_ncreated): Remove. (allocate_stack): Remove COLORING_INCREMENT usage. * nptl/stack-aliasing.h (COLORING_INCREMENT). Likewise. * sysdeps/i386/i686/stack-aliasing.h (COLORING_INCREMENT): Likewise.
* Remove i686, x86_64, and powerpc strtok implementationsAdhemerval Zanella2017-02-064-612/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on comments on previous attempt to address BZ#16640 [1], the idea is not support invalid use of strtok (the original bug report proposal). This leader to a new strtok optimized strtok implementation [2]. The idea of this patch is to fix BZ#16640 to align all the implementations to a same contract. However, with newer strtok code it is better to get remove the old assembly ones instead of fix them. For x86 is a gain in all cases since the new implementation can potentially use sse2/sse42 implementation for strspn and strcspn. This shows a better performance on both i686 and x86_64 using the string benchtests. On powerpc64 the gains are mixed, where only for larger inputs or keys some gains are showns (based on benchtest it seems that it shows some gains for keys larger than 10 and inputs larger than 32). I would prefer to remove the optimized implementation based on first code simplicity and second because some more gain could be optimized using a better optimized strcspn/strspn code (as for x86). However if powerpc arch maintainers prefer I can send a v2 with the assembly code adjusted instead. Checked on x86_64-linux-gnu, i686-linux-gnu, and powerpc64le-linux-gnu. [BZ #16640] * sysdeps/i386/i686/strtok.S: Remove file. * sysdeps/i386/i686/strtok_r.S: Likewise. * sysdeps/i386/strtok.S: Likewise. * sysdeps/i386/strtok_r.S: Likewise. * sysdeps/powerpc/powerpc64/strtok.S: Likewise. * sysdeps/powerpc/powerpc64/strtok_r.S: Likewise. * sysdeps/x86_64/strtok.S: Likewise. * sysdeps/x86_64/strtok_r.S: Likewise. [1] https://sourceware.org/ml/libc-alpha/2016-10/msg00411.html [2] https://sourceware.org/ml/libc-alpha/2016-12/msg00461.html
* Allow IFUNC relocation against unrelocated shared libraryH.J. Lu2017-02-021-1/+1
| | | | | | | | | | | | | | | | | IFUNC relocation against definition in unrelocated shared library will lead to segfault when the IFUNC function is called. This patch allows such IFUNC relocations with a warning. This isn't a real fix for https://sourceware.org/bugzilla/show_bug.cgi?id=21041 It simply allows the program to load. The program will segfault when longjmp is called. * sysdeps/i386/dl-machine.h (elf_machine_rel): Replace _dl_fatal_printf with _dl_error_printf for IFUNC relocation against unrelocated shared library. * sysdeps/x86_64/dl-machine.h (elf_machine_rela): Likewise.
* Move wrappers to libm-compat-calls-autoGabriel F. T. Gomes2017-01-041-1/+1
| | | | | | | | | | This commit moves one step towards the deprecation of wrappers that use _LIB_VERSION / matherr / __kernel_standard functionality, by adding the suffix '_compat' to their filenames and adjusting Makefiles and #includes accordingly. New template wrappers that do not use such functionality will be added by future patches and will be first used by the float128 wrappers.
* Update i386 libm-test-ulps.Joseph Myers2017-01-031-4/+4
| | | | | | | | | | When testing changes to i386 libm functions (that are shadowed for i686 builds by i686 versions) recently, I saw that the plain i386 libm-test-ulps (as opposed to the i686 multiarch version) needed updating for tests that had been added since it was last updated. This patch updates it accordingly. * sysdeps/i386/fpu/libm-test-ulps: Update.
* Fix x86 strncat optimized implementation for large sizesAdhemerval Zanella2017-01-031-0/+2
| | | | | | | | | | | | | | | | | | | | | | Similar to BZ#19387, BZ#21014, and BZ#20971, both x86 sse2 strncat optimized assembly implementations do not handle the size overflow correctly. The x86_64 one is in fact an issue with strcpy-sse2-unaligned, but that is triggered also with strncat optimized implementation. This patch uses a similar strategy used on 3daef2c8ee4df2, where saturared math is used for overflow case. Checked on x86_64-linux-gnu and i686-linux-gnu. It fixes BZ #19390. [BZ #19390] * string/test-strncat.c (test_main): Add tests with SIZE_MAX as maximum string size. * sysdeps/i386/i686/multiarch/strcat-sse2.S (STRCAT): Avoid overflow in pointer addition. * sysdeps/x86_64/multiarch/strcpy-sse2-unaligned.S (STRCPY): Likewise.
* Fix i686 memchr for large input sizesAdhemerval Zanella2017-01-022-3/+15
| | | | | | | | | | | | | | | | | | | Similar to BZ#19387 and BZ#20971, both i686 memchr optimized assembly implementations (memchr-sse2-bsf and memchr-sse2) do not handle the size overflow correctly. It is shown by the new tests added by commit 3daef2c8ee4df29, where both implementation fails with size as SIZE_MAX. This patch uses a similar strategy used on 3daef2c8ee4df2, where saturared math is used for overflow case. Checked on i686-linux-gnu. [BZ #21014] * sysdeps/i386/i686/multiarch/memchr-sse2-bsf.S (MEMCHR): Avoid overflow in pointer addition. * sysdeps/i386/i686/multiarch/memchr-sse2.S (MEMCHR): Likewise.
* Update copyright dates with scripts/update-copyrights.Joseph Myers2017-01-01249-249/+249
|
* Fix typos in the spelling of "implementation"Dmitry V. Levin2016-12-271-1/+1
| | | | | | | | | | | | | Apply the following spelling fix: $ git grep -El 'implemetn?ation' | xargs sed -ri 's/implemetn?ation/implementation/g' [BZ #19514] * resolv/res_send.c: Fix typo in comment. * sysdeps/i386/i386-mcount.S: Likewise. * sysdeps/s390/s390-32/s390-mcount.S: Likewise. * sysdeps/s390/s390-64/s390x-mcount.S: Likewise. * sysdeps/sparc/sparc-mcount.S: Likewise.
* Compile the dynamic linker without stack protection [BZ #7065]Nick Alcock2016-12-261-1/+1
| | | | | Also compile corresponding routines in the static libc.a with the same flag.
* Fix x86, x86_64 fmax, fmin sNaN handling, add tests (bug 20947).Joseph Myers2016-12-154-15/+111
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Various fmax and fmin function implementations mishandle sNaN arguments: (a) When both arguments are NaNs, the return value should be a qNaN, but sometimes it is an sNaN if at least one argument is an sNaN. (b) Under TS 18661-1 semantics, if either argument is an sNaN then the result should be a qNaN (whereas if one argument is a qNaN and the other is not a NaN, the result should be the non-NaN argument). Various implementations treat sNaNs like qNaNs here. This patch fixes the x86 and x86_64 versions (ignoring float and double for 32-bit x86 given the inability to reliably avoid the sNaN turning into a qNaN before it gets to the called function). Tests of sNaN inputs to these functions are added. Note on architecture versions I haven't changed for this issue: AArch64 already gets this right (it uses a hardware instruction with the correct semantics for both quiet and signaling NaNs) and does not need changes. It's possible Alpha, IA64, SPARC might need changes (this would be shown by the testsuite if so). Tested for x86_64 and x86 (both i686 and i586 builds, to cover the different x86 implementations). [BZ #20947] * sysdeps/i386/fpu/s_fmaxl.S (__fmaxl): Add the arguments when either is a signaling NaN. * sysdeps/i386/fpu/s_fminl.S (__fminl): Likewise. Make code follow fmaxl more closely. * sysdeps/i386/i686/fpu/s_fmaxl.S (__fmaxl): Add the arguments when either is a signaling NaN. * sysdeps/i386/i686/fpu/s_fminl.S (__fminl): Likewise. * sysdeps/x86_64/fpu/s_fmax.S (__fmax): Likewise. * sysdeps/x86_64/fpu/s_fmaxf.S (__fmaxf): Likewise. * sysdeps/x86_64/fpu/s_fmaxl.S (__fmaxl): Likewise. * sysdeps/x86_64/fpu/s_fmin.S (__fmin): Likewise. * sysdeps/x86_64/fpu/s_fminf.S (__fminf): Likewise. * sysdeps/x86_64/fpu/s_fminl.S (__fminl): Likewise. * math/libm-test.inc (fmax_test_data): Add tests of sNaN inputs. (fmin_test_data): Likewise.
* Fix x86_64/x86 powl handling of sNaN arguments (bug 20916).Joseph Myers2016-12-061-5/+24
| | | | | | | | | | | | | | | | | | | | | | | | | The x86_64/x86 powl implementations mishandle sNaN arguments, both by returning sNaN in some cases (instead of doing arithmetic on the arguments to produce the result when NaN arguments result in NaN results) and by treating sNaN the same as qNaN for arguments (1, sNaN) and (sNaN, 0), contrary to TS 18661-1 which requires those cases to return qNaN instead of 1. This patch makes the x86_64/x86 powl implementations follow TS 18661-1 semantics for sNaN arguments; sNaN tests are also added for pow. Given the problems with testing float and double sNaN arguments on 32-bit x86 (sNaN tests disabled because the compiler may convert unnecessarily to a qNaN when passing arguments), no changes are made to the powf and pow implementations there. Tested for x86_64 and x86. [BZ #20916] * sysdeps/i386/fpu/e_powl.S (__ieee754_powl): Do not return 1 for arguments (sNaN, 0) or (1, sNaN). Do arithmetic on NaN arguments to compute result. * sysdeps/x86_64/fpu/e_powl.S (__ieee754_powl): Likewise. * math/libm-test.inc (pow_test_data): Add tests of sNaN arguments.
* Remove cached PID/TID in cloneAdhemerval Zanella2016-11-241-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch remove the PID cache and usage in current GLIBC code. Current usage is mainly used a performance optimization to avoid the syscall, however it adds some issues: - The exposed clone syscall will try to set pid/tid to make the new thread somewhat compatible with current GLIBC assumptions. This cause a set of issue with new workloads and usecases (such as BZ#17214 and [1]) as well for new internal usage of clone to optimize other algorithms (such as clone plus CLONE_VM for posix_spawn, BZ#19957). - The caching complexity also added some bugs in the past [2] [3] and requires more effort of each port to handle such requirements (for both clone and vfork implementation). - Caching performance gain in mainly on getpid and some specific code paths. The getpid performance leverage is questionable [4], either by the idea of getpid being a hotspot as for the getpid implementation itself (if it is indeed a justifiable hotspot a vDSO symbol could let to a much more simpler solution). Other usage is mainly for non usual code paths, such as pthread cancellation signal and handling. For thread creation (on stack allocation) the code simplification in fact adds some performance gain due the no need of transverse the stack cache and invalidate each element pid. Other thread usages will require a direct getpid syscall, such as cancellation/setxid signal, thread cancellation, thread fail path (at create_thread), and thread signal (pthread_kill and pthread_sigqueue). However these are hardly usual hotspots and I think adding a syscall is justifiable. It also simplifies both the clone and vfork arch-specific implementation. And by review each fork implementation there are some discrepancies that this patch also solves: - microblaze clone/vfork does not set/reset the pid/tid field - hppa uses the default vfork implementation that fallback to fork. Since vfork is deprecated I do not think we should bother with it. The patch also removes the TID caching in clone. My understanding for such semantic is try provide some pthread usage after a user program issue clone directly (as done by thread creation with CLONE_PARENT_SETTID and pthread tid member). However, as stated before in multiple discussions threads, GLIBC provides clone syscalls without further supporting all this semantics. I ran a full make check on x86_64, x32, i686, armhf, aarch64, and powerpc64le. For sparc32, sparc64, and mips I ran the basic fork and vfork tests from posix/ folder (on a qemu system). So it would require further testing on alpha, hppa, ia64, m68k, nios2, s390, sh, and tile (I excluded microblaze because it is already implementing the patch semantic regarding clone/vfork). [1] https://codereview.chromium.org/800183004/ [2] https://sourceware.org/ml/libc-alpha/2006-07/msg00123.html [3] https://sourceware.org/bugzilla/show_bug.cgi?id=15368 [4] http://yarchive.net/comp/linux/getpid_caching.html * sysdeps/nptl/fork.c (__libc_fork): Remove pid cache setting. * nptl/allocatestack.c (allocate_stack): Likewise. (__reclaim_stacks): Likewise. (setxid_signal_thread): Obtain pid through syscall. * nptl/nptl-init.c (sigcancel_handler): Likewise. (sighandle_setxid): Likewise. * nptl/pthread_cancel.c (pthread_cancel): Likewise. * sysdeps/unix/sysv/linux/pthread_kill.c (__pthread_kill): Likewise. * sysdeps/unix/sysv/linux/pthread_sigqueue.c (pthread_sigqueue): Likewise. * sysdeps/unix/sysv/linux/createthread.c (create_thread): Likewise. * sysdeps/unix/sysv/linux/getpid.c: Remove file. * nptl/descr.h (struct pthread): Change comment about pid value. * nptl/pthread_getattr_np.c (pthread_getattr_np): Remove thread pid assert. * sysdeps/unix/sysv/linux/pthread-pids.h (__pthread_initialize_pids): Do not set pid value. * nptl_db/td_ta_thr_iter.c (iterate_thread_list): Remove thread pid cache check. * nptl_db/td_thr_validate.c (td_thr_validate): Likewise. * sysdeps/aarch64/nptl/tcb-offsets.sym: Remove pid offset. * sysdeps/alpha/nptl/tcb-offsets.sym: Likewise. * sysdeps/arm/nptl/tcb-offsets.sym: Likewise. * sysdeps/hppa/nptl/tcb-offsets.sym: Likewise. * sysdeps/i386/nptl/tcb-offsets.sym: Likewise. * sysdeps/ia64/nptl/tcb-offsets.sym: Likewise. * sysdeps/m68k/nptl/tcb-offsets.sym: Likewise. * sysdeps/microblaze/nptl/tcb-offsets.sym: Likewise. * sysdeps/mips/nptl/tcb-offsets.sym: Likewise. * sysdeps/nios2/nptl/tcb-offsets.sym: Likewise. * sysdeps/powerpc/nptl/tcb-offsets.sym: Likewise. * sysdeps/s390/nptl/tcb-offsets.sym: Likewise. * sysdeps/sh/nptl/tcb-offsets.sym: Likewise. * sysdeps/sparc/nptl/tcb-offsets.sym: Likewise. * sysdeps/tile/nptl/tcb-offsets.sym: Likewise. * sysdeps/x86_64/nptl/tcb-offsets.sym: Likewise. * sysdeps/unix/sysv/linux/aarch64/clone.S: Remove pid and tid caching. * sysdeps/unix/sysv/linux/alpha/clone.S: Likewise. * sysdeps/unix/sysv/linux/arm/clone.S: Likewise. * sysdeps/unix/sysv/linux/hppa/clone.S: Likewise. * sysdeps/unix/sysv/linux/i386/clone.S: Likewise. * sysdeps/unix/sysv/linux/ia64/clone2.S: Likewise. * sysdeps/unix/sysv/linux/mips/clone.S: Likewise. * sysdeps/unix/sysv/linux/nios2/clone.S: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/clone.S: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/clone.S: Likewise. * sysdeps/unix/sysv/linux/sh/clone.S: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/clone.S: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/clone.S: Likewise. * sysdeps/unix/sysv/linux/tile/clone.S: Likewise. * sysdeps/unix/sysv/linux/x86_64/clone.S: Likewise. * sysdeps/unix/sysv/linux/aarch64/vfork.S: Remove pid set and reset. * sysdeps/unix/sysv/linux/alpha/vfork.S: Likewise. * sysdeps/unix/sysv/linux/arm/vfork.S: Likewise. * sysdeps/unix/sysv/linux/i386/vfork.S: Likewise. * sysdeps/unix/sysv/linux/ia64/vfork.S: Likewise. * sysdeps/unix/sysv/linux/m68k/clone.S: Likewise. * sysdeps/unix/sysv/linux/m68k/vfork.S: Likewise. * sysdeps/unix/sysv/linux/mips/vfork.S: Likewise. * sysdeps/unix/sysv/linux/nios2/vfork.S: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/vfork.S: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/vfork.S: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/vfork.S: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/vfork.S: Likewise. * sysdeps/unix/sysv/linux/sh/vfork.S: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/vfork.S: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/vfork.S: Likewise. * sysdeps/unix/sysv/linux/tile/vfork.S: Likewise. * sysdeps/unix/sysv/linux/x86_64/vfork.S: Likewise. * sysdeps/unix/sysv/linux/tst-clone2.c (f): Remove direct pthread struct access. (clone_test): Remove function. (do_test): Rewrite to take in consideration pid is not cached anymore.
* Do not hardcode platform names in manual/libm-err-tab.pl (bug 14139).Joseph Myers2016-11-042-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | manual/libm-err-tab.pl hardcodes a list of names for particular platforms (mapping from sysdeps directory name to friendly name for the manual). This goes against the principle of keeping information about individual platforms in their corresponding sysdeps directory, and the list is also very out-of-date regarding supported platforms and their corresponding sysdeps directories. This patch fixes this by adding a libm-test-ulps-name file alongside each libm-test-ulps file. The script then gets the friendly name from that file, which is required to exist, so it no longer needs to allow for the mapping being missing. Tested for x86_64. [BZ #14139] * manual/libm-err-tab.pl (%pplatforms): Initialize to empty. (find_files): Obtain platform name from libm-test-ulps-name and store in %pplatforms. (canonicalize_platform): Remove. (print_platforms): Use $pplatforms directly. (by_platforms): Do not allow for platforms missing from %pplatforms. * sysdeps/aarch64/libm-test-ulps-name: New file. * sysdeps/alpha/fpu/libm-test-ulps-name: Likewise. * sysdeps/arm/libm-test-ulps-name: Likewise. * sysdeps/generic/libm-test-ulps-name: Likewise. * sysdeps/hppa/fpu/libm-test-ulps-name: Likewise. * sysdeps/i386/fpu/libm-test-ulps-name: Likewise. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps-name: Likewise. * sysdeps/ia64/fpu/libm-test-ulps-name: Likewise. * sysdeps/m68k/coldfire/fpu/libm-test-ulps-name: Likewise. * sysdeps/m68k/m680x0/fpu/libm-test-ulps-name: Likewise. * sysdeps/microblaze/libm-test-ulps-name: Likewise. * sysdeps/mips/mips32/libm-test-ulps-name: Likewise. * sysdeps/mips/mips64/libm-test-ulps-name: Likewise. * sysdeps/nios2/libm-test-ulps-name: Likewise. * sysdeps/powerpc/fpu/libm-test-ulps-name: Likewise. * sysdeps/powerpc/nofpu/libm-test-ulps-name: Likewise. * sysdeps/s390/fpu/libm-test-ulps-name: Likewise. * sysdeps/sh/libm-test-ulps-name: Likewise. * sysdeps/sparc/fpu/libm-test-ulps-name: Likewise. * sysdeps/tile/libm-test-ulps-name: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps-name: Likewise.
* Check IFUNC definition in unrelocated shared library [BZ #20019]H.J. Lu2016-10-281-1/+17
| | | | | | | | | | | | Calling an IFUNC function defined in unrelocated shared library may lead to segfault. This patch issues an error message to request relinking the shared library if it references IFUNC function defined in the unrelocated shared library. [BZ #20019] * sysdeps/i386/dl-machine.h (elf_machine_rel): Check IFUNC definition in unrelocated shared library. * sysdeps/x86_64/dl-machine.h (elf_machine_rela): Likewise.
* Installed-header hygiene (BZ#20366): stack_t.Zack Weinberg2016-09-231-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sys/ucontext.h unconditionally uses stack_t, and it does not make sense to change that. But signal.h only declares stack_t under __USE_XOPEN_EXTENDED || __USE_XOPEN2K8. The actual definition is already in a bits header, bits/sigstack.h, but that header insists on only being included by signal.h, so we have to change that as well as all of the sys/ucontext.h variants. (Some but not all variants of bits/sigcontext.h, which sys/ucontext.h may also need, had already received this adjustment; for consistency, I made them all the same, even if that's not strictly necessary in some configurations.) bits/sigcontext.h and bits/sigstack.h also all need to receive multiple inclusion guards. * sysdeps/generic/sys/ucontext.h * sysdeps/arm/sys/ucontext.h * sysdeps/i386/sys/ucontext.h * sysdeps/m68k/sys/ucontext.h * sysdeps/mips/sys/ucontext.h * sysdeps/unix/sysv/linux/aarch64/sys/ucontext.h * sysdeps/unix/sysv/linux/alpha/sys/ucontext.h * sysdeps/unix/sysv/linux/arm/sys/ucontext.h * sysdeps/unix/sysv/linux/hppa/sys/ucontext.h * sysdeps/unix/sysv/linux/ia64/sys/ucontext.h * sysdeps/unix/sysv/linux/m68k/sys/ucontext.h * sysdeps/unix/sysv/linux/mips/sys/ucontext.h * sysdeps/unix/sysv/linux/nios2/sys/ucontext.h * sysdeps/unix/sysv/linux/powerpc/sys/ucontext.h * sysdeps/unix/sysv/linux/s390/sys/ucontext.h * sysdeps/unix/sysv/linux/sh/sys/ucontext.h * sysdeps/unix/sysv/linux/sparc/sys/ucontext.h * sysdeps/unix/sysv/linux/tile/sys/ucontext.h * sysdeps/unix/sysv/linux/x86/sys/ucontext.h: Include both bits/sigcontext.h and bits/sigstack.h. Fix grammar error in comment, if present. * bits/sigstack.h * sysdeps/unix/sysv/linux/aarch64/bits/sigstack.h * sysdeps/unix/sysv/linux/alpha/bits/sigstack.h * sysdeps/unix/sysv/linux/bits/sigstack.h * sysdeps/unix/sysv/linux/ia64/bits/sigstack.h * sysdeps/unix/sysv/linux/mips/bits/sigstack.h * sysdeps/unix/sysv/linux/powerpc/bits/sigstack.h * sysdeps/unix/sysv/linux/sparc/bits/sigstack.h * bits/sigcontext.h * sysdeps/mach/hurd/i386/bits/sigcontext.h * sysdeps/unix/sysv/linux/bits/sigcontext.h * sysdeps/unix/sysv/linux/ia64/bits/sigcontext.h * sysdeps/unix/sysv/linux/sparc/bits/sigcontext.h: Add multiple inclusion guard. Permit inclusion by sys/ucontext.h as well as signal.h, if this was not already allowed. Request definition of size_t if necessary. Minimize semantically-null differences across files.
* Remove remnants of .og patternsFlorian Weimer2016-09-201-2/+0
| | | | | | | | | | | This was used by --enable-omitfp, and the bulk of it was removed in this commit: commit bdeba1354b7364d9b7857a048286a71ddbcdff86 Author: Ulrich Drepper <drepper@gmail.com> Date: Sat Jan 7 11:29:31 2012 -0500 Remove --enable-omitfp support
* Add femode_t functions.Joseph Myers2016-09-072-0/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TS 18661-1 defines a type femode_t to represent the set of dynamic floating-point control modes (such as the rounding mode and trap enablement modes), and functions fegetmode and fesetmode to manipulate those modes (without affecting other state such as the raised exception flags) and a corresponding macro FE_DFL_MODE. This patch series implements those interfaces for glibc. This first patch adds the architecture-independent pieces, the x86 and x86_64 implementations, and the <bits/fenv.h> and ABI baseline updates for all architectures so glibc keeps building and passing the ABI tests on all architectures. Subsequent patches add the fegetmode and fesetmode implementations for other architectures. femode_t is generally an integer type - the same type as fenv_t, or as the single element of fenv_t where fenv_t is a structure containing a single integer (or the single relevant element, where it has elements for both status and control registers) - except where architecture properties or consistency with the fenv_t implementation indicate otherwise. FE_DFL_MODE follows FE_DFL_ENV in whether it's a magic pointer value (-1 cast to const femode_t *), a value that can be distinguished from valid pointers by its high bits but otherwise contains a representation of the desired register contents, or a pointer to a constant variable (the powerpc case; __fe_dfl_mode is added as an exported constant object, an alias to __fe_dfl_env). Note that where architectures (that share a register between control and status bits) gain definitions of new floating-point control or status bits in future, the implementations of fesetmode for those architectures may need updating (depending on whether the new bits are control or status bits and what the implementation does with previously unknown bits), just like existing implementations of <fenv.h> functions that take care not to touch reserved bits may need updating when the set of reserved bits changes. (As any new bits are outside the scope of ISO C, that's just a quality-of-implementation issue for supporting them, not a conformance issue.) As with fenv_t, femode_t should properly include any software DFP rounding mode (and for both fenv_t and femode_t I'd consider that fragment of DFP support appropriate for inclusion in glibc even in the absence of the rest of libdfp; hardware DFP rounding modes should already be included if the definitions of which bits are status / control bits are correct). Tested for x86_64, x86, mips64 (hard float, and soft float to test the fallback version), arm (hard float) and powerpc (hard float, soft float and e500). Other architecture versions are untested. * math/fegetmode.c: New file. * math/fesetmode.c: Likewise. * sysdeps/i386/fpu/fegetmode.c: Likewise. * sysdeps/i386/fpu/fesetmode.c: Likewise. * sysdeps/x86_64/fpu/fegetmode.c: Likewise. * sysdeps/x86_64/fpu/fesetmode.c: Likewise. * math/fenv.h: Update comment on inclusion of <bits/fenv.h>. [__GLIBC_USE (IEC_60559_BFP_EXT)] (fegetmode): New function declaration. [__GLIBC_USE (IEC_60559_BFP_EXT)] (fesetmode): Likewise. * bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/aarch64/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/alpha/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/arm/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/hppa/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/ia64/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/m68k/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/microblaze/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/mips/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/nios2/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/powerpc/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (__fe_dfl_mode): New variable declaration. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/s390/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/sh/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/sparc/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/tile/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * sysdeps/x86/fpu/bits/fenv.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (femode_t): New typedef. [__GLIBC_USE (IEC_60559_BFP_EXT)] (FE_DFL_MODE): New macro. * manual/arith.texi (FE_DFL_MODE): Document macro. (fegetmode): Document function. (fesetmode): Likewise. * math/Versions (fegetmode): New libm symbol at version GLIBC_2.25. (fesetmode): Likewise. * math/Makefile (libm-support): Add fegetmode and fesetmode. (tests): Add test-femode and test-femode-traps. * math/test-femode-traps.c: New file. * math/test-femode.c: Likewise. * sysdeps/powerpc/fpu/fenv_const.c (__fe_dfl_mode): Declare as alias for __fe_dfl_env. * sysdeps/powerpc/nofpu/fenv_const.c (__fe_dfl_mode): Likewise. * sysdeps/powerpc/powerpc32/e500/nofpu/fenv_const.c (__fe_dfl_mode): Likewise. * sysdeps/powerpc/Versions (__fe_dfl_mode): New libm symbol at version GLIBC_2.25. * sysdeps/nacl/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
* Remove unneeded stubs for k_rem_pio2l.Paul E. Murphy2016-09-011-3/+0
| | | | | | | | | | This is only used for the float and double variants. Instead, just add it to the type specific list of files, and remove all stubs, and remove the declaration from math_private.h. I verified x86_64, i486, ia64, m68k, and ppc64 build.
* Add fesetexcept.Joseph Myers2016-08-161-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TS 18661-1 defines an fesetexcept function for setting floating-point exception flags without the side-effect of causing enabled traps to be taken. This patch series implements this function for glibc. The present patch adds the fallback stub implementation, x86 and x86_64 implementations, documentation, tests and ABI baseline updates. The remaining patches, some of them untested, add implementations for other architectures. The implementations generally follow those of the fesetexceptflag function. As for fesetexceptflag, the approach taken for architectures where setting flags causes enabled traps to be taken is to set the flags (and potentially cause traps) rather than refusing to set the flags and returning an error. Since ISO C and TS 18661 provide no way to enable traps, this is formally in accordance with the standards. The NEWS entry should be considered a placeholder, since this patch series is intended to be followed by further such series adding other TS 18661-1 features, so that the NEWS entry would end up looking more like * New <fenv.h> features from TS 18661-1:2014 are added to libm: the fesetexcept, fetestexceptflag, fegetmode and fesetmode functions, the femode_t type and the FE_DFL_MODE macro. with hopefully more such entries for other features, rather than having an entry for a single function in the end. I believe we have consensus for adding TS 18661-1 interfaces as per <https://sourceware.org/ml/libc-alpha/2016-06/msg00421.html>. Tested for x86_64, x86, mips64 (hard float, and soft float to test the fallback version), arm (hard float) and powerpc (hard float, soft float and e500). * math/fesetexcept.c: New file. * sysdeps/i386/fpu/fesetexcept.c: Likewise. * sysdeps/x86_64/fpu/fesetexcept.c: Likewise. * math/fenv.h: Define __GLIBC_INTERNAL_STARTING_HEADER_IMPLEMENTATION and include <bits/libc-header-start.h> instead of including <features.h>. [__GLIBC_USE (IEC_60559_BFP_EXT)] (fesetexcept): New function declaration. * manual/arith.texi (fesetexcept): Document function. * math/Versions (fesetexcept): New libm symbol at version GLIBC_2.25. * math/Makefile (libm-support): Add fesetexcept. (tests): Add test-fesetexcept and test-fesetexcept-traps. * math/test-fesetexcept.c: New file. * math/test-fesetexcept-traps.c: Likewise. * sysdeps/nacl/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
* i386: Compile rtld-*.os with -mno-sse -mno-mmx -mfpmath=387H.J. Lu2016-07-181-2/+5
| | | | | | | | | | | | | Compile i386 rtld-*.os with -mno-sse -mno-mmx -mfpmath=387 so that no code in ld.so uses mm/xmm/ymm/zmm registers on i386 since the first 3 mm/xmm/ymm/zmm registers are used to pass vector parameters which must be preserved. * sysdeps/i386/Makefile (rtld-CFLAGS): New. [subdir == elf] (CFLAGS-.os): Replace -mno-sse -mno-mmx -mfpmath=387 with $(rtld-CFLAGS). [subdir != elf] (CFLAGS-.os): Compile rtld-*.os with $(rtld-CFLAGS).
* Regenerate i686 libm-test-ulps with GCC 6.1 at -O3 [BZ #20347]H.J. Lu2016-07-131-2/+2
| | | | | | | | | | | | | | | | | | This fixes with GCC 6.1 and -O3 on i686: Failure: Test: j0_downward (0xap+0) Result: is: -2.45935813e-01 -0x1.f7ad32p-3 should be: -2.45935768e-01 -0x1.f7ad2cp-3 difference: 4.47034835e-08 0x1.800000p-25 ulp : 3.0000 max.ulp : 2.0000 Maximal error of `j0_downward' is : 3 ulp accepted: 2 ulp [BZ #20347] * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Regenerated.
* i686/multiarch: Regenerate ulpsAurelien Jarno2016-06-301-8/+8
| | | | | | | This comes from running “make regen-ulps” on AMD Opteron 6272 CPUs. Changelog: * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Regenerated.
* Avoid "inexact" exceptions in i386/x86_64 trunc functions (bug 15479).Joseph Myers2016-06-273-18/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As discussed in <https://sourceware.org/ml/libc-alpha/2016-05/msg00577.html>, TS 18661-1 disallows ceil, floor, round and trunc functions from raising the "inexact" exception, in accordance with general IEEE 754 semantics for when that exception is raised. Fixing this for x87 floating point is more complicated than for the other versions of these functions, because they use the frndint instruction that raises "inexact" and this can only be avoided by saving and restoring the whole floating-point environment. As I noted in <https://sourceware.org/ml/libc-alpha/2016-06/msg00128.html>, I have now implemented a GCC option -fno-fp-int-builtin-inexact for GCC 7, such that GCC will inline these functions on x86, without caring about "inexact", when the default -ffp-int-builtin-inexact is in effect. This allows users to get optimized code depending on the options they pass to the compiler, while making the out-of-line functions follow TS 18661-1 semantics and avoid "inexact". This patch duly fixes the out-of-line trunc function implementations to avoid "inexact", in the same way as the nearbyint implementations. I do not know how the performance of implementations such as these based on saving the environment and changing the rounding mode temporarily compares to that of the C versions or SSE 4.1 versions (of course, for 32-bit x86 SSE implementations still need to get the return value in an x87 register); it's entirely possible other implementations could be faster in some cases. Tested for x86_64 and x86. [BZ #15479] * sysdeps/i386/fpu/s_trunc.S (__trunc): Save and restore floating-point environment rather than just control word. * sysdeps/i386/fpu/s_truncf.S (__truncf): Likewise. * sysdeps/i386/fpu/s_truncl.S (__truncl): Save and restore floating-point environment, with "invalid" exceptions merged in, rather than just control word. * sysdeps/x86_64/fpu/s_truncl.S (__truncl): Likewise. * math/libm-test.inc (trunc_test_data): Do not allow spurious "inexact" exceptions.
* Avoid "inexact" exceptions in i386/x86_64 floor functions (bug 15479).Joseph Myers2016-06-273-18/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As discussed in <https://sourceware.org/ml/libc-alpha/2016-05/msg00577.html>, TS 18661-1 disallows ceil, floor, round and trunc functions from raising the "inexact" exception, in accordance with general IEEE 754 semantics for when that exception is raised. Fixing this for x87 floating point is more complicated than for the other versions of these functions, because they use the frndint instruction that raises "inexact" and this can only be avoided by saving and restoring the whole floating-point environment. As I noted in <https://sourceware.org/ml/libc-alpha/2016-06/msg00128.html>, I have now implemented a GCC option -fno-fp-int-builtin-inexact for GCC 7, such that GCC will inline these functions on x86, without caring about "inexact", when the default -ffp-int-builtin-inexact is in effect. This allows users to get optimized code depending on the options they pass to the compiler, while making the out-of-line functions follow TS 18661-1 semantics and avoid "inexact". This patch duly fixes the out-of-line floor function implementations to avoid "inexact", in the same way as the nearbyint implementations. I do not know how the performance of implementations such as these based on saving the environment and changing the rounding mode temporarily compares to that of the C versions or SSE 4.1 versions (of course, for 32-bit x86 SSE implementations still need to get the return value in an x87 register); it's entirely possible other implementations could be faster in some cases. Tested for x86_64 and x86. [BZ #15479] * sysdeps/i386/fpu/s_floor.S (__floor): Save and restore floating-point environment rather than just control word. * sysdeps/i386/fpu/s_floorf.S (__floorf): Likewise. * sysdeps/i386/fpu/s_floorl.S (__floorl): Save and restore floating-point environment, with "invalid" exceptions merged in, rather than just control word. * sysdeps/x86_64/fpu/s_floorl.S (__floorl): Likewise. * math/libm-test.inc (floor_test_data): Do not allow spurious "inexact" exceptions.
* Avoid "inexact" exceptions in i386/x86_64 ceil functions (bug 15479).Joseph Myers2016-06-273-18/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As discussed in <https://sourceware.org/ml/libc-alpha/2016-05/msg00577.html>, TS 18661-1 disallows ceil, floor, round and trunc functions from raising the "inexact" exception, in accordance with general IEEE 754 semantics for when that exception is raised. Fixing this for x87 floating point is more complicated than for the other versions of these functions, because they use the frndint instruction that raises "inexact" and this can only be avoided by saving and restoring the whole floating-point environment. As I noted in <https://sourceware.org/ml/libc-alpha/2016-06/msg00128.html>, I have now implemented a GCC option -fno-fp-int-builtin-inexact for GCC 7, such that GCC will inline these functions on x86, without caring about "inexact", when the default -ffp-int-builtin-inexact is in effect. This allows users to get optimized code depending on the options they pass to the compiler, while making the out-of-line functions follow TS 18661-1 semantics and avoid "inexact". This patch duly fixes the out-of-line ceil function implementations to avoid "inexact", in the same way as the nearbyint implementations. I do not know how the performance of implementations such as these based on saving the environment and changing the rounding mode temporarily compares to that of the C versions or SSE 4.1 versions (of course, for 32-bit x86 SSE implementations still need to get the return value in an x87 register); it's entirely possible other implementations could be faster in some cases. Tested for x86_64 and x86. [BZ #15479] * sysdeps/i386/fpu/s_ceil.S (__ceil): Save and restore floating-point environment rather than just control word. * sysdeps/i386/fpu/s_ceilf.S (__ceilf): Likewise. * sysdeps/i386/fpu/s_ceill.S (__ceill): Save and restore floating-point environment, with "invalid" exceptions merged in, rather than just control word. * sysdeps/x86_64/fpu/s_ceill.S (__ceill): Likewise. * math/libm-test.inc (ceil_test_data): Do not allow spurious "inexact" exceptions.
* Fix i386/x86_64 scalbl with sNaN input (bug 20296).Joseph Myers2016-06-231-13/+3
| | | | | | | | | | | | | | | | | The x86_64 and i386 versions of scalbl return sNaN for some cases of sNaN input and are missing "invalid" exceptions for other cases. This results from overly complicated code that either returns a NaN input, or discards both inputs when one is NaN and loads a NaN from memory. This patch fixes this by simplifying the code to add the arguments when either one is NaN. Tested for x86_64 and x86. [BZ #20296] * sysdeps/i386/fpu/e_scalbl.S (__ieee754_scalbl): Add arguments when either argument is a NaN. * sysdeps/x86_64/fpu/e_scalbl.S (__ieee754_scalbl): Likewise. * math/libm-test.inc (scalb_test_data): Add sNaN tests.
* Simplify x86 nearbyint functions.Joseph Myers2016-06-223-12/+0
| | | | | | | | | | | | | | | | | | | | The i386 implementations of nearbyint functions, and x86_64 nearbyintl, contain code to mask the "inexact" exception. However, the fnstenv instruction has the effect of masking all exceptions, so this masking code has been redundant since fnstenv was added to those implementations (by commit 846d9a4a3acdb4939ca7bf6aed48f9f6f26911be; commit 71d1b0166b4ace0d804af2993b3815758b852efc added the test math/test-nearbyint-except-2.c that verifies these functions do work when called with "inexact" traps enabled); this patch removes the redundant code. Tested for x86_64 and x86. * sysdeps/i386/fpu/s_nearbyint.S (__nearbyint): Do not mask "inexact" exceptions after fnstenv. * sysdeps/i386/fpu/s_nearbyintf.S (__nearbyintf): Likewise. * sysdeps/i386/fpu/s_nearbyintl.S (__nearbyintl): Likewise. * sysdeps/x86_64/fpu/s_nearbyintl.S (__nearbyintl): Likewise.
* elf: Consolidate machine-agnostic DTV definitions in <dl-dtv.h>Florian Weimer2016-06-202-16/+1
| | | | | Identical definitions of dtv_t and TLS_DTV_UNALLOCATED were repeated for all architectures using DTVs.
* Fix i386 fdim double rounding (bug 20255).Joseph Myers2016-06-141-0/+50
| | | | | | | | | | | | | | | | | | | fdim suffers from double rounding on i386 because subtracting two double values can produce an inexact long double value exactly half way between two double values. This patch fixes this by creating an i386-specific version of fdim - C, based on the generic version, unlike the previous .S version - which sets the x87 precision control to double precision for the subtraction and then restores it afterwards. As noted in the comment added, there are no issues of double rounding for subnormals (a case that setting precision control does not address) because subtraction cannot produce an inexact result in the subnormal range. Tested for x86_64 and x86. [BZ #20255] * sysdeps/i386/fpu/s_fdim.c: New file. Based on math/s_fdim.c. * math/libm-test.inc (fdim_test_data): Add another test.
* Use generic fdim on more architectures (bug 6796, bug 20255, bug 20256).Joseph Myers2016-06-146-282/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some architectures have their own versions of fdim functions, which are missing errno setting (bug 6796) and may also return sNaN instead of qNaN for sNaN input, in the case of the x86 / x86_64 long double versions (bug 20256). These versions are not actually doing anything that a compiler couldn't generate, just straightforward comparisons / arithmetic (and, in the x86 / x86_64 case, testing for NaNs with fxam, which isn't actually needed once you use an unordered comparison and let the NaNs pass through the same subtraction as non-NaN inputs). This patch removes the x86 / x86_64 / powerpc versions, so that those architectures use the generic C versions, which correctly handle setting errno and deal properly with sNaN inputs. This seems better than dealing with setting errno in lots of .S versions. The i386 versions also return results with excess range and precision, which is not appropriate for a function exactly defined by reference to IEEE operations. For errno setting to work correctly on overflow, it's necessary to remove excess range with math_narrow_eval, which this patch duly does in the float and double versions so that the tests can reliably pass on x86. For float, this avoids any double rounding issues as the long double precision is more than twice that of float. For double, double rounding issues will need to be addressed separately, so this patch does not fully fix bug 20255. Tested for x86_64, x86 and powerpc. [BZ #6796] [BZ #20255] [BZ #20256] * math/s_fdim.c: Include <math_private.h>. (__fdim): Use math_narrow_eval on result. * math/s_fdimf.c: Include <math_private.h>. (__fdimf): Use math_narrow_eval on result. * sysdeps/i386/fpu/s_fdim.S: Remove file. * sysdeps/i386/fpu/s_fdimf.S: Likewise. * sysdeps/i386/fpu/s_fdiml.S: Likewise. * sysdeps/i386/i686/fpu/s_fdim.S: Likewise. * sysdeps/i386/i686/fpu/s_fdimf.S: Likewise. * sysdeps/i386/i686/fpu/s_fdiml.S: Likewise. * sysdeps/powerpc/fpu/s_fdim.c: Likewise. * sysdeps/powerpc/fpu/s_fdimf.c: Likewise. * sysdeps/powerpc/powerpc32/fpu/s_fdim.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_fdim.c: Likewise. * sysdeps/x86_64/fpu/s_fdiml.S: Likewise. * math/libm-test.inc (fdim_test_data): Expect errno setting on overflow. Add sNaN tests.
* Fix frexp (NaN) (bug 20250).Joseph Myers2016-06-131-1/+8
| | | | | | | | | | | | | | | | | | | | | Various implementations of frexp functions return sNaN for sNaN input. This patch fixes them to add such arguments to themselves so that qNaN is returned. Tested for x86_64, x86, mips64 and powerpc. [BZ #20250] * sysdeps/i386/fpu/s_frexpl.S (__frexpl): Add non-finite input to itself. * sysdeps/ieee754/dbl-64/s_frexp.c (__frexp): Add non-finite or zero input to itself. * sysdeps/ieee754/dbl-64/wordsize-64/s_frexp.c (__frexp): Likewise. * sysdeps/ieee754/flt-32/s_frexpf.c (__frexpf): Likewise. * sysdeps/ieee754/ldbl-128/s_frexpl.c (__frexpl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_frexpl.c (__frexpl): Likewise. * sysdeps/ieee754/ldbl-96/s_frexpl.c (__frexpl): Likewise. * math/libm-test.inc (frexp_test_data): Add sNaN tests.
* Fix i386/x86_64 log2l (sNaN) (bug 20235).Joseph Myers2016-06-091-0/+1
| | | | | | | | | | | | | | The i386/x86_64 versions of log2l return sNaN for sNaN input. This patch fixes them to add NaN inputs to themselves so that qNaN is returned in this case. Tested for x86_64 and x86. [BZ #20235] * sysdeps/i386/fpu/e_log2l.S (__ieee754_log2l): Add NaN input to itself. * sysdeps/x86_64/fpu/e_log2l.S (__ieee754_log2l): Likewise. * math/libm-test.inc (log2_test_data): Add sNaN tests.
* Fix i386/x86_64 log1pl (sNaN) (bug 20229).Joseph Myers2016-06-081-0/+1
| | | | | | | | | | | | | The i386/x86_64 versions of log1pl return sNaN for sNaN input. This patch fixes them to add a NaN input to itself so that qNaN is returned in this case. Tested for x86_64 and x86. [BZ #20229] * sysdeps/i386/fpu/s_log1pl.S (__log1pl): Add NaN input to itself. * sysdeps/x86_64/fpu/s_log1pl.S (__log1pl): Likewise. * math/libm-test.inc (log1p_test_data): Add sNaN tests.
* Fix i386/x86_64 log10l (sNaN) (bug 20228).Joseph Myers2016-06-081-0/+1
| | | | | | | | | | | | | | The i386/x86_64 versions of log10l return sNaN for sNaN input. This patch fixes them to add a NaN input to itself so that qNaN is returned in this case. Tested for x86_64 and x86. [BZ #20228] * sysdeps/i386/fpu/e_log10l.S (__ieee754_log10l): Add NaN input to itself. * sysdeps/x86_64/fpu/e_log10l.S (__ieee754_log10l): Likewise. * math/libm-test.inc (log10_test_data): Add sNaN tests.
* Fix i386/x86_64 logl (sNaN) (bug 20227).Joseph Myers2016-06-082-0/+2
| | | | | | | | | | | | | | | | The i386/x86_64 versions of logl return sNaN for sNaN input. This patch fixes them to add a NaN input to itself so that qNaN is returned in this case. Tested for x86_64 and x86 (including a build for i586 to cover the non-i686 logl version). [BZ #20227] * sysdeps/i386/fpu/e_logl.S (__ieee754_logl): Add NaN input to itself. * sysdeps/i386/i686/fpu/e_logl.S (__ieee754_logl): Likewise. * sysdeps/x86_64/fpu/e_logl.S (__ieee754_logl): Likewise. * math/libm-test.inc (log_test_data): Add sNaN tests.
* Fix i386/x86_64 expl, exp10l, expm1l for sNaN input (bug 20226).Joseph Myers2016-06-081-2/+5
| | | | | | | | | | | | | | | | | The i386 and x86_64 implementations of expl, exp10l and expm1l (code shared between the functions) return sNaN for sNaN input. This patch fixes them to add NaN inputs to themselves so that qNaN is returned in this case. Tested for x86_64 and x86. [BZ #20226] * sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL): Add NaN argument to itself. * sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL): Likewise. * math/libm-test.inc (exp_test_data): Add sNaN tests. (exp10_test_data): Likewise. (expm1_test_data): Likewise.
* Fix i386 cbrtl (sNaN) (bug 20224).Joseph Myers2016-06-081-0/+1
| | | | | | | | | | | | | | | The i386 version of cbrtl returns sNaN (without raising any exceptions) for sNaN input. This patch fixes it to add non-finite arguments to themselves (the code path in question is also reached for zero arguments, for which adding them to themselves is also harmless), so that "invalid" is raised and qNaN returned. Tested for x86_64 and x86. [BZ #20224] * sysdeps/i386/fpu/s_cbrtl.S (__cbrtl): Add non-finite or zero argument to itself. * math/libm-test.inc (cbrt_test_data): Add sNaN tests.
* Fix i386 atanhl (sNaN) (bug 20219).Joseph Myers2016-06-071-0/+1
| | | | | | | | | | | | | The i386 version of atanhl returns sNaN for sNaN input. This patch fixes it to add NaN arguments to themselves so it returns qNaN in this case. Tested for x86_64 and x86. [BZ #20219] * sysdeps/i386/fpu/e_atanhl.S (__ieee754_atanhl): Add NaN argument to itself. * math/libm-test.inc (atanh_test_data): Add sNaN tests.
* Fix i386 asinhl (sNaN) (bug 20218).Joseph Myers2016-06-071-0/+1
| | | | | | | | | | | | | | The i386 version of asinhl returns sNaN (without raising any exceptions) for sNaN input. This patch fixes it to add non-finite arguments to themselves, so that "invalid" is raised and qNaN returned. Tested for x86_64 and x86. [BZ #20218] * sysdeps/i386/fpu/s_asinhl.S (__asinhl): Add non-finite argument to itself. * math/libm-test.inc (asinh_test_data): Add sNaN tests.
* Fix x86/x86_64 nextafterl incrementing negative subnormals (bug 20205).Joseph Myers2016-06-031-1/+1
| | | | | | | | | | | | | | | | | The x86 / x86_64 implementation of nextafterl (also used for nexttowardl) produces incorrect results (NaNs) when negative subnormals, the low 32 bits of whose mantissa are zero, are incremented towards zero. This patch fixes this by disabling the logic to decrement the exponent in that case. Tested for x86_64 and x86. [BZ #20205] * sysdeps/i386/fpu/s_nextafterl.c (__nextafterl): Do not adjust exponent when incrementing negative subnormal with low mantissa word zero. * math/libm-test.inc (nextafter_test_data) [TEST_COND_intel96]: Add another test.
* Call init_cpu_features only if SHARED is definedH.J. Lu2016-05-131-0/+4
| | | | | | | | | | In static executable, since init_cpu_features is called early from __libc_start_main, there is no need to call it again in dl_platform_init. [BZ #20072] * sysdeps/i386/dl-machine.h (dl_platform_init): Call init_cpu_features only if SHARED is defined. * sysdeps/x86_64/dl-machine.h (dl_platform_init): Likewise.
* Remove x86 ifunc-defines.sym and rtld-global-offsets.symH.J. Lu2016-05-112-20/+0
| | | | | | | | | | | | | | | | | | | | Merge x86 ifunc-defines.sym with x86 cpu-features-offsets.sym. Remove x86 ifunc-defines.sym and rtld-global-offsets.sym. No code changes on i686 and x86-64. * sysdeps/i386/i686/multiarch/Makefile (gen-as-const-headers): Remove ifunc-defines.sym. * sysdeps/x86_64/multiarch/Makefile (gen-as-const-headers): Likewise. * sysdeps/i386/i686/multiarch/ifunc-defines.sym: Removed. * sysdeps/x86/rtld-global-offsets.sym: Likewise. * sysdeps/x86_64/multiarch/ifunc-defines.sym: Likewise. * sysdeps/x86/Makefile (gen-as-const-headers): Remove rtld-global-offsets.sym. * sysdeps/x86_64/multiarch/ifunc-defines.sym: Merged with ... * sysdeps/x86/cpu-features-offsets.sym: This. * sysdeps/x86/cpu-features.h: Include <cpu-features-offsets.h> instead of <ifunc-defines.h> and <rtld-global-offsets.h>.
* Move sysdeps/x86_64/cacheinfo.c to sysdeps/x86H.J. Lu2016-05-081-1/+1
| | | | | | | | | | Move sysdeps/x86_64/cacheinfo.c to sysdeps/x86. No code changes on x86 and x86_64. * sysdeps/i386/cacheinfo.c: Include <sysdeps/x86/cacheinfo.c> instead of <sysdeps/x86_64/cacheinfo.c>. * sysdeps/x86_64/cacheinfo.c: Moved to ... * sysdeps/x86/cacheinfo.c: Here.
* When disabling SSE, make sure -fpmath is not set to use SSE eitherKhem Raj2016-04-091-1/+2
| | | | | | | | | | | | | | | This fixes errors when we inject sse options through CFLAGS and now that we have -Werror turned on by default this warning turns into an error on x86: $ gcc -m32 -march=core2 -mtune=core2 -msse3 -mfpmath=sse -x c /dev/null -S -mno-sse -mno-mmx /dev/null:1:0: warning: SSE instruction set disabled, using 387 arithmetics Where as: $ gcc -m32 -march=core2 -mtune=core2 -msse3 -mfpmath=sse -x c /dev/null -S -mno-sse -mno-mmx -mfpmath=387 Generates no warnings.
* configure: fix `test ==` usageMike Frysinger2016-04-092-2/+2
| | | | | POSIX defines the = operator, but not ==. Fix the few places where we incorrectly used ==.
* Improve generic strcspn performanceWilco Dijkstra2016-04-011-18/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Improve strcspn performance using a much faster algorithm. It is kept simple so it works well on most targets. It is generally at least 10 times faster than the existing implementation on bench-strcspn on a few AArch64 implementations, and for some tests 100 times as fast (repeatedly calling strchr on a small string is extremely slow...). In fact the string/bits/string2.h inlines make no longer sense, as GCC already uses strlen if reject is an empty string, strchrnul is 5 times as fast as __strcspn_c1, while __strcspn_c2 and __strcspn_c3 are slower than the strcspn main loop for large strings (though reject length 2-4 could be special cased in the future to gain even more performance). Tested on x86_64, i686, and aarch64. * string/Version (libc): Add GLIBC_2.24. * string/strcspn.c (strcspn): Rewrite function. * string/bits/string2.h (strcspn): Use __builtin_strcspn. (__strcspn_c1): Remove inline function. (__strcspn_c2): Likewise. (__strcspn_c3): Likewise. * string/string-inline.c [SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strcspn_c1): Add compatibility symbol. [SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strcspn_c2): Likewise. [SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strcspn_c3): Likewise. * sysdeps/i386/string-inlines.c: Include generic string-inlines.c.