about summary refs log tree commit diff
Commit message (Collapse)AuthorAgeFilesLines
...
* x86: Add EVEX optimized memchr family not safe for RTMNoah Goldstein2021-05-0810-41/+217
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No bug. This commit adds a new implementation for EVEX memchr that is not safe for RTM because it uses vzeroupper. The benefit is that by using ymm0-ymm15 it can use vpcmpeq and vpternlogd in the 4x loop which is faster than the RTM safe version which cannot use vpcmpeq because there is no EVEX encoding for the instruction. All parts of the implementation aside from the 4x loop are the same for the two versions and the optimization is only relevant for large sizes. Tigerlake: size , algn , Pos , Cur T , New T , Win , Dif 512 , 6 , 192 , 9.2 , 9.04 , no-RTM , 0.16 512 , 7 , 224 , 9.19 , 8.98 , no-RTM , 0.21 2048 , 0 , 256 , 10.74 , 10.54 , no-RTM , 0.2 2048 , 0 , 512 , 14.81 , 14.87 , RTM , 0.06 2048 , 0 , 1024 , 22.97 , 22.57 , no-RTM , 0.4 2048 , 0 , 2048 , 37.49 , 34.51 , no-RTM , 2.98 <-- Icelake: size , algn , Pos , Cur T , New T , Win , Dif 512 , 6 , 192 , 7.6 , 7.3 , no-RTM , 0.3 512 , 7 , 224 , 7.63 , 7.27 , no-RTM , 0.36 2048 , 0 , 256 , 8.48 , 8.38 , no-RTM , 0.1 2048 , 0 , 512 , 11.57 , 11.42 , no-RTM , 0.15 2048 , 0 , 1024 , 17.92 , 17.38 , no-RTM , 0.54 2048 , 0 , 2048 , 30.37 , 27.34 , no-RTM , 3.03 <-- test-memchr, test-wmemchr, and test-rawmemchr are all passing. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Fix an unknown vector operation in memchr-evex.SAlice Xu2021-05-071-1/+1
| | | | | | | An unknown vector operation occurred in commit 2a76821c308. Fixed it by using "ymm{k1}{z}" but not "ymm {k1} {z}". Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* powerpc64le: Fix ifunc selection for memset, memmove, bzero and bcopyRaoni Fassina Firmino2021-05-075-20/+22
| | | | | | | | | The hwcap2 check for the aforementioned functions should check for both PPC_FEATURE2_ARCH_3_1 and PPC_FEATURE2_HAS_ISEL but was mistakenly checking for any one of them, enabling isa 3.1 version of the functions in incompatible processors, like POWER8. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
* malloc: Make tunable callback functions staticH.J. Lu2021-05-071-2/+2
| | | | | Since malloc tunable callback functions are only used within the same file, we should make them static.
* linux: implement ttyname as a wrapper around ttyname_r.Érico Nogueira2021-05-073-161/+15
| | | | | | | | | | | | | Big win in binary size and avoids duplicating the logic in multiple places. On x86_64, dropped from 1883206 to 1881790, a 1416 byte decrease. Also changed logic to track if ttyname_buf has been allocated by checking if it's NULL instead of tracking buflen as an additional variable. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* linux: use fd_to_filename instead of _fitoa_word in ttyname_r.Érico Nogueira2021-05-071-5/+3
| | | | | | | | | Simplifies the logic and makes intent clearer, while at the same time decreasing binary size. On x86_64, dropped from 1883270 to 1883206, a 64 byte decrease. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* misc: use _fitoa_word to implement __fd_to_filename.Érico Nogueira2021-05-071-5/+2
| | | | | | | | | | In a default build for x86_64, size decreased by 24 bytes: 1883294 to 1883270. Aditionally, avoids repeating the number printing logic in multiple places. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* linux: Remove /proc/cpuinfo fallback on alpha and sparcAdhemerval Zanella2021-05-073-97/+1
| | | | | | | | | There is no much gain in fallback to cpuinfo if sysfs is no present, usually on restricted environment neither will be present. It also simplifies the code and make all architecture use the sched_getaffinity as the sysfs fallback. Checked on sparc64-linux-gnu.
* linux: Use sched_getaffinity for __get_nprocs (BZ #27645)Adhemerval Zanella2021-05-078-338/+31
| | | | | | | | Both the sysfs and procfs parsing (through GET_NPROCS_PARSER) are removed in favor the syscall. The initial scratch buffer should fit to most of the common usage (1024 bytes with maps to 8192 CPUs). Checked on x86_64-linux-gnu and aarch64-linux-gnu.
* Remove architecture specific sched_cpucount optimizationsAdhemerval Zanella2021-05-076-130/+14
| | | | | | | | | | | | | | And replace the generic algorithm with the Brian Kernighan's one. GCC optimize it with popcnt if the architecture supports, so there is no need to add the extra POPCNT define to enable it. This is really a micro-optimization that only adds complexity: recent ABIs already support it (x86-64-v2 or power64le) and it simplifies the code for internal usage, since i686 does not allow an internal iFUNC call. Checked on x86_64-linux-gnu, aarch64-linux-gnu, and powerpc64le-linux-gnu.
* Run $(objpfx)iconvconfig with $(run-program-prefix) [BZ #27477]H.J. Lu2021-05-071-2/+3
| | | | | | | | | | | | | | | When glibc is configured with --enable-hardcoded-path-in-tests, "make xcheck" failed with ... env GCONV_PATH=/export/build/gnu/tools-build/glibc-cet-gitlab/build-x86_64-linux/iconvdata LOCPATH=/export/build/gnu/tools-build/glibc-cet-gitlab/build-x86_64-linux/localedata LC_ALL=C /export/build/gnu/tools-build/glibc-cet-gitlab/build-x86_64-linux/iconv/iconvconfig --output=$tmp --nostdlib /usr/lib64/gconv; ... /export/build/gnu/tools-build/glibc-cet-gitlab/build-x86_64-linux/iconv/iconvconfig: /lib64/libc.so.6: version `GLIBC_2.34' not found (required by /export/build/gnu/tools-build/glibc-cet-gitlab/build-x86_64-linux/iconv/iconvconfig) ... FAIL: iconv/test-iconvconfig Since $(objpfx)iconvconfig is an installed program, run it with $(run-program-prefix).
* Use the correct diagnostic macro.Martin Sebor2021-05-061-1/+1
|
* Annotate additional APIs with GCC attribute access.Martin Sebor2021-05-0616-45/+95
| | | | | | | | | | | | | | | | This change continues the improvements to compile-time out of bounds checking by decorating more APIs with either attribute access, or by explicitly providing the array bound in APIs such as tmpnam() that expect arrays of some minimum size as arguments. (The latter feature is new in GCC 11.) The only effects of the attribute and/or the array bound is to check and diagnose calls to the functions that fail to provide a sufficient number of elements, and the definitions of the functions that access elements outside the specified bounds. (There is no interplay with _FORTIFY_SOURCE here yet.) Tested with GCC 7 through 11 on x86_64-linux.
* nptl: Move pthread_barrierattr_setpshared into libcFlorian Weimer2021-05-0664-33/+76
| | | | | | The symbol was moved using scripts/move-symbol-to-libc.py. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* nptl: Move pthread_barrierattr_getpshared into libcFlorian Weimer2021-05-0664-33/+76
| | | | | | The symbol was moved using scripts/move-symbol-to-libc.py. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* nptl: Move pthread_barrierattr_init into libcFlorian Weimer2021-05-0664-33/+76
| | | | | | The symbol was moved using scripts/move-symbol-to-libc.py. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* nptl: Move pthread_barrierattr_destroy into libcFlorian Weimer2021-05-0664-33/+76
| | | | | | The symbol was moved using scripts/move-symbol-to-libc.py. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* nptl: Move pthread_barrier_wait into libcFlorian Weimer2021-05-0664-34/+80
| | | | | | The symbol was moved using scripts/move-symbol-to-libc.py. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* nptl: Move pthread_barrier_init into libcFlorian Weimer2021-05-0665-35/+83
| | | | | | The symbol was moved using scripts/move-symbol-to-libc.py. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* nptl: Move pthread_barrier_destroy into libcFlorian Weimer2021-05-0664-33/+76
| | | | | | The symbol was moved using scripts/move-symbol-to-libc.py. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* nptl: Move sem_trywait, sem_wait into libcFlorian Weimer2021-05-0564-83/+170
| | | | | | The symbols were moved using scripts/move-symbol-to-libc.py. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* nptl: Move sem_unlink into libcFlorian Weimer2021-05-0565-33/+93
| | | | | | | | | | | | The symbol was moved using scripts/move-symbol-to-libc.py. A small adjust to the sem_unlink implementation is necessary to avoid a check-localplt failure. A placeholder symbol to keep the GLIBC_2.1.1 version alive in libpthread is added with this commit. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* nptl: Move sem_timedwait into libcFlorian Weimer2021-05-0565-37/+81
| | | | | | The symbol was moved using scripts/move-symbol-to-libc.py. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* nptl: Move sem_post into libcFlorian Weimer2021-05-0564-42/+84
| | | | | | The symbol was moved using scripts/move-symbol-to-libc.py. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* nptl: Move sem_init into libcFlorian Weimer2021-05-0564-43/+84
| | | | | | The symbol was moved using scripts/move-symbol-to-libc.py. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* nptl: Move sem_getvalue into libcFlorian Weimer2021-05-0564-42/+86
| | | | | | The symbol was moved using scripts/move-symbol-to-libc.py. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* nptl: Move sem_destroy into libcFlorian Weimer2021-05-0564-42/+84
| | | | | | The symbol was moved using scripts/move-symbol-to-libc.py. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* nptl: Move sem_close, sem_open into libcFlorian Weimer2021-05-0570-82/+182
| | | | | | | | | | | The symbols were moved using move-symbol-to-libc.py. Both functions are moved at the same time because they depend on internal functions in sysdeps/pthread/sem_routines.c, which are moved in this commit as well. Additional hidden prototypes are required to avoid check-localplt failures. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* nptl: Move sem_clockwait into libcFlorian Weimer2021-05-0566-37/+111
| | | | | | | | | The symbol was moved using scripts/move-symbol-to-libc.py. A new placeholder version is added at version GLIBC_2.30, to preserve that version in libpthread.so.0. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* nptl: Consolidate async cancel enable/disable implementation in libcFlorian Weimer2021-05-0514-110/+25
| | | | | | | | | | | | | | Previously, the source file nptl/cancellation.c was compiled multiple times, for libc, libpthread, librt. This commit switches to a single implementation, with new __pthread_enable_asynccancel@@GLIBC_PRIVATE, __pthread_disable_asynccancel@@GLIBC_PRIVATE exports. The almost-unused CANCEL_ASYNC and CANCEL_RESET macros are replaced by LIBC_CANCEL_ASYNC and LIBC_CANCEL_ASYNC macros. They call the __pthread_* functions unconditionally now. The macros are still needed because shared code uses them; Hurd has different definitions. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* nptl: Move pthread_testcancel into libcFlorian Weimer2021-05-0565-36/+79
| | | | | | | | | The symbol was moved using scripts/move-symbol-to-libc.py. A temporary __pthread_testcancel@@GLIBC_PRIVATE export is created because it is needed by the semaphore implementation. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* elf, nptl: Initialize static TLS directly in ld.soFlorian Weimer2021-05-059-44/+65
| | | | | | | | | | The stack list is available in ld.so since commit 1daccf403b1bd86370eb94edca794dc106d02039 ("nptl: Move stack list variables into _rtld_global"), so it's possible to walk the stack list directly in ld.so and perform the initialization there. This eliminates an unprotected function pointer from _rtld_global and reduces the libpthread initialization code.
* posix: Fix Hurd build failure in tst-execveatFlorian Weimer2021-05-041-1/+4
| | | | | This avoids a -Werror compilation failure due to unused local variables.
* x86: Optimize memchr-evex.SNoah Goldstein2021-05-031-225/+322
| | | | | | | | | | | | No bug. This commit optimizes memchr-evex.S. The optimizations include replacing some branches with cmovcc, avoiding some branches entirely in the less_4x_vec case, making the page cross logic less strict, saving some ALU in the alignment process, and most importantly increasing ILP in the 4x loop. test-memchr, test-rawmemchr, and test-wmemchr are all passing. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86: Optimize memchr-avx2.SNoah Goldstein2021-05-031-178/+247
| | | | | | | | | | | No bug. This commit optimizes memchr-avx2.S. The optimizations include replacing some branches with cmovcc, avoiding some branches entirely in the less_4x_vec case, making the page cross logic less strict, asaving a few instructions the in loop return loop. test-memchr, test-rawmemchr, and test-wmemchr are all passing. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* linux: use __fd_to_filename helper function instead of snprintf.Érico Nogueira2021-05-032-16/+7
| | | | | Change made to fchmodat and fexecve. There are tests using xasprintf instead of this helper as well, but this commit doesn't touch them.
* linux: Add execveat system call wrapperAlexandra Hájková2021-05-0340-2/+305
| | | | | | | | | | | It operates similar to execve and it is is already used to implement fexecve without requiring /proc to be mounted. However, different than fexecve, if the syscall is not supported by the kernel an error is returned instead of trying a fallback. Checked on x86_64-linux-gnu and powerpc64le-linux-gnu. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Bench: Expand bench-memchr.cNoah Goldstein2021-05-031-0/+13
| | | | | | | | No bug. This commit adds some additional cases for bench-memchr.c including testing medium sizes and testing short length with both an inbound match and out of bound match. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
* locale: Align _nl_C_LC_CTYPE_class and _nl_C_LC_CTYPE_class32Lirong Yuan2021-05-031-2/+3
| | | | | Otherwise, programs that use character classification macros such as isspace may observe unaligned pointers.
* nptl: Re-sort Versions fileFlorian Weimer2021-05-031-2/+2
| | | | | Due to an incorrect conflict resolution, libc/GLIBC_2.2 section was no longer sorted lexicographically.
* x86: Set rep_movsb_threshold to 2112 on processors with FSRMH.J. Lu2021-05-031-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The glibc memcpy benchmark on Intel Core i7-1065G7 (Ice Lake) showed that REP MOVSB became faster after 2112 bytes: Vector Move REP MOVSB length=2112, align1=0, align2=0: 24.20 24.40 length=2112, align1=1, align2=0: 26.07 23.13 length=2112, align1=0, align2=1: 27.18 28.13 length=2112, align1=1, align2=1: 26.23 25.16 length=2176, align1=0, align2=0: 23.18 22.52 length=2176, align1=2, align2=0: 25.45 22.52 length=2176, align1=0, align2=2: 27.14 27.82 length=2176, align1=2, align2=2: 22.73 25.56 length=2240, align1=0, align2=0: 24.62 24.25 length=2240, align1=3, align2=0: 29.77 27.15 length=2240, align1=0, align2=3: 35.55 29.93 length=2240, align1=3, align2=3: 34.49 25.15 length=2304, align1=0, align2=0: 34.75 26.64 length=2304, align1=4, align2=0: 32.09 22.63 length=2304, align1=0, align2=4: 28.43 31.24 Use REP MOVSB for data size > 2112 bytes in memcpy on processors with fast short REP MOVSB (FSRM). * sysdeps/x86/dl-cacheinfo.h (dl_init_cacheinfo): Set rep_movsb_threshold to 2112 on processors with fast short REP MOVSB (FSRM).
* bench-memcpy: Collect data from 2KB to 4KBH.J. Lu2021-05-031-0/+8
| | | | Collect data on memcpy from 2KB to 4KB with the 64-byte increment value.
* stdio: fix vfscanf with matches longer than INT_MAX (bug 27650)Alyssa Ross2021-05-031-9/+4
| | | | | | | | | | Patterns like %*[ can safely be used to match a great many characters, and it's quite realisitic to use them for more than INT_MAX characters from an IO stream. With the previous approach, after INT_MAX characters (v)fscanf would return successfully, indicating an end to the match, even though there wasn't one.
* nptl: Move pthread_yield into libc, as a compatibility symbolFlorian Weimer2021-05-0366-40/+53
| | | | | | | | | | | | | And deprecate it in <pthread.h>, redirecting it to sched_yield for the time being. The symbol was moved using scripts/move-symbol-to-libc.py. No GLIBC_2.34 symbol version is added because of the compatibility symbol status. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
* nptl: Move pthread_rwlockattr_setpshared into libcFlorian Weimer2021-05-0364-33/+76
| | | | | | | The symbol was moved using scripts/move-symbol-to-libc.py. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
* nptl: Move pthread_rwlockattr_setkind_np into libcFlorian Weimer2021-05-0364-33/+76
| | | | | | | The symbol was moved using scripts/move-symbol-to-libc.py. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
* nptl: Move pthread_rwlockattr_init into libcFlorian Weimer2021-05-0364-33/+76
| | | | | | | The symbol was moved using scripts/move-symbol-to-libc.py. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
* nptl: Move pthread_rwlockattr_getpshared into libcFlorian Weimer2021-05-0364-33/+77
| | | | | | | The symbol was moved using scripts/move-symbol-to-libc.py. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
* nptl: Move pthread_rwlockattr_getkind_np into libcFlorian Weimer2021-05-0364-33/+76
| | | | | | | The symbol was moved using scripts/move-symbol-to-libc.py. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
* nptl: Move pthread_rwlockattr_destroy into libcFlorian Weimer2021-05-0364-33/+76
| | | | | | | The symbol was moved using scripts/move-symbol-to-libc.py. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>