about summary refs log tree commit diff
Commit message (Collapse)AuthorAgeFilesLines
* x86-64: Add vector log1p/log1pf implementation to libmvecSunil K Pandey2021-12-2950-1/+4447
| | | | | | | | Implement vectorized log1p/log1pf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector log1p/log1pf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector log2/log2f implementation to libmvecSunil K Pandey2021-12-2950-1/+4208
| | | | | | | | Implement vectorized log2/log2f containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector log2/log2f with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector log10/log10f implementation to libmvecSunil K Pandey2021-12-2950-1/+3758
| | | | | | | | Implement vectorized log10/log10f containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector log10/log10f with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector atan2/atan2f implementation to libmvecSunil K Pandey2021-12-2950-1/+3117
| | | | | | | | Implement vectorized atan2/atan2f containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector atan2/atan2f with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector cbrt/cbrtf implementation to libmvecSunil K Pandey2021-12-2950-1/+3031
| | | | | | | | Implement vectorized cbrt/cbrtf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector cbrt/cbrtf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector sinh/sinhf implementation to libmvecSunil K Pandey2021-12-2950-1/+2894
| | | | | | | | Implement vectorized sinh/sinhf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector sinh/sinhf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector expm1/expm1f implementation to libmvecSunil K Pandey2021-12-2950-1/+2725
| | | | | | | | Implement vectorized expm1/expm1f containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector expm1/expm1f with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector cosh/coshf implementation to libmvecSunil K Pandey2021-12-2950-1/+2637
| | | | | | | | Implement vectorized cosh/coshf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector cosh/coshf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector exp10/exp10f implementation to libmvecSunil K Pandey2021-12-2950-1/+2617
| | | | | | | | Implement vectorized exp10/exp10f containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector exp10/exp10f with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector exp2/exp2f implementation to libmvecSunil K Pandey2021-12-2950-1/+2293
| | | | | | | | Implement vectorized exp2/exp2f containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector exp2/exp2f with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector hypot/hypotf implementation to libmvecSunil K Pandey2021-12-2950-1/+2151
| | | | | | | | Implement vectorized hypot/hypotf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector hypot/hypotf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector asin/asinf implementation to libmvecSunil K Pandey2021-12-2950-1/+2189
| | | | | | | | Implement vectorized asin/asinf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector asin/asinf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86-64: Add vector atan/atanf implementation to libmvecSunil K Pandey2021-12-2950-1/+1741
| | | | | | | | Implement vectorized atan/atanf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector atan/atanf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* elf: Add _dl_find_object functionFlorian Weimer2021-12-2873-18/+2038
| | | | | | | | | | | | | | | | | | | | | | | | | It can be used to speed up the libgcc unwinder, and the internal _dl_find_dso_for_object function (which is used for caller identification in dlopen and related functions, and in dladdr). _dl_find_object is in the internal namespace due to bug 28503. If libgcc switches to _dl_find_object, this namespace issue will be fixed. It is located in libc for two reasons: it is necessary to forward the call to the static libc after static dlopen, and there is a link ordering issue with -static-libgcc and libgcc_eh.a because libc.so is not a linker script that includes ld.so in the glibc build tree (so that GCC's internal -lc after libgcc_eh.a does not pick up ld.so). It is necessary to do the i386 customization in the sysdeps/x86/bits/dl_find_object.h header shared with x86-64 because otherwise, multilib installations are broken. The implementation uses software transactional memory, as suggested by Torvald Riegel. Two copies of the supporting data structures are used, also achieving full async-signal-safety. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* malloc: Remove memusage.hAdhemerval Zanella2021-12-2824-406/+20
| | | | | | And use machine-sp.h instead. The Linux implementation is based on already provided CURRENT_STACK_FRAME (used on nptl code) and STACK_GROWS_UPWARD is replaced with _STACK_GROWS_UP.
* malloc: Use hp-timing on libmemusageAdhemerval Zanella2021-12-285-24/+21
| | | | Instead of reimplemeting on GETTIME macro.
* Remove atomic-machine.h atomic typedefsAdhemerval Zanella2021-12-2819-386/+8
| | | | Now that memusage.c uses generic types we can remove them.
* malloc: Remove atomic_* usageAdhemerval Zanella2021-12-284-36/+18
| | | | | These typedef are used solely on memusage and can be replaced with generic types.
* microblaze: Add missing implementation when !__ASSUME_TIME64_SYSCALLSThomas Petazzoni2021-12-281-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit a92f4e6299fe0e3cb6f77e79de00817aece501ce ("linux: Add time64 pselect support"), a Microblaze specific implementation of __pselect32() was added to cover the case of kernels < 3.15 which lack the pselect6 system call. This new file sysdeps/unix/sysv/linux/microblaze/pselect32.c takes precedence over the default implementation sysdeps/unix/sysv/linux/pselect32.c. However sysdeps/unix/sysv/linux/pselect32.c provides an implementation of __pselect32() which is needed when __ASSUME_TIME64_SYSCALLS is not defined. On Microblaze, which is a 32-bit architecture, __ASSUME_TIME64_SYSCALLS is only true for kernels >= 5.1. Due to sysdeps/unix/sysv/linux/microblaze/pselect32.c taking precedence over sysdeps/unix/sysv/linux/pselect32.c, it means that when we are with a kernel >= 3.15 but < 5.1, we need a __pselect32() implementation, but sysdeps/unix/sysv/linux/microblaze/pselect32.c doesn't provide it, and sysdeps/unix/sysv/linux/pselect32.c which would provide it is not compiled in. This causes the following build failure on Microblaze with for example Linux kernel headers 4.9: [...]/build/libc_pic.os: in function `__pselect64': (.text+0x120b44): undefined reference to `__pselect32' collect2: error: ld returned 1 exit status Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* elf: Do not fail for failed dlmopen on audit modules (BZ #28061)Adhemerval Zanella2021-12-284-2/+87
| | | | | | | | | | | | | | | | | | | | | The dl_main sets the LM_ID_BASE to RT_ADD just before starting to add load new shared objects. The state is set to RT_CONSISTENT just after all objects are loaded. However if a audit modules tries to dlmopen an inexistent module, the _dl_open will assert that the namespace is in an inconsistent state. This is different than dlopen, since first it will not use LM_ID_BASE and second _dl_map_object_from_fd is the sole responsible to set and reset the r_state value. So the assert on _dl_open can not really be seen if the state is consistent, since _dt_main resets it. This patch removes the assert. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* elf: Issue audit la_objopen for vDSOAdhemerval Zanella2021-12-287-8/+199
| | | | | | | | | | | | | | | The vDSO is is listed in the link_map chain, but is never the subject of an la_objopen call. A new internal flag __RTLD_VDSO is added that acts as __RTLD_OPENEXEC to allocate the required 'struct auditstate' extra space for the 'struct link_map'. The return value from the callback is currently ignored, since there is no PLT call involved by glibc when using the vDSO, neither the vDSO are exported directly. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* elf: Add audit tests for modules with TLSDESCAdhemerval Zanella2021-12-286-0/+242
| | | | | | Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* elf: Avoid unnecessary slowdown from profiling with audit (BZ#15533)Adhemerval Zanella2021-12-2811-10/+294
| | | | | | | | | | | | | | | | | | | The rtld-audit interfaces introduces a slowdown due to enabling profiling instrumentation (as if LD_AUDIT implied LD_PROFILE). However, instrumenting is only necessary if one of audit libraries provides PLT callbacks (la_pltenter or la_pltexit symbols). Otherwise, the slowdown can be avoided. The following patch adjusts the logic that enables profiling to iterate over all audit modules and check if any of those provides a PLT hook. To keep la_symbind to work even without PLT callbacks, _dl_fixup now calls the audit callback if the modules implements it. Co-authored-by: Alexander Monakov <amonakov@ispras.ru> Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* elf: Add _dl_audit_pltexitAdhemerval Zanella2021-12-2827-122/+158
| | | | | | | | | It consolidates the code required to call la_pltexit audit callback. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* elf: Add _dl_audit_pltenterAdhemerval Zanella2021-12-283-72/+82
| | | | | | | | | It consolidates the code required to call la_pltenter audit callback. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* elf: Add _dl_audit_preinitAdhemerval Zanella2021-12-284-21/+22
| | | | | | | | | It consolidates the code required to call la_preinit audit callback. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* elf: Add _dl_audit_symbind_alt and _dl_audit_symbindAdhemerval Zanella2021-12-285-124/+135
| | | | | | | | | It consolidates the code required to call la_symbind{32,64} audit callback. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* elf: Add _dl_audit_objcloseAdhemerval Zanella2021-12-284-34/+27
| | | | | | | | | It consolidates the code required to call la_objclose audit callback. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* elf: Add _dl_audit_objsearchAdhemerval Zanella2021-12-283-49/+47
| | | | | | | | | It consolidates the code required to call la_objsearch audit callback. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* elf: Add _dl_audit_activity_map and _dl_audit_activity_nsidAdhemerval Zanella2021-12-286-111/+45
| | | | | | | | | | | | | It consolidates the code required to call la_activity audit callback. Also for a new Lmid_t the namespace link_map list are empty, so it requires to check if before using it. This can happen for when audit module is used along with dlmopen. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* elf: Add _dl_audit_objopenAdhemerval Zanella2021-12-285-38/+49
| | | | | | | | It consolidates the code required to call la_objopen audit callback. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* hurd: Fix static-PIE startupSamuel Thibault2021-12-2814-40/+140
| | | | | | | | hurd initialization stages use RUN_HOOK to run various initialization functions. That is however using absolute addresses which need to be relocated, which is done later by csu. We can however easily make the linker compute relative addresses which thus don't need a relocation. The new SET_RELHOOK and RUN_RELHOOK macros implement this.
* hurd: let csu initialize tlsSamuel Thibault2021-12-283-49/+24
| | | | | | | | Since 9cec82de715b ("htl: Initialize later"), we let csu initialize pthreads. We can thus let it initialize tls later too, to better align with the generic order. Initialization however accesses ports which links/unlinks into the sigstate for unwinding. We can however easily skip that during initialization.
* hurd: Fix XFAIL-ing mallocfork2 testsSamuel Thibault2021-12-271-4/+10
| | | | They are using setpshared but are outside the htl directory.
* hurd: XFAIL more tests that require setpshared supportSamuel Thibault2021-12-271-0/+2
|
* malloc: Add missing shared thread library flagsSamuel Thibault2021-12-271-0/+16
|
* stdio-common: Fix %m sprintf test output for GNU/HurdSamuel Thibault2021-12-271-0/+10
| | | | | GNU/Hurd has slightly different error messages for undefined numbers, due to the notion of error subsystems.
* x86: Optimize L(less_vec) case in memcmpeq-evex.SNoah Goldstein2021-12-271-127/+43
| | | | | | | | | | | | | | | | | | No bug. Optimizations are twofold. 1) Replace page cross and 0/1 checks with masked load instructions in L(less_vec). In applications this reduces branch-misses in the hot [0, 32] case. 2) Change controlflow so that L(less_vec) case gets the fall through. Change 2) helps copies in the [0, 32] size range but comes at the cost of copies in the [33, 64] size range. From profiles of GCC and Python3, 94%+ and 99%+ of calls are in the [0, 32] range so this appears to the the right tradeoff. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* x86: Optimize L(less_vec) case in memcmp-evex-movbe.SNoah Goldstein2021-12-271-193/+56
| | | | | | | | | | | | | | | | | | No bug. Optimizations are twofold. 1) Replace page cross and 0/1 checks with masked load instructions in L(less_vec). In applications this reduces branch-misses in the hot [0, 32] case. 2) Change controlflow so that L(less_vec) case gets the fall through. Change 2) helps copies in the [0, 32] size range but comes at the cost of copies in the [33, 64] size range. From profiles of GCC and Python3, 94%+ and 99%+ of calls are in the [0, 32] range so this appears to the the right tradeoff. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* elf: Remove AArch64 from comment for AT_MINSIGSTKSZH.J. Lu2021-12-231-2/+1
| | | | | | | | | | | | | | | | | Remove AArch64 from comment for AT_MINSIGSTKSZ to match commit 7cd60e43a6def40ecb75deb8decc677995970d0b Author: Chang S. Bae <chang.seok.bae@intel.com> Date: Tue May 18 13:03:15 2021 -0700 uapi/auxvec: Define the aux vector AT_MINSIGSTKSZ Define AT_MINSIGSTKSZ in the generic uapi header. It is already used as generic ABI in glibc's generic elf.h, and this define will prevent future namespace conflicts. In particular, x86 is also using this generic definition. in Linux kernel 5.14.
* math: Properly cast X_TLOSS to float [BZ #28713]H.J. Lu2021-12-234-6/+11
| | | | | | | | | | | | | | | | | | | | | | | | Add #define AS_FLOAT_CONSTANT_1(x) x##f #define AS_FLOAT_CONSTANT(x) AS_FLOAT_CONSTANT_1(x) to cast X_TLOSS to float at compile-time to fix: FAIL: math/test-float-j0 FAIL: math/test-float-jn FAIL: math/test-float-y0 FAIL: math/test-float-y1 FAIL: math/test-float-yn FAIL: math/test-float32-j0 FAIL: math/test-float32-jn FAIL: math/test-float32-y0 FAIL: math/test-float32-y1 FAIL: math/test-float32-yn when compiling with GCC 12. Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
* Set default __TIMESIZE default to 64Adhemerval Zanella2021-12-2314-15/+216
| | | | This is expected size for newer ABIs.
* stdio: Implement %#m for vfprintf and related functionsFlorian Weimer2021-12-235-8/+125
| | | | | | | | | %#m prints errno as an error constant if one is available, or a decimal number as a fallback. This intends to address the gap that strerrorname_np does not work well with printf for unknown error codes due to its NULL return values in those cases. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* elf: Remove unused NEED_DL_BASE_ADDR and _dl_base_addrFlorian Weimer2021-12-231-8/+0
| | | | Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* x86-64: Add vector acos/acosf implementation to libmvecSunil K Pandey2021-12-2251-1/+2313
| | | | | | | | Implement vectorized acos/acosf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector acos/acosf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* intl/plural.y: Avoid conflicting declarations of yyerror and yylexAndrea Monaco2021-12-221-0/+5
| | | | | | | | | | | | | | | | bison-3.8 includes these lines in the generated intl/plural.c: #if !defined __gettexterror && !defined YYERROR_IS_DECLARED void __gettexterror (struct parse_args *arg, const char *msg); #endif #if !defined __gettextlex && !defined YYLEX_IS_DECLARED int __gettextlex (YYSTYPE *yylvalp, struct parse_args *arg); #endif Those default prototypes provided by bison conflict with the declarations later on in plural.y. This patch solves the issue. Reviewed-by: Arjun Shankar <arjun@redhat.com>
* elf: Remove excessive p_align check on PT_LOAD segments [BZ #28688]H.J. Lu2021-12-221-7/+2
| | | | | | | | | | p_align does not have to be a multiple of the page size. Only PT_LOAD segment layout should be aligned to the page size. 1: Remove p_align check against the page size. 2. Use the page size, instead of p_align, to check PT_LOAD segment layout. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* s_sincosf.h: Change pio4 type to float [BZ #28713]H.J. Lu2021-12-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | s_cosf.c and s_sinf.c have if (abstop12 (y) < abstop12 (pio4)) where abstop12 takes a float argument, but pio4 is static const double. pio4 is used only in calls to abstop12 and never in arithmetic. Apply -static const double pio4 = 0x1.921FB54442D18p-1; +static const float pio4 = 0x1.921FB6p-1f; to fix: FAIL: math/test-float-cos FAIL: math/test-float-sin FAIL: math/test-float-sincos FAIL: math/test-float32-cos FAIL: math/test-float32-sin FAIL: math/test-float32-sincos when compiling with GCC 12. Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
* Linux: Fix 32-bit vDSO for clock_gettime on powerpc32maminjie2021-12-211-1/+1
| | | | | | | | | | | | | | When the clock_id is CLOCK_PROCESS_CPUTIME_ID or CLOCK_THREAD_CPUTIME_ID, on the 5.10 kernel powerpc 32-bit, the 32-bit vDSO is executed successfully ( because the __kernel_clock_gettime in arch/powerpc/kernel/vdso32/gettimeofday.S does not support these two IDs, the 32-bit time_t syscall will be used), but tp32.tv_sec is equal to 0, causing the 64-bit time_t syscall to continue to be used, resulting in two system calls. Fix commit 72e84d1db22203e01a43268de71ea8669eca2863. Signed-off-by: maminjie <maminjie2@huawei.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Regenerate ulps on x86_64 with GCC 12H.J. Lu2021-12-201-1/+1
| | | | | | | | | Fix FAIL: math/test-float-clog10 FAIL: math/test-float32-clog10 on Intel Core i7-1165G7 with GCC 12.