about summary refs log tree commit diff
path: root/sysdeps
Commit message (Collapse)AuthorAgeFilesLines
...
* malloc: Assure that THP mode is always null terminatedAdhemerval Zanella2023-04-131-0/+1
|
* hurd: Mark two tests as unsupportedSamuel Thibault2023-04-131-0/+10
| | | | They make the whole testsuite hang/crash.
* hurd: Restore destroying receive rights on sigreturnSamuel Thibault2023-04-131-2/+2
| | | | | Just subtracting a ref is making signal/tst-signal signal/tst-raise signal/tst-minsigstksz-5 htl/tst-raise1 fail.
* Revert "hurd: Only check for TLS initialization inside rtld or in static builds"Samuel Thibault2023-04-116-83/+33
| | | | | | | | This reverts commit b37899d34d2190ef4b454283188f22519f096048. Apparently we load libc.so (and thus start using its functions) before calling TLS_INIT_TP, so libc.so functions should not actually assume that TLS is always set up.
* hurd: Don't leak __hurd_reply_port0Sergey Bugaev2023-04-112-1/+13
| | | | | | | | | | | | | Previously, once we set up TLS, we would implicitly switch from using __hurd_reply_port0 to reply_port inside the TCB, leaving the former unused. But we never deallocated it, so it got leaked. Instead, migrate the port into the new TCB's reply_port slot. This avoids both the port leak and an extra syscall to create a new reply port for the TCB. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-28-bugaevc@gmail.com>
* hurd: Improve reply port handling when exiting signal handlersSergey Bugaev2023-04-101-16/+5
| | | | | | | | | | | | | | | | | If we're doing signals, that means we've already got the signal thread running, and that implies TLS having been set up. So we know that __hurd_local_reply_port will resolve to THREAD_SELF->reply_port, and can access that directly using the THREAD_GETMEM and THREAD_SETMEM macros. This avoids potential miscompilations, and should also be a tiny bit faster. Also, use mach_port_mod_refs () and not mach_port_destroy () to destroy the receive right. mach_port_destroy () should *never* be used on mach_task_self (); this can easily lead to port use-after-free vulnerabilities if the task has any other references to the same port. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-26-bugaevc@gmail.com>
* hurd: Only check for TLS initialization inside rtld or in static buildsSergey Bugaev2023-04-106-34/+85
| | | | | | | | | | | | | | | | | | | | | When glibc is built as a shared library, TLS is always initialized by the call of TLS_INIT_TP () macro made inside the dynamic loader, prior to running the main program (see dl-call_tls_init_tp.h). We can take advantage of this: we know for sure that __LIBC_NO_TLS () will evaluate to 0 in all other cases, so let the compiler know that explicitly too. Also, only define _hurd_tls_init () and TLS_INIT_TP () under the same conditions (either !SHARED or inside rtld), to statically assert that this is the case. Other than a microoptimization, this also helps with avoiding awkward sharing of the __libc_tls_initialized variable between ld.so and libc.so that we would have to do otherwise -- we know for sure that no sharing is required, simply because __libc_tls_initialized would always be set to true inside libc.so. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-25-bugaevc@gmail.com>
* elf: Stop including tls.h in ldsodefs.hSergey Bugaev2023-04-101-1/+0
| | | | | | | Nothing in there needs tls.h Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-24-bugaevc@gmail.com>
* hurd: Port trampoline.c to x86_64Sergey Bugaev2023-04-101-7/+151
| | | | | Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230403115621.258636-3-bugaevc@gmail.com>
* hurd: Do not declare local variables volatileSergey Bugaev2023-04-101-2/+2
| | | | | | | | | | | | | | These are just regular local variables that are not accessed in any funny ways, not even though a pointer. There's absolutely no reason to declare them volatile. It only ends up hurting the quality of the generated machine code. If anything, it would make sense to decalre sigsp as *pointing* to volatile memory (volatile void *sigsp), but evidently that's not needed either. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230403115621.258636-2-bugaevc@gmail.com>
* hurd: Implement x86_64/intr-msg.hSergey Bugaev2023-04-101-0/+119
| | | | | Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-18-bugaevc@gmail.com>
* hurd: Add sys/ucontext.h and sigcontext.h for x86_64Sergey Bugaev2023-04-103-0/+326
| | | | | | | This is based on the Linux port's version, but laid out to match Mach's struct i386_thread_state, much like the i386 version does. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
* hurd: Stop depending on the default_pager stubs provided by gnumachFlavio Cruz2023-04-102-4/+2
| | | | | | The hurd source tree already provides the same stubs and they are only needed there. Message-Id: <ZDN3rDdjMowtUWf7@jupiter.tail36e24.ts.net>
* <sys/platform/x86.h>: Add PREFETCHI supportH.J. Lu2023-04-054-0/+7
| | | | | Add PREFETCHI support to <sys/platform/x86.h>. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* <sys/platform/x86.h>: Add AMX-COMPLEX supportH.J. Lu2023-04-054-0/+8
| | | | | Add AMX-COMPLEX support to <sys/platform/x86.h>. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* <sys/platform/x86.h>: Add AVX-NE-CONVERT supportH.J. Lu2023-04-054-0/+8
| | | | | Add AVX-NE-CONVERT support to <sys/platform/x86.h>. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* <sys/platform/x86.h>: Add AVX-VNNI-INT8 supportH.J. Lu2023-04-054-0/+17
| | | | | Add AVX-VNNI-INT8 support to <sys/platform/x86.h>. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* <sys/platform/x86.h>: Add MSRLIST supportH.J. Lu2023-04-052-0/+2
| | | | | Add MSRLIST support to <sys/platform/x86.h>. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* <sys/platform/x86.h>: Add AVX-IFMA supportH.J. Lu2023-04-054-0/+8
| | | | | Add AVX-IFMA support to <sys/platform/x86.h>. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* <sys/platform/x86.h>: Add AMX-FP16 supportH.J. Lu2023-04-054-0/+8
| | | | | Add AMX-FP16 support to <sys/platform/x86.h>. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* <sys/platform/x86.h>: Add WRMSRNS supportH.J. Lu2023-04-052-0/+2
| | | | | Add WRMSRNS support to <sys/platform/x86.h>. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* <sys/platform/x86.h>: Add ArchPerfmonExt supportH.J. Lu2023-04-052-0/+2
| | | | | | Add Architectural Performance Monitoring Extended Leaf (EAX = 23H) support to <sys/platform/x86.h>. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* <sys/platform/x86.h>: Add CMPCCXADD supportH.J. Lu2023-04-054-0/+7
| | | | | Add CMPCCXADD support to <sys/platform/x86.h>. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* <sys/platform/x86.h>: Add LASS supportH.J. Lu2023-04-052-0/+2
| | | | | Add Linear Address Space Separation (LASS) support to <sys/platform/x86.h>. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* <sys/platform/x86.h>: Add RAO-INT supportH.J. Lu2023-04-054-0/+7
| | | | | Add RAO-INT support to <sys/platform/x86.h>. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* <sys/platform/x86.h>: Add LBR supportH.J. Lu2023-04-052-1/+2
| | | | | Add architectural LBR support to <sys/platform/x86.h>. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* <sys/platform/x86.h>: Add RTM_FORCE_ABORT supportH.J. Lu2023-04-052-1/+2
| | | | | Add RTM_FORCE_ABORT support to <sys/platform/x86.h>. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* <sys/platform/x86.h>: Add SGX-KEYS supportH.J. Lu2023-04-052-1/+2
| | | | | Add SGX-KEYS support to <sys/platform/x86.h>. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* <sys/platform/x86.h>: Add BUS_LOCK_DETECT supportH.J. Lu2023-04-052-1/+2
| | | | | | Add Bus lock debug exceptions (BUS_LOCK_DETECT) support to <sys/platform/x86.h>. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* <sys/platform/x86.h>: Add LA57 supportH.J. Lu2023-04-052-1/+2
| | | | | | Add 57-bit linear addresses and five-level paging (LA57) support to <sys/platform/x86.h>. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* <bits/platform/x86.h>: Rename to x86_cpu_INDEX_7_ECX_15H.J. Lu2023-04-051-1/+1
| | | | | | Rename x86_cpu_INDEX_7_ECX_1 to x86_cpu_INDEX_7_ECX_15 for the unused bit 15 in ECX from CPUID with EAX == 0x7 and ECX == 0. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* hppa: Update struct __pthread_rwlock_arch_t comment.John David Anglin2023-04-051-5/+5
| | | | Signed-off-by: John David Anglin <dave.anglin@bell.net>
* hppa: Revise __TIMESIZE define to use __WORDSIZEJohn David Anglin2023-04-051-1/+3
| | | | | | Handle both 32 and 64-bit ABIs. Signed-off-by: John David Anglin <dave.anglin@bell.net>
* htl: move pthread_self info libc.Guy-Fleury Iteriteka2023-04-054-4/+4
| | | | | Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org> Message-Id: <20230318095826.1125734-4-gfleury@disroot.org>
* htl: move ___pthread_self into libc.Guy-Fleury Iteriteka2023-04-053-2/+25
| | | | | | | | | | | sysdeps/mach/hurd/htl/pt-pthread_self.c: New file. htl/Makefile: .. Add it to libc routine. sysdeps/mach/hurd/htl/pt-sysdep.c(__pthread_self): Remove it. sysdeps/mach/hurd/htl/pt-sysdep.h(__pthread_self): Add hidden propertie. htl/Versions(__pthread_self) Version it as private symbol. Signed-off-by: Guy-Fleury Iteriteka <gfleury@disroot.org> Message-Id: <20230318095826.1125734-3-gfleury@disroot.org>
* x86/dl-cacheinfo: remove unsused parameter from handle_amdAndreas Schwab2023-04-041-36/+30
| | | | Also replace an unreachable assert with __builtin_unreachable.
* powerpc: Disable stack protector in early static initializationAdhemerval Zanella2023-04-031-0/+3
| | | | | | | | Similar to fb95c316382679c0826cc8399760977cd95f15c9, also disable for string-ppc64.c (pulled on rltd as the default string implementation). Checked on powerpc64-linux-gnu.
* nptl: Fix tst-cancel30 on sparc64Adhemerval Zanella2023-04-031-3/+1
| | | | | | As indicated by sparc kernel-features.h, even though sparc64 defines __NR_pause, it is not supported (ENOSYS). Always use ppoll or the 64 bit time_t variant instead.
* math: Remove the error handling wrapper from fmod and fmodfAdhemerval Zanella Netto2023-04-0335-7/+155
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The error handling is moved to sysdeps/ieee754 version with no SVID support. The compatibility symbol versions still use the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). The ia64 is unchanged, since it still uses the arch specific __libm_error_region on its implementation. For both i686 and m68k, which provive arch specific implementation, wrappers are added so no new symbol are added (which would require to change the implementations). It shows an small improvement, the results for fmod: Architecture | Input | master | patch -----------------|-----------------|----------|-------- x86_64 (Ryzen 9) | subnormals | 12.5049 | 9.40992 x86_64 (Ryzen 9) | normal | 296.939 | 296.738 x86_64 (Ryzen 9) | close-exponents | 16.0244 | 13.119 aarch64 (N1) | subnormal | 6.81778 | 4.33313 aarch64 (N1) | normal | 155.620 | 152.915 aarch64 (N1) | close-exponents | 8.21306 | 5.76138 armhf (N1) | subnormal | 15.1083 | 14.5746 armhf (N1) | normal | 244.833 | 241.738 armhf (N1) | close-exponents | 21.8182 | 22.457 Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
* math: Improve fmodfAdhemerval Zanella Netto2023-04-032-93/+187
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This uses a new algorithm similar to already proposed earlier [1]. With x = mx * 2^ex and y = my * 2^ey (mx, my, ex, ey being integers), the simplest implementation is: mx * 2^ex == 2 * mx * 2^(ex - 1) while (ex > ey) { mx *= 2; --ex; mx %= my; } With mx/my being mantissa of double floating pointer, on each step the argument reduction can be improved 8 (which is sizeof of uint32_t minus MANTISSA_WIDTH plus the signal bit): while (ex > ey) { mx << 8; ex -= 8; mx %= my; } */ The implementation uses builtin clz and ctz, along with shifts to convert hx/hy back to doubles. Different than the original patch, this path assume modulo/divide operation is slow, so use multiplication with invert values. I see the following performance improvements using fmod benchtests (result only show the 'mean' result): Architecture | Input | master | patch -----------------|-----------------|----------|-------- x86_64 (Ryzen 9) | subnormals | 17.2549 | 12.0318 x86_64 (Ryzen 9) | normal | 85.4096 | 49.9641 x86_64 (Ryzen 9) | close-exponents | 19.1072 | 15.8224 aarch64 (N1) | subnormal | 10.2182 | 6.81778 aarch64 (N1) | normal | 60.0616 | 20.3667 aarch64 (N1) | close-exponents | 11.5256 | 8.39685 I also see similar improvements on arm-linux-gnueabihf when running on the N1 aarch64 chips, where it a lot of soft-fp implementation (for modulo, and multiplication): Architecture | Input | master | patch -----------------|-----------------|----------|-------- armhf (N1) | subnormal | 11.6662 | 10.8955 armhf (N1) | normal | 69.2759 | 34.1524 armhf (N1) | close-exponents | 13.6472 | 18.2131 Instead of using the math_private.h definitions, I used the math_config.h instead which is used on newer math implementations. Co-authored-by: kirill <kirill.okhotnikov@gmail.com> [1] https://sourceware.org/pipermail/libc-alpha/2020-November/119794.html Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
* math: Improve fmodAdhemerval Zanella Netto2023-04-032-96/+208
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This uses a new algorithm similar to already proposed earlier [1]. With x = mx * 2^ex and y = my * 2^ey (mx, my, ex, ey being integers), the simplest implementation is: mx * 2^ex == 2 * mx * 2^(ex - 1) while (ex > ey) { mx *= 2; --ex; mx %= my; } With mx/my being mantissa of double floating pointer, on each step the argument reduction can be improved 11 (which is sizeo of uint64_t minus MANTISSA_WIDTH plus the signal bit): while (ex > ey) { mx << 11; ex -= 11; mx %= my; } */ The implementation uses builtin clz and ctz, along with shifts to convert hx/hy back to doubles. Different than the original patch, this path assume modulo/divide operation is slow, so use multiplication with invert values. I see the following performance improvements using fmod benchtests (result only show the 'mean' result): Architecture | Input | master | patch -----------------|-----------------|----------|-------- x86_64 (Ryzen 9) | subnormals | 19.1584 | 12.5049 x86_64 (Ryzen 9) | normal | 1016.51 | 296.939 x86_64 (Ryzen 9) | close-exponents | 18.4428 | 16.0244 aarch64 (N1) | subnormal | 11.153 | 6.81778 aarch64 (N1) | normal | 528.649 | 155.62 aarch64 (N1) | close-exponents | 11.4517 | 8.21306 I also see similar improvements on arm-linux-gnueabihf when running on the N1 aarch64 chips, where it a lot of soft-fp implementation (for modulo, clz, ctz, and multiplication): Architecture | Input | master | patch -----------------|-----------------|----------|-------- armhf (N1) | subnormal | 15.908 | 15.1083 armhf (N1) | normal | 837.525 | 244.833 armhf (N1) | close-exponents | 16.2111 | 21.8182 Instead of using the math_private.h definitions, I used the math_config.h instead which is used on newer math implementations. Co-authored-by: kirill <kirill.okhotnikov@gmail.com> [1] https://sourceware.org/pipermail/libc-alpha/2020-November/119794.html Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
* x86: Set FSGSBASE to active if enabled by kernelH.J. Lu2023-04-034-0/+58
| | | | | | | | | | | | | Linux kernel uses AT_HWCAP2 to indicate if FSGSBASE instructions are enabled. If the HWCAP2_FSGSBASE bit in AT_HWCAP2 is set, FSGSBASE instructions can be used in user space. Define dl_check_hwcap2 to set the FSGSBASE feature to active on Linux when the HWCAP2_FSGSBASE bit is set. Add a test to verify that FSGSBASE is active on current kernels. NB: This test will fail if the kernel doesn't set the HWCAP2_FSGSBASE bit in AT_HWCAP2 while fsgsbase shows up in /proc/cpuinfo. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* x86_64: Fix asm constraints in feraiseexcept (bug 30305)Florian Weimer2023-04-031-2/+2
| | | | | | | | | The divss instruction clobbers its first argument, and the constraints need to reflect that. Fortunately, with GCC 12, generated code does not actually change, so there is no externally visible bug. Suggested-by: Jakub Jelinek <jakub@redhat.com> Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* hurd: Add vm_param.h for x86_64Sergey Bugaev2023-04-031-0/+24
| | | | | Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-30-bugaevc@gmail.com>
* hurd: Implement _hurd_longjmp_thread_state for x86_64Sergey Bugaev2023-04-031-0/+41
| | | | | Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-29-bugaevc@gmail.com>
* htl: Implement thread_set_pcsptp for x86_64Sergey Bugaev2023-04-031-0/+73
| | | | | Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-23-bugaevc@gmail.com>
* x86_64: Add rtld-stpncpy & rtld-strncpySergey Bugaev2023-04-032-0/+36
| | | | | | | | Just like the other existing rtld-str* files, this provides rtld with usable versions of stpncpy and strncpy. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-22-bugaevc@gmail.com>
* htl: Add tcb-offsets.sym for x86_64Sergey Bugaev2023-04-032-0/+28
| | | | | | | | The source code is the same as sysdeps/i386/htl/tcb-offsets.sym, but of course the produced tcb-offsets.h will be different. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-21-bugaevc@gmail.com>
* hurd: Move a couple of signal-related files to x86Sergey Bugaev2023-04-032-0/+0
| | | | | | | These do not need any changes to be used on x86_64. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-20-bugaevc@gmail.com>
* hurd: Use uintptr_t for register values in trampoline.cSergey Bugaev2023-04-031-7/+6
| | | | | | | | | | | | This is more correct, if only because these fields are defined as having the type unsigned int in the Mach headers, so casting them to a signed int and then back is suboptimal. Also, remove an extra reassignment of uesp -- this is another remnant of the ecx kludge. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230319151017.531737-16-bugaevc@gmail.com>