about summary refs log tree commit diff
Commit message (Collapse)AuthorAgeFilesLines
* aarch64/fpu: Add vector variants of sinhJoe Ramsay2024-04-0416-0/+572
| | | | Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* aarch64/fpu: Add vector variants of atanhJoe Ramsay2024-04-0414-0/+288
| | | | Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* aarch64/fpu: Add vector variants of asinhJoe Ramsay2024-04-0414-0/+489
| | | | Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* aarch64/fpu: Add vector variants of acoshJoe Ramsay2024-04-0419-0/+653
| | | | Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* aarch64/fpu: Add vector variants of coshJoe Ramsay2024-04-0418-1/+648
| | | | Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* aarch64/fpu: Add vector variants of erfJoe Ramsay2024-04-0419-1/+4531
| | | | Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* misc: Add support for Linux uio.h RWF_NOAPPEND flagStafford Horne2024-04-043-1/+9
| | | | | | | | | | | | | | | | | | | In Linux 6.9 a new flag is added to allow for Per-io operations to disable append mode even if a file was opened with the flag O_APPEND. This is done with the new RWF_NOAPPEND flag. This caused two test failures as these tests expected the flag 0x00000020 to be unused. Adding the flag definition now fixes these tests on Linux 6.9 (v6.9-rc1). FAIL: misc/tst-preadvwritev2 FAIL: misc/tst-preadvwritev64v2 This patch adds the flag, adjusts the test and adds details to documentation. Link: https://lore.kernel.org/all/20200831153207.GO3265@brightrain.aerifal.cx/ Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* manual: significand() uses FLT_RADIX, not 2Alejandro Colomar2024-04-031-1/+1
| | | | | | | | | | | | | | It's implemented using scalb(), which uses FLT_RADIX, AFAIK. Link: <https://lore.kernel.org/linux-man/ZeYKUOKYS7G90SaV@debian/T/#mf21ab57e16b92eb6be6c7df79dc0eb43d4454056> Reported-by: Morten Welinder <mwelinder@gmail.com> Cc: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> Cc: Vincent Lefevre <vincent@vinc17.net> Cc: DJ Delorie <dj@redhat.com> Cc: Paul Zimmermann <Paul.Zimmermann@inria.fr> Cc: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Alejandro Colomar <alx@kernel.org> Reviewed-by: DJ Delorie <dj@redhat.com>
* manual: Clarify return value of cbrt(3)Alejandro Colomar2024-04-031-2/+6
| | | | | | | | | | | | | Link: <https://lore.kernel.org/linux-man/ZeYKUOKYS7G90SaV@debian/T/#mff0ab388000c6afdb5e5162804d4a0073de481de> Reported-by: Morten Welinder <mwelinder@gmail.com> Cowritten-by: Morten Welinder <mwelinder@gmail.com> Cc: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> Cc: Vincent Lefevre <vincent@vinc17.net> Cc: DJ Delorie <dj@redhat.com> Cc: Paul Zimmermann <Paul.Zimmermann@inria.fr> Cc: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Alejandro Colomar <alx@kernel.org> Reviewed-by: DJ Delorie <dj@redhat.com>
* manual: floor(log2(fabs(x))) has rounding errorsAlejandro Colomar2024-04-031-2/+5
| | | | | | | | | | | | Link: <https://inbox.sourceware.org/libc-alpha/20240305150131.GD3653@qaa.vinc17.org/T/#m3ceecda630012995339bcc5448fee451cf277a8b> Reported-by: Vincent Lefevre <vincent@vinc17.net> Suggested-by: Vincent Lefevre <vincent@vinc17.net> Reviewed-by: DJ Delorie <dj@redhat.com> Cc: Morten Welinder <mwelinder@gmail.com> Cc: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> Cc: Paul Zimmermann <Paul.Zimmermann@inria.fr> Cc: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Alejandro Colomar <alx@kernel.org>
* manual: logb(x) is floor(log2(fabs(x)))Alejandro Colomar2024-04-031-1/+1
| | | | | | | | | | | | | | log2(3) doesn't accept negative input, but it seems logb(3) does accept it. Link: <https://lore.kernel.org/linux-man/ZeYKUOKYS7G90SaV@debian/T/#u> Reported-by: Morten Welinder <mwelinder@gmail.com> Reviewed-by: DJ Delorie <dj@redhat.com> Cc: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> Cc: Vincent Lefevre <vincent@vinc17.net> Cc: Paul Zimmermann <Paul.Zimmermann@inria.fr> Cc: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Alejandro Colomar <alx@kernel.org>
* powerpc: Add missing arch flags on rounding ifunc variantsAdhemerval Zanella2024-04-021-0/+6
| | | | | | | | | | The ifunc variants now uses the powerpc implementation which in turn uses the compiler builtin. Without the proper -mcpu switch the builtin does not generate the expected optimization. Checked on powerpc-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
* math: Reformat Makefile.Adhemerval Zanella2024-04-021-159/+685
| | | | | | | | | Reflow all long lines adding comment terminators. Sort all reflowed text using scripts/sort-makefile-lines.py. No code generation changes observed in binary artifacts. No regressions on x86_64 and i686. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* Always define __USE_TIME_BITS64 when 64 bit time_t is usedAdhemerval Zanella2024-04-0275-182/+178
| | | | | | | | | | | | | | | | | | | | It was raised on libc-help [1] that some Linux kernel interfaces expect the libc to define __USE_TIME_BITS64 to indicate the time_t size for the kABI. Different than defined by the initial y2038 design document [2], the __USE_TIME_BITS64 is only defined for ABIs that support more than one time_t size (by defining the _TIME_BITS for each module). The 64 bit time_t redirects are now enabled using a different internal define (__USE_TIME64_REDIRECTS). There is no expected change in semantic or code generation. Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu, and arm-linux-gnueabi [1] https://sourceware.org/pipermail/libc-help/2024-January/006557.html [2] https://sourceware.org/glibc/wiki/Y2038ProofnessDesign Reviewed-by: DJ Delorie <dj@redhat.com>
* benchtests: Improve benchtests for strstrAdhemerval Zanella2024-04-011-76/+273
| | | | | | | Use same strategy as bench-strstr.c (93eebae5168e5cf2 and 80b2bfb53504) and use json_ctx for output to help standardize format across all benchtests. Reviewed-by: Arjun Shankar <arjun@redhat.com>
* x86_64: Remove avx512 strstr implementationAdhemerval Zanella2024-03-274-248/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As indicated in a recent thread, this it is a simple brute-force algorithm that checks the whole needle at a matching character pair (and does so 1 byte at a time after the first 64 bytes of a needle). Also it never skips ahead and thus can match at every haystack position after trying to match all of the needle, which generic implementation avoids. As indicated by Wilco, a 4x larger needle and 16x larger haystack gives a clear 65x slowdown both basic_strstr and __strstr_avx512: "ifuncs": ["basic_strstr", "twoway_strstr", "__strstr_avx512", "__strstr_sse2_unaligned", "__strstr_generic"], { "len_haystack": 65536, "len_needle": 1024, "align_haystack": 0, "align_needle": 0, "fail": 1, "desc": "Difficult bruteforce needle", "timings": [4.0948e+07, 15094.5, 3.20818e+07, 108558, 10839.2] }, { "len_haystack": 1048576, "len_needle": 4096, "align_haystack": 0, "align_needle": 0, "fail": 1, "desc": "Difficult bruteforce needle", "timings": [2.69767e+09, 100797, 2.08535e+09, 495706, 82666.9] } PS: I don't have an AVX512 capable machine to verify this issues, but skimming through the code it does seems to follow what Wilco has described. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* signal: Avoid system signal disposition to interfere with testsAdhemerval Zanella2024-03-272-0/+7
| | | | | Both tst-sigset2 and tst-signal1 expectes that SIGINT disposition is set to SIG_DFL.
* RISC-V: Fix the static-PIE non-relocated object checkPalmer Dabbelt2024-03-251-1/+1
| | | | | | | | | | | | The value of l_scope is only valid post relocation, so this original check was triggering undefined behavior. Instead just directly check to see if the object has been relocated, at which point using l_scope is safe. Reported-by: Andreas Schwab <schwab@suse.de> Closes: BZ #31317 Fixes: e0590f41fe ("RISC-V: Enable static-pie.") Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
* htl: Implement some support for TLS_DTV_AT_TPSergey Bugaev2024-03-233-2/+25
| | | | | Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-19-bugaevc@gmail.com>
* htl: Respect GL(dl_stack_flags) when allocating stacksSergey Bugaev2024-03-232-2/+11
| | | | | | | | | | | | Previously, HTL would always allocate non-executable stacks. This has never been noticed, since GNU Mach on x86 ignores VM_PROT_EXECUTE and makes all pages implicitly executable. Since GNU Mach on AArch64 supports non-executable pages, HTL forgetting to pass VM_PROT_EXECUTE immediately breaks any code that (unfortunately, still) relies on executable stacks. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-7-bugaevc@gmail.com>
* hurd: Use the RETURN_ADDRESS macroSergey Bugaev2024-03-231-1/+1
| | | | | | | This gives us PAC stripping on AArch64. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-6-bugaevc@gmail.com>
* hurd: Disable Prefer_MAP_32BIT_EXEC on non-x86_64 for nowSergey Bugaev2024-03-232-2/+2
| | | | | | | | While we could support it on any architecture, the tunable is currently only defined on x86_64. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-5-bugaevc@gmail.com>
* Allow glibc to be compiled without EXEC_PAGESIZESergey Bugaev2024-03-233-2/+8
| | | | | | | | | | | | | | | | | We would like to avoid statically defining any specific page size on aarch64-gnu, and instead make sure that everything uses the dynamic page size, available via vm_page_size and GLRO(dl_pagesize). There are currently a few places in glibc that require EXEC_PAGESIZE to be defined. Per Roland's suggestion [0], drop the static GLRO(dl_pagesize) initializers (for now, only if EXEC_PAGESIZE is not defined), and don't require EXEC_PAGESIZE definition for libio to enable mmap usage. [0]: https://mail.gnu.org/archive/html/bug-hurd/2011-10/msg00035.html Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-4-bugaevc@gmail.com>
* hurd: Stop relying on VM_MAX_ADDRESSSergey Bugaev2024-03-231-4/+4
| | | | | | | | | | | | | | | | | | We'd like to avoid committing to a specific size of virtual address space (i.e. the value of VM_AARCH64_T0SZ) on AArch64. While the current version of GNU Mach still exports VM_MAX_ADDRESS for compatibility, we should try to avoid relying on it when we can. This piece of logic in _hurdsig_getenv () doesn't actually care about the size of user- accessible virtual address space, it just wants to preempt faults on any addresses starting from the value of the P pointer and above. So, use (unsigned long int) -1 instead of VM_MAX_ADDRESS. While at it, change the casts to (unsigned long int) and not just (long int), since the type of struct hurd_signal_preemptor.{first,last} is unsigned long int. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-3-bugaevc@gmail.com>
* hurd: Move internal functions to internal headerSergey Bugaev2024-03-232-87/+78
| | | | | | | | | | | Move _hurd_self_sigstate (), _hurd_critical_section_lock (), and _hurd_critical_section_unlock () inline implementations (that were already guarded by #if defined _LIBC) to the internal version of the header. While at it, add <tls.h> to the includes, and use __LIBC_NO_TLS () unconditionally. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-2-bugaevc@gmail.com>
* stdlib: Fix tst-makecontext2 log when swapcontext failsStafford Horne2024-03-231-1/+1
| | | | | | | | The log incorrectly prints, setcontext failed. Update this to indicate that actually swapcontext failed. Signed-off-by: Stafford Horne <shorne@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* or1k: Add prctl wrapper to unwrap variadic argsStafford Horne2024-03-221-0/+42
| | | | | | | | | | | | | On OpenRISC variadic functions and regular functions have different calling conventions so this wrapper is needed to translate. This wrapper is copied from x86_64/x32. I don't know the build system enough to find a cleaner way to share the code between x86_64/x32 and or1k (maybe Implies?), so I went with the straight copy. This fixes test failures: misc/tst-prctl nptl/tst-setgetname
* or1k: Only define fpu rouding and exceptions with hard-floatStafford Horne2024-03-221-0/+19
| | | | | | | | | | | | | | This test failure: math/test-fenv If rounding mode and exception macros are defined then the fenv tests run and always fail. This patch adds an ifdef using the __or1k_hard_float__ macro provided by gcc to avoid defining these fenv macros when they cnnot be used. This is similar to what is done in csky. Note, I will post the or1k hard-float support soon. So, I prefer to leave the hard-float bits here for now.
* or1k: Update libm test ulpsStafford Horne2024-03-221-0/+1
| | | | | | | To fix test failures: FAIL: math/test-float-hypot FAIL: math/test-float32-hypot
* AArch64: Check kernel version for SVE ifuncsWilco Dijkstra2024-03-215-2/+53
| | | | | | | | | | | | | | | | Old Linux kernels disable SVE after every system call. Calling the SVE-optimized memcpy afterwards will then cause a trap to reenable SVE. As a result, applications with a high use of syscalls may run slower with the SVE memcpy. This is true for kernels between 4.15.0 and before 6.2.0, except for 5.14.0 which was patched. Avoid this by checking the kernel version and selecting the SVE ifunc on modern kernels. Parse the kernel version reported by uname() into a 24-bit kernel.major.minor value without calling any library functions. If uname() is not supported or if the version format is not recognized, assume the kernel is modern. Tested-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* powerpc: Placeholder and infrastructure/build support to add Power11 ↵Amrita H S2024-03-1915-5/+27
| | | | | | | | | | | | related changes. The following three changes have been added to provide initial Power11 support. 1. Add the directories to hold Power11 files. 2. Add support to select Power11 libraries based on AT_PLATFORM. 3. Let submachine=power11 be set automatically. Reviewed-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
* powerpc: Add HWCAP3/HWCAP4 data to TCB for Power Architecture.Manjunath Matti2024-03-1912-19/+74
| | | | | | | | | | | | | This patch adds a new feature for powerpc. In order to get faster access to the HWCAP3/HWCAP4 masks, similar to HWCAP/HWCAP2 (i.e. for implementing __builtin_cpu_supports() in GCC) without the overhead of reading them from the auxiliary vector, we now reserve space for them in the TCB. This is an ABI change for GLIBC 2.39. Suggested-by: Peter Bergner <bergner@linux.ibm.com> Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
* elf: Enable TLS descriptor tests on aarch64Adhemerval Zanella2024-03-195-33/+40
| | | | | | | | | | | | The aarch64 uses 'trad' for traditional tls and 'desc' for tls descriptors, but unlike other targets it defaults to 'desc'. The gnutls2 configure check does not set aarch64 as an ABI that uses TLS descriptors, which then disable somes stests. Also rename the internal machinery fron gnu2 to tls descriptors. Checked on aarch64-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* arm: Update _dl_tlsdesc_dynamic to preserve caller-saved registers (BZ 31372)Adhemerval Zanella2024-03-1910-15/+250
| | | | | | | | | | | | | | | | | | | | | | ARM _dl_tlsdesc_dynamic slow path has two issues: * The ip/r12 is defined by AAPCS as a scratch register, and gcc is used to save the stack pointer before on some function calls. So it should also be saved/restored as well. It fixes the tst-gnu2-tls2. * None of the possible VFP registers are saved/restored. ARM has the additional complexity to have different VFP bank sizes (depending of VFP support by the chip). The tst-gnu2-tls2 test is extended to check for VFP registers, although only for hardfp builds. Different than setcontext, _dl_tlsdesc_dynamic does not have HWCAP_ARM_IWMMXT (I don't have a way to properly test it and it is almost a decade since newer hardware was released). With this patch there is no need to mark tst-gnu2-tls2 as XFAIL. Checked on arm-linux-gnueabihf. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* Ignore undefined symbols for -mtls-dialect=gnu2Adhemerval Zanella2024-03-192-2/+2
| | | | | | So it does not fail for arm config that defaults to -mtp=soft (which issues a call to __aeabi_read_tp). Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* Add tst-gnu2-tls2mod1 to test-internal-extrasAndreas Schwab2024-03-191-0/+2
| | | | | | That allows sysdeps/x86_64/tst-gnu2-tls2mod1.S to use internal headers. Fixes: 717ebfa85c ("x86-64: Allocate state buffer space for RDI, RSI and RBX")
* x86-64: Allocate state buffer space for RDI, RSI and RBXH.J. Lu2024-03-183-11/+147
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | _dl_tlsdesc_dynamic preserves RDI, RSI and RBX before realigning stack. After realigning stack, it saves RCX, RDX, R8, R9, R10 and R11. Define TLSDESC_CALL_REGISTER_SAVE_AREA to allocate space for RDI, RSI and RBX to avoid clobbering saved RDI, RSI and RBX values on stack by xsave to STATE_SAVE_OFFSET(%rsp). +==================+<- stack frame start aligned at 8 or 16 bytes | |<- RDI saved in the red zone | |<- RSI saved in the red zone | |<- RBX saved in the red zone | |<- paddings for stack realignment of 64 bytes |------------------|<- xsave buffer end aligned at 64 bytes | |<- | |<- | |<- |------------------|<- xsave buffer start at STATE_SAVE_OFFSET(%rsp) | |<- 8-byte padding for 64-byte alignment | |<- 8-byte padding for 64-byte alignment | |<- R11 | |<- R10 | |<- R9 | |<- R8 | |<- RDX | |<- RCX +==================+<- RSP aligned at 64 bytes Define TLSDESC_CALL_REGISTER_SAVE_AREA, the total register save area size for all integer registers by adding 24 to STATE_SAVE_OFFSET since RDI, RSI and RBX are saved onto stack without adjusting stack pointer first, using the red-zone. This fixes BZ #31501. Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
* riscv: Update nofpu libm test ulpsDarius Rad2024-03-181-0/+1
| | | | | | Fix two test failures. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* Add STATX_MNT_ID_UNIQUE from Linux 6.8 to bits/statx-generic.hJoseph Myers2024-03-151-0/+1
| | | | | | | Linux 6.8 adds a new STATX_MNT_ID_UNIQUE constant. Add it to glibc's bits/statx-generic.h. Tested for x86_64.
* linux: Use rseq area unconditionally in sched_getcpu (bug 31479)Florian Weimer2024-03-151-8/+0
| | | | | | | | | | | | | | | | | | | Originally, nptl/descr.h included <sys/rseq.h>, but we removed that in commit 2c6b4b272e6b4d07303af25709051c3e96288f2d ("nptl: Unconditionally use a 32-byte rseq area"). After that, it was not ensured that the RSEQ_SIG macro was defined during sched_getcpu.c compilation that provided a definition. This commit always checks the rseq area for CPU number information before using the other approaches. This adds an unnecessary (but well-predictable) branch on architectures which do not define RSEQ_SIG, but its cost is small compared to the system call. Most architectures that have vDSO acceleration for getcpu also have rseq support. Fixes: 2c6b4b272e6b4d07303af25709051c3e96288f2d Fixes: 1d350aa06091211863e41169729cee1bca39f72f Reviewed-by: Arjun Shankar <arjun@redhat.com>
* aarch64: fix check for SVE support in assemblerSzabolcs Nagy2024-03-142-4/+6
| | | | | | | | | | | | | Due to GCC bug 110901 -mcpu can override -march setting when compiling asm code and thus a compiler targetting a specific cpu can fail the configure check even when binutils gas supports SVE. The workaround is that explicit .arch directive overrides both -mcpu and -march, and since that's what the actual SVE memcpy uses the configure check should use that too even if the GCC issue is fixed independently. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* Update kernel version to 6.8 in header constant testsJoseph Myers2024-03-133-4/+4
| | | | | | | | | This patch updates the kernel version in the tests tst-mman-consts.py, tst-mount-consts.py and tst-pidfd-consts.py to 6.8. (There are no new constants covered by these tests in 6.8 that need any other header changes.) Tested with build-many-glibcs.py.
* Update syscall lists for Linux 6.8Joseph Myers2024-03-1327-2/+137
| | | | | | | | Linux 6.8 adds five new syscalls. Update syscall-names.list and regenerate the arch-syscall.h headers with build-many-glibcs.py update-syscalls. Tested with build-many-glibcs.py.
* Use Linux 6.8 in build-many-glibcs.pyJoseph Myers2024-03-131-1/+1
| | | | | | | This patch makes build-many-glibcs.py use Linux 6.8. Tested with build-many-glibcs.py (host-libraries, compilers and glibcs builds).
* powerpc: Remove power8 strcasestr optimizationAdhemerval Zanella2024-03-128-675/+1
| | | | | | | | | | | | | | | | | | Similar to strstr (1e9a550ba4), power8 strcasestr does not show much improvement compared to the generic implementation. The geomean on bench-strcasestr shows: __strcasestr_power8 __strcasestr_ppc power10 1159 1120 power9 1640 1469 power8 1787 1904 The strcasestr uses the same 'trick' as power7 strstr to detect potential quadradic behavior, which only adds overheads for input that trigger quadradic behavior and it is really a hack. Checked on powerpc64le-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>
* riscv: Fix alignment-ignorant memcpy implementationAdhemerval Zanella2024-03-129-167/+215
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The memcpy optimization (commit 587a1290a1af7bee6db) has a series of mistakes: - The implementation is wrong: the chunk size calculation is wrong leading to invalid memory access. - It adds ifunc supports as default, so --disable-multi-arch does not work as expected for riscv. - It mixes Linux files (memcpy ifunc selection which requires the vDSO/syscall mechanism) with generic support (the memcpy optimization itself). - There is no __libc_ifunc_impl_list, which makes testing only check the selected implementation instead of all supported by the system. This patch also simplifies the required bits to enable ifunc: there is no need to memcopy.h; nor to add Linux-specific files. The __memcpy_noalignment tail handling now uses a branchless strategy similar to aarch64 (overlap 32-bits copies for sizes 4..7 and byte copies for size 1..3). Checked on riscv64 and riscv32 by explicitly enabling the function on __libc_ifunc_impl_list on qemu-system. Changes from v1: * Implement the memcpy in assembly to correctly handle RISCV strict-alignment. Reviewed-by: Evan Green <evan@rivosinc.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
* linux/sigsetops: fix type confusion (bug 31468)Andreas Schwab2024-03-123-9/+20
| | | | | | Each mask in the sigset array is an unsigned long, so fix __sigisemptyset to use that instead of int. The __sigword function returns a simple array index, so it can return int instead of unsigned long.
* LoongArch: Correct {__ieee754, _}_scalb -> {__ieee754, _}_scalbfcaiyinyu2024-03-121-1/+1
|
* duplocale: protect use of global locale (bug 23970)Andreas Schwab2024-03-111-6/+8
| | | | | | | Protect the global locale from being modified while we compute the size of the locale category names. That allows the use of the global locale in a single thread, while all other threads use the thread safe locale functions.
* x86-64: Simplify minimum ISA check ifdef conditional with ifSunil K Pandey2024-03-031-11/+8
| | | | | | | | | Replace minimum ISA check ifdef conditional with if. Since MINIMUM_X86_ISA_LEVEL and AVX_X86_ISA_LEVEL are compile time constants, compiler will perform constant folding optimization, getting same results. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>