about summary refs log tree commit diff
path: root/sysdeps/loongarch
Commit message (Collapse)AuthorAgeFilesLines
* LoongArch: Ensure sp 16-byte aligned for tlsdesc HEAD masterXi Ruoyao13 hours2-7/+4
| | | | | | | | | | "ADDI sp, sp, 24" and "ADDI sp, sp, SZFCSREG" (SZFCSREG = 4) are misaligning the stack: the ABI mandates a 16-byte alignment. Fix it by changing the first one to "ADDI sp, sp, 32", and reuse the spare 4th slot for saving fcsr. Reported-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Xi Ruoyao <xry111@xry111.site>
* LoongArch: Use "$fcsr0" instead of "$r0" in _FPU_{GET,SET}CWXi Ruoyao2024-05-281-2/+2
| | | | | | | | | | | | Clang inline-asm parser does not allow using "$r0" in movfcsr2gr/movgr2fcsr, so everything using _FPU_{GET,SET}CW is now failing to build with Clang on LoongArch. As we now requires Binutils >= 2.41 which supports using "$fcsr0" here, use it instead of "$r0" to fix the issue. Link: https://github.com/loongson-community/discussions/issues/53#issuecomment-2081507390 Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=4142b2368353 Signed-off-by: Xi Ruoyao <xry111@xry111.site>
* loongarch: Remove duplicate strnlen in libc.a (BZ 31785)Adhemerval Zanella2024-05-231-0/+2
| | | | | | The generic version provides weak definitions of strnlen, which are already provided by the ifunc resolver. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* LoongArch: Update ulpscaiyinyu2024-05-211-0/+20
| | | | For the log2p1 implementation.
* LoongArch: Fix tst-gnu2-tls2 compiler errormengqinggang2024-05-213-2/+8
| | | | | Add -mno-lsx to tst-gnu2-tlsmod*.c if gcc support -mno-lsx. Add escape character '\' in vector support test function.
* LoongArch: Add support for TLS Descriptorsmengqinggang2024-05-1514-8/+1071
| | | | | | | | | This is mostly based on AArch64 and RISC-V implementation. Add R_LARCH_TLS_DESC32 and R_LARCH_TLS_DESC64 relocations. For _dl_tlsdesc_dynamic function slow path, temporarily save and restore all vector registers.
* LoongArch: Add glibc.cpu.hwcap support.caiyinyu2024-04-247-5/+312
| | | | | | | | | | | | | | | | The current IFUNC selection is always using the most recent features which are available via AT_HWCAP. But in some scenarios it is useful to adjust this selection. The environment variable: GLIBC_TUNABLES=glibc.cpu.hwcaps=-xxx,yyy,zzz,.... can be used to enable HWCAP feature yyy, disable HWCAP feature xxx, where the feature name is case-sensitive and has to match the ones used in sysdeps/loongarch/cpu-tunables.c. Signed-off-by: caiyinyu <caiyinyu@loongson.cn>
* LoongArch: Correct {__ieee754, _}_scalb -> {__ieee754, _}_scalbfcaiyinyu2024-03-121-1/+1
|
* Apply the Makefile sorting fixH.J. Lu2024-02-151-40/+40
| | | | Apply the Makefile sorting fix generated by sort-makefile-lines.py.
* LoongArch: Use builtins for ffs and ffsllXi Ruoyao2024-02-051-0/+2
| | | | | | | | | | | | On LoongArch GCC compiles __builtin_ffs{,ll} to basically `(x ? __builtin_ctz (x) : -1) + 1`. Since a hardware ctz instruction is available, this is much better than the table-driven generic implementation. Tested on loongarch64. Signed-off-by: Xi Ruoyao <xry111@xry111.site> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Refer to C23 in place of C2X in glibcJoseph Myers2024-02-011-1/+1
| | | | | | | | | | | | | | | WG14 decided to use the name C23 as the informal name of the next revision of the C standard (notwithstanding the publication date in 2024). Update references to C2X in glibc to use the C23 name. This is intended to update everything *except* where it involves renaming files (the changes involving renaming tests are intended to be done separately). In the case of the _ISOC2X_SOURCE feature test macro - the only user-visible interface involved - support for that macro is kept for backwards compatibility, while adding _ISOC23_SOURCE. Tested for x86_64.
* Update copyright dates with scripts/update-copyrightsPaul Eggert2024-01-01171-171/+171
|
* elf: Remove LD_PROFILE for static binariesAdhemerval Zanella2023-11-212-2/+6
| | | | | | | | | | | The _dl_non_dynamic_init does not parse LD_PROFILE, which does not enable profile for dlopen objects. Since dlopen is deprecated for static objects, it is better to remove the support. It also allows to trim down libc.a of profile support. Checked on x86_64-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* LoongArch: Delete excessively allocated memory.caiyinyu2023-10-261-34/+34
|
* LoongArch: Unify Register Names.caiyinyu2023-10-262-19/+19
|
* Revert "LoongArch: Add glibc.cpu.hwcap support."caiyinyu2023-09-216-171/+4
| | | | This reverts commit a53451559dc9cce765ea5bcbb92c4007e058e92b.
* LoongArch: Add glibc.cpu.hwcap support.caiyinyu2023-09-196-4/+171
| | | | | | | | | | | | | | | | | | | | | | | | | Key Points: 1. On lasx & lsx platforms, We must use _dl_runtime_{profile, resolve}_{lsx, lasx} to save vector registers. 2. Via "tunables", users can choose str/mem_{lasx,lsx,unaligned} functions with `export GLIBC_TUNABLES=glibc.cpu.hwcaps=LASX,...`. Note: glibc.cpu.hwcaps doesn't affect _dl_runtime_{profile, resolve}_{lsx, lasx} selection. Usage Notes: 1. Only valid inputs: LASX, LSX, UAL. Case-sensitive, comma-separated, no spaces. 2. Example: `export GLIBC_TUNABLES=glibc.cpu.hwcaps=LASX,UAL` turns on LASX & UAL. Unmentioned features turn off. With default ifunc: lasx > lsx > unaligned > aligned > generic, effect is: lasx > unaligned > aligned > generic; lsx off. 3. Incorrect GLIBC_TUNABLES settings will show error messages. For example: On lsx platforms, you cannot enable lasx features. If you do that, you will get error messages. 4. Valid input examples: - GLIBC_TUNABLES=glibc.cpu.hwcaps=LASX: lasx > aligned > generic. - GLIBC_TUNABLES=glibc.cpu.hwcaps=LSX,UAL: lsx > unaligned > aligned > generic. - GLIBC_TUNABLES=glibc.cpu.hwcaps=LASX,UAL,LASX,UAL,LSX,LASX,UAL: Repetitions allowed but not recommended. Results in: lasx > lsx > unaligned > aligned > generic.
* LoongArch: Change to put magic number to .rodata sectiondengjianbo2023-09-151-10/+10
| | | | | Change to put magic number to .rodata section in memmove-lsx, and use pcalau12i and %pc_lo12 with vld to get the data.
* LoongArch: Add ifunc support for strrchr{aligned, lsx, lasx}dengjianbo2023-09-157-0/+578
| | | | | | | | | | | | | | | | According to glibc strrchr microbenchmark test results, this implementation could reduce the runtime time as following: Name Percent of rutime reduced strrchr-lasx 10%-50% strrchr-lsx 0%-50% strrchr-aligned 5%-50% Generic strrchr is implemented by function strlen + memrchr, the lasx version will compare with generic strrchr implemented by strlen-lasx + memrchr-lasx, the lsx version will compare with generic strrchr implemented by strlen-lsx + memrchr-lsx, the aligned version will compare with generic strrchr implemented by strlen-aligned + memrchr-generic.
* LoongArch: Add ifunc support for strcpy, stpcpy{aligned, unaligned, lsx, lasx}dengjianbo2023-09-1512-0/+963
| | | | | | | | | | | | | | | | | | | | According to glibc strcpy and stpcpy microbenchmark test results(changed to use generic_strcpy and generic_stpcpy instead of strlen + memcpy), comparing with the generic version, this implementation could reduce the runtime as following: Name Percent of rutime reduced strcpy-aligned 8%-45% strcpy-unaligned 8%-48%, comparing with the aligned version, unaligned version takes less instructions to copy the tail of data which length is less than 8. it also has better performance in case src and dest cannot be both aligned with 8bytes strcpy-lsx 20%-80% strcpy-lasx 15%-86% stpcpy-aligned 6%-43% stpcpy-unaligned 8%-48% stpcpy-lsx 10%-80% stpcpy-lasx 10%-87%
* LoongArch: Replace deprecated $v0 with $a0 to eliminate 'as' Warnings.caiyinyu2023-09-151-1/+1
|
* LoongArch: Add lasx/lsx support for _dl_runtime_profile.caiyinyu2023-09-157-179/+331
|
* LoongArch: Change loongarch to LoongArch in commentsdengjianbo2023-08-2924-24/+24
|
* LoongArch: Add ifunc support for memcmp{aligned, lsx, lasx}dengjianbo2023-08-297-0/+861
| | | | | | | | | | | According to glibc memcmp microbenchmark test results(Add generic memcmp), this implementation have performance improvement except the length is less than 3, details as below: Name Percent of time reduced memcmp-lasx 16%-74% memcmp-lsx 20%-50% memcmp-aligned 5%-20%
* LoongArch: Add ifunc support for memset{aligned, unaligned, lsx, lasx}dengjianbo2023-08-298-0/+688
| | | | | | | | | | | | | | According to glibc memset microbenchmark test results, for LSX and LASX versions, A few cases with length less than 8 experience performace degradation, overall, the LASX version could reduce the runtime about 15% - 75%, LSX version could reduce the runtime about 15%-50%. The unaligned version uses unaligned memmory access to set data which length is less than 64 and make address aligned with 8. For this part, the performace is better than aligned version. Comparing with the generic version, the performance is close when the length is larger than 128. When the length is 8-128, the unaligned version could reduce the runtime about 30%-70%, the aligned version could reduce the runtime about 20%-50%.
* LoongArch: Add ifunc support for memrchr{lsx, lasx}dengjianbo2023-08-297-0/+335
| | | | | | | | | According to glibc memrchr microbenchmark, this implementation could reduce the runtime as following: Name Percent of rutime reduced memrchr-lasx 20%-83% memrchr-lsx 20%-64%
* LoongArch: Add ifunc support for memchr{aligned, lsx, lasx}dengjianbo2023-08-297-0/+401
| | | | | | | | | | According to glibc memchr microbenchmark, this implementation could reduce the runtime as following: Name Percent of runtime reduced memchr-lasx 37%-83% memchr-lsx 30%-66% memchr-aligned 0%-15%
* LoongArch: Add ifunc support for rawmemchr{aligned, lsx, lasx}dengjianbo2023-08-297-0/+365
| | | | | | | | | According to glibc rawmemchr microbenchmark, A few cases tested with char '\0' experience performance degradation due to the lasx and lsx versions don't handle the '\0' separately. Overall, rawmemchr-lasx implementation could reduce the runtime about 40%-80%, rawmemchr-lsx implementation could reduce the runtime about 40%-66%, rawmemchr-aligned implementation could reduce the runtime about 20%-40%.
* LoongArch: Remove support code for old linker in start.SXi Ruoyao2023-08-291-16/+3
| | | | | | We are requiring Binutils >= 2.41, so la.pcrel always works here. Signed-off-by: Xi Ruoyao <xry111@xry111.site>
* LoongArch: Simplify the autoconf check for static PIEXi Ruoyao2023-08-292-50/+16
| | | | | | | We are strictly requiring GAS >= 2.41 now, so we don't need to check assembler capability anymore. Signed-off-by: Xi Ruoyao <xry111@xry111.site>
* LoongArch: Add ifunc support for strncmp{aligned, lsx}dengjianbo2023-08-246-0/+508
| | | | | | | | Based on the glibc microbenchmark, only a few short inputs with this strncmp-aligned and strncmp-lsx implementation experience performance degradation, overall, strncmp-aligned could reduce the runtime 0%-10% for aligned comparision, 10%-25% for unaligend comparision, strncmp-lsx could reduce the runtime about 0%-60%.
* LoongArch: Add ifunc support for strcmp{aligned, lsx}dengjianbo2023-08-246-0/+426
| | | | | | Based on the glibc microbenchmark, strcmp-aligned implementation could reduce the runtime 0%-10% for aligned comparison, 10%-20% for unaligned comparison, strcmp-lsx implemenation could reduce the runtime 0%-50%.
* LoongArch: Add ifunc support for strnlen{aligned, lsx, lasx}dengjianbo2023-08-247-0/+382
| | | | | | | Based on the glibc microbenchmark, strnlen-aligned implementation could reduce the runtime more than 10%, strnlen-lsx implementation could reduce the runtime about 50%-78%, strnlen-lasx implementation could reduce the runtime about 50%-88%.
* Loongarch: Add ifunc support for memcpy{aligned, unaligned, lsx, lasx} and ↵dengjianbo2023-08-1713-0/+2435
| | | | | | | | | | | | | | | memmove{aligned, unaligned, lsx, lasx} These implementations improve the time to copy data in the glibc microbenchmark as below: memcpy-lasx reduces the runtime about 8%-76% memcpy-lsx reduces the runtime about 8%-72% memcpy-unaligned reduces the runtime of unaligned data copying up to 40% memcpy-aligned reduece the runtime of unaligned data copying up to 25% memmove-lasx reduces the runtime about 20%-73% memmove-lsx reduces the runtime about 50% memmove-unaligned reduces the runtime of unaligned data moving up to 40% memmove-aligned reduces the runtime of unaligned data moving up to 25%
* Loongarch: Add ifunc support for strchr{aligned, lsx, lasx} and ↵dengjianbo2023-08-1712-0/+581
| | | | | | | | | | | | | strchrnul{aligned, lsx, lasx} These implementations improve the time to run strchr{nul} microbenchmark in glibc as below: strchr-lasx reduces the runtime about 50%-83% strchr-lsx reduces the runtime about 30%-67% strchr-aligned reduces the runtime about 10%-20% strchrnul-lasx reduces the runtime about 50%-83% strchrnul-lsx reduces the runtime about 36%-65% strchrnul-aligned reduces the runtime about 6%-10%
* Loongarch: Add ifunc support and add different versions of strlendengjianbo2023-08-148-0/+416
| | | | | | strlen-lasx is implemeted by LASX simd instructions(256bit) strlen-lsx is implemeted by LSX simd instructions(128bit) strlen-align is implemented by LA basic instructions and never use unaligned memory acess
* LoongArch: Add minuimum binutils required versiondengjianbo2023-08-144-8/+7
| | | | | | | LoongArch glibc can add some LASX/LSX vector instructions codes, change the required minimum binutils version to 2.41 which could support vector instructions. HAVE_LOONGARCH_VEC_ASM is removed accordingly.
* LoongArch: Redefine macro LEAF/ENTRY.dengjianbo2023-08-141-10/+26
| | | | | | The following usage of macro LEAF/ENTRY are all feasible: 1. LEAF(fcn) -- the align value of fcn is .align 3(default value) 2. LEAF(fcn, 6) -- the align value of fcn is .align 6
* LoongArch: Fix static PIE condition for toolchain bootstrapping.Yang Yujie2023-08-042-2/+2
| | | | | | This patch allows the static PIE startfile rcrt1.o to be built without requiring libgcc_s.so from GCC, which depends on libc in the first place.
* configure: Use autoconf 2.71Siddhesh Poyarekar2023-07-172-37/+38
| | | | | | | | | | | | | | Bump autoconf requirement to 2.71 to allow regenerating configure on more recent distributions. autoconf 2.71 has been in Fedora since F36 and is the current version in Debian stable (bookworm). It appears to be current in Gentoo as well. All sysdeps configure and preconfigure scripts have also been regenerated; all changes are trivial transformations that do not affect functionality. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* LoongArch: Fix soft-float bug about _dl_runtime_resolve{,lsx,lasx}caiyinyu2023-07-113-11/+9
|
* LoongArch: Add vector implementation for _dl_runtime_resolve.caiyinyu2023-07-116-69/+178
|
* LoongArch: config: Added HAVE_LOONGARCH_VEC_ASM.caiyinyu2023-07-112-0/+43
| | | | | | | | | | | | | | | | | | | | | | | | | This patch checks if assembler supports vector instructions to generate LASX/LSX code or not, and then define HAVE_LOONGARCH_VEC_ASM macro We have added support for vector instructions in binutils-2.41 See: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=75b2f521b101d974354f6ce9ed7c054d8b2e3b7a commit 75b2f521b101d974354f6ce9ed7c054d8b2e3b7a Author: mengqinggang <mengqinggang@loongson.cn> Date: Thu Jun 22 10:35:28 2023 +0800 LoongArch: gas: Add lsx and lasx instructions support gas/ChangeLog: * config/tc-loongarch.c (md_parse_option): Add lsx and lasx option. (loongarch_after_parse_args): Add lsx and lasx option. opcodes/ChangeLog: * loongarch-opc.c (struct loongarch_ase): Add lsx and lasx instructions.
* LoongArch: config: Rewrite check on static PIE.caiyinyu2023-07-072-14/+14
| | | | | It's better to add "\" before "EOF" and remove "\" before "$".
* LoongArch: Add support for dl_runtime_profilecaiyinyu2023-06-135-4/+220
| | | | This commit can fix the FAIL item: elf/tst-sprof-basic.
* Fix misspellings in sysdeps/ -- BZ 25337Paul Pluzhnikov2023-05-302-2/+2
|
* LoongArch: Add get_rounding_mode.caiyinyu2023-03-131-0/+38
|
* LoongArch: Update libm-test-ulps.caiyinyu2023-03-021-0/+1
|
* LoongArch: Further refine the condition to enable static PIEXi Ruoyao2023-03-022-0/+7
| | | | | | | | | | | | Before GCC r13-2728, it would produce a normal dynamic-linked executable with -static-pie. I mistakely believed it would produce a static-linked executable, so failed to detect the breakage. Then with Binutils 2.40 and (vanilla) GCC 12, libc_cv_static_pie_on_loongarch is mistakenly enabled and cause a building failure with "undefined reference to _DYNAMIC". Fix the issue by disabling static PIE if -static-pie creates something with a INTERP header.
* LoongArch: Add math-barriers.hXi Ruoyao2023-02-271-0/+28
| | | | | | | | This patch implements the LoongArch specific math barriers in order to omit the store and load from stack if possible. Signed-off-by: Xi Ruoyao <xry111@xry111.site> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>