| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
This patch updates x86 elision-conf.c to use the newly defined
HAS_CPU_FEATURE from <cpu-features.h>.
* sysdeps/unix/sysv/linux/x86/elision-conf.c (elision_init):
Replace HAS_RTM with HAS_CPU_FEATURE (RTM).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch updates i686 multiarch functions to use the newly defined
HAS_CPU_FEATURE, HAS_ARCH_FEATURE, LOAD_GOT_AND_RTLD_GLOBAL_RO and
LOAD_FUNC_GOT_EAX from <cpu-features.h>.
* sysdeps/i386/i686/fpu/multiarch/e_expf.c: Replace HAS_XXX
with HAS_CPU_FEATURE/HAS_ARCH_FEATURE (XXX).
* sysdeps/i386/i686/fpu/multiarch/s_cosf.c: Likewise.
* sysdeps/i386/i686/fpu/multiarch/s_cosf.c: Likewise.
* sysdeps/i386/i686/fpu/multiarch/s_sincosf.c: Likewise.
* sysdeps/i386/i686/fpu/multiarch/s_sinf.c: Likewise.
* sysdeps/i386/i686/multiarch/ifunc-impl-list.c: Likewise.
* sysdeps/i386/i686/multiarch/s_fma.c: Likewise.
* sysdeps/i386/i686/multiarch/s_fmaf.c: Likewise.
* sysdeps/i386/i686/multiarch/bcopy.S: Remove __init_cpu_features
call. Merge SHARED and !SHARED. Add LOAD_GOT_AND_RTLD_GLOBAL_RO.
Use LOAD_FUNC_GOT_EAX to load function address. Replace HAS_XXX
with HAS_CPU_FEATURE/HAS_ARCH_FEATURE (XXX).
* sysdeps/i386/i686/multiarch/bzero.S: Likewise.
* sysdeps/i386/i686/multiarch/memchr.S: Likewise.
* sysdeps/i386/i686/multiarch/memcmp.S: Likewise.
* sysdeps/i386/i686/multiarch/memcpy.S: Likewise.
* sysdeps/i386/i686/multiarch/memcpy_chk.S: Likewise.
* sysdeps/i386/i686/multiarch/memmove.S: Likewise.
* sysdeps/i386/i686/multiarch/memmove_chk.S: Likewise.
* sysdeps/i386/i686/multiarch/mempcpy.S: Likewise.
* sysdeps/i386/i686/multiarch/mempcpy_chk.S: Likewise.
* sysdeps/i386/i686/multiarch/memrchr.S: Likewise.
* sysdeps/i386/i686/multiarch/memset.S: Likewise.
* sysdeps/i386/i686/multiarch/memset_chk.S: Likewise.
* sysdeps/i386/i686/multiarch/rawmemchr.S: Likewise.
* sysdeps/i386/i686/multiarch/strcasecmp.S: Likewise.
* sysdeps/i386/i686/multiarch/strcat.S: Likewise.
* sysdeps/i386/i686/multiarch/strchr.S: Likewise.
* sysdeps/i386/i686/multiarch/strcmp.S: Likewise.
* sysdeps/i386/i686/multiarch/strcpy.S: Likewise.
* sysdeps/i386/i686/multiarch/strcspn.S: Likewise.
* sysdeps/i386/i686/multiarch/strlen.S: Likewise.
* sysdeps/i386/i686/multiarch/strncase.S: Likewise.
* sysdeps/i386/i686/multiarch/strnlen.S: Likewise.
* sysdeps/i386/i686/multiarch/strrchr.S: Likewise.
* sysdeps/i386/i686/multiarch/strspn.S: Likewise.
* sysdeps/i386/i686/multiarch/wcschr.S: Likewise.
* sysdeps/i386/i686/multiarch/wcscmp.S: Likewise.
* sysdeps/i386/i686/multiarch/wcscpy.S: Likewise.
* sysdeps/i386/i686/multiarch/wcslen.S: Likewise.
* sysdeps/i386/i686/multiarch/wcsrchr.S: Likewise.
* sysdeps/i386/i686/multiarch/wmemcmp.S: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch updates x86_64 multiarch functions to use the newly defined
HAS_CPU_FEATURE, HAS_ARCH_FEATURE and LOAD_RTLD_GLOBAL_RO_RDX from
<cpu-features.h>.
* sysdeps/x86_64/fpu/multiarch/e_asin.c: Replace HAS_XXX with
HAS_CPU_FEATURE/HAS_ARCH_FEATURE (XXX).
* sysdeps/x86_64/fpu/multiarch/e_atan2.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/e_exp.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/e_log.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/e_pow.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_atan.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_fma.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_fmaf.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_sin.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_tan.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_ceil.S: Use
LOAD_RTLD_GLOBAL_RO_RDX and HAS_CPU_FEATURE (SSE4_1).
* sysdeps/x86_64/fpu/multiarch/s_ceilf.S: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_floor.S: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_floorf.S: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_nearbyint.S : Likewise.
* sysdeps/x86_64/fpu/multiarch/s_nearbyintf.S: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_rintf.S: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_rintf.S : Likewise.
* sysdeps/x86_64/multiarch/ifunc-impl-list.c: Likewise.
* sysdeps/x86_64/multiarch/sched_cpucount.c: Likewise.
* sysdeps/x86_64/multiarch/strstr.c: Likewise.
* sysdeps/x86_64/multiarch/memmove.c: Likewise.
* sysdeps/x86_64/multiarch/memmove_chk.c: Likewise.
* sysdeps/x86_64/multiarch/test-multiarch.c: Likewise.
* sysdeps/x86_64/multiarch/memcmp.S: Remove __init_cpu_features
call. Add LOAD_RTLD_GLOBAL_RO_RDX. Replace HAS_XXX with
HAS_CPU_FEATURE/HAS_ARCH_FEATURE (XXX).
* sysdeps/x86_64/multiarch/memcpy.S: Likewise.
* sysdeps/x86_64/multiarch/memcpy_chk.S: Likewise.
* sysdeps/x86_64/multiarch/mempcpy.S: Likewise.
* sysdeps/x86_64/multiarch/mempcpy_chk.S: Likewise.
* sysdeps/x86_64/multiarch/memset.S: Likewise.
* sysdeps/x86_64/multiarch/memset_chk.S: Likewise.
* sysdeps/x86_64/multiarch/strcat.S: Likewise.
* sysdeps/x86_64/multiarch/strchr.S: Likewise.
* sysdeps/x86_64/multiarch/strcmp.S: Likewise.
* sysdeps/x86_64/multiarch/strcpy.S: Likewise.
* sysdeps/x86_64/multiarch/strcspn.S: Likewise.
* sysdeps/x86_64/multiarch/strspn.S: Likewise.
* sysdeps/x86_64/multiarch/wcscpy.S: Likewise.
* sysdeps/x86_64/multiarch/wmemcmp.S: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds _dl_x86_cpu_features to rtld_global in x86 ld.so
and initializes it early before __libc_start_main is called so that
cpu_features is always available when it is used and we can avoid
calling __init_cpu_features in IFUNC selectors.
* sysdeps/i386/dl-machine.h: Include <cpu-features.c>.
(dl_platform_init): Call init_cpu_features.
* sysdeps/i386/dl-procinfo.c (_dl_x86_cpu_features): New.
* sysdeps/i386/i686/cacheinfo.c
(DISABLE_PREFERRED_MEMORY_INSTRUCTION): Removed.
* sysdeps/i386/i686/multiarch/Makefile (aux): Remove init-arch.
* sysdeps/i386/i686/multiarch/Versions: Removed.
* sysdeps/i386/i686/multiarch/ifunc-defines.sym (KIND_OFFSET):
Removed.
* sysdeps/i386/ldsodefs.h: Include <cpu-features.h>.
* sysdeps/unix/sysv/linux/x86/Makefile
(libpthread-sysdep_routines): Remove init-arch.
* sysdeps/unix/sysv/linux/x86_64/dl-procinfo.c: Include
<sysdeps/x86_64/dl-procinfo.c> instead of
sysdeps/generic/dl-procinfo.c>.
* sysdeps/x86/Makefile [$(subdir) == csu] (gen-as-const-headers):
Add cpu-features-offsets.sym and rtld-global-offsets.sym.
[$(subdir) == elf] (sysdep-dl-routines): Add dl-get-cpu-features.
[$(subdir) == elf] (sysdep-rtld-routines): Likewise.
[$(subdir) == elf] (sysdep_routines): Likewise.
[$(subdir) == elf] (elide-routines.os): Likewise.
[$(subdir) == elf] (tests): Add tst-get-cpu-features.
[$(subdir) == elf] (tests-static): Add
tst-get-cpu-features-static.
* sysdeps/x86/Versions: New file.
* sysdeps/x86/cpu-features-offsets.sym: Likewise.
* sysdeps/x86/cpu-features.c: Likewise.
* sysdeps/x86/cpu-features.h: Likewise.
* sysdeps/x86/dl-get-cpu-features.c: Likewise.
* sysdeps/x86/libc-start.c: Likewise.
* sysdeps/x86/rtld-global-offsets.sym: Likewise.
* sysdeps/x86/tst-get-cpu-features-static.c: Likewise.
* sysdeps/x86/tst-get-cpu-features.c: Likewise.
* sysdeps/x86_64/dl-procinfo.c: Likewise.
* sysdeps/x86_64/cacheinfo.c (__cpuid_count): Removed.
Assume USE_MULTIARCH is defined and don't check it.
(is_intel): Replace __cpu_features with GLRO(dl_x86_cpu_features).
(is_amd): Likewise.
(max_cpuid): Likewise.
(intel_check_word): Likewise.
(__cache_sysconf): Don't call __init_cpu_features.
(__x86_preferred_memory_instruction): Removed.
(init_cacheinfo): Don't call __init_cpu_features. Replace
__cpu_features with GLRO(dl_x86_cpu_features).
* sysdeps/x86_64/dl-machine.h: <cpu-features.c>.
(dl_platform_init): Call init_cpu_features.
* sysdeps/x86_64/ldsodefs.h: Include <cpu-features.h>.
* sysdeps/x86_64/multiarch/Makefile (aux): Remove init-arch.
* sysdeps/x86_64/multiarch/Versions: Removed.
* sysdeps/x86_64/multiarch/cacheinfo.c: Likewise.
* sysdeps/x86_64/multiarch/init-arch.c: Likewise.
* sysdeps/x86_64/multiarch/ifunc-defines.sym (KIND_OFFSET):
Removed.
* sysdeps/x86_64/multiarch/init-arch.h: Rewrite.
|
|
|
|
|
|
|
|
|
|
| |
If x86-64 assembler doesn't support MPX, we encode bndmov instruction by
hand. When displacement is zero, assembler generates shorter encoding.
This patch improves bndmov encoding with zero displacement so that ld.so
is identical when using assemblers with and without MPX support.
* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve): Improve
bndmov encoding with zero displacement.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to save/restore bound registers and add a BND prefix before
branches in _dl_runtime_profile so that bound registers for pointer
pass and return are preserved when LD_AUDIT is used.
[BZ #18134]
* sysdeps/i386/configure.ac: Set HAVE_MPX_SUPPORT.
* sysdeps/i386/configure: Regenerated.
* sysdeps/i386/dl-trampoline.S (PRESERVE_BND_REGS_PREFIX): New.
(_dl_runtime_profile): Save and restore Intel MPX return bound
registers when calling _dl_call_pltexit. Add
PRESERVE_BND_REGS_PREFIX before return.
* sysdeps/i386/link-defines.sym (LRV_BND0_OFFSET): New.
(LRV_BND1_OFFSET): Likewise.
* sysdeps/x86/bits/link.h (La_i86_retval): Add lrv_bnd0 and
lrv_bnd1.
* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Fix
typo in bndmov encoding.
* sysdeps/x86_64/dl-trampoline.h: Properly save and restore
Intel MPX bound registers. Add PRESERVE_BND_REGS_PREFIX before
branch instructions to preserve bounds.
|
|
|
|
|
|
|
|
|
| |
We need to add a BND prefix before indirect branch at the end of
_dl_runtime_resolve to preserve bound registers.
[BZ #18134]
* sysdeps/x86_64/dl-trampoline.S (PRESERVE_BND_REGS_PREFIX): New.
(_dl_runtime_resolve): Add a BND prefix before indirect branch.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Define macros for fields in La_i86_regs and La_i86_retval and use them
in dl-trampoline.S, instead of hardcoded values.
* sysdeps/i386/Makefile (gen-as-const-headers)[elf]: Add
link-defines.sym.
* sysdeps/i386/dl-trampoline.S: Include <link-defines.h>.
(_dl_runtime_profile): Use LONG_DOUBLE_SIZE, LRV_SIZE,
LRV_EAX_OFFSET, LRV_EDX_OFFSET, LRV_ST0_OFFSET, LRV_ST1_OFFSET
and LR_SIZE.
* sysdeps/i386/link-defines.sym: New file.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds a testcase for i386 LD_AUDIT to check function return
and parameters passed in registers.
* sysdeps/i386/Makefile (tests)[elf]: Add tst-audit3.
(modules-names): Add tst-auditmod3a tst-auditmod3b.
($(objpfx)tst-audit3): New rule.
($(objpfx)tst-audit3.out): Likewise.
* sysdeps/i386/tst-audit3.c: New file.
* sysdeps/i386/tst-audit3.h: Likewise.
* sysdeps/i386/tst-auditmod3a.c: Likewise.
* sysdeps/i386/tst-auditmod3b.c: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With copy relocation, address of protected data defined in the shared
library may be external. When there is a relocation against the
protected data symbol within the shared library, we need to check if we
should skip the definition in the executable copied from the protected
data. This patch adds ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA and defines
it for x86. If ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA isn't 0, do_lookup_x
will skip the data definition in the executable from copy reloc.
Cherry-pick from master: 62da1e3b00b51383ffa7efc89d8addda0502e107
[BZ #17711]
* elf/dl-lookup.c (do_lookup_x): When UNDEF_MAP is NULL, which
indicates it is called from do_lookup_x on relocation against
protected data, skip the data definion in the executable from
copy reloc.
(_dl_lookup_symbol_x): Pass ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA,
instead of ELF_RTYPE_CLASS_PLT, to do_lookup_x for
EXTERN_PROTECTED_DATA relocation against STT_OBJECT symbol.
* sysdeps/generic/ldsodefs.h * (ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA):
New. Defined to 4 if DL_EXTERN_PROTECTED_DATA is defined,
otherwise to 0.
* sysdeps/i386/dl-lookupcfg.h (DL_EXTERN_PROTECTED_DATA): New.
* sysdeps/i386/dl-machine.h (elf_machine_type_class): Set class
to ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA for R_386_GLOB_DAT.
* sysdeps/x86_64/dl-lookupcfg.h (DL_EXTERN_PROTECTED_DATA): New.
* sysdeps/x86_64/dl-machine.h (elf_machine_type_class): Set class
to ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA for R_X86_64_GLOB_DAT.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Way back in 2005 the atomic_exchange_and_add function was cleaned up to
avoid the explicit size checking and instead let gcc handle things itself.
Unfortunately that change ended up leaving beyond a cast to int, even when
the incoming value was a long. This has flown under the radar for a long
time due to the function not being heavily used in the tree (especially as
a full 64bit field), but a recent change to semaphores made some nptl tests
fail reliably. This is due to the code packing two 32bit values into one
64bit variable (where the high 32bits contained the number of waiters), and
then the whole variable being atomically updated between threads. On ia64,
that meant we never atomically updated the count, so sometimes the sem_post
would not wake up the waiters.
(cherry picked from commit cf31a2c79957936b60de34ea1e718e892baf669c)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit a059d359d86130b5fa74e04a978c8523a0293f77 changed the sigaction
struct to pass conform tests, but it ended up also changing the ABI for
32 bit builds. For 64 bit builds, changing the long to two ints works,
but for 32 bit builds, it inserts 4 extra bytes. This leads to many
packages randomly failing like bash that spews things like:
configure: line 471: wait_for: No record of process 0
Bracket the new member by a wordsize check to fix the ABI for 32bit.
(cherry picked from commit 7fde904c73c57faea48c9679bbdc0932d81b3a2f)
|
|
|
|
|
|
|
| |
In commit 8b4416d, the 1: jump label in __mempcpy_chk was accidentally
moved. This resulted in failures of mempcpy on CPU without SSE2.
(cherry picked from commit 132a1328eccd20621b77f7810eebbeec0a1af187)
|
|
|
|
|
|
|
|
| |
This reverts part of the previous commit to refactor pthread.h.
The refactoring must be done by having pthread.h include arch
bits headers, not the other way around. Then hppa provides the
arch bits header. For now we synchronzie again with pthread.h
and include the entire contents in the hppa copy.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update all translations.
Update contributions in the manual.
Update installation notes with information about newest working tools.
Reconfigure using exactly autoconf 2.69.
Regenerate INSTALL.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(1) Fix warnings.
This is a bulk update to fix all the warnings that were causing
build failures with -Werror on hppa.
The most egregious problems are in dl-fptr.c which needs to be
entirely rewritten, thus I've used -Wno-error for that.
(2) Fix conformance errors.
The sysdep.c file had __syscall_error and syscall in one file
which caused conformance issues by including syscall when
__syscall_error was linked to. The fix is obviously to split
the file and use syscall.c to implement syscall.
|
| |
|
|
|
|
|
|
|
|
| |
* sysdeps/sparc/sparc32/bits/atomic.h
(__sparc32_atomic_do_unlock24): Put the memory barrier before the
unlock not after it.
(__v9_compare_and_exchange_val_32_acq): Use unions to avoid getting
volatile register usage warnings from the compiler.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* sysdeps/sparc/nptl/sem_init.c: Delete.
* sysdeps/sparc/nptl/sem_post.c: Delete.
* sysdeps/sparc/nptl/sem_timedwait.c: Delete.
* sysdeps/sparc/nptl/sem_wait.c: Delete.
* sysdeps/sparc/sparc32/sem_init.c: New file.
* sysdeps/sparc/sparc32/sem_waitcommon.c: New file.
* sysdeps/sparc/sparc32/sem_open.c: Generic nptl version with
padding explicitly initialized.
* sysdeps/sparc/sparc32/sem_post.c: Generic nptl version using
padding for in-semaphore spinlock.
* sysdeps/sparc/sparc32/sem_wait.c: Likewise.
* sysdeps/sparc/sparc32/sem_trywait.c: Delete.
* sysdeps/sparc/sparc32/sem_timedwait.c: Delete.
* sysdeps/sparc/sparc32/sparcv9/sem_init.c: New file.
* sysdeps/sparc/sparc32/sparcv9/sem_open.c: New file.
* sysdeps/sparc/sparc32/sparcv9/sem_post.c: New file.
* sysdeps/sparc/sparc32/sparcv9/sem_waitcommon.c: New file.
* sysdeps/sparc/sparc32/sparcv9/sem_wait.c: Redirect to nptl
version.
* sysdeps/sparc/sparc32/sparcv9/sem_timedwait.c: Delete.
* sysdeps/sparc/sparc32/sparcv9/sem_trywait.c: Delete.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
memcpy with unaligned 256-bit AVX register loads/stores are slow on older
processorsl like Sandy Bridge. This patch adds bit_AVX_Fast_Unaligned_Load
and sets it only when AVX2 is available.
[BZ #17801]
* sysdeps/x86_64/multiarch/init-arch.c (__init_cpu_features):
Set the bit_AVX_Fast_Unaligned_Load bit for AVX2.
* sysdeps/x86_64/multiarch/init-arch.h (bit_AVX_Fast_Unaligned_Load):
New.
(index_AVX_Fast_Unaligned_Load): Likewise.
(HAS_AVX_FAST_UNALIGNED_LOAD): Likewise.
* sysdeps/x86_64/multiarch/memcpy.S (__new_memcpy): Check the
bit_AVX_Fast_Unaligned_Load bit instead of the bit_AVX_Usable bit.
* sysdeps/x86_64/multiarch/memcpy_chk.S (__memcpy_chk): Likewise.
* sysdeps/x86_64/multiarch/mempcpy.S (__mempcpy): Likewise.
* sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Likewise.
* sysdeps/x86_64/multiarch/memmove.c (__libc_memmove): Replace
HAS_AVX with HAS_AVX_FAST_UNALIGNED_LOAD.
* sysdeps/x86_64/multiarch/memmove_chk.c (__memmove_chk): Likewise.
|
|
|
|
|
| |
Architectures which don't use hp-timing-common.h don't include <signal.h>
via <sys/param.h>.
|
|
|
|
|
|
| |
This is because of alignment issues in the sem_t support.
tilegx32 does in fact support 64-bit atomics and we will need
to revisit this after the 2.21 freeze.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch disables use of 64-bit atomics for MIPS n32 to fix the
problems with unaligned semaphores.
Before 64-bit atomics are used for anything for which such alignment
issues do not arise, and before the addition of any new ILP32 ports
with 64-bit semaphores for which the ABI can be set to have the
greater alignment (AARCH64?), a better approach will need to be
established that allows architectures to declare their 64-bit atomics
availability accurately, without doing so causing inappropriate use of
such atomics on unaligned semaphores.
Tested for MIPS n32 that this fixes the nptl/tst-sem3 failure.
* sysdeps/mips/bits/atomic.h [_MIPS_SIM == _ABIN32]
(__HAVE_64B_ATOMICS): Define to 0.
|
|
|
|
|
|
|
|
|
| |
This patch fixes a bug introduced by 18f2945ae9216cfc, where it optimizes
the FPSCR set by just issuing a mtfs instruction if new flag is different
from older one. The issue is a typo, where the new flag should the the
new value, instead of the old one.
It fixes BZ#17885.
|
|
|
|
|
|
|
|
|
|
| |
Some powerpc64 processors (e5500 core for instance) does not provide the
fsqrt instruction, however current check to use in math_private.h is
__WORDSIZE and _ARCH_PWR4 (ISA 2.02). This is patch change it to use
the compiler flag _ARCH_PPCSQ (which is the same condition GCC uses to
decide whether to generate fsqrt instruction).
It fixes BZ#16576.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GLIBC memset optimization for POWER8 uses the '.machine power8'
directive, which is only supported officially on binutils 2.24+. This
causes a build failure on older binutils.
Since the requirement of .machine power8 is to correctly assembly the
'mtvsrd' instruction and it is already handled by the MTVSRD_V1_R4
macro, there is no really needed of using it.
The patch replaces the power8 with power7 for .machine directive.
It fixes BZ#17869.
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fix the elf/ifuncmain6pie failure when building with GCC
4.9+. For some reason, the compiler removes the branch taken code at
resolve_ifunc (sysdeps/powerpc/powerpc64/dl-machine.h) as dead-code
and thus the testcase fails because the ifunc resolves branches to an
invalid memory location. It fixes by explicit adding a dependency of
value based on odp variable to avoid compiler optimization.
It fixes BZ#17868.
|
| |
|
|
|
|
|
| |
* sysdeps/x86_64/multiarch/init-arch.c (__init_cpu_features):
Treat model numbers 0x4a/0x4d as Intel Silvermont architecture.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch replaces unsigned long int and 1UL with uint64_t and
(uint64_t) 1 to support ILP32 targets like x32.
[BZ #17870]
* nptl/sem_post.c (__new_sem_post): Replace unsigned long int
with uint64_t.
* nptl/sem_waitcommon.c (__sem_wait_cleanup): Replace 1UL with
(uint64_t) 1.
(__new_sem_wait_slow): Replace unsigned long int with uint64_t.
Replace 1UL with (uint64_t) 1.
* sysdeps/nptl/internaltypes.h (new_sem): Replace unsigned long
int with uint64_t.
|
|
|
|
|
|
|
|
| |
This patch fix powerpc __get_clockfreq racy and cancel-safe issues by
dropping internal static cache and by using nocancel file operations.
The vDSO failure check is also removed, since kernel code does not
return an error (it cleans cr0.so bit on function return) and the static
code (to read value /proc) now uses non-cancellable calls.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ability to recursively call dlopen is useful for malloc
implementations that wish to load other dynamic modules that
implement reentrant/AS-safe functions to use in their own
implementation.
Given that a user malloc implementation may be called by an
ongoing dlopen to allocate memory the user malloc
implementation interrupts dlopen and if it calls dlopen again
that's a reentrant call.
This patch fixes the issues with the ld.so.cache mapping
and the _r_debug assertion which prevent this from working
as expected.
See:
https://sourceware.org/ml/libc-alpha/2014-12/msg00446.html
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit fixes semaphore destruction by either using 64b atomic
operations (where available), or by using two separate fields when only
32b atomic operations are available. In the latter case, we keep a
conservative estimate of whether there are any waiting threads in one
bit of the field that counts the number of available tokens, thus
allowing sem_post to atomically both add a token and determine whether
it needs to call futex_wake.
See:
https://sourceware.org/ml/libc-alpha/2014-12/msg00155.html
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When fixing namespace issues for <fenv.h> functions I missed one call
to fesetenv for powerpc-nofpu. This patch changes this to a call to
__fesetenv.
Tested for powerpc-nofpu; it fixes the previously observed math.h
linknamespace test failures.
[BZ #17748]
* sysdeps/powerpc/nofpu/feholdexcpt.c (__feholdexcept): Call
__fesetenv instead of fesetenv.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 050f7298e1ecc39887c329037575ccd972071255 added an extern
declaration for __tls_get_addr that conflicts with the one in s390
dl-tls.h, based on whether __tls_get_addr is defined as a macro. The
rationale seems to be based on the assumption that __tls_get_addr is
exported for every architecture and hence an internal non-plt alias is
needed. This is not true for s390 though, since it exports
__tls_get_offset and not __tls_get_addr. This results in tst-audit9
being stuck in an infinite loop.
This patch fixes this by defining a __tls_get_addr macro to itself so
as to not use the conflicting declaration.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes a performance regression on the POWER7/PPC64 memcmp
porting for Little Endian. The LE code uses 'ldbrx' instruction to read
the memory on byte reversed form, however ISA 2.06 just provide the indexed
form which uses a register value as additional index, instead of a fixed value
enconded in the instruction.
And the port strategy for LE uses r0 index value and update the address
value on each compare loop interation. For large compare size values,
it adds 8 more instructions plus some more depending of trailing
size. This patch fixes it by adding pre-calculate indexes to remove the
address update on loops and tailing sizes.
For large sizes it shows a considerable gain, with double performance
pairing with BE.
|
|
|
|
|
|
|
|
|
|
| |
This patch adds an optimized POWER8 strncmp. The implementation focus
on speeding up unaligned cases follwing the ideas of power8 strcmp.
The algorithm first check the initial 16 bytes, then align the first
function source and uses unaligned loads on second argument only.
Aditional checks for page boundaries are done for unaligned cases
(where sources alignment are different).
|
|
|
|
|
|
| |
This patch optimized the POWER7 trailing check by avoiding using byte
read operations and instead use the doubleword already readed with
bitwise operations.
|
|
|
|
|
|
|
| |
This patch adds an optimized POWER8 strcmp using unaligned accesses.
The algorithm first check the initial 16 bytes, then align the first
function source and uses unaligned loads on second argument only.
Aditional checks for page boundaries are done for unaligned cases
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds an optimized POWER8 st{r,p}ncpy using unaligned accesses.
It shows 10%-80% improvement over the optimized POWER7 one that uses
only aligned accesses, specially on unaligned inputs.
The algorithm first read and check 16 bytes (if inputs do not cross a 4K
page size). The it realign source to 16-bytes and issue a 16 bytes read
and compare loop to speedup null byte checks for large strings. Also,
different from POWER7 optimization, the null pad is done inline in the
implementation using possible unaligned accesses, instead of realying on
a memset call. Special case is added for page cross reads.
|
|
|
|
|
|
|
|
|
|
|
| |
With 3eb38795dbbbd816 (Simplify strncat) the generic algorithms uses
strlen, strnlen, and memcpy. This is faster than POWER7 current
implementation, especially for unaligned strings (where POWER7 code
uses byte-byte operations).
This patch removes the assembly implementation and uses a multiarch
specialization based on default algorithm calling optimized POWER7
symbols.
|
|
|
|
|
| |
With new optimized strcpy for POWER8, this patch adds an optimized
strcat which uses it along with default implementation at strings/.
|