| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When x86-64 assmebler doesn't support AVX512, we should make
_dl_runtime_resolve_avx512/_dl_runtime_profile_avx512 as aliases of
_dl_runtime_resolve_avx/_dl_runtime_profile_avx. Tested on x86-64
using GCC 5.2 with binutils 20151008 and GCC 4.8 with binutils 20130219.
There are no differences in ld.so with binutils 20151008. There are no
unexpected failures with binutils 20130219 and 20151008.
[BZ #19124]
* sysdeps/x86_64/dl-trampoline.S [!HAVE_AVX512_ASM_SUPPORT]
(_dl_runtime_resolve_avx512): Make it a hidden alias of
_dl_runtime_resolve_avx.
(_dl_runtime_profile_avx512): Make it a hidden alias of
_dl_runtime_profile_avx.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds SSE, AVX and AVX512 versions of _dl_runtime_resolve
and _dl_runtime_profile, which save and restore the first 8 vector
registers used for parameter passing. elf_machine_runtime_setup
selects the proper _dl_runtime_resolve or _dl_runtime_profile based
on _dl_x86_cpu_features. It avoids race condition caused by
FOREIGN_CALL macros, which are only used for x86-64.
Performance impact of saving and restoring 8 vector registers are
negligible on Nehalem, Sandy Bridge, Ivy Bridge and Haswell when
ld.so is optimized with SSE2.
[BZ #15128]
* sysdeps/x86_64/Makefile [$(subdir) == elf] (tests): Add
ifuncmain8.
(modules-names): Add ifuncmod8.
($(objpfx)ifuncmain8): New rule.
* sysdeps/x86_64/dl-machine.h: Include <dl-procinfo.h> and
<cpuid.h>.
(elf_machine_runtime_setup): Use _dl_runtime_resolve_sse,
_dl_runtime_resolve_avx, or _dl_runtime_resolve_avx512,
_dl_runtime_profile_sse, _dl_runtime_profile_avx, or
_dl_runtime_profile_avx512, based on HAS_ARCH_FEATURE.
* sysdeps/x86_64/dl-trampoline.S: Rewrite.
* sysdeps/x86_64/dl-trampoline.h: Likewise.
* sysdeps/x86_64/ifuncmain8.c: New file.
* sysdeps/x86_64/ifuncmod8.c: Likewise.
* sysdeps/x86_64/nptl/tcb-offsets.sym (RTLD_SAVESPACE_SSE):
Removed.
* sysdeps/x86_64/nptl/tls.h (__128bits): Removed.
(tcbhead_t): Change rtld_must_xmm_save to __glibc_unused1.
Change rtld_savespace_sse to __glibc_unused2.
(RTLD_CHECK_FOREIGN_CALL): Removed.
(RTLD_ENABLE_FOREIGN_CALL): Likewise.
(RTLD_PREPARE_FOREIGN_CALL): Likewise.
(RTLD_FINALIZE_FOREIGN_CALL): Likewise.
|
|
|
|
|
|
|
|
|
|
| |
If x86-64 assembler doesn't support MPX, we encode bndmov instruction by
hand. When displacement is zero, assembler generates shorter encoding.
This patch improves bndmov encoding with zero displacement so that ld.so
is identical when using assemblers with and without MPX support.
* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve): Improve
bndmov encoding with zero displacement.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to save/restore bound registers and add a BND prefix before
branches in _dl_runtime_profile so that bound registers for pointer
pass and return are preserved when LD_AUDIT is used.
[BZ #18134]
* sysdeps/i386/configure.ac: Set HAVE_MPX_SUPPORT.
* sysdeps/i386/configure: Regenerated.
* sysdeps/i386/dl-trampoline.S (PRESERVE_BND_REGS_PREFIX): New.
(_dl_runtime_profile): Save and restore Intel MPX return bound
registers when calling _dl_call_pltexit. Add
PRESERVE_BND_REGS_PREFIX before return.
* sysdeps/i386/link-defines.sym (LRV_BND0_OFFSET): New.
(LRV_BND1_OFFSET): Likewise.
* sysdeps/x86/bits/link.h (La_i86_retval): Add lrv_bnd0 and
lrv_bnd1.
* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Fix
typo in bndmov encoding.
* sysdeps/x86_64/dl-trampoline.h: Properly save and restore
Intel MPX bound registers. Add PRESERVE_BND_REGS_PREFIX before
branch instructions to preserve bounds.
|
|
|
|
|
|
|
|
|
| |
We need to add a BND prefix before indirect branch at the end of
_dl_runtime_resolve to preserve bound registers.
[BZ #18134]
* sysdeps/x86_64/dl-trampoline.S (PRESERVE_BND_REGS_PREFIX): New.
(_dl_runtime_resolve): Add a BND prefix before indirect branch.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch saves and restores bound registers in x86-64 PLT for
ld.so profile and LD_AUDIT:
* sysdeps/x86_64/bits/link.h (La_x86_64_regs): Add lr_bnd.
(La_x86_64_retval): Add lrv_bnd0 and lrv_bnd1.
* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Save
Intel MPX bound registers before _dl_profile_fixup.
* sysdeps/x86_64/dl-trampoline.h: Restore Intel MPX bound
registers after _dl_profile_fixup. Save and restore bound
registers bnd0/bnd1 when calling _dl_call_pltexit.
* sysdeps/x86_64/link-defines.sym (BND_SIZE): New.
(LR_BND_OFFSET): Likewise.
(LRV_BND0_OFFSET): Likewise.
(LRV_BND1_OFFSET): Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch saves and restores bound registers in symbol lookup for x86-64:
1. Branches without BND prefix clear bound registers.
2. x86-64 pass bounds in bound registers as specified in MPX psABI
extension on hjl/mpx/master branch at
https://github.com/hjl-tools/x86-64-psABI
https://groups.google.com/forum/#!topic/x86-64-abi/KFsB0XTgWYc
Binutils has been updated to create an alternate PLT to add BND prefix
when branching to ld.so.
* config.h.in (HAVE_MPX_SUPPORT): New #undef.
* sysdeps/x86_64/configure.ac: Set HAVE_MPX_SUPPORT.
* sysdeps/x86_64/configure: Regenerated.
* sysdeps/x86_64/dl-trampoline.S (REGISTER_SAVE_AREA): New
macro.
(REGISTER_SAVE_RAX): Likewise.
(REGISTER_SAVE_RCX): Likewise.
(REGISTER_SAVE_RDX): Likewise.
(REGISTER_SAVE_RSI): Likewise.
(REGISTER_SAVE_RDI): Likewise.
(REGISTER_SAVE_R8): Likewise.
(REGISTER_SAVE_R9): Likewise.
(REGISTER_SAVE_BND0): Likewise.
(REGISTER_SAVE_BND1): Likewise.
(REGISTER_SAVE_BND2): Likewise.
(_dl_runtime_resolve): Use them. Save and restore Intel MPX
bound registers when calling _dl_fixup.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AVX-512 ISA adds 512-bit zmm registers. This patch updates
_dl_runtime_profile to pass zmm registers to run-time audit. It also
changes _dl_x86_64_save_sse and _dl_x86_64_restore_sse to upport zmm
registers, which are called when only when RTLD_PREPARE_FOREIGN_CALL
is used. Its performance impact is minimum.
* config.h.in (HAVE_AVX512_SUPPORT): New #undef.
(HAVE_AVX512_ASM_SUPPORT): Likewise.
* sysdeps/x86_64/bits/link.h (La_x86_64_zmm): New.
(La_x86_64_vector): Add zmm.
* sysdeps/x86_64/Makefile (tests): Add tst-audit10.
(modules-names): Add tst-auditmod10a and tst-auditmod10b.
($(objpfx)tst-audit10): New target.
($(objpfx)tst-audit10.out): Likewise.
(tst-audit10-ENV): New.
(AVX512-CFLAGS): Likewise.
(CFLAGS-tst-audit10.c): Likewise.
(CFLAGS-tst-auditmod10a.c): Likewise.
(CFLAGS-tst-auditmod10b.c): Likewise.
* sysdeps/x86_64/configure.ac: Set config-cflags-avx512,
HAVE_AVX512_SUPPORT and HAVE_AVX512_ASM_SUPPORT.
* sysdeps/x86_64/configure: Regenerated.
* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Add
AVX-512 zmm register support.
(_dl_x86_64_save_sse): Likewise.
(_dl_x86_64_restore_sse): Likewise.
* sysdeps/x86_64/dl-trampoline.h: Updated to support different
size vector registers.
* sysdeps/x86_64/link-defines.sym (YMM_SIZE): New.
(ZMM_SIZE): Likewise.
* sysdeps/x86_64/tst-audit10.c: New file.
* sysdeps/x86_64/tst-auditmod10a.c: Likewise.
* sysdeps/x86_64/tst-auditmod10b.c: Likewise.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
The AVX bit is set if the CPU supports AVX. But this doesn't mean the
kernel does. Add checks according to Intel's documentation.
|
|
|
|
|
|
|
|
|
| |
If a signal arrived during a symbol lookup and the signal handler also
required a symbol lookup, the end of the lookup in the signal handler reset
the flag whether restoring AVX/SSE registers is needed. Resetting means
in this case that the tail part of the outer lookup code will try to
restore the registers and this can fail miserably. We now restore to the
previous value which makes nesting calls possible.
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes mixed SSE/AVX audit and checks AVX only once in
_dl_runtime_profile. When an AVX or SSE register value in pltenter is
modified, we have to make sure that the SSE part value is the same in both
lr_xmm and lr_vector fields so that pltexit will get the correct value
from either lr_xmm or lr_vector fields. AVX-enabled pltenter should
update both lr_xmm and lr_vector fields to support stacked AVX/SSE
pltenter functions.
|
| |
|
|
|
|
|
| |
tst-audit4 and tst-audit5 fail under AVX emulator due to je instead of
jne. This patch fixes them.
|
|
|
|
|
|
|
|
|
|
| |
SSE registers are used for passing parameters and must be preserved
in runtime relocations. This is inside ld.so enforced through the
tests in tst-xmmymm.sh. But the malloc routines used after startup
come from libc.so and can be arbitrarily complex. It's overkill
to save the SSE registers all the time because of that. These calls
are rare. Instead we save them on demand. The new infrastructure
put in place in this patch makes this possible and efficient.
|
|
|
|
| |
The patch mainly reduces the code size but also avoids some jumps.
|
|
|
|
| |
Don't use AVX instructions too often.
|
| |
|
|
|
|
|
|
|
|
| |
The original AVX patch used a function pointer to handle the difference
between machines with and without AVX support. This is insecure. A
well-placed memory exploit could lead to redirection of the execution.
Using a variable and several tests is a bit slower but cannot be
exploited in this way.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
* inet/inet6_rth.c (inet6_rth_add): Add some error checking.
Patch mostly by Yang Hongyang <yanghy@cn.fujitsu.com>.
* inet/Makefile (tests): Add tst-inet6_rth.
* inet/tst-inet6_rth.c: New file.
alignment of La_x86_64_regs. Store xmm parameters.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(reloc_index): Define.
(_dl_fixup): Rename reloc_offset parameter to reloc_arg.
(_dl_fixup_profile): Likewise. Use reloc_index instead of
computing index from reloc_offset.
(_dl_call_pltexit): Likewise.
* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve): Just pass
the relocation index to _dl_fixup.
(_dl_runtime_profile): Likewise for _dl_fixup_profile and
_dl_call_pltexit.
* sysdeps/x86_64/dl-runtime.c: New file.
|
|
|
|
|
|
| |
* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Fix
alignement of La_x86_64_regs. Store xmm parameters.
Patch mostly by Jiri Olsa <olsajiri@gmail.com>.
|
|
|
|
|
| |
stack is properly aligned for the target function.
Correct unwind info.
|
|
|
|
| |
align stack for call if pltexit is to be used.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
consider_profiling always to zero. Don't count of compiler to
remove unreached if block.
* sysdeps/x86_64/dl-trampoline.S [PROF] (_dl_runtime_profile):
Don't compile.
* sysdeps/i386/dl-trampoline.S [PROF] (_dl_runtime_profile): Likewise.
* sysdeps/ia64/dl-trampoline.S [PROF] (_dl_runtime_profile): Likewise.
* sysdeps/s390/s390-64/dl-trampoline.S [PROF] (_dl_runtime_profile):
Likewise.
* sysdeps/s390/s390-32/dl-trampoline.S [PROF] (_dl_runtime_profile):
Likewise.
* sysdeps/powerpc/powerpc64/dl-trampoline.S [PROF]
(_dl_profile_resolve): Likewise.
* sysdeps/powerpc/powerpc32/dl-trampoline.S [PROF]
(_dl_profile_resolve): Likewise.
* gmon/Makefile: Add rules to build and run tst-profile-static.
* gmon/tst-profile-static.c: New file.
* Makeconfig (+link-static): Allow passing program-specific flags.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* sysdeps/generic/libc-start.c: Don't register program destructor here.
* dlfcn/Makefile: Add rules to build dlfcn.c.
(LDFLAGS-dl.so): Removed.
* dlfcn/dlclose.c: _dl_close is now in ld.so, use function pointer
table.
* dlfcn/dlmopen.c: Likewise for _dl_open.
* dlfcn/dlopen.c: Likewise.
* dlfcn/dlopenold.c: Likewise.
* elf/dl-libc.c: Likewise for _dl_open and _dl_close.
* elf/Makefile (routines): Remove dl-open and dl-close.
(dl-routines): Add dl-open, dl-close, and dl-trampoline.
Add rules to build and run tst-audit1.
* elf/tst-audit1.c: New file.
* elf/tst-auditmod1.c: New file.
* elf/Versions [libc]: Remove _dl_open and _dl_close.
* elf/dl-close.c: Change for use inside ld.so instead of libc.so.
* elf/dl-open.c: Likewise.
* elf/dl-debug.c (_dl_debug_initialize): Allow reinitialization,
signaled by nonzero parameter.
* elf/dl-init.c: Fix use of r_state.
* elf/dl-load.c: Likewise.
* elf/dl-close.c: Add auditing checkpoints.
* elf/dl-open.c: Likewise.
* elf/dl-fini.c: Likewise.
* elf/dl-load.c: Likewise.
* elf/dl-sym.c: Likewise.
* sysdeps/generic/libc-start.c: Likewise.
* elf/dl-object.c: Allocate memory for auditing information.
* elf/dl-reloc.c: Remove RESOLV. We now always need the map.
Correctly initialize slotinfo.
* elf/dynamic-link.h: Adjust after removal of RESOLV.
* sysdeps/hppa/dl-lookupcfg.h: Likewise.
* sysdeps/ia64/dl-lookupcfg.h: Likewise.
* sysdeps/powerpc/powerpc64/dl-lookupcfg.h: Removed.
* elf/dl-runtime.c (_dl_fixup): Little cleanup.
(_dl_profile_fixup): New parameters to point to register struct and
variable for frame size.
Add auditing checkpoints.
(_dl_call_pltexit): New function.
Don't define trampoline code here.
* elf/rtld.c: Recognize LD_AUDIT. Load modules on startup.
Remove all the functions from _rtld_global_ro which only _dl_open
and _dl_close needed.
Add auditing checkpoints.
* elf/link.h: Define symbols for auditing interfaces.
* include/link.h: Likewise.
* include/dlfcn.h: Define __RTLD_AUDIT.
Remove prototypes for _dl_open and _dl_close.
Adjust access to argc and argv in libdl.
* dlfcn/dlfcn.c: New file.
* sysdeps/generic/dl-lookupcfg.h: Remove all content now that RESOLVE
is gone.
* sysdeps/generic/ldsodefs.h: Add definitions for auditing interfaces.
* sysdeps/generic/unsecvars.h: Add LD_AUDIT.
* sysdeps/i386/dl-machine.h: Remove trampoline code here.
Adjust for removal of RESOLVE.
* sysdeps/x86_64/dl-machine.h: Likewise.
* sysdeps/generic/dl-trampoline.c: New file.
* sysdeps/i386/dl-trampoline.c: New file.
* sysdeps/x86_64/dl-trampoline.c: New file.
* sysdeps/generic/dl-tls.c: Cleanups. Fixup for dtv_t change.
Fix updating of DTV.
* sysdeps/generic/libc-tls.c: Likewise.
* sysdeps/arm/bits/link.h: Renamed to ...
* sysdeps/arm/buts/linkmap.h: ...this.
* sysdeps/generic/bits/link.h: Renamed to...
* sysdeps/generic/bits/linkmap.h: ...this.
* sysdeps/hppa/bits/link.h: Renamed to...
* sysdeps/hppa/bits/linkmap.h: ...this.
* sysdeps/hppa/i386/link.h: Renamed to...
* sysdeps/hppa/i386/linkmap.h: ...this.
* sysdeps/hppa/ia64/link.h: Renamed to...
* sysdeps/hppa/ia64/linkmap.h: ...this.
* sysdeps/hppa/s390/link.h: Renamed to...
* sysdeps/hppa/s390/linkmap.h: ...this.
* sysdeps/hppa/sh/link.h: Renamed to...
* sysdeps/hppa/sh/linkmap.h: ...this.
* sysdeps/hppa/x86_64/link.h: Renamed to...
* sysdeps/hppa/x86_64/linkmap.h: ...this.
2005-01-06 Ulrich Drepper <drepper@redhat.com>
* allocatestack.c (init_one_static_tls): Adjust initialization of DTV
entry for static tls deallocation fix.
* sysdeps/alpha/tls.h (dtv_t): Change pointer type to be struct which
also contains information whether the memory pointed to is static
TLS or not.
* sysdeps/i386/tls.h: Likewise.
* sysdeps/ia64/tls.h: Likewise.
* sysdeps/powerpc/tls.h: Likewise.
* sysdeps/s390/tls.h: Likewise.
* sysdeps/sh/tls.h: Likewise.
* sysdeps/sparc/tls.h: Likewise.
* sysdeps/x86_64/tls.h: Likewise.
|
| |
|
|
|