about summary refs log tree commit diff
path: root/sysdeps/x86_64/dl-trampoline.S
Commit message (Collapse)AuthorAgeFilesLines
* Save and restore vector registers in x86-64 ld.soH.J. Lu2015-08-251-389/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds SSE, AVX and AVX512 versions of _dl_runtime_resolve and _dl_runtime_profile, which save and restore the first 8 vector registers used for parameter passing. elf_machine_runtime_setup selects the proper _dl_runtime_resolve or _dl_runtime_profile based on _dl_x86_cpu_features. It avoids race condition caused by FOREIGN_CALL macros, which are only used for x86-64. Performance impact of saving and restoring 8 vector registers are negligible on Nehalem, Sandy Bridge, Ivy Bridge and Haswell when ld.so is optimized with SSE2. [BZ #15128] * sysdeps/x86_64/Makefile [$(subdir) == elf] (tests): Add ifuncmain8. (modules-names): Add ifuncmod8. ($(objpfx)ifuncmain8): New rule. * sysdeps/x86_64/dl-machine.h: Include <dl-procinfo.h> and <cpuid.h>. (elf_machine_runtime_setup): Use _dl_runtime_resolve_sse, _dl_runtime_resolve_avx, or _dl_runtime_resolve_avx512, _dl_runtime_profile_sse, _dl_runtime_profile_avx, or _dl_runtime_profile_avx512, based on HAS_ARCH_FEATURE. * sysdeps/x86_64/dl-trampoline.S: Rewrite. * sysdeps/x86_64/dl-trampoline.h: Likewise. * sysdeps/x86_64/ifuncmain8.c: New file. * sysdeps/x86_64/ifuncmod8.c: Likewise. * sysdeps/x86_64/nptl/tcb-offsets.sym (RTLD_SAVESPACE_SSE): Removed. * sysdeps/x86_64/nptl/tls.h (__128bits): Removed. (tcbhead_t): Change rtld_must_xmm_save to __glibc_unused1. Change rtld_savespace_sse to __glibc_unused2. (RTLD_CHECK_FOREIGN_CALL): Removed. (RTLD_ENABLE_FOREIGN_CALL): Likewise. (RTLD_PREPARE_FOREIGN_CALL): Likewise. (RTLD_FINALIZE_FOREIGN_CALL): Likewise.
* Improve bndmov encoding with zero displacementH.J. Lu2015-07-091-0/+8
| | | | | | | | | | If x86-64 assembler doesn't support MPX, we encode bndmov instruction by hand. When displacement is zero, assembler generates shorter encoding. This patch improves bndmov encoding with zero displacement so that ld.so is identical when using assemblers with and without MPX support. * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve): Improve bndmov encoding with zero displacement.
* Preserve bound registers for pointer pass/returnIgor Zamyatin2015-07-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | We need to save/restore bound registers and add a BND prefix before branches in _dl_runtime_profile so that bound registers for pointer pass and return are preserved when LD_AUDIT is used. [BZ #18134] * sysdeps/i386/configure.ac: Set HAVE_MPX_SUPPORT. * sysdeps/i386/configure: Regenerated. * sysdeps/i386/dl-trampoline.S (PRESERVE_BND_REGS_PREFIX): New. (_dl_runtime_profile): Save and restore Intel MPX return bound registers when calling _dl_call_pltexit. Add PRESERVE_BND_REGS_PREFIX before return. * sysdeps/i386/link-defines.sym (LRV_BND0_OFFSET): New. (LRV_BND1_OFFSET): Likewise. * sysdeps/x86/bits/link.h (La_i86_retval): Add lrv_bnd0 and lrv_bnd1. * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Fix typo in bndmov encoding. * sysdeps/x86_64/dl-trampoline.h: Properly save and restore Intel MPX bound registers. Add PRESERVE_BND_REGS_PREFIX before branch instructions to preserve bounds.
* Preserve bound registers in _dl_runtime_resolveH.J. Lu2015-03-161-0/+8
| | | | | | | | | We need to add a BND prefix before indirect branch at the end of _dl_runtime_resolve to preserve bound registers. [BZ #18134] * sysdeps/x86_64/dl-trampoline.S (PRESERVE_BND_REGS_PREFIX): New. (_dl_runtime_resolve): Add a BND prefix before indirect branch.
* Update copyright dates with scripts/update-copyrights.Joseph Myers2015-01-021-1/+1
|
* Save/restore bound registers for _dl_runtime_profileIgor Zamyatin2014-04-161-0/+14
| | | | | | | | | | | | | | | | | This patch saves and restores bound registers in x86-64 PLT for ld.so profile and LD_AUDIT: * sysdeps/x86_64/bits/link.h (La_x86_64_regs): Add lr_bnd. (La_x86_64_retval): Add lrv_bnd0 and lrv_bnd1. * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Save Intel MPX bound registers before _dl_profile_fixup. * sysdeps/x86_64/dl-trampoline.h: Restore Intel MPX bound registers after _dl_profile_fixup. Save and restore bound registers bnd0/bnd1 when calling _dl_call_pltexit. * sysdeps/x86_64/link-defines.sym (BND_SIZE): New. (LR_BND_OFFSET): Likewise. (LRV_BND0_OFFSET): Likewise. (LRV_BND1_OFFSET): Likewise.
* Save/restore bound registers in _dl_runtime_resolveIgor Zamyatin2014-04-091-20/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch saves and restores bound registers in symbol lookup for x86-64: 1. Branches without BND prefix clear bound registers. 2. x86-64 pass bounds in bound registers as specified in MPX psABI extension on hjl/mpx/master branch at https://github.com/hjl-tools/x86-64-psABI https://groups.google.com/forum/#!topic/x86-64-abi/KFsB0XTgWYc Binutils has been updated to create an alternate PLT to add BND prefix when branching to ld.so. * config.h.in (HAVE_MPX_SUPPORT): New #undef. * sysdeps/x86_64/configure.ac: Set HAVE_MPX_SUPPORT. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/dl-trampoline.S (REGISTER_SAVE_AREA): New macro. (REGISTER_SAVE_RAX): Likewise. (REGISTER_SAVE_RCX): Likewise. (REGISTER_SAVE_RDX): Likewise. (REGISTER_SAVE_RSI): Likewise. (REGISTER_SAVE_RDI): Likewise. (REGISTER_SAVE_R8): Likewise. (REGISTER_SAVE_R9): Likewise. (REGISTER_SAVE_BND0): Likewise. (REGISTER_SAVE_BND1): Likewise. (REGISTER_SAVE_BND2): Likewise. (_dl_runtime_resolve): Use them. Save and restore Intel MPX bound registers when calling _dl_fixup.
* Save and restore AVX-512 zmm registers to x86-64 ld.soIgor Zamyatin2014-03-131-18/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | AVX-512 ISA adds 512-bit zmm registers. This patch updates _dl_runtime_profile to pass zmm registers to run-time audit. It also changes _dl_x86_64_save_sse and _dl_x86_64_restore_sse to upport zmm registers, which are called when only when RTLD_PREPARE_FOREIGN_CALL is used. Its performance impact is minimum. * config.h.in (HAVE_AVX512_SUPPORT): New #undef. (HAVE_AVX512_ASM_SUPPORT): Likewise. * sysdeps/x86_64/bits/link.h (La_x86_64_zmm): New. (La_x86_64_vector): Add zmm. * sysdeps/x86_64/Makefile (tests): Add tst-audit10. (modules-names): Add tst-auditmod10a and tst-auditmod10b. ($(objpfx)tst-audit10): New target. ($(objpfx)tst-audit10.out): Likewise. (tst-audit10-ENV): New. (AVX512-CFLAGS): Likewise. (CFLAGS-tst-audit10.c): Likewise. (CFLAGS-tst-auditmod10a.c): Likewise. (CFLAGS-tst-auditmod10b.c): Likewise. * sysdeps/x86_64/configure.ac: Set config-cflags-avx512, HAVE_AVX512_SUPPORT and HAVE_AVX512_ASM_SUPPORT. * sysdeps/x86_64/configure: Regenerated. * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Add AVX-512 zmm register support. (_dl_x86_64_save_sse): Likewise. (_dl_x86_64_restore_sse): Likewise. * sysdeps/x86_64/dl-trampoline.h: Updated to support different size vector registers. * sysdeps/x86_64/link-defines.sym (YMM_SIZE): New. (ZMM_SIZE): Likewise. * sysdeps/x86_64/tst-audit10.c: New file. * sysdeps/x86_64/tst-auditmod10a.c: Likewise. * sysdeps/x86_64/tst-auditmod10b.c: Likewise.
* Update copyright notices with scripts/update-copyrightsAllan McRae2014-01-011-1/+1
|
* Fix typos.Ondřej Bílka2013-08-301-1/+1
|
* Update copyright notices with scripts/update-copyrights.Joseph Myers2013-01-021-1/+1
|
* Check if RTLD_SAVESPACE_SSE is aligned to 32 bytesH.J. Lu2012-05-111-0/+4
|
* Replace FSF snail mail address with URLs.Paul Eggert2012-02-091-3/+2
|
* Simplify AVX checkH.J. Lu2011-09-071-4/+1
|
* Fix minor CFI problem in regular x86-64 trampolineUlrich Drepper2011-08-201-1/+2
|
* Fix CFI info in x86-64 trampolines for non-AVX codeUlrich Drepper2011-08-201-2/+3
|
* One more typo in AVX testUlrich Drepper2011-07-231-2/+2
|
* One more change to XSAVE patchUlrich Drepper2011-07-221-2/+4
|
* Fix AVX checkAndreas Schwab2011-07-221-6/+15
|
* Fix check for AVX enablementUlrich Drepper2011-07-201-5/+12
| | | | | The AVX bit is set if the CPU supports AVX. But this doesn't mean the kernel does. Add checks according to Intel's documentation.
* Handle AVX saving on x86-64 in interrupted smbol lookups.Ulrich Drepper2009-08-251-1/+0
| | | | | | | | | If a signal arrived during a symbol lookup and the signal handler also required a symbol lookup, the end of the lookup in the signal handler reset the flag whether restoring AVX/SSE registers is needed. Resetting means in this case that the tail part of the outer lookup code will try to restore the registers and this can fail miserably. We now restore to the previous value which makes nesting calls possible.
* Support mixed SSE/AVX audit and check AVX only once.H.J. Lu2009-08-081-237/+7
| | | | | | | | | | This patch fixes mixed SSE/AVX audit and checks AVX only once in _dl_runtime_profile. When an AVX or SSE register value in pltenter is modified, we have to make sure that the SSE part value is the same in both lr_xmm and lr_vector fields so that pltexit will get the correct value from either lr_xmm or lr_vector fields. AVX-enabled pltenter should update both lr_xmm and lr_vector fields to support stacked AVX/SSE pltenter functions.
* Improve CFI in x86-64 ld.so trampoline code.Ulrich Drepper2009-07-291-1/+2
|
* Properly restore AVX registers on x86-64.H.J. Lu2009-07-291-10/+10
| | | | | tst-audit4 and tst-audit5 fail under AVX emulator due to je instead of jne. This patch fixes them.
* Preserve SSE registers in runtime relocations on x86-64.Ulrich Drepper2009-07-291-0/+82
| | | | | | | | | | SSE registers are used for passing parameters and must be preserved in runtime relocations. This is inside ld.so enforced through the tests in tst-xmmymm.sh. But the malloc routines used after startup come from libc.so and can be arbitrarily complex. It's overkill to save the SSE registers all the time because of that. These calls are rare. Instead we save them on demand. The new infrastructure put in place in this patch makes this possible and efficient.
* Optimize restoring of ymm registers on x86-64.Ulrich Drepper2009-07-161-43/+34
| | | | The patch mainly reduces the code size but also avoids some jumps.
* Fix thinko in AVX audit patch.Ulrich Drepper2009-07-151-20/+4
| | | | Don't use AVX instructions too often.
* Fix typo in last change.Ulrich Drepper2009-07-151-1/+1
|
* Secure AVX changes for auditing code.Ulrich Drepper2009-07-151-32/+295
| | | | | | | | The original AVX patch used a function pointer to handle the difference between machines with and without AVX support. This is insecure. A well-placed memory exploit could lead to redirection of the execution. Using a variable and several tests is a bit slower but cannot be exploited in this way.
* Add AVX support to ld.so auditing for x86-64.H.J. Lu2009-07-101-124/+55
|
* Fix handling of xmm6 in ld.so audit hooks on x86-64.H.J. Lu2009-07-021-2/+4
|
* [BZ #9881]Ulrich Drepper2009-03-151-3/+3
| | | | | | | | | * inet/inet6_rth.c (inet6_rth_add): Add some error checking. Patch mostly by Yang Hongyang <yanghy@cn.fujitsu.com>. * inet/Makefile (tests): Add tst-inet6_rth. * inet/tst-inet6_rth.c: New file. alignment of La_x86_64_regs. Store xmm parameters.
* * elf/dl-runtime.c (reloc_offset): Define.Ulrich Drepper2009-03-151-12/+0
| | | | | | | | | | | | | (reloc_index): Define. (_dl_fixup): Rename reloc_offset parameter to reloc_arg. (_dl_fixup_profile): Likewise. Use reloc_index instead of computing index from reloc_offset. (_dl_call_pltexit): Likewise. * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve): Just pass the relocation index to _dl_fixup. (_dl_runtime_profile): Likewise for _dl_fixup_profile and _dl_call_pltexit. * sysdeps/x86_64/dl-runtime.c: New file.
* [BZ #9893]Ulrich Drepper2009-03-141-99/+140
| | | | | | * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Fix alignement of La_x86_64_regs. Store xmm parameters. Patch mostly by Jiri Olsa <olsajiri@gmail.com>.
* * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Make sureUlrich Drepper2007-10-311-26/+28
| | | | | stack is properly aligned for the target function. Correct unwind info.
* * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): CorrectlyUlrich Drepper2007-08-241-3/+4
| | | | align stack for call if pltexit is to be used.
* * elf/dl-reloc.c [PROF] (_dl_relocate_object): DefineUlrich Drepper2005-07-071-1/+2
| | | | | | | | | | | | | | | | | | | | consider_profiling always to zero. Don't count of compiler to remove unreached if block. * sysdeps/x86_64/dl-trampoline.S [PROF] (_dl_runtime_profile): Don't compile. * sysdeps/i386/dl-trampoline.S [PROF] (_dl_runtime_profile): Likewise. * sysdeps/ia64/dl-trampoline.S [PROF] (_dl_runtime_profile): Likewise. * sysdeps/s390/s390-64/dl-trampoline.S [PROF] (_dl_runtime_profile): Likewise. * sysdeps/s390/s390-32/dl-trampoline.S [PROF] (_dl_runtime_profile): Likewise. * sysdeps/powerpc/powerpc64/dl-trampoline.S [PROF] (_dl_profile_resolve): Likewise. * sysdeps/powerpc/powerpc32/dl-trampoline.S [PROF] (_dl_profile_resolve): Likewise. * gmon/Makefile: Add rules to build and run tst-profile-static. * gmon/tst-profile-static.c: New file. * Makeconfig (+link-static): Allow passing program-specific flags.
* * csu/elf-init.c (__libc_csu_fini): Don't do anything here. Ulrich Drepper2005-01-061-0/+188
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * sysdeps/generic/libc-start.c: Don't register program destructor here. * dlfcn/Makefile: Add rules to build dlfcn.c. (LDFLAGS-dl.so): Removed. * dlfcn/dlclose.c: _dl_close is now in ld.so, use function pointer table. * dlfcn/dlmopen.c: Likewise for _dl_open. * dlfcn/dlopen.c: Likewise. * dlfcn/dlopenold.c: Likewise. * elf/dl-libc.c: Likewise for _dl_open and _dl_close. * elf/Makefile (routines): Remove dl-open and dl-close. (dl-routines): Add dl-open, dl-close, and dl-trampoline. Add rules to build and run tst-audit1. * elf/tst-audit1.c: New file. * elf/tst-auditmod1.c: New file. * elf/Versions [libc]: Remove _dl_open and _dl_close. * elf/dl-close.c: Change for use inside ld.so instead of libc.so. * elf/dl-open.c: Likewise. * elf/dl-debug.c (_dl_debug_initialize): Allow reinitialization, signaled by nonzero parameter. * elf/dl-init.c: Fix use of r_state. * elf/dl-load.c: Likewise. * elf/dl-close.c: Add auditing checkpoints. * elf/dl-open.c: Likewise. * elf/dl-fini.c: Likewise. * elf/dl-load.c: Likewise. * elf/dl-sym.c: Likewise. * sysdeps/generic/libc-start.c: Likewise. * elf/dl-object.c: Allocate memory for auditing information. * elf/dl-reloc.c: Remove RESOLV. We now always need the map. Correctly initialize slotinfo. * elf/dynamic-link.h: Adjust after removal of RESOLV. * sysdeps/hppa/dl-lookupcfg.h: Likewise. * sysdeps/ia64/dl-lookupcfg.h: Likewise. * sysdeps/powerpc/powerpc64/dl-lookupcfg.h: Removed. * elf/dl-runtime.c (_dl_fixup): Little cleanup. (_dl_profile_fixup): New parameters to point to register struct and variable for frame size. Add auditing checkpoints. (_dl_call_pltexit): New function. Don't define trampoline code here. * elf/rtld.c: Recognize LD_AUDIT. Load modules on startup. Remove all the functions from _rtld_global_ro which only _dl_open and _dl_close needed. Add auditing checkpoints. * elf/link.h: Define symbols for auditing interfaces. * include/link.h: Likewise. * include/dlfcn.h: Define __RTLD_AUDIT. Remove prototypes for _dl_open and _dl_close. Adjust access to argc and argv in libdl. * dlfcn/dlfcn.c: New file. * sysdeps/generic/dl-lookupcfg.h: Remove all content now that RESOLVE is gone. * sysdeps/generic/ldsodefs.h: Add definitions for auditing interfaces. * sysdeps/generic/unsecvars.h: Add LD_AUDIT. * sysdeps/i386/dl-machine.h: Remove trampoline code here. Adjust for removal of RESOLVE. * sysdeps/x86_64/dl-machine.h: Likewise. * sysdeps/generic/dl-trampoline.c: New file. * sysdeps/i386/dl-trampoline.c: New file. * sysdeps/x86_64/dl-trampoline.c: New file. * sysdeps/generic/dl-tls.c: Cleanups. Fixup for dtv_t change. Fix updating of DTV. * sysdeps/generic/libc-tls.c: Likewise. * sysdeps/arm/bits/link.h: Renamed to ... * sysdeps/arm/buts/linkmap.h: ...this. * sysdeps/generic/bits/link.h: Renamed to... * sysdeps/generic/bits/linkmap.h: ...this. * sysdeps/hppa/bits/link.h: Renamed to... * sysdeps/hppa/bits/linkmap.h: ...this. * sysdeps/hppa/i386/link.h: Renamed to... * sysdeps/hppa/i386/linkmap.h: ...this. * sysdeps/hppa/ia64/link.h: Renamed to... * sysdeps/hppa/ia64/linkmap.h: ...this. * sysdeps/hppa/s390/link.h: Renamed to... * sysdeps/hppa/s390/linkmap.h: ...this. * sysdeps/hppa/sh/link.h: Renamed to... * sysdeps/hppa/sh/linkmap.h: ...this. * sysdeps/hppa/x86_64/link.h: Renamed to... * sysdeps/hppa/x86_64/linkmap.h: ...this. 2005-01-06 Ulrich Drepper <drepper@redhat.com> * allocatestack.c (init_one_static_tls): Adjust initialization of DTV entry for static tls deallocation fix. * sysdeps/alpha/tls.h (dtv_t): Change pointer type to be struct which also contains information whether the memory pointed to is static TLS or not. * sysdeps/i386/tls.h: Likewise. * sysdeps/ia64/tls.h: Likewise. * sysdeps/powerpc/tls.h: Likewise. * sysdeps/s390/tls.h: Likewise. * sysdeps/sh/tls.h: Likewise. * sysdeps/sparc/tls.h: Likewise. * sysdeps/x86_64/tls.h: Likewise.
* (CFLAGS-tst-align.c): Add -mpreferred-stack-boundary=4.Ulrich Drepper2004-12-221-189/+0
|
* 2.5-18.1Jakub Jelinek2007-07-121-0/+189