about summary refs log tree commit diff
path: root/arch
Commit message (Collapse)AuthorAgeFilesLines
* fix arm a_crash for big endianRich Felker2016-01-251-2/+4
| | | | | | | | contrary to commit 89e149d275a7699a4a5e4c98bab267648f64cbba, big endian arm does need the instruction bytes in big endian order. rather than trying to use a special encoding that works as arm or thumb, simply encode the simplest/canonical undefined instructions dependent on whether __thumb__ is defined.
* add native a_crash primitive for armRich Felker2016-01-251-0/+10
| | | | | | | | the .byte directive encodes a guaranteed-undefined instruction, the same one Linux fills the kuser helper page with when it's disabled. the udf mnemonic and and .insn directives are not supported by old binutils versions, and larger-than-byte integer directives would produce the wrong output on big-endian.
* clean powerpc syscall.hSzabolcs Nagy2016-01-241-24/+0
| | | | remove ifdefs for powerpc64.
* add missing powerpc specific PROT_SAO memory protection flagSzabolcs Nagy2016-01-241-0/+1
| | | | | this flag for strong access ordering was added in linux v2.6.27 commit aba46c5027cb59d98052231b36efcbbde9c77a1d
* fix powerpc MCL_* mlockall flags in bits/mman.hSzabolcs Nagy2016-01-241-2/+2
| | | | the definitions didn't match the linux uapi headers.
* fix aarch64 atomics to load/store 32bit onlySzabolcs Nagy2016-01-241-2/+2
| | | | | | a_ll/a_sc inline asm used 64bit register operands (%0) instead of 32bit ones (%w0), this at least broke a_and_64 (which always cleared the top 32bit, leaking memory in malloc).
* improve aarch64 atomicsRich Felker2016-01-231-16/+36
| | | | | | | | | | | | | | | | | aarch64 provides ll/sc variants with acquire/release memory order, freeing us from the need to have full barriers both before and after the ll/sc operation. previously they were not used because the a_cas can fail without performing a_sc, in which case half of the barrier would be omitted. instead, define a custom version of a_cas for aarch64 which uses a_barrier explicitly when aborting the cas operation. aside from cas, other operations built on top of ll/sc are not affected since they never abort but rather loop until they succeed. a split ll/sc version of the pointer-sized a_cas_p is also introduced using the same technique. patch by Szabolcs Nagy.
* remove sh port's __fpscr_values source fileRich Felker2016-01-221-5/+0
| | | | | | | | | commit f3ddd173806fd5c60b3f034528ca24542aecc5b9, the dynamic linker bootstrap overhaul, silently disabled the definition of __fpscr_values in this file since libc.so's copy of __fpscr_values now comes from crt_arch.h, the same place the public definition in the main program's crt1.o ultimately comes from. remove this file which is no longer in use.
* move sh port's __shcall internal function from arch/sh/src to src treeRich Felker2016-01-221-5/+0
|
* move sh __unmapself code from arch/sh/src to main src treeRich Felker2016-01-221-24/+0
|
* move x32 sysinfo impl and syscall fixup code out of arch/x32/srcRich Felker2016-01-222-88/+0
| | | | | all such arch-specific translation units are being moved to appropriate arch dirs under the main src tree.
* overhaul powerpc atomics for new atomics frameworkRich Felker2016-01-221-14/+38
| | | | | | | | | | | | | previously powerpc had a_cas defined in terms of its native ll/sc style operations, but all other atomics were defined in terms of a_cas. instead define a_ll and a_sc so the compiler can generate optimized versions of all the atomic ops and perform better inlining of a_cas. extracting the result of the sc (stwcx.) instruction is rather awkward because it's natively stored in a condition flag, which is not representable in inline asm. but even with this limitation the new code still seems significantly better.
* clean up x86_64 (and x32) atomics for new atomics frameworkRich Felker2016-01-222-113/+130
| | | | | | | this commit mostly makes consistent things like spacing, function ordering in atomic_arch.h, argument names, use of volatile, etc. a_ctz_l was also removed from x86_64 since atomic.h provides it automatically using a_ctz_64.
* clean up i386 atomics for new atomics frameworkRich Felker2016-01-221-66/+58
| | | | | | | | | | | | this commit mostly makes consistent things like spacing, function ordering in atomic_arch.h, argument names, use of volatile, etc. the fake 64-bit and/or atomics are also removed because the shared atomic.h does a better job of implementing them; it avoids making two atomic memory accesses when only one 32-bit half needs to be touched. no major overhaul is needed or possible because x86 actually has native versions of all the usual atomic operations, rather than using ll/sc or needing cas loops.
* overhaul mips atomics for new atomics frameworkRich Felker2016-01-221-53/+31
|
* move arm-specific translation units out of arch/arm/src, to src/*/armRich Felker2016-01-228-244/+0
| | | | | | | this is possible with the new build system that allows src/*/$(ARCH)/* files which do not shadow a file in the parent directory, and yields a more logical organization. eventually it will be possible to remove arch/*/src from the build system.
* overhaul arm atomics for new atomics frameworkRich Felker2016-01-211-142/+38
| | | | | | | | | | | | switch to ll/sc model so that new atomic.h can provide optimized versions of all the atomic primitives without needing an ll/sc loop written in asm for each one. all isa levels which use ldrex/strex now use the inline ll/sc model even if the type of barrier to use is not known until runtime (v6). the cas model is only used for arm v5 and earlier, and it has been optimized to make the call via inline asm with custom constraints rather than as a C function call.
* overhaul aarch64 atomics for new atomics frameworkRich Felker2016-01-211-174/+25
|
* overhaul sh atomics for new atomics framework, add j-core cas.l backendRich Felker2016-01-214-287/+30
| | | | | | | | | | | | | | | | sh needs runtime-selected atomic backends since there are a number of supported models that use non-forwards-compatible (non-smp-compatible) atomic mechanisms. previously, the code paths for this were highly inefficient since they involved C function calls with multiple branches in the callee and heavy spills in the caller. the new code performs calls the runtime-selected asm fragment from inline asm with extremely minimal clobbers, rather than using a function call. for the sh4a case where the atomic mechanism is known and there is no forward-compatibility issue, the movli.l and movco.l instructions are provided as a_ll and a_sc, allowing the new shared atomic.h to generate efficient inline versions of all the basic atomic operations without needing a cas loop.
* refactor internal atomic.hRich Felker2016-01-2114-834/+216
| | | | | | | | | | | | | | | rather than having each arch provide its own atomic.h, there is a new shared atomic.h in src/internal which pulls arch-specific definitions from arc/$(ARCH)/atomic_arch.h. the latter can be extremely minimal, defining only a_cas or new ll/sc type primitives which the shared atomic.h will use to construct everything else. this commit avoids making heavy changes to the individual archs' atomic implementations. definitions which are identical or near-identical to what the new shared atomic.h would produce have been removed, but otherwise the changes made are just hooking up the arch-specific files to the new infrastructure. major changes to take advantage of the new system will come in subsequent commits.
* fix build regression for arm pre-v7 from out-of-tree build patchRich Felker2016-01-202-0/+0
| | | | | | | | | | commit 2f853dd6b9a95d5b13ee8f9df762125e0588df5d failed to replicate the old makefile logic that caused arch/arm/src/arm/atomics.s to be built. since this was the only .s file under arch/*/src, rather than trying to reproduce the old logic, I'm just moving it up a level and adjusting the glob pattern in the makefile to catch it. eventually arch/*/src will probably be removed in favor of moving all these files to appropriate src/*/$(ARCH) locations.
* fix dynamic linker path file selection for arm vs armhfRich Felker2016-01-201-3/+3
| | | | | | | | | | the __SOFTFP__ macro which was wrongly being used does not reflect the ABI (arm vs armhf) but just the availability of floating point instructions/registers, so -mfloat-abi=softfp was wrongly being treated as armhf. __ARM_PCS_VFP is the correct predefined macro to check for the armhf EABI variant. this macro usage was corrected for the build process in commit 4918c2bb206bfaaf5a1f7d3448c2f63d5e2b7d56 but reloc.h was apparently overlooked at the time.
* adjust mips crt_arch entry point asm to avoid assembler bugsRich Felker2015-12-291-1/+4
| | | | | | | | | | | | | apparently the .gpword directive does not work reliably with local text labels; values produced were offset by 64k from the correct value, resulting in incorrect computation of the got pointer at runtime. instead, use an external label so that the assembler does not munge the relocation; the linker will then get it right. commit 6fef8cafbd0f6f185897bc87feb1ff66e2e204e1 exposed this issue by removing the old, non-PIE-compatible handwritten crt1.s, which was not affected. presumably mips PIE executables (using Scrt1.o produced from crt_arch.h) were already affected at the time.
* adjust i386 max_align_t definition to work around some broken compilersRich Felker2015-12-291-3/+5
| | | | | | | | | | at least gcc 4.7 claims c++11 support but does not accept the alignas keyword, causing breakage when stddef.h is included in c++11 mode. instead, prefer using __attribute__((__aligned__)) on any compiler with GNU extensions, and only use the alignas keyword as a fallback for other C++ compilers. C code should not be affected by this patch.
* remove visibility suppression by SHARED macro in mips and x32 arch filesRich Felker2015-12-152-6/+0
| | | | | | commit 8a8fdf6398b85c99dffb237e47fa577e2ddc9e77 was intended to remove all such usage, but these arch-specific files were overlooked, leading to inconsistent declarations and definitions.
* fix dynamic loader library mapping for nommu systemsRich Felker2015-11-111-0/+2
| | | | | | | | | | | | | | | | | | | | | on linux/nommu, non-writable private mappings of files may actually use memory shared with other processes or the fs cache. the old nommu loader code (used when mmap with MAP_FIXED fails) simply wrote over top of the original file mapping, possibly clobbering this shared memory. no such breakage was observed in practice, but it should have been possible. the new code starts by mapping anonymous writable memory on archs that might support nommu, then maps load segments over top of it, falling back to read if MAP_FIXED fails. we use an anonymous map rather than a writable file map to avoid reading more data from disk than needed. since pages cannot be loaded lazily on fault, in case of large data/bss, mapping the full file may read a lot of data that will subsequently be thrown away when processing additional LOAD segments. as a result, we cannot skip the first LOAD segment when operating in this mode. these changes affect only non-FDPIC nommu support.
* explicitly assemble all arm asm sources as UALRich Felker2015-11-101-0/+1
| | | | | | | | these files are all accepted as legacy arm syntax when producing arm code, but legacy syntax cannot be used for producing thumb2 with access to the full ISA. even after switching to UAL, some asm source files contain instructions which are not valid in thumb mode, so these will need to be addressed separately.
* remove non-working pre-armv4t support from arm asmRich Felker2015-11-092-11/+0
| | | | | | | | | | | | | | | the idea of the three-instruction sequence being removed was to be able to return to thumb code when used on armv4t+ from a thumb caller, but also to be able to run on armv4 without the bx instruction available (in which case the low bit of lr would always be 0). however, without compiler support for generating such a sequence from C code, which does not exist and which there is unlikely to be interest in implementing, there is little point in having it in the asm, and it would likely be easier to add pre-armv4t support via enhanced linker handling of R_ARM_V4BX than at the compiler level. removing this code simplifies adding support for building libc in thumb2-only form (for cortex-m).
* generalize sh entry point asm not to assume call dests fit in 12 bitsRich Felker2015-11-021-5/+12
| | | | | | this assumption is borderline-unsafe to begin with, and fails badly with -ffunction-sections since the linker can move the callee arbitrarily far away when it lies in a different section.
* properly access mcontext_t program counter in cancellation handlerRich Felker2015-11-0210-12/+10
| | | | | | | | | using the actual mcontext_t definition rather than an overlaid pointer array both improves correctness/readability and eliminates some ugly hacks for archs with 64-bit registers bit 32-bit program counter. also fix UB due to comparison of pointers not in a common array object.
* prevent reordering of or1k and powerpc thread pointer loadsRich Felker2015-10-152-0/+2
| | | | | | | | | | | | | | | | | other archs use asm for the thread pointer load, so making that asm volatile is sufficient to inform the compiler that it has a "side effect" (crashing or giving the wrong result if the thread pointer was not yet initialized) that prevents reordering. however, powerpc and or1k have dedicated general purpose registers for the thread pointer and did not need to use any asm to access it; instead, "local register variables with a specified register" were used. however, there is no specification for ordering constraints on this type of usage, and presumably use of the thread pointer could be reordered across its initialization. to impose an ordering, I have added empty volatile asm blocks that produce the "local register variable with a specified register" as an output constraint.
* mark arm thread-pointer-loading inline asm as volatileRich Felker2015-10-151-3/+3
| | | | | | | | | | this builds on commits a603a75a72bb469c6be4963ed1b55fabe675fe15 and 0ba35d69c0e77b225ec640d2bd112ff6d9d3b2af to ensure that a compiler cannot conclude that it's valid to reorder the asm to a point before the thread pointer is set up, or to treat the inline function as if it were declared with attribute((const)). other archs already use volatile asm for thread pointer loading.
* add comment documenting hard-coded opcode for reading mips thread pointerRich Felker2015-10-151-0/+1
|
* remove attribute((const)) from arm __pthread_self inline functionRich Felker2015-10-151-2/+2
| | | | | commit a603a75a72bb469c6be4963ed1b55fabe675fe15 did this for the public pthread_self function but not the internal inline one.
* fix signal return for sh/fdpicRich Felker2015-09-232-0/+10
| | | | | | | | | | | | | | | | the restorer function pointer provided in the kernel sigaction structure is interpreted by the kernel as a raw code address, not a function descriptor. this commit moves the declarations of the __restore and __restore_rt symbols to ksigaction.h so that arch versions of the file can override them, and introduces a version for sh which declares them as objects rather than functions. an alternate solution would have been defining SA_RESTORER to 0 so that the functions are not used, but this both requires executable stack (since the sh kernel does not have a vdso page with permanent restorer functions) and crashes on qemu user-level emulation.
* have sh/fdpic entry point set fdpic personality if neededRich Felker2015-09-221-0/+12
| | | | | | | | | | | | | | | | | the entry point code supports being loaded by a loader which is not fdpic-aware (in practice, either kernel with mmu or qemu without fdpic support). this mostly just works, but signal handling will wrongly use a function descriptor address as a code address if the personality is not adjusted to fdpic. ideally this code could be placed with sigaction so that it's not needed except if/when a signal handler is installed. however, personality is incorrectly maintained per-thread by the kernel, rather than per-process, so it's necessary to correct the personality before any threads are started. also, in order to skip the personality syscall when an fdpic-aware loader is used, we need to be able to detect how the program was loaded, and this information is only readily available at the entry point.
* add real fdpic loading of shared librariesRich Felker2015-09-221-0/+1
| | | | | | previously, the normal ELF library loading code was used even for fdpic, so only the kernel-loaded dynamic linker and main app could benefit from separate placement of segments and shared text.
* size-optimize sh/fdpic dynamic entry pointRich Felker2015-09-221-0/+4
| | | | | | the __fdpic_fixup code is not needed for ET_DYN executables, which instead use reloctions, so we can omit it from the dynamic linker and static-pie entry point and save some code size.
* work around breakage in sh/fdpic __unmapself functionRich Felker2015-09-221-0/+5
| | | | | | | | | | | | the C implementation of __unmapself used for potentially-nommu sh assumed CRTJMP takes a function descriptor rather than a code address; however, the actual dynamic linker needs a code address, and so commit 7a9669e977e5f750cf72ccbd2614f8b72ce02c4c changed the definition of the macro in reloc.h. this commit puts the old macro back in a place where it only affects __unmapself. this is an ugly workaround and should be cleaned up at some point, but at least it's well isolated.
* add general fdpic support in dynamic linker and arch support for shRich Felker2015-09-222-3/+12
| | | | | | | | | | | | | | | | | | at this point not all functionality is complete. the dynamic linker itself, and main app if it is also loaded by the kernel, take advantage of fdpic and do not need constant displacement between segments, but additional libraries loaded by the dynamic linker follow normal ELF semantics for mapping still. this fully works, but does not admit shared text on nommu. in terms of actual functional correctness, dlsym's results are presently incorrect for function symbols, RTLD_NEXT fails to identify the caller correctly, and dladdr fails almost entirely. with the dynamic linker entry point working, support for static pie is automatically included, but linking the main application as ET_DYN (pie) probably does not make sense for fdpic anyway. ET_EXEC is equally relocatable but more efficient at representing relocations.
* new dlstart stage-2 chaining for x86_64 and x32Rich Felker2015-09-172-0/+10
|
* new dlstart stage-2 chaining for powerpcRich Felker2015-09-171-0/+9
|
* new dlstart stage-2 chaining for or1kRich Felker2015-09-171-0/+9
|
* new dlstart stage-2 chaining for mipsRich Felker2015-09-171-0/+15
|
* new dlstart stage-2 chaining for microblazeRich Felker2015-09-171-0/+7
|
* introduce new symbol-lookup-free rcrt1/dlstart stage chainingRich Felker2015-09-171-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | previously, the call into stage 2 was made by looking up the symbol name "__dls2" (which was chosen short to be easy to look up) from the dynamic symbol table. this was no problem for the dynamic linker, since it always exports all its symbols. in the case of the static pie entry point, however, the dynamic symbol table does not contain the necessary symbol unless -rdynamic/-E was used when linking. this linking requirement is a major obstacle both to practical use of static-pie as a nommu binary format (since it greatly enlarges the file) and to upstream toolchain support for static-pie (adding -E to default linking specs is not reasonable). this patch replaces the runtime symbolic lookup with a link-time lookup via an inline asm fragment, which reloc.h is responsible for providing. in this initial commit, the asm is provided only for i386, and the old lookup code is left in place as a fallback for archs that have not yet transitioned. modifying crt_arch.h to pass the stage-2 function pointer as an argument was considered as an alternative, but such an approach would not be compatible with fdpic, where it's impossible to compute function pointers without already having performed relocations. it was also deemed desirable to keep crt_arch.h as simple/minimal as possible. in principle, archs with pc-relative or got-relative addressing of static variables could instead load the stage-2 function pointer from a static volatile object. that does not work for fdpic, and is not safe against reordering on mips-like archs that use got slots even for static functions, but it's a valid on i386 and many others, and could provide a reasonable default implementation in the future.
* reindent powerpc's bits/termios.h to be consistent with other archsFelix Janda2015-09-151-140/+138
|
* fix namespace violations in aarch64/bits/termios.hFelix Janda2015-09-151-7/+7
| | | | in analogy with commit a627eb35864d5c29a3c3300dfe83745ab1e7a00f
* add sh fdpic subarch variantsRich Felker2015-09-121-1/+16
| | | | | | | | | with this commit it should be possible to produce a working static-linked fdpic libc and application binaries for sh. the changes in reloc.h are largely unused at this point since dynamic linking is not supported, but the CRTJMP macro is used one place outside of dynamic linking, in __unmapself.
* add fdpic version of entry point code for shRich Felker2015-09-121-0/+29
| | | | | | | this version of the entry point is only suitable for static linking in ET_EXEC form. neither dynamic linking nor pie is supported yet. at some point in the future the fdpic and non-fdpic versions of this code may be unified but for now it's easiest to work with them separately.