about summary refs log tree commit diff
path: root/arch/arm
Commit message (Collapse)AuthorAgeFilesLines
* add TLSDESC support for 32-bit armRich Felker2018-10-011-1/+3
| | | | | | | | | | | | | | | unlike other asm where the baseline ISA is used, these functions are hot paths and use ISA-level specializations. call-clobbered vfp registers are saved before calling __tls_get_new, since there is no guarantee it won't use them. while setjmp/longjmp have to use hwcap to decide whether to the fpu is in use, since application code could be using vfp registers even if libc was compiled as pure softfloat, __tls_get_new is part of libc and can be assumed not to have access to vfp registers if tlsdesc.S does not. thus it suffices just to check the predefined preprocessor macros. the check for __ARM_PCS_VFP is redundant; !__SOFTFP__ must always be true if the target ISA level includes fpu instructions/registers.
* add arm and sh bits/ptrace.hSzabolcs Nagy2018-09-201-0/+25
| | | | | | These should have been added in commit df6d9450ea19fd71e52cf5cdb4c85beb73066394 that added target specific PTRACE_ macros, but somehow got missed.
* define and use internal macros for hidden visibility, weak refsRich Felker2018-09-052-3/+4
| | | | | | | | | this cleans up what had become widespread direct inline use of "GNU C" style attributes directly in the source, and lowers the barrier to increased use of hidden visibility, which will be useful to recovering some of the efficiency lost when the protected visibility hack was dropped in commit dc2f368e565c37728b0d620380b849c3a1ddd78f, especially on archs where the PLT ABI is costly.
* work around broken kernel struct ipc_perm on some big endian archsRich Felker2018-06-201-0/+2
| | | | | | | | | | | | | | | the mode member of struct ipc_perm is specified by POSIX to have type mode_t, which is uniformly defined as unsigned int. however, Linux defines it with type __kernel_mode_t, and defines __kernel_mode_t as unsigned short on some archs. since there is a subsequent padding field, treating it as a 32-bit unsigned int works on little endian archs, but the order is backwards on big endian archs with the erroneous definition. since multiple archs are affected, remedy the situation with fixup code in the affected functions (shmctl, semctl, and msgctl) rather than repeating the same shims in syscall_arch.h for every affected arch.
* fix TLS layout of TLS variant I when there is a gap above TPSzabolcs Nagy2018-06-022-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | In TLS variant I the TLS is above TP (or above a fixed offset from TP) but on some targets there is a reserved gap above TP before TLS starts. This matters for the local-exec tls access model when the offsets of TLS variables from the TP are hard coded by the linker into the executable, so the libc must compute these offsets the same way as the linker. The tls offset of the main module has to be alignup(GAP_ABOVE_TP, main_tls_align). If there is no TLS in the main module then the gap can be ignored since musl does not use it and the tls access models of shared libraries are not affected. The previous setup only worked if (tls_align & -GAP_ABOVE_TP) == 0 (i.e. TLS did not require large alignment) because the gap was treated as a fixed offset from TP. Now the TP points at the end of the pthread struct (which is aligned) and there is a gap above it (which may also need alignment). The fix required changing TP_ADJ and __pthread_self on affected targets (aarch64, arm and sh) and in the tlsdesc asm the offset to access the dtv changed too.
* work around arm gcc's rejection of r7 asm constraints in thumb modeRich Felker2018-05-011-14/+39
| | | | | | | | | | | | | | | | | | | | | | | in thumb mode, r7 is the ABI frame pointer register, and unless frame pointer is disabled, gcc insists on treating it as a fixed register, refusing to spill it to satisfy constraints. unfortunately, r7 is also used in the syscall ABI for passing the syscall number. up til now we just treated this as a requirement to disable frame pointer when generating code as thumb, but it turns out gcc forcibly enables frame pointer, and the fixed register constraint that goes with it, for functions which contain VLAs. this produces an unacceptable arch-specific constraint that (non-arm-specific) source files making syscalls cannot use VLAs. as a workaround, avoid r7 register constraints when producing thumb code and instead save/restore r7 in a temp register as part of the asm block. at some point we may want/need to support armv6-m/thumb1, so the asm has been tweaked to be thumb1-compatible while also near-optimal for thumb2: it allows the temp and/or syscall number to be in high registers (necessary since r0-r5 may all be used for syscalll args) and in thumb2 mode allows the syscall number to be an 8-bit immediate.
* arm: use a_ll/a_sc atomics when building for ARMv6T2Andre McCurdy2018-04-191-1/+1
| | | | | ARMv6 cores with support for Thumb2 can take advantage of the "ldrex" and "strex" based implementations of a_ll and a_sc.
* arm: respect both __ARM_ARCH_6KZ__ and __ARM_ARCH_6ZK__ macrosAndre McCurdy2018-04-192-2/+2
| | | | | | | | | | | | __ARM_ARCH_6ZK__ is a gcc specific historical typo which may not be defined by other compilers. https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02237.html To avoid unexpected results when building for ARMv6KZ with clang, the correct form of the macro (ie 6KZ) needs to be tested. The incorrect form of the macro (ie 6ZK) still needs to be tested for compatibility with pre-2015 versions of gcc.
* provide optimized a_ctz_32 for armAndre McCurdy2018-04-191-0/+12
| | | | | | Provide an ARM specific a_ctz_32 helper function for architecture versions for which it can be implemented efficiently via the "rbit" instruction (ie all Thumb-2 capable versions of ARM v6 and above).
* arm: add get_tls syscall from linux v4.15Szabolcs Nagy2018-02-221-0/+1
| | | | | for systems without tp register or kuser helper, new in linux commit 8fcd6c45f5a65621ec809b7866a3623e9a01d4ed
* add statx syscall numbers from linux v4.11Szabolcs Nagy2017-11-051-0/+1
| | | | | statx was added in linux commit a528d35e8bfcc521d7cb70aaf03e1bd296c8493f (there is no libc wrapper yet and microblaze and sh misses the number).
* fix build regression on ARM for ISA levels less than v5Rich Felker2017-10-251-0/+4
| | | | | | | | | | | commit 06fbefd10046a0fae7e588b7c6d25fb51811b931 (first included in release 1.1.17) introduced this regression. patch by Adrian Bunk. it fixes the regression in all cases, but spuriously prevents use of the clz instruction on very old compiler versions that don't define __ARM_ARCH. this may be fixed in a more general way at some point in the future. it also omits thumb1 logic since building as thumb1 code is currently not supported.
* make syscall.h consistent with linuxSzabolcs Nagy2017-09-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | most of the found naming differences don't matter to musl, because internally it unifies the syscall names that vary across targets, but for external code the names should match the kernel uapi. aarch64: __NR_fstatat is called __NR_newfstatat in linux. __NR_or1k_atomic got mistakenly copied from or1k. arm: __NR_arm_sync_file_range is an alias for __NR_sync_file_range2 __NR_fadvise64_64 is called __NR_arm_fadvise64_64 in linux, the old non-arm name is kept too, it should not cause issues. (powerpc has similar nonstandard fadvise and it uses the normal name.) i386: __NR_madvise1 was removed from linux in commit 303395ac3bf3e2cb488435537d416bc840438fcb 2011-11-11 microblaze: __NR_fadvise, __NR_fstatat, __NR_pread, __NR_pwrite had different name in linux. mips: __NR_fadvise, __NR_fstatat, __NR_pread, __NR_pwrite, __NR_select had different name in linux. mipsn32: __NR_fstatat is called __NR_newfstatat in linux. or1k: __NR__llseek is called __NR_llseek in linux. the old name is kept too because that's the name musl uses internally. powerpc: __NR_{get,set}res{gid,uid}32 was never present in powerpc linux. __NR_timerfd was briefly defined in linux but then got renamed.
* arm: add HWCAP_ARM_ hwcap macrosSzabolcs Nagy2017-08-291-0/+24
| | | | | | | Glibc renamed the linux uapi HWCAP_* macros to HWCAP_ARM_* so have both variants in case some code depends on it. (The HWCAP2_ macros are not defined in glibc currently so those only have the linux uapi variant.)
* add a_clz_64 helper functionSzabolcs Nagy2017-08-291-0/+7
| | | | | | | | | counts leading zero bits of a 64bit int, undefined on zero input. (has nothing to do with atomics, added to atomic.h so target specific helper functions are together.) there is a logarithmic generic implementation and another in terms of a 32bit a_clz_32 on targets where that's available.
* allow page size to vary on armRich Felker2017-02-221-1/+0
| | | | | | | | | | | the ABI for arm was silently changed at some point to allow page sizes other than 4k; traditional binaries built with only 4k-aligned offsets between load segments cannot run on such systems, but newer binutils versions use 64k offset alignment. while larger page size is undesirable for various reasons, users have encountered hardware and/or kernels that lock the page size to a larger value, so follow the new ABI and allow it to vary.
* add pkey_{mprotect,alloc,free} syscalls from linux v4.9Szabolcs Nagy2016-12-291-0/+3
| | | | | see linux commit e8c24d3a23a469f1f40d4de24d872ca7023ced0a and linux Documentation/x86/protection-keys.txt
* rework arm atomic/tp backends to be thumb-compatible and fdpic-readyRich Felker2016-12-192-14/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | three problems are addressed: - use of pc arithmetic, which was difficult if not impossible to make correct in thumb mode on all models, so that relative rather than absolute pointers to the backends could be used. this was designed back when there was no coherent model for the early stages of the dynamic linker before relocations, and is no longer necessary. - assumption that data (the relative pointers to the backends) can be accessed at a constant displacement from the code. this will not be possible on future fdpic subarchs (for cortex-m), so move responsibility for loading the backend code address to the caller. - hard-coded arm opcodes using the .word directive. instead, use the .arch directive to work around the assembler's refusal to assemble instructions not available (or in some cases, available but just considered deprecated) in the target isa level. the obscure v6t2 arch is used for v6 code so as to (1) allow generation of thumb2 output if -mthumb is active, and (2) avoid warnings/errors for mcr barriers that clang would produce if we just set arch to v7-a. in addition, the __aeabi_read_tp function is moved out of the inner workings and implemented as an asm wrapper around a C function, so that asm code does not need to read global data. the asm wrapper serves to satisfy the ABI calling convention requirements for this function.
* add bits/hwcap.h and include it in sys/auxv.hSzabolcs Nagy2016-10-201-0/+29
| | | | | | | | | aarch64, arm, mips, mips64, mipsn32, powerpc, powerpc64 and sh have cpu feature bits defined in linux for AT_HWCAP auxv entry, so expose those in sys/auxv.h it seems the mips hwcaps were never exposed to userspace neither by linux nor by glibc, but that's most likely an oversight.
* make brace placement in public header typedef'd structs consistentRich Felker2016-07-031-2/+1
| | | | | | commit befa5866ee30d09c0c96e88af2eabff5911342ea performed this change for struct definitions that did not also involve typedef, but omitted the latter.
* make brace placement in public header struct definitions consistentRich Felker2016-07-031-2/+1
| | | | | | | | | | | | | | placing the opening brace on the same line as the struct keyword/tag is the style I prefer and seems to be the prevailing practice in more recent additions. these changes were generated by the command: find include/ arch/*/bits -name '*.h' \ -exec sed -i '/^struct [^;{]*$/{N;s/\n/ /;}' {} + and subsequently checked by hand to ensure that the regex did not pick up any false positives.
* fix FIOQSIZE in arm ioctl.hSzabolcs Nagy2016-07-031-0/+2
| | | | | arm ioctl.h is the same as the generic one except this macro, so a workaround solution is used to avoid another ioctl.h copy.
* fix posix_fadvise syscall args on powerpc, unify with arm fixRich Felker2016-07-011-0/+2
| | | | | | | | | commit 6d38c9cf80f47623e5e48190046673bbd0dc410b provided an arm-specific version of posix_fadvise to address the alternate argument order the kernel expects on arm, but neglected to address that powerpc (32-bit) has the same issue. instead of having arch variant files in duplicate, simply put the alternate version in the top-level file under the control of a macro defined in syscall_arch.h.
* add preadv2 and pwritev2 syscall numbers for linux v4.6Szabolcs Nagy2016-06-091-0/+2
| | | | | | | | the syscalls take an additional flag argument, they were added in commit f17d8b35452cab31a70d224964cd583fb2845449 and a RWF_HIPRI priority hint flag was added to linux/fs.h in 97be7ebe53915af504fb491fb99f064c7cf3cb09. the syscall is not allocated for microblaze and sh yet.
* deduplicate __NR_* and SYS_* syscall number definitionsBobby Bingham2016-05-121-349/+0
|
* add copy_file_range syscall numbers from linux v4.5Szabolcs Nagy2016-03-191-0/+2
| | | | | | | it was introduced for offloading copying between regular files in linux commit 29732938a6289a15e907da234d6692a2ead71855 (microblaze and sh does not yet have the syscall number.)
* deduplicate bits/mman.hSzabolcs Nagy2016-03-181-59/+0
| | | | | | | | | | | currently five targets use the same mman.h constants and the rest share most constants too, so move them to sys/mman.h before the bits/mman.h include where the differences can be corrected by redefinition of the macros. this fixes two minor bugs: POSIX_MADV_DONTNEED was wrong on most targets (it should be the same as MADV_DONTNEED), and sh defined the x86-only MAP_32BIT mmap flag.
* better a_sc inline asm constraint on aarch64 and armSzabolcs Nagy2016-01-311-1/+1
| | | | | | | | | | "Q" input constraint was used for the written object, instead of "=Q" output constraint. this should not cause problems because "memory" is on the clobber list, but "=Q" better documents the intent and more consistent with the actual asm code. this changes the generated code, because different registers are used, but other than the register names nothing should change.
* deduplicate the bulk of the arch bits headersRich Felker2016-01-2713-594/+0
| | | | | | | | | | | | all bits headers that were identical for a number of 'clean' archs are moved to the new arch/generic tree. in addition, a few headers that differed only cosmetically from the new generic version are removed. additional deduplication may be possible in mman.h and in several headers (limits.h, posix.h, stdint.h) that mostly depend on whether the arch is 32- or 64-bit, but they are left alone for now because greater gains are likely possible with more invasive changes to header logic, which is beyond the scope of this commit.
* add MCL_ONFAULT and MLOCK_ONFAULT mlockall and mlock2 flagsSzabolcs Nagy2016-01-261-0/+1
| | | | | | | | they lock faulted pages into memory (useful when a small part of a large mapped file needs efficient access), new in linux v4.4, commit b0f205c2a3082dd9081f9a94e50658c5fa906ff1 MLOCK_* is not in the POSIX reserved namespace for sys/mman.h
* add mlock2 syscall number from linux v4.4Szabolcs Nagy2016-01-261-0/+2
| | | | | | | this is mlock with a flags argument, new in linux commit a8ca5d0ecbdde5cc3d7accacbd69968b0c98764e as usual microblaze and sh don't have allocated syscall number yet.
* add new membarrier, userfaultfd and switch_endian syscallsSzabolcs Nagy2016-01-261-0/+4
| | | | | | | | | | | | | | | new in linux v4.3 added for aarch64, arm, i386, mips, or1k, powerpc, x32 and x86_64. membarrier is a system wide memory barrier, moves most of the synchronization cost to one side, new in kernel commit 5b25b13ab08f616efd566347d809b4ece54570d1 userfaultfd is useful for qemu and is new in kernel commit 8d2afd96c20316d112e04d935d9e09150e988397 switch_endian is powerpc only for switching endianness, new in commit 529d235a0e190ded1d21ccc80a73e625ebcad09b
* fix arm a_crash for big endianRich Felker2016-01-251-2/+4
| | | | | | | | contrary to commit 89e149d275a7699a4a5e4c98bab267648f64cbba, big endian arm does need the instruction bytes in big endian order. rather than trying to use a special encoding that works as arm or thumb, simply encode the simplest/canonical undefined instructions dependent on whether __thumb__ is defined.
* add native a_crash primitive for armRich Felker2016-01-251-0/+10
| | | | | | | | the .byte directive encodes a guaranteed-undefined instruction, the same one Linux fills the kuser helper page with when it's disabled. the udf mnemonic and and .insn directives are not supported by old binutils versions, and larger-than-byte integer directives would produce the wrong output on big-endian.
* move arm-specific translation units out of arch/arm/src, to src/*/armRich Felker2016-01-228-244/+0
| | | | | | | this is possible with the new build system that allows src/*/$(ARCH)/* files which do not shadow a file in the parent directory, and yields a more logical organization. eventually it will be possible to remove arch/*/src from the build system.
* overhaul arm atomics for new atomics frameworkRich Felker2016-01-211-142/+38
| | | | | | | | | | | | switch to ll/sc model so that new atomic.h can provide optimized versions of all the atomic primitives without needing an ll/sc loop written in asm for each one. all isa levels which use ldrex/strex now use the inline ll/sc model even if the type of barrier to use is not known until runtime (v6). the cas model is only used for arm v5 and earlier, and it has been optimized to make the call via inline asm with custom constraints rather than as a C function call.
* refactor internal atomic.hRich Felker2016-01-211-104/+11
| | | | | | | | | | | | | | | rather than having each arch provide its own atomic.h, there is a new shared atomic.h in src/internal which pulls arch-specific definitions from arc/$(ARCH)/atomic_arch.h. the latter can be extremely minimal, defining only a_cas or new ll/sc type primitives which the shared atomic.h will use to construct everything else. this commit avoids making heavy changes to the individual archs' atomic implementations. definitions which are identical or near-identical to what the new shared atomic.h would produce have been removed, but otherwise the changes made are just hooking up the arch-specific files to the new infrastructure. major changes to take advantage of the new system will come in subsequent commits.
* fix build regression for arm pre-v7 from out-of-tree build patchRich Felker2016-01-202-0/+0
| | | | | | | | | | commit 2f853dd6b9a95d5b13ee8f9df762125e0588df5d failed to replicate the old makefile logic that caused arch/arm/src/arm/atomics.s to be built. since this was the only .s file under arch/*/src, rather than trying to reproduce the old logic, I'm just moving it up a level and adjusting the glob pattern in the makefile to catch it. eventually arch/*/src will probably be removed in favor of moving all these files to appropriate src/*/$(ARCH) locations.
* fix dynamic linker path file selection for arm vs armhfRich Felker2016-01-201-3/+3
| | | | | | | | | | the __SOFTFP__ macro which was wrongly being used does not reflect the ABI (arm vs armhf) but just the availability of floating point instructions/registers, so -mfloat-abi=softfp was wrongly being treated as armhf. __ARM_PCS_VFP is the correct predefined macro to check for the armhf EABI variant. this macro usage was corrected for the build process in commit 4918c2bb206bfaaf5a1f7d3448c2f63d5e2b7d56 but reloc.h was apparently overlooked at the time.
* explicitly assemble all arm asm sources as UALRich Felker2015-11-101-0/+1
| | | | | | | | these files are all accepted as legacy arm syntax when producing arm code, but legacy syntax cannot be used for producing thumb2 with access to the full ISA. even after switching to UAL, some asm source files contain instructions which are not valid in thumb mode, so these will need to be addressed separately.
* remove non-working pre-armv4t support from arm asmRich Felker2015-11-092-11/+0
| | | | | | | | | | | | | | | the idea of the three-instruction sequence being removed was to be able to return to thumb code when used on armv4t+ from a thumb caller, but also to be able to run on armv4 without the bx instruction available (in which case the low bit of lr would always be 0). however, without compiler support for generating such a sequence from C code, which does not exist and which there is unlikely to be interest in implementing, there is little point in having it in the asm, and it would likely be easier to add pre-armv4t support via enhanced linker handling of R_ARM_V4BX than at the compiler level. removing this code simplifies adding support for building libc in thumb2-only form (for cortex-m).
* properly access mcontext_t program counter in cancellation handlerRich Felker2015-11-021-1/+1
| | | | | | | | | using the actual mcontext_t definition rather than an overlaid pointer array both improves correctness/readability and eliminates some ugly hacks for archs with 64-bit registers bit 32-bit program counter. also fix UB due to comparison of pointers not in a common array object.
* mark arm thread-pointer-loading inline asm as volatileRich Felker2015-10-151-3/+3
| | | | | | | | | | this builds on commits a603a75a72bb469c6be4963ed1b55fabe675fe15 and 0ba35d69c0e77b225ec640d2bd112ff6d9d3b2af to ensure that a compiler cannot conclude that it's valid to reorder the asm to a point before the thread pointer is set up, or to treat the inline function as if it were declared with attribute((const)). other archs already use volatile asm for thread pointer loading.
* remove attribute((const)) from arm __pthread_self inline functionRich Felker2015-10-151-2/+2
| | | | | commit a603a75a72bb469c6be4963ed1b55fabe675fe15 did this for the public pthread_self function but not the internal inline one.
* implement arm eabi mem* functionsTimo Teräs2015-08-314-0/+36
| | | | | | | these functions are part of the ARM EABI, meaning compilers may generate references to them. known versions of gcc do not use them, but llvm does. they are not provided by libgcc, and the de facto standard seems to be that libc provides them.
* arm: add vdso supportSzabolcs Nagy2015-06-141-0/+4
| | | | | vdso will be available on arm in linux v4.2, the user-space code for it is in kernel commit 8512287a8165592466cb9cb347ba94892e9c56a5
* add .text section directive to all crt_arch.h files missing itRich Felker2015-05-221-0/+1
| | | | | | | | i386 and x86_64 versions already had the .text directive; other archs did not. normally, top-level (file scope) __asm__ starts in the .text section anyway, but problems were reported with some versions of clang, and it seems preferable to set it explicitly anyway, at least for the sake of consistency between archs.
* make arm reloc.h CRTJMP macro compatible with thumbRich Felker2015-05-141-0/+5
| | | | | | | | | | | | | compilers targeting armv7 may be configured to produce thumb2 code instead of arm code by default, and in the future we may wish to support targets where only the thumb instruction set is available. the instructions this patch omits in thumb mode are needed only for non-thumb versions of armv4 or earlier, which are not supported by any current compilers/toolchains and thus rather pointless to have. at some point these compatibility return sequences may be removed from all asm source files, and in that case it would make sense to remove them here too and remove the ifdef.
* make arm crt_arch.h compatible with thumb code generationRich Felker2015-05-141-4/+6
| | | | | | | | | | | | | | | | | | compilers targeting armv7 may be configured to produce thumb2 code instead of arm code by default, and in the future we may wish to support targets where only the thumb instruction set is available. the changes made here avoid operating directly on the sp register, which is not possible in thumb code, and address an issue with the way the address of _DYNAMIC is computed. previously, the relative address of _DYNAMIC was stored with an additional offset of -8 versus the pc-relative add instruction, since on arm the pc register evaluates to ".+8". in thumb code, it instead evaluates to ".+4". both are two (normal-size) instructions beyond "." in the current execution mode, so the numbered label 2 used in the relative address expression is simply moved two instructions ahead to be compatible with both instruction sets.
* fix __syscall declaration with wrong visibility in syscall_arch.hSzabolcs Nagy2015-04-301-2/+0
| | | | | remove __syscall declaration where it is not needed (aarch64, arm, microblaze, or1k) and add the hidden attribute where it is (mips).