about summary refs log tree commit diff
path: root/arch/sh
Commit message (Collapse)AuthorAgeFilesLines
* move arch-invariant definitions out of bits/ioctl.hBobby Bingham2019-02-071-96/+0
|
* make thread-pointer-loading asm non-volatileRich Felker2018-10-161-1/+1
| | | | | | | | | | this will allow the compiler to cache and reuse the result, meaning we no longer have to take care not to load it more than once for the sake of archs where the load may be expensive. depends on commit 1c84c99913bf1cd47b866ed31e665848a0da84a2 for correctness, since otherwise the compiler could hoist loads during stage 3 of dynamic linking before the initial thread-pointer setup.
* add arm and sh bits/ptrace.hSzabolcs Nagy2018-09-201-0/+5
| | | | | | These should have been added in commit df6d9450ea19fd71e52cf5cdb4c85beb73066394 that added target specific PTRACE_ macros, but somehow got missed.
* apply hidden visibility to sigreturn code fragmentsRich Felker2018-09-121-1/+3
| | | | | | | these were overlooked in the declarations overhaul work because they are not properly declared, and the current framework even allows their declared types to vary by arch. at some point this should be cleaned up, but I'm not sure what the right way would be.
* define and use internal macros for hidden visibility, weak refsRich Felker2018-09-051-1/+3
| | | | | | | | | this cleans up what had become widespread direct inline use of "GNU C" style attributes directly in the source, and lowers the barrier to increased use of hidden visibility, which will be useful to recovering some of the efficiency lost when the protected visibility hack was dropped in commit dc2f368e565c37728b0d620380b849c3a1ddd78f, especially on archs where the PLT ABI is costly.
* fix async thread cancellation on sh-fdpicRich Felker2018-08-291-0/+5
| | | | | | | | | | | | | | if __cp_cancel was reached via __syscall_cp, r12 will necessarily still contain a GOT pointer (for libc.so or for the static-linked main program) valid for entering __cancel. however, in the case of async cancellation, r12 may contain any scratch value; it's not necessarily even a valid GOT pointer for the code that was interrupted. unlike in commit 0ec49dab6794166d67fae4764ce7fdea42ea6103 where the corresponding issue was fixed for powerpc64, there is fundamentally no way for fdpic code to recompute its GOT pointer. so a new mechanism is introduced for cancel_handler to write a GOT register value into the interrupted context on archs where it is needed.
* work around broken kernel struct ipc_perm on some big endian archsRich Felker2018-06-201-0/+2
| | | | | | | | | | | | | | | the mode member of struct ipc_perm is specified by POSIX to have type mode_t, which is uniformly defined as unsigned int. however, Linux defines it with type __kernel_mode_t, and defines __kernel_mode_t as unsigned short on some archs. since there is a subsequent padding field, treating it as a 32-bit unsigned int works on little endian archs, but the order is backwards on big endian archs with the erroneous definition. since multiple archs are affected, remedy the situation with fixup code in the affected functions (shmctl, semctl, and msgctl) rather than repeating the same shims in syscall_arch.h for every affected arch.
* fix TLS layout of TLS variant I when there is a gap above TPSzabolcs Nagy2018-06-022-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | In TLS variant I the TLS is above TP (or above a fixed offset from TP) but on some targets there is a reserved gap above TP before TLS starts. This matters for the local-exec tls access model when the offsets of TLS variables from the TP are hard coded by the linker into the executable, so the libc must compute these offsets the same way as the linker. The tls offset of the main module has to be alignup(GAP_ABOVE_TP, main_tls_align). If there is no TLS in the main module then the gap can be ignored since musl does not use it and the tls access models of shared libraries are not affected. The previous setup only worked if (tls_align & -GAP_ABOVE_TP) == 0 (i.e. TLS did not require large alignment) because the gap was treated as a fixed offset from TP. Now the TP points at the end of the pthread struct (which is aligned) and there is a gap above it (which may also need alignment). The fix required changing TP_ADJ and __pthread_self on affected targets (aarch64, arm and sh) and in the tlsdesc asm the offset to access the dtv changed too.
* reverse definition dependency between PAGESIZE and PAGE_SIZERich Felker2018-03-101-1/+1
| | | | | | PAGESIZE is actually the version defined in POSIX base, with PAGE_SIZE being in the XSI option. use PAGESIZE as the underlying definition to facilitate making exposure of PAGE_SIZE conditional.
* ioctl TIOCGPTPEER from linux v4.13Szabolcs Nagy2017-11-051-0/+1
| | | | | added for safe opening of peer end of pty in a mount namespace. new in linux commit c6325179238f1d4683edbec53d8322575d76d7e2
* add SIOCGSTAMPNS socket ioctl macro to ioctl.hSzabolcs Nagy2017-08-291-0/+1
| | | | | | | | | | it is defined in linux asm/sockios.h since commit ae40eb1ef30ab4120bd3c8b7e3da99ee53d27a23 (linux v2.6.22) but was missing from musl by accident. in musl the sockios macros are exposed in sys/ioctl.h together with other ioctl requests instead of in sys/socket.h because of namespace rules. (glibc has them in sys/socket.h under _GNU_SOURCE.)
* fix build failure for sh4a due to missing colon in asm statementThomas Petazzoni2017-08-111-1/+1
| | | | | | Due to a missing ":" in an asm() statement, the "memory" clobber is considered by gcc as an input operand and not a clobber, which causes a build failure.
* add bits/hwcap.h and include it in sys/auxv.hSzabolcs Nagy2016-10-201-0/+11
| | | | | | | | | aarch64, arm, mips, mips64, mipsn32, powerpc, powerpc64 and sh have cpu feature bits defined in linux for AT_HWCAP auxv entry, so expose those in sys/auxv.h it seems the mips hwcaps were never exposed to userspace neither by linux nor by glibc, but that's most likely an oversight.
* add sh syscall numbers from linux v4.8Szabolcs Nagy2016-10-201-0/+14
| | | | | sh was updated in linux commit 74bdaa611fa69368fb4032ad437af073d31116bd to have numbers for new syscalls.
* fix pread/pwrite syscall calling convention on shRich Felker2016-08-111-0/+1
| | | | | | | despite sh not generally using register-pair alignment for 64-bit syscall arguments, there are arch-specific versions of the syscall entry points for pread and pwrite which include a dummy argument for alignment before the 64-bit offset argument.
* make brace placement in public header struct definitions consistentRich Felker2016-07-032-4/+2
| | | | | | | | | | | | | | placing the opening brace on the same line as the struct keyword/tag is the style I prefer and seems to be the prevailing practice in more recent additions. these changes were generated by the command: find include/ arch/*/bits -name '*.h' \ -exec sed -i '/^struct [^;{]*$/{N;s/\n/ /;}' {} + and subsequently checked by hand to ensure that the regex did not pick up any false positives.
* remove termios2 related ioctls from sh ioctl.hSzabolcs Nagy2016-07-031-4/+0
| | | | musl does not define these on other targets either.
* add missing TIOC* macros to ioctl.hSzabolcs Nagy2016-07-031-0/+2
| | | | | these are defined in linux asm/ioctls.h. (powerpc64 and powerpc bits/ioctl.h are now identical)
* add missing SIOCSIFNAME from linux/sockios.h to ioctl.hSzabolcs Nagy2016-07-031-0/+1
| | | | glibc ioctl.h has it too.
* remove ioctl macros that were removed from linux uapiSzabolcs Nagy2016-07-031-2/+0
| | | | | TIOCTTYGSTRUCT, TIOCGHAYESESP, TIOCSHAYESESP and TIOCM_MODEM_BITS were removed from the linux uapi and not present in glibc ioctl.h
* deduplicate __NR_* and SYS_* syscall number definitionsBobby Bingham2016-05-122-684/+341
|
* deduplicate bits/mman.hSzabolcs Nagy2016-03-181-60/+0
| | | | | | | | | | | currently five targets use the same mman.h constants and the rest share most constants too, so move them to sys/mman.h before the bits/mman.h include where the differences can be corrected by redefinition of the macros. this fixes two minor bugs: POSIX_MADV_DONTNEED was wrong on most targets (it should be the same as MADV_DONTNEED), and sh defined the x86-only MAP_32BIT mmap flag.
* deduplicate the bulk of the arch bits headersRich Felker2016-01-2712-408/+0
| | | | | | | | | | | | all bits headers that were identical for a number of 'clean' archs are moved to the new arch/generic tree. in addition, a few headers that differed only cosmetically from the new generic version are removed. additional deduplication may be possible in mman.h and in several headers (limits.h, posix.h, stdint.h) that mostly depend on whether the arch is 32- or 64-bit, but they are left alone for now because greater gains are likely possible with more invasive changes to header logic, which is beyond the scope of this commit.
* add MCL_ONFAULT and MLOCK_ONFAULT mlockall and mlock2 flagsSzabolcs Nagy2016-01-261-0/+1
| | | | | | | | they lock faulted pages into memory (useful when a small part of a large mapped file needs efficient access), new in linux v4.4, commit b0f205c2a3082dd9081f9a94e50658c5fa906ff1 MLOCK_* is not in the POSIX reserved namespace for sys/mman.h
* remove sh port's __fpscr_values source fileRich Felker2016-01-221-5/+0
| | | | | | | | | commit f3ddd173806fd5c60b3f034528ca24542aecc5b9, the dynamic linker bootstrap overhaul, silently disabled the definition of __fpscr_values in this file since libc.so's copy of __fpscr_values now comes from crt_arch.h, the same place the public definition in the main program's crt1.o ultimately comes from. remove this file which is no longer in use.
* move sh port's __shcall internal function from arch/sh/src to src treeRich Felker2016-01-221-5/+0
|
* move sh __unmapself code from arch/sh/src to main src treeRich Felker2016-01-221-24/+0
|
* overhaul sh atomics for new atomics framework, add j-core cas.l backendRich Felker2016-01-214-287/+30
| | | | | | | | | | | | | | | | sh needs runtime-selected atomic backends since there are a number of supported models that use non-forwards-compatible (non-smp-compatible) atomic mechanisms. previously, the code paths for this were highly inefficient since they involved C function calls with multiple branches in the callee and heavy spills in the caller. the new code performs calls the runtime-selected asm fragment from inline asm with extremely minimal clobbers, rather than using a function call. for the sh4a case where the atomic mechanism is known and there is no forward-compatibility issue, the movli.l and movco.l instructions are provided as a_ll and a_sc, allowing the new shared atomic.h to generate efficient inline versions of all the basic atomic operations without needing a cas loop.
* refactor internal atomic.hRich Felker2016-01-211-72/+0
| | | | | | | | | | | | | | | rather than having each arch provide its own atomic.h, there is a new shared atomic.h in src/internal which pulls arch-specific definitions from arc/$(ARCH)/atomic_arch.h. the latter can be extremely minimal, defining only a_cas or new ll/sc type primitives which the shared atomic.h will use to construct everything else. this commit avoids making heavy changes to the individual archs' atomic implementations. definitions which are identical or near-identical to what the new shared atomic.h would produce have been removed, but otherwise the changes made are just hooking up the arch-specific files to the new infrastructure. major changes to take advantage of the new system will come in subsequent commits.
* fix dynamic loader library mapping for nommu systemsRich Felker2015-11-111-0/+2
| | | | | | | | | | | | | | | | | | | | | on linux/nommu, non-writable private mappings of files may actually use memory shared with other processes or the fs cache. the old nommu loader code (used when mmap with MAP_FIXED fails) simply wrote over top of the original file mapping, possibly clobbering this shared memory. no such breakage was observed in practice, but it should have been possible. the new code starts by mapping anonymous writable memory on archs that might support nommu, then maps load segments over top of it, falling back to read if MAP_FIXED fails. we use an anonymous map rather than a writable file map to avoid reading more data from disk than needed. since pages cannot be loaded lazily on fault, in case of large data/bss, mapping the full file may read a lot of data that will subsequently be thrown away when processing additional LOAD segments. as a result, we cannot skip the first LOAD segment when operating in this mode. these changes affect only non-FDPIC nommu support.
* generalize sh entry point asm not to assume call dests fit in 12 bitsRich Felker2015-11-021-5/+12
| | | | | | this assumption is borderline-unsafe to begin with, and fails badly with -ffunction-sections since the linker can move the callee arbitrarily far away when it lies in a different section.
* properly access mcontext_t program counter in cancellation handlerRich Felker2015-11-021-1/+1
| | | | | | | | | using the actual mcontext_t definition rather than an overlaid pointer array both improves correctness/readability and eliminates some ugly hacks for archs with 64-bit registers bit 32-bit program counter. also fix UB due to comparison of pointers not in a common array object.
* fix signal return for sh/fdpicRich Felker2015-09-231-0/+8
| | | | | | | | | | | | | | | | the restorer function pointer provided in the kernel sigaction structure is interpreted by the kernel as a raw code address, not a function descriptor. this commit moves the declarations of the __restore and __restore_rt symbols to ksigaction.h so that arch versions of the file can override them, and introduces a version for sh which declares them as objects rather than functions. an alternate solution would have been defining SA_RESTORER to 0 so that the functions are not used, but this both requires executable stack (since the sh kernel does not have a vdso page with permanent restorer functions) and crashes on qemu user-level emulation.
* have sh/fdpic entry point set fdpic personality if neededRich Felker2015-09-221-0/+12
| | | | | | | | | | | | | | | | | the entry point code supports being loaded by a loader which is not fdpic-aware (in practice, either kernel with mmu or qemu without fdpic support). this mostly just works, but signal handling will wrongly use a function descriptor address as a code address if the personality is not adjusted to fdpic. ideally this code could be placed with sigaction so that it's not needed except if/when a signal handler is installed. however, personality is incorrectly maintained per-thread by the kernel, rather than per-process, so it's necessary to correct the personality before any threads are started. also, in order to skip the personality syscall when an fdpic-aware loader is used, we need to be able to detect how the program was loaded, and this information is only readily available at the entry point.
* add real fdpic loading of shared librariesRich Felker2015-09-221-0/+1
| | | | | | previously, the normal ELF library loading code was used even for fdpic, so only the kernel-loaded dynamic linker and main app could benefit from separate placement of segments and shared text.
* size-optimize sh/fdpic dynamic entry pointRich Felker2015-09-221-0/+4
| | | | | | the __fdpic_fixup code is not needed for ET_DYN executables, which instead use reloctions, so we can omit it from the dynamic linker and static-pie entry point and save some code size.
* work around breakage in sh/fdpic __unmapself functionRich Felker2015-09-221-0/+5
| | | | | | | | | | | | the C implementation of __unmapself used for potentially-nommu sh assumed CRTJMP takes a function descriptor rather than a code address; however, the actual dynamic linker needs a code address, and so commit 7a9669e977e5f750cf72ccbd2614f8b72ce02c4c changed the definition of the macro in reloc.h. this commit puts the old macro back in a place where it only affects __unmapself. this is an ugly workaround and should be cleaned up at some point, but at least it's well isolated.
* add general fdpic support in dynamic linker and arch support for shRich Felker2015-09-222-3/+12
| | | | | | | | | | | | | | | | | | at this point not all functionality is complete. the dynamic linker itself, and main app if it is also loaded by the kernel, take advantage of fdpic and do not need constant displacement between segments, but additional libraries loaded by the dynamic linker follow normal ELF semantics for mapping still. this fully works, but does not admit shared text on nommu. in terms of actual functional correctness, dlsym's results are presently incorrect for function symbols, RTLD_NEXT fails to identify the caller correctly, and dladdr fails almost entirely. with the dynamic linker entry point working, support for static pie is automatically included, but linking the main application as ET_DYN (pie) probably does not make sense for fdpic anyway. ET_EXEC is equally relocatable but more efficient at representing relocations.
* add sh fdpic subarch variantsRich Felker2015-09-121-1/+16
| | | | | | | | | with this commit it should be possible to produce a working static-linked fdpic libc and application binaries for sh. the changes in reloc.h are largely unused at this point since dynamic linking is not supported, but the CRTJMP macro is used one place outside of dynamic linking, in __unmapself.
* add fdpic version of entry point code for shRich Felker2015-09-121-0/+29
| | | | | | | this version of the entry point is only suitable for static linking in ET_EXEC form. neither dynamic linking nor pie is supported yet. at some point in the future the fdpic and non-fdpic versions of this code may be unified but for now it's easiest to work with them separately.
* make sh clone asm fdpic-compatibleRich Felker2015-09-121-0/+5
| | | | | | | | | | | clone calls back to a function pointer provided by the caller, which will actually be a pointer to a function descriptor on fdpic. the obvious solution is to have a separate version of clone for fdpic, but I have taken a simpler approach to go around the problem. instead of calling the pointed-to function from asm, a direct call is made to an internal C function which then calls the pointed-to function. this lets the C compiler generate the appropriate calling convention for an indirect call with no need for ABI-specific assembly.
* switch to using trap number 31 for syscalls on shRich Felker2015-06-161-1/+1
| | | | | | | | | | | | | | | | | | | nominally the low bits of the trap number on sh are the number of syscall arguments, but they have never been used by the kernel, and some code making syscalls does not even know the number of arguments and needs to pass an arbitrary high number anyway. sh3/sh4 traditionally used the trap range 16-31 for syscalls, but part of this range overlapped with hardware exceptions/interrupts on sh2 hardware, so an incompatible range 32-47 was chosen for sh2. using trap number 31 everywhere, since it's in the existing sh3/sh4 range and does not conflict with sh2 hardware, is a proposed unification of the kernel syscall convention that will allow binaries to be shared between sh2 and sh3/sh4. if this is not accepted into the kernel, we can refit the sh2 target with runtime selection mechanisms for the trap number, but doing so would be invasive and would entail non-trivial overhead.
* switch sh port's __unmapself to generic version when running on sh2/nommuRich Felker2015-06-161-0/+19
| | | | | | | | | | | | | due to the way the interrupt and syscall trap mechanism works, userspace on sh2 must never set the stack pointer to an invalid value. thus, the approach used on most archs, where __unmapself executes with no stack for the interval between SYS_munmap and SYS_exit, is not viable on sh2. in order not to pessimize sh3/sh4, the sh asm version of __unmapself is not removed. instead it's renamed and redirected through code that calls either the generic (safe) __unmapself or the sh3/sh4 asm, depending on compile-time and run-time conditions.
* add support for sh2 interrupt-masking-based atomics to sh portRich Felker2015-06-163-8/+113
| | | | | | | | | | | | | | | | | | | the sh2 target is being considered an ISA subset of sh3/sh4, in the sense that binaries built for sh2 are intended to be usable on later cpu models/kernels with mmu support. so rather than hard-coding sh2-specific atomics, the runtime atomic selection mechanisms that was already in place has been extended to add sh2 atomics. at this time, the sh2 atomics are not SMP-compatible; since the ISA lacks actual atomic operations, the new code instead masks interrupts for the duration of the atomic operation, producing an atomic result on single-core. this is only possible because the kernel/hardware does not impose protections against userspace doing so. additional changes will be needed to support future SMP systems. care has been taken to avoid producing significant additional code size in the case where it's known at compile-time that the target is not sh2 and does not need sh2-specific code.
* add .text section directive to all crt_arch.h files missing itRich Felker2015-05-221-0/+1
| | | | | | | | i386 and x86_64 versions already had the .text directive; other archs did not. normally, top-level (file scope) __asm__ starts in the .text section anyway, but problems were reported with some versions of clang, and it seems preferable to set it explicitly anyway, at least for the sake of consistency between archs.
* inline llsc atomics when building for sh4aBobby Bingham2015-05-192-90/+128
| | | | | | | If we're building for sh4a, the compiler is already free to use instructions only available on sh4a, so we can do the same and inline the llsc atomics. If we're building for an older processor, we still do the same runtime atomics selection as before.
* fix sh jmp_buf size to match ABIRich Felker2015-04-271-1/+1
| | | | | | | | | | | | | | | | | while the sh port is still experimental and subject to ABI instability, this is not actually an application/libc boundary ABI change. it only affects third-party APIs where jmp_buf is used in a shared structure at the ABI boundary, because nothing anywhere near the end of the jmp_buf object (which includes the oversized sigset_t) is accessed by libc. both glibc and uclibc have 15-slot jmp_buf for sh. presumably the smaller version was used in musl because the slots for fpu status register and thread pointer register (gbr) were incorrect and must not be restored by longjmp, but the size should have been preserved, as it's generally treated as a libc-agnostic ABI property for the arch, and having extra slots free in case we ever need them for something is useful anyway.
* fix ldso name for sh-nofpu subarchRich Felker2015-04-241-1/+7
| | | | | | | | | | | | | | | previously it was using the same name as the default ABI with hard float (floating point args and return value in registers). the test __SH_FPU_ANY__ || __SH4__ matches what's used in the configure script already, and seems correct under casual review against gcc's config/sh.h, but may need tweaks. the logic for predefined macros for sh, and what they all mean, is very complex. eventually this should be documented in comments here. configure already rejects "half-hard" configurations on sh where double=float since these do not conform to Annex F and are not suitable for musl, so these do not need to be considered here.
* fix failure of sh reloc.h to properly detect endianness for ldso nameRich Felker2015-04-241-0/+2
| | | | | versions of reloc.h that rely on endian macros much include endian.h to ensure they are available.
* dynamic linker bootstrap overhaulRich Felker2015-04-133-34/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | this overhaul further reduces the amount of arch-specific code needed by the dynamic linker and removes a number of assumptions, including: - that symbolic function references inside libc are bound at link time via the linker option -Bsymbolic-functions. - that libc functions used by the dynamic linker do not require access to data symbols. - that static/internal function calls and data accesses can be made without performing any relocations, or that arch-specific startup code handled any such relocations needed. removing these assumptions paves the way for allowing libc.so itself to be built with stack protector (among other things), and is achieved by a three-stage bootstrap process: 1. relative relocations are processed with a flat function. 2. symbolic relocations are processed with no external calls/data. 3. main program and dependency libs are processed with a fully-functional libc/ldso. reduction in arch-specific code is achived through the following: - crt_arch.h, used for generating crt1.o, now provides the entry point for the dynamic linker too. - asm is no longer responsible for skipping the beginning of argv[] when ldso is invoked as a command. - the functionality previously provided by __reloc_self for heavily GOT-dependent RISC archs is now the arch-agnostic stage-1. - arch-specific relocation type codes are mapped directly as macros rather than via an inline translation function/switch statement.