mirror/musl - mirror of git://git.musl-libc.org/musl

	Commit message (Collapse)	Author	Age	Files	Lines
*	fix excess precision in return value of i386 log-family functions	Rich Felker	2020-02-06	8	-0/+20
\|
*	fix excess precision in return value of i386 acos[f] and asin[f]	Rich Felker	2020-02-06	6	-42/+75
\| \| \| \| \|	analogous to commit 1c9afd69051a64cf085c6fb3674a444ff9a43857 for atan[2][f].
*	fix excess precision in return value of i386 atan[2][f]	Rich Felker	2020-02-06	4	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \|	for functions implemented in C, this is a requirement of C11 (F.6); strictly speaking that text does not apply to standard library functions, but it seems to be intended to apply to them, and C2x is expected to make it a requirement. failure to drop excess precision is particularly bad for inverse trig functions, where a value with excess precision can be outside the range of the function (entire range, or range for a particular subdomain), breaking reasonable invariants a caller may expect.
*	remove legacy time32 timer[fd] syscalls from public syscall.h	Rich Felker	2020-02-05	1	-0/+16
\| \| \| \| \| \| \|	this extends commit 5a105f19b5aae79dd302899e634b6b18b3dcd0d6, removing timer[fd]_settime and timer[fd]_gettime. the timerfd ones are likely to have been used in software that started using them before it could rely on libc exposing functions.
*	remove further legacy time32 clock syscalls from public syscall.h	Rich Felker	2020-02-05	1	-0/+16
\| \| \| \| \|	this extends commit 5a105f19b5aae79dd302899e634b6b18b3dcd0d6, removing clock_settime, clock_getres, clock_nanosleep, and settimeofday.
*	fix incorrect results for catanf and catanl with some inputs	Rich Felker	2020-02-05	2	-26/+2
\| \| \| \| \| \|	catan was fixed in 10e4bd3780050e75b72aac5d85c31816419bb17d but the same bug in catanf and catanl was overlooked. the patch is completely analogous.
*	remove legacy clock_gettime and gettimeofday from public syscall.h	Rich Felker	2020-01-30	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	some nontrivial number of applications have historically performed direct syscalls for these operations rather than using the public functions. such usage is invalid now that time_t is 64-bit and these syscalls no longer match the types they are used with, and it was already harmful before (by suppressing use of vdso). since syscall() has no type safety, incorrect usage of these syscalls can't be caught at compile-time. so, without manually inspecting or running additional tools to check sources, the risk of such errors slipping through is high. this patch renames the syscalls on 32-bit archs to clock_gettime32 and gettimeofday_time32, so that applications using the original names will fail to build without being fixed. note that there are a number of other syscalls that may also be unsafe to use directly after the time64 switchover, but (1) these are the main two that seem to be in widespread use, and (2) most of the others continue to have valid usage with a null timeval/timespec argument, as the argument is an optional timeout or similar.
*	math/x32: correct lrintl.s for 32-bit long	Alexander Monakov	2020-01-27	1	-2/+2
\|
*	add thumb2 support to arm assembler memcpy	Andre McCurdy	2020-01-16	2	-6/+9
\| \| \| \| \| \| \|	For Thumb2 compatibility, replace two instances of a single instruction "orr with a variable shift" with the two instruction equivalent. Neither of the replacements are in a performance critical loop.
*	fix wcwidth wrongly returning 0 for most of planes 4 and up	Rich Felker	2020-01-01	1	-1/+1
\| \| \| \| \| \|	commit 1b0ce9af6d2aa7b92edaf3e9c631cb635bae22bd introduced this bug back in 2012 and it was never noticed, presumably since the affected planes are essentially unused in Unicode.
*	move stage3_func typedef out of shared internal dynlink.h header	Rich Felker	2019-12-31	1	-1/+0
\| \| \| \|	this interface contract is entirely internal to dynlink.c.
*	spare archs without time32 legacy the cost of ioctl fallback conversions	Rich Felker	2019-12-22	1	-1/+1
\| \| \| \| \| \|	adding this condition makes the entire convert_ioctl_struct function and compat_map table statically unreachable, and thereby optimized out by dead code elimination, on archs where they are not needed.
*	add further ioctl time64 fallback conversion for device-specific command	Rich Felker	2019-12-22	1	-0/+3
\| \| \| \| \| \| \| \|	VIDIOC_OMAP3ISP_STAT_REQ is a device-specific command for the omap3isp video device. the command number is in a device-private range and therefore could theoretically be used by other devices too in the future, but problematic clashes should not be able to arise without intentional misuse.
*	don't continue looping through ioctl compat_map after finding match	Rich Felker	2019-12-21	1	-0/+1
\| \| \| \| \| \|	there's only one matching entry for any given command so this had no functional distinction, but additional loops are pointless and wasteful.
*	revert unwanted and inadvertent change that slipped into mmap.c	Rich Felker	2019-12-20	1	-1/+0
\| \| \| \| \| \| \|	commit ae388becb529428ac926da102f1d025b3c3968da accidentally introduced #define SYSCALL_NO_TLS 1 in mmap.c, which was probably a stale change left around from unrelated syscall timing measurements. reverse it.
*	add further ioctl time64 fallback conversions	Rich Felker	2019-12-20	1	-1/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	this commit covers all remaining ioctls I'm aware of that use time_t-derived types in their interfaces. it may still be incomplete, and has undergone only minimal testing for a few commands used in audio playback. the SNDRV_PCM_IOCTL_SYNC_PTR command is special-cased because, rather than the whole structure expanding, it has two substructures each padded to 64 bytes that expand within their own 64-byte reserved zone. as long as it's the only one of its type, it doesn't really make sense to make a general framework for it, but the existing table framework is still used for the substructures in the special-case. one of the substructures, snd_pcm_mmap_status, has a snd_pcm_uframes_t member which is not a timestamp but is expanded just like one, to match the 64-bit-arch version of the structure. this is handled just like a timestamp at offset 8, and is the motivation for the conversions table holding offsets of individual values to be expanded rather than timespec/timeval type pairs. for some of the types, the size to which they expand is dependent on whether the arch's ABI aligns 8-byte types on 8-byte boundaries. new_req entries in the table need to reflect this size to get the right ioctl request number that will match what callers pass, but we don't have access to the actual structure type definitions here and duplicating them would be cumbersome. instead, the new_misaligned macro introduced here constructs an artificial object whose size is the result of expanding a misaligned timespec/timeval to 64-bit and imposing the arch's alignment on the result, which can be passed to the _IO{R,W,WR} macros.
*	improve ioctl time64 conversion fallback framework	Rich Felker	2019-12-19	1	-17/+18
\| \| \| \| \| \| \| \| \|	record offsets of individual slots that expand from 32- to 64-bit, rather than timespec/timeval pairs. this flexibility will be needed for some ioctls. reduce size of types in table. adjust representation of offsets to include a count rather than needing -1 padding so that the table is less ugly and doesn't need large diffs if we increase max number of slots.
*	convert ioctl time64 fallbacks to table-driven framework	Rich Felker	2019-12-18	1	-17/+66
\| \| \| \| \| \| \| \|	with the current set of supported ioctls, this conversion is hardly an improvement, but it sets the stage for being able to do alsa, v4l2, ppp, and other ioctls with timespec/timeval-derived types. without this capability, a lot of functionality users depend on would stop working with the time64 switchover.
*	hook recvmmsg up to SO_TIMESTAMP[NS] fallback for pre-time64 kernels	Rich Felker	2019-12-17	2	-6/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	always try the time64 syscall first since we can use its success to conclude that no conversion is needed (any setsockopt for the timestamp options would have succeeded without need for fallbacks). otherwise, we have to remember the original controllen for each msghdr, requiring O(vlen) space, so vlen must be bounded. linux clamps it to IOV_MAX for sendmmsg only (not recvmmsg), but doing the same for recvmmsg is not unreasonable, especially since the limitation will only apply to old kernels. we could optimize to avoid trying SYS_recvmmsg_time64 first if all msghdrs have controllen zero, or support unlimited vlen by looping and emulating the timeout logic, but I'm not inclined to do complex and error-prone optimizations on a function that has so many underlying problems it should really never be used.
*	implement SO_TIMESTAMP[NS] fallback for kernels without time64 versions	Rich Felker	2019-12-17	5	-0/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the definitions of SO_TIMESTAMP* changed on 32-bit archs in commit 38143339646a4ccce8afe298c34467767c899f51 to the new versions that provide 64-bit versions of timeval/timespec structure in control message payload. socket options, being state attached to the socket rather than function calls, are not trivial to implement as fallbacks on ENOSYS, and support for them was initially omitted on the assumption that the ioctl-based polling alternatives (SIOCGSTAMP*) could be used instead by applications if setsockopt fails. unfortunately, it turns out that SO_TIMESTAMP is sufficiently old and widely supported that a number of applications assume it's available and treat errors as fatal. this patch introduces emulation of SO_TIMESTAMP[NS] on pre-time64 kernels by falling back to setting the "_OLD" (time32) versions of the options if the time64 ones are not recognized, and performing translation of the SCM_TIMESTAMP[NS] control messages in recvmsg. since recvmsg does not know whether its caller is legacy time32 code or time64, it performs translation for any SCM_TIMESTAMP[NS]_OLD control messages it sees, leaving the original time32 timestamp as-is (it can't be rewritten in-place anyway, and memmove would be mildly expensive) and appending the converted time64 control message at the end of the buffer. legacy time32 callers will see the converted one as a spurious control message of unknown type; time64 callers running on pre-time64 kernels will see the original one as a spurious control message of unknown type. a time64 caller running on a kernel with native time64 support will only see the time64 version of the control message. emulation of SO_TIMESTAMPING is not included at this time since (1) applications which use it seem to be prepared for the possibility that it's not present or working, and (2) it can also be used in sendmsg control messages, in a manner that looks complex to emulate completely, and costly even when running on a time64-supporting kernel. corresponding changes in recvmmsg are not made at this time; they will be done separately.
*	arm: avoid conditional branch to PLT in sigsetjmp	Andre McCurdy	2019-12-07	1	-2/+3
\| \| \| \| \|	The R_ARM_THM_JUMP19 relocation type generated for the original code when targeting Thumb 2 is not supported by the gold linker.
*	riscv64: fix fesetenv(FE_DFL_ENV) crash	Ruinland ChuanTzu Tsai	2019-12-07	1	-1/+4
\| \| \| \| \|	When FE_DFL_ENV is passed to fesetenv(), the very first instruction lw t1, 0(a0) will fail since a0 is -1.
*	ppc: add configure check for older compilers erroring on 'd' constraint	rofl0r	2019-11-05	2	-2/+2
\|
*	fix time64 link regression of dlsym stub for static-linked programs	Rich Felker	2019-11-03	1	-0/+4
\| \| \| \| \| \| \| \| \|	in commit 22daaea39f1cc5f7391f0a5cd84576ffb58c2860, the __dlsym_redir_time64 function providing the backend for __dlsym_time64 was defined only in the dynamic linker, and thus was undefined when static linking a program referencing dlsym. use the same stub_dlsym definition that provides __dlsym (the non-redirecting backend) for static linked programs to provide it, conditional on _REDIR_TIME64.
*	add __dlsym_time64 asm entry point for all legacy-32bit-time_t archs	Rich Felker	2019-11-02	9	-0/+27
\|
*	make fstatat fill in old time32 stat fields too	Rich Felker	2019-10-28	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \|	here _REDIR_TIME64 is used as an indication that there's an old ABI, and thereby the old time32 timespec fields of struct stat. keeping struct stat compatible and providing both versions of the timespec fields is done so that ftw/nftw does not need painful compat shims, and (more importantly) so that similar interfaces between pairs of libc consumers (applications/libraries) will be less likely to break when one has been rebuilt for time64 but the other has not.
*	disable lfs64 aliases for remapped time64 functions	Rich Felker	2019-10-28	6	-0/+14
\| \| \| \| \| \| \|	these functions cannot provide the glibc lfs64-ABI-compatible symbols when time_t differs from what it was in that ABI. instead, the aliases need to be provided by the time32 compat shims or through some other mechanism.
*	update case mappings to unicode 12.1.0	Rich Felker	2019-10-25	1	-85/+92
\|
*	update ctype data to unicode 12.1.0	u_quark	2019-10-25	4	-201/+232
\|
*	overhaul wide character case mapping implementation	Rich Felker	2019-10-25	2	-290/+345
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the existing implementation of case mappings was very small (typically around 1.5k), but unmaintainable, requiring manual addition of new case mappings with each new edition of Unicode. often, it turned out that newly-added case mappings were not easily representable in the existing tightly-constrained table structures, requiring new hacks to be invented and delaying support for new characters. the new implementation added here follows the pattern used for character class membership, with a two-level table allowing Unicode blocks for which no data is needed to be elided. however, rather than single-bit data, each character maps to a one of up to 6 case-mapping rules available to its block, where 6 is floor(cbrt(256)) and allow 3 characters to be represented per byte (vs 8 with bit tables). blocks that would need more than 6 rules designate one as an exception and let lookup pass into a binary search of exceptional cases for the block. the number 6 was chosen empirically; many blocks would be ok with 4 rules (uncased, lower, upper, possible exceptions), some even just with 2, but the latter are rare and fitting 4 characters per byte rather than 3 does not save significant space. moreover, somewhat surprisingly, there are sufficiently many blocks where even 4 rules don't suffice without a lot of exceptions (blocks where some case pairs are laced, others offset) that originally I was looking at supporting variable-width tables, with 1-, 2-, or 3-bit entries, thereby allowing blocks with 8 rules. as implemented in my experiments, that version was significantly larger and involved more memory accesses/cache lines. improvements in size at the expense of some performance might be possible by utilizing iswalpha data or merging the table of case mapping identity with alphabetic identity. these were explored somewhat when the code was first written, and might be worth revisiting in the future.
*	add missing case mapping between U+03F3 and U+037F	Rich Felker	2019-10-25	1	-0/+1
\| \| \| \| \| \|	somehow this seems to have been overlooked. add it now so that subsequent overhaul of case mapping implementation will not introduce a functional change at the same time.
*	fix errno for posix_openpt with no free ptys available	Rich Felker	2019-10-24	1	-1/+3
\| \| \| \|	linux fails the open with ENOSPC, but POSIX mandates EAGAIN.
*	clock_adjtime: generalize time64 not to assume old struct layout match	Rich Felker	2019-10-20	1	-11/+46
\| \| \| \| \| \| \| \| \| \| \| \| \|	commit 2b4fd6f75b4fa66d28cddcf165ad48e8fda486d1 added time64 for this function, but did so with a hidden assumption that the new time64 version of struct timex will be layout-compatible with the old one. however, there is little benefit to doing it that way, and the cost is permanent special-casing of 32-bit archs with 64-bit time_t in the public interface definitions. instead, do a full translation of the structure going in and out. this commit is actually a revision to an earlier uncommited version of the code.
*	wait4, getrusage: add time64/x32 variant	Rich Felker	2019-10-19	2	-3/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	presently the kernel does not actually define time64 versions of these syscalls, and they're not really needed except to represent extreme cpu time usage. however, x32's versions of the syscalls already behave as time64 ones, meaning the functions were broken on x32 if the caller used any part of the rusage result other than ru_utime and ru_stime. commit 7e8171143124f7f510db555dc6f6327a965a3e84 made it possible to fix this by treating x32's syscalls as time64 versions. in the non-time64-syscall case, make the syscall with the rusage destination pointer adjusted so that all members but the timevals line up between the libc and kernel structures. on 64-bit archs, or present 32-bit archs with 32-bit time_t, the timevals will line up too and no further work is needed. for future 32-bit archs with 64-bit time_t, the timevals are copied into place, contingent on time_t being larger than long.
*	fix return value of ungetc when argument is outside unsigned char range	Rich Felker	2019-10-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	aside from the special value EOF, ungetc is specified to accept and convert values outside the range of unsigned char. conversion takes place automatically as part of assignment when storing into the buffer, but the return value is also required to be the resulting converted value, and this requirement was not satisfied. simplified from patch by Wang Jianjian.
*	fix incorrect use of fabs on long double operand in floatscan.c	Rich Felker	2019-10-18	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	based on patch by Dan Gohman, who caught this via compiler warnings. analysis by Szabolcs Nagy determined that it's a bug, whereby errno can be set incorrectly for values where the coercion from long double to double causes rounding. it seems likely that floating point status flags may be set incorrectly as a result too. at the same time, clean up use of preprocessor concatenation involving LDBL_MANT_DIG, which spuriously depends on it being a single unadorned decimal integer literal, and instead use the equivalent formulation 2/LDBL_EPSILON. an equivalent change on the printf side was made in commit bff6095d915f3e41206e47ea2a570ecb937ef926.
*	mips: add single-instruction math functions	info@mobile-stream.com	2019-10-14	4	-0/+64
\| \| \| \| \| \| \|	SQRT.fmt exists on MIPS II+ (float), MIPS III+ (double). ABS.fmt exists on MIPS I+ but only cores with ABS2008 flag in FCSR implement the required behaviour.
*	fix cacosh results for arguments with negative imaginary part	Michael Morrell	2019-10-14	3	-3/+12
\|
*	math: fix signed int left shift ub in sqrt	Szabolcs Nagy	2019-10-13	2	-4/+2
\| \| \| \| \| \| \|	Both sqrt and sqrtf shifted the signed exponent as signed int to adjust the bit representation of the result. There are signed right shifts too in the code but those are implementation defined and are expected to compile to arithmetic shift on supported compilers and targets.
*	fix aliasing-based undefined behavior in mbsrtowcs	Rich Felker	2019-10-13	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \|	mbsrtowcs contains "vectorized" loops to quickly step over bytes without the high bit set; these have undefined behavior by virtue of aliasing uint32_t over top of char data for the accesses. commit 4d0a82170a25464c39522d7190b9fe302045ddb2 fixed the corresponding usage in string functions by using the may_alias attribute conditional on __GNUC__ and disabled the vectorized code in its absence. do the same for mbsrtowcs.
*	remove remaining traces of __tls_get_new	Szabolcs Nagy	2019-09-29	5	-12/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some declarations of __tls_get_new were left in the code, even though the definition got removed in commit 9d44b6460ab603487dab4d916342d9ba4467e6b9 install dynamic tls synchronously at dlopen, streamline access this can make the build fail with ld: lib/libc.so: hidden symbol `__tls_get_new' isn't defined when libc.so is linked without --gc-sections, because a .hidden declaration in asm code creates a reference even if the symbol is not actually used.
*	math: optimize lrint on 32bit targets	Szabolcs Nagy	2019-09-27	1	-1/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	lrint in (LONG_MAX, 1/DBL_EPSILON) and in (-1/DBL_EPSILON, LONG_MIN) is not trivial: rounding to int may be inexact, but the conversion to int may overflow and then the inexact flag must not be raised. (the overflow threshold is rounding mode dependent). this matters on 32bit targets (without single instruction lrint or rint), so the common case (when there is no overflow) is optimized by inlining the lrint logic, otherwise the old code is kept as a fallback. on my laptop an i486 lrint call is asm:10ns, old c:30ns, new c:21ns on a smaller arm core: old c:71ns, new c:34ns on a bigger arm core: old c:27ns, new c:19ns
*	fix mips setjmp/longjmp fpu state on r6, related issues	Rich Felker	2019-09-26	2	-24/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	mips32 has two fpu register file variants: FR=0 with 32 32-bit registers, where pairs of neighboring even/odd registers are used to represent doubles, and FR=1 with 32 64-bit registers, each of which can store a single or double. up through r5 (our "mips" arch), the supported ABI uses FR=0, but modern compilers generate "fpxx" model code that can safely operate with either model. r6, which is an incompatible but similar ISA, drops FR=0 and only provides the FR=1 model. as such, setjmp and longjmp, which depended on being able to save and restore call-saved doubles by storing and loading their 32-bit halves, were completely broken in the presence of floating point code on mips r6. to fix this, use the s.d and l.d mnemonics to store and load fpu registers. these expand to the existing swc1 and lwc1 instructions for pairs of 32-bit fpu registers on mips1, but on mips2 and later they translate directly to the 64-bit sdc1 and ldc1. with FR=0, sdc1 and ldc1 behave just like the pairs of swc1 and lwc1 instructions they replace, storing or loading the even/odd pair of fpu registers that can be treated as separate single-precision floats or as a unit representing a double. but with FR=1, they store/load individual 64-bit registers. this yields the ABI-correct behavior on mips r6, and should make linking of pre-r6 (plain "mips") code with "fp64" model code workable, although this is and will likely remain unsupported usage. in addition to the mips r6 problem this change fixes, reportedly clang's internal assembler refuses to assemble swc1 and lwc1 instructions for odd register indices when building for "fpxx" model (the default). this caused setjmp and longjmp not to build. by using the s.d and l.d forms, this problem is avoided too. as a bonus, code size is reduced everywhere but mips1.
*	arm: fix setjmp and longjmp asm for armv8-a	Szabolcs Nagy	2019-09-26	2	-0/+14
\| \| \| \| \| \| \| \| \|	armv8 removed the coprocessor instructions other than cp14, so on an armv8 system the related hwcaps should never be set. new llvm complains about the use of coprocessor instructions in armv8-a mode (even though they are never executed at runtime), so ifdef them out when musl is built for armv8.
*	fix data race in timer_create with SIGEV_THREAD notification	Rich Felker	2019-09-25	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	in the timer thread start function, self->timer_id was accessed without synchronization; the timer thread could fail to see the store from the calling thread, resulting in timer_delete failing to delete the correct kernel-level timer. this fix is based on a patch by changdiankang, but with the load moved to after receiving the timer_delete signal rather than just after the start barrier, so as not to retain the possibility of data race with timer_delete.
*	harden thread start with failed scheduling against broken __clone	Rich Felker	2019-09-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 8a544ee3a2a75af278145b09531177cab4939b41 introduced a dependency of the failure path for explicit scheduling at thread creation on __clone's handling of the start function returning, which should result in SYS_exit. as noted in commit 05870abeaac0588fb9115cfd11f96880a0af2108, the arm version of __clone was broken in this case. in the past, the mips version was also broken; it was fixed in commit 8b2b61e0001281be0dcd3dedc899bf187172fecb. since this code path is pretty much entirely untested (previously only reachable in applications that call the public clone() and return from the start function) and consists of fragile per-arch asm, don't assume it works, at least not until it's been thoroughly tested. instead make the SYS_exit syscall from the start function's failure path.
*	fix %lf in wprintf	Brion Vibber	2019-09-13	1	-0/+2
\| \| \| \| \| \| \|	commit cc3a4466605fe8dfc31f3b75779110ac93055bc1 fixed this for printf but neglected to fix wprintf. Previously, %lf caused a failure to output.
*	fix arm __tlsdesc_dynamic when built as thumb code without __ARM_ARCH>=5	Rich Felker	2019-09-11	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	we don't actually support building asm source files as thumb1, but it's possible that the condition __ARM_ARCH>=5 would be false on old compilers that did not define __ARM_ARCH at all. avoiding that would require enumerating all of the possible __ARM_ARCH_*__ macros for testing. as noted in commit 05870abeaac0588fb9115cfd11f96880a0af2108, mov lr,pc is not valid for saving a return address when in thumb mode. since this code is a hot path (dynamic TLS access), don't do the out-of-line bl->bx chaining to save the return value; instead, use the fact that this file is preprocessed asm to add the missing thumb bit with an add in place of the mov. the change here does not affect builds for ISA levels new enough to have a thread pointer read instruction, or for armv5 and later as long as the compiler properly defines __ARM_ARCH, or for any build as arm (not thumb) code. it's likely that it makes no difference whatsoever to any present-day practical build environments, but nonetheless now it's safe. as an alternative, we could just assume __thumb__ implies availability of blx since we don't support building asm source files as thumb1. I didn't do that in order to avoid having a wrong assumption here if that ever changes.
*	fix arm __a_barrier_oldkuser when built as thumb	Rich Felker	2019-09-11	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	as noted in commit 05870abeaac0588fb9115cfd11f96880a0af2108, mov lr,pc is not a valid method for saving the return address in code that might be built as thumb. this one is unlikely to matter, since any ISA level that has thumb2 should also have native implementations of atomics that don't involve kuser_helper, and the affected code is only used on very old kernels to begin with.
*	fix code path where child function returns in arm __clone built as thumb	Rich Felker	2019-09-11	1	-7/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	mov lr,pc is not a valid way to save the return address in thumb mode since it omits the thumb bit. use a chain of bl and bx to emulate blx. this could be avoided by converting to a .S file with preprocessor conditions to use blx if available, but the time cost here is dominated by the syscall anyway. while making this change, also remove the remnants of support for pre-bx ISA levels. commit 9f290a49bf9ee247d540d3c83875288a7991699c removed the hack from the parent code paths, but left the unnecessary code in the child. keeping it would require rewriting two code paths rather than one, and is useless for reasons described in that commit.