about summary refs log tree commit diff
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* optimized C memcpyRich Felker2013-08-281-16/+111
| | | | | | | | | | | | | | | | unlike the old C memcpy, this version handles word-at-a-time reads and writes even for misaligned copies. it does not require that the cpu support misaligned accesses; instead, it performs bit shifts to realign the bytes for the destination. essentially, this is the C version of the ARM assembly language memcpy. the ideas are all the same, and it should perform well on any arch with a decent number of general-purpose registers that has a barrel shift operation. since the barrel shifter is an optional cpu feature on microblaze, it may be desirable to provide an alternate asm implementation on microblaze, but otherwise the C code provides a competitive implementation for "generic risc-y" cpu archs that should alleviate the urgent need for arch-specific memcpy asm.
* fix invalid instruction mnemonics in powerpc fenv asmRich Felker2013-08-271-3/+3
| | | | | there is no non-dot version of the andis instruction, but there's no harm in updating the flags anyway, so just use the dot version.
* optimized C memsetRich Felker2013-08-271-12/+77
| | | | | | | | | | | | | | | | this version of memset is optimized both for small and large values of n, and makes no misaligned writes, so it is usable (and near-optimal) on all archs. it is capable of filling up to 52 or 56 bytes without entering a loop and with at most 7 branches, all of which can be fully predicted if memset is called multiple times with the same size. it also uses the attribute extension to inform the compiler that it is violating the aliasing rules, unlike the previous code which simply assumed it was safe to violate the aliasing rules since translation unit boundaries hide the violations from the compiler. for non-GNUC compilers, 100% portable fallback code in the form of a naive loop is provided. I intend to eventually apply this approach to all of the string/memory functions which are doing word-at-a-time accesses.
* add the %s (seconds since the epoch) format to strftimeRich Felker2013-08-251-0/+4
| | | | | | this is a nonstandard extension but will be required in the next version of POSIX, and it's widely used/useful in shell scripts utilizing the date utility.
* fix strftime regression in %e formatRich Felker2013-08-241-2/+2
| | | | %e pads with spaces instead of zeros.
* properly fill in tzname[] for old (pre-64-bit-format) zoneinfo filesRich Felker2013-08-241-1/+22
| | | | | in this case, the first standard-time and first daylight-time rules should be taken as the "default" ones to expose.
* minor fix to tz name checkingRich Felker2013-08-241-2/+2
| | | | | if a zoneinfo file is not (or is no longer) in use, don't check the abbrevs pointers, which may be invalid.
* fix strftime handling of time zone dataRich Felker2013-08-244-8/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | this may need further revision in the future, since POSIX is rather unclear on the requirements, and is designed around the assumption of POSIX TZ specifiers which are not sufficiently powerful to represent real-world timezones (this is why zoneinfo support was added). the basic issue is that strftime gets the string and numeric offset for the timezone from the extra fields in struct tm, which are initialized when calling localtime/gmtime/etc. however, a conforming application might have created its own struct tm without initializing these fields, in which case using __tm_zone (a pointer) could crash. other zoneinfo-based implementations simply check for a null pointer, but otherwise can still crash of the field contains junk. simply ignoring __tm_zone and using tzname[] would "work" but would give incorrect results in time zones with more complex rules. I feel like this would lower the quality of implementation. instead, simply validate __tm_zone: unless it points to one of the zone name strings managed by the timezone system, assume it's invalid. this commit also fixes several other minor bugs with formatting: tm_isdst being negative is required to suppress printing of the zone formats, and %z was using the wrong format specifiers since the type of val was changed, resulting in bogus output.
* make dlopen honor the rpath of the main programRich Felker2013-08-231-1/+1
| | | | | | this seems to match what other systems do, and seems useful for programs that have their libraries and plugins stored relative to the executable.
* fix mishandling of empty or blank TZ environment variableRich Felker2013-08-231-1/+1
| | | | | | the empty TZ string was matching equal to the initial value of the cached TZ name, thus causing do_tzset never to run and never to initialize the time zone data.
* fix regression in dn_expand/reverse dnsRich Felker2013-08-231-1/+1
| | | | | | off-by-one error copying the name components was yielding junk at the beginning and truncating one character at the end (of every component).
* fix bugs in $ORIGIN handlingRich Felker2013-08-231-3/+9
| | | | | | | | 1. an occurrence of ${ORIGIN} before $ORIGIN would be ignored due to the strstr logic. (note that rpath contains multiple :-delimited paths to be searched.) 2. data read by readlink was not null-terminated.
* use AT_EXECFN, if available, for dynamic linker to identify main programRich Felker2013-08-231-1/+5
| | | | | | | | | | fallback to argv[0] as before. unlike argv[0], AT_EXECFN was a valid (but possibly relative) pathname for the new program image at the time the execve syscall was made. as a special case, ignore AT_EXECFN if it begins with "/proc/", in order not to give bogus (and possibly harmful) results when fexecve was used.
* add rpath $ORIGIN processing to dynamic linkerRich Felker2013-08-231-3/+59
|
* add recursive rpath support to dynamic linkerRich Felker2013-08-231-12/+13
| | | | | | | | | | previously, rpath was only honored for direct dependencies. in other words, if A depends on B and B depends on C, only B's rpath (if any), not A's rpath, was being searched for C. this limitation made rpath-based deployment difficult in the presence of multiple levels of library dependency. at present, $ORIGIN processing in rpath is still unsupported.
* fix missing string.h in strftime.c (needed by new strftime code)Rich Felker2013-08-231-0/+1
| | | | this bug was masked by local experimental CFLAGS in my config.mak.
* add strftime and wcsftime field widthsRich Felker2013-08-222-24/+81
| | | | | | | | | at present, since POSIX requires %F to behave as %+4Y-%m-%d and ISO C requires %F to behave as %Y-%m-%d, the default behavior for %Y has been changed to match %+4Y. this seems to be the only way to conform to the requirements of both standards, and it does not affect years prior to the year 10000. depending on the outcome of interpretations from the standards bodies, this may be adjusted at some point.
* simplify strftime and fix integer overflowsRich Felker2013-08-221-28/+12
| | | | | | | | | | use a long long value so that even with offsets, values cannot overflow. instead of using different format strings for different numeric formats, simply use a per-format width and %0*lld for all of them. this width specifier is not for use with strftime field widths; that will be a separate step in the caller.
* strftime cleanup: avoid recomputing strlen when it's knownRich Felker2013-08-221-10/+16
|
* more strftime refactoringRich Felker2013-08-221-23/+25
| | | | | | | make __strftime_fmt_1 return a string (possibly in the caller-provided temp buffer) rather than writing into the output buffer. this approach makes more sense when padding to a minimum field width might be required, and it's also closer to what wcsftime wants.
* begin refactoring strftime to make adding field widths easierRich Felker2013-08-221-151/+161
|
* unbreak vwarn: print ": " before errno messageRich Felker2013-08-211-2/+5
| | | | | patch by Strake. this seems to be a regression caused by fixing the behavior of perror("") to match perror(0) at some point in the past.
* fix fenv exception functions to mask their argumentSzabolcs Nagy2013-08-188-18/+55
| | | | | | | | | | | fesetround.c is a wrapper to do the arch independent argument check (on archs where rounding mode is not stored in 2 bits __fesetround still has to check its arguments) on powerpc fe*except functions do not accept the extra invalid flags of its fpscr register the useless FENV_ACCESS pragma was removed from feupdateenv
* optimize x86 feclearexcept: only use save/restore x87 fenv if neededSzabolcs Nagy2013-08-182-27/+38
| | | | | | | the x87 exception summary (ES) and stack fault (SF) flags may be spuriously cleared by feclearexcept using the fnclex instruction, but these flags are not observable through libc hence maintaining their state is not critical.
* add sse fenv support on i386 through hwcapSzabolcs Nagy2013-08-182-9/+61
| | | | | | | the sse and x87 rounding modes should be always the same, the visible exception flags are the bitwise or of the two fenv states (so it's enough to query the rounding mode or raise exceptions on one fenv)
* fix i386 fesetenv: FE_DFL_ENV is (fenv_t*)-1 not 0Szabolcs Nagy2013-08-181-2/+2
|
* remove spurious tmp file present since initial git check-inRich Felker2013-08-171-390/+0
|
* add hkscs/big5-2003/eten extensions to iconv big5Rich Felker2013-08-173-977/+1433
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with these changes, the character set implemented as "big5" in musl is a pure superset of cp950, the canonical "big5", and agrees with the normative parts of Unicode. this means it has minor differences from both hkscs and big5-2003: - the range A2CC-A2CE maps to CJK ideographs rather than numerals, contrary to changes made in big5-2003. - C6CD maps to a CJK ideograph rather than its corresponding Kangxi radical character, contrary to changes made in hkscs. - F9FE maps to U+2593 rather than U+FFED. of these differences, none but the last are visually distinct, and the last is a character used purely for text-based graphics, not to convey linguistic content. should there be future demand for strict conformance to big5-2003 or hkscs mappings, the present charset aliases can be replaced with distinct variants. reportedly there are other non-standard big5 extensions in common use in Taiwan and perhaps elsewhere, which could also be added as layers on top of the existing big5 support. there may be additional characters which should be added to the hkscs table: the whatwg standard for big5 defines what appears to be a superset of hkscs.
* some initial math asm for armhf (fabs[f] and sqrt[f])Rich Felker2013-08-1612-0/+32
|
* support floating point environment (fenv) on armhf (hard float) subarchsRich Felker2013-08-163-0/+62
| | | | | patch by nsz. I've tested it on an armhf machine and it seems to be working correctly.
* fix build of x86_64 expl assemblyRich Felker2013-08-161-1/+1
| | | | | apparently this label change was not carried over when adapting the changes from the i386 version.
* math: fix pow(x,-1) to raise underflow properlySzabolcs Nagy2013-08-151-2/+14
| | | | | | if FLT_EVAL_METHOD!=0 check if (double)(1/x) is subnormal and not a power of 2 (if 1/x is power of 2 then either it is exact or the long double to double rounding already raised inexact and underflow)
* math: fix i386 atan2.s to raise underflow for subnormal resultsSzabolcs Nagy2013-08-152-2/+24
|
* math: clean up atan2.cSzabolcs Nagy2013-08-154-103/+73
| | | | | | | | | * remove volatile hacks * don't care about inexact flag for now (removed all the +-tiny) * fix atanl to raise underflow properly * remove signed int arithmetics * use pi/2 instead of pi_o_2 (gcc generates the same code, which is not correct, but it does not matter: we mainly care about nearest rounding)
* math: fix x86 asin, atan, exp, log1p to raise underflowSzabolcs Nagy2013-08-156-3/+98
| | | | | | underflow is raised by an inexact subnormal float store, since subnormal operations are slow, check the underflow flag and skip the store if it's already raised
* math: fix x86 expl.s to raise underflow and clean up special case handlingSzabolcs Nagy2013-08-152-45/+31
|
* math: fix asin, atan, log1p, tanh to raise underflow on subnormalSzabolcs Nagy2013-08-159-26/+39
| | | | | | | | | | | | | | | | | | for these functions f(x)=x for small inputs, because f(0)=0 and f'(0)=1, but for subnormal values they should raise the underflow flag (required by annex F), if they are approximated by a polynomial around 0 then spurious underflow should be avoided (not required by annex F) all these functions should raise inexact flag for small x if x!=0, but it's not required by the standard and it does not seem a worthy goal, so support for it is removed in some cases. raising underflow: - x*x may not raise underflow for subnormal x if FLT_EVAL_METHOD!=0 - x*x may raise spurious underflow for normal x if FLT_EVAL_METHOD==0 - in case of double subnormal x, store x as float - in case of float subnormal x, store x*x as float
* math: fix tgamma to raise underflow for large negative valuesSzabolcs Nagy2013-08-151-0/+1
|
* math: fix pow(0,-inf) to raise divbyzero flagSzabolcs Nagy2013-08-152-2/+2
|
* math: minor scalbn*.c simplificationSzabolcs Nagy2013-08-153-18/+10
|
* fix length computation in dn_expandRich Felker2013-08-141-3/+5
| | | | | | | there are two possible points where the length is evaluated: either the first 'compression' jump, or the null terminator if no jumps have taken place yet. the previous code only measured the length of the first component.
* de-duplicate dn_expand, fix return value and signature, clean upRich Felker2013-08-142-48/+23
| | | | | | | | | | | | | | | | | the duplicate code in dn_expand and its incorrect return values are both results of the history of the code: the version in __dns.c was originally written with no awareness of the legacy resolver API, and was later copy-and-paste duplicated to provide the legacy API. this commit is the first of a series that will restructure the internal dns code to share as much code as possible with the legacy resolver API functions. I have also removed the loop detection logic, since the output buffer length limit naturally prevents loops. in order to avoid long runtime when encountering a loop if the caller provided a ridiculously long buffer, the caller-provided length is clamped at the maximum dns name length.
* add arm-optimized memcpy implementation from bionic libcRich Felker2013-08-143-0/+383
| | | | | | | | | | | | | | | | | | | | the approach of this implementation was heavily investigated prior to adopting it. attempts to obtain similar performance with pure C code were capping out at about 75% of the performance of the asm, with considerably larger code size, and were fragile in that the compiler would sometimes compile part of memcpy into a call to itself. therefore, just using the asm seems to be the best option. this commit is the first to make use of the new subarch-specific asm framework. the new armel directory is the location for arm asm that should not be used for all arm subarchs, only the default one. armhf is the name of the little-endian hardfloat-ABI subarch, which can use the exact same asm. in both cases, the build system finds the asm by following a memcpy.sub file. the other two subarchs, armeb and armebhf, would need a big-endian variant of this code. it would not be hard to adapt the code to big endian, but I will hold off on doing so until there is demand for it.
* fix _NSIG and SIGRTMAX on mipsRich Felker2013-08-101-1/+3
| | | | | | | | | | | | | | | | | | | | | a mips signal mask contains 128 bits, enough for signals 1 through 128. however, the exit status obtained from the wait-family functions only has room for values up to 127. reportedly signal 128 was causing kernelspace bugs, so it was removed from the kernel recently; even without that issue, however, it was impossible to support it correctly in userspace. at the same time, the bug was masked on musl by SIGRTMAX incorrectly yielding 64 on mips, rather than the "correct" value of 128. now that the _NSIG issue is fixed, SIGRTMAX can be fixed at the same time, exposing the full range of signals for application use. note that the (nonstandardized) libc _NSIG value is actually one greater than the max signal number, and also one greater than the kernel headers' idea of _NSIG. this is the reason for the discrepency with the recent kernel changes. since reducing _NSIG by one brought it down from 129 to 128, rather than from 128 to 127, _NSIG/8, used widely in the musl sources, is unchanged.
* add pthread_setaffinity_np and pthread_getaffinity_np functionsRich Felker2013-08-103-18/+26
|
* add cpu affinity interfacesRich Felker2013-08-103-0/+29
| | | | | | | this first commit just includes the CPU_* and sched_* interfaces, not the pthread_* interfaces, which may be added later. simple sanity-check testing has been done for the basic interfaces, but most of the macros have not yet been tested.
* change sigset_t functions to restrict to _NSIGRich Felker2013-08-094-5/+5
| | | | | | | the idea here is to avoid advertising signals that don't exist and to make these functions safe to call (e.g. from within other parts of the implementation) on fake sigset_t objects which do not have the HURD padding.
* optimize posix_spawn to avoid spurious sigaction syscallsRich Felker2013-08-093-12/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | the trick here is that sigaction can track for us which signals have ever had a signal handler set for them, and only those signals need to be considered for reset. this tracking mask may have false positives, since it is impossible to remove bits from it without race conditions. false negatives are not possible since the mask is updated with atomic operations prior to making the sigaction syscall. implementation-internal signals are set to SIG_IGN rather than SIG_DFL so that a signal raised in the parent (e.g. calling pthread_cancel on the thread executing pthread_spawn) does not have any chance make it to the child, where it would cause spurious termination by signal. this change reduces the minimum/typical number of syscalls in the child from around 70 to 4 (including execve). this should greatly improve the performance of posix_spawn and other interfaces which use it (popen and system). to facilitate these changes, sigismember is also changed to return 0 rather than -1 for invalid signals, and to return the actual status of implementation-internal signals. POSIX allows but does not require an error on invalid signal numbers, and in fact returning an error tends to confuse applications which wrongly assume the return value of sigismember is boolean.
* fix missing errno from exec failure in posix_spawnRich Felker2013-08-091-0/+1
| | | | | failures prior to the exec attempt were reported correctly, but on exec failure, the return value contained junk.
* block all signals, even implementation-internal ones, in faccessat childRich Felker2013-08-091-1/+1
| | | | | | the child process's stack may be insufficient size to support a signal frame, and there is no reason these signal handlers should run in the child anyway.