about summary refs log tree commit diff
path: root/src/locale/langinfo.c
Commit message (Collapse)AuthorAgeFilesLines
* reduce spurious inclusion of libc.hRich Felker2018-09-121-1/+0
| | | | | | | | | | | | | | | | | | | | | libc.h was intended to be a header for access to global libc state and related interfaces, but ended up included all over the place because it was the way to get the weak_alias macro. most of the inclusions removed here are places where weak_alias was needed. a few were recently introduced for hidden. some go all the way back to when libc.h defined CANCELPT_BEGIN and _END, and all (wrongly implemented) cancellation points had to include it. remaining spurious users are mostly callers of the LOCK/UNLOCK macros and files that use the LFS64 macro to define the awful *64 aliases. in a few places, new inclusion of libc.h is added because several internal headers no longer implicitly include libc.h. declarations for __lockfile and __unlockfile are moved from libc.h to stdio_impl.h so that the latter does not need libc.h. putting them in libc.h made no sense at all, since the macros in stdio_impl.h are needed to use them correctly anyway.
* fix nl_langinfo_l(CODESET, loc) reporting wrong locale's valueRich Felker2018-03-071-1/+1
| | | | | | use of MB_CUR_MAX encoded a hidden dependency on the currently active locale for the calling thread, whereas nl_langinfo_l is supposed to report for the locale passed as an argument.
* add _NL_LOCALE_NAME extension to nl_langinfoRich Felker2017-07-311-0/+4
| | | | | | | | | | | | | | | | | since setlocale(cat, NULL) is required to return the setting for the global locale, there is no standard mechanism to obtain the name of the currently active thread-local locale set by uselocale. this makes it impossible for application/library software to load appropriate translations, etc. unless using the gettext implementation provided by libc, which has privileged access to libc internals. to fill this gap, glibc introduced the _NL_LOCALE_NAME macro which can be used with nl_langinfo to obtain the name. GNU gettext/gnulib code already use this functionality on glibc, and can easily be adapted to make use of it on non-glibc systems if it's available; for other systems they poke at locale implementation internals, which we want to avoid. this patch provides a compatible interface to the one glibc introduced.
* fix return value of nl_langinfo for invalid item argumentsRich Felker2015-11-101-5/+5
| | | | it was wrongly returning a null pointer instead of an empty string.
* make nl_langinfo(CODESET) always return "ASCII" in byte-based C localeRich Felker2015-10-011-1/+1
| | | | | | | | | | | | | | | | | | | | | commit 844212d94f582c4e3c5055e0a1524931e89ebe76, which did not make it into any releases, changed nl_langinfo(CODESET) to always return "UTF-8", even in the byte-based C locale. this was problematic because application software was found to use the string match for "UTF-8" to activate its own UTF-8 processing. this both undermines the byte-based functionality of the C locale, and if mixed with with calls to the standard multibyte functions, which happened in practice, could result in severe mis-handling of input. the motive for the previous change was that, to avoid widespread compatibility problems, the string returned by nl_langinfo(CODESET) needs to be accepted by iconv and by third-party character conversion code. thus, the only remaining choice is "ASCII". this choice accurately represents the intent that high bytes do not have individual meaning in the C locale, but it does mean that iconv, when passed nl_langinfo(CODESET) in the C locale, will produce errors in cases where mbrtowc would have succeeded. for reference, glibc behaves similarly in this regard, so I don't think it will be a problem.
* fix breakage in nl_langinfo from previous commitRich Felker2015-09-091-1/+1
|
* make nl_langinfo(CODESET) always return "UTF-8"Rich Felker2015-09-091-2/+1
| | | | | | | | | | | | | | | | | | | | | | | this restores the original behavior prior to the addition of the byte-based C locale and fixes what is effectively a regression in musl's property of always providing working UTF-8 support. commit 1507ebf837334e9e07cfab1ca1c2e88449069a80 introduced the codeset name "UTF-8-CODE-UNITS" for the byte-based C locale to represent that the semantic content is UTF-8 but that it is being processed as code units (bytes) rather than whole multibyte characters. however, many programs assume that the codeset name is usable with iconv and/or comes from a set of standard/widely-used names known to the application. such programs are likely to produce warnings or errors, run with reduced functionality, or mangle character data when run explicitly in the C locale. the standard places basically no requirements for the string returned by nl_langinfo(CODESET) and how it interacts with other interfaces, so returning "UTF-8" is permissible. moreover, it seems like the right thing to do, since the identity of the character encoding as "UTF-8" is independent of whether it is being processed as bytes of characters by the standard library functions.
* byte-based C locale, phase 1: multibyte character handling functionsRich Felker2015-06-161-1/+2
| | | | | | | | | | | | | | | | | | this patch makes the functions which work directly on multibyte characters treat the high bytes as individual abstract code units rather than as multibyte sequences when MB_CUR_MAX is 1. since MB_CUR_MAX is presently defined as a constant 4, all of the new code added is dead code, and optimizing compilers' code generation should not be affected at all. a future commit will activate the new code. as abstract code units, bytes 0x80 to 0xff are represented by wchar_t values 0xdf80 to 0xdfff, at the end of the surrogates range. this ensures that they will never be misinterpreted as Unicode characters, and that all wctype functions return false for these "characters" without needing locale-specific logic. a high range outside of Unicode such as 0x7fffff80 to 0x7fffffff was also considered, but since C11's char16_t also needs to be able to represent conversions of these bytes, the surrogate range was the natural choice.
* add support for LC_TIME and LC_MESSAGES translationsRich Felker2014-07-261-0/+1
| | | | | | | | | | | | | | | | | for LC_MESSAGES, translation of strerror and similar literal message functions is supported. for messages in other places (particularly the dynamic linker) that use format strings, translation is not yet supported. in order to make it possible and safe, such messages will need to be refactored to separate the textual content from the format. for LC_TIME, the day and month names and strftime-style format strings provided by nl_langinfo are supported for translation. however there may be limitations, as some of the original C-locale nl_langinfo strings are non-unique and thus perhaps non-suitable as keys. overall, the locale support activated by this commit should not be seen as complete and polished but as a basis for beginning to test locale functionality and implement locales.
* add missing yes/no strings to nl_langinfoRich Felker2014-07-261-2/+2
| | | | | these were removed from the standard but still offered as an extension in langinfo.h, so nl_langinfo should support them.
* fix nl_langinfo table for LC_TIME era-related itemsRich Felker2014-07-261-1/+2
| | | | | due to a skipped slot and missing null terminator, the last few strings were off by one or two slots from their item codes.
* properly pass current locale to *_l functions when used internallyRich Felker2014-07-021-1/+2
| | | | | this change is presently non-functional since the callees do not yet use their locale argument for anything.
* fix semantically incorrect use of LC_GLOBAL_LOCALERich Felker2013-07-281-1/+1
| | | | | | | | | | | | | LC_GLOBAL_LOCALE refers to the global locale, controlled by setlocale, not the thread-local locale in effect which these functions should be using. neither LC_GLOBAL_LOCALE nor 0 has an argument to the *_l functions has behavior defined by the standard, but 0 is a more logical choice for requesting the callee to lookup the current locale. in the future I may move the current locale lookup the the caller (the non-_l-suffixed wrapper). at this point, all of the locale logic is dummied out, so no harm was done, but it should at least avoid misleading usage.
* rework langinfo code for ABI compat and for use by time codeRich Felker2013-07-241-2/+8
|
* fix nl_langinfo to actually use the existing, correct internal versionRich Felker2011-04-031-2/+5
|
* initial check-in, version 0.5.0 v0.5.0Rich Felker2011-02-121-0/+58