about summary refs log tree commit diff
path: root/localedata/locales
Commit message (Collapse)AuthorAgeFilesLines
* ja_JP locale: Add entry for the new Japanese era [BZ #22964]TAMUKI Shoichi2019-04-021-2/+4
| | | | | | | | | | | | | | | | | | | | The Japanese era name will be changed on May 1, 2019. The Japanese government made a preliminary announcement on April 1, 2019. The glibc ja_JP locale must be updated to include the new era name for strftime's alternative year format support. Checked on x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com> ChangeLog: [BZ #22964] * localedata/locales/ja_JP (LC_TIME): Add entry for the new Japanese era. * time/tst-strftime2.c (dates): Add 2019-04-30 and 2019-05-01. (mkreftable): Add rules for the new Japanese era and the new dates.
* Add verbose comments to 'era' in ja_JP locale.Carlos O'Donell2019-04-011-0/+18
| | | | | Reviewed-by: Rafal Luzynski <digitalfreak@lingonborough.com> Reviewed-by: TAMUKI Shoichi <tamuki@linet.gr.jp>
* tt_RU: Fix orthographic mistakes in day and abday sections [BZ #24296]mansayk2019-03-201-14/+14
| | | | | | | | | | | This commit fixes some errors and converts all weekday names to lowercase. The content is synchronized with CLDR-34 now, but trailing dots are removed from abday values in order to maintain consistency with the previous values and with many other locales which do the same. [BZ #24296] * localedata/locales/tt_RU (day): Update from CLDR-34, fix errors. (abday): Likewise, but remove the trailing dots.
* localedata: Add Minguo calendar support to Taiwanese locales [BZ #24293]Felix Yan2019-03-155-0/+20
| | | | | | | | | | | | | | | | | Minguo calendar is the official calendar system, and very widely used in Taiwan. This commit adds its support into glibc. Some background information: The government website (www.gov.tw) uses it, popular public services like Taiwan HSR also use this calendar system. Link to Wikipedia: https://en.wikipedia.org/wiki/Minguo_calendar [BZ #24293] * localedata/locales/zh_TW (era): Add, support Minguo calendar. * localedata/locales/cmn_TW (era): Likewise. * localedata/locales/hak_TW (era): Likewise. * localedata/locales/lzh_TW (era): Likewise. * localedata/locales/nan_TW (era): Likewise.
* Bug 24307: Update to Unicode 12.0.0Mike FABIAN2019-03-088-2333/+2443
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unicode 12.0.0 Support: Character encoding, character type info, and transliteration tables are all updated to Unicode 12.0.0, using the generator scripts contributed by Mike FABIAN (Red Hat). Some info about the number of characters added or changed: Total added characters in newly generated CHARMAP: 554 Total added characters in newly generated WIDTH: 106 alpha: Missing 8 characters of old ctype in new ctype (These are combining marks, apparently they were removed from alpha on purpose) alpha: Added 295 characters in new ctype which were not in old ctype combining: Missing 2 characters of old ctype in new ctype (U+1CF2 VEDIC SIGN ARDHAVISARGA and U+1CF3 VEDIC SIGN ROTATED ARDHAVISARGA, these are now "Alphabetic" in Unicode 12.0.0) combining: Added 37 characters in new ctype which were not in old ctype combining_level3: Missing 2 characters of old ctype in new ctype (U+1CF2 VEDIC SIGN ARDHAVISARGA and U+1CF3 VEDIC SIGN ROTATED ARDHAVISARGA, these are now "Alphabetic" in Unicode 12.0.0) combining_level3: Added 26 characters in new ctype which were not in old ctype graph: Added 554 characters in new ctype which were not in old ctype lower: Added 6 characters in new ctype which were not in old ctype print: Added 554 characters in new ctype which were not in old ctype punct: Missing 29 characters of old ctype in new ctype (These characters have all become "Alphabetic" in Unicode 12.0.0. Therefore, they are not in "punct" anymore (see: is_punct() in unicode_utils.py)) punct: Added 296 characters in new ctype which were not in old ctype tolower: Added 7 characters in new ctype which were not in old ctype totitle: Added 7 characters in new ctype which were not in old ctype toupper: Added 7 characters in new ctype which were not in old ctype upper: Added 7 characters in new ctype which were not in old ctype [BZ #24307] * localedata/unicode-gen/Makefile (UNICODE_VERSION): Set to 12.0.0. * localedata/unicode-gen/DerivedCoreProperties.txt: Update to Unicode 12.0.0. * localedata/unicode-gen/EastAsianWidth.txt: Likewise. * localedata/unicode-gen/PropList.txt: Likewise. * localedata/unicode-gen/UnicodeData.txt: Likewise. * localedata/unicode-gen/ctype_compatibility_test_cases.py: U+108D became "Alphabetic" in Unicode 12.0.0. Adapt test case. * localedata/charmaps/UTF-8: Regenerate. * localedata/locales/i18n_ctype: Likewise. * localedata/locales/tr_TR: Likewise. * localedata/locales/translit_circle: Likewise. * localedata/locales/translit_cjk_compat: Likewise. * localedata/locales/translit_combining: Likewise. * localedata/locales/translit_compat: Likewise. * localedata/locales/translit_font: Likewise. * localedata/locales/translit_fraction: Likewise.
* ja_JP: Change the offset for Taisho gan-nen from 2 to 1 [BZ #24162]TAMUKI Shoichi2019-03-021-1/+1
| | | | | | | | | | | | | | | The offset in era-string format for Taisho gan-nen (1912) is currently defined as 2, but it should be 1. So fix it. "Gan-nen" means the 1st (origin) year, Taisho started on July 30, 1912. Reported-by: Morimitsu, Junji <junji.morimitsu@hpe.com> Reviewed-by: Rafal Luzynski <digitalfreak@lingonborough.com> ChangeLog: [BZ #24162] * localedata/locales/ja_JP (LC_TIME): Change the offset for Taisho gan-nen from 2 to 1. Problem reported by Morimitsu, Junji.
* en_US: define date_fmt (bug 24046)Aurelien Jarno2019-01-071-0/+3
| | | | | | | | | | | | | | The en_US locale use a 12h am/pm format in both d_fmt and d_t_fmt, which is correct, but does not define date_fmt. This causes the default value to be used, which is in 24h format. This patch adds the date_fmt entry to the en_US locale with the same value as d_t_fmt as the latter already includes the timezone. Changelog [BZ #24046] * localedata/locales/en_US (date_fmt): Add, set to "%a %d %b %Y %r %Z".
* bs_BA: Fix a small typo in comment (bug 24011).PanderMusubi2019-01-021-1/+1
| | | | | [BZ #24011] * localedata/locales/bs_BA (LC_TELEPHONE): Fix a typo in comment.
* Multiple locales: Use the correct 12-hour time formats (bug 10496).Rafal Luzynski2018-12-2879-112/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It has been discovered that some locales use the 12-hour time formats but do not use any AM/PM indicator thus making the time ambiguous. This commit adds "%p" wherever it was missing. In some cases it has been identified that a locale should use 24-hour time format rather than 12-hour. All time formats come from CLDR but this commit introduces as few changes as possible (for example, it tries not to change the time zone display). For the locales which are not supported by CLDR the consistency with similar locales (which means the same language or the same country) has been preserved: if the time formats were the same before the change then they are still the same after the change. The time format updates can be roughly summarized as follows: * Most of the locales of Djibouti, Eritrea, and Ethiopia now use "%l:%M:%S %p". * Most of the locales of India and some surrounding countries (Bangladesh, Nepal etc.) now use "%I:%M:%S %p %Z". * Most of the Arabic locales now use "%Z %I:%M:%S %p". * Ge'ez language (Eritrea and Ethiopia) now uses "%l:%M:%S፡%p" (note the consistent use of Ethiopic wordspace character). * Tamil (India) now uses "%p %I:%M:%S %Z". * Chinese (Hong Kong) t_fmt now uses "%p %I<U6642>%M<U5206>%S<U79D2> %Z". * Additionally, the following locales have been switched from 12-hour time formats to 24-hour, according to CLDR: Arabic (Morocco), Maltese, Somali (Kenya), and Tamil (Sri Lanka). * Finally, the Bulgarian, Czech, and Slovak locales used 24-hour time format correctly but their t_fmt_ampm field was not empty containing 12-hour time format which was incorrect so it is now replaced with an empty string. [BZ #10496] * localedata/locales/aa_DJ (t_fmt): Set to "%l:%M:%S %p". (t_fmt_ampm): Likewise. * localedata/locales/aa_ER (t_fmt): Likewise. (t_fmt_ampm): Likewise. * localedata/locales/aa_ER@saaho (t_fmt): Likewise. (t_fmt_ampm): Likewise. * localedata/locales/aa_ET (t_fmt): Likewise. (t_fmt_ampm): Likewise. * localedata/locales/am_ET (t_fmt): Likewise. (t_fmt_ampm): Likewise. * localedata/locales/byn_ER (t_fmt): Likewise. (t_fmt_ampm): Likewise. * localedata/locales/om_ET (t_fmt): Likewise. (t_fmt_ampm): Likewise. * localedata/locales/sid_ET (t_fmt): Likewise. (t_fmt_ampm): Likewise. * localedata/locales/so_DJ (t_fmt): Likewise. (t_fmt_ampm): Likewise. * localedata/locales/so_ET (t_fmt): Likewise. (t_fmt_ampm): Likewise. * localedata/locales/so_SO (t_fmt): Likewise. (t_fmt_ampm): Likewise. * localedata/locales/ti_ER (t_fmt): Likewise. (t_fmt_ampm): Likewise. * localedata/locales/ti_ET (t_fmt): Likewise. (t_fmt_ampm): Likewise. * localedata/locales/tig_ER (t_fmt): Likewise. (t_fmt_ampm): Likewise. * localedata/locales/wal_ET (t_fmt): Likewise. (t_fmt_ampm): Likewise. * localedata/locales/anp_IN (t_fmt): Set to "%I:%M:%S %p %Z". * localedata/locales/ar_IN (t_fmt): Likewise. * localedata/locales/bhb_IN (t_fmt): Likewise. * localedata/locales/bho_IN (t_fmt): Likewise. * localedata/locales/bi_VU (t_fmt): Likewise. * localedata/locales/bn_BD (t_fmt): Likewise. * localedata/locales/bn_IN (t_fmt): Likewise. * localedata/locales/brx_IN (t_fmt): Likewise. * localedata/locales/doi_IN (t_fmt): Likewise. * localedata/locales/en_HK (t_fmt): Likewise. (t_fmt_ampm): Likewise. * localedata/locales/en_IN (t_fmt): Likewise. * localedata/locales/en_PH (t_fmt): Likewise. * localedata/locales/gu_IN (t_fmt): Likewise. * localedata/locales/hi_IN (t_fmt): Likewise. * localedata/locales/hif_FJ (t_fmt): Likewise. * localedata/locales/hne_IN (t_fmt): Likewise. * localedata/locales/kn_IN (t_fmt): Likewise. * localedata/locales/kok_IN (t_fmt): Likewise. * localedata/locales/ks_IN (t_fmt): Likewise. * localedata/locales/ks_IN@devanagari (t_fmt): Likewise. * localedata/locales/mag_IN (t_fmt): Likewise. * localedata/locales/mai_IN (t_fmt): Likewise. * localedata/locales/mjw_IN (t_fmt): Likewise. * localedata/locales/ml_IN (t_fmt): Likewise. * localedata/locales/mni_IN (t_fmt): Likewise. * localedata/locales/mr_IN (t_fmt): Likewise. * localedata/locales/ms_MY (t_fmt): Likewise. * localedata/locales/pa_IN (t_fmt): Likewise. * localedata/locales/raj_IN (t_fmt): Likewise. * localedata/locales/sa_IN (t_fmt): Likewise. * localedata/locales/sat_IN (t_fmt): Likewise. * localedata/locales/sd_IN (t_fmt): Likewise. * localedata/locales/sd_IN@devanagari (t_fmt): Likewise. * localedata/locales/tcy_IN (t_fmt): Likewise. * localedata/locales/the_NP (t_fmt): Likewise. * localedata/locales/to_TO (t_fmt): Likewise. * localedata/locales/ur_IN (t_fmt): Likewise. * localedata/locales/hif_FJ (d_t_fmt): Set to "%A %d %b %Y %I:%M:%S %p". (date_fmt): Add, set to "%A %d %b %Y %I:%M:%S %p %Z". * localedata/locales/ar_AE (t_fmt): Set to "%Z %I:%M:%S %p". * localedata/locales/ar_BH (t_fmt): Likewise. * localedata/locales/ar_DZ (t_fmt): Likewise. * localedata/locales/ar_EG (t_fmt): Likewise. * localedata/locales/ar_IQ (t_fmt): Likewise. * localedata/locales/ar_JO (t_fmt): Likewise. * localedata/locales/ar_KW (t_fmt): Likewise. * localedata/locales/ar_LB (t_fmt): Likewise. * localedata/locales/ar_LY (t_fmt): Likewise. * localedata/locales/ar_OM (t_fmt): Likewise. * localedata/locales/ar_QA (t_fmt): Likewise. * localedata/locales/ar_SD (t_fmt): Likewise. * localedata/locales/ar_SS (t_fmt): Likewise. * localedata/locales/ar_SY (t_fmt): Likewise. * localedata/locales/ar_TN (t_fmt): Likewise. * localedata/locales/ar_YE (t_fmt): Likewise. * localedata/locales/gez_ER (t_fmt): Set to "%l:%M:%S<U1361>%p". (t_fmt_ampm): Likewise. * localedata/locales/gez_ET (t_fmt): Likewise. (t_fmt_ampm): Likewise. * localedata/locales/ta_IN (t_fmt): Set to "%p %I:%M:%S %Z". (t_fmt_ampm): Likewise. (d_t_fmt): Set to "%A %d %B %Y %p %I:%M:%S %Z". * localedata/locales/zh_HK (t_fmt): Set to "%p %I<U6642>%M<U5206>%S<U79D2> %Z". * localedata/locales/ar_MA (t_fmt_ampm): Set to "" (empty string) because this locale does not use the 12-hour clock. (t_fmt): Set to "%Z %H:%M:%S". (d_t_fmt): Set to "%d %b, %Y %Z %H:%M:%S". * localedata/locales/mt_MT (t_fmt_ampm): Set to "" (empty string) because this locale does not use the 12-hour clock. (t_fmt): Set to "%H:%M:%S %Z". (d_t_fmt): Set to "%A, %d ta %b, %Y %H:%M:%S %Z". * localedata/locales/so_KE (t_fmt_ampm): Set to "" (empty string) because this locale does not use the 12-hour clock. (t_fmt): Set to "%T". (d_t_fmt): Set to "%A, %B %e, %Y %X %Z". (date_fmt): Set to "%A, %B %e, %X %Z %Y". * localedata/locales/ta_LK (t_fmt_ampm): Set to "" (empty string) because this locale does not use the 12-hour clock. (t_fmt): Set to "%H:%M:%S %Z". (d_t_fmt): Set to "%A %d %B %Y %H:%M:%S %Z". * localedata/locales/bg_BG (t_fmt_ampm): Set to "" (empty string) because this locale does not use the 12-hour clock. * localedata/locales/cs_CZ (t_fmt_ampm): Likewise. * localedata/locales/sk_SK (t_fmt_ampm): Likewise.
* sq_AL: Use the correct date and time formats (bug 10496, 23724).Rafal Luzynski2018-12-281-4/+7
| | | | | | | | | | | | | | | | | | | | | Albanian locale uses the 12-hour clock but some time formats did not use any AM/PM indicator making the time ambiguous. This commit adds "%p" wherever it was missing. It also sets the correct date format because the old "%Y-%b-%d" produced rather weird results like "2018-Sht-28". All time formats come from CLDR but as few changes have been introduced by this commit as possible. Some articles from MSDN and other available online sources have been also taken into account. [BZ #10496] [BZ #23724] * localedata/locales/sq_AL (t_fmt): Set to "%I:%M:%S.%p %Z". (t_fmt_ampm): Likewise. (d_t_fmt): Set to "%a %-d %b %Y %I:%M:%S.%p". (date_fmt): Add, set to "%a %-d %b %Y %I:%M:%S.%p %Z". (d_fmt): Set to "%-d.%-m.%y".
* localedata: Remove executable bit from localedata/locales/bi_VU [BZ #23995]Florian Weimer2018-12-181-0/+0
|
* Currency symbol should not preceed amount for [BZ #23791]Sergi Almacellas Abellana2018-10-291-3/+3
| | | | | | | | CLDR also has the currency symbol after the amount for Catalan. Also set grouping in LC_NUMERIC to 3;3. Reviewed-by: Mike FABIAN <mfabian@redhat.com>
* kl_GL: Update the month names and date formats (bug 23740).Rafal Luzynski2018-10-081-14/+26
| | | | | | | | | | | | | | Month names as provided by Oqaasileriffik, the official Greenlandic language regulator. They have recently reached the consensus regarding the orthography of the month names. Date formats updated to match the correct Greenlandic order which is MDY. [BZ #23740] * localedata/locales/kl_GL (mon): Update, the relative case. (alt_mon): Add, fill with month names in the nominative case. (d_t_fmt): Set to "%a %b %d %Y %T %Z". (d_fmt): Set to "%b %d %Y".
* kl_GL: Fix spelling of Sunday, should be "sapaat" (bug 20209).Rafal Luzynski2018-10-021-2/+2
| | | | | | | | | | | Although CLDR says otherwise, it is confirmed by Oqaasileriffik, the official Greenlandic language regulator, that this change is correct. [BZ #20209] * localedata/locales/kl_GL: (abday): Fix spelling of Sun (Sunday), should be "sap" rather than "sab". (day): Fix spelling of Sunday, should be "sapaat" rather than "sabaat".
* it_CH/it_IT locales: Correct some LC_TIME formats (bug 10425).Rafal Luzynski2018-09-212-5/+5
| | | | | | | | | | | | Synchronize some values with CLDR and apply some suggestions from Bugzilla. [BZ #10425] * localedata/locales/it_IT (d_t_fmt): Use "%a %-d %b %Y, %T". (date_fmt): Use "%a %-d %b %Y, %T, %Z". * localedata/locales/it_CH (d_t_fmt): Use "%a %-d %b %Y, %T" which is the same as in it_IT. (d_fmt): Use "%d.%m.%Y" which is the same as in de_CH. (date_fmt): Use "%a %-d %b %Y, %T, %Z" which is the same as in it_IT.
* Italian and Swiss locales: Use the correct separators (bug 10797).Rafal Luzynski2018-09-103-7/+5
| | | | | | | | | | | | | | | | | | | | CLDR and many other sources say that it_IT (Italian) should use a dot (".") as a thousands separator and a comma (",") as a decimal separator. For it_CH and de_CH CLDR says that they should use the Right Single Quotation Mark ("’") as a thousands separator and a dot (".") as a decimal separator. Consequently, the same rules are copied to all other locales in Switzerland. These rules apply to both LC_MONETARY and LC_NUMERIC. [BZ #10797] * localedata/locales/de_CH (mon_thousands_sep): Use "<U2019>" (Right Single Quotation Mark). (thousands_sep): Likewise. * localedata/locales/it_CH (LC_NUMERIC): Use “copy "de_CH"”. * localedata/locales/it_IT (thousands_sep): Use ".". (grouping): Use "3;3".
* Indian and similar locales: Set the correct date format (bug 17426).Rafal Luzynski2018-09-0533-33/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit also fixes d_fmt in bn_BD which is identical to bn_IN, in ne_NP which is identical to ne_IN (not supported by Glibc but supported by CLDR), and in ta_LK which is identical to ta_IN. For those locales which are supported by CLDR data is imported from CLDR v33. For others it is copied from those locales which were identical before this commit. [BZ #17426] * localedata/locales/anp_IN (d_fmt): Use "%-d//%-m//%y". * localedata/locales/ar_IN (d_fmt): Likewise. * localedata/locales/bhb_IN (d_fmt): Likewise. * localedata/locales/bho_IN (d_fmt): Likewise. * localedata/locales/bn_BD (d_fmt): Likewise. * localedata/locales/bn_IN (d_fmt): Likewise. * localedata/locales/doi_IN (d_fmt): Likewise. * localedata/locales/gu_IN (d_fmt): Likewise. * localedata/locales/hi_IN (d_fmt): Likewise. * localedata/locales/hne_IN (d_fmt): Likewise. * localedata/locales/kn_IN (d_fmt): Likewise. * localedata/locales/mag_IN (d_fmt): Likewise. * localedata/locales/mai_IN (d_fmt): Likewise. * localedata/locales/mjw_IN (d_fmt): Likewise. * localedata/locales/ml_IN (d_fmt): Likewise. * localedata/locales/mni_IN (d_fmt): Likewise. * localedata/locales/mr_IN (d_fmt): Likewise. * localedata/locales/pa_IN (d_fmt): Likewise. * localedata/locales/raj_IN (d_fmt): Likewise. * localedata/locales/sat_IN (d_fmt): Likewise. * localedata/locales/sd_IN (d_fmt): Likewise. * localedata/locales/sd_IN@devanagari (d_fmt): Likewise. * localedata/locales/ta_IN (d_fmt): Likewise. * localedata/locales/ta_LK (d_fmt): Likewise. * localedata/locales/tcy_IN (d_fmt): Likewise. * localedata/locales/ur_IN (d_fmt): Likewise. * localedata/locales/brx_IN (d_fmt): Use "%-m//%-d//%y". * localedata/locales/ks_IN (d_fmt): Likewise. * localedata/locales/ks_IN@devanagari (d_fmt): Likewise. * localedata/locales/kok_IN (d_fmt): Use "%-d-%-m-%y". * localedata/locales/ne_NP (d_fmt): Use "%y//%-m//%-d". * localedata/locales/sa_IN (d_fmt): Use "%-d-%m-%y". * localedata/locales/te_IN (d_fmt): Use "%d-%m-%y".
* en_IN: Set the correct date format for "%x" (bug 17426).Rafal Luzynski2018-08-281-2/+2
| | | | | [BZ #17426] * localedata/locales/en_IN (d_fmt): Use "%d/%m/%y".
* Keep expected behaviour for [a-z] and [A-z] (Bug 23393).Carlos O'Donell2018-07-251-964/+964
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit 9479b6d5e08eacce06c6ab60abc9b2f4eb8b71e4 we updated all of the collation data to harmonize with the new version of ISO 14651 which is derived from Unicode 9.0.0. This collation update brought with it some changes to locales which were not desirable by some users, in particular it altered the meaning of the locale-dependent-range regular expression, namely [a-z] and [A-Z], and for en_US it caused uppercase letters to be matched by [a-z] for the first time. The matching of uppercase letters by [a-z] is something which is already known to users of other locales which have this property, but this change could cause significant problems to en_US and other similar locales that had never had this change before. Whether this behaviour is desirable or not is contentious and GNU Awk has this to say on the topic: https://www.gnu.org/software/gawk/manual/html_node/Ranges-and-Locales.html While the POSIX standard also has this further to say: "RE Bracket Expression": http://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_chap09.html "The current standard leaves unspecified the behavior of a range expression outside the POSIX locale. ... As noted above, efforts were made to resolve the differences, but no solution has been found that would be specific enough to allow for portable software while not invalidating existing implementations." In glibc we implement the requirement of ISO POSIX-2:1993 and use collation element order (CEO) to construct the range expression, the API internally is __collseq_table_lookup(). The fact that we use CEO and also have 4-level weights on each collation rule means that we can in practice reorder the collation rules in iso14651_t1_common (the new data) to provide consistent range expression resolution *and* the weights should maintain the expected total order. Therefore this patch does three things: * Reorder the collation rules for the LATIN script in iso14651_t1_common to deinterlace uppercase and lowercase letters in the collation element orders. * Adds new test data en_US.UTF-8.in for sort-test.sh which exercises strcoll* and strxfrm* and ensures the ISO 14651 collation remains. * Add back tests to tst-fnmatch.input and tst-regexloc.c which exercise that [a-z] does not match A or Z. The reordering of the ISO 14651 data is done in an entirely mechanical fashion using the following program attached to the bug: https://sourceware.org/bugzilla/show_bug.cgi?id=23393#c28 It is up for discussion if the iso14651_t1_common data should be refined further to have 3 very tight collation element ranges that include only a-z, A-Z, and 0-9, which would implement the solution sought after in: https://sourceware.org/bugzilla/show_bug.cgi?id=23393#c12 and implemented here: https://www.sourceware.org/ml/libc-alpha/2018-07/msg00854.html No regressions on x86_64. Verified that removal of the iso14651_t1_common change causes tst-fnmatch to regress with: 422: fnmatch ("[a-z]", "A", 0) = 0 (FAIL, expected FNM_NOMATCH) *** ... 425: fnmatch ("[A-Z]", "z", 0) = 0 (FAIL, expected FNM_NOMATCH) ***
* oc_FR locale: Multiple updates (bug 23140, bug 23422).Quentin PAGÈS2018-07-181-24/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | Multiple updates for Occitan language including alternative month names, update abday and abmon, fix typos in day, fix d_fmt, correct LC_NAME, and use “copy "ca_ES"” as LC_COLLATE. [BZ #23140] * localedata/locales/oc_FR (mon): Rename to... (alt_mon): This, then update October (typo fix). (mon): New content (genitive case, month names preceded by "de" or "d’"). [BZ #23422] * localedata/locales/oc_FR (abday): Update all items. (day): Update Wednesday and Saturday (typo fixes). (abmon): Update all items, except May. (d_fmt): Update "%d.%m.%Y" -> "%d/%m/%Y". (LC_IDENTIFICATION): Bump the revision number and date. Keep the "category" entries in alphabetic order. (LC_ADDRESS): Remove no longer needed comment. (LC_COLLATE): Use “copy "ca_ES"”. (LC_NAME): Set the correct values of "name_fmt", "name_mr", and "name_mrs". Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* New locale: Yakut (Sakha) for Russia (sah_RU) [BZ #22241]Valery Timiriliyev2018-07-181-0/+290
| | | | | | | | | | * localedata/Makefile (test-input): Add sah_RU.UTF-8. (LOCALES): Likewise. * localedata/SUPPORTED (sah_RU/UTF-8): New entry. * localedata/locales/sah_RU: New file. * localedata/sah_RU.UTF-8.in: New file. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* os_RU: Add alternative month names (bug 23140).Rafal Luzynski2018-07-171-1/+14
| | | | | | | | | [BZ #23140] * localedata/locales/os_RU (mon): Rename to... (alt_mon): This. (mon): Import from CLDR (genitive case). Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* dsb_DE locale: Fix syntax error and add tests (bug 23208).Rafal Luzynski2018-07-131-2/+2
| | | | | | | | | | | | | Fixed syntax error in the collation rules of Lower Sorbian language. Collation test added in order to test the bugs like this early. Reported-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com> [BZ #23208] * localedata/Makefile (test-input): Add dsb_DE.UTF-8. (LOCALES): Likewise. * localedata/dsb_DE.UTF-8.in: New file. * localedata/locales/dsb_DE (LC_COLLATE): Fix syntax error.
* Put the correct Unicode version number 11.0.0 into the generated filesMike FABIAN2018-07-101-3/+3
| | | | | | | | | | | In some places there was still the old Unicode version 10.0.0 in the files. * localedata/charmaps/UTF-8: Use correct Unicode version 11.0.0 in comment. * localedata/locales/i18n_ctype: Use correct Unicode version in comments and headers. * localedata/unicode-gen/utf8_gen.py: Add option to specify Unicode version * localedata/unicode-gen/Makefile: Use option to specify Unicode version for utf8_gen.py
* Bug 23308: Update to Unicode 11.0.0Mike FABIAN2018-07-048-2092/+2406
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unicode 11.0.0 Support: Character encoding, character type info, and transliteration tables are all updated to Unicode 11.0.0, using the generator scripts contributed by Mike FABIAN (Red Hat). Some info about the number of characters added: Total added characters in newly generated CHARMAP: 684 Total added characters in newly generated WIDTH: 119 alpha: Added 380 characters in new ctype which were not in old ctype combining: Added 56 characters in new ctype which were not in old ctype combining_level3: Added 37 characters in new ctype which were not in old ctype graph: Added 684 characters in new ctype which were not in old ctype lower: Added 82 characters in new ctype which were not in old ctype print: Added 684 characters in new ctype which were not in old ctype punct: Added 304 characters in new ctype which were not in old ctype tolower: Added 79 characters in new ctype which were not in old ctype totitle: Added 33 characters in new ctype which were not in old ctype toupper: Added 79 characters in new ctype which were not in old ctype upper: Added 79 characters in new ctype which were not in old ctype No characters were removed. [BZ #23308] * unicode-gen/Makefile (UNICODE_VERSION): Set to 11.0.0. * localedata/unicode-gen/DerivedCoreProperties.txt: Update to Unicode 11.0.0. * localedata/unicode-gen/EastAsianWidth.txt: likewise. * localedata/unicode-gen/PropList.txt: likewise. * localedata/unicode-gen/UnicodeData.txt: likewise. * localedata/charmaps/UTF-8: Regenerate. * localedata/locales/i18n_ctype: likewise. * localedata/locales/tr_TR: likewise. * localedata/locales/translit_circle: likewise. * localedata/locales/translit_cjk_compat: likewise. * localedata/locales/translit_combining: likewise. * localedata/locales/translit_compat: likewise. * localedata/locales/translit_font: likewise. * localedata/locales/translit_fraction: likewise.
* New locale: Lower Sorbian (dsb_DE) [BZ #23208]Michael Wolf2018-06-291-0/+251
| | | | | | [BZ #23208] * localedata/SUPPORTED (dsb_DE/UTF-8): New entry. * localedata/locales/dsb_DE: New file.
* hy_AM: Add alternative month names (bug 23140).Rafal Luzynski2018-06-291-12/+24
| | | | | | | | | | | | | This locale already contained correct data in mon array. Updated from CLDR to start the month names with the lowercase letters. alt_mon is a new import from CLDR. The change has been consulted off-list with a native speaker. [BZ #23140] * localedata/locales/hy_AM (mon): Synchronize with CLDR (lowercase, genitive case). (alt_mon): New entry, import from CLDR (nominative case).
* es_BO locale: Change LC_PAPER to en_US (bug 22996).Sylvain Lesage2018-06-291-1/+1
| | | | | [BZ #22996] * localedata/locales/es_BO (LC_PAPER): Change to “copy "en_US"”.
* ast_ES: Add alternative month names (bug 23140).Rafal Luzynski2018-06-291-1/+16
| | | | | | | [BZ #23140] * localedata/locales/ast_ES (mon): Rename to... (alt_mon): This. (mon): Import from CLDR (genitive case).
* csb_PL: Add alternative month names (bug 23140).Rafal Luzynski2018-06-251-2/+20
| | | | | | | | | | | | | | | | | | Kashubian language is not supported by CLDR, data copied from Wikipedia and documents released by RJK (official Kashubian Language Council), also consulted with a native speaker. Note that this language also needs ab_alt_mon feature due to the month May: nominative "môj", genitive "maja"; abbreviated nominative "môj", abbreviated genitive "maj". [BZ #23140] * localedata/locales/csb_PL (mon): Rename to... (alt_mon): This. (abmon): Rename to... (ab_alt_mon): This. (mon): Add with proper genitive forms, copy from Wikipedia. (abmon): Likewise.
* csb_PL: Update month translations + add yesstr/nostr (bug 19485).Rafal Luzynski2018-06-251-2/+4
| | | | | | | | | | Thank you Michal Ostrowski for the feedback. [BZ #19485] * localedata/locales/csb_PL (mon): Fix typos: "łżëkwiôt" -> "łżëkwiat" (April); "lëpinc" -> "lëpińc" (July). (yesstr): Add, value is "jo". (nostr): Add, value is "nié".
* gd_GB, hsb_DE, wa_BE: Add alternative month names (bug 23140).Rafal Luzynski2018-06-123-10/+45
| | | | | | | | | | | | | | | | | | | | | | | | As a followup of fixing bug 10871, these three languages now support two grammatical cases of the month names. This commit does not resolve the bug because there are more languages to be committed. [BZ #23140] * localedata/locales/gd_GB (mon): Rename to... (alt_mon): This. (mon): Import from CLDR (genitive case). * localedata/locales/hsb_DE (mon): Rename to... (alt_mon): This. (mon): Import from CLDR (genitive case). * localedata/locales/wa_BE (mon): Rename to... (alt_mon): This. (mon): Add, fill with the proper genitive forms, but CLDR data is incomplete; completed according to the comments in this file. (d_t_fmt): Do not use "di" before the month name, no longer needed. * localedata/locales/wa_BE (country_name): Reword "Beljike" -> "Beldjike".
* gd_GB: Fix typo in abbreviated "May" (bug 23152).Rafal Luzynski2018-05-111-2/+2
| | | | | | | | [BZ #23152] * localedata/locales/gd_GB (abmon): Fix typo in May: "Mhàrt" -> "Cèit". Adjust the comment according to the change. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* hr_HR locale: fix thousands_sep and mon_thousands_sepDragan Stanojevic - Nevidljivi2018-04-231-2/+2
| | | | | | [BZ #23094] * localedata/locales/hr_HR: fix thousands_sep and mon_thousands_sep
* cs_CZ locale: Add alternative month names (bug 22963).Rafal Luzynski2018-03-151-1/+14
| | | | | | | | | Add alternative month names, primary month names are genitive now. [BZ #22963] * localedata/locales/cs_CZ (mon): Rename to... (alt_mon): This. (mon): Import from CLDR (genitive case).
* Greek (el_CY, el_GR) locales: Introduce ab_alt_mon (bug 22937).Rafal Luzynski2018-03-152-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | As spotted by GNOME translation team, Greek language has the actually visible difference between the abbreviated nominative and the abbreviated genitive case for some month names. Examples: May: abbreviated nominative: "Μάι" -> abbreviated genitive: "Μαΐ" July: abbreviated nominative: "Ιούν" -> abbreviated genitive: "Ιουλ" and more month names with similar differences. Original discussion: https://bugzilla.gnome.org/show_bug.cgi?id=793645#c21 [BZ #22937] * localedata/locales/el_CY (abmon): Rename to... (ab_alt_mon): This. (abmon): Import from CLDR (abbreviated genitive case). * localedata/locales/el_GR (abmon): Rename to... (ab_alt_mon): This. (abmon): Import from CLDR (abbreviated genitive case).
* lt_LT locale: Update abbreviated month names (bug 22932).Rafal Luzynski2018-03-151-6/+6
| | | | | | | | | A GNOME translator asked to use the same abbreviated month names as provided by CLDR. This sounds reasonable. See the discussion: https://bugzilla.gnome.org/show_bug.cgi?id=793645#c27 [BZ #22932] * localedata/locales/lt_LT (abmon): Synchronize with CLDR.
* ca_ES locale: Update LC_TIME (bug 22848).Robert Buj2018-03-151-40/+71
| | | | | | | | | | | | | | | | | | | | | | | Add/fix alternative month names, long & short formats, am_pm, abday settings, and improve indentation for Catalan. [BZ #22848] * localedata/locales/ca_ES (abmon): Rename to... (ab_alt_mon): This, then synchronize with CLDR (nominative case). (mon): Rename to... (alt_mon): This. (abmon): Import from CLDR (genitive case, month names preceded by "de" or "d’"). (mon): Likewise. (abday): Synchronize with CLDR. (d_t_fmt): Likewise. (d_fmt): Likewise. (am_pm): Likewise. (LC_TIME): Improve indentation. (LC_TELEPHONE): Likewise. (LC_NAME): Likewise. (LC_ADDRESS): Likewise.
* an_ES locale: update some locale data [BZ #22896]Mike FABIAN2018-03-011-35/+22
| | | | | | | [BZ #22896] * localedata/locales/an_ES: update month and day names, improve d_fmt, improve postal_fmt, add country_post, add country_isbn
* bg_BG locale: Fix a typo in a commentMike FABIAN2018-03-011-1/+1
| | | | | * localedata/locales/bg_BG (LC_COLLATE): The comment mentioned Ukrainian instead of Bulgarian.
* Adapt collation in several locales to the new iso14651_t1_common fileMike FABIAN2018-02-2780-5479/+6449
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [BZ #22550] - es_ES locale (and other es_* locales): collation should treat ñ as a primary different character, sync the collation for Spanish with CLDR [BZ #21547] - Tibetan script collation broken (Dzongkha and Tibetan) * localedata/Makefile: Add new test files. * localedata/lv_LV.UTF-8.in: Adapt test file to new collation order. * localedata/sv_SE.ISO-8859-1.in: Adapt test file to new collation order. * localedata/uk_UA.UTF-8.in: Adapt test file to new collation order. * localedata/am_ET.UTF-8.in: New test file. * localedata/az_AZ.UTF-8.in: Likewise. * localedata/be_BY.UTF-8.in: Likewise. * localedata/ber_DZ.UTF-8.in: Likewise. * localedata/ber_MA.UTF-8.in: Likewise. * localedata/bg_BG.UTF-8.in: Likewise. * localedata/br_FR.UTF-8.in: Likewise. * localedata/cmn_TW.UTF-8.in: Likewise. * localedata/crh_UA.UTF-8.in: Likewise. * localedata/csb_PL.UTF-8.in: Likewise. * localedata/cv_RU.UTF-8.in: Likewise. * localedata/cy_GB.UTF-8.in: Likewise. * localedata/dz_BT.UTF-8.in: Likewise. * localedata/eo.UTF-8.in: Likewise. * localedata/es_ES.UTF-8.in: Likewise. * localedata/fa_IR.UTF-8.in: Likewise. * localedata/fi_FI.UTF-8.in: Likewise. * localedata/fil_PH.UTF-8.in: Likewise. * localedata/fur_IT.UTF-8.in: Likewise. * localedata/gez_ER.UTF-8@abegede.in: Likewise. * localedata/ha_NG.UTF-8.in: Likewise. * localedata/ig_NG.UTF-8.in: Likewise. * localedata/ik_CA.UTF-8.in: Likewise. * localedata/kk_KZ.UTF-8.in: Likewise. * localedata/ku_TR.UTF-8.in: Likewise. * localedata/ky_KG.UTF-8.in: Likewise. * localedata/ln_CD.UTF-8.in: Likewise. * localedata/mi_NZ.UTF-8.in: Likewise. * localedata/ml_IN.UTF-8.in: Likewise. * localedata/mn_MN.UTF-8.in: Likewise. * localedata/mr_IN.UTF-8.in: Likewise. * localedata/mt_MT.UTF-8.in: Likewise. * localedata/nb_NO.UTF-8.in: Likewise. * localedata/om_KE.UTF-8.in: Likewise. * localedata/os_RU.UTF-8.in: Likewise. * localedata/ps_AF.UTF-8.in: Likewise. * localedata/ro_RO.UTF-8.in: Likewise. * localedata/ru_RU.UTF-8.in: Likewise. * localedata/sc_IT.UTF-8.in: Likewise. * localedata/se_NO.UTF-8.in: Likewise. * localedata/sq_AL.UTF-8.in: Likewise. * localedata/sv_SE.UTF-8.in: Likewise. * localedata/szl_PL.UTF-8.in: Likewise. * localedata/tg_TJ.UTF-8.in: Likewise. * localedata/tk_TM.UTF-8.in: Likewise. * localedata/tt_RU.UTF-8.in: Likewise. * localedata/tt_RU.UTF-8@iqtelif.in: Likewise. * localedata/ug_CN.UTF-8.in: Likewise. * localedata/uz_UZ.UTF-8.in: Likewise. * localedata/vi_VN.UTF-8.in: Likewise. * localedata/yi_US.UTF-8.in: Likewise. * localedata/yo_NG.UTF-8.in: Likewise. * localedata/zh_CN.UTF-8.in: Likewise. * localedata/locales/am_ET: Adapt collation rules to new iso14651_t1_common file and fix bugs in the collation. * localedata/locales/az_AZ: Likewise. * localedata/locales/be_BY: Likewise. * localedata/locales/ber_DZ: Likewise. * localedata/locales/ber_MA: Likewise. * localedata/locales/bg_BG: Likewise. * localedata/locales/br_FR: Likewise. * localedata/locales/br_FR@euro: Likewise. * localedata/locales/ca_ES: Likewise. * localedata/locales/cns11643_stroke: Likewise. * localedata/locales/crh_UA: Likewise. * localedata/locales/cs_CZ: Likewise. * localedata/locales/csb_PL: Likewise. * localedata/locales/cv_RU: Likewise. * localedata/locales/cy_GB: Likewise. * localedata/locales/da_DK: Likewise. * localedata/locales/dz_BT: Likewise. * localedata/locales/en_CA: Likewise. * localedata/locales/eo: Likewise. * localedata/locales/es_CU: Likewise. * localedata/locales/es_EC: Likewise. * localedata/locales/es_ES: Likewise. * localedata/locales/es_US: Likewise. * localedata/locales/et_EE: Likewise. * localedata/locales/fa_IR: Likewise. * localedata/locales/fi_FI: Likewise. * localedata/locales/fil_PH: Likewise. * localedata/locales/fur_IT: Likewise. * localedata/locales/gez_ER@abegede: Likewise. * localedata/locales/ha_NG: Likewise. * localedata/locales/hr_HR: Likewise. * localedata/locales/hsb_DE: Likewise. * localedata/locales/hu_HU: Likewise. * localedata/locales/ig_NG: Likewise. * localedata/locales/ik_CA: Likewise. * localedata/locales/is_IS: Likewise. * localedata/locales/iso14651_t1_pinyin: Likewise. * localedata/locales/kk_KZ: Likewise. * localedata/locales/ku_TR: Likewise. * localedata/locales/ky_KG: Likewise. * localedata/locales/ln_CD: Likewise. * localedata/locales/lt_LT: Likewise. * localedata/locales/lv_LV: Likewise. * localedata/locales/mi_NZ: Likewise. * localedata/locales/ml_IN: Likewise. * localedata/locales/mn_MN: Likewise. * localedata/locales/mr_IN: Likewise. * localedata/locales/mt_MT: Likewise. * localedata/locales/nb_NO: Likewise. * localedata/locales/om_KE: Likewise. * localedata/locales/os_RU: Likewise. * localedata/locales/pl_PL: Likewise. * localedata/locales/ps_AF: Likewise. * localedata/locales/ro_RO: Likewise. * localedata/locales/ru_RU: Likewise. * localedata/locales/ru_UA: Likewise. * localedata/locales/sc_IT: Likewise. * localedata/locales/se_NO: Likewise. * localedata/locales/si_LK: Likewise. * localedata/locales/sq_AL: Likewise. * localedata/locales/sv_FI: Likewise. * localedata/locales/sv_FI@euro: Likewise. * localedata/locales/sv_SE: Likewise. * localedata/locales/szl_PL: Likewise. * localedata/locales/tg_TJ: Likewise. * localedata/locales/ti_ER: Likewise. * localedata/locales/tk_TM: Likewise. * localedata/locales/tl_PH: Likewise. * localedata/locales/tr_TR: Likewise. * localedata/locales/tt_RU: Likewise. * localedata/locales/tt_RU@iqtelif: Likewise. * localedata/locales/ug_CN: Likewise. * localedata/locales/uk_UA: Likewise. * localedata/locales/uz_UZ: Likewise. * localedata/locales/uz_UZ@cyrillic: Likewise. * localedata/locales/vi_VN: Likewise. * localedata/locales/yi_US: Likewise. * localedata/locales/yo_NG: Likewise.
* Add sections for various scripts to the iso14651_t1_common fileMike FABIAN2018-02-271-9/+68
| | | | | * localedata/locales/iso14651_t1_common: Add sections for various scripts to the iso14651_t1_common file.
* iso14651_t1_common: make the fourth level the codepoint for characters which ↵Mike FABIAN2018-02-271-457/+457
| | | | | | | | | | | | | | | | | | | | | | are ignorable on all 4 levels Entries for characters which have “IGNORE” on all 4 levels like: <U0001> IGNORE;IGNORE;IGNORE;IGNORE % START OF HEADING (in ISO 6429) are changed into: <U0001> IGNORE;IGNORE;IGNORE;<U0001> % START OF HEADING (in ISO 6429) i.e. putting the code point of the character into the fourth level instead of “IGNORE”. Without that change, all such characters would compare equal which would make a wcscoll test case fail. It is better to have a clearly defined sort order even for characters like this so it is good to use the code point as a tie-break. * localedata/locales/iso14651_t1_common: Use the code point of a character in the fourth collation level instead of IGNORE for all entries which have IGNORE on all 4 levels.
* Add convenience symbols like <AFTER-A>, <BEFORE-A> to iso14651_t1_commonMike FABIAN2018-02-271-0/+120
| | | | | | * localedata/locales/iso14651_t1_common: Add some convenient collation symbols like <AFTER-A>, <BEFORE-A> to make tailoring easier using rules similar to those in CLDR.
* Fixing syntax errors after updating the iso14651_t1_common fileMike FABIAN2018-02-271-4/+32811
| | | | | | * localedata/locales/iso14651_t1_common: The new version of this file downloaded from ISO contained several syntax errors which are fixed by this patch.
* iso14651_t1_common: <U\([0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F]\)> → <U000\1>Mike FABIAN2018-02-271-13294/+13294
| | | | | * localedata/locales/iso14651_t1_common: replace all <U.....> with <U000.....> because glibc understands only 4 digit or 8 digit
* Necessary changes after updating the iso14651_t1_common fileMike FABIAN2018-02-271-8/+16
| | | | | * localedata/locales/iso14651_t1_common: Necessary changes to make the file downloaded from ISO usable by glibc.
* Update iso14651_t1_common file to ISO14651_2016_TABLE1_en.txt [BZ #14095]Mike FABIAN2018-02-271-9494/+52571
| | | | | | | | | | | | | | | | | | | | | | [BZ #14095] - Review / update collation data from Unicode / ISO 14651 File downloaded from: http://standards.iso.org/iso-iec/14651/ed-4/ISO14651_2016_TABLE1_en.txt Updating this file alone is not enough, there are problems in the new file which need to be fixed and the collation rules for many locales need to be adapted. This is done by the following patches. This update also fixes the problem that many characters are treated as identical when sorting because they were not yet in the old iso14651_t1_common file, see: https://bugzilla.redhat.com/show_bug.cgi?id=1336308 - Infinite (∞) and empty set (∅) are treated as if they were the same character by sort and uniq [BZ #14095] * localedata/locales/iso14651_t1_common: Update file to latest version from ISO (ISO14651_2016_TABLE1_en.txt).
* Use / instead of - in d_fmt for pt_BR and pt_PT [BZ #17438]Mike FABIAN2018-02-232-2/+2
| | | | | | | [BZ #17438] * localedata/locales/pt_BR (LC_TIME): use / instead of - in d_fmt. * localedata/locales/pt_PT (LC_TIME): likewise
* Use “copy "es_BO"” in LC_TIME of es_CU, es_CL, and es_ECMike FABIAN2018-02-233-108/+3
| | | | | | | | | | LC_TIME in these 4 locales is identical, using “copy "es_BO"” makes that more obvious. [BZ #22646] * localedata/locales/es_CL (LC_TIME): copy "es_BO". * localedata/locales/es_CU (LC_TIME): copy "es_BO". * localedata/locales/es_EC (LC_TIME): copy "es_BO".