about summary refs log tree commit diff
path: root/localedata
Commit message (Collapse)AuthorAgeFilesLines
* Reinstate gconv-modules as the default configuration fileSiddhesh Poyarekar2021-06-141-2/+2
| | | | | | | | Reinstate gconv-modules as the main file so that the configuration files in gconv-modules.d/ become add-on configuration. With this, the effective user visible change is that GCONV_PATH can now have supplementary configuration in GCONV_PATH/gconv-modules.d/ in addition to the main GCONV_PATH/gconv-modules file.
* iconvdata: Move gconv-modules configuration to gconv-modules.confSiddhesh Poyarekar2021-06-091-2/+2
| | | | | | | | | | | Move all gconv-modules configuration files to gconv-modules.conf. That is, the S390 extensions now become gconv-modules-s390.conf. Move both configuration files into gconv-modules.d. Now GCONV_PATH/gconv-modules is read only for backward compatibility for third-party gconv modules directories. Reviewed-by: DJ Delorie <dj@redhat.com>
* localedata: Use U+00AF MACRON in more EBCDIC charsets [BZ #27882]Florian Weimer2021-05-187-7/+7
| | | | | | | | | | This updates IBM256, IBM277, IBM278, IBM280, IBM284, IBM297, IBM424 in the same way that IBM273 was updated for bug 23290. IBM256 and IBM424 still have holes after this change, so HAS_HOLES is not updated. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* Update sv_SE to treate 'W' as a distinct character (Bug 25036)Sebastian Rasmussen2021-04-063-21/+13
| | | | | | | | | | | | The 13th edition of Svenska Akademiens ordlista lists 'W' as a distinct letter that sorts after 'V'. We adjust the sv_SE locale (and tests) to match this updated and "reformed" language change. This harmonizes us with CLDR 1.5.0 (2007) for sv_SE sorting of the letter 'W'. No regressions on x86_64, and locale sorting tests all pass. Co-authored-by: Carlos O'Donell <carlos@redhat.com>
* POSIX locale: Fix typo in commentMarc Aurèle La France2021-01-091-1/+1
|
* Update copyright dates with scripts/update-copyrightsPaul Eggert2021-01-0248-48/+48
| | | | | | | | | | | | | | | | I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 6694 files FOO. I then removed trailing white space from benchtests/bench-pthread-locks.c and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this diagnostic from Savannah: remote: *** pre-commit check failed ... remote: *** error: lines with trailing whitespace found remote: error: hook declined to update refs/heads/master
* Revert "Fix missing redirects in testsuite targets"Andreas Schwab2020-10-081-2/+2
| | | | | This reverts commit d5afb38503. The log files are actually created by the various shell scripts that drive the tests.
* en_US: Minimize changes to date_fmt (Bug 25923)Carlos O'Donell2020-07-161-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In 2000 when date_fmt was originally added as an extension the en_US locale did not have a date_fmt specifier and so used the default which resulted in the abbreviated month name coming before the day of the month (as expected in the US and other locales). In commit 7395f3a0efad9fc51bb54fa383ef6524702e0c49 the date_fmt was added to en_US with a 12H time to better align with US user expectations. Unfortunately the abbreviated month name and day were inverted during that transition, and that was seen as a regression and reported against Fedora 32: https://bugzilla.redhat.com/show_bug.cgi?id=1830623 The progression of date_fmt looks like this: "%a %b %e %H:%M:%S %Z %Y" <- Originally (2000) "%a %d %b %Y %I:%M:%S %p %Z" <- glibc 2.29 (2019) "%a %b %e %r %Z %Y" <- glibc 2.32 (2020) [this commit] Note: "%r" is "%I:%M:%S %p" in en_US and so shorter to write. Likewise the year is in the wrong place in commit 7395f3a0efad9fc51bb54fa383ef6524702e0c49 and this is corrected in this patch. For reference d_t_fmt: "%a %d %b %Y %r %Z" <- d_t_fmt (1997) Yes, d_t_fmt and date_fmt are *not* the same, this is just the history of this locale. This commit does not change d_t_fmt to better align with date_fmt. No users have requested we change d_t_fmt or given any justification for such a change. The only goals of this change are to place the abbreviated month name before the day of the month as it has been printed since 2000, and place the year at the end. This minimizes the change from commit 7395f3a0efad9fc51bb54fa383ef6524702e0c49 and makes good on changing only from 24H clock to 12H clock. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* Set width of JUNGSEONG/JONGSEONG characters from UD7B0 to UD7FB to 0 [BZ #26120]Mike FABIAN2020-06-2610-9/+18
| | | | Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* ckb_IQ, or_IN locales: Add missing reorder-end keywordsFlorian Weimer2020-05-082-0/+4
| | | | | | This suppresses a non-fatal error during locale building. Reviewed-by: Rafał Lużyński <digitalfreak@lingonborough.com>
* localedef: Add tests-container test for --no-hard-links.Carlos O'Donell2020-04-306-0/+154
| | | | | | | | | | | | | | | The new tst-localedef-hardlinks verifies that when compiling two locales (with default output directory) one with --no-hard-links and one without the option, results in the expected behaviour. When --no-hard-links is used the link counts on LC_CTYPE is 1, indicating that even thoug the two locale are identical (though different named source files and output direcotry) the localedef did not carry out the hard link optimization. Then when --no-hard-links is omitted the localedef hard link optimization is correctly carried out and for 2 compiled locales the link count for LC_CTYPE is 2. Reviewed-by: DJ Delorie <dj@redhat.com>
* Bug 25819: Update to Unicode 13.0.0Mike FABIAN2020-04-2114-1438/+3833
| | | | | | | | | Unicode 13.0.0 Support: Character encoding, character type info, and transliteration tables are all updated to Unicode 13.0.0, using the generator scripts contributed by Mike FABIAN (Red Hat). Total added characters in newly generated CHARMAP: 5930 Total added characters in newly generated WIDTH: 5536
* Updates to the shn_MM locale [BZ #25532]kokoye20072020-04-081-69/+62
|
* oc_FR locale: Fix spelling of April (bug 25639)Rafał Lużyński2020-04-071-2/+2
| | | | | | Confirmed by CLDR and a native speaker: "abril" is more often used even if "abrial" is also correct. Both nominative (alt_mon) and genitive (mon) cases are updated.
* oc_FR locale: Fix spelling of Thursday (bug 25639)Rafał Lużyński2020-03-191-1/+1
| | | | | | As reported by a native speaker: Thursday: "dijóus" -> "dijòus" (also confirmed by CLDR)
* Fix typo in the name for Wednesday in Kurdish [BZ #9809]Mike FABIAN2020-02-111-1/+1
|
* Update or_IN collation [BZ #22525]Mike FABIAN2020-02-033-526/+186
| | | | | - Add a test file or_IN.UTF-8.in. - Make the collation agree with CLDR.
* Fix ckb_IQ [BZ #9809]Mike FABIAN2020-02-034-309/+194
| | | | | | Add ckb_IQ to SUPPORTED file. Add ckb_IQ.UTF-8.in collation test file. Mention new ckb_IQ locale in NEWS.
* Add new locale: ckb_IQ (Kurdish/Sorani spoken in Iraq) [BZ #9809]Jwtiyar Nariman2020-02-031-0/+452
|
* sl_SI locale: Use "." as the thousands separator (bug 25233)Rafał Lużyński2020-01-081-2/+2
| | | | | | | | | | This is correct according to CLDR [1] and Florian Weimer's quick research. [2] [1] https://st.unicode.org/cldr-apps/v#/sl/Symbols/ [2] https://sourceware.org/bugzilla/show_bug.cgi?id=25233#c0 Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* Multiple locales: Add date_fmt (bug 24054)Rafał Lużyński2020-01-02204-203/+564
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is not specified what should be the content of d_t_fmt and date_fmt but in the built-in C locale those fields have only one difference: date_fmt contains "%Z" (the current time zone) while d_t_fmt does not. For most of the locales this commit does the following operation: copy d_t_fmt to date_fmt, and then remove "%Z" from d_t_fmt. If "%Z" was originally missing from d_t_fmt add it to date_fmt. It also corrects comments where necessary. Exceptions: * In bo_CN, dz_BT, and km_KH "%Z" has not been added to date_fmt because it was too difficult. In these locales date_fmt has been set to the copy of d_t_fmt. * In en_DK "%Z" has not been removed from d_t_fmt in order to preserve the conformance with the standard mentioned in the comment. The command to identify and initially edit the locales that need the update was: for i in `grep -lw d_t_fmt *` do if ! grep -qw date_fmt $i ; then awk '/d_t_fmt/ { print $0; gsub("d_t_fmt", "date_fmt"); } //{ print $0 }' < $i > $i.next mv $i.next $i fi done and then each file was further edited manually.
* Update copyright dates with scripts/update-copyrights.Joseph Myers2020-01-0147-47/+47
|
* lv_LV locale: Correct the time part of d_t_fmt (bug 25324)Rafał Lużyński2019-12-301-1/+1
| | | | | | | | | Currently d_t_fmt formats time as "plkst. %H un %M". A quick Google search says that "plkst." means "o’clock" and "un" means "and". Also this format does not display seconds. CLDR does not mention anything like that. We have no reason to use anything different than "%H:%M:%S".
* km_KH locale: Use "%M" instead of "m" in d_t_fmt (bug 25323)Rafał Lużyński2019-12-301-1/+1
| | | | | A quick analysis suggests that the original author meant "%M" (minutes format specifier) instead of "m" which is just a literal "m" letter.
* mnw_MM, my_MM, and shn_MM locales: Do not use %OpRafał Lużyński2019-12-233-4/+4
| | | | | The "O" modifier does nothing when used with "%p" so let's better not use it at all and replace "%Op" with "%p".
* ru_UA locale: use copy "ru_RU" in LC_TIME (bug 25044)Rafał Lużyński2019-11-261-69/+1
| | | | | | | | | | Replacing incorrect abbreviated weekday names "Пнд", "Вто", "Срд"... with correct ones "Пн", "Вт", "Ср"... makes the LC_TIME sections in those two locales almost identical. The only remaining difference was that ab_alt_mon elements in ru_UA were lowercase while in ru_RU they had the first letter uppercase, the latter was pointed as a better choice by a native speaker. This commit unifies LC_TIME between ru_RU and ru_UA.
* Add new locale: mnw_MM (Mon language spoken in Myanmar) [BZ #25139]Talachan Mon2019-11-062-0/+288
|
* Add Transliterations for Unicode Misc. Mathematical Symbols-A/B [BZ #23132]Arjun Shankar2019-10-253-3/+157
| | | | | | | | | This commit adds previously missing transliterations for several code points in the Unicode blocks "Miscellaneous Mathematical Symbols-A/B" - transliterated to their approximate ASCII representations. It also adds a corresponding iconv transliteration test. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* Install charmaps uncompressed in testrootDJ Delorie2019-10-241-0/+18
| | | | | | | | | | | | | The testroot does not have a gunzip command, so the charmap files should not be installed gzipped else they cannot be used (and thus tested). With this patch, installing with INSTALL_UNCOMPRESSED=yes installs uncompressed charmaps instead. Note that we must purge the $(symbolic_link_list) as it contains references to $(DESTDIR), which we change during the testroot installation. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* Sync "language", "lang_name", "territory", "country_name" with CLDR/langtableMike FABIAN2019-10-0169-88/+135
| | | | | | | | | Sync these values with CLDR and langtable as much as possible. Add missing values. If possible, take the values from CLDR, if CLDR does not have it, take it from langtable. The values from langtable which are not from CLDR are from Wikipedia or native speakers.
* Prefer https to http for gnu.org and fsf.org URLsPaul Eggert2019-09-0778-78/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Also, change sources.redhat.com to sourceware.org. This patch was automatically generated by running the following shell script, which uses GNU sed, and which avoids modifying files imported from upstream: sed -ri ' s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g ' \ $(find $(git ls-files) -prune -type f \ ! -name '*.po' \ ! -name 'ChangeLog*' \ ! -path COPYING ! -path COPYING.LIB \ ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \ ! -path manual/texinfo.tex ! -path scripts/config.guess \ ! -path scripts/config.sub ! -path scripts/install-sh \ ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \ ! -path INSTALL ! -path locale/programs/charmap-kw.h \ ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \ ! '(' -name configure \ -execdir test -f configure.ac -o -f configure.in ';' ')' \ ! '(' -name preconfigure \ -execdir test -f preconfigure.ac ';' ')' \ -print) and then by running 'make dist-prepare' to regenerate files built from the altered files, and then executing the following to cleanup: chmod a+x sysdeps/unix/sysv/linux/riscv/configure # Omit irrelevant whitespace and comment-only changes, # perhaps from a slightly-different Autoconf version. git checkout -f \ sysdeps/csky/configure \ sysdeps/hppa/configure \ sysdeps/riscv/configure \ sysdeps/unix/sysv/linux/csky/configure # Omit changes that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines git checkout -f \ sysdeps/powerpc/powerpc64/ppc-mcount.S \ sysdeps/unix/sysv/linux/s390/s390-64/syscall.S # Omit change that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
* Chinese locales: Set first_weekday to 2 (bug 24682).Rafal Luzynski2019-08-233-0/+3
| | | | | | | | | | | | | The first day of the week in China (Mainland) should be Monday according to the national standard GB/T 7408-2005. References: * https://www.doc88.com/p-1166696540287.html * https://unicode-org.atlassian.net/browse/CLDR-11510 [BZ #24682] * localedata/locales/bo_CN (first_weekday): Add, set to 2 (Monday). * localedata/locales/ug_CN (first_weekday): Likewise. * localedata/locales/zh_CN (first_weekday): Likewise.
* Afar locales: Months and days updated from CLDR (bug 21897).Rafal Luzynski2019-07-174-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit updates month and weekday names (full and abbreviated) from CLDR 35.1 with the following exceptions. It was not clear why the full name of February in aa_DJ and aa_ER was "Kudo" while the abbreviated version is "Nah" but some additional sources [1] [2] as well as the content of aa_ER and aa_ER@saaho suggest it should be "Naharsi Kudo". This commit consequently sets the translation of February to "Naharsi Kudo" in aa_DJ and aa_ET. aa_ER@saaho is not supported by CLDR but since the month names were identical to aa_ER before this commit, the same values have been copied from aa_ER. Links: [1] https://fr.wiktionary.org/wiki/naharsi_kudo [2] http://www.mcit.gov.et/web/guest/-/localization-standard-for-afaraf [BZ #21897] * localedata/locales/aa_DJ (abday): Update from CLDR, all words begin with an uppercase letter now. (abmon): Likewise. (mon): Update from CLDR, reword February from "Kudo" to "Naharsi Kudo", April from "Agda Baxisso" to "Agda Baxis", and August from "Liiqen" to "Leqeeni". * localedata/locales/aa_ER (mon): Update from CLDR, reword April from "Agda Baxisso" to "Agda Baxis" and August from "Leqeeni" to "Liiqen". * localedata/locales/aa_ER@saaho (mon): Likewise. * localedata/locales/aa_ET (abmon): Update from CLDR, reword abbreviated February from "Kud" to "Nah". (mon): Update from CLDR, reword February from "Kudo" to "Naharsi Kudo" and April from "Agda Baxisso" to "Agda Baxis".
* nl_BE locale: Use "copy "nl_NL"" in LC_NAME (bug 23996).Rafal Luzynski2019-07-171-6/+1
| | | | | | | The content of the section is identical in both languages. [BZ #23996] * localedata/locales/nl_BE (LC_NAME): Replace with “copy "nl_NL"”.
* nl_BE and nl_NL locales: Dutch salutations (bug 23996).PanderMusubi2019-07-172-0/+10
| | | | | | | [BZ #23996] * localedata/locales/nl_BE (LC_NAME): Add name_gen, name_mr, name_mrs, name_miss, and name_ms. * localedata/locales/nl_NL (LC_NAME): Likewise.
* ga_IE and en_IE locales: Revert first_weekday removal (bug 24200).Daniil Zhilin2019-07-172-0/+2
| | | | | | | | These values were removed by the commit 0a410e76f5. [BZ #24200] * localedata/locales/ga_IE (first_weekday): Add, set to 2 (Monday). * localedata/locales/en_IE (first_weekday): Likewise.
* szl_PL locale: Fix a typo in the previous commit (bug 24652).Rafal Luzynski2019-06-241-1/+1
| | | | | | | | | | | | The Unicode sequences in the format <Uxxxx> should be used instead of non-ASCII characters. Reported by Piotr Drąg: https://sourceware.org/bugzilla/show_bug.cgi?id=24652#c8 [BZ #24652] * localedata/locales/szl_PL (day): Use the correct Unicode sequences instead of non-ASCII characters.
* szl_PL locale: Spelling corrections (bug 24652).Grzegorz Kulik2019-06-241-15/+27
| | | | | | | | | | | | | | This commit also provides the correct month names in both nominative and genitive case for Silesian language, as required by the fix for the bug 10871. [BZ #24652] * localedata/locales/szl_PL (abday): Spelling corrections. (day): Likewise. (abmon): Likewise. (mon): Rename to... (alt_mon): This, then apply spelling corrections. (mon): New entry, month names in the genitive case.
* nl_{AW,NL}: Correct the thousands separator and grouping (bug 23831).Rafal Luzynski2019-06-212-4/+4
| | | | | | | | | | | | According to CLDR 35.1 and the bug report the thousands grouping separator should be always "." (a single dot) and digits should be grouped by 3. [BZ #23831] * localedata/locales/nl_AW (mon_thousands_sep): Set to ".". * localedata/locales/nl_NL (mon_thousands_sep): Likewise. (thousands_sep): Likewise. (grouping): Set to 3;3.
* nl_AW locale: Correct the negative monetary format (bug 24614).Rafal Luzynski2019-06-191-2/+2
| | | | | | | | | | | | Follow the same changes as made in the commit 02d8b5ab1c because the respective entries in nl_NL and nl_AW had been the same before the change so they should be the same after. CLDR does not provide complete data for nl_AW, it says it is missing and displays a copy of nl_NL. [BZ #24614] * localedata/locales/nl_AW (n_sep_by_space): Set to 2 (a space between the currency symbol and the minus sign). (n_sign_posn): Set to 4 (the minus sign after the currency symbol).
* nl_NL locale: Correct the negative monetary format (bug 24614).Rafal Luzynski2019-06-173-3/+5
| | | | | | | | | | | | | | | According to CLDR 35.1 and the bug report the correct monetary format for negative amounts should be "EUR -1 234,56" while previously it was "EUR 1 234,56-". This patch does not change the thousands (grouping) separator. [BZ #24614] * localedata/Makefile (LOCALES): Add nl_NL.UTF-8. * localedata/locales/nl_NL (n_sep_by_space): Set to 2 (a space between the currency symbol and the minus sign). (n_sign_posn): Set to 4 (the minus sign after the currency symbol). * localedata/tst-strfmon1.c (tests): Add test data for nl_NL.UTF-8.
* tt_RU: Add lang_name [BZ #24370]mansayk2019-05-281-0/+1
| | | | | | | This commit adds a lang_name according to CLDR-35.1. [BZ #24370] * localedata/locales/tt_RU (lang_name): Add from CLDR-35.1.
* tt_RU: Fix orthographic mistakes in mon and abmon sections [BZ #24369]mansayk2019-05-281-24/+24
| | | | | | | | | | | This commit fixes some errors and converts all month names to lowercase. The content is synchronized with CLDR-35.1 now but trailing dots are removed from abmon values in order to maintain consistency with the previous values and with many other locales which do the same. [BZ #24369] * localedata/locales/tt_RU (mon): Update from CLDR-35.1, fix errors. (abmon): Likewise, but remove the trailing dots.
* Bug 24535: Update to Unicode 12.1.0Mike FABIAN2019-05-1314-235/+233
| | | | | | | | | | | | | | | | | | | Unicode 12.1.0 Support: Character encoding, character type info, and transliteration tables are all updated to Unicode 12.1.0, using the generator scripts contributed by Mike FABIAN (Red Hat). Some info about the number of characters added or changed: Total added characters in newly generated CHARMAP: 1 added: <U32FF> /xe3/x8b/xbf SQUARE ERA NAME REIWA Total added characters in newly generated WIDTH: 1 added: <U32FF> 2 : eaw=W category=So bidi=L name=SQUARE ERA NAME REIWA graph: Added 1 characters in new ctype which were not in old ctype graph: Added: ㋿ U+32FF SQUARE ERA NAME REIWA print: Added 1 characters in new ctype which were not in old ctype print: Added: ㋿ U+32FF SQUARE ERA NAME REIWA punct: Added 1 characters in new ctype which were not in old ctype punct: Added: ㋿ U+32FF SQUARE ERA NAME REIWA
* ja_JP locale: Add entry for the new Japanese era [BZ #22964]TAMUKI Shoichi2019-04-021-2/+4
| | | | | | | | | | | | | | | | | | | | The Japanese era name will be changed on May 1, 2019. The Japanese government made a preliminary announcement on April 1, 2019. The glibc ja_JP locale must be updated to include the new era name for strftime's alternative year format support. Checked on x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com> ChangeLog: [BZ #22964] * localedata/locales/ja_JP (LC_TIME): Add entry for the new Japanese era. * time/tst-strftime2.c (dates): Add 2019-04-30 and 2019-05-01. (mkreftable): Add rules for the new Japanese era and the new dates.
* Add verbose comments to 'era' in ja_JP locale.Carlos O'Donell2019-04-011-0/+18
| | | | | Reviewed-by: Rafal Luzynski <digitalfreak@lingonborough.com> Reviewed-by: TAMUKI Shoichi <tamuki@linet.gr.jp>
* tt_RU: Fix orthographic mistakes in day and abday sections [BZ #24296]mansayk2019-03-201-14/+14
| | | | | | | | | | | This commit fixes some errors and converts all weekday names to lowercase. The content is synchronized with CLDR-34 now, but trailing dots are removed from abday values in order to maintain consistency with the previous values and with many other locales which do the same. [BZ #24296] * localedata/locales/tt_RU (day): Update from CLDR-34, fix errors. (abday): Likewise, but remove the trailing dots.
* localedata: Add Minguo calendar support to Taiwanese locales [BZ #24293]Felix Yan2019-03-155-0/+20
| | | | | | | | | | | | | | | | | Minguo calendar is the official calendar system, and very widely used in Taiwan. This commit adds its support into glibc. Some background information: The government website (www.gov.tw) uses it, popular public services like Taiwan HSR also use this calendar system. Link to Wikipedia: https://en.wikipedia.org/wiki/Minguo_calendar [BZ #24293] * localedata/locales/zh_TW (era): Add, support Minguo calendar. * localedata/locales/cmn_TW (era): Likewise. * localedata/locales/hak_TW (era): Likewise. * localedata/locales/lzh_TW (era): Likewise. * localedata/locales/nan_TW (era): Likewise.
* Bug 24307: Update to Unicode 12.0.0Mike FABIAN2019-03-0815-2595/+4039
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unicode 12.0.0 Support: Character encoding, character type info, and transliteration tables are all updated to Unicode 12.0.0, using the generator scripts contributed by Mike FABIAN (Red Hat). Some info about the number of characters added or changed: Total added characters in newly generated CHARMAP: 554 Total added characters in newly generated WIDTH: 106 alpha: Missing 8 characters of old ctype in new ctype (These are combining marks, apparently they were removed from alpha on purpose) alpha: Added 295 characters in new ctype which were not in old ctype combining: Missing 2 characters of old ctype in new ctype (U+1CF2 VEDIC SIGN ARDHAVISARGA and U+1CF3 VEDIC SIGN ROTATED ARDHAVISARGA, these are now "Alphabetic" in Unicode 12.0.0) combining: Added 37 characters in new ctype which were not in old ctype combining_level3: Missing 2 characters of old ctype in new ctype (U+1CF2 VEDIC SIGN ARDHAVISARGA and U+1CF3 VEDIC SIGN ROTATED ARDHAVISARGA, these are now "Alphabetic" in Unicode 12.0.0) combining_level3: Added 26 characters in new ctype which were not in old ctype graph: Added 554 characters in new ctype which were not in old ctype lower: Added 6 characters in new ctype which were not in old ctype print: Added 554 characters in new ctype which were not in old ctype punct: Missing 29 characters of old ctype in new ctype (These characters have all become "Alphabetic" in Unicode 12.0.0. Therefore, they are not in "punct" anymore (see: is_punct() in unicode_utils.py)) punct: Added 296 characters in new ctype which were not in old ctype tolower: Added 7 characters in new ctype which were not in old ctype totitle: Added 7 characters in new ctype which were not in old ctype toupper: Added 7 characters in new ctype which were not in old ctype upper: Added 7 characters in new ctype which were not in old ctype [BZ #24307] * localedata/unicode-gen/Makefile (UNICODE_VERSION): Set to 12.0.0. * localedata/unicode-gen/DerivedCoreProperties.txt: Update to Unicode 12.0.0. * localedata/unicode-gen/EastAsianWidth.txt: Likewise. * localedata/unicode-gen/PropList.txt: Likewise. * localedata/unicode-gen/UnicodeData.txt: Likewise. * localedata/unicode-gen/ctype_compatibility_test_cases.py: U+108D became "Alphabetic" in Unicode 12.0.0. Adapt test case. * localedata/charmaps/UTF-8: Regenerate. * localedata/locales/i18n_ctype: Likewise. * localedata/locales/tr_TR: Likewise. * localedata/locales/translit_circle: Likewise. * localedata/locales/translit_cjk_compat: Likewise. * localedata/locales/translit_combining: Likewise. * localedata/locales/translit_compat: Likewise. * localedata/locales/translit_font: Likewise. * localedata/locales/translit_fraction: Likewise.
* ja_JP: Change the offset for Taisho gan-nen from 2 to 1 [BZ #24162]TAMUKI Shoichi2019-03-021-1/+1
| | | | | | | | | | | | | | | The offset in era-string format for Taisho gan-nen (1912) is currently defined as 2, but it should be 1. So fix it. "Gan-nen" means the 1st (origin) year, Taisho started on July 30, 1912. Reported-by: Morimitsu, Junji <junji.morimitsu@hpe.com> Reviewed-by: Rafal Luzynski <digitalfreak@lingonborough.com> ChangeLog: [BZ #24162] * localedata/locales/ja_JP (LC_TIME): Change the offset for Taisho gan-nen from 2 to 1. Problem reported by Morimitsu, Junji.