diff options
author | Mike FABIAN <mfabian@redhat.com> | 2017-12-11 18:26:22 +0100 |
---|---|---|
committer | Mike FABIAN <mfabian@redhat.com> | 2018-02-27 17:47:50 +0100 |
commit | 159738548130d5ac4fe6178977e940ed5f8cfdc4 (patch) | |
tree | 03f90b90e7bb794cfdbd4b3e66c9fff7ad6a9b24 /localedata/locales/ml_IN | |
parent | ce6636b06b67d6bb9b3d6927bf2a926b9b7478f5 (diff) | |
download | glibc-159738548130d5ac4fe6178977e940ed5f8cfdc4.tar.gz glibc-159738548130d5ac4fe6178977e940ed5f8cfdc4.tar.xz glibc-159738548130d5ac4fe6178977e940ed5f8cfdc4.zip |
Adapt collation in several locales to the new iso14651_t1_common file
[BZ #22550] - es_ES locale (and other es_* locales): collation should treat ñ as a primary different character, sync the collation for Spanish with CLDR [BZ #21547] - Tibetan script collation broken (Dzongkha and Tibetan) * localedata/Makefile: Add new test files. * localedata/lv_LV.UTF-8.in: Adapt test file to new collation order. * localedata/sv_SE.ISO-8859-1.in: Adapt test file to new collation order. * localedata/uk_UA.UTF-8.in: Adapt test file to new collation order. * localedata/am_ET.UTF-8.in: New test file. * localedata/az_AZ.UTF-8.in: Likewise. * localedata/be_BY.UTF-8.in: Likewise. * localedata/ber_DZ.UTF-8.in: Likewise. * localedata/ber_MA.UTF-8.in: Likewise. * localedata/bg_BG.UTF-8.in: Likewise. * localedata/br_FR.UTF-8.in: Likewise. * localedata/cmn_TW.UTF-8.in: Likewise. * localedata/crh_UA.UTF-8.in: Likewise. * localedata/csb_PL.UTF-8.in: Likewise. * localedata/cv_RU.UTF-8.in: Likewise. * localedata/cy_GB.UTF-8.in: Likewise. * localedata/dz_BT.UTF-8.in: Likewise. * localedata/eo.UTF-8.in: Likewise. * localedata/es_ES.UTF-8.in: Likewise. * localedata/fa_IR.UTF-8.in: Likewise. * localedata/fi_FI.UTF-8.in: Likewise. * localedata/fil_PH.UTF-8.in: Likewise. * localedata/fur_IT.UTF-8.in: Likewise. * localedata/gez_ER.UTF-8@abegede.in: Likewise. * localedata/ha_NG.UTF-8.in: Likewise. * localedata/ig_NG.UTF-8.in: Likewise. * localedata/ik_CA.UTF-8.in: Likewise. * localedata/kk_KZ.UTF-8.in: Likewise. * localedata/ku_TR.UTF-8.in: Likewise. * localedata/ky_KG.UTF-8.in: Likewise. * localedata/ln_CD.UTF-8.in: Likewise. * localedata/mi_NZ.UTF-8.in: Likewise. * localedata/ml_IN.UTF-8.in: Likewise. * localedata/mn_MN.UTF-8.in: Likewise. * localedata/mr_IN.UTF-8.in: Likewise. * localedata/mt_MT.UTF-8.in: Likewise. * localedata/nb_NO.UTF-8.in: Likewise. * localedata/om_KE.UTF-8.in: Likewise. * localedata/os_RU.UTF-8.in: Likewise. * localedata/ps_AF.UTF-8.in: Likewise. * localedata/ro_RO.UTF-8.in: Likewise. * localedata/ru_RU.UTF-8.in: Likewise. * localedata/sc_IT.UTF-8.in: Likewise. * localedata/se_NO.UTF-8.in: Likewise. * localedata/sq_AL.UTF-8.in: Likewise. * localedata/sv_SE.UTF-8.in: Likewise. * localedata/szl_PL.UTF-8.in: Likewise. * localedata/tg_TJ.UTF-8.in: Likewise. * localedata/tk_TM.UTF-8.in: Likewise. * localedata/tt_RU.UTF-8.in: Likewise. * localedata/tt_RU.UTF-8@iqtelif.in: Likewise. * localedata/ug_CN.UTF-8.in: Likewise. * localedata/uz_UZ.UTF-8.in: Likewise. * localedata/vi_VN.UTF-8.in: Likewise. * localedata/yi_US.UTF-8.in: Likewise. * localedata/yo_NG.UTF-8.in: Likewise. * localedata/zh_CN.UTF-8.in: Likewise. * localedata/locales/am_ET: Adapt collation rules to new iso14651_t1_common file and fix bugs in the collation. * localedata/locales/az_AZ: Likewise. * localedata/locales/be_BY: Likewise. * localedata/locales/ber_DZ: Likewise. * localedata/locales/ber_MA: Likewise. * localedata/locales/bg_BG: Likewise. * localedata/locales/br_FR: Likewise. * localedata/locales/br_FR@euro: Likewise. * localedata/locales/ca_ES: Likewise. * localedata/locales/cns11643_stroke: Likewise. * localedata/locales/crh_UA: Likewise. * localedata/locales/cs_CZ: Likewise. * localedata/locales/csb_PL: Likewise. * localedata/locales/cv_RU: Likewise. * localedata/locales/cy_GB: Likewise. * localedata/locales/da_DK: Likewise. * localedata/locales/dz_BT: Likewise. * localedata/locales/en_CA: Likewise. * localedata/locales/eo: Likewise. * localedata/locales/es_CU: Likewise. * localedata/locales/es_EC: Likewise. * localedata/locales/es_ES: Likewise. * localedata/locales/es_US: Likewise. * localedata/locales/et_EE: Likewise. * localedata/locales/fa_IR: Likewise. * localedata/locales/fi_FI: Likewise. * localedata/locales/fil_PH: Likewise. * localedata/locales/fur_IT: Likewise. * localedata/locales/gez_ER@abegede: Likewise. * localedata/locales/ha_NG: Likewise. * localedata/locales/hr_HR: Likewise. * localedata/locales/hsb_DE: Likewise. * localedata/locales/hu_HU: Likewise. * localedata/locales/ig_NG: Likewise. * localedata/locales/ik_CA: Likewise. * localedata/locales/is_IS: Likewise. * localedata/locales/iso14651_t1_pinyin: Likewise. * localedata/locales/kk_KZ: Likewise. * localedata/locales/ku_TR: Likewise. * localedata/locales/ky_KG: Likewise. * localedata/locales/ln_CD: Likewise. * localedata/locales/lt_LT: Likewise. * localedata/locales/lv_LV: Likewise. * localedata/locales/mi_NZ: Likewise. * localedata/locales/ml_IN: Likewise. * localedata/locales/mn_MN: Likewise. * localedata/locales/mr_IN: Likewise. * localedata/locales/mt_MT: Likewise. * localedata/locales/nb_NO: Likewise. * localedata/locales/om_KE: Likewise. * localedata/locales/os_RU: Likewise. * localedata/locales/pl_PL: Likewise. * localedata/locales/ps_AF: Likewise. * localedata/locales/ro_RO: Likewise. * localedata/locales/ru_RU: Likewise. * localedata/locales/ru_UA: Likewise. * localedata/locales/sc_IT: Likewise. * localedata/locales/se_NO: Likewise. * localedata/locales/si_LK: Likewise. * localedata/locales/sq_AL: Likewise. * localedata/locales/sv_FI: Likewise. * localedata/locales/sv_FI@euro: Likewise. * localedata/locales/sv_SE: Likewise. * localedata/locales/szl_PL: Likewise. * localedata/locales/tg_TJ: Likewise. * localedata/locales/ti_ER: Likewise. * localedata/locales/tk_TM: Likewise. * localedata/locales/tl_PH: Likewise. * localedata/locales/tr_TR: Likewise. * localedata/locales/tt_RU: Likewise. * localedata/locales/tt_RU@iqtelif: Likewise. * localedata/locales/ug_CN: Likewise. * localedata/locales/uk_UA: Likewise. * localedata/locales/uz_UZ: Likewise. * localedata/locales/uz_UZ@cyrillic: Likewise. * localedata/locales/vi_VN: Likewise. * localedata/locales/yi_US: Likewise. * localedata/locales/yo_NG: Likewise.
Diffstat (limited to 'localedata/locales/ml_IN')
-rw-r--r-- | localedata/locales/ml_IN | 158 |
1 files changed, 157 insertions, 1 deletions
diff --git a/localedata/locales/ml_IN b/localedata/locales/ml_IN index 32b467f96d..2e6cfe52ca 100644 --- a/localedata/locales/ml_IN +++ b/localedata/locales/ml_IN @@ -65,8 +65,164 @@ END LC_CTYPE % % LC_COLLATE -% Copy the template from ISO/IEC 14651 +% CLDR collation rules for Malayalam: +% (see: https://unicode.org/cldr/trac/browser/trunk/common/collation/ml.xml) +% +% <collation type="standard" references="Sabdatharavali Malayalam Dictionary 23rd Ed. by Sahithya Pravarthaka Co-operative Society Ltd."> +% <cr><![CDATA[ +% [reorder Mlym Latn Deva Arab Taml Knda Telu Beng Guru Gujr Orya Sinh] # native speaker's special list +% # +% # Avagraha and Visarga are primary ignorables. +% # +% &ഃ<<ഽ +% # +% # Vowel sign AU ( ൌ) and AU length mark ( ൗ) need to differ +% # only on secondary level, not primary. +% # +% &\u0D4C<<\u0D57 +% # +% # Pre-5.1 Chillus secondary equal to 5.1 chillus. +% # Chillus primary equal to their consonant_dead form. +% # +% &ക്<<ക്\u200D<<<ൿ +% &ണ്<<ണ്\u200D<<<ൺ +% &ന്<<ന്\u200D<<<ൻ +% &ര്<<ര്\u200D<<<ർ +% &ല്<<ല്\u200D<<<ൽ +% &ള്<<ള്\u200D<<<ൾ +% # +% # Anuswara primary equal to MA_dead. +% # +% &മ്<<ം +% # +% # /nta/ is sorted as <NA, Virama, RRA>. +% # +% &ന്<<<ൻ് +% ]]></cr> +% </collation> +% +% And CLDR also lists the following +% index characters: +% (see: https://unicode.org/cldr/trac/browser/trunk/common/main/ml.xml) +% +% <exemplarCharacters type="index" draft="contributed">[അ ആ ഇ ഈ ഉ ഊ ഋ എ ഏ ഐ ഒ ഓ ഔ ക ഖ ഗ ഘ ങ ച ഛ ജ ഝ ഞ ട ഠ ഡ ഢ ണ ത ഥ ദ ധ ന പ ഫ ബ ഭ മ യ ര ല വ ശ ഷ സ ഹ ള ഴ റ]</exemplarCharacters> +% +% The following rules implement the same order for glibc. copy "iso14651_t1" +% &ക്<<ക്\u200D<<<ൿ +collating-element <e0d15-0d4d> from "<U0D15><U0D4D>" +collating-symbol <s0d15-0d4d> +collating-element <e0d15-0d4d-200d> from "<U0D15><U0D4D><U200D>" +collating-symbol <s0d15-0d4d-200d> +% &ണ്<<ണ്\u200D<<<ൺ +collating-element <e0d23-0d4d> from "<U0D23><U0D4D>" +collating-symbol <s0d23-0d4d> +collating-element <e0d23-0d4d-200d> from "<U0D23><U0D4D><U200D>" +collating-symbol <s0d23-0d4d-200d> +% &ന്<sന്\u200D<<<ൻ +collating-element <e0d28-0d4d> from "<U0D28><U0D4D>" +collating-symbol <s0d28-0d4d> +collating-element <e0d28-0d4d-200d> from "<U0D28><U0D4D><U200D>" +collating-symbol <s0d28-0d4d-200d> +% &ര്<<ര്\u200D<<<ർ +collating-element <e0d30-0d4d> from "<U0D30><U0D4D>" +collating-symbol <s0d30-0d4d> +collating-element <e0d30-0d4d-200d> from "<U0D30><U0D4D><U200D>" +collating-symbol <s0d30-0d4d-200d> +% &ല്<<ല്\u200D<<<ൽ +collating-element <e0d32-0d4d> from "<U0D32><U0D4D>" +collating-symbol <s0d32-0d4d> +collating-element <e0d32-0d4d-200d> from "<U0D32><U0D4D><U200D>" +collating-symbol <s0d32-0d4d-200d> +% &ള്<<ള്\u200D<<<ൾ +collating-element <e0d33-0d4d> from "<U0D33><U0D4D>" +collating-symbol <s0d33-0d4d> +collating-element <e0d33-0d4d-200d> from "<U0D33><U0D4D><U200D>" +collating-symbol <s0d33-0d4d-200d> +% # +% # Anuswara primary equal to MA_dead. +% # +% &മ്<<ം +collating-element <e0d2e-0d4d> from "<U0D2e><U0D4D>" +collating-symbol <s0d2e-0d4d> +% # +% # /nta/ is sorted as <NA, Virama, RRA>. +% # +% &ന്<<<ൻ് +% already defined: +% collating-element <e0d28-0d4d> from "<U0D28><U0D4D>" +% already defined: +% collating-symbol <s0d28-0d4d> +collating-element <e0d7b-0d4d> from "<U0D7B><U0D4D>" +collating-symbol <s0d7b-0d4d> +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +% Finished defining collating-elements and collating-symbols +% +% One dummy reorder-after statement here to avoid a syntax error +% because the first rule reordering stuff starts without a reorder-after: +collating-symbol <dummy> +reorder-after <AFTER-A> +<dummy> +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +% # Avagraha and Visarga are primary ignorables. +% &ഃ<<ഽ +<U0D03> IGNORE;<VISARGA>;<MIN>;<U0D03> % MALAYALAM SIGN VISARGA +<U0D3D> IGNORE;<VRNT1>;<MIN>;<U0D3D> % MALAYALAM SIGN AVAGRAHA +% # Vowel sign AU ( ൌ) and AU length mark ( ൗ) need to differ +% # only on secondary level, not primary. +% # +% &\u0D4C<<\u0D57 +<U0D4C> <S0D4C>;<BASE>;<MIN>;<U0D4C> % MALAYALAM VOWEL SIGN AU +<U0D57> <S0D4C>;<VRNT1>;<MIN>;<U0D57> % MALAYALAM AU LENGTH MARK +% &ക്<<ക്\u200D<<<ൿ +<e0d15-0d4d> "<S0D15><S0D4D>";"<BASE><BASE>";"<MIN><MIN>";IGNORE +<e0d15-0d4d-200d> "<S0D15><S0D4D>";"<BASE><VRNT1>";"<MIN><MIN>";IGNORE +<U0D7F> "<S0D15><S0D4D>";"<BASE><VRNT1>";"<COMPAT><COMPAT>";<U0D7F> +% &ണ്<<ണ്\u200D<<<ൺ +<e0d23-0d4d> "<S0D23><S0D4D>";"<BASE><BASE>";"<MIN><MIN>";IGNORE +<e0d23-0d4d-200d> "<S0D23><S0D4D>";"<BASE><VRNT1>";"<MIN><MIN>";IGNORE +<U0D7A> "<S0D23><S0D4D>";"<BASE><VRNT1>";"<COMPAT><COMPAT>";<U0D7A> +% &ന്<<ന്\u200D<<<ൻ +<e0d28-0d4d> "<S0D28><S0D4D>";"<BASE><BASE>";"<MIN><MIN>";IGNORE % ന് +<e0d28-0d4d-200d> "<S0D28><S0D4D>";"<BASE><VRNT1>";"<MIN><MIN>";IGNORE % ന് +<U0D7B> "<S0D28><S0D4D>";"<BASE><VRNT1>";"<COMPATCAP><COMPATCAP>";<U0D7B> % ൻ +% &ര്<<ര്\u200D<<<ർ +<e0d30-0d4d> "<S0D30><S0D4D>";"<BASE><BASE>";"<MIN><MIN>";IGNORE +<e0d30-0d4d-200d> "<S0D30><S0D4D>";"<BASE><VRNT1>";"<MIN><MIN>";IGNORE +<U0D7C> "<S0D30><S0D4D>";"<BASE><VRNT1>";"<COMPAT><COMPAT>";<U0D7C> % ർ +% &ല്<<ല്\u200D<<<ൽ +<e0d32-0d4d> "<S0D32><S0D4D>";"<BASE><BASE>";"<MIN><MIN>";IGNORE +<e0d32-0d4d-200d> "<S0D32><S0D4D>";"<BASE><VRNT1>";"<MIN><MIN>";IGNORE +<U0D7D> "<S0D32><S0D4D>";"<BASE><VRNT1>";"<COMPAT><COMPAT>";<U0D7D> +% &ള്<<ള്\u200D<<<ൾ +<e0d33-0d4d> "<S0D33><S0D4D>";"<BASE><BASE>";"<MIN><MIN>";IGNORE +<e0d33-0d4d-200d> "<S0D33><S0D4D>";"<BASE><VRNT1>";"<MIN><MIN>";IGNORE +<U0D7E> "<S0D33><S0D4D>";"<BASE><VRNT1>";"<COMPAT><COMPAT>";<U0D7E> +% # +% # Anuswara primary equal to MA_dead. +% # +% &മ്<<ം +<e0d2e-0d4d> "<S0D2E><S0D4D>";"<BASE><BASE>";"<MIN><MIN>";IGNORE % മ് +<U0D02> "<S0D2E><S0D4D>";"<BASE><VRNT1>";"<MIN><MIN>";IGNORE % MALAYALAM SIGN ANUSVARA +% # +% # /nta/ is sorted as <NA, Virama, RRA>. +% # +% &ന്<<<ൻ് +% +% It looks to me that the above line +% is a contradiction to the earlier rule: &ന്<<ന്\u200D<<<ൻ +% I experimented with libicu to see how libicu sorts given these rules. +% And the end result seems to be the same as if the above two rules had been +% combined in a rule like this: +% +% &ന്<<ന്\u200D<<<ൻ്<<<ൻ +% +% So I write the glibc rules to reproduce that behaviour. +<e0d28-0d4d> "<S0D28><S0D4D>";"<BASE><BASE>";"<MIN><MIN>";<U0D28> % ന് +<e0d7b-0d4d> "<S0D28><S0D4D>";"<BASE><VRNT1>";"<COMPAT><COMPAT>";<U0D7B> % ൻ് + +reorder-end + END LC_COLLATE % LC_MONETARY |