diff options
author | Florian Weimer <fweimer@redhat.com> | 2022-07-05 09:05:45 +0200 |
---|---|---|
committer | Florian Weimer <fweimer@redhat.com> | 2022-07-05 09:06:50 +0200 |
commit | b15538d77c6a7893c8bb42831dcd3a1a12b727d4 (patch) | |
tree | 0ebaa0c09cc8f21437021e25d06a22d62a82cb72 /localedata | |
parent | 7dcaabb94caa00c9dd68a207ea62fef5a2551ac4 (diff) | |
download | glibc-b15538d77c6a7893c8bb42831dcd3a1a12b727d4.tar.gz glibc-b15538d77c6a7893c8bb42831dcd3a1a12b727d4.tar.xz glibc-b15538d77c6a7893c8bb42831dcd3a1a12b727d4.zip |
locale: localdef input files are now encoded in UTF-8
Previously, they were assumed to be in ISO-8859-1, and that the output charset overlapped with ISO-8859-1 for the characters actually used. However, this did not work as intended on many architectures even for an ISO-8859-1 output encoding because of the char signedness bug in lr_getc. Therefore, this commit switches to UTF-8 without making provisions for backwards compatibility. The following Elisp code can be used to convert locale definition files to UTF-8: (defun glibc/convert-localedef (from to) (interactive "r") (save-excursion (save-restriction (narrow-to-region from to) (goto-char (point-min)) (save-match-data (while (re-search-forward "<U\\([0-9a-fA-F]+\\)>" nil t) (let* ((codepoint (string-to-number (match-string 1) 16)) (converted (cond ((memq codepoint '(?/ ?\ ?< ?>)) (string ?/ codepoint)) ((= codepoint ?\") "<U0022>") (t (string codepoint))))) (replace-match converted t))))))) Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
Diffstat (limited to 'localedata')
0 files changed, 0 insertions, 0 deletions