about summary refs log tree commit diff
path: root/Doc
diff options
context:
space:
mode:
authorPeter Stephenson <pws@users.sourceforge.net>2006-04-09 21:47:21 +0000
committerPeter Stephenson <pws@users.sourceforge.net>2006-04-09 21:47:21 +0000
commitef330a5dfddc763b83fe2406a91c61519279de68 (patch)
treea443ed27b17283591c3bd51e2938659ac03f6687 /Doc
parent82dc72e03462d2f0ebae2f6f4794fbb941cb3c8c (diff)
downloadzsh-ef330a5dfddc763b83fe2406a91c61519279de68.tar.gz
zsh-ef330a5dfddc763b83fe2406a91c61519279de68.tar.xz
zsh-ef330a5dfddc763b83fe2406a91c61519279de68.zip
22408: support for multibyte characters in patterns
Diffstat (limited to 'Doc')
-rw-r--r--Doc/Zsh/expn.yo27
-rw-r--r--Doc/Zsh/options.yo14
2 files changed, 34 insertions, 7 deletions
diff --git a/Doc/Zsh/expn.yo b/Doc/Zsh/expn.yo
index e4e270f98..71a702809 100644
--- a/Doc/Zsh/expn.yo
+++ b/Doc/Zsh/expn.yo
@@ -1461,20 +1461,20 @@ tt(LPAR()#)var(X)tt(RPAR()) where var(X) may have one of the following
 forms:
 
 startitem()
-item(i)(
+item(tt(i))(
 Case insensitive:  upper or lower case characters in the pattern match
 upper or lower case characters.
 )
-item(l)(
+item(tt(l))(
 Lower case characters in the pattern match upper or lower case
 characters; upper case characters in the pattern still only match
 upper case characters.
 )
-item(I)(
+item(tt(I))(
 Case sensitive:  locally negates the effect of tt(i) or tt(l) from
 that point on.
 )
-item(b)(
+item(tt(b))(
 Activate backreferences for parenthesised groups in the pattern;
 this does not work in filename generation.  When a pattern with a set of
 active parentheses is matched, the strings matched by the groups are
@@ -1525,11 +1525,11 @@ start and end indices are set to -1.
 
 Pattern matching with backreferences is slightly slower than without.
 )
-item(B)(
+item(tt(B))(
 Deactivate backreferences, negating the effect of the tt(b) flag from that
 point on.
 )
-item(m)(
+item(tt(m))(
 Set references to the match data for the entire string matched; this is
 similar to backreferencing and does not work in filename generation.  The
 flag must be in effect at the end of the pattern, i.e. not local to a
@@ -1550,7 +1550,7 @@ Unlike backreferences, there is no speed penalty for using match
 references, other than the extra substitutions required for the
 replacement strings in cases such as the example shown.
 )
-item(M)(
+item(tt(M))(
 Deactivate the tt(m) flag, hence no references to match data will be
 created.
 )
@@ -1596,6 +1596,19 @@ the latter case the `tt((#b))' is useful for backreferences and the
 `tt((#q.))' will be ignored.  Note that colon modifiers in the glob
 qualifiers are also not applied in ordinary pattern matching.
 )
+item(tt(u))(
+Respect the current locale in determining the presence of multibyte
+characters in a pattern, provided the shell was compiled with 
+tt(MULTIBYTE_SUPPORT).  This overrides the tt(MULTIBYTE)
+option; the default behaviour is taken from the option.  Compare tt(U).
+(Mnemonic: typically multibyte characters are from Unicode in the UTF-8
+encoding, although any extension of ASCII supported by the system
+library may be used.)
+)
+item(tt(U))(
+All characters are considered to be a single byte long.  The opposite
+of tt(u).  This overrides the tt(MULTIBYTE) option.
+)
 enditem()
 
 For example, the test string tt(fooxx) can be matched by the pattern
diff --git a/Doc/Zsh/options.yo b/Doc/Zsh/options.yo
index 74f8b4c84..0fb87302e 100644
--- a/Doc/Zsh/options.yo
+++ b/Doc/Zsh/options.yo
@@ -411,6 +411,20 @@ item(tt(MARK_DIRS) (tt(-8), ksh: tt(-X)))(
 Append a trailing `tt(/)' to all directory
 names resulting from filename generation (globbing).
 )
+pindex(MULTIBYTE)
+cindex(characters, multibyte, in expansion and globbing)
+cindex(multibyte characters, in expansion and globbing)
+item(tt(MULTIBYTE))(
+Respect multibyte characters when found during pattern matching.
+When this option is set, characters strings are examined using the
+system library to determine how many bytes form a character, depending
+on the current locale.  If the option is unset
+(or the shell was not compiled with the configuration option
+tt(MULTIBYTE_SUPPORT)) a single byte is always treated as a single
+character.  The option will eventually be extended to cover expansion.
+Note, however, that it does not affect the shellʼs editor, which always
+uses the locale to determine multibyte characters.
+)
 pindex(NOMATCH)
 cindex(globbing, no matches)
 item(tt(NOMATCH) (tt(PLUS()3)) <C> <Z>)(