From bb68ee8db7971b683fba7dd7bf404186872ba7cf Mon Sep 17 00:00:00 2001
From: Peter Stephenson <pws@users.sourceforge.net>
Date: Sun, 8 Jun 2008 17:53:53 +0000
Subject: 25138(? mailing list stuck): rewrite of completion matching. Will one
 day use multibyte/wide characters, doesn't yet.

---
 Doc/Zsh/compwid.yo | 74 ++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 47 insertions(+), 27 deletions(-)

(limited to 'Doc/Zsh/compwid.yo')

diff --git a/Doc/Zsh/compwid.yo b/Doc/Zsh/compwid.yo
index 438dc059b..a87aeac87 100644
--- a/Doc/Zsh/compwid.yo
+++ b/Doc/Zsh/compwid.yo
@@ -577,7 +577,7 @@ the next character typed inserts one of the characters given in the
 var(remove-chars).  This string is parsed as a characters class and
 understands the backslash sequences used by the tt(print) command.  For
 example, `tt(-r "a-z\t")' removes the suffix if the next character typed
-inserts a lowercase character or a TAB, and `tt(-r "^0-9")' removes the
+inserts a lower case character or a TAB, and `tt(-r "^0-9")' removes the
 suffix if the next character typed inserts anything but a digit. One extra
 backslash sequence is understood in this string: `tt(\-)' stands for
 all characters that insert nothing. Thus `tt(-S "=" -q)' is the same
@@ -857,9 +857,9 @@ which character sequences in the trial completion.  Any sequence of
 characters not handled in this fashion must match exactly, as usual.
 
 The forms of var(match-spec) understood are as follows. In each case, the
-form with an uppercase initial character retains the string already
+form with an upper case initial character retains the string already
 typed on the command line as the final result of completion, while with
-a lowercase initial character the string on the command line is changed
+a lower case initial character the string on the command line is changed
 into the corresponding part of the trial completion.
 
 startitem()
@@ -918,15 +918,35 @@ are not allowed, so the characters tt(!) and tt(^) have no special
 meaning directly after the opening brace.  They indicate that a range of
 characters on the line match a range of characters in the trial
 completion, but (unlike ordinary character classes) paired according to
-the corresponding position in the sequence. For example, to make any
-lowercase letter on the line match the corresponding uppercase letter in
-the trial completion, you can use `tt(m:{a-z}={A-Z})'.  More than one
-pair of classes can occur, in which case the first class before the
-tt(=) corresponds to the first after it, and so on.  If one side has
+the corresponding position in the sequence.  For example, to make any
+ASCII lower case letter on the line match the corresponding upper case
+letter in the trial completion, you can use `tt(m:{a-z}={A-Z})'
+(however, see below for the recommended form for this).  More
+than one pair of classes can occur, in which case the first class before
+the tt(=) corresponds to the first after it, and so on.  If one side has
 more such classes than the other side, the superfluous classes behave
 like normal character classes.  In anchor patterns correspondence classes
 also behave like normal character classes.
 
+The standard `tt([:)var(name)tt(:])' forms described for standard shell
+patterns,
+ifnzman(noderef(Filename Generation))\
+ifzman(see the section FILENAME GENERATION in zmanref(zshexpn)),
+may appear in correspondence classes as well as normal character
+classes.  The only special behaviour in correspondence classes is if
+the form on the left and the form on the right are each one of
+tt([:upper:]), tt([:lower:]).  In these cases the
+character in the word and the character on the line must be the same up
+to a difference in case.  Hence to make any lower case character on the
+line match the corresponding upper case character in the trial
+completion you can use `tt(m:{[:lower:]}={[:upper:]})'.  Although the
+matching system does not yet handle multibyte characters, this is likely
+to be a future extension, at which point this syntax will handle
+arbitrary alphabets; hence this form, rather than the use of explicit
+ranges, is the recommended form.  In other cases
+`tt([:)var(name)tt(:])' forms are allowed, but imply no special
+constraint on the characters beyond that implied by the test itself.
+
 The pattern var(tpat) may also be one or two stars, `tt(*)' or
 `tt(**)'. This means that the pattern on the command line can match
 any number of characters in the trial completion. In this case the
@@ -939,16 +959,16 @@ anchor can be matched, too.
 Examples:
 
 The keys of the tt(options) association defined by the tt(parameter)
-module are the option names in all-lowercase form, without
+module are the option names in all-lower-case form, without
 underscores, and without the optional tt(no) at the beginning even
 though the builtins tt(setopt) and tt(unsetopt) understand option names
-with uppercase letters, underscores, and the optional tt(no).  The
+with upper case letters, underscores, and the optional tt(no).  The
 following alters the matching rules so that the prefix tt(no) and any
 underscore are ignored when trying to match the trial completions
-generated and uppercase letters on the line match the corresponding
-lowercase letters in the words:
+generated and upper case letters on the line match the corresponding
+lower case letters in the words:
 
-example(compadd -M 'L:|[nN][oO]= M:_= M:{A-Z}={a-z}' - \ 
+example(compadd -M 'L:|[nN][oO]= M:_= M:{[:upper:]}={[:lower:]}' - \ 
   ${(k)options} )
 
 The first part says that the pattern `tt([nN][oO])' at the beginning
@@ -957,8 +977,8 @@ line matches the empty string in the list of words generated by
 completion, so it will be ignored if present. The second part does the
 same for an underscore anywhere in the command line string, and the
 third part uses correspondence classes so that any
-uppercase letter on the line matches the corresponding lowercase
-letter in the word. The use of the uppercase forms of the
+upper case letter on the line matches the corresponding lower case
+letter in the word. The use of the upper case forms of the
 specification characters (tt(L) and tt(M)) guarantees that what has
 already been typed on the command line (in particular the prefix
 tt(no)) will not be deleted.
@@ -979,12 +999,12 @@ The second example makes completion case insensitive.  This is just
 the same as in the option example, except here we wish to retain the
 characters in the list of completions:
 
-example(compadd -M 'm:{a-z}={A-Z}' ... )
+example(compadd -M 'm:{[:lower:]}={[:upper:]}' ... )
 
-This makes lowercase letters match their uppercase counterparts.
-To make uppercase letters match the lowercase forms as well:
+This makes lower case letters match their upper case counterparts.
+To make upper case letters match the lower case forms as well:
 
-example(compadd -M 'm:{a-zA-Z}={A-Za-z}' ... )
+example(compadd -M 'm:{[:lower:][:upper:]}={[:upper:][:lower:]}' ... )
 
 A nice example for the use of tt(*) patterns is partial word
 completion. Sometimes you would like to make strings like `tt(c.s.u)'
@@ -1042,27 +1062,27 @@ The specifications with both a left and a right anchor are useful to
 complete partial words whose parts are not separated by some
 special character. For example, in some places strings have to be
 completed that are formed `tt(LikeThis)' (i.e. the separate parts are
-determined by a leading uppercase letter) or maybe one has to
+determined by a leading upper case letter) or maybe one has to
 complete strings with trailing numbers. Here one could use the simple
 form with only one anchor as in:
 
-example(compadd -M 'r:|[A-Z0-9]=* r:|=*' LikeTHIS FooHoo 5foo123 5bar234)
+example(compadd -M 'r:|[[:upper:]0-9]=* r:|=*' LikeTHIS FooHoo 5foo123 5bar234)
 
 But with this, the string `tt(H)' would neither complete to `tt(FooHoo)'
-nor to `tt(LikeTHIS)' because in each case there is an uppercase
+nor to `tt(LikeTHIS)' because in each case there is an upper case
 letter before the `tt(H)' and that is matched by the anchor. Likewise, 
 a `tt(2)' would not be completed. In both cases this could be changed
-by using `tt(r:|[A-Z0-9]=**)', but then `tt(H)' completes to both
+by using `tt(r:|[[:upper:]0-9]=**)', but then `tt(H)' completes to both
 `tt(LikeTHIS)' and `tt(FooHoo)' and a `tt(2)' matches the other
-strings because characters can be inserted before every uppercase
+strings because characters can be inserted before every upper case
 letter and digit. To avoid this one would use:
 
-example(compadd -M 'r:[^A-Z0-9]||[A-Z0-9]=** r:|=*' \ 
+example(compadd -M 'r:[^[:upper:]0-9]||[[:upper:]0-9]=** r:|=*' \ 
     LikeTHIS FooHoo foo123 bar234)
 
-By using these two anchors, a `tt(H)' matches only uppercase `tt(H)'s that 
+By using these two anchors, a `tt(H)' matches only upper case `tt(H)'s that 
 are immediately preceded by something matching the left anchor
-`tt([^A-Z0-9])'. The effect is, of course, that `tt(H)' matches only
+`tt([^[:upper:]0-9])'. The effect is, of course, that `tt(H)' matches only
 the string `tt(FooHoo)', a `tt(2)' matches only `tt(bar234)' and so on.
 
 When using the completion system (see
-- 
cgit 1.4.1