From 418671fdb06c1920414056f9b47245aa062f7b6f Mon Sep 17 00:00:00 2001 From: Peter Stephenson Date: Wed, 25 Mar 2009 11:29:11 +0000 Subject: Jon Strait: 26778, 26781: extra options for PCRE matching --- Doc/Zsh/mod_pcre.yo | 37 +++++++++++++++++++++++++++++++++++-- 1 file changed, 35 insertions(+), 2 deletions(-) (limited to 'Doc') diff --git a/Doc/Zsh/mod_pcre.yo b/Doc/Zsh/mod_pcre.yo index 33b864478..9b8d9d6a7 100644 --- a/Doc/Zsh/mod_pcre.yo +++ b/Doc/Zsh/mod_pcre.yo @@ -6,7 +6,7 @@ The tt(zsh/pcre) module makes some commands available as builtins: startitem() findex(pcre_compile) -item(tt(pcre_compile) [ tt(-aimx) ] var(PCRE))( +item(tt(pcre_compile) [ tt(-aimxs) ] var(PCRE))( Compiles a perl-compatible regular expression. Option tt(-a) will force the pattern to be anchored. @@ -15,6 +15,8 @@ Option tt(-m) will compile a multi-line pattern; that is, tt(^) and tt($) will match newlines within the pattern. Option tt(-x) will compile an extended pattern, wherein whitespace and tt(#) comments are ignored. +Option tt(-s) makes the dot metacharacter match all characters, +including those that indicate newline. ) findex(pcre_study) item(tt(pcre_study))( @@ -22,7 +24,8 @@ Studies the previously-compiled PCRE which may result in faster matching. ) findex(pcre_match) -item(tt(pcre_match) [ tt(-v) var(var) ] [ tt(-a) var(arr) ] var(string))( +item(tt(pcre_match) [ tt(-v) var(var) ] [ tt(-a) var(arr) ] \ +[ tt(-n) var(offset) ] [ tt(-b) ] var(string))( Returns successfully if tt(string) matches the previously-compiled PCRE. @@ -35,6 +38,36 @@ var(MATCH) will be set to the entire matched portion of the string, unless the tt(-v) option is given, in which case the variable var(var) will be set. No variables are altered if there is no successful match. +A tt(-n) option starts searching for a match from the +byte var(offset) position in var(string). If the tt(-b) option is given, +the variable var(ZPCRE_OP) will be set to an offset pair string, +representing the byte offset positions of the entire matched portion +within the var(string). For example, a var(ZPCRE_OP) set to "32 45" indicates +that the matched portion began on byte offset 32 and ended on byte offset 44. +Here, byte offset position 45 is the position directly after the matched +portion. Keep in mind that the byte position isn't necessarily the same +as the character position when UTF-8 characters are involved. +Consequently, the byte offset positions are only to be relied on in the +context of using them for subsequent searches on var(string), using an offset +position as an argument to the tt(-n) option. This is mostly +used to implement the "find all non-overlapping matches" functionality. + +A simple example of "find all non-overlapping matches": + +example( +string="The following zip codes: 78884 90210 99513" +pcre_compile -m "\d{5}" +accum=() +pcre_match -b -- $string +while [[ $? -eq 0 ]] do + b=($=ZPCRE_OP) + accum+=$MATCH + pcre_match -b -n $b[2] -- $string +done +print -l $accum + + +) ) enditem() -- cgit 1.4.1