data/TWiki/RegularExpression.txt,v
author Colas Nahaboo <colas@nahaboo.net>
Sat, 26 Jan 2008 15:50:53 +0100
changeset 0 414e01d06fd5
permissions -rw-r--r--
RELEASE 4.2.0 freetown
colas@0
     1
head	1.8;
colas@0
     2
access;
colas@0
     3
symbols;
colas@0
     4
locks; strict;
colas@0
     5
comment	@# @;
colas@0
     6
colas@0
     7
colas@0
     8
1.8
colas@0
     9
date	2007.01.16.04.12.04;	author TWikiContributor;	state Exp;
colas@0
    10
branches;
colas@0
    11
next	1.7;
colas@0
    12
colas@0
    13
1.7
colas@0
    14
date	2006.04.01.05.55.09;	author TWikiContributor;	state Exp;
colas@0
    15
branches;
colas@0
    16
next	1.6;
colas@0
    17
colas@0
    18
1.6
colas@0
    19
date	2006.02.01.12.01.17;	author TWikiContributor;	state Exp;
colas@0
    20
branches;
colas@0
    21
next	1.5;
colas@0
    22
colas@0
    23
1.5
colas@0
    24
date	2003.04.15.05.19.25;	author PeterThoeny;	state Exp;
colas@0
    25
branches;
colas@0
    26
next	1.4;
colas@0
    27
colas@0
    28
1.4
colas@0
    29
date	2003.03.22.05.12.00;	author PeterThoeny;	state Exp;
colas@0
    30
branches;
colas@0
    31
next	1.3;
colas@0
    32
colas@0
    33
1.3
colas@0
    34
date	2002.11.23.05.52.00;	author PeterThoeny;	state Exp;
colas@0
    35
branches;
colas@0
    36
next	1.2;
colas@0
    37
colas@0
    38
1.2
colas@0
    39
date	2000.08.23.06.58.32;	author PeterThoeny;	state Exp;
colas@0
    40
branches;
colas@0
    41
next	1.1;
colas@0
    42
colas@0
    43
1.1
colas@0
    44
date	2000.08.18.08.47.58;	author PeterThoeny;	state Exp;
colas@0
    45
branches;
colas@0
    46
next	;
colas@0
    47
colas@0
    48
colas@0
    49
desc
colas@0
    50
@none
colas@0
    51
@
colas@0
    52
colas@0
    53
colas@0
    54
1.8
colas@0
    55
log
colas@0
    56
@buildrelease
colas@0
    57
@
colas@0
    58
text
colas@0
    59
@%META:TOPICINFO{author="TWikiContributor" date="1164227726" format="1.1" version="8"}%
colas@0
    60
---+!! Regular Expressions
colas@0
    61
colas@0
    62
%TOC%
colas@0
    63
---++ Introduction
colas@0
    64
colas@0
    65
Regular expressions (REs), unlike simple queries, allow you to search for text which matches a particular pattern.
colas@0
    66
colas@0
    67
REs are similar to (but more poweful than) the "wildcards" used in the command-line interfaces found in operating systems such as Unix and MS-DOS. REs are used by sophisticated search engines, as well as by many Unix-based languages and tools ( e.g., =awk=, =grep=, =lex=, =perl=, and =sed= ).
colas@0
    68
colas@0
    69
---++ Examples
colas@0
    70
colas@0
    71
| =compan(y&#124;ies)= | Search for *company*, *companies* |
colas@0
    72
| =(peter&#124;paul)= | Search for *peter*, *paul* |
colas@0
    73
| =bug*= | Search for *bug*, *bugg*, *buggg* or simply *bu* (a star matches *zero* or more instances of the previous character) |
colas@0
    74
| =bug.*= | Search for *bug*, *bugs*, *bugfix* (a dot-star matches zero or more instances of *any* character) |
colas@0
    75
| =[Bb]ag= | Search for *Bag*, *bag* |
colas@0
    76
| =b[aiueo]g= | Second letter is a vowel. Matches *bag*, *bug*, *big* |
colas@0
    77
| =b.g= | Second letter is any letter. Matches also *b&amp;g* |
colas@0
    78
| =[a-zA-Z]= | Matches any one letter (but not a number or a symbol) |
colas@0
    79
| =[^0-9a-zA-Z]= | Matches any symbol (but not a number or a letter) |
colas@0
    80
| =[A-Z][A-Z]*= | Matches one or more uppercase letters |
colas@0
    81
| =[0-9]{3}-[0-9]{2}-[0-9]{4}= | US social security number, e.g. *123-45-6789* |
colas@0
    82
| =PNG;Chart= | Search for topics containing the words *PNG* _and_ *Chart*. The =";"= _and_ separator is TWiki-specific and is not a regular expression; it is a useful facility that is enabled when regular expression searching is enabled. |
colas@0
    83
colas@0
    84
---++ Searches with "and" combinations
colas@0
    85
colas@0
    86
   * TWiki extends the regular expressions with an _and_ search. The delimiter is a semicolon =;=. Example search for "form" _and_ "template": =form;template=
colas@0
    87
colas@0
    88
   * Use Google if your TWiki site is public. Example search for "form" _and_ "template" at TWiki.org: =site:twiki.org +form +template=
colas@0
    89
colas@0
    90
---++ Advanced
colas@0
    91
colas@0
    92
Here is stuff for our UNIX freaks: (copied from 'man egrep')
colas@0
    93
colas@0
    94
<blockquote>
colas@0
    95
A regular expression is a pattern that describes a set of strings. Regular expressions are constructed analogously to arithmetic expressions, by using various operators to combine smaller expressions.
colas@0
    96
colas@0
    97
The fundamental building blocks are the regular expressions that match a single character. Most characters, including all letters and digits, are regular expressions that match themselves. Any metacharacter with special meaning may be quoted by preceding it with a backslash.
colas@0
    98
colas@0
    99
A bracket expression is a list of characters enclosed by [ and ]. It matches any single character in that list; if the first character of the list is the caret ^ then it matches any character not in the list. For example, the regular expression [0123456789] matches any single digit.
colas@0
   100
colas@0
   101
Within a bracket expression, a range expression consists of two characters separated by a hyphen. It matches any single character that sorts between the two characters, inclusive, using the locale's collating sequence and character set. For example, in the default C locale, [a-d] is equivalent to [abcd]. Many locales sort characters in dictionary order, and in these locales [a-d] is typically not equivalent to [abcd]; it might be equivalent to [aBbCcDd], for example.
colas@0
   102
colas@0
   103
Finally, certain named classes of characters are predefined within bracket expressions, as follows. Their names are self explanatory, and they are [:alnum:], [:alpha:], [:cntrl:], [:digit:], [:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:], and [:xdigit:]. For example, [<nop>[:alnum:]<nop>] means [0-9A-Za-z], except the latter form depends upon the C locale and the ASCII character encoding, whereas the former is independent of locale and character set. (Note that the brackets in these class names are part of the symbolic names, and must be included in addition to the brackets delimiting the bracket list.) Most metacharacters lose their special meaning inside lists. To include a literal ] place it first in the list. Similarly, to include a literal ^ place it anywhere but first. Finally, to include a literal - place it last.
colas@0
   104
colas@0
   105
The period . matches any single character. The symbol \w is a synonym for [<nop>[:alnum:]<nop>] and \W is a synonym for [^[:alnum]<nop>].
colas@0
   106
colas@0
   107
The caret ^ and the dollar sign $ are metacharacters that respectively match the empty string at the beginning and end of a line. The symbols \&lt; and \&gt; respectively match the empty string at the beginning and end of a word. The symbol \b matches the empty string at the edge of a word, and \B matches the empty string provided it's not at the edge of a word.
colas@0
   108
colas@0
   109
A regular expression may be followed by one of several repetition operators:
colas@0
   110
| ? | The preceding item is optional and matched at most once. |
colas@0
   111
| * | The preceding item will be matched zero or more times. |
colas@0
   112
| + | The preceding item will be matched one or more times. |
colas@0
   113
| {n} | The preceding item is matched exactly n times. |
colas@0
   114
| {n,} | The preceding item is matched n or more times. |
colas@0
   115
| {n,m} | The preceding item is matched at least n times, but not more than m times. |
colas@0
   116
colas@0
   117
Two regular expressions may be concatenated; the resulting regular expression matches any string formed by concatenating two substrings that respectively match the concatenated subexpressions.
colas@0
   118
colas@0
   119
Two regular expressions may be joined by the infix operator |; the resulting regular expression matches any string matching either subexpression.
colas@0
   120
colas@0
   121
Repetition takes precedence over concatenation, which in turn takes precedence over alternation. A whole subexpression may be enclosed in parentheses to override these precedence rules.
colas@0
   122
colas@0
   123
The backreference \n, where n is a single digit, matches the substring previously matched by the nth parenthesized subexpression of the regular expression.
colas@0
   124
</blockquote>
colas@0
   125
colas@0
   126
__Related Links:__ 
colas@0
   127
   * http://perldoc.perl.org/perlretut.html - Regular expressions tutorial
colas@0
   128
   * http://www.perl.com/doc/manual/html/pod/perlre.html - Perl regular expressions
colas@0
   129
colas@0
   130
__Related Topics:__ UserDocumentationCategory
colas@0
   131
@
colas@0
   132
colas@0
   133
colas@0
   134
1.7
colas@0
   135
log
colas@0
   136
@buildrelease
colas@0
   137
@
colas@0
   138
text
colas@0
   139
@d1 1
colas@0
   140
a1 1
colas@0
   141
%META:TOPICINFO{author="TWikiContributor" date="1111929255" format="1.0" version="7"}%
colas@0
   142
d68 4
colas@0
   143
@
colas@0
   144
colas@0
   145
colas@0
   146
1.6
colas@0
   147
log
colas@0
   148
@buildrelease
colas@0
   149
@
colas@0
   150
text
colas@0
   151
@d1 1
colas@0
   152
a1 1
colas@0
   153
%META:TOPICINFO{author="TWikiContributor" date="1111929255" format="1.0" version="6"}%
colas@0
   154
d28 1
colas@0
   155
a28 1
colas@0
   156
	* TWiki extends the regular expressions with an _and_ search. The delimiter is a semicolon =;=. Example search for "form" _and_ "template": =form;template=
colas@0
   157
d30 1
colas@0
   158
a30 1
colas@0
   159
	* Use Google if your TWiki site is public. Example search for "form" _and_ "template" at TWiki.org: =site:twiki.org +form +template=
colas@0
   160
@
colas@0
   161
colas@0
   162
colas@0
   163
1.5
colas@0
   164
log
colas@0
   165
@none
colas@0
   166
@
colas@0
   167
text
colas@0
   168
@d1 68
colas@0
   169
a68 66
colas@0
   170
%META:TOPICINFO{author="PeterThoeny" date="1050383965" format="1.0" version="1.5"}%
colas@0
   171
---+!! Regular Expressions
colas@0
   172
colas@0
   173
%TOC%
colas@0
   174
---++ Introduction
colas@0
   175
colas@0
   176
Regular expressions (REs), unlike simple queries, allow you to search for text which matches a particular pattern.
colas@0
   177
colas@0
   178
REs are similar to (but more poweful than) the "wildcards" used in the command-line interfaces found in operating systems such as Unix and MS-DOS. REs are used by sophisticated search engines, as well as by many Unix-based languages and tools ( e.g., =awk=, =grep=, =lex=, =perl=, and =sed= ).
colas@0
   179
colas@0
   180
---++ Examples
colas@0
   181
colas@0
   182
| =compan(y&#124;ies)= | Search for *company*, *companies* |
colas@0
   183
| =(peter&#124;paul)= | Search for *peter*, *paul* |
colas@0
   184
| =bug*= | Search for *bug*, *bugg*, *buggg* or simply *bu* (a star matches *zero* or more instances of the previous character) |
colas@0
   185
| =bug.*= | Search for *bug*, *bugs*, *bugfix* (a dot-star matches zero or more instances of *any* character) |
colas@0
   186
| =[Bb]ag= | Search for *Bag*, *bag* |
colas@0
   187
| =b[aiueo]g= | Second letter is a vowel. Matches *bag*, *bug*, *big* |
colas@0
   188
| =b.g= | Second letter is any letter. Matches also *b&amp;g* |
colas@0
   189
| =[a-zA-Z]= | Matches any one letter (but not a number or a symbol) |
colas@0
   190
| =[^0-9a-zA-Z]= | Matches any symbol (but not a number or a letter) |
colas@0
   191
| =[A-Z][A-Z]*= | Matches one or more uppercase letters |
colas@0
   192
| =[0-9]{3}-[0-9]{2}-[0-9]{4}= | US social security number, e.g. *123-45-6789* |
colas@0
   193
| =PNG;Chart= | Search for topics containing the words *PNG* _and_ *Chart*. The =";"= _and_ separator is TWiki-specific and is not a regular expression; it is a useful facility that is enabled when regular expression searching is enabled. |
colas@0
   194
colas@0
   195
---++ Searches with "and" combinations
colas@0
   196
colas@0
   197
	* TWiki extends the regular expressions with an _and_ search. The delimiter is a semicolon =;=. Example search for "form" _and_ "template": =form;template=
colas@0
   198
colas@0
   199
	* Use Google if your TWiki site is public. Example search for "form" _and_ "template" at TWiki.org: =site:twiki.org +form +template=
colas@0
   200
colas@0
   201
---++ Advanced
colas@0
   202
colas@0
   203
Here is stuff for our UNIX freaks: (copied from 'man egrep')
colas@0
   204
colas@0
   205
<blockquote>
colas@0
   206
A regular expression is a pattern that describes a set of strings. Regular expressions are constructed analogously to arithmetic expressions, by using various operators to combine smaller expressions.
colas@0
   207
colas@0
   208
The fundamental building blocks are the regular expressions that match a single character. Most characters, including all letters and digits, are regular expressions that match themselves. Any metacharacter with special meaning may be quoted by preceding it with a backslash.
colas@0
   209
colas@0
   210
A bracket expression is a list of characters enclosed by [ and ]. It matches any single character in that list; if the first character of the list is the caret ^ then it matches any character not in the list. For example, the regular expression [0123456789] matches any single digit.
colas@0
   211
colas@0
   212
Within a bracket expression, a range expression consists of two characters separated by a hyphen. It matches any single character that sorts between the two characters, inclusive, using the locale's collating sequence and character set. For example, in the default C locale, [a-d] is equivalent to [abcd]. Many locales sort characters in dictionary order, and in these locales [a-d] is typically not equivalent to [abcd]; it might be equivalent to [aBbCcDd], for example.
colas@0
   213
colas@0
   214
Finally, certain named classes of characters are predefined within bracket expressions, as follows. Their names are self explanatory, and they are [:alnum:], [:alpha:], [:cntrl:], [:digit:], [:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:], and [:xdigit:]. For example, [<nop>[:alnum:]<nop>] means [0-9A-Za-z], except the latter form depends upon the C locale and the ASCII character encoding, whereas the former is independent of locale and character set. (Note that the brackets in these class names are part of the symbolic names, and must be included in addition to the brackets delimiting the bracket list.) Most metacharacters lose their special meaning inside lists. To include a literal ] place it first in the list. Similarly, to include a literal ^ place it anywhere but first. Finally, to include a literal - place it last.
colas@0
   215
colas@0
   216
The period . matches any single character. The symbol \w is a synonym for [<nop>[:alnum:]<nop>] and \W is a synonym for [^[:alnum]<nop>].
colas@0
   217
colas@0
   218
The caret ^ and the dollar sign $ are metacharacters that respectively match the empty string at the beginning and end of a line. The symbols \&lt; and \&gt; respectively match the empty string at the beginning and end of a word. The symbol \b matches the empty string at the edge of a word, and \B matches the empty string provided it's not at the edge of a word.
colas@0
   219
colas@0
   220
A regular expression may be followed by one of several repetition operators:
colas@0
   221
| ? | The preceding item is optional and matched at most once. |
colas@0
   222
| * | The preceding item will be matched zero or more times. |
colas@0
   223
| + | The preceding item will be matched one or more times. |
colas@0
   224
| {n} | The preceding item is matched exactly n times. |
colas@0
   225
| {n,} | The preceding item is matched n or more times. |
colas@0
   226
| {n,m} | The preceding item is matched at least n times, but not more than m times. |
colas@0
   227
colas@0
   228
Two regular expressions may be concatenated; the resulting regular expression matches any string formed by concatenating two substrings that respectively match the concatenated subexpressions.
colas@0
   229
colas@0
   230
Two regular expressions may be joined by the infix operator |; the resulting regular expression matches any string matching either subexpression.
colas@0
   231
colas@0
   232
Repetition takes precedence over concatenation, which in turn takes precedence over alternation. A whole subexpression may be enclosed in parentheses to override these precedence rules.
colas@0
   233
colas@0
   234
The backreference \n, where n is a single digit, matches the substring previously matched by the nth parenthesized subexpression of the regular expression.
colas@0
   235
</blockquote>
colas@0
   236
@
colas@0
   237
colas@0
   238
colas@0
   239
1.4
colas@0
   240
log
colas@0
   241
@none
colas@0
   242
@
colas@0
   243
text
colas@0
   244
@d1 1
colas@0
   245
a1 1
colas@0
   246
%META:TOPICINFO{author="PeterThoeny" date="1048309920" format="1.0" version="1.4"}%
colas@0
   247
d15 2
colas@0
   248
a16 1
colas@0
   249
| =bug*= | Search for *bug*, *bugs*, *bugfix* |
colas@0
   250
@
colas@0
   251
colas@0
   252
colas@0
   253
1.3
colas@0
   254
log
colas@0
   255
@none
colas@0
   256
@
colas@0
   257
text
colas@0
   258
@d1 1
colas@0
   259
a1 1
colas@0
   260
%META:TOPICINFO{author="PeterThoeny" date="1038030720" format="1.0" version="1.3"}%
colas@0
   261
a4 1
colas@0
   262
colas@0
   263
d7 1
colas@0
   264
a7 1
colas@0
   265
Regular expressions (REs), unlike simple queries, allow you to search for text which matches a particular pattern. 
colas@0
   266
d23 1
colas@0
   267
a23 1
colas@0
   268
| =PNG;Chart= | Search for topics containing the words *PNG* _and_ *Chart*. This is not a regular expression! But a useful facility that is enabled when regular expression searching is enabled. |
colas@0
   269
d42 1
colas@0
   270
a42 1
colas@0
   271
Within a bracket expression, a range expression consists of two characters separated by a hyphen. It matches any single character that sorts between the two characters, inclusive, using the locale's collating sequence and character set. For example, in the default C locale, [a-d] is equivalent to [abcd]. Many locales sort characters in dictionary order, and in these locales [a-d] is typically not equivalent to [abcd]; it might be equivalent to [aBbCcDd], for example. 
colas@0
   272
@
colas@0
   273
colas@0
   274
colas@0
   275
1.2
colas@0
   276
log
colas@0
   277
@none
colas@0
   278
@
colas@0
   279
text
colas@0
   280
@d1 7
colas@0
   281
d12 50
colas@0
   282
a61 1
colas@0
   283
*Examples*
colas@0
   284
d63 1
colas@0
   285
a63 161
colas@0
   286
<TABLE>
colas@0
   287
  <TR>
colas@0
   288
	 <TD>
colas@0
   289
		compan(y|ies)
colas@0
   290
	 </TD><TD>
colas@0
   291
		Search for _company_ , _companies_
colas@0
   292
	 </TD>
colas@0
   293
  </TR><TR>
colas@0
   294
	 <TD>
colas@0
   295
		(peter|paul)
colas@0
   296
	 </TD><TD>
colas@0
   297
		Search for _peter_ , _paul_
colas@0
   298
	 </TD>
colas@0
   299
  </TR><TR>
colas@0
   300
	 <TD>
colas@0
   301
		bug*
colas@0
   302
	 </TD><TD>
colas@0
   303
		Search for _bug_ , _bugs_ , _bugfix_
colas@0
   304
	 </TD>
colas@0
   305
  </TR><TR>
colas@0
   306
	 <TD>
colas@0
   307
		[Bb]ag
colas@0
   308
	 </TD><TD>
colas@0
   309
		Search for _Bag_ , _bag_
colas@0
   310
	 </TD>
colas@0
   311
  </TR><TR>
colas@0
   312
	 <TD>
colas@0
   313
		b[aiueo]g
colas@0
   314
	 </TD><TD>
colas@0
   315
		Second letter is a vowel. Matches _bag_ , _bug_ , _big_
colas@0
   316
	 </TD>
colas@0
   317
  </TR><TR>
colas@0
   318
	 <TD>
colas@0
   319
		b.g
colas@0
   320
	 </TD><TD>
colas@0
   321
		Second letter is any letter. Matches also _b&g_
colas@0
   322
	 </TD>
colas@0
   323
  </TR><TR>
colas@0
   324
	 <TD>
colas@0
   325
		[a-zA-Z]
colas@0
   326
	 </TD><TD>
colas@0
   327
		Matches any one letter (not a number and a symbol)
colas@0
   328
	 </TD>
colas@0
   329
  </TR><TR>
colas@0
   330
	 <TD>
colas@0
   331
		[^0-9a-zA-Z]
colas@0
   332
	 </TD><TD>
colas@0
   333
		Matches any symbol (not a number or a letter)
colas@0
   334
	 </TD>
colas@0
   335
  </TR><TR>
colas@0
   336
	 <TD>
colas@0
   337
		[A-Z][A-Z]*
colas@0
   338
	 </TD><TD>
colas@0
   339
		Matches one or more uppercase letters
colas@0
   340
	 </TD>
colas@0
   341
  </TR><TR>
colas@0
   342
	 <TD>
colas@0
   343
		[0-9][0-9][0-9]-[0-9][0-9]- <br> [0-9][0-9][0-9][0-9]
colas@0
   344
	 </TD><TD VALIGN="top">
colas@0
   345
		US social security number, e.g. 123-45-6789
colas@0
   346
	 </TD>
colas@0
   347
  </TR>
colas@0
   348
</TABLE>
colas@0
   349
colas@0
   350
Here is stuff for our UNIX freaks: <BR>
colas@0
   351
(copied from 'man grep')
colas@0
   352
colas@0
   353
<pre>
colas@0
   354
	  \c	A backslash (\) followed by any special character is  a
colas@0
   355
			 one-character  regular expression that matches the spe-
colas@0
   356
			 cial character itself.  The special characters are:
colas@0
   357
colas@0
   358
					+	 `.', `*', `[',  and  `\'  (period,  asterisk,
colas@0
   359
						  left  square  bracket, and backslash, respec-
colas@0
   360
						  tively), which  are  always  special,  except
colas@0
   361
						  when they appear within square brackets ([]).
colas@0
   362
colas@0
   363
					+	 `^' (caret or circumflex), which  is  special
colas@0
   364
						  at the beginning of an entire regular expres-
colas@0
   365
						  sion, or when it immediately follows the left
colas@0
   366
						  of a pair of square brackets ([]).
colas@0
   367
colas@0
   368
					+	 $ (currency symbol), which is special at  the
colas@0
   369
						  end of an entire regular expression.							  
colas@0
   370
colas@0
   371
	  .	 A `.' (period) is a  one-character  regular  expression
colas@0
   372
			 that matches any character except NEWLINE.
colas@0
   373
 
colas@0
   374
	  [string]
colas@0
   375
			 A non-empty string of  characters  enclosed  in  square
colas@0
   376
			 brackets  is  a  one-character  regular expression that
colas@0
   377
			 matches any one character in that string.  If, however,
colas@0
   378
			 the  first  character of the string is a `^' (a circum-
colas@0
   379
			 flex or caret), the  one-character  regular  expression
colas@0
   380
			 matches  any character except NEWLINE and the remaining
colas@0
   381
			 characters in the string.  The  `^'  has  this  special
colas@0
   382
			 meaning only if it occurs first in the string.  The `-'
colas@0
   383
			 (minus) may be used to indicate a range of  consecutive
colas@0
   384
			 ASCII  characters;  for example, [0-9] is equivalent to
colas@0
   385
			 [0123456789].  The `-' loses this special meaning if it
colas@0
   386
			 occurs  first (after an initial `^', if any) or last in
colas@0
   387
			 the string.  The `]' (right square  bracket)  does  not
colas@0
   388
			 terminate  such a string when it is the first character
colas@0
   389
			 within it (after an initial  `^',  if  any);  that  is,
colas@0
   390
			 []a-f]  matches either `]' (a right square bracket ) or
colas@0
   391
			 one of the letters a through  f  inclusive.	The  four
colas@0
   392
			 characters  `.', `*', `[', and `\' stand for themselves
colas@0
   393
			 within such a string of characters.
colas@0
   394
colas@0
   395
	  The following rules may be used to construct regular expres-
colas@0
   396
	  sions:
colas@0
   397
colas@0
   398
	  *	 A one-character regular expression followed by `*'  (an
colas@0
   399
			 asterisk)  is a regular expression that matches zero or
colas@0
   400
			 more occurrences of the one-character  regular  expres-
colas@0
   401
			 sion.	If  there  is  any choice, the longest leftmost
colas@0
   402
			 string that permits a match is chosen.
colas@0
   403
colas@0
   404
	  ^	 A circumflex or caret (^) at the beginning of an entire
colas@0
   405
			 regular  expression  constrains that regular expression
colas@0
   406
			 to match an initial segment of a line.
colas@0
   407
colas@0
   408
	  $	 A currency symbol ($) at the end of an  entire  regular
colas@0
   409
			 expression  constrains that regular expression to match
colas@0
   410
			 a final segment of a line.
colas@0
   411
colas@0
   412
	  *	 A  regular  expression  (not  just	a	one-
colas@0
   413
			 character regular expression) followed by `*'
colas@0
   414
			 (an asterisk) is a  regular  expression  that
colas@0
   415
			 matches  zero or more occurrences of the one-
colas@0
   416
			 character regular expression.	If  there  is
colas@0
   417
			 any  choice, the longest leftmost string that
colas@0
   418
			 permits a match is chosen.
colas@0
   419
colas@0
   420
	  +	 A regular expression followed by `+' (a  plus
colas@0
   421
			 sign)  is  a  regular expression that matches
colas@0
   422
			 one or more occurrences of the  one-character
colas@0
   423
			 regular  expression.  If there is any choice,
colas@0
   424
			 the longest leftmost string  that  permits  a
colas@0
   425
			 match is chosen.
colas@0
   426
colas@0
   427
	  ?	 A regular expression followed by `?' (a ques-
colas@0
   428
			 tion  mark)  is  a  regular  expression  that
colas@0
   429
			 matches zero or one occurrences of  the  one-
colas@0
   430
			 character  regular  expression.	If there is
colas@0
   431
			 any choice, the longest leftmost string  that
colas@0
   432
			 permits a match is chosen.
colas@0
   433
colas@0
   434
	  |	 Alternation:	 two	 regular	 expressions
colas@0
   435
			 separated  by  `|'  or NEWLINE match either a
colas@0
   436
			 match for  the  first  or  a  match  for  the
colas@0
   437
			 second.
colas@0
   438
colas@0
   439
	  ()	A regular expression enclosed in  parentheses
colas@0
   440
			 matches a match for the regular expression.
colas@0
   441
colas@0
   442
	  The order of precedence of operators at the same parenthesis
colas@0
   443
	  level  is  `[ ]'  (character  classes),  then  `*'  `+'  `?'
colas@0
   444
	  (closures),then  concatenation,  then  `|'  (alternation)and
colas@0
   445
	  NEWLINE.
colas@0
   446
</pre>
colas@0
   447
d65 2
colas@0
   448
@
colas@0
   449
colas@0
   450
colas@0
   451
1.1
colas@0
   452
log
colas@0
   453
@none
colas@0
   454
@
colas@0
   455
text
colas@0
   456
@d1 3
colas@0
   457
a3 1
colas@0
   458
Regular expressions allow more specific queries then a simple query.
colas@0
   459
@