1 @c Copyright (C) 1994, 1996, 1998, 2000, 2001, 2003, 2004, 2005, 2006,
2 @c 2007, 2009, 2010, 2011 Free Software Foundation, Inc.
4 @c Permission is granted to copy, distribute and/or modify this document
5 @c under the terms of the GNU Free Documentation License, Version 1.3 or
6 @c any later version published by the Free Software Foundation; with no
7 @c Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
8 @c Texts. A copy of the license is included in the ``GNU Free
9 @c Documentation License'' file as part of this distribution.
11 @c this regular expression description is for: findutils
14 * findutils-default regular expression syntax::
15 * awk regular expression syntax::
16 * egrep regular expression syntax::
17 * emacs regular expression syntax::
18 * gnu-awk regular expression syntax::
19 * grep regular expression syntax::
20 * posix-awk regular expression syntax::
21 * posix-basic regular expression syntax::
22 * posix-egrep regular expression syntax::
23 * posix-extended regular expression syntax::
26 @node findutils-default regular expression syntax
27 @subsection @samp{findutils-default} regular expression syntax
30 The character @samp{.} matches any single character.
36 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
38 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
46 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are not supported, so for example you would need to use @samp{[0-9]} instead of @samp{[[:digit:]]}.
48 GNU extensions are supported:
51 @item @samp{\w} matches a character within a word
53 @item @samp{\W} matches a character which is not within a word
55 @item @samp{\<} matches the beginning of a word
57 @item @samp{\>} matches the end of a word
59 @item @samp{\b} matches a word boundary
61 @item @samp{\B} matches characters which are not a word boundary
63 @item @samp{\`} matches the beginning of the whole input
65 @item @samp{\'} matches the end of the whole input
70 Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
72 The alternation operator is @samp{\|}.
74 The character @samp{^} only represents the beginning of a string when it appears:
78 At the beginning of a regular expression
80 @item After an open-group, signified by
83 @item After the alternation operator @samp{\|}
88 The character @samp{$} only represents the end of a string when it appears:
91 @item At the end of a regular expression
93 @item Before a close-group, signified by
95 @item Before the alternation operator @samp{\|}
100 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
103 @item At the beginning of a regular expression
105 @item After an open-group, signified by
107 @item After the alternation operator @samp{\|}
114 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
117 @node awk regular expression syntax
118 @subsection @samp{awk} regular expression syntax
121 The character @samp{.} matches any single character except the null character.
127 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
129 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
137 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
139 GNU extensions are not supported and so @samp{\w}, @samp{\W}, @samp{\<}, @samp{\>}, @samp{\b}, @samp{\B}, @samp{\`}, and @samp{\'} match @samp{w}, @samp{W}, @samp{<}, @samp{>}, @samp{b}, @samp{B}, @samp{`}, and @samp{'} respectively.
141 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit matches that digit.
143 The alternation operator is @samp{|}.
145 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
147 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
150 @item At the beginning of a regular expression
152 @item After an open-group, signified by
154 @item After the alternation operator @samp{|}
161 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
164 @node egrep regular expression syntax
165 @subsection @samp{egrep} regular expression syntax
168 The character @samp{.} matches any single character except newline.
174 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
176 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
184 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. Non-matching lists @samp{[^@dots{}]} do not ever match newline.
186 GNU extensions are supported:
189 @item @samp{\w} matches a character within a word
191 @item @samp{\W} matches a character which is not within a word
193 @item @samp{\<} matches the beginning of a word
195 @item @samp{\>} matches the end of a word
197 @item @samp{\b} matches a word boundary
199 @item @samp{\B} matches characters which are not a word boundary
201 @item @samp{\`} matches the beginning of the whole input
203 @item @samp{\'} matches the end of the whole input
208 Grouping is performed with parentheses @samp{()}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
210 The alternation operator is @samp{|}.
212 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
214 The characters @samp{*}, @samp{+} and @samp{?} are special anywhere in a regular expression.
218 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
221 @node emacs regular expression syntax
222 @subsection @samp{emacs} regular expression syntax
225 The character @samp{.} matches any single character except newline.
231 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
233 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
241 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are not supported, so for example you would need to use @samp{[0-9]} instead of @samp{[[:digit:]]}.
243 GNU extensions are supported:
246 @item @samp{\w} matches a character within a word
248 @item @samp{\W} matches a character which is not within a word
250 @item @samp{\<} matches the beginning of a word
252 @item @samp{\>} matches the end of a word
254 @item @samp{\b} matches a word boundary
256 @item @samp{\B} matches characters which are not a word boundary
258 @item @samp{\`} matches the beginning of the whole input
260 @item @samp{\'} matches the end of the whole input
265 Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
267 The alternation operator is @samp{\|}.
269 The character @samp{^} only represents the beginning of a string when it appears:
273 At the beginning of a regular expression
275 @item After an open-group, signified by
278 @item After the alternation operator @samp{\|}
283 The character @samp{$} only represents the end of a string when it appears:
286 @item At the end of a regular expression
288 @item Before a close-group, signified by
290 @item Before the alternation operator @samp{\|}
295 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
298 @item At the beginning of a regular expression
300 @item After an open-group, signified by
302 @item After the alternation operator @samp{\|}
309 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
312 @node gnu-awk regular expression syntax
313 @subsection @samp{gnu-awk} regular expression syntax
316 The character @samp{.} matches any single character.
322 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
324 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
332 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
334 GNU extensions are supported:
337 @item @samp{\w} matches a character within a word
339 @item @samp{\W} matches a character which is not within a word
341 @item @samp{\<} matches the beginning of a word
343 @item @samp{\>} matches the end of a word
345 @item @samp{\b} matches a word boundary
347 @item @samp{\B} matches characters which are not a word boundary
349 @item @samp{\`} matches the beginning of the whole input
351 @item @samp{\'} matches the end of the whole input
356 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
358 The alternation operator is @samp{|}.
360 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
362 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
365 @item At the beginning of a regular expression
367 @item After an open-group, signified by
369 @item After the alternation operator @samp{|}
374 Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals are treated as literals, for example @samp{a@{1} is treated as @samp{a\@{1}
376 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
379 @node grep regular expression syntax
380 @subsection @samp{grep} regular expression syntax
383 The character @samp{.} matches any single character except newline.
389 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
391 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
397 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. Non-matching lists @samp{[^@dots{}]} do not ever match newline.
399 GNU extensions are supported:
402 @item @samp{\w} matches a character within a word
404 @item @samp{\W} matches a character which is not within a word
406 @item @samp{\<} matches the beginning of a word
408 @item @samp{\>} matches the end of a word
410 @item @samp{\b} matches a word boundary
412 @item @samp{\B} matches characters which are not a word boundary
414 @item @samp{\`} matches the beginning of the whole input
416 @item @samp{\'} matches the end of the whole input
421 Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
423 The alternation operator is @samp{\|}.
425 The character @samp{^} only represents the beginning of a string when it appears:
429 At the beginning of a regular expression
431 @item After an open-group, signified by
434 @item After a newline
436 @item After the alternation operator @samp{\|}
441 The character @samp{$} only represents the end of a string when it appears:
444 @item At the end of a regular expression
446 @item Before a close-group, signified by
448 @item Before a newline
450 @item Before the alternation operator @samp{\|}
455 @samp{\*}, @samp{\+} and @samp{\?} are special at any point in a regular expression except:
458 @item At the beginning of a regular expression
460 @item After an open-group, signified by
462 @item After a newline
464 @item After the alternation operator @samp{\|}
469 Intervals are specified by @samp{\@{} and @samp{\@}}. Invalid intervals such as @samp{a\@{1z} are not accepted.
471 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
474 @node posix-awk regular expression syntax
475 @subsection @samp{posix-awk} regular expression syntax
478 The character @samp{.} matches any single character except the null character.
484 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
486 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
494 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
496 GNU extensions are not supported and so @samp{\w}, @samp{\W}, @samp{\<}, @samp{\>}, @samp{\b}, @samp{\B}, @samp{\`}, and @samp{\'} match @samp{w}, @samp{W}, @samp{<}, @samp{>}, @samp{b}, @samp{B}, @samp{`}, and @samp{'} respectively.
498 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
500 The alternation operator is @samp{|}.
502 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
504 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except the following places, where they are not allowed:
507 @item At the beginning of a regular expression
509 @item After an open-group, signified by
511 @item After the alternation operator @samp{|}
516 Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals are treated as literals, for example @samp{a@{1} is treated as @samp{a\@{1}
518 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
521 @node posix-basic regular expression syntax
522 @subsection @samp{posix-basic} regular expression syntax
523 This is a synonym for ed.
524 @node posix-egrep regular expression syntax
525 @subsection @samp{posix-egrep} regular expression syntax
528 The character @samp{.} matches any single character except newline.
534 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
536 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
544 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. Non-matching lists @samp{[^@dots{}]} do not ever match newline.
546 GNU extensions are supported:
549 @item @samp{\w} matches a character within a word
551 @item @samp{\W} matches a character which is not within a word
553 @item @samp{\<} matches the beginning of a word
555 @item @samp{\>} matches the end of a word
557 @item @samp{\b} matches a word boundary
559 @item @samp{\B} matches characters which are not a word boundary
561 @item @samp{\`} matches the beginning of the whole input
563 @item @samp{\'} matches the end of the whole input
568 Grouping is performed with parentheses @samp{()}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
570 The alternation operator is @samp{|}.
572 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
574 The characters @samp{*}, @samp{+} and @samp{?} are special anywhere in a regular expression.
576 Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals are treated as literals, for example @samp{a@{1} is treated as @samp{a\@{1}
578 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
581 @node posix-extended regular expression syntax
582 @subsection @samp{posix-extended} regular expression syntax
585 The character @samp{.} matches any single character except the null character.
591 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
593 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
601 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
603 GNU extensions are supported:
606 @item @samp{\w} matches a character within a word
608 @item @samp{\W} matches a character which is not within a word
610 @item @samp{\<} matches the beginning of a word
612 @item @samp{\>} matches the end of a word
614 @item @samp{\b} matches a word boundary
616 @item @samp{\B} matches characters which are not a word boundary
618 @item @samp{\`} matches the beginning of the whole input
620 @item @samp{\'} matches the end of the whole input
625 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
627 The alternation operator is @samp{|}.
629 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
631 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except the following places, where they are not allowed:
634 @item At the beginning of a regular expression
636 @item After an open-group, signified by
638 @item After the alternation operator @samp{|}
643 Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals such as @samp{a@{1z} are not accepted.
645 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.