1 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
4 <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
5 <title>User's Guide</title>
6 <link rel="stylesheet" href="../../../doc/src/boostbook.css" type="text/css">
7 <meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
8 <link rel="home" href="../index.html" title="The Boost C++ Libraries BoostBook Documentation Subset">
9 <link rel="up" href="../xpressive.html" title="Chapter 46. Boost.Xpressive">
10 <link rel="prev" href="../xpressive.html" title="Chapter 46. Boost.Xpressive">
11 <link rel="next" href="reference.html" title="Reference">
13 <body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
14 <table cellpadding="2" width="100%"><tr>
15 <td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../boost.png"></td>
16 <td align="center"><a href="../../../index.html">Home</a></td>
17 <td align="center"><a href="../../../libs/libraries.htm">Libraries</a></td>
18 <td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
19 <td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
20 <td align="center"><a href="../../../more/index.htm">More</a></td>
23 <div class="spirit-nav">
24 <a accesskey="p" href="../xpressive.html"><img src="../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../xpressive.html"><img src="../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="reference.html"><img src="../../../doc/src/images/next.png" alt="Next"></a>
27 <div class="titlepage"><div><div><h2 class="title" style="clear: both">
28 <a name="xpressive.user_s_guide"></a><a class="link" href="user_s_guide.html" title="User's Guide">User's Guide</a>
29 </h2></div></div></div>
30 <div class="toc"><dl class="toc">
31 <dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.introduction">Introduction</a></span></dt>
32 <dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive">Installing
33 xpressive</a></span></dt>
34 <dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.quick_start">Quick Start</a></span></dt>
35 <dt><span class="section"><a href="user_s_guide.html#xpressive.user_s_guide.creating_a_regex_object">Creating
36 a Regex Object</a></span></dt>
37 <dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.matching_and_searching">Matching
38 and Searching</a></span></dt>
39 <dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results">Accessing
40 Results</a></span></dt>
41 <dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions">String
42 Substitutions</a></span></dt>
43 <dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization">String
44 Splitting and Tokenization</a></span></dt>
45 <dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.named_captures">Named Captures</a></span></dt>
46 <dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches">Grammars
47 and Nested Matches</a></span></dt>
48 <dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions">Semantic
49 Actions and User-Defined Assertions</a></span></dt>
50 <dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.symbol_tables_and_attributes">Symbol
51 Tables and Attributes</a></span></dt>
52 <dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits">Localization
53 and Regex Traits</a></span></dt>
54 <dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks">Tips 'N Tricks</a></span></dt>
55 <dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.concepts">Concepts</a></span></dt>
56 <dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.examples">Examples</a></span></dt>
59 This section describes how to use xpressive to accomplish text manipulation
60 and parsing tasks. If you are looking for detailed information regarding specific
61 components in xpressive, check the <a class="link" href="reference.html" title="Reference">Reference</a>
65 <div class="titlepage"><div><div><h3 class="title">
66 <a name="boost_xpressive.user_s_guide.introduction"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.introduction" title="Introduction">Introduction</a>
67 </h3></div></div></div>
69 <a name="boost_xpressive.user_s_guide.introduction.h0"></a>
70 <span class="phrase"><a name="boost_xpressive.user_s_guide.introduction.what_is_xpressive_"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.introduction.what_is_xpressive_">What
74 xpressive is a regular expression template library. Regular expressions (regexes)
75 can be written as strings that are parsed dynamically at runtime (dynamic
76 regexes), or as <span class="emphasis"><em>expression templates</em></span><a href="#ftn.boost_xpressive.user_s_guide.introduction.f0" class="footnote" name="boost_xpressive.user_s_guide.introduction.f0"><sup class="footnote">[36]</sup></a> that are parsed at compile-time (static regexes). Dynamic regexes
77 have the advantage that they can be accepted from the user as input at runtime
78 or read from an initialization file. Static regexes have several advantages.
79 Since they are C++ expressions instead of strings, they can be syntax-checked
80 at compile-time. Also, they can naturally refer to code and data elsewhere
81 in your program, giving you the ability to call back into your code from
82 within a regex match. Finally, since they are statically bound, the compiler
83 can generate faster code for static regexes.
86 xpressive's dual nature is unique and powerful. Static xpressive is a bit
87 like the <a href="http://spirit.sourceforge.net" target="_top">Spirit Parser Framework</a>.
88 Like <a href="http://spirit.sourceforge.net" target="_top">Spirit</a>, you can build
89 grammars with static regexes using expression templates. (Unlike <a href="http://spirit.sourceforge.net" target="_top">Spirit</a>,
90 xpressive does exhaustive backtracking, trying every possibility to find
91 a match for your pattern.) Dynamic xpressive is a bit like <a href="../../../libs/regex" target="_top">Boost.Regex</a>.
92 In fact, xpressive's interface should be familiar to anyone who has used
93 <a href="../../../libs/regex" target="_top">Boost.Regex</a>. xpressive's innovation
94 comes from allowing you to mix and match static and dynamic regexes in the
95 same program, and even in the same expression! You can embed a dynamic regex
96 in a static regex, or <span class="emphasis"><em>vice versa</em></span>, and the embedded regex
97 will participate fully in the search, back-tracking as needed to make the
101 <a name="boost_xpressive.user_s_guide.introduction.h1"></a>
102 <span class="phrase"><a name="boost_xpressive.user_s_guide.introduction.hello__world_"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.introduction.hello__world_">Hello,
106 Enough theory. Let's have a look at <span class="emphasis"><em>Hello World</em></span>, xpressive
109 <pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
110 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
112 <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
114 <span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
115 <span class="special">{</span>
116 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">hello</span><span class="special">(</span> <span class="string">"hello world!"</span> <span class="special">);</span>
118 <span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(\\w+) (\\w+)!"</span> <span class="special">);</span>
119 <span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
121 <span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="identifier">hello</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rex</span> <span class="special">)</span> <span class="special">)</span>
122 <span class="special">{</span>
123 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// whole match</span>
124 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">1</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// first capture</span>
125 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">2</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// second capture</span>
126 <span class="special">}</span>
128 <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
129 <span class="special">}</span>
132 This program outputs the following:
134 <pre class="programlisting">hello world!
139 The first thing you'll notice about the code is that all the types in xpressive
140 live in the <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span></code> namespace.
142 <div class="note"><table border="0" summary="Note">
144 <td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
145 <th align="left">Note</th>
147 <tr><td align="left" valign="top"><p>
148 Most of the rest of the examples in this document will leave off the <code class="computeroutput"><span class="keyword">using</span> <span class="keyword">namespace</span>
149 <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span></code>
150 directive. Just pretend it's there.
154 Next, you'll notice the type of the regular expression object is <code class="computeroutput"><span class="identifier">sregex</span></code>. If you are familiar with <a href="../../../libs/regex" target="_top">Boost.Regex</a>, this is different than what you
155 are used to. The "<code class="computeroutput"><span class="identifier">s</span></code>"
156 in "<code class="computeroutput"><span class="identifier">sregex</span></code>" stands
157 for "<code class="computeroutput"><span class="identifier">string</span></code>", indicating
158 that this regex can be used to find patterns in <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>
159 objects. I'll discuss this difference and its implications in detail later.
162 Notice how the regex object is initialized:
164 <pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(\\w+) (\\w+)!"</span> <span class="special">);</span>
167 To create a regular expression object from a string, you must call a factory
168 method such as <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html#id-1_3_47_5_18_2_1_1_9_1-bb">basic_regex<>::compile()</a></code></code>.
169 This is another area in which xpressive differs from other object-oriented
170 regular expression libraries. Other libraries encourage you to think of a
171 regular expression as a kind of string on steroids. In xpressive, regular
172 expressions are not strings; they are little programs in a domain-specific
173 language. Strings are only one <span class="emphasis"><em>representation</em></span> of that
174 language. Another representation is an expression template. For example,
175 the above line of code is equivalent to the following:
177 <pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="char">' '</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s2</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="char">'!'</span><span class="special">;</span>
180 This describes the same regular expression, except it uses the domain-specific
181 embedded language defined by static xpressive.
184 As you can see, static regexes have a syntax that is noticeably different
185 than standard Perl syntax. That is because we are constrained by C++'s syntax.
186 The biggest difference is the use of <code class="computeroutput"><span class="special">>></span></code>
187 to mean "followed by". For instance, in Perl you can just put sub-expressions
190 <pre class="programlisting"><span class="identifier">abc</span>
193 But in C++, there must be an operator separating sub-expressions:
195 <pre class="programlisting"><span class="identifier">a</span> <span class="special">>></span> <span class="identifier">b</span> <span class="special">>></span> <span class="identifier">c</span>
198 In Perl, parentheses <code class="computeroutput"><span class="special">()</span></code> have
199 special meaning. They group, but as a side-effect they also create back-references
200 like <code class="literal">$1</code> and <code class="literal">$2</code>. In C++, there is no
201 way to overload parentheses to give them side-effects. To get the same effect,
202 we use the special <code class="computeroutput"><span class="identifier">s1</span></code>, <code class="computeroutput"><span class="identifier">s2</span></code>, etc. tokens. Assign to one to create
203 a back-reference (known as a sub-match in xpressive).
206 You'll also notice that the one-or-more repetition operator <code class="computeroutput"><span class="special">+</span></code> has moved from postfix to prefix position.
207 That's because C++ doesn't have a postfix <code class="computeroutput"><span class="special">+</span></code>
210 <pre class="programlisting"><span class="string">"\\w+"</span>
215 <pre class="programlisting"><span class="special">+</span><span class="identifier">_w</span>
218 We'll cover all the other differences <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes" title="Static Regexes">later</a>.
221 <div class="section">
222 <div class="titlepage"><div><div><h3 class="title">
223 <a name="boost_xpressive.user_s_guide.installing_xpressive"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive" title="Installing xpressive">Installing
225 </h3></div></div></div>
227 <a name="boost_xpressive.user_s_guide.installing_xpressive.h0"></a>
228 <span class="phrase"><a name="boost_xpressive.user_s_guide.installing_xpressive.getting_xpressive"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive.getting_xpressive">Getting
232 There are two ways to get xpressive. The first and simplest is to download
233 the latest version of Boost. Just go to <a href="http://sf.net/projects/boost" target="_top">http://sf.net/projects/boost</a>
234 and follow the <span class="quote">“<span class="quote">Download</span>”</span> link.
237 The second way is by directly accessing the Boost Subversion repository.
238 Just go to <a href="http://svn.boost.org/trac/boost/" target="_top">http://svn.boost.org/trac/boost/</a>
239 and follow the instructions there for anonymous Subversion access. The version
240 in Boost Subversion is unstable.
243 <a name="boost_xpressive.user_s_guide.installing_xpressive.h1"></a>
244 <span class="phrase"><a name="boost_xpressive.user_s_guide.installing_xpressive.building_with_xpressive"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive.building_with_xpressive">Building
248 Xpressive is a header-only template library, which means you don't need to
249 alter your build scripts or link to any separate lib file to use it. All
250 you need to do is <code class="computeroutput"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span></code>.
251 If you are only using static regexes, you can improve compile times by only
252 including <code class="computeroutput"><span class="identifier">xpressive_static</span><span class="special">.</span><span class="identifier">hpp</span></code>. Likewise,
253 you can include <code class="computeroutput"><span class="identifier">xpressive_dynamic</span><span class="special">.</span><span class="identifier">hpp</span></code> if
254 you only plan on using dynamic regexes.
257 If you would also like to use semantic actions or custom assertions with
258 your static regexes, you will need to additionally include <code class="computeroutput"><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span></code>.
261 <a name="boost_xpressive.user_s_guide.installing_xpressive.h2"></a>
262 <span class="phrase"><a name="boost_xpressive.user_s_guide.installing_xpressive.requirements"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive.requirements">Requirements</a>
265 Xpressive requires Boost version 1.34.1 or higher.
268 <a name="boost_xpressive.user_s_guide.installing_xpressive.h3"></a>
269 <span class="phrase"><a name="boost_xpressive.user_s_guide.installing_xpressive.supported_compilers"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive.supported_compilers">Supported
273 Currently, Boost.Xpressive is known to work on the following compilers:
275 <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
276 <li class="listitem">
277 Visual C++ 7.1 and higher
279 <li class="listitem">
280 GNU C++ 3.4 and higher
282 <li class="listitem">
283 Intel for Linux 8.1 and higher
285 <li class="listitem">
286 Intel for Windows 10 and higher
288 <li class="listitem">
289 tru64cxx 71 and higher
291 <li class="listitem">
294 <li class="listitem">
299 Check the latest tests results at Boost's <a href="http://beta.boost.org/development/tests/trunk/developer/xpressive.html" target="_top">Regression
302 <div class="note"><table border="0" summary="Note">
304 <td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
305 <th align="left">Note</th>
307 <tr><td align="left" valign="top"><p>
308 Please send any questions, comments and bug reports to eric <at>
309 boost-consulting <dot> com.
313 <div class="section">
314 <div class="titlepage"><div><div><h3 class="title">
315 <a name="boost_xpressive.user_s_guide.quick_start"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.quick_start" title="Quick Start">Quick Start</a>
316 </h3></div></div></div>
318 You don't need to know much to start being productive with xpressive. Let's
319 begin with the nickel tour of the types and algorithms xpressive provides.
322 <a name="boost_xpressive.user_s_guide.quick_start.t0"></a><p class="title"><b>Table 46.1. xpressive's Tool-Box</b></p>
323 <div class="table-contents"><table class="table" summary="xpressive's Tool-Box">
344 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
349 Contains a compiled regular expression. <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
350 is the most important type in xpressive. Everything you do with
351 xpressive will begin with creating an object of type <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>.
358 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>,
359 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
364 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
365 contains the results of a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
366 or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
367 operation. It acts like a vector of <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
368 objects. A <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
369 object contains a marked sub-expression (also known as a back-reference
370 in Perl). It is basically just a pair of iterators representing
371 the begin and end of the marked sub-expression.
378 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
383 Checks to see if a string matches a regex. For <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
384 to succeed, the <span class="emphasis"><em>whole string</em></span> must match the
385 regex, from beginning to end. If you give <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
386 a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>,
387 it will write into it any marked sub-expressions it finds.
394 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
399 Searches a string to find a sub-string that matches the regex.
400 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
401 will try to find a match at every position in the string, starting
402 at the beginning, and stopping when it finds a match or when the
403 string is exhausted. As with <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>,
404 if you give <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
405 a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>,
406 it will write into it any marked sub-expressions it finds.
413 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
418 Given an input string, a regex, and a substitution string, <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
419 builds a new string by replacing those parts of the input string
420 that match the regex with the substitution string. The substitution
421 string can contain references to marked sub-expressions.
428 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator<></a></code></code>
433 An STL-compatible iterator that makes it easy to find all the places
434 in a string that match a regex. Dereferencing a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator<></a></code></code>
435 returns a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>.
436 Incrementing a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator<></a></code></code>
437 finds the next match.
444 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
449 Like <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator<></a></code></code>,
450 except dereferencing a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
451 returns a string. By default, it will return the whole sub-string
452 that the regex matched, but it can be configured to return any
453 or all of the marked sub-expressions one at a time, or even the
454 parts of the string that <span class="emphasis"><em>didn't</em></span> match the
462 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
467 A factory for <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
468 objects. It "compiles" a string into a regular expression.
469 You will not usually have to deal directly with <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
470 because the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
471 class has a factory method that uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
472 internally. But if you need to do anything fancy like create a
473 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
474 object with a different <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>,
475 you will need to use a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
483 <br class="table-break"><p>
484 Now that you know a bit about the tools xpressive provides, you can pick
485 the right tool for you by answering the following two questions:
487 <div class="orderedlist"><ol class="orderedlist" type="1">
488 <li class="listitem">
489 What <span class="emphasis"><em>iterator</em></span> type will you use to traverse your
492 <li class="listitem">
493 What do you want to <span class="emphasis"><em>do</em></span> to your data?
497 <a name="boost_xpressive.user_s_guide.quick_start.h0"></a>
498 <span class="phrase"><a name="boost_xpressive.user_s_guide.quick_start.know_your_iterator_type"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.quick_start.know_your_iterator_type">Know
499 Your Iterator Type</a>
502 Most of the classes in xpressive are templates that are parameterized on
503 the iterator type. xpressive defines some common typedefs to make the job
504 of choosing the right types easier. You can use the table below to find the
505 right types based on the type of your iterator.
508 <a name="boost_xpressive.user_s_guide.quick_start.t1"></a><p class="title"><b>Table 46.2. xpressive Typedefs vs. Iterator Types</b></p>
509 <div class="table-contents"><table class="table" summary="xpressive Typedefs vs. Iterator Types">
522 std::string::const_iterator
532 std::wstring::const_iterator
545 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
550 <code class="computeroutput"><span class="identifier">sregex</span></code>
555 <code class="computeroutput"><span class="identifier">cregex</span></code>
560 <code class="computeroutput"><span class="identifier">wsregex</span></code>
565 <code class="computeroutput"><span class="identifier">wcregex</span></code>
572 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
577 <code class="computeroutput"><span class="identifier">smatch</span></code>
582 <code class="computeroutput"><span class="identifier">cmatch</span></code>
587 <code class="computeroutput"><span class="identifier">wsmatch</span></code>
592 <code class="computeroutput"><span class="identifier">wcmatch</span></code>
599 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
604 <code class="computeroutput"><span class="identifier">sregex_compiler</span></code>
609 <code class="computeroutput"><span class="identifier">cregex_compiler</span></code>
614 <code class="computeroutput"><span class="identifier">wsregex_compiler</span></code>
619 <code class="computeroutput"><span class="identifier">wcregex_compiler</span></code>
626 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator<></a></code></code>
631 <code class="computeroutput"><span class="identifier">sregex_iterator</span></code>
636 <code class="computeroutput"><span class="identifier">cregex_iterator</span></code>
641 <code class="computeroutput"><span class="identifier">wsregex_iterator</span></code>
646 <code class="computeroutput"><span class="identifier">wcregex_iterator</span></code>
653 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
658 <code class="computeroutput"><span class="identifier">sregex_token_iterator</span></code>
663 <code class="computeroutput"><span class="identifier">cregex_token_iterator</span></code>
668 <code class="computeroutput"><span class="identifier">wsregex_token_iterator</span></code>
673 <code class="computeroutput"><span class="identifier">wcregex_token_iterator</span></code>
680 <br class="table-break"><p>
681 You should notice the systematic naming convention. Many of these types are
682 used together, so the naming convention helps you to use them consistently.
683 For instance, if you have a <code class="computeroutput"><span class="identifier">sregex</span></code>,
684 you should also be using a <code class="computeroutput"><span class="identifier">smatch</span></code>.
687 If you are not using one of those four iterator types, then you can use the
688 templates directly and specify your iterator type.
691 <a name="boost_xpressive.user_s_guide.quick_start.h1"></a>
692 <span class="phrase"><a name="boost_xpressive.user_s_guide.quick_start.know_your_task"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.quick_start.know_your_task">Know Your
696 Do you want to find a pattern once? Many times? Search and replace? xpressive
697 has tools for all that and more. Below is a quick reference:
700 <a name="boost_xpressive.user_s_guide.quick_start.t2"></a><p class="title"><b>Table 46.3. Tasks and Tools</b></p>
701 <div class="table-contents"><table class="table" summary="Tasks and Tools">
722 <span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_whole_string_matches_a_regex">See
723 if a whole string matches a regex</a>
728 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
736 <span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_string_contains_a_sub_string_that_matches_a_regex">See
737 if a string contains a sub-string that matches a regex</a>
742 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
750 <span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.replace_all_sub_strings_that_match_a_regex">Replace
751 all sub-strings that match a regex</a>
756 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
764 <span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.find_all_the_sub_strings_that_match_a_regex_and_step_through_them_one_at_a_time">Find
765 all the sub-strings that match a regex and step through them one
771 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator<></a></code></code>
779 <span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.split_a_string_into_tokens_that_each_match_a_regex">Split
780 a string into tokens that each match a regex</a>
785 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
793 <span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.split_a_string_using_a_regex_as_a_delimiter">Split
794 a string using a regex as a delimiter</a>
799 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
807 <br class="table-break"><p>
808 These algorithms and classes are described in excruciating detail in the
811 <div class="tip"><table border="0" summary="Tip">
813 <td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../doc/src/images/tip.png"></td>
814 <th align="left">Tip</th>
816 <tr><td align="left" valign="top"><p>
817 Try clicking on a task in the table above to see a complete example program
818 that uses xpressive to solve that particular task.
822 <div class="section">
823 <div class="titlepage"><div><div><h3 class="title">
824 <a name="xpressive.user_s_guide.creating_a_regex_object"></a><a class="link" href="user_s_guide.html#xpressive.user_s_guide.creating_a_regex_object" title="Creating a Regex Object">Creating
826 </h3></div></div></div>
827 <div class="toc"><dl class="toc">
828 <dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes">Static
829 Regexes</a></span></dt>
830 <dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes">Dynamic
831 Regexes</a></span></dt>
834 When using xpressive, the first thing you'll do is create a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
835 object. This section goes over the nuts and bolts of building a regular expression
836 in the two dialects xpressive supports: static and dynamic.
838 <div class="section">
839 <div class="titlepage"><div><div><h4 class="title">
840 <a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes" title="Static Regexes">Static
842 </h4></div></div></div>
844 <a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.h0"></a>
845 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.overview">Overview</a>
848 The feature that really sets xpressive apart from other C/C++ regular expression
849 libraries is the ability to author a regular expression using C++ expressions.
850 xpressive achieves this through operator overloading, using a technique
851 called <span class="emphasis"><em>expression templates</em></span> to embed a mini-language
852 dedicated to pattern matching within C++. These "static regexes"
853 have many advantages over their string-based brethren. In particular, static
856 <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
857 <li class="listitem">
858 are syntax-checked at compile-time; they will never fail at run-time
859 due to a syntax error.
861 <li class="listitem">
862 can naturally refer to other C++ data and code, including other regexes,
863 making it simple to build grammars out of regular expressions and bind
864 user-defined actions that execute when parts of your regex match.
866 <li class="listitem">
867 are statically bound for better inlining and optimization. Static regexes
868 require no state tables, virtual functions, byte-code or calls through
869 function pointers that cannot be resolved at compile time.
871 <li class="listitem">
872 are not limited to searching for patterns in strings. You can declare
873 a static regex that finds patterns in an array of integers, for instance.
877 Since we compose static regexes using C++ expressions, we are constrained
878 by the rules for legal C++ expressions. Unfortunately, that means that
879 "classic" regular expression syntax cannot always be mapped cleanly
880 into C++. Rather, we map the regex <span class="emphasis"><em>constructs</em></span>, picking
881 new syntax that is legal C++.
884 <a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.h1"></a>
885 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.construction_and_assignment"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.construction_and_assignment">Construction
889 You create a static regex by assigning one to an object of type <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>.
890 For instance, the following defines a regex that can be used to find patterns
891 in objects of type <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>:
893 <pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="char">'$'</span> <span class="special">>></span> <span class="special">+</span><span class="identifier">_d</span> <span class="special">>></span> <span class="char">'.'</span> <span class="special">>></span> <span class="identifier">_d</span> <span class="special">>></span> <span class="identifier">_d</span><span class="special">;</span>
896 Assignment works similarly.
899 <a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.h2"></a>
900 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.character_and_string_literals"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.character_and_string_literals">Character
901 and String Literals</a>
904 In static regexes, character and string literals match themselves. For
905 instance, in the regex above, <code class="computeroutput"><span class="char">'$'</span></code>
906 and <code class="computeroutput"><span class="char">'.'</span></code> match the characters
907 <code class="computeroutput"><span class="char">'$'</span></code> and <code class="computeroutput"><span class="char">'.'</span></code>
908 respectively. Don't be confused by the fact that <code class="literal">$</code> and
909 <code class="literal">.</code> are meta-characters in Perl. In xpressive, literals
910 always represent themselves.
913 When using literals in static regexes, you must take care that at least
914 one operand is not a literal. For instance, the following are <span class="emphasis"><em>not</em></span>
917 <pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">re1</span> <span class="special">=</span> <span class="char">'a'</span> <span class="special">>></span> <span class="char">'b'</span><span class="special">;</span> <span class="comment">// ERROR!</span>
918 <span class="identifier">sregex</span> <span class="identifier">re2</span> <span class="special">=</span> <span class="special">+</span><span class="char">'a'</span><span class="special">;</span> <span class="comment">// ERROR!</span>
921 The two operands to the binary <code class="computeroutput"><span class="special">>></span></code>
922 operator are both literals, and the operand of the unary <code class="computeroutput"><span class="special">+</span></code> operator is also a literal, so these statements
923 will call the native C++ binary right-shift and unary plus operators, respectively.
924 That's not what we want. To get operator overloading to kick in, at least
925 one operand must be a user-defined type. We can use xpressive's <code class="computeroutput"><span class="identifier">as_xpr</span><span class="special">()</span></code>
926 helper function to "taint" an expression with regex-ness, forcing
927 operator overloading to find the correct operators. The two regexes above
928 should be written as:
930 <pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">re1</span> <span class="special">=</span> <span class="identifier">as_xpr</span><span class="special">(</span><span class="char">'a'</span><span class="special">)</span> <span class="special">>></span> <span class="char">'b'</span><span class="special">;</span> <span class="comment">// OK</span>
931 <span class="identifier">sregex</span> <span class="identifier">re2</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">as_xpr</span><span class="special">(</span><span class="char">'a'</span><span class="special">);</span> <span class="comment">// OK</span>
934 <a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.h3"></a>
935 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.sequencing_and_alternation"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.sequencing_and_alternation">Sequencing
939 As you've probably already noticed, sub-expressions in static regexes must
940 be separated by the sequencing operator, <code class="computeroutput"><span class="special">>></span></code>.
941 You can read this operator as "followed by".
943 <pre class="programlisting"><span class="comment">// Match an 'a' followed by a digit</span>
944 <span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="char">'a'</span> <span class="special">>></span> <span class="identifier">_d</span><span class="special">;</span>
947 Alternation works just as it does in Perl with the <code class="computeroutput"><span class="special">|</span></code>
948 operator. You can read this operator as "or". For example:
950 <pre class="programlisting"><span class="comment">// match a digit character or a word character one or more times</span>
951 <span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">|</span> <span class="identifier">_w</span> <span class="special">);</span>
954 <a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.h4"></a>
955 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.grouping_and_captures"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.grouping_and_captures">Grouping
959 In Perl, parentheses <code class="computeroutput"><span class="special">()</span></code> have
960 special meaning. They group, but as a side-effect they also create back-references
961 like <code class="literal">$1</code> and <code class="literal">$2</code>. In C++, parentheses
962 only group -- there is no way to give them side-effects. To get the same
963 effect, we use the special <code class="computeroutput"><span class="identifier">s1</span></code>,
964 <code class="computeroutput"><span class="identifier">s2</span></code>, etc. tokens. Assigning
965 to one creates a back-reference. You can then use the back-reference later
966 in your expression, like using <code class="literal">\1</code> and <code class="literal">\2</code>
967 in Perl. For example, consider the following regex, which finds matching
970 <pre class="programlisting"><span class="string">"<(\\w+)>.*?</\\1>"</span>
973 In static xpressive, this would be:
975 <pre class="programlisting"><span class="char">'<'</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="char">'>'</span> <span class="special">>></span> <span class="special">-*</span><span class="identifier">_</span> <span class="special">>></span> <span class="string">"</"</span> <span class="special">>></span> <span class="identifier">s1</span> <span class="special">>></span> <span class="char">'>'</span>
978 Notice how you capture a back-reference by assigning to <code class="computeroutput"><span class="identifier">s1</span></code>,
979 and then you use <code class="computeroutput"><span class="identifier">s1</span></code> later
980 in the pattern to find the matching end tag.
982 <div class="tip"><table border="0" summary="Tip">
984 <td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../doc/src/images/tip.png"></td>
985 <th align="left">Tip</th>
987 <tr><td align="left" valign="top"><p>
988 <span class="bold"><strong>Grouping without capturing a back-reference</strong></span>
989 <br> <br> In xpressive, if you just want grouping without capturing
990 a back-reference, you can just use <code class="computeroutput"><span class="special">()</span></code>
991 without <code class="computeroutput"><span class="identifier">s1</span></code>. That is the
992 equivalent of Perl's <code class="literal">(?:)</code> non-capturing grouping construct.
996 <a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.h5"></a>
997 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.case_insensitivity_and_internationalization"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.case_insensitivity_and_internationalization">Case-Insensitivity
998 and Internationalization</a>
1001 Perl lets you make part of your regular expression case-insensitive by
1002 using the <code class="literal">(?i:)</code> pattern modifier. xpressive also has
1003 a case-insensitivity pattern modifier, called <code class="computeroutput"><span class="identifier">icase</span></code>.
1004 You can use it as follows:
1006 <pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="string">"this"</span> <span class="special">>></span> <span class="identifier">icase</span><span class="special">(</span> <span class="string">"that"</span> <span class="special">);</span>
1009 In this regular expression, <code class="computeroutput"><span class="string">"this"</span></code>
1010 will be matched exactly, but <code class="computeroutput"><span class="string">"that"</span></code>
1011 will be matched irrespective of case.
1014 Case-insensitive regular expressions raise the issue of internationalization:
1015 how should case-insensitive character comparisons be evaluated? Also, many
1016 character classes are locale-specific. Which characters are matched by
1017 <code class="computeroutput"><span class="identifier">digit</span></code> and which are matched
1018 by <code class="computeroutput"><span class="identifier">alpha</span></code>? The answer depends
1019 on the <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code> object the regular expression
1020 object is using. By default, all regular expression objects use the global
1021 locale. You can override the default by using the <code class="computeroutput"><span class="identifier">imbue</span><span class="special">()</span></code> pattern modifier, as follows:
1023 <pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span> <span class="identifier">my_locale</span> <span class="special">=</span> <span class="comment">/* initialize a std::locale object */</span><span class="special">;</span>
1024 <span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">imbue</span><span class="special">(</span> <span class="identifier">my_locale</span> <span class="special">)(</span> <span class="special">+</span><span class="identifier">alpha</span> <span class="special">>></span> <span class="special">+</span><span class="identifier">digit</span> <span class="special">);</span>
1027 This regular expression will evaluate <code class="computeroutput"><span class="identifier">alpha</span></code>
1028 and <code class="computeroutput"><span class="identifier">digit</span></code> according to
1029 <code class="computeroutput"><span class="identifier">my_locale</span></code>. See the section
1030 on <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits" title="Localization and Regex Traits">Localization
1031 and Regex Traits</a> for more information about how to customize the
1032 behavior of your regexes.
1035 <a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.h6"></a>
1036 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.static_xpressive_syntax_cheat_sheet"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.static_xpressive_syntax_cheat_sheet">Static
1037 xpressive Syntax Cheat Sheet</a>
1040 The table below lists the familiar regex constructs and their equivalents
1041 in static xpressive.
1044 <a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.t0"></a><p class="title"><b>Table 46.4. Perl syntax vs. Static xpressive syntax</b></p>
1045 <div class="table-contents"><table class="table" summary="Perl syntax vs. Static xpressive syntax">
1072 <code class="literal">.</code>
1077 <code class="computeroutput"><a class="link" href="../boost/xpressive/_.html" title="Global _">_</a></code>
1082 any character (assuming Perl's /s modifier).
1089 <code class="literal">ab</code>
1094 <code class="computeroutput"><span class="identifier">a</span> <span class="special">>></span>
1095 <span class="identifier">b</span></code>
1100 sequencing of <code class="literal">a</code> and <code class="literal">b</code> sub-expressions.
1107 <code class="literal">a|b</code>
1112 <code class="computeroutput"><span class="identifier">a</span> <span class="special">|</span>
1113 <span class="identifier">b</span></code>
1118 alternation of <code class="literal">a</code> and <code class="literal">b</code>
1126 <code class="literal">(a)</code>
1131 <code class="computeroutput"><span class="special">(</span><a class="link" href="../boost/xpressive/s1.html" title="Global s1">s1</a><span class="special">=</span> <span class="identifier">a</span><span class="special">)</span></code>
1136 group and capture a back-reference.
1143 <code class="literal">(?:a)</code>
1148 <code class="computeroutput"><span class="special">(</span><span class="identifier">a</span><span class="special">)</span></code>
1153 group and do not capture a back-reference.
1160 <code class="literal">\1</code>
1165 <code class="computeroutput"><a class="link" href="../boost/xpressive/s1.html" title="Global s1">s1</a></code>
1170 a previously captured back-reference.
1177 <code class="literal">a*</code>
1182 <code class="computeroutput"><span class="special">*</span><span class="identifier">a</span></code>
1187 zero or more times, greedy.
1194 <code class="literal">a+</code>
1199 <code class="computeroutput"><span class="special">+</span><span class="identifier">a</span></code>
1204 one or more times, greedy.
1211 <code class="literal">a?</code>
1216 <code class="computeroutput"><span class="special">!</span><span class="identifier">a</span></code>
1221 zero or one time, greedy.
1228 <code class="literal">a{n,m}</code>
1233 <code class="computeroutput"><a class="link" href="../boost/xpressive/repeat.html" title="Function repeat">repeat</a><span class="special"><</span><span class="identifier">n</span><span class="special">,</span><span class="identifier">m</span><span class="special">>(</span><span class="identifier">a</span><span class="special">)</span></code>
1238 between <code class="literal">n</code> and <code class="literal">m</code> times,
1246 <code class="literal">a*?</code>
1251 <code class="computeroutput"><span class="special">-*</span><span class="identifier">a</span></code>
1256 zero or more times, non-greedy.
1263 <code class="literal">a+?</code>
1268 <code class="computeroutput"><span class="special">-+</span><span class="identifier">a</span></code>
1273 one or more times, non-greedy.
1280 <code class="literal">a??</code>
1285 <code class="computeroutput"><span class="special">-!</span><span class="identifier">a</span></code>
1290 zero or one time, non-greedy.
1297 <code class="literal">a{n,m}?</code>
1302 <code class="computeroutput"><span class="special">-</span><a class="link" href="../boost/xpressive/repeat.html" title="Function repeat">repeat</a><span class="special"><</span><span class="identifier">n</span><span class="special">,</span><span class="identifier">m</span><span class="special">>(</span><span class="identifier">a</span><span class="special">)</span></code>
1307 between <code class="literal">n</code> and <code class="literal">m</code> times,
1315 <code class="literal">^</code>
1320 <code class="computeroutput"><a class="link" href="../boost/xpressive/bos.html" title="Global bos">bos</a></code>
1325 beginning of sequence assertion.
1332 <code class="literal">$</code>
1337 <code class="computeroutput"><a class="link" href="../boost/xpressive/eos.html" title="Global eos">eos</a></code>
1342 end of sequence assertion.
1349 <code class="literal">\b</code>
1354 <code class="computeroutput"><a class="link" href="../boost/xpressive/_b.html" title="Global _b">_b</a></code>
1359 word boundary assertion.
1366 <code class="literal">\B</code>
1371 <code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_b.html" title="Global _b">_b</a></code>
1376 not word boundary assertion.
1383 <code class="literal">\n</code>
1388 <code class="computeroutput"><a class="link" href="../boost/xpressive/_n.html" title="Global _n">_n</a></code>
1400 <code class="literal">.</code>
1405 <code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_n.html" title="Global _n">_n</a></code>
1410 any character except a literal newline (without Perl's /s modifier).
1417 <code class="literal">\r?\n|\r</code>
1422 <code class="computeroutput"><a class="link" href="../boost/xpressive/_ln.html" title="Global _ln">_ln</a></code>
1434 <code class="literal">[^\r\n]</code>
1439 <code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_ln.html" title="Global _ln">_ln</a></code>
1444 any single character not a logical newline.
1451 <code class="literal">\w</code>
1456 <code class="computeroutput"><a class="link" href="../boost/xpressive/_w.html" title="Global _w">_w</a></code>
1461 a word character, equivalent to set[alnum | '_'].
1468 <code class="literal">\W</code>
1473 <code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_w.html" title="Global _w">_w</a></code>
1478 not a word character, equivalent to ~set[alnum | '_'].
1485 <code class="literal">\d</code>
1490 <code class="computeroutput"><a class="link" href="../boost/xpressive/_d.html" title="Global _d">_d</a></code>
1502 <code class="literal">\D</code>
1507 <code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_d.html" title="Global _d">_d</a></code>
1512 not a digit character.
1519 <code class="literal">\s</code>
1524 <code class="computeroutput"><a class="link" href="../boost/xpressive/_s.html" title="Global _s">_s</a></code>
1536 <code class="literal">\S</code>
1541 <code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_s.html" title="Global _s">_s</a></code>
1546 not a space character.
1553 <code class="literal">[:alnum:]</code>
1558 <code class="computeroutput"><a class="link" href="../boost/xpressive/alnum.html" title="Global alnum">alnum</a></code>
1563 an alpha-numeric character.
1570 <code class="literal">[:alpha:]</code>
1575 <code class="computeroutput"><a class="link" href="../boost/xpressive/alpha.html" title="Global alpha">alpha</a></code>
1580 an alphabetic character.
1587 <code class="literal">[:blank:]</code>
1592 <code class="computeroutput"><a class="link" href="../boost/xpressive/blank.html" title="Global blank">blank</a></code>
1597 a horizontal white-space character.
1604 <code class="literal">[:cntrl:]</code>
1609 <code class="computeroutput"><a class="link" href="../boost/xpressive/cntrl.html" title="Global cntrl">cntrl</a></code>
1614 a control character.
1621 <code class="literal">[:digit:]</code>
1626 <code class="computeroutput"><a class="link" href="../boost/xpressive/digit.html" title="Global digit">digit</a></code>
1638 <code class="literal">[:graph:]</code>
1643 <code class="computeroutput"><a class="link" href="../boost/xpressive/graph.html" title="Global graph">graph</a></code>
1648 a graphable character.
1655 <code class="literal">[:lower:]</code>
1660 <code class="computeroutput"><a class="link" href="../boost/xpressive/lower.html" title="Global lower">lower</a></code>
1665 a lower-case character.
1672 <code class="literal">[:print:]</code>
1677 <code class="computeroutput"><a class="link" href="../boost/xpressive/print.html" title="Global print">print</a></code>
1682 a printing character.
1689 <code class="literal">[:punct:]</code>
1694 <code class="computeroutput"><a class="link" href="../boost/xpressive/punct.html" title="Global punct">punct</a></code>
1699 a punctuation character.
1706 <code class="literal">[:space:]</code>
1711 <code class="computeroutput"><a class="link" href="../boost/xpressive/space.html" title="Global space">space</a></code>
1716 a white-space character.
1723 <code class="literal">[:upper:]</code>
1728 <code class="computeroutput"><a class="link" href="../boost/xpressive/upper.html" title="Global upper">upper</a></code>
1733 an upper-case character.
1740 <code class="literal">[:xdigit:]</code>
1745 <code class="computeroutput"><a class="link" href="../boost/xpressive/xdigit.html" title="Global xdigit">xdigit</a></code>
1750 a hexadecimal digit character.
1757 <code class="literal">[0-9]</code>
1762 <code class="computeroutput"><a class="link" href="../boost/xpressive/range.html" title="Function template range">range</a><span class="special">(</span><span class="char">'0'</span><span class="special">,</span><span class="char">'9'</span><span class="special">)</span></code>
1767 characters in range <code class="computeroutput"><span class="char">'0'</span></code>
1768 through <code class="computeroutput"><span class="char">'9'</span></code>.
1775 <code class="literal">[abc]</code>
1780 <code class="computeroutput"><span class="identifier">as_xpr</span><span class="special">(</span><span class="char">'a'</span><span class="special">)</span> <span class="special">|</span> <span class="char">'b'</span> <span class="special">|</span><span class="char">'c'</span></code>
1785 characters <code class="computeroutput"><span class="char">'a'</span></code>, <code class="computeroutput"><span class="char">'b'</span></code>, or <code class="computeroutput"><span class="char">'c'</span></code>.
1792 <code class="literal">[abc]</code>
1797 <code class="computeroutput"><span class="special">(</span><a class="link" href="../boost/xpressive/set.html" title="Global set">set</a><span class="special">=</span> <span class="char">'a'</span><span class="special">,</span><span class="char">'b'</span><span class="special">,</span><span class="char">'c'</span><span class="special">)</span></code>
1802 <span class="emphasis"><em>same as above</em></span>
1809 <code class="literal">[0-9abc]</code>
1814 <code class="computeroutput"><a class="link" href="../boost/xpressive/set.html" title="Global set">set</a><span class="special">[</span> <a class="link" href="../boost/xpressive/range.html" title="Function template range">range</a><span class="special">(</span><span class="char">'0'</span><span class="special">,</span><span class="char">'9'</span><span class="special">)</span> <span class="special">|</span>
1815 <span class="char">'a'</span> <span class="special">|</span>
1816 <span class="char">'b'</span> <span class="special">|</span>
1817 <span class="char">'c'</span> <span class="special">]</span></code>
1822 characters <code class="computeroutput"><span class="char">'a'</span></code>, <code class="computeroutput"><span class="char">'b'</span></code>, <code class="computeroutput"><span class="char">'c'</span></code>
1823 or in range <code class="computeroutput"><span class="char">'0'</span></code> through
1824 <code class="computeroutput"><span class="char">'9'</span></code>.
1831 <code class="literal">[0-9abc]</code>
1836 <code class="computeroutput"><a class="link" href="../boost/xpressive/set.html" title="Global set">set</a><span class="special">[</span> <a class="link" href="../boost/xpressive/range.html" title="Function template range">range</a><span class="special">(</span><span class="char">'0'</span><span class="special">,</span><span class="char">'9'</span><span class="special">)</span> <span class="special">|</span>
1837 <span class="special">(</span><a class="link" href="../boost/xpressive/set.html" title="Global set">set</a><span class="special">=</span> <span class="char">'a'</span><span class="special">,</span><span class="char">'b'</span><span class="special">,</span><span class="char">'c'</span><span class="special">)</span> <span class="special">]</span></code>
1842 <span class="emphasis"><em>same as above</em></span>
1849 <code class="literal">[^abc]</code>
1854 <code class="computeroutput"><span class="special">~(</span><a class="link" href="../boost/xpressive/set.html" title="Global set">set</a><span class="special">=</span> <span class="char">'a'</span><span class="special">,</span><span class="char">'b'</span><span class="special">,</span><span class="char">'c'</span><span class="special">)</span></code>
1859 not characters <code class="computeroutput"><span class="char">'a'</span></code>,
1860 <code class="computeroutput"><span class="char">'b'</span></code>, or <code class="computeroutput"><span class="char">'c'</span></code>.
1867 <code class="literal">(?i:<span class="emphasis"><em>stuff</em></span>)</code>
1872 <code class="computeroutput"><a class="link" href="../boost/xpressive/icase.html" title="Function template icase">icase</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code>
1877 match <span class="emphasis"><em>stuff</em></span> disregarding case.
1884 <code class="literal">(?><span class="emphasis"><em>stuff</em></span>)</code>
1889 <code class="computeroutput"><a class="link" href="../boost/xpressive/keep.html" title="Function template keep">keep</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code>
1894 independent sub-expression, match <span class="emphasis"><em>stuff</em></span>
1895 and turn off backtracking.
1902 <code class="literal">(?=<span class="emphasis"><em>stuff</em></span>)</code>
1907 <code class="computeroutput"><a class="link" href="../boost/xpressive/before.html" title="Function template before">before</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code>
1912 positive look-ahead assertion, match if before <span class="emphasis"><em>stuff</em></span>
1913 but don't include <span class="emphasis"><em>stuff</em></span> in the match.
1920 <code class="literal">(?!<span class="emphasis"><em>stuff</em></span>)</code>
1925 <code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/before.html" title="Function template before">before</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code>
1930 negative look-ahead assertion, match if not before <span class="emphasis"><em>stuff</em></span>.
1937 <code class="literal">(?<=<span class="emphasis"><em>stuff</em></span>)</code>
1942 <code class="computeroutput"><a class="link" href="../boost/xpressive/after.html" title="Function template after">after</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code>
1947 positive look-behind assertion, match if after <span class="emphasis"><em>stuff</em></span>
1948 but don't include <span class="emphasis"><em>stuff</em></span> in the match. (<span class="emphasis"><em>stuff</em></span>
1949 must be constant-width.)
1956 <code class="literal">(?<!<span class="emphasis"><em>stuff</em></span>)</code>
1961 <code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/after.html" title="Function template after">after</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code>
1966 negative look-behind assertion, match if not after <span class="emphasis"><em>stuff</em></span>.
1967 (<span class="emphasis"><em>stuff</em></span> must be constant-width.)
1974 <code class="literal">(?P<<span class="emphasis"><em>name</em></span>><span class="emphasis"><em>stuff</em></span>)</code>
1979 <code class="computeroutput"><code class="literal"><a class="link" href="../boost/xpressive/mark_tag.html" title="Struct mark_tag">mark_tag</a></code>
1980 </code><code class="literal"><span class="emphasis"><em>name</em></span></code><code class="computeroutput"><span class="special">(</span></code><span class="emphasis"><em>n</em></span><code class="computeroutput"><span class="special">);</span></code><br> ...<br> <code class="computeroutput"><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>name</em></span></code><code class="computeroutput"><span class="special">=</span> </code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code>
1985 Create a named capture.
1992 <code class="literal">(?P=<span class="emphasis"><em>name</em></span>)</code>
1997 <code class="computeroutput"><code class="literal"><a class="link" href="../boost/xpressive/mark_tag.html" title="Struct mark_tag">mark_tag</a></code>
1998 </code><code class="literal"><span class="emphasis"><em>name</em></span></code><code class="computeroutput"><span class="special">(</span></code><span class="emphasis"><em>n</em></span><code class="computeroutput"><span class="special">);</span></code><br> ...<br> <code class="literal"><span class="emphasis"><em>name</em></span></code>
2003 Refer back to a previously created named capture.
2010 <br class="table-break"><p>
2014 <div class="section">
2015 <div class="titlepage"><div><div><h4 class="title">
2016 <a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes" title="Dynamic Regexes">Dynamic
2018 </h4></div></div></div>
2020 <a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.h0"></a>
2021 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.overview">Overview</a>
2024 Static regexes are dandy, but sometimes you need something a bit more ...
2025 dynamic. Imagine you are developing a text editor with a regex search/replace
2026 feature. You need to accept a regular expression from the end user as input
2027 at run-time. There should be a way to parse a string into a regular expression.
2028 That's what xpressive's dynamic regexes are for. They are built from the
2029 same core components as their static counterparts, but they are late-bound
2030 so you can specify them at run-time.
2033 <a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.h1"></a>
2034 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.construction_and_assignment"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.construction_and_assignment">Construction
2038 There are two ways to create a dynamic regex: with the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html#id-1_3_47_5_18_2_1_1_9_1-bb">basic_regex<>::compile()</a></code></code>
2039 function or with the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
2040 class template. Use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html#id-1_3_47_5_18_2_1_1_9_1-bb">basic_regex<>::compile()</a></code></code>
2041 if you want the default locale. Use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
2042 if you need to specify a different locale. In the section on <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches" title="Grammars and Nested Matches">regex
2043 grammars</a>, we'll see another use for <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>.
2046 Here is an example of using <code class="computeroutput"><span class="identifier">basic_regex</span><span class="special"><>::</span><span class="identifier">compile</span><span class="special">()</span></code>:
2048 <pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"this|that"</span><span class="special">,</span> <span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">icase</span> <span class="special">);</span>
2051 Here is the same example using <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>:
2053 <pre class="programlisting"><span class="identifier">sregex_compiler</span> <span class="identifier">compiler</span><span class="special">;</span>
2054 <span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"this|that"</span><span class="special">,</span> <span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">icase</span> <span class="special">);</span>
2057 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html#id-1_3_47_5_18_2_1_1_9_1-bb">basic_regex<>::compile()</a></code></code>
2058 is implemented in terms of <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>.
2061 <a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.h2"></a>
2062 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.dynamic_xpressive_syntax"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.dynamic_xpressive_syntax">Dynamic
2063 xpressive Syntax</a>
2066 Since the dynamic syntax is not constrained by the rules for valid C++
2067 expressions, we are free to use familiar syntax for dynamic regexes. For
2068 this reason, the syntax used by xpressive for dynamic regexes follows the
2069 lead set by John Maddock's <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1429.htm" target="_top">proposal</a>
2070 to add regular expressions to the Standard Library. It is essentially the
2071 syntax standardized by <a href="http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf" target="_top">ECMAScript</a>,
2072 with minor changes in support of internationalization.
2075 Since the syntax is documented exhaustively elsewhere, I will simply refer
2076 you to the existing standards, rather than duplicate the specification
2080 <a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.h3"></a>
2081 <span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.internationalization"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.internationalization">Internationalization</a>
2084 As with static regexes, dynamic regexes support internationalization by
2085 allowing you to specify a different <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>.
2086 To do this, you must use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>.
2087 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
2088 class has an <code class="computeroutput"><span class="identifier">imbue</span><span class="special">()</span></code>
2089 function. After you have imbued a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
2090 object with a custom <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>,
2091 all regex objects compiled by that <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
2092 will use that locale. For example:
2094 <pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span> <span class="identifier">my_locale</span> <span class="special">=</span> <span class="comment">/* initialize your locale object here */</span><span class="special">;</span>
2095 <span class="identifier">sregex_compiler</span> <span class="identifier">compiler</span><span class="special">;</span>
2096 <span class="identifier">compiler</span><span class="special">.</span><span class="identifier">imbue</span><span class="special">(</span> <span class="identifier">my_locale</span> <span class="special">);</span>
2097 <span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"\\w+|\\d+"</span> <span class="special">);</span>
2100 This regex will use <code class="computeroutput"><span class="identifier">my_locale</span></code>
2101 when evaluating the intrinsic character sets <code class="computeroutput"><span class="string">"\\w"</span></code>
2102 and <code class="computeroutput"><span class="string">"\\d"</span></code>.
2106 <div class="section">
2107 <div class="titlepage"><div><div><h3 class="title">
2108 <a name="boost_xpressive.user_s_guide.matching_and_searching"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.matching_and_searching" title="Matching and Searching">Matching
2110 </h3></div></div></div>
2112 <a name="boost_xpressive.user_s_guide.matching_and_searching.h0"></a>
2113 <span class="phrase"><a name="boost_xpressive.user_s_guide.matching_and_searching.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.matching_and_searching.overview">Overview</a>
2116 Once you have created a regex object, you can use the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
2117 and <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
2118 algorithms to find patterns in strings. This page covers the basics of regex
2119 matching and searching. In all cases, if you are familiar with how <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
2120 and <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
2121 in the <a href="../../../libs/regex" target="_top">Boost.Regex</a> library work, xpressive's
2122 versions work the same way.
2125 <a name="boost_xpressive.user_s_guide.matching_and_searching.h1"></a>
2126 <span class="phrase"><a name="boost_xpressive.user_s_guide.matching_and_searching.seeing_if_a_string_matches_a_regex"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.matching_and_searching.seeing_if_a_string_matches_a_regex">Seeing
2127 if a String Matches a Regex</a>
2130 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
2131 algorithm checks to see if a regex matches a given input.
2133 <div class="warning"><table border="0" summary="Warning">
2135 <td rowspan="2" align="center" valign="top" width="25"><img alt="[Warning]" src="../../../doc/src/images/warning.png"></td>
2136 <th align="left">Warning</th>
2138 <tr><td align="left" valign="top"><p>
2139 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
2140 algorithm will only report success if the regex matches the <span class="emphasis"><em>whole
2141 input</em></span>, from beginning to end. If the regex matches only a part
2142 of the input, <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
2143 will return false. If you want to search through the string looking for
2144 sub-strings that the regex matches, use the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
2149 The input can be a bidirectional range such as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>,
2150 a C-style null-terminated string or a pair of iterators. In all cases, the
2151 type of the iterator used to traverse the input sequence must match the iterator
2152 type used to declare the regex object. (You can use the table in the <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.quick_start.know_your_iterator_type">Quick
2153 Start</a> to find the correct regex type for your iterator.)
2155 <pre class="programlisting"><span class="identifier">cregex</span> <span class="identifier">cre</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">;</span> <span class="comment">// this regex can match C-style strings</span>
2156 <span class="identifier">sregex</span> <span class="identifier">sre</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">;</span> <span class="comment">// this regex can match std::strings</span>
2158 <span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="string">"hello"</span><span class="special">,</span> <span class="identifier">cre</span> <span class="special">)</span> <span class="special">)</span> <span class="comment">// OK</span>
2159 <span class="special">{</span> <span class="comment">/*...*/</span> <span class="special">}</span>
2161 <span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">(</span><span class="string">"hello"</span><span class="special">),</span> <span class="identifier">sre</span> <span class="special">)</span> <span class="special">)</span> <span class="comment">// OK</span>
2162 <span class="special">{</span> <span class="comment">/*...*/</span> <span class="special">}</span>
2164 <span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="string">"hello"</span><span class="special">,</span> <span class="identifier">sre</span> <span class="special">)</span> <span class="special">)</span> <span class="comment">// ERROR! iterator mis-match!</span>
2165 <span class="special">{</span> <span class="comment">/*...*/</span> <span class="special">}</span>
2168 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
2169 algorithm optionally accepts a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
2170 struct as an out parameter. If given, the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
2171 algorithm fills in the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
2172 struct with information about which parts of the regex matched which parts
2175 <pre class="programlisting"><span class="identifier">cmatch</span> <span class="identifier">what</span><span class="special">;</span>
2176 <span class="identifier">cregex</span> <span class="identifier">cre</span> <span class="special">=</span> <span class="special">+(</span><span class="identifier">s1</span><span class="special">=</span> <span class="identifier">_w</span><span class="special">);</span>
2178 <span class="comment">// store the results of the regex_match in "what"</span>
2179 <span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="string">"hello"</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">cre</span> <span class="special">)</span> <span class="special">)</span>
2180 <span class="special">{</span>
2181 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">1</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// prints "o"</span>
2182 <span class="special">}</span>
2185 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
2186 algorithm also optionally accepts a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_constants/match_flag_type.html" title="Type match_flag_type">match_flag_type</a></code></code>
2187 bitmask. With <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_constants/match_flag_type.html" title="Type match_flag_type">match_flag_type</a></code></code>,
2188 you can control certain aspects of how the match is evaluated. See the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_constants/match_flag_type.html" title="Type match_flag_type">match_flag_type</a></code></code>
2189 reference for a complete list of the flags and their meanings.
2191 <pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"hello"</span><span class="special">);</span>
2192 <span class="identifier">sregex</span> <span class="identifier">sre</span> <span class="special">=</span> <span class="identifier">bol</span> <span class="special">>></span> <span class="special">+</span><span class="identifier">_w</span><span class="special">;</span>
2194 <span class="comment">// match_not_bol means that "bol" should not match at [begin,begin)</span>
2195 <span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">sre</span><span class="special">,</span> <span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">match_not_bol</span> <span class="special">)</span> <span class="special">)</span>
2196 <span class="special">{</span>
2197 <span class="comment">// should never get here!!!</span>
2198 <span class="special">}</span>
2201 Click <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_whole_string_matches_a_regex">here</a>
2202 to see a complete example program that shows how to use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>.
2203 And check the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
2204 reference to see a complete list of the available overloads.
2207 <a name="boost_xpressive.user_s_guide.matching_and_searching.h2"></a>
2208 <span class="phrase"><a name="boost_xpressive.user_s_guide.matching_and_searching.searching_for_matching_sub_strings"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.matching_and_searching.searching_for_matching_sub_strings">Searching
2209 for Matching Sub-Strings</a>
2212 Use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
2213 when you want to know if an input sequence contains a sub-sequence that a
2214 regex matches. <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
2215 will try to match the regex at the beginning of the input sequence and scan
2216 forward in the sequence until it either finds a match or exhausts the sequence.
2219 In all other regards, <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
2220 behaves like <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
2221 <span class="emphasis"><em>(see above)</em></span>. In particular, it can operate on a bidirectional
2222 range such as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>, C-style null-terminated strings
2223 or iterator ranges. The same care must be taken to ensure that the iterator
2224 type of your regex matches the iterator type of your input sequence. As with
2225 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>,
2226 you can optionally provide a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
2227 struct to receive the results of the search, and a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_constants/match_flag_type.html" title="Type match_flag_type">match_flag_type</a></code></code>
2228 bitmask to control how the match is evaluated.
2231 Click <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_string_contains_a_sub_string_that_matches_a_regex">here</a>
2232 to see a complete example program that shows how to use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>.
2233 And check the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
2234 reference to see a complete list of the available overloads.
2237 <div class="section">
2238 <div class="titlepage"><div><div><h3 class="title">
2239 <a name="boost_xpressive.user_s_guide.accessing_results"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results" title="Accessing Results">Accessing
2241 </h3></div></div></div>
2243 <a name="boost_xpressive.user_s_guide.accessing_results.h0"></a>
2244 <span class="phrase"><a name="boost_xpressive.user_s_guide.accessing_results.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results.overview">Overview</a>
2247 Sometimes, it is not enough to know simply whether a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
2248 or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
2249 was successful or not. If you pass an object of type <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
2250 to <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
2251 or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>,
2252 then after the algorithm has completed successfully the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
2253 will contain extra information about which parts of the regex matched which
2254 parts of the sequence. In Perl, these sub-sequences are called <span class="emphasis"><em>back-references</em></span>,
2255 and they are stored in the variables <code class="literal">$1</code>, <code class="literal">$2</code>,
2256 etc. In xpressive, they are objects of type <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>,
2257 and they are stored in the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
2258 structure, which acts as a vector of <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
2262 <a name="boost_xpressive.user_s_guide.accessing_results.h1"></a>
2263 <span class="phrase"><a name="boost_xpressive.user_s_guide.accessing_results.match_results"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results.match_results">match_results</a>
2266 So, you've passed a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
2267 object to a regex algorithm, and the algorithm has succeeded. Now you want
2268 to examine the results. Most of what you'll be doing with the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
2269 object is indexing into it to access its internally stored <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
2270 objects, but there are a few other things you can do with a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
2274 The table below shows how to access the information stored in a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
2275 object named <code class="computeroutput"><span class="identifier">what</span></code>.
2278 <a name="boost_xpressive.user_s_guide.accessing_results.t0"></a><p class="title"><b>Table 46.5. match_results<> Accessors</b></p>
2279 <div class="table-contents"><table class="table" summary="match_results<> Accessors">
2300 <code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">size</span><span class="special">()</span></code>
2305 Returns the number of sub-matches, which is always greater than
2306 zero after a successful match because the full match is stored
2307 in the zero-th sub-match.
2314 <code class="computeroutput"><span class="identifier">what</span><span class="special">[</span><span class="identifier">n</span><span class="special">]</span></code>
2319 Returns the <span class="emphasis"><em>n</em></span>-th sub-match.
2326 <code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">length</span><span class="special">(</span><span class="identifier">n</span><span class="special">)</span></code>
2331 Returns the length of the <span class="emphasis"><em>n</em></span>-th sub-match.
2332 Same as <code class="computeroutput"><span class="identifier">what</span><span class="special">[</span><span class="identifier">n</span><span class="special">].</span><span class="identifier">length</span><span class="special">()</span></code>.
2339 <code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">position</span><span class="special">(</span><span class="identifier">n</span><span class="special">)</span></code>
2344 Returns the offset into the input sequence at which the <span class="emphasis"><em>n</em></span>-th
2352 <code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">str</span><span class="special">(</span><span class="identifier">n</span><span class="special">)</span></code>
2357 Returns a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special"><></span></code>
2358 constructed from the <span class="emphasis"><em>n</em></span>-th sub-match. Same
2359 as <code class="computeroutput"><span class="identifier">what</span><span class="special">[</span><span class="identifier">n</span><span class="special">].</span><span class="identifier">str</span><span class="special">()</span></code>.
2366 <code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">prefix</span><span class="special">()</span></code>
2371 Returns a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
2372 object which represents the sub-sequence from the beginning of
2373 the input sequence to the start of the full match.
2380 <code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">suffix</span><span class="special">()</span></code>
2385 Returns a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
2386 object which represents the sub-sequence from the end of the full
2387 match to the end of the input sequence.
2394 <code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">regex_id</span><span class="special">()</span></code>
2399 Returns the <code class="computeroutput"><span class="identifier">regex_id</span></code>
2400 of the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
2401 object that was last used with this <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
2409 <br class="table-break"><p>
2410 There is more you can do with the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
2411 object, but that will be covered when we talk about <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches" title="Grammars and Nested Matches">Grammars
2412 and Nested Matches</a>.
2415 <a name="boost_xpressive.user_s_guide.accessing_results.h2"></a>
2416 <span class="phrase"><a name="boost_xpressive.user_s_guide.accessing_results.sub_match"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results.sub_match">sub_match</a>
2419 When you index into a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
2420 object, you get back a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
2421 object. A <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
2422 is basically a pair of iterators. It is defined like this:
2424 <pre class="programlisting"><span class="keyword">template</span><span class="special"><</span> <span class="keyword">class</span> <span class="identifier">BidirectionalIterator</span> <span class="special">></span>
2425 <span class="keyword">struct</span> <span class="identifier">sub_match</span>
2426 <span class="special">:</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">pair</span><span class="special"><</span> <span class="identifier">BidirectionalIterator</span><span class="special">,</span> <span class="identifier">BidirectionalIterator</span> <span class="special">></span>
2427 <span class="special">{</span>
2428 <span class="keyword">bool</span> <span class="identifier">matched</span><span class="special">;</span>
2429 <span class="comment">// ...</span>
2430 <span class="special">};</span>
2433 Since it inherits publicaly from <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">pair</span><span class="special"><></span></code>, <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
2434 has <code class="computeroutput"><span class="identifier">first</span></code> and <code class="computeroutput"><span class="identifier">second</span></code> data members of type <code class="computeroutput"><span class="identifier">BidirectionalIterator</span></code>. These are the beginning
2435 and end of the sub-sequence this <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
2436 represents. <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
2437 also has a Boolean <code class="computeroutput"><span class="identifier">matched</span></code>
2438 data member, which is true if this <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
2439 participated in the full match.
2442 The following table shows how you might access the information stored in
2443 a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
2444 object called <code class="computeroutput"><span class="identifier">sub</span></code>.
2447 <a name="boost_xpressive.user_s_guide.accessing_results.t1"></a><p class="title"><b>Table 46.6. sub_match<> Accessors</b></p>
2448 <div class="table-contents"><table class="table" summary="sub_match<> Accessors">
2469 <code class="computeroutput"><span class="identifier">sub</span><span class="special">.</span><span class="identifier">length</span><span class="special">()</span></code>
2474 Returns the length of the sub-match. Same as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">distance</span><span class="special">(</span><span class="identifier">sub</span><span class="special">.</span><span class="identifier">first</span><span class="special">,</span><span class="identifier">sub</span><span class="special">.</span><span class="identifier">second</span><span class="special">)</span></code>.
2481 <code class="computeroutput"><span class="identifier">sub</span><span class="special">.</span><span class="identifier">str</span><span class="special">()</span></code>
2486 Returns a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special"><></span></code>
2487 constructed from the sub-match. Same as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special"><</span><span class="identifier">char_type</span><span class="special">>(</span><span class="identifier">sub</span><span class="special">.</span><span class="identifier">first</span><span class="special">,</span><span class="identifier">sub</span><span class="special">.</span><span class="identifier">second</span><span class="special">)</span></code>.
2494 <code class="computeroutput"><span class="identifier">sub</span><span class="special">.</span><span class="identifier">compare</span><span class="special">(</span><span class="identifier">str</span><span class="special">)</span></code>
2499 Performs a string comparison between the sub-match and <code class="computeroutput"><span class="identifier">str</span></code>, where <code class="computeroutput"><span class="identifier">str</span></code>
2500 can be a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special"><></span></code>,
2501 C-style null-terminated string, or another sub-match. Same as
2502 <code class="computeroutput"><span class="identifier">sub</span><span class="special">.</span><span class="identifier">str</span><span class="special">().</span><span class="identifier">compare</span><span class="special">(</span><span class="identifier">str</span><span class="special">)</span></code>.
2509 <br class="table-break"><h3>
2510 <a name="boost_xpressive.user_s_guide.accessing_results.h3"></a>
2511 <span class="phrase"><a name="boost_xpressive.user_s_guide.accessing_results._inlinemediaobject__imageobject__imagedata_fileref__images_caution_png____imagedata___imageobject__textobject__phrase_caution__phrase___textobject___inlinemediaobject__results_invalidation__inlinemediaobject__imageobject__imagedata_fileref__images_caution_png____imagedata___imageobject__textobject__phrase_caution__phrase___textobject___inlinemediaobject_"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results._inlinemediaobject__imageobject__imagedata_fileref__images_caution_png____imagedata___imageobject__textobject__phrase_caution__phrase___textobject___inlinemediaobject__results_invalidation__inlinemediaobject__imageobject__imagedata_fileref__images_caution_png____imagedata___imageobject__textobject__phrase_caution__phrase___textobject___inlinemediaobject_"><span class="inlinemediaobject"><img src="../images/caution.png" alt="caution"></span> Results Invalidation <span class="inlinemediaobject"><img src="../images/caution.png" alt="caution"></span></a>
2514 Results are stored as iterators into the input sequence. Anything which invalidates
2515 the input sequence will invalidate the match results. For instance, if you
2516 match a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code> object, the results are only valid
2517 until your next call to a non-const member function of that <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>
2518 object. After that, the results held by the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
2519 object are invalid. Don't use them!
2522 <div class="section">
2523 <div class="titlepage"><div><div><h3 class="title">
2524 <a name="boost_xpressive.user_s_guide.string_substitutions"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions" title="String Substitutions">String
2526 </h3></div></div></div>
2528 Regular expressions are not only good for searching text; they're good at
2529 <span class="emphasis"><em>manipulating</em></span> it. And one of the most common text manipulation
2530 tasks is search-and-replace. xpressive provides the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
2531 algorithm for searching and replacing.
2534 <a name="boost_xpressive.user_s_guide.string_substitutions.h0"></a>
2535 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.regex_replace__"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.regex_replace__">regex_replace()</a>
2538 Performing search-and-replace using <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
2539 is simple. All you need is an input sequence, a regex object, and a format
2540 string or a formatter object. There are several versions of the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
2541 algorithm. Some accept the input sequence as a bidirectional container such
2542 as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code> and returns the result in a new
2543 container of the same type. Others accept the input as a null terminated
2544 string and return a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>. Still others accept the input sequence
2545 as a pair of iterators and writes the result into an output iterator. The
2546 substitution may be specified as a string with format sequences or as a formatter
2547 object. Below are some simple examples of using string-based substitutions.
2549 <pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"This is his face"</span><span class="special">);</span>
2550 <span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">as_xpr</span><span class="special">(</span><span class="string">"his"</span><span class="special">);</span> <span class="comment">// find all occurrences of "his" ...</span>
2551 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">format</span><span class="special">(</span><span class="string">"her"</span><span class="special">);</span> <span class="comment">// ... and replace them with "her"</span>
2553 <span class="comment">// use the version of regex_replace() that operates on strings</span>
2554 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">output</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span> <span class="identifier">input</span><span class="special">,</span> <span class="identifier">re</span><span class="special">,</span> <span class="identifier">format</span> <span class="special">);</span>
2555 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">output</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
2557 <span class="comment">// use the version of regex_replace() that operates on iterators</span>
2558 <span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special"><</span> <span class="keyword">char</span> <span class="special">></span> <span class="identifier">out_iter</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">);</span>
2559 <span class="identifier">regex_replace</span><span class="special">(</span> <span class="identifier">out_iter</span><span class="special">,</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">re</span><span class="special">,</span> <span class="identifier">format</span> <span class="special">);</span>
2562 The above program prints out the following:
2564 <pre class="programlisting">Ther is her face
2568 Notice that <span class="emphasis"><em>all</em></span> the occurrences of <code class="computeroutput"><span class="string">"his"</span></code>
2569 have been replaced with <code class="computeroutput"><span class="string">"her"</span></code>.
2572 Click <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.replace_all_sub_strings_that_match_a_regex">here</a>
2573 to see a complete example program that shows how to use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>.
2574 And check the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
2575 reference to see a complete list of the available overloads.
2578 <a name="boost_xpressive.user_s_guide.string_substitutions.h1"></a>
2579 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.replace_options"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.replace_options">Replace
2583 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
2584 algorithm takes an optional bitmask parameter to control the formatting.
2585 The possible values of the bitmask are:
2588 <a name="boost_xpressive.user_s_guide.string_substitutions.t0"></a><p class="title"><b>Table 46.7. Format Flags</b></p>
2589 <div class="table-contents"><table class="table" summary="Format Flags">
2610 <code class="computeroutput"><span class="identifier">format_default</span></code>
2615 Recognize the ECMA-262 format sequences (see below).
2622 <code class="computeroutput"><span class="identifier">format_first_only</span></code>
2627 Only replace the first match, not all of them.
2634 <code class="computeroutput"><span class="identifier">format_no_copy</span></code>
2639 Don't copy the parts of the input sequence that didn't match the
2640 regex to the output sequence.
2647 <code class="computeroutput"><span class="identifier">format_literal</span></code>
2652 Treat the format string as a literal; that is, don't recognize
2653 any escape sequences.
2660 <code class="computeroutput"><span class="identifier">format_perl</span></code>
2665 Recognize the Perl format sequences (see below).
2672 <code class="computeroutput"><span class="identifier">format_sed</span></code>
2677 Recognize the sed format sequences (see below).
2684 <code class="computeroutput"><span class="identifier">format_all</span></code>
2689 In addition to the Perl format sequences, recognize some Boost-specific
2697 <br class="table-break"><p>
2698 These flags live in the <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">regex_constants</span></code>
2699 namespace. If the substitution parameter is a function object instead of
2700 a string, the flags <code class="computeroutput"><span class="identifier">format_literal</span></code>,
2701 <code class="computeroutput"><span class="identifier">format_perl</span></code>, <code class="computeroutput"><span class="identifier">format_sed</span></code>, and <code class="computeroutput"><span class="identifier">format_all</span></code>
2705 <a name="boost_xpressive.user_s_guide.string_substitutions.h2"></a>
2706 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.the_ecma_262_format_sequences"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.the_ecma_262_format_sequences">The
2707 ECMA-262 Format Sequences</a>
2710 When you haven't specified a substitution string dialect with one of the
2711 format flags above, you get the dialect defined by ECMA-262, the standard
2712 for ECMAScript. The table below shows the escape sequences recognized in
2716 <a name="boost_xpressive.user_s_guide.string_substitutions.t1"></a><p class="title"><b>Table 46.8. Format Escape Sequences</b></p>
2717 <div class="table-contents"><table class="table" summary="Format Escape Sequences">
2738 <code class="literal">$1</code>, <code class="literal">$2</code>, etc.
2743 the corresponding sub-match
2750 <code class="literal">$&</code>
2762 <code class="literal">$`</code>
2774 <code class="literal">$'</code>
2786 <code class="literal">$$</code>
2791 a literal <code class="computeroutput"><span class="char">'$'</span></code> character
2798 <br class="table-break"><p>
2799 Any other sequence beginning with <code class="computeroutput"><span class="char">'$'</span></code>
2800 simply represents itself. For example, if the format string were <code class="computeroutput"><span class="string">"$a"</span></code> then <code class="computeroutput"><span class="string">"$a"</span></code>
2801 would be inserted into the output sequence.
2804 <a name="boost_xpressive.user_s_guide.string_substitutions.h3"></a>
2805 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.the_sed_format_sequences"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.the_sed_format_sequences">The
2806 Sed Format Sequences</a>
2809 When specifying the <code class="computeroutput"><span class="identifier">format_sed</span></code>
2810 flag to <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>,
2811 the following escape sequences are recognized:
2814 <a name="boost_xpressive.user_s_guide.string_substitutions.t2"></a><p class="title"><b>Table 46.9. Sed Format Escape Sequences</b></p>
2815 <div class="table-contents"><table class="table" summary="Sed Format Escape Sequences">
2836 <code class="literal">\1</code>, <code class="literal">\2</code>, etc.
2841 The corresponding sub-match
2848 <code class="literal">&</code>
2860 <code class="literal">\a</code>
2865 A literal <code class="computeroutput"><span class="char">'\a'</span></code>
2872 <code class="literal">\e</code>
2877 A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">27</span><span class="special">)</span></code>
2884 <code class="literal">\f</code>
2889 A literal <code class="computeroutput"><span class="char">'\f'</span></code>
2896 <code class="literal">\n</code>
2901 A literal <code class="computeroutput"><span class="char">'\n'</span></code>
2908 <code class="literal">\r</code>
2913 A literal <code class="computeroutput"><span class="char">'\r'</span></code>
2920 <code class="literal">\t</code>
2925 A literal <code class="computeroutput"><span class="char">'\t'</span></code>
2932 <code class="literal">\v</code>
2937 A literal <code class="computeroutput"><span class="char">'\v'</span></code>
2944 <code class="literal">\xFF</code>
2949 A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">0xFF</span><span class="special">)</span></code>, where <code class="literal"><span class="emphasis"><em>F</em></span></code>
2957 <code class="literal">\x{FFFF}</code>
2962 A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">0xFFFF</span><span class="special">)</span></code>, where <code class="literal"><span class="emphasis"><em>F</em></span></code>
2970 <code class="literal">\cX</code>
2975 The control character <code class="literal"><span class="emphasis"><em>X</em></span></code>
2982 <br class="table-break"><h3>
2983 <a name="boost_xpressive.user_s_guide.string_substitutions.h4"></a>
2984 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.the_perl_format_sequences"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.the_perl_format_sequences">The
2985 Perl Format Sequences</a>
2988 When specifying the <code class="computeroutput"><span class="identifier">format_perl</span></code>
2989 flag to <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>,
2990 the following escape sequences are recognized:
2993 <a name="boost_xpressive.user_s_guide.string_substitutions.t3"></a><p class="title"><b>Table 46.10. Perl Format Escape Sequences</b></p>
2994 <div class="table-contents"><table class="table" summary="Perl Format Escape Sequences">
3015 <code class="literal">$1</code>, <code class="literal">$2</code>, etc.
3020 the corresponding sub-match
3027 <code class="literal">$&</code>
3039 <code class="literal">$`</code>
3051 <code class="literal">$'</code>
3063 <code class="literal">$$</code>
3068 a literal <code class="computeroutput"><span class="char">'$'</span></code> character
3075 <code class="literal">\a</code>
3080 A literal <code class="computeroutput"><span class="char">'\a'</span></code>
3087 <code class="literal">\e</code>
3092 A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">27</span><span class="special">)</span></code>
3099 <code class="literal">\f</code>
3104 A literal <code class="computeroutput"><span class="char">'\f'</span></code>
3111 <code class="literal">\n</code>
3116 A literal <code class="computeroutput"><span class="char">'\n'</span></code>
3123 <code class="literal">\r</code>
3128 A literal <code class="computeroutput"><span class="char">'\r'</span></code>
3135 <code class="literal">\t</code>
3140 A literal <code class="computeroutput"><span class="char">'\t'</span></code>
3147 <code class="literal">\v</code>
3152 A literal <code class="computeroutput"><span class="char">'\v'</span></code>
3159 <code class="literal">\xFF</code>
3164 A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">0xFF</span><span class="special">)</span></code>, where <code class="literal"><span class="emphasis"><em>F</em></span></code>
3172 <code class="literal">\x{FFFF}</code>
3177 A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">0xFFFF</span><span class="special">)</span></code>, where <code class="literal"><span class="emphasis"><em>F</em></span></code>
3185 <code class="literal">\cX</code>
3190 The control character <code class="literal"><span class="emphasis"><em>X</em></span></code>
3197 <code class="literal">\l</code>
3202 Make the next character lowercase
3209 <code class="literal">\L</code>
3214 Make the rest of the substitution lowercase until the next <code class="literal">\E</code>
3221 <code class="literal">\u</code>
3226 Make the next character uppercase
3233 <code class="literal">\U</code>
3238 Make the rest of the substitution uppercase until the next <code class="literal">\E</code>
3245 <code class="literal">\E</code>
3250 Terminate <code class="literal">\L</code> or <code class="literal">\U</code>
3257 <code class="literal">\1</code>, <code class="literal">\2</code>, etc.
3262 The corresponding sub-match
3269 <code class="literal">\g<name></code>
3274 The named backref <span class="emphasis"><em>name</em></span>
3281 <br class="table-break"><h3>
3282 <a name="boost_xpressive.user_s_guide.string_substitutions.h5"></a>
3283 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.the_boost_specific_format_sequences"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.the_boost_specific_format_sequences">The
3284 Boost-Specific Format Sequences</a>
3287 When specifying the <code class="computeroutput"><span class="identifier">format_all</span></code>
3288 flag to <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>,
3289 the escape sequences recognized are the same as those above for <code class="computeroutput"><span class="identifier">format_perl</span></code>. In addition, conditional expressions
3290 of the following form are recognized:
3292 <pre class="programlisting">?Ntrue-expression:false-expression
3295 where <span class="emphasis"><em>N</em></span> is a decimal digit representing a sub-match.
3296 If the corresponding sub-match participated in the full match, then the substitution
3297 is <span class="emphasis"><em>true-expression</em></span>. Otherwise, it is <span class="emphasis"><em>false-expression</em></span>.
3298 In this mode, you can use parens <code class="literal">()</code> for grouping. If you
3299 want a literal paren, you must escape it as <code class="literal">\(</code>.
3302 <a name="boost_xpressive.user_s_guide.string_substitutions.h6"></a>
3303 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.formatter_objects"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.formatter_objects">Formatter
3307 Format strings are not always expressive enough for all your text substitution
3308 needs. Consider the simple example of wanting to map input strings to output
3309 strings, as you may want to do with environment variables. Rather than a
3310 format <span class="emphasis"><em>string</em></span>, for this you would use a formatter <span class="emphasis"><em>object</em></span>.
3311 Consider the following code, which finds embedded environment variables of
3312 the form <code class="computeroutput"><span class="string">"$(XYZ)"</span></code> and
3313 computes the substitution string by looking up the environment variable in
3316 <pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">map</span><span class="special">></span>
3317 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">string</span><span class="special">></span>
3318 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
3319 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
3320 <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">;</span>
3321 <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">xpressive</span><span class="special">;</span>
3323 <span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">></span> <span class="identifier">env</span><span class="special">;</span>
3325 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">format_fun</span><span class="special">(</span><span class="identifier">smatch</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">what</span><span class="special">)</span>
3326 <span class="special">{</span>
3327 <span class="keyword">return</span> <span class="identifier">env</span><span class="special">[</span><span class="identifier">what</span><span class="special">[</span><span class="number">1</span><span class="special">].</span><span class="identifier">str</span><span class="special">()];</span>
3328 <span class="special">}</span>
3330 <span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
3331 <span class="special">{</span>
3332 <span class="identifier">env</span><span class="special">[</span><span class="string">"X"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"this"</span><span class="special">;</span>
3333 <span class="identifier">env</span><span class="special">[</span><span class="string">"Y"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"that"</span><span class="special">;</span>
3335 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"\"$(X)\" has the value \"$(Y)\""</span><span class="special">);</span>
3337 <span class="comment">// replace strings like "$(XYZ)" with the result of env["XYZ"]</span>
3338 <span class="identifier">sregex</span> <span class="identifier">envar</span> <span class="special">=</span> <span class="string">"$("</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s1</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="char">')'</span><span class="special">;</span>
3339 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">output</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span><span class="identifier">input</span><span class="special">,</span> <span class="identifier">envar</span><span class="special">,</span> <span class="identifier">format_fun</span><span class="special">);</span>
3340 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">output</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
3341 <span class="special">}</span>
3344 In this case, we use a function, <code class="computeroutput"><span class="identifier">format_fun</span><span class="special">()</span></code> to compute the substitution string on the
3345 fly. It accepts a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
3346 object which contains the results of the current match. <code class="computeroutput"><span class="identifier">format_fun</span><span class="special">()</span></code> uses the first submatch as a key into the
3347 global <code class="computeroutput"><span class="identifier">env</span></code> map. The above
3350 <pre class="programlisting">"this" has the value "that"
3353 The formatter need not be an ordinary function. It may be an object of class
3354 type. And rather than return a string, it may accept an output iterator into
3355 which it writes the substitution. Consider the following, which is functionally
3356 equivalent to the above.
3358 <pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">map</span><span class="special">></span>
3359 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">string</span><span class="special">></span>
3360 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
3361 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
3362 <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">;</span>
3363 <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">xpressive</span><span class="special">;</span>
3365 <span class="keyword">struct</span> <span class="identifier">formatter</span>
3366 <span class="special">{</span>
3367 <span class="keyword">typedef</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">></span> <span class="identifier">env_map</span><span class="special">;</span>
3368 <span class="identifier">env_map</span> <span class="identifier">env</span><span class="special">;</span>
3370 <span class="keyword">template</span><span class="special"><</span><span class="keyword">typename</span> <span class="identifier">Out</span><span class="special">></span>
3371 <span class="identifier">Out</span> <span class="keyword">operator</span><span class="special">()(</span><span class="identifier">smatch</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">what</span><span class="special">,</span> <span class="identifier">Out</span> <span class="identifier">out</span><span class="special">)</span> <span class="keyword">const</span>
3372 <span class="special">{</span>
3373 <span class="identifier">env_map</span><span class="special">::</span><span class="identifier">const_iterator</span> <span class="identifier">where</span> <span class="special">=</span> <span class="identifier">env</span><span class="special">.</span><span class="identifier">find</span><span class="special">(</span><span class="identifier">what</span><span class="special">[</span><span class="number">1</span><span class="special">]);</span>
3374 <span class="keyword">if</span><span class="special">(</span><span class="identifier">where</span> <span class="special">!=</span> <span class="identifier">env</span><span class="special">.</span><span class="identifier">end</span><span class="special">())</span>
3375 <span class="special">{</span>
3376 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">sub</span> <span class="special">=</span> <span class="identifier">where</span><span class="special">-></span><span class="identifier">second</span><span class="special">;</span>
3377 <span class="identifier">out</span> <span class="special">=</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span><span class="identifier">sub</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">sub</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">out</span><span class="special">);</span>
3378 <span class="special">}</span>
3379 <span class="keyword">return</span> <span class="identifier">out</span><span class="special">;</span>
3380 <span class="special">}</span>
3382 <span class="special">};</span>
3384 <span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
3385 <span class="special">{</span>
3386 <span class="identifier">formatter</span> <span class="identifier">fmt</span><span class="special">;</span>
3387 <span class="identifier">fmt</span><span class="special">.</span><span class="identifier">env</span><span class="special">[</span><span class="string">"X"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"this"</span><span class="special">;</span>
3388 <span class="identifier">fmt</span><span class="special">.</span><span class="identifier">env</span><span class="special">[</span><span class="string">"Y"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"that"</span><span class="special">;</span>
3390 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"\"$(X)\" has the value \"$(Y)\""</span><span class="special">);</span>
3392 <span class="identifier">sregex</span> <span class="identifier">envar</span> <span class="special">=</span> <span class="string">"$("</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s1</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="char">')'</span><span class="special">;</span>
3393 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">output</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span><span class="identifier">input</span><span class="special">,</span> <span class="identifier">envar</span><span class="special">,</span> <span class="identifier">fmt</span><span class="special">);</span>
3394 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">output</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
3395 <span class="special">}</span>
3398 The formatter must be a callable object -- a function or a function object
3399 -- that has one of three possible signatures, detailed in the table below.
3400 For the table, <code class="computeroutput"><span class="identifier">fmt</span></code> is a function
3401 pointer or function object, <code class="computeroutput"><span class="identifier">what</span></code>
3402 is a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
3403 object, <code class="computeroutput"><span class="identifier">out</span></code> is an OutputIterator,
3404 and <code class="computeroutput"><span class="identifier">flags</span></code> is a value of
3405 <code class="computeroutput"><span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">match_flag_type</span></code>:
3408 <a name="boost_xpressive.user_s_guide.string_substitutions.t4"></a><p class="title"><b>Table 46.11. Formatter Signatures</b></p>
3409 <div class="table-contents"><table class="table" summary="Formatter Signatures">
3418 Formatter Invocation
3436 <code class="computeroutput"><span class="identifier">fmt</span><span class="special">(</span><span class="identifier">what</span><span class="special">)</span></code>
3441 Range of characters (e.g. <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>)
3442 or null-terminated string
3447 The string matched by the regex is replaced with the string returned
3455 <code class="computeroutput"><span class="identifier">fmt</span><span class="special">(</span><span class="identifier">what</span><span class="special">,</span>
3456 <span class="identifier">out</span><span class="special">)</span></code>
3466 The formatter writes the replacement string into <code class="computeroutput"><span class="identifier">out</span></code> and returns <code class="computeroutput"><span class="identifier">out</span></code>.
3473 <code class="computeroutput"><span class="identifier">fmt</span><span class="special">(</span><span class="identifier">what</span><span class="special">,</span>
3474 <span class="identifier">out</span><span class="special">,</span>
3475 <span class="identifier">flags</span><span class="special">)</span></code>
3485 The formatter writes the replacement string into <code class="computeroutput"><span class="identifier">out</span></code> and returns <code class="computeroutput"><span class="identifier">out</span></code>. The <code class="computeroutput"><span class="identifier">flags</span></code>
3486 parameter is the value of the match flags passed to the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
3494 <br class="table-break"><h3>
3495 <a name="boost_xpressive.user_s_guide.string_substitutions.h7"></a>
3496 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.formatter_expressions"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.formatter_expressions">Formatter
3500 In addition to format <span class="emphasis"><em>strings</em></span> and formatter <span class="emphasis"><em>objects</em></span>,
3501 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
3502 also accepts formatter <span class="emphasis"><em>expressions</em></span>. A formatter expression
3503 is a lambda expression that generates a string. It uses the same syntax as
3504 that for <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions" title="Semantic Actions and User-Defined Assertions">Semantic
3505 Actions</a>, which are covered later. The above example, which uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
3506 to substitute strings for environment variables, is repeated here using a
3507 formatter expression.
3509 <pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">map</span><span class="special">></span>
3510 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">string</span><span class="special">></span>
3511 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
3512 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
3513 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
3514 <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
3516 <span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
3517 <span class="special">{</span>
3518 <span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">></span> <span class="identifier">env</span><span class="special">;</span>
3519 <span class="identifier">env</span><span class="special">[</span><span class="string">"X"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"this"</span><span class="special">;</span>
3520 <span class="identifier">env</span><span class="special">[</span><span class="string">"Y"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"that"</span><span class="special">;</span>
3522 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"\"$(X)\" has the value \"$(Y)\""</span><span class="special">);</span>
3524 <span class="identifier">sregex</span> <span class="identifier">envar</span> <span class="special">=</span> <span class="string">"$("</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s1</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="char">')'</span><span class="special">;</span>
3525 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">output</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span><span class="identifier">input</span><span class="special">,</span> <span class="identifier">envar</span><span class="special">,</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">env</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]);</span>
3526 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">output</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
3527 <span class="special">}</span>
3530 In the above, the formatter expression is <code class="computeroutput"><span class="identifier">ref</span><span class="special">(</span><span class="identifier">env</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]</span></code>. This
3531 means to use the value of the first submatch, <code class="computeroutput"><span class="identifier">s1</span></code>,
3532 as a key into the <code class="computeroutput"><span class="identifier">env</span></code> map.
3533 The purpose of <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">ref</span><span class="special">()</span></code>
3534 here is to make the reference to the <code class="computeroutput"><span class="identifier">env</span></code>
3535 local variable <span class="emphasis"><em>lazy</em></span> so that the index operation is deferred
3536 until we know what to replace <code class="computeroutput"><span class="identifier">s1</span></code>
3540 <div class="section">
3541 <div class="titlepage"><div><div><h3 class="title">
3542 <a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization" title="String Splitting and Tokenization">String
3543 Splitting and Tokenization</a>
3544 </h3></div></div></div>
3546 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
3547 is the Ginsu knife of the text manipulation world. It slices! It dices! This
3548 section describes how to use the highly-configurable <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
3549 to chop up input sequences.
3552 <a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.h0"></a>
3553 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization.overview">Overview</a>
3556 You initialize a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
3557 with an input sequence, a regex, and some optional configuration parameters.
3558 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
3559 will use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
3560 to find the first place in the sequence that the regex matches. When dereferenced,
3561 the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
3562 returns a <span class="emphasis"><em>token</em></span> in the form of a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special"><></span></code>. Which string it returns depends
3563 on the configuration parameters. By default it returns a string corresponding
3564 to the full match, but it could also return a string corresponding to a particular
3565 marked sub-expression, or even the part of the sequence that <span class="emphasis"><em>didn't</em></span>
3566 match. When you increment the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>,
3567 it will move to the next token. Which token is next depends on the configuration
3568 parameters. It could simply be a different marked sub-expression in the current
3569 match, or it could be part or all of the next match. Or it could be the part
3570 that <span class="emphasis"><em>didn't</em></span> match.
3573 As you can see, <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
3574 can do a lot. That makes it hard to describe, but some examples should make
3578 <a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.h1"></a>
3579 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_1__simple_tokenization"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_1__simple_tokenization">Example
3580 1: Simple Tokenization</a>
3583 This example uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
3584 to chop a sequence into a series of tokens consisting of words.
3586 <pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"This is his face"</span><span class="special">);</span>
3587 <span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">;</span> <span class="comment">// find a word</span>
3589 <span class="comment">// iterate over all the words in the input</span>
3590 <span class="identifier">sregex_token_iterator</span> <span class="identifier">begin</span><span class="special">(</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">re</span> <span class="special">),</span> <span class="identifier">end</span><span class="special">;</span>
3592 <span class="comment">// write all the words to std::cout</span>
3593 <span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special"><</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="special">></span> <span class="identifier">out_iter</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span><span class="special">,</span> <span class="string">"\n"</span> <span class="special">);</span>
3594 <span class="identifier">std</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">out_iter</span> <span class="special">);</span>
3597 This program displays the following:
3599 <pre class="programlisting">This
3605 <a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.h2"></a>
3606 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_2__simple_tokenization__reloaded"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_2__simple_tokenization__reloaded">Example
3607 2: Simple Tokenization, Reloaded</a>
3610 This example also uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
3611 to chop a sequence into a series of tokens consisting of words, but it uses
3612 the regex as a delimiter. When we pass a <code class="computeroutput"><span class="special">-</span><span class="number">1</span></code> as the last parameter to the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
3613 constructor, it instructs the token iterator to consider as tokens those
3614 parts of the input that <span class="emphasis"><em>didn't</em></span> match the regex.
3616 <pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"This is his face"</span><span class="special">);</span>
3617 <span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_s</span><span class="special">;</span> <span class="comment">// find white space</span>
3619 <span class="comment">// iterate over all non-white space in the input. Note the -1 below:</span>
3620 <span class="identifier">sregex_token_iterator</span> <span class="identifier">begin</span><span class="special">(</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">re</span><span class="special">,</span> <span class="special">-</span><span class="number">1</span> <span class="special">),</span> <span class="identifier">end</span><span class="special">;</span>
3622 <span class="comment">// write all the words to std::cout</span>
3623 <span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special"><</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="special">></span> <span class="identifier">out_iter</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span><span class="special">,</span> <span class="string">"\n"</span> <span class="special">);</span>
3624 <span class="identifier">std</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">out_iter</span> <span class="special">);</span>
3627 This program displays the following:
3629 <pre class="programlisting">This
3635 <a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.h3"></a>
3636 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_3__simple_tokenization__revolutions"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_3__simple_tokenization__revolutions">Example
3637 3: Simple Tokenization, Revolutions</a>
3640 This example also uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
3641 to chop a sequence containing a bunch of dates into a series of tokens consisting
3642 of just the years. When we pass a positive integer <code class="literal"><span class="emphasis"><em>N</em></span></code>
3643 as the last parameter to the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
3644 constructor, it instructs the token iterator to consider as tokens only the
3645 <code class="literal"><span class="emphasis"><em>N</em></span></code>-th marked sub-expression of each
3648 <pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"01/02/2003 blahblah 04/23/1999 blahblah 11/13/1981"</span><span class="special">);</span>
3649 <span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(\\d{2})/(\\d{2})/(\\d{4})"</span><span class="special">);</span> <span class="comment">// find a date</span>
3651 <span class="comment">// iterate over all the years in the input. Note the 3 below, corresponding to the 3rd sub-expression:</span>
3652 <span class="identifier">sregex_token_iterator</span> <span class="identifier">begin</span><span class="special">(</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">re</span><span class="special">,</span> <span class="number">3</span> <span class="special">),</span> <span class="identifier">end</span><span class="special">;</span>
3654 <span class="comment">// write all the words to std::cout</span>
3655 <span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special"><</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="special">></span> <span class="identifier">out_iter</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span><span class="special">,</span> <span class="string">"\n"</span> <span class="special">);</span>
3656 <span class="identifier">std</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">out_iter</span> <span class="special">);</span>
3659 This program displays the following:
3661 <pre class="programlisting">2003
3666 <a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.h4"></a>
3667 <span class="phrase"><a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_4__not_so_simple_tokenization"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_4__not_so_simple_tokenization">Example
3668 4: Not-So-Simple Tokenization</a>
3671 This example is like the previous one, except that instead of tokenizing
3672 just the years, this program turns the days, months and years into tokens.
3673 When we pass an array of integers <code class="literal"><span class="emphasis"><em>{I,J,...}</em></span></code>
3674 as the last parameter to the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
3675 constructor, it instructs the token iterator to consider as tokens the <code class="literal"><span class="emphasis"><em>I</em></span></code>-th,
3676 <code class="literal"><span class="emphasis"><em>J</em></span></code>-th, etc. marked sub-expression
3679 <pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"01/02/2003 blahblah 04/23/1999 blahblah 11/13/1981"</span><span class="special">);</span>
3680 <span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(\\d{2})/(\\d{2})/(\\d{4})"</span><span class="special">);</span> <span class="comment">// find a date</span>
3682 <span class="comment">// iterate over the days, months and years in the input</span>
3683 <span class="keyword">int</span> <span class="keyword">const</span> <span class="identifier">sub_matches</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span> <span class="number">2</span><span class="special">,</span> <span class="number">1</span><span class="special">,</span> <span class="number">3</span> <span class="special">};</span> <span class="comment">// day, month, year</span>
3684 <span class="identifier">sregex_token_iterator</span> <span class="identifier">begin</span><span class="special">(</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">re</span><span class="special">,</span> <span class="identifier">sub_matches</span> <span class="special">),</span> <span class="identifier">end</span><span class="special">;</span>
3686 <span class="comment">// write all the words to std::cout</span>
3687 <span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special"><</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="special">></span> <span class="identifier">out_iter</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span><span class="special">,</span> <span class="string">"\n"</span> <span class="special">);</span>
3688 <span class="identifier">std</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">out_iter</span> <span class="special">);</span>
3691 This program displays the following:
3693 <pre class="programlisting">02
3704 The <code class="computeroutput"><span class="identifier">sub_matches</span></code> array instructs
3705 the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
3706 to first take the value of the 2nd sub-match, then the 1st sub-match, and
3707 finally the 3rd. Incrementing the iterator again instructs it to use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
3708 again to find the next match. At that point, the process repeats -- the token
3709 iterator takes the value of the 2nd sub-match, then the 1st, et cetera.
3712 <div class="section">
3713 <div class="titlepage"><div><div><h3 class="title">
3714 <a name="boost_xpressive.user_s_guide.named_captures"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.named_captures" title="Named Captures">Named Captures</a>
3715 </h3></div></div></div>
3717 <a name="boost_xpressive.user_s_guide.named_captures.h0"></a>
3718 <span class="phrase"><a name="boost_xpressive.user_s_guide.named_captures.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.named_captures.overview">Overview</a>
3721 For complicated regular expressions, dealing with numbered captures can be
3722 a pain. Counting left parentheses to figure out which capture to reference
3723 is no fun. Less fun is the fact that merely editing a regular expression
3724 could cause a capture to be assigned a new number, invaliding code that refers
3725 back to it by the old number.
3728 Other regular expression engines solve this problem with a feature called
3729 <span class="emphasis"><em>named captures</em></span>. This feature allows you to assign a
3730 name to a capture, and to refer back to the capture by name rather by number.
3731 Xpressive also supports named captures, both in dynamic and in static regexes.
3734 <a name="boost_xpressive.user_s_guide.named_captures.h1"></a>
3735 <span class="phrase"><a name="boost_xpressive.user_s_guide.named_captures.dynamic_named_captures"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.named_captures.dynamic_named_captures">Dynamic
3739 For dynamic regular expressions, xpressive follows the lead of other popular
3740 regex engines with the syntax of named captures. You can create a named capture
3741 with <code class="computeroutput"><span class="string">"(?P<xxx>...)"</span></code>
3742 and refer back to that capture with <code class="computeroutput"><span class="string">"(?P=xxx)"</span></code>.
3743 Here, for instance, is a regular expression that creates a named capture
3744 and refers back to it:
3746 <pre class="programlisting"><span class="comment">// Create a named capture called "char" that matches a single</span>
3747 <span class="comment">// character and refer back to that capture by name.</span>
3748 <span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(?P<char>.)(?P=char)"</span><span class="special">);</span>
3751 The effect of the above regular expression is to find the first doubled character.
3754 Once you have executed a match or search operation using a regex with named
3755 captures, you can access the named capture through the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
3756 object using the capture's name.
3758 <pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"tweet"</span><span class="special">);</span>
3759 <span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(?P<char>.)(?P=char)"</span><span class="special">);</span>
3760 <span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
3761 <span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">))</span>
3762 <span class="special">{</span>
3763 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"char = "</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="string">"char"</span><span class="special">]</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
3764 <span class="special">}</span>
3767 The above code displays:
3769 <pre class="programlisting">char = e
3772 You can also refer back to a named capture from within a substitution string.
3773 The syntax for that is <code class="computeroutput"><span class="string">"\\g<xxx>"</span></code>.
3774 Below is some code that demonstrates how to use named captures when doing
3775 string substitution.
3777 <pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"tweet"</span><span class="special">);</span>
3778 <span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(?P<char>.)(?P=char)"</span><span class="special">);</span>
3779 <span class="identifier">str</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">,</span> <span class="string">"**\\g<char>**"</span><span class="special">,</span> <span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">format_perl</span><span class="special">);</span>
3780 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">str</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
3783 Notice that you have to specify <code class="computeroutput"><span class="identifier">format_perl</span></code>
3784 when using named captures. Only the perl syntax recognizes the <code class="computeroutput"><span class="string">"\\g<xxx>"</span></code> syntax. The above
3787 <pre class="programlisting">tw**e**t
3790 <a name="boost_xpressive.user_s_guide.named_captures.h2"></a>
3791 <span class="phrase"><a name="boost_xpressive.user_s_guide.named_captures.static_named_captures"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.named_captures.static_named_captures">Static
3795 If you're using static regular expressions, creating and using named captures
3796 is even easier. You can use the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/mark_tag.html" title="Struct mark_tag">mark_tag</a></code></code>
3797 type to create a variable that you can use like <code class="computeroutput"><a class="link" href="../boost/xpressive/s1.html" title="Global s1">s1</a></code>, <code class="computeroutput"><a class="link" href="../boost/xpressive/s1.html" title="Global s1">s2</a></code> and friends, but with a
3798 name that is more meaningful. Below is how the above example would look using
3801 <pre class="programlisting"><span class="identifier">mark_tag</span> <span class="identifier">char_</span><span class="special">(</span><span class="number">1</span><span class="special">);</span> <span class="comment">// char_ is now a synonym for s1</span>
3802 <span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">char_</span><span class="special">=</span> <span class="identifier">_</span><span class="special">)</span> <span class="special">>></span> <span class="identifier">char_</span><span class="special">;</span>
3805 After a match operation, you can use the <code class="computeroutput"><span class="identifier">mark_tag</span></code>
3806 to index into the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
3807 to access the named capture:
3809 <pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"tweet"</span><span class="special">);</span>
3810 <span class="identifier">mark_tag</span> <span class="identifier">char_</span><span class="special">(</span><span class="number">1</span><span class="special">);</span>
3811 <span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">char_</span><span class="special">=</span> <span class="identifier">_</span><span class="special">)</span> <span class="special">>></span> <span class="identifier">char_</span><span class="special">;</span>
3812 <span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
3813 <span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">))</span>
3814 <span class="special">{</span>
3815 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="identifier">char_</span><span class="special">]</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
3816 <span class="special">}</span>
3819 The above code displays:
3821 <pre class="programlisting">char = e
3824 When doing string substitutions with <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>,
3825 you can use named captures to create <span class="emphasis"><em>format expressions</em></span>
3828 <pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"tweet"</span><span class="special">);</span>
3829 <span class="identifier">mark_tag</span> <span class="identifier">char_</span><span class="special">(</span><span class="number">1</span><span class="special">);</span>
3830 <span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">char_</span><span class="special">=</span> <span class="identifier">_</span><span class="special">)</span> <span class="special">>></span> <span class="identifier">char_</span><span class="special">;</span>
3831 <span class="identifier">str</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">,</span> <span class="string">"**"</span> <span class="special">+</span> <span class="identifier">char_</span> <span class="special">+</span> <span class="string">"**"</span><span class="special">);</span>
3832 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">str</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
3835 The above code displays:
3837 <pre class="programlisting">tw**e**t
3839 <div class="note"><table border="0" summary="Note">
3841 <td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
3842 <th align="left">Note</th>
3844 <tr><td align="left" valign="top"><p>
3845 You need to include <code class="literal"><boost/xpressive/regex_actions.hpp></code>
3846 to use format expressions.
3850 <div class="section">
3851 <div class="titlepage"><div><div><h3 class="title">
3852 <a name="boost_xpressive.user_s_guide.grammars_and_nested_matches"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches" title="Grammars and Nested Matches">Grammars
3853 and Nested Matches</a>
3854 </h3></div></div></div>
3856 <a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h0"></a>
3857 <span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.overview">Overview</a>
3860 One of the key benefits of representing regexes as C++ expressions is the
3861 ability to easily refer to other C++ code and data from within the regex.
3862 This enables programming idioms that are not possible with other regular
3863 expression libraries. Of particular note is the ability for one regex to
3864 refer to another regex, allowing you to build grammars out of regular expressions.
3865 This section describes how to embed one regex in another by value and by
3866 reference, how regex objects behave when they refer to other regexes, and
3867 how to access the tree of results after a successful parse.
3870 <a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h1"></a>
3871 <span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.embedding_a_regex_by_value"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.embedding_a_regex_by_value">Embedding
3872 a Regex by Value</a>
3875 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
3876 object has value semantics. When a regex object appears on the right-hand
3877 side in the definition of another regex, it is as if the regex were embedded
3878 by value; that is, a copy of the nested regex is stored by the enclosing
3879 regex. The inner regex is invoked by the outer regex during pattern matching.
3880 The inner regex participates fully in the match, back-tracking as needed
3881 to make the match succeed.
3884 Consider a text editor that has a regex-find feature with a whole-word option.
3885 You can implement this with xpressive as follows:
3887 <pre class="programlisting"><span class="identifier">find_dialog</span> <span class="identifier">dlg</span><span class="special">;</span>
3888 <span class="keyword">if</span><span class="special">(</span> <span class="identifier">dialog_ok</span> <span class="special">==</span> <span class="identifier">dlg</span><span class="special">.</span><span class="identifier">do_modal</span><span class="special">()</span> <span class="special">)</span>
3889 <span class="special">{</span>
3890 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">pattern</span> <span class="special">=</span> <span class="identifier">dlg</span><span class="special">.</span><span class="identifier">get_text</span><span class="special">();</span> <span class="comment">// the pattern the user entered</span>
3891 <span class="keyword">bool</span> <span class="identifier">whole_word</span> <span class="special">=</span> <span class="identifier">dlg</span><span class="special">.</span><span class="identifier">whole_word</span><span class="special">.</span><span class="identifier">is_checked</span><span class="special">();</span> <span class="comment">// did the user select the whole-word option?</span>
3893 <span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="identifier">pattern</span> <span class="special">);</span> <span class="comment">// try to compile the pattern</span>
3895 <span class="keyword">if</span><span class="special">(</span> <span class="identifier">whole_word</span> <span class="special">)</span>
3896 <span class="special">{</span>
3897 <span class="comment">// wrap the regex in begin-word / end-word assertions</span>
3898 <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">bow</span> <span class="special">>></span> <span class="identifier">re</span> <span class="special">>></span> <span class="identifier">eow</span><span class="special">;</span>
3899 <span class="special">}</span>
3901 <span class="comment">// ... use re ...</span>
3902 <span class="special">}</span>
3905 Look closely at this line:
3907 <pre class="programlisting"><span class="comment">// wrap the regex in begin-word / end-word assertions</span>
3908 <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">bow</span> <span class="special">>></span> <span class="identifier">re</span> <span class="special">>></span> <span class="identifier">eow</span><span class="special">;</span>
3911 This line creates a new regex that embeds the old regex by value. Then, the
3912 new regex is assigned back to the original regex. Since a copy of the old
3913 regex was made on the right-hand side, this works as you might expect: the
3914 new regex has the behavior of the old regex wrapped in begin- and end-word
3917 <div class="note"><table border="0" summary="Note">
3919 <td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
3920 <th align="left">Note</th>
3922 <tr><td align="left" valign="top"><p>
3923 Note that <code class="computeroutput"><span class="identifier">re</span> <span class="special">=</span>
3924 <span class="identifier">bow</span> <span class="special">>></span>
3925 <span class="identifier">re</span> <span class="special">>></span>
3926 <span class="identifier">eow</span></code> does <span class="emphasis"><em>not</em></span>
3927 define a recursive regular expression, since regex objects embed by value
3928 by default. The next section shows how to define a recursive regular expression
3929 by embedding a regex by reference.
3933 <a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h2"></a>
3934 <span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.embedding_a_regex_by_reference"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.embedding_a_regex_by_reference">Embedding
3935 a Regex by Reference</a>
3938 If you want to be able to build recursive regular expressions and context-free
3939 grammars, embedding a regex by value is not enough. You need to be able to
3940 make your regular expressions self-referential. Most regular expression engines
3941 don't give you that power, but xpressive does.
3943 <div class="tip"><table border="0" summary="Tip">
3945 <td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../doc/src/images/tip.png"></td>
3946 <th align="left">Tip</th>
3948 <tr><td align="left" valign="top"><p>
3949 The theoretical computer scientists out there will correctly point out
3950 that a self-referential regular expression is not "regular",
3951 so in the strict sense, xpressive isn't really a <span class="emphasis"><em>regular</em></span>
3952 expression engine at all. But as Larry Wall once said, "the term [regular expression] has
3953 grown with the capabilities of our pattern matching engines, so I'm not
3954 going to try to fight linguistic necessity here."
3958 Consider the following code, which uses the <code class="computeroutput"><span class="identifier">by_ref</span><span class="special">()</span></code> helper to define a recursive regular expression
3959 that matches balanced, nested parentheses:
3961 <pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">parentheses</span><span class="special">;</span>
3962 <span class="identifier">parentheses</span> <span class="comment">// A balanced set of parentheses ...</span>
3963 <span class="special">=</span> <span class="char">'('</span> <span class="comment">// is an opening parenthesis ...</span>
3964 <span class="special">>></span> <span class="comment">// followed by ...</span>
3965 <span class="special">*(</span> <span class="comment">// zero or more ...</span>
3966 <span class="identifier">keep</span><span class="special">(</span> <span class="special">+~(</span><span class="identifier">set</span><span class="special">=</span><span class="char">'('</span><span class="special">,</span><span class="char">')'</span><span class="special">)</span> <span class="special">)</span> <span class="comment">// of a bunch of things that are not parentheses ...</span>
3967 <span class="special">|</span> <span class="comment">// or ...</span>
3968 <span class="identifier">by_ref</span><span class="special">(</span><span class="identifier">parentheses</span><span class="special">)</span> <span class="comment">// a balanced set of parentheses</span>
3969 <span class="special">)</span> <span class="comment">// (ooh, recursion!) ...</span>
3970 <span class="special">>></span> <span class="comment">// followed by ...</span>
3971 <span class="char">')'</span> <span class="comment">// a closing parenthesis</span>
3972 <span class="special">;</span>
3975 Matching balanced, nested tags is an important text processing task, and
3976 it is one that "classic" regular expressions cannot do. The <code class="computeroutput"><span class="identifier">by_ref</span><span class="special">()</span></code>
3977 helper makes it possible. It allows one regex object to be embedded in another
3978 <span class="emphasis"><em>by reference</em></span>. Since the right-hand side holds <code class="computeroutput"><span class="identifier">parentheses</span></code> by reference, assigning the
3979 right-hand side back to <code class="computeroutput"><span class="identifier">parentheses</span></code>
3980 creates a cycle, which will execute recursively.
3983 <a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h3"></a>
3984 <span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.building_a_grammar"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.building_a_grammar">Building
3988 Once we allow self-reference in our regular expressions, the genie is out
3989 of the bottle and all manner of fun things are possible. In particular, we
3990 can now build grammars out of regular expressions. Let's have a look at the
3991 text-book grammar example: the humble calculator.
3993 <pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">group</span><span class="special">,</span> <span class="identifier">factor</span><span class="special">,</span> <span class="identifier">term</span><span class="special">,</span> <span class="identifier">expression</span><span class="special">;</span>
3995 <span class="identifier">group</span> <span class="special">=</span> <span class="char">'('</span> <span class="special">>></span> <span class="identifier">by_ref</span><span class="special">(</span><span class="identifier">expression</span><span class="special">)</span> <span class="special">>></span> <span class="char">')'</span><span class="special">;</span>
3996 <span class="identifier">factor</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span> <span class="special">|</span> <span class="identifier">group</span><span class="special">;</span>
3997 <span class="identifier">term</span> <span class="special">=</span> <span class="identifier">factor</span> <span class="special">>></span> <span class="special">*((</span><span class="char">'*'</span> <span class="special">>></span> <span class="identifier">factor</span><span class="special">)</span> <span class="special">|</span> <span class="special">(</span><span class="char">'/'</span> <span class="special">>></span> <span class="identifier">factor</span><span class="special">));</span>
3998 <span class="identifier">expression</span> <span class="special">=</span> <span class="identifier">term</span> <span class="special">>></span> <span class="special">*((</span><span class="char">'+'</span> <span class="special">>></span> <span class="identifier">term</span><span class="special">)</span> <span class="special">|</span> <span class="special">(</span><span class="char">'-'</span> <span class="special">>></span> <span class="identifier">term</span><span class="special">));</span>
4001 The regex <code class="computeroutput"><span class="identifier">expression</span></code> defined
4002 above does something rather remarkable for a regular expression: it matches
4003 mathematical expressions. For example, if the input string were <code class="computeroutput"><span class="string">"foo 9*(10+3) bar"</span></code>, this pattern
4004 would match <code class="computeroutput"><span class="string">"9*(10+3)"</span></code>.
4005 It only matches well-formed mathematical expressions, where the parentheses
4006 are balanced and the infix operators have two arguments each. Don't try this
4007 with just any regular expression engine!
4010 Let's take a closer look at this regular expression grammar. Notice that
4011 it is cyclic: <code class="computeroutput"><span class="identifier">expression</span></code>
4012 is implemented in terms of <code class="computeroutput"><span class="identifier">term</span></code>,
4013 which is implemented in terms of <code class="computeroutput"><span class="identifier">factor</span></code>,
4014 which is implemented in terms of <code class="computeroutput"><span class="identifier">group</span></code>,
4015 which is implemented in terms of <code class="computeroutput"><span class="identifier">expression</span></code>,
4016 closing the loop. In general, the way to define a cyclic grammar is to forward-declare
4017 the regex objects and embed by reference those regular expressions that have
4018 not yet been initialized. In the above grammar, there is only one place where
4019 we need to reference a regex object that has not yet been initialized: the
4020 definition of <code class="computeroutput"><span class="identifier">group</span></code>. In that
4021 place, we use <code class="computeroutput"><span class="identifier">by_ref</span><span class="special">()</span></code>
4022 to embed <code class="computeroutput"><span class="identifier">expression</span></code> by reference.
4023 In all other places, it is sufficient to embed the other regex objects by
4024 value, since they have already been initialized and their values will not
4027 <div class="tip"><table border="0" summary="Tip">
4029 <td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../doc/src/images/tip.png"></td>
4030 <th align="left">Tip</th>
4032 <tr><td align="left" valign="top"><p>
4033 <span class="bold"><strong>Embed by value if possible</strong></span> <br> <br>
4034 In general, prefer embedding regular expressions by value rather than by
4035 reference. It involves one less indirection, making your patterns match
4036 a little faster. Besides, value semantics are simpler and will make your
4037 grammars easier to reason about. Don't worry about the expense of "copying"
4038 a regex. Each regex object shares its implementation with all of its copies.
4042 <a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h4"></a>
4043 <span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.dynamic_regex_grammars"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.dynamic_regex_grammars">Dynamic
4047 Using <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>,
4048 you can also build grammars out of dynamic regular expressions. You do that
4049 by creating named regexes, and referring to other regexes by name. Each
4050 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
4051 instance keeps a mapping from names to regexes that have been created with
4055 You can create a named dynamic regex by prefacing your regex with <code class="computeroutput"><span class="string">"(?$name=)"</span></code>, where <span class="emphasis"><em>name</em></span>
4056 is the name of the regex. You can refer to a named regex from another regex
4057 with <code class="computeroutput"><span class="string">"(?$name)"</span></code>. The
4058 named regex does not need to exist yet at the time it is referenced in another
4059 regex, but it must exist by the time you use the regex.
4062 Below is a code fragment that uses dynamic regex grammars to implement the
4063 calculator example from above.
4065 <pre class="programlisting"><span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
4066 <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">regex_constants</span><span class="special">;</span>
4068 <span class="identifier">sregex</span> <span class="identifier">expr</span><span class="special">;</span>
4070 <span class="special">{</span>
4071 <span class="identifier">sregex_compiler</span> <span class="identifier">compiler</span><span class="special">;</span>
4072 <span class="identifier">syntax_option_type</span> <span class="identifier">x</span> <span class="special">=</span> <span class="identifier">ignore_white_space</span><span class="special">;</span>
4074 <span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(? $group = ) \\( (? $expr ) \\) "</span><span class="special">,</span> <span class="identifier">x</span><span class="special">);</span>
4075 <span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(? $factor = ) \\d+ | (? $group ) "</span><span class="special">,</span> <span class="identifier">x</span><span class="special">);</span>
4076 <span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(? $term = ) (? $factor )"</span>
4077 <span class="string">" ( \\* (? $factor ) | / (? $factor ) )* "</span><span class="special">,</span> <span class="identifier">x</span><span class="special">);</span>
4078 <span class="identifier">expr</span> <span class="special">=</span> <span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(? $expr = ) (? $term )"</span>
4079 <span class="string">" ( \\+ (? $term ) | - (? $term ) )* "</span><span class="special">,</span> <span class="identifier">x</span><span class="special">);</span>
4080 <span class="special">}</span>
4082 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"foo 9*(10+3) bar"</span><span class="special">);</span>
4083 <span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
4085 <span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">expr</span><span class="special">))</span>
4086 <span class="special">{</span>
4087 <span class="comment">// This prints "9*(10+3)":</span>
4088 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
4089 <span class="special">}</span>
4092 As with static regex grammars, nested regex invocations create nested match
4093 results (see <span class="emphasis"><em>Nested Results</em></span> below). The result is a
4094 complete parse tree for string that matched. Unlike static regexes, dynamic
4095 regexes are always embedded by reference, not by value.
4098 <a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h5"></a>
4099 <span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.cyclic_patterns__copying_and_memory_management__oh_my_"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.cyclic_patterns__copying_and_memory_management__oh_my_">Cyclic
4100 Patterns, Copying and Memory Management, Oh My!</a>
4103 The calculator examples above raises a number of very complicated memory-management
4104 issues. Each of the four regex objects refer to each other, some directly
4105 and some indirectly, some by value and some by reference. What if we were
4106 to return one of them from a function and let the others go out of scope?
4107 What becomes of the references? The answer is that the regex objects are
4108 internally reference counted, such that they keep their referenced regex
4109 objects alive as long as they need them. So passing a regex object by value
4110 is never a problem, even if it refers to other regex objects that have gone
4114 Those of you who have dealt with reference counting are probably familiar
4115 with its Achilles Heel: cyclic references. If regex objects are reference
4116 counted, what happens to cycles like the one created in the calculator examples?
4117 Are they leaked? The answer is no, they are not leaked. The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
4118 object has some tricky reference tracking code that ensures that even cyclic
4119 regex grammars are cleaned up when the last external reference goes away.
4120 So don't worry about it. Create cyclic grammars, pass your regex objects
4121 around and copy them all you want. It is fast and efficient and guaranteed
4122 not to leak or result in dangling references.
4125 <a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h6"></a>
4126 <span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.nested_regexes_and_sub_match_scoping"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.nested_regexes_and_sub_match_scoping">Nested
4127 Regexes and Sub-Match Scoping</a>
4130 Nested regular expressions raise the issue of sub-match scoping. If both
4131 the inner and outer regex write to and read from the same sub-match vector,
4132 chaos would ensue. The inner regex would stomp on the sub-matches written
4133 by the outer regex. For example, what does this do?
4135 <pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">inner</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(.)\\1"</span> <span class="special">);</span>
4136 <span class="identifier">sregex</span> <span class="identifier">outer</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="identifier">_</span><span class="special">)</span> <span class="special">>></span> <span class="identifier">inner</span> <span class="special">>></span> <span class="identifier">s1</span><span class="special">;</span>
4139 The author probably didn't intend for the inner regex to overwrite the sub-match
4140 written by the outer regex. The problem is particularly acute when the inner
4141 regex is accepted from the user as input. The author has no way of knowing
4142 whether the inner regex will stomp the sub-match vector or not. This is clearly
4146 Instead, what actually happens is that each invocation of a nested regex
4147 gets its own scope. Sub-matches belong to that scope. That is, each nested
4148 regex invocation gets its own copy of the sub-match vector to play with,
4149 so there is no way for an inner regex to stomp on the sub-matches of an outer
4150 regex. So, for example, the regex <code class="computeroutput"><span class="identifier">outer</span></code>
4151 defined above would match <code class="computeroutput"><span class="string">"ABBA"</span></code>,
4155 <a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h7"></a>
4156 <span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.nested_results"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.nested_results">Nested
4160 If nested regexes have their own sub-matches, there should be a way to access
4161 them after a successful match. In fact, there is. After a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
4162 or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>,
4163 the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
4164 struct behaves like the head of a tree of nested results. The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
4165 class provides a <code class="computeroutput"><span class="identifier">nested_results</span><span class="special">()</span></code> member function that returns an ordered
4166 sequence of <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
4167 structures, representing the results of the nested regexes. The order of
4168 the nested results is the same as the order in which the nested regex objects
4172 Take as an example the regex for balanced, nested parentheses we saw earlier:
4174 <pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">parentheses</span><span class="special">;</span>
4175 <span class="identifier">parentheses</span> <span class="special">=</span> <span class="char">'('</span> <span class="special">>></span> <span class="special">*(</span> <span class="identifier">keep</span><span class="special">(</span> <span class="special">+~(</span><span class="identifier">set</span><span class="special">=</span><span class="char">'('</span><span class="special">,</span><span class="char">')'</span><span class="special">)</span> <span class="special">)</span> <span class="special">|</span> <span class="identifier">by_ref</span><span class="special">(</span><span class="identifier">parentheses</span><span class="special">)</span> <span class="special">)</span> <span class="special">>></span> <span class="char">')'</span><span class="special">;</span>
4177 <span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
4178 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span> <span class="string">"blah blah( a(b)c (c(e)f (g)h )i (j)6 )blah"</span> <span class="special">);</span>
4180 <span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_search</span><span class="special">(</span> <span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">parentheses</span> <span class="special">)</span> <span class="special">)</span>
4181 <span class="special">{</span>
4182 <span class="comment">// display the whole match</span>
4183 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
4185 <span class="comment">// display the nested results</span>
4186 <span class="identifier">std</span><span class="special">::</span><span class="identifier">for_each</span><span class="special">(</span>
4187 <span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">begin</span><span class="special">(),</span>
4188 <span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">end</span><span class="special">(),</span>
4189 <span class="identifier">output_nested_results</span><span class="special">()</span> <span class="special">);</span>
4190 <span class="special">}</span>
4193 This program displays the following:
4195 <pre class="programlisting">( a(b)c (c(e)f (g)h )i (j)6 )
4203 Here you can see how the results are nested and that they are stored in the
4204 order in which they are found.
4206 <div class="tip"><table border="0" summary="Tip">
4208 <td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../doc/src/images/tip.png"></td>
4209 <th align="left">Tip</th>
4211 <tr><td align="left" valign="top"><p>
4212 See the definition of <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.display_a_tree_of_nested_results">output_nested_results</a>
4213 in the <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">Examples</a>
4218 <a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h8"></a>
4219 <span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.filtering_nested_results"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.filtering_nested_results">Filtering
4223 Sometimes a regex will have several nested regex objects, and you want to
4224 know which result corresponds to which regex object. That's where <code class="computeroutput"><span class="identifier">basic_regex</span><span class="special"><>::</span><span class="identifier">regex_id</span><span class="special">()</span></code>
4225 and <code class="computeroutput"><span class="identifier">match_results</span><span class="special"><>::</span><span class="identifier">regex_id</span><span class="special">()</span></code>
4226 come in handy. When iterating over the nested results, you can compare the
4227 regex id from the results to the id of the regex object you're interested
4231 To make this a bit easier, xpressive provides a predicate to make it simple
4232 to iterate over just the results that correspond to a certain nested regex.
4233 It is called <code class="computeroutput"><span class="identifier">regex_id_filter_predicate</span></code>,
4234 and it is intended to be used with <a href="../../../libs/iterator/doc/index.html" target="_top">Boost.Iterator</a>.
4235 You can use it as follows:
4237 <pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">name</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">alpha</span><span class="special">;</span>
4238 <span class="identifier">sregex</span> <span class="identifier">integer</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span><span class="special">;</span>
4239 <span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="special">*(</span> <span class="special">*</span><span class="identifier">_s</span> <span class="special">>></span> <span class="special">(</span> <span class="identifier">name</span> <span class="special">|</span> <span class="identifier">integer</span> <span class="special">)</span> <span class="special">);</span>
4241 <span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
4242 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span> <span class="string">"marsha 123 jan 456 cindy 789"</span> <span class="special">);</span>
4244 <span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">re</span> <span class="special">)</span> <span class="special">)</span>
4245 <span class="special">{</span>
4246 <span class="identifier">smatch</span><span class="special">::</span><span class="identifier">nested_results_type</span><span class="special">::</span><span class="identifier">const_iterator</span> <span class="identifier">begin</span> <span class="special">=</span> <span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">begin</span><span class="special">();</span>
4247 <span class="identifier">smatch</span><span class="special">::</span><span class="identifier">nested_results_type</span><span class="special">::</span><span class="identifier">const_iterator</span> <span class="identifier">end</span> <span class="special">=</span> <span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">end</span><span class="special">();</span>
4249 <span class="comment">// declare filter predicates to select just the names or the integers</span>
4250 <span class="identifier">sregex_id_filter_predicate</span> <span class="identifier">name_id</span><span class="special">(</span> <span class="identifier">name</span><span class="special">.</span><span class="identifier">regex_id</span><span class="special">()</span> <span class="special">);</span>
4251 <span class="identifier">sregex_id_filter_predicate</span> <span class="identifier">integer_id</span><span class="special">(</span> <span class="identifier">integer</span><span class="special">.</span><span class="identifier">regex_id</span><span class="special">()</span> <span class="special">);</span>
4253 <span class="comment">// iterate over only the results from the name regex</span>
4254 <span class="identifier">std</span><span class="special">::</span><span class="identifier">for_each</span><span class="special">(</span>
4255 <span class="identifier">boost</span><span class="special">::</span><span class="identifier">make_filter_iterator</span><span class="special">(</span> <span class="identifier">name_id</span><span class="special">,</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span> <span class="special">),</span>
4256 <span class="identifier">boost</span><span class="special">::</span><span class="identifier">make_filter_iterator</span><span class="special">(</span> <span class="identifier">name_id</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">end</span> <span class="special">),</span>
4257 <span class="identifier">output_result</span>
4258 <span class="special">);</span>
4260 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
4262 <span class="comment">// iterate over only the results from the integer regex</span>
4263 <span class="identifier">std</span><span class="special">::</span><span class="identifier">for_each</span><span class="special">(</span>
4264 <span class="identifier">boost</span><span class="special">::</span><span class="identifier">make_filter_iterator</span><span class="special">(</span> <span class="identifier">integer_id</span><span class="special">,</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span> <span class="special">),</span>
4265 <span class="identifier">boost</span><span class="special">::</span><span class="identifier">make_filter_iterator</span><span class="special">(</span> <span class="identifier">integer_id</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">end</span> <span class="special">),</span>
4266 <span class="identifier">output_result</span>
4267 <span class="special">);</span>
4268 <span class="special">}</span>
4271 where <code class="computeroutput"><span class="identifier">output_results</span></code> is a
4272 simple function that takes a <code class="computeroutput"><span class="identifier">smatch</span></code>
4273 and displays the full match. Notice how we use the <code class="computeroutput"><span class="identifier">regex_id_filter_predicate</span></code>
4274 together with <code class="computeroutput"><span class="identifier">basic_regex</span><span class="special"><>::</span><span class="identifier">regex_id</span><span class="special">()</span></code> and <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">make_filter_iterator</span><span class="special">()</span></code> from the <a href="../../../libs/iterator/doc/index.html" target="_top">Boost.Iterator</a>
4275 to select only those results corresponding to a particular nested regex.
4276 This program displays the following:
4278 <pre class="programlisting">marsha
4286 <div class="section">
4287 <div class="titlepage"><div><div><h3 class="title">
4288 <a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions" title="Semantic Actions and User-Defined Assertions">Semantic
4289 Actions and User-Defined Assertions</a>
4290 </h3></div></div></div>
4292 <a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h0"></a>
4293 <span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.overview">Overview</a>
4296 Imagine you want to parse an input string and build a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><></span></code>
4297 from it. For something like that, matching a regular expression isn't enough.
4298 You want to <span class="emphasis"><em>do something</em></span> when parts of your regular
4299 expression match. Xpressive lets you attach semantic actions to parts of
4300 your static regular expressions. This section shows you how.
4303 <a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h1"></a>
4304 <span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.semantic_actions"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.semantic_actions">Semantic
4308 Consider the following code, which uses xpressive's semantic actions to parse
4309 a string of word/integer pairs and stuffs them into a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><></span></code>.
4310 It is described below.
4312 <pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">string</span><span class="special">></span>
4313 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
4314 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
4315 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
4316 <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
4318 <span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
4319 <span class="special">{</span>
4320 <span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">></span> <span class="identifier">result</span><span class="special">;</span>
4321 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"aaa=>1 bbb=>23 ccc=>456"</span><span class="special">);</span>
4323 <span class="comment">// Match a word and an integer, separated by =>,</span>
4324 <span class="comment">// and then stuff the result into a std::map<></span>
4325 <span class="identifier">sregex</span> <span class="identifier">pair</span> <span class="special">=</span> <span class="special">(</span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="string">"=>"</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s2</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span><span class="special">)</span> <span class="special">)</span>
4326 <span class="special">[</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]</span> <span class="special">=</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">s2</span><span class="special">)</span> <span class="special">];</span>
4328 <span class="comment">// Match one or more word/integer pairs, separated</span>
4329 <span class="comment">// by whitespace.</span>
4330 <span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="identifier">pair</span> <span class="special">>></span> <span class="special">*(+</span><span class="identifier">_s</span> <span class="special">>></span> <span class="identifier">pair</span><span class="special">);</span>
4332 <span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_match</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">))</span>
4333 <span class="special">{</span>
4334 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"aaa"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
4335 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"bbb"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
4336 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"ccc"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
4337 <span class="special">}</span>
4339 <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
4340 <span class="special">}</span>
4343 This program prints the following:
4345 <pre class="programlisting">1
4350 The regular expression <code class="computeroutput"><span class="identifier">pair</span></code>
4351 has two parts: the pattern and the action. The pattern says to match a word,
4352 capturing it in sub-match 1, and an integer, capturing it in sub-match 2,
4353 separated by <code class="computeroutput"><span class="string">"=>"</span></code>.
4354 The action is the part in square brackets: <code class="computeroutput"><span class="special">[</span>
4355 <span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]</span> <span class="special">=</span>
4356 <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">s2</span><span class="special">)</span> <span class="special">]</span></code>. It says
4357 to take sub-match one and use it to index into the <code class="computeroutput"><span class="identifier">results</span></code>
4358 map, and assign to it the result of converting sub-match 2 to an integer.
4360 <div class="note"><table border="0" summary="Note">
4362 <td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
4363 <th align="left">Note</th>
4365 <tr><td align="left" valign="top"><p>
4366 To use semantic actions with your static regexes, you must <code class="computeroutput"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span></code>
4370 How does this work? Just as the rest of the static regular expression, the
4371 part between brackets is an expression template. It encodes the action and
4372 executes it later. The expression <code class="computeroutput"><span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)</span></code> creates a lazy reference to the <code class="computeroutput"><span class="identifier">result</span></code> object. The larger expression <code class="computeroutput"><span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]</span></code>
4373 is a lazy map index operation. Later, when this action is getting executed,
4374 <code class="computeroutput"><span class="identifier">s1</span></code> gets replaced with the
4375 first <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>.
4376 Likewise, when <code class="computeroutput"><span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">s2</span><span class="special">)</span></code> gets executed, <code class="computeroutput"><span class="identifier">s2</span></code>
4377 is replaced with the second <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>.
4378 The <code class="computeroutput"><span class="identifier">as</span><span class="special"><></span></code>
4379 action converts its argument to the requested type using Boost.Lexical_cast.
4380 The effect of the whole action is to insert a new word/integer pair into
4383 <div class="note"><table border="0" summary="Note">
4385 <td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
4386 <th align="left">Note</th>
4388 <tr><td align="left" valign="top"><p>
4389 There is an important difference between the function <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">ref</span><span class="special">()</span></code> in <code class="computeroutput"><span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">ref</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span></code>
4390 and <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">ref</span><span class="special">()</span></code>
4391 in <code class="computeroutput"><span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span></code>. The first returns a plain <code class="computeroutput"><span class="identifier">reference_wrapper</span><span class="special"><></span></code>
4392 which behaves in many respects like an ordinary reference. By contrast,
4393 <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">ref</span><span class="special">()</span></code>
4394 returns a <span class="emphasis"><em>lazy</em></span> reference that you can use in expressions
4395 that are executed lazily. That is why we can say <code class="computeroutput"><span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]</span></code>, even though <code class="computeroutput"><span class="identifier">result</span></code>
4396 doesn't have an <code class="computeroutput"><span class="keyword">operator</span><span class="special">[]</span></code>
4397 that would accept <code class="computeroutput"><span class="identifier">s1</span></code>.
4401 In addition to the sub-match placeholders <code class="computeroutput"><span class="identifier">s1</span></code>,
4402 <code class="computeroutput"><span class="identifier">s2</span></code>, etc., you can also use
4403 the placeholder <code class="computeroutput"><span class="identifier">_</span></code> within
4404 an action to refer back to the string matched by the sub-expression to which
4405 the action is attached. For instance, you can use the following regex to
4406 match a bunch of digits, interpret them as an integer and assign the result
4407 to a local variable:
4409 <pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
4410 <span class="comment">// Here, _ refers back to all the</span>
4411 <span class="comment">// characters matched by (+_d)</span>
4412 <span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">(+</span><span class="identifier">_d</span><span class="special">)[</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">=</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">)</span> <span class="special">];</span>
4415 <a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h2"></a>
4416 <span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.lazy_action_execution"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.lazy_action_execution">Lazy
4417 Action Execution</a>
4420 What does it mean, exactly, to attach an action to part of a regular expression
4421 and perform a match? When does the action execute? If the action is part
4422 of a repeated sub-expression, does the action execute once or many times?
4423 And if the sub-expression initially matches, but ultimately fails because
4424 the rest of the regular expression fails to match, is the action executed
4428 The answer is that by default, actions are executed <span class="emphasis"><em>lazily</em></span>.
4429 When a sub-expression matches a string, its action is placed on a queue,
4430 along with the current values of any sub-matches to which the action refers.
4431 If the match algorithm must backtrack, actions are popped off the queue as
4432 necessary. Only after the entire regex has matched successfully are the actions
4433 actually exeucted. They are executed all at once, in the order in which they
4434 were added to the queue, as the last step before <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
4438 For example, consider the following regex that increments a counter whenever
4441 <pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
4442 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"1!2!3?"</span><span class="special">);</span>
4443 <span class="comment">// count the exciting digits, but not the</span>
4444 <span class="comment">// questionable ones.</span>
4445 <span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span>
4446 <span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rex</span><span class="special">);</span>
4447 <span class="identifier">assert</span><span class="special">(</span> <span class="identifier">i</span> <span class="special">==</span> <span class="number">2</span> <span class="special">);</span>
4450 The action <code class="computeroutput"><span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span></code>
4451 is queued three times: once for each found digit. But it is only <span class="emphasis"><em>executed</em></span>
4452 twice: once for each digit that precedes a <code class="computeroutput"><span class="char">'!'</span></code>
4453 character. When the <code class="computeroutput"><span class="char">'?'</span></code> character
4454 is encountered, the match algorithm backtracks, removing the final action
4458 <a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h3"></a>
4459 <span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.immediate_action_execution"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.immediate_action_execution">Immediate
4460 Action Execution</a>
4463 When you want semantic actions to execute immediately, you can wrap the sub-expression
4464 containing the action in a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/keep.html" title="Function template keep">keep()</a></code></code>.
4465 <code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code>
4466 turns off back-tracking for its sub-expression, but it also causes any actions
4467 queued by the sub-expression to execute at the end of the <code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code>. It is as if the sub-expression in the
4468 <code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code>
4469 were compiled into an independent regex object, and matching the <code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code>
4470 is like a separate invocation of <code class="computeroutput"><span class="identifier">regex_search</span><span class="special">()</span></code>. It matches characters and executes actions
4471 but never backtracks or unwinds. For example, imagine the above example had
4472 been written as follows:
4474 <pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
4475 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"1!2!3?"</span><span class="special">);</span>
4476 <span class="comment">// count all the digits.</span>
4477 <span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">keep</span><span class="special">(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">]</span> <span class="special">)</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span>
4478 <span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rex</span><span class="special">);</span>
4479 <span class="identifier">assert</span><span class="special">(</span> <span class="identifier">i</span> <span class="special">==</span> <span class="number">3</span> <span class="special">);</span>
4482 We have wrapped the sub-expression <code class="computeroutput"><span class="identifier">_d</span>
4483 <span class="special">[</span> <span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">]</span></code> in <code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code>.
4484 Now, whenever this regex matches a digit, the action will be queued and then
4485 immediately executed before we try to match a <code class="computeroutput"><span class="char">'!'</span></code>
4486 character. In this case, the action executes three times.
4488 <div class="note"><table border="0" summary="Note">
4490 <td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
4491 <th align="left">Note</th>
4493 <tr><td align="left" valign="top"><p>
4494 Like <code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code>,
4495 actions within <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/before.html" title="Function template before">before()</a></code></code>
4496 and <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/after.html" title="Function template after">after()</a></code></code>
4497 are also executed early when their sub-expressions have matched.
4501 <a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h4"></a>
4502 <span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.lazy_functions"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.lazy_functions">Lazy
4506 So far, we've seen how to write semantic actions consisting of variables
4507 and operators. But what if you want to be able to call a function from a
4508 semantic action? Xpressive provides a mechanism to do this.
4511 The first step is to define a function object type. Here, for instance, is
4512 a function object type that calls <code class="computeroutput"><span class="identifier">push</span><span class="special">()</span></code> on its argument:
4514 <pre class="programlisting"><span class="keyword">struct</span> <span class="identifier">push_impl</span>
4515 <span class="special">{</span>
4516 <span class="comment">// Result type, needed for tr1::result_of</span>
4517 <span class="keyword">typedef</span> <span class="keyword">void</span> <span class="identifier">result_type</span><span class="special">;</span>
4519 <span class="keyword">template</span><span class="special"><</span><span class="keyword">typename</span> <span class="identifier">Sequence</span><span class="special">,</span> <span class="keyword">typename</span> <span class="identifier">Value</span><span class="special">></span>
4520 <span class="keyword">void</span> <span class="keyword">operator</span><span class="special">()(</span><span class="identifier">Sequence</span> <span class="special">&</span><span class="identifier">seq</span><span class="special">,</span> <span class="identifier">Value</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">val</span><span class="special">)</span> <span class="keyword">const</span>
4521 <span class="special">{</span>
4522 <span class="identifier">seq</span><span class="special">.</span><span class="identifier">push</span><span class="special">(</span><span class="identifier">val</span><span class="special">);</span>
4523 <span class="special">}</span>
4524 <span class="special">};</span>
4527 The next step is to use xpressive's <code class="computeroutput"><span class="identifier">function</span><span class="special"><></span></code> template to define a function object
4528 named <code class="computeroutput"><span class="identifier">push</span></code>:
4530 <pre class="programlisting"><span class="comment">// Global "push" function object.</span>
4531 <span class="identifier">function</span><span class="special"><</span><span class="identifier">push_impl</span><span class="special">>::</span><span class="identifier">type</span> <span class="keyword">const</span> <span class="identifier">push</span> <span class="special">=</span> <span class="special">{{}};</span>
4534 The initialization looks a bit odd, but this is because <code class="computeroutput"><span class="identifier">push</span></code>
4535 is being statically initialized. That means it doesn't need to be constructed
4536 at runtime. We can use <code class="computeroutput"><span class="identifier">push</span></code>
4537 in semantic actions as follows:
4539 <pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">stack</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">ints</span><span class="special">;</span>
4540 <span class="comment">// Match digits, cast them to an int</span>
4541 <span class="comment">// and push it on the stack.</span>
4542 <span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">(+</span><span class="identifier">_d</span><span class="special">)[</span><span class="identifier">push</span><span class="special">(</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">ints</span><span class="special">),</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">))];</span>
4545 You'll notice that doing it this way causes member function invocations to
4546 look like ordinary function invocations. You can choose to write your semantic
4547 action in a different way that makes it look a bit more like a member function
4550 <pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">(+</span><span class="identifier">_d</span><span class="special">)[</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">ints</span><span class="special">)->*</span><span class="identifier">push</span><span class="special">(</span><span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">))];</span>
4553 Xpressive recognizes the use of the <code class="computeroutput"><span class="special">->*</span></code>
4554 and treats this expression exactly the same as the one above.
4557 When your function object must return a type that depends on its arguments,
4558 you can use a <code class="computeroutput"><span class="identifier">result</span><span class="special"><></span></code>
4559 member template instead of the <code class="computeroutput"><span class="identifier">result_type</span></code>
4560 typedef. Here, for example, is a <code class="computeroutput"><span class="identifier">first</span></code>
4561 function object that returns the <code class="computeroutput"><span class="identifier">first</span></code>
4562 member of a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">pair</span><span class="special"><></span></code>
4563 or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>:
4565 <pre class="programlisting"><span class="comment">// Function object that returns the</span>
4566 <span class="comment">// first element of a pair.</span>
4567 <span class="keyword">struct</span> <span class="identifier">first_impl</span>
4568 <span class="special">{</span>
4569 <span class="keyword">template</span><span class="special"><</span><span class="keyword">typename</span> <span class="identifier">Sig</span><span class="special">></span> <span class="keyword">struct</span> <span class="identifier">result</span> <span class="special">{};</span>
4571 <span class="keyword">template</span><span class="special"><</span><span class="keyword">typename</span> <span class="identifier">This</span><span class="special">,</span> <span class="keyword">typename</span> <span class="identifier">Pair</span><span class="special">></span>
4572 <span class="keyword">struct</span> <span class="identifier">result</span><span class="special"><</span><span class="identifier">This</span><span class="special">(</span><span class="identifier">Pair</span><span class="special">)></span>
4573 <span class="special">{</span>
4574 <span class="keyword">typedef</span> <span class="keyword">typename</span> <span class="identifier">remove_reference</span><span class="special"><</span><span class="identifier">Pair</span><span class="special">></span>
4575 <span class="special">::</span><span class="identifier">type</span><span class="special">::</span><span class="identifier">first_type</span> <span class="identifier">type</span><span class="special">;</span>
4576 <span class="special">};</span>
4578 <span class="keyword">template</span><span class="special"><</span><span class="keyword">typename</span> <span class="identifier">Pair</span><span class="special">></span>
4579 <span class="keyword">typename</span> <span class="identifier">Pair</span><span class="special">::</span><span class="identifier">first_type</span>
4580 <span class="keyword">operator</span><span class="special">()(</span><span class="identifier">Pair</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">p</span><span class="special">)</span> <span class="keyword">const</span>
4581 <span class="special">{</span>
4582 <span class="keyword">return</span> <span class="identifier">p</span><span class="special">.</span><span class="identifier">first</span><span class="special">;</span>
4583 <span class="special">}</span>
4584 <span class="special">};</span>
4586 <span class="comment">// OK, use as first(s1) to get the begin iterator</span>
4587 <span class="comment">// of the sub-match referred to by s1.</span>
4588 <span class="identifier">function</span><span class="special"><</span><span class="identifier">first_impl</span><span class="special">>::</span><span class="identifier">type</span> <span class="keyword">const</span> <span class="identifier">first</span> <span class="special">=</span> <span class="special">{{}};</span>
4591 <a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h5"></a>
4592 <span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.referring_to_local_variables"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.referring_to_local_variables">Referring
4593 to Local Variables</a>
4596 As we've seen in the examples above, we can refer to local variables within
4597 an actions using <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">ref</span><span class="special">()</span></code>.
4598 Any such variables are held by reference by the regular expression, and care
4599 should be taken to avoid letting those references dangle. For instance, in
4600 the following code, the reference to <code class="computeroutput"><span class="identifier">i</span></code>
4601 is left to dangle when <code class="computeroutput"><span class="identifier">bad_voodoo</span><span class="special">()</span></code> returns:
4603 <pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">bad_voodoo</span><span class="special">()</span>
4604 <span class="special">{</span>
4605 <span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
4606 <span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span>
4607 <span class="comment">// ERROR! rex refers by reference to a local</span>
4608 <span class="comment">// variable, which will dangle after bad_voodoo()</span>
4609 <span class="comment">// returns.</span>
4610 <span class="keyword">return</span> <span class="identifier">rex</span><span class="special">;</span>
4611 <span class="special">}</span>
4614 When writing semantic actions, it is your responsibility to make sure that
4615 all the references do not dangle. One way to do that would be to make the
4616 variables shared pointers that are held by the regex by value.
4618 <pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">good_voodoo</span><span class="special">(</span><span class="identifier">boost</span><span class="special">::</span><span class="identifier">shared_ptr</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">pi</span><span class="special">)</span>
4619 <span class="special">{</span>
4620 <span class="comment">// Use val() to hold the shared_ptr by value:</span>
4621 <span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++*</span><span class="identifier">val</span><span class="special">(</span><span class="identifier">pi</span><span class="special">)</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span>
4622 <span class="comment">// OK, rex holds a reference count to the integer.</span>
4623 <span class="keyword">return</span> <span class="identifier">rex</span><span class="special">;</span>
4624 <span class="special">}</span>
4627 In the above code, we use <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">val</span><span class="special">()</span></code>
4628 to hold the shared pointer by value. That's not normally necessary because
4629 local variables appearing in actions are held by value by default, but in
4630 this case, it is necessary. Had we written the action as <code class="computeroutput"><span class="special">++*</span><span class="identifier">pi</span></code>, it would have executed immediately.
4631 That's because <code class="computeroutput"><span class="special">++*</span><span class="identifier">pi</span></code>
4632 is not an expression template, but <code class="computeroutput"><span class="special">++*</span><span class="identifier">val</span><span class="special">(</span><span class="identifier">pi</span><span class="special">)</span></code> is.
4635 It can be tedious to wrap all your variables in <code class="computeroutput"><span class="identifier">ref</span><span class="special">()</span></code> and <code class="computeroutput"><span class="identifier">val</span><span class="special">()</span></code> in your semantic actions. Xpressive provides
4636 the <code class="computeroutput"><span class="identifier">reference</span><span class="special"><></span></code>
4637 and <code class="computeroutput"><span class="identifier">value</span><span class="special"><></span></code>
4638 templates to make things easier. The following table shows the equivalencies:
4641 <a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.t0"></a><p class="title"><b>Table 46.12. reference<> and value<></b></p>
4642 <div class="table-contents"><table class="table" summary="reference<> and value<>">
4655 ... is equivalent to this ...
4664 <pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
4666 <span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span></pre>
4673 <pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
4674 <span class="identifier">reference</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">ri</span><span class="special">(</span><span class="identifier">i</span><span class="special">);</span>
4675 <span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ri</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span></pre>
4684 <pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">shared_ptr</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">pi</span><span class="special">(</span><span class="keyword">new</span> <span class="keyword">int</span><span class="special">(</span><span class="number">0</span><span class="special">));</span>
4686 <span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++*</span><span class="identifier">val</span><span class="special">(</span><span class="identifier">pi</span><span class="special">)</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span></pre>
4693 <pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">shared_ptr</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">pi</span><span class="special">(</span><span class="keyword">new</span> <span class="keyword">int</span><span class="special">(</span><span class="number">0</span><span class="special">));</span>
4694 <span class="identifier">value</span><span class="special"><</span><span class="identifier">boost</span><span class="special">::</span><span class="identifier">shared_ptr</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="special">></span> <span class="identifier">vpi</span><span class="special">(</span><span class="identifier">pi</span><span class="special">);</span>
4695 <span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++*</span><span class="identifier">vpi</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span></pre>
4703 <br class="table-break"><p>
4704 As you can see, when using <code class="computeroutput"><span class="identifier">reference</span><span class="special"><></span></code>, you need to first declare a local
4705 variable and then declare a <code class="computeroutput"><span class="identifier">reference</span><span class="special"><></span></code> to it. These two steps can be combined
4706 into one using <code class="computeroutput"><span class="identifier">local</span><span class="special"><></span></code>.
4709 <a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.t1"></a><p class="title"><b>Table 46.13. local<> vs. reference<></b></p>
4710 <div class="table-contents"><table class="table" summary="local<> vs. reference<>">
4723 ... is equivalent to this ...
4731 <pre class="programlisting"><span class="identifier">local</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">i</span><span class="special">(</span><span class="number">0</span><span class="special">);</span>
4733 <span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">i</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span></pre>
4740 <pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
4741 <span class="identifier">reference</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">ri</span><span class="special">(</span><span class="identifier">i</span><span class="special">);</span>
4742 <span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ri</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span></pre>
4749 <br class="table-break"><p>
4750 We can use <code class="computeroutput"><span class="identifier">local</span><span class="special"><></span></code>
4751 to rewrite the above example as follows:
4753 <pre class="programlisting"><span class="identifier">local</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">i</span><span class="special">(</span><span class="number">0</span><span class="special">);</span>
4754 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"1!2!3?"</span><span class="special">);</span>
4755 <span class="comment">// count the exciting digits, but not the</span>
4756 <span class="comment">// questionable ones.</span>
4757 <span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">i</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span>
4758 <span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rex</span><span class="special">);</span>
4759 <span class="identifier">assert</span><span class="special">(</span> <span class="identifier">i</span><span class="special">.</span><span class="identifier">get</span><span class="special">()</span> <span class="special">==</span> <span class="number">2</span> <span class="special">);</span>
4762 Notice that we use <code class="computeroutput"><span class="identifier">local</span><span class="special"><>::</span><span class="identifier">get</span><span class="special">()</span></code> to access the value of the local variable.
4763 Also, beware that <code class="computeroutput"><span class="identifier">local</span><span class="special"><></span></code>
4764 can be used to create a dangling reference, just as <code class="computeroutput"><span class="identifier">reference</span><span class="special"><></span></code> can.
4767 <a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h6"></a>
4768 <span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.referring_to_non_local_variables"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.referring_to_non_local_variables">Referring
4769 to Non-Local Variables</a>
4772 In the beginning of this section, we used a regex with a semantic action
4773 to parse a string of word/integer pairs and stuff them into a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><></span></code>. That required that the map and the
4774 regex be defined together and used before either could go out of scope. What
4775 if we wanted to define the regex once and use it to fill lots of different
4776 maps? We would rather pass the map into the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
4777 algorithm rather than embed a reference to it directly in the regex object.
4778 What we can do instead is define a placeholder and use that in the semantic
4779 action instead of the map itself. Later, when we call one of the regex algorithms,
4780 we can bind the reference to an actual map object. The following code shows
4783 <pre class="programlisting"><span class="comment">// Define a placeholder for a map object:</span>
4784 <span class="identifier">placeholder</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">></span> <span class="special">></span> <span class="identifier">_map</span><span class="special">;</span>
4786 <span class="comment">// Match a word and an integer, separated by =>,</span>
4787 <span class="comment">// and then stuff the result into a std::map<></span>
4788 <span class="identifier">sregex</span> <span class="identifier">pair</span> <span class="special">=</span> <span class="special">(</span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="string">"=>"</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s2</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span><span class="special">)</span> <span class="special">)</span>
4789 <span class="special">[</span> <span class="identifier">_map</span><span class="special">[</span><span class="identifier">s1</span><span class="special">]</span> <span class="special">=</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">s2</span><span class="special">)</span> <span class="special">];</span>
4791 <span class="comment">// Match one or more word/integer pairs, separated</span>
4792 <span class="comment">// by whitespace.</span>
4793 <span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="identifier">pair</span> <span class="special">>></span> <span class="special">*(+</span><span class="identifier">_s</span> <span class="special">>></span> <span class="identifier">pair</span><span class="special">);</span>
4795 <span class="comment">// The string to parse</span>
4796 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"aaa=>1 bbb=>23 ccc=>456"</span><span class="special">);</span>
4798 <span class="comment">// Here is the actual map to fill in:</span>
4799 <span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">></span> <span class="identifier">result</span><span class="special">;</span>
4801 <span class="comment">// Bind the _map placeholder to the actual map</span>
4802 <span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
4803 <span class="identifier">what</span><span class="special">.</span><span class="identifier">let</span><span class="special">(</span> <span class="identifier">_map</span> <span class="special">=</span> <span class="identifier">result</span> <span class="special">);</span>
4805 <span class="comment">// Execute the match and fill in result map</span>
4806 <span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_match</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">))</span>
4807 <span class="special">{</span>
4808 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"aaa"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
4809 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"bbb"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
4810 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"ccc"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
4811 <span class="special">}</span>
4814 This program displays:
4816 <pre class="programlisting">1
4821 We use <code class="computeroutput"><span class="identifier">placeholder</span><span class="special"><></span></code>
4822 here to define <code class="computeroutput"><span class="identifier">_map</span></code>, which
4823 stands in for a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><></span></code>
4824 variable. We can use the placeholder in the semantic action as if it were
4825 a map. Then, we define a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
4826 struct and bind an actual map to the placeholder with "<code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">let</span><span class="special">(</span> <span class="identifier">_map</span> <span class="special">=</span> <span class="identifier">result</span> <span class="special">);</span></code>". The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
4827 call behaves as if the placeholder in the semantic action had been replaced
4828 with a reference to <code class="computeroutput"><span class="identifier">result</span></code>.
4830 <div class="note"><table border="0" summary="Note">
4832 <td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
4833 <th align="left">Note</th>
4835 <tr><td align="left" valign="top"><p>
4836 Placeholders in semantic actions are not <span class="emphasis"><em>actually</em></span>
4837 replaced at runtime with references to variables. The regex object is never
4838 mutated in any way during any of the regex algorithms, so they are safe
4839 to use in multiple threads.
4843 The syntax for late-bound action arguments is a little different if you are
4844 using <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator<></a></code></code>
4845 or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>.
4846 The regex iterators accept an extra constructor parameter for specifying
4847 the argument bindings. There is a <code class="computeroutput"><span class="identifier">let</span><span class="special">()</span></code> function that you can use to bind variables
4848 to their placeholders. The following code demonstrates how.
4850 <pre class="programlisting"><span class="comment">// Define a placeholder for a map object:</span>
4851 <span class="identifier">placeholder</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">></span> <span class="special">></span> <span class="identifier">_map</span><span class="special">;</span>
4853 <span class="comment">// Match a word and an integer, separated by =>,</span>
4854 <span class="comment">// and then stuff the result into a std::map<></span>
4855 <span class="identifier">sregex</span> <span class="identifier">pair</span> <span class="special">=</span> <span class="special">(</span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="string">"=>"</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s2</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span><span class="special">)</span> <span class="special">)</span>
4856 <span class="special">[</span> <span class="identifier">_map</span><span class="special">[</span><span class="identifier">s1</span><span class="special">]</span> <span class="special">=</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">s2</span><span class="special">)</span> <span class="special">];</span>
4858 <span class="comment">// The string to parse</span>
4859 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"aaa=>1 bbb=>23 ccc=>456"</span><span class="special">);</span>
4861 <span class="comment">// Here is the actual map to fill in:</span>
4862 <span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">></span> <span class="identifier">result</span><span class="special">;</span>
4864 <span class="comment">// Create a regex_iterator to find all the matches</span>
4865 <span class="identifier">sregex_iterator</span> <span class="identifier">it</span><span class="special">(</span><span class="identifier">str</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">pair</span><span class="special">,</span> <span class="identifier">let</span><span class="special">(</span><span class="identifier">_map</span><span class="special">=</span><span class="identifier">result</span><span class="special">));</span>
4866 <span class="identifier">sregex_iterator</span> <span class="identifier">end</span><span class="special">;</span>
4868 <span class="comment">// step through all the matches, and fill in</span>
4869 <span class="comment">// the result map</span>
4870 <span class="keyword">while</span><span class="special">(</span><span class="identifier">it</span> <span class="special">!=</span> <span class="identifier">end</span><span class="special">)</span>
4871 <span class="special">++</span><span class="identifier">it</span><span class="special">;</span>
4873 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"aaa"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
4874 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"bbb"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
4875 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"ccc"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
4878 This program displays:
4880 <pre class="programlisting">1
4885 <a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h7"></a>
4886 <span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.user_defined_assertions"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.user_defined_assertions">User-Defined
4890 You are probably already familiar with regular expression <span class="emphasis"><em>assertions</em></span>.
4891 In Perl, some examples are the <code class="literal">^</code> and <code class="literal">$</code>
4892 assertions, which you can use to match the beginning and end of a string,
4893 respectively. Xpressive lets you define your own assertions. A custom assertion
4894 is a contition which must be true at a point in the match in order for the
4895 match to succeed. You can check a custom assertion with xpressive's <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/check.html" title="Function template check">check()</a></code></code> function.
4898 There are a couple of ways to define a custom assertion. The simplest is
4899 to use a function object. Let's say that you want to ensure that a sub-expression
4900 matches a sub-string that is either 3 or 6 characters long. The following
4901 struct defines such a predicate:
4903 <pre class="programlisting"><span class="comment">// A predicate that is true IFF a sub-match is</span>
4904 <span class="comment">// either 3 or 6 characters long.</span>
4905 <span class="keyword">struct</span> <span class="identifier">three_or_six</span>
4906 <span class="special">{</span>
4907 <span class="keyword">bool</span> <span class="keyword">operator</span><span class="special">()(</span><span class="identifier">ssub_match</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">sub</span><span class="special">)</span> <span class="keyword">const</span>
4908 <span class="special">{</span>
4909 <span class="keyword">return</span> <span class="identifier">sub</span><span class="special">.</span><span class="identifier">length</span><span class="special">()</span> <span class="special">==</span> <span class="number">3</span> <span class="special">||</span> <span class="identifier">sub</span><span class="special">.</span><span class="identifier">length</span><span class="special">()</span> <span class="special">==</span> <span class="number">6</span><span class="special">;</span>
4910 <span class="special">}</span>
4911 <span class="special">};</span>
4914 You can use this predicate within a regular expression as follows:
4916 <pre class="programlisting"><span class="comment">// match words of 3 characters or 6 characters.</span>
4917 <span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">bow</span> <span class="special">>></span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">>></span> <span class="identifier">eow</span><span class="special">)[</span> <span class="identifier">check</span><span class="special">(</span><span class="identifier">three_or_six</span><span class="special">())</span> <span class="special">]</span> <span class="special">;</span>
4920 The above regular expression will find whole words that are either 3 or 6
4921 characters long. The <code class="computeroutput"><span class="identifier">three_or_six</span></code>
4922 predicate accepts a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
4923 that refers back to the part of the string matched by the sub-expression
4924 to which the custom assertion is attached.
4926 <div class="note"><table border="0" summary="Note">
4928 <td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
4929 <th align="left">Note</th>
4931 <tr><td align="left" valign="top"><p>
4932 The custom assertion participates in determining whether the match succeeds
4933 or fails. Unlike actions, which execute lazily, custom assertions execute
4934 immediately while the regex engine is searching for a match.
4938 Custom assertions can also be defined inline using the same syntax as for
4939 semantic actions. Below is the same custom assertion written inline:
4941 <pre class="programlisting"><span class="comment">// match words of 3 characters or 6 characters.</span>
4942 <span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">bow</span> <span class="special">>></span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">>></span> <span class="identifier">eow</span><span class="special">)[</span> <span class="identifier">check</span><span class="special">(</span><span class="identifier">length</span><span class="special">(</span><span class="identifier">_</span><span class="special">)==</span><span class="number">3</span> <span class="special">||</span> <span class="identifier">length</span><span class="special">(</span><span class="identifier">_</span><span class="special">)==</span><span class="number">6</span><span class="special">)</span> <span class="special">]</span> <span class="special">;</span>
4945 In the above, <code class="computeroutput"><span class="identifier">length</span><span class="special">()</span></code>
4946 is a lazy function that calls the <code class="computeroutput"><span class="identifier">length</span><span class="special">()</span></code> member function of its argument, and <code class="computeroutput"><span class="identifier">_</span></code> is a placeholder that receives the <code class="computeroutput"><span class="identifier">sub_match</span></code>.
4949 Once you get the hang of writing custom assertions inline, they can be very
4950 powerful. For example, you can write a regular expression that only matches
4951 valid dates (for some suitably liberal definition of the term <span class="quote">“<span class="quote">valid</span>”</span>).
4953 <pre class="programlisting"><span class="keyword">int</span> <span class="keyword">const</span> <span class="identifier">days_per_month</span><span class="special">[]</span> <span class="special">=</span>
4954 <span class="special">{</span><span class="number">31</span><span class="special">,</span> <span class="number">29</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">30</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">30</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">30</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">31</span><span class="special">};</span>
4956 <span class="identifier">mark_tag</span> <span class="identifier">month</span><span class="special">(</span><span class="number">1</span><span class="special">),</span> <span class="identifier">day</span><span class="special">(</span><span class="number">2</span><span class="special">);</span>
4957 <span class="comment">// find a valid date of the form month/day/year.</span>
4958 <span class="identifier">sregex</span> <span class="identifier">date</span> <span class="special">=</span>
4959 <span class="special">(</span>
4960 <span class="comment">// Month must be between 1 and 12 inclusive</span>
4961 <span class="special">(</span><span class="identifier">month</span><span class="special">=</span> <span class="identifier">_d</span> <span class="special">>></span> <span class="special">!</span><span class="identifier">_d</span><span class="special">)</span> <span class="special">[</span> <span class="identifier">check</span><span class="special">(</span><span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">)</span> <span class="special">>=</span> <span class="number">1</span>
4962 <span class="special">&&</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">)</span> <span class="special"><=</span> <span class="number">12</span><span class="special">)</span> <span class="special">]</span>
4963 <span class="special">>></span> <span class="char">'/'</span>
4964 <span class="comment">// Day must be between 1 and 31 inclusive</span>
4965 <span class="special">>></span> <span class="special">(</span><span class="identifier">day</span><span class="special">=</span> <span class="identifier">_d</span> <span class="special">>></span> <span class="special">!</span><span class="identifier">_d</span><span class="special">)</span> <span class="special">[</span> <span class="identifier">check</span><span class="special">(</span><span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">)</span> <span class="special">>=</span> <span class="number">1</span>
4966 <span class="special">&&</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">)</span> <span class="special"><=</span> <span class="number">31</span><span class="special">)</span> <span class="special">]</span>
4967 <span class="special">>></span> <span class="char">'/'</span>
4968 <span class="comment">// Only consider years between 1970 and 2038</span>
4969 <span class="special">>></span> <span class="special">(</span><span class="identifier">_d</span> <span class="special">>></span> <span class="identifier">_d</span> <span class="special">>></span> <span class="identifier">_d</span> <span class="special">>></span> <span class="identifier">_d</span><span class="special">)</span> <span class="special">[</span> <span class="identifier">check</span><span class="special">(</span><span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">)</span> <span class="special">>=</span> <span class="number">1970</span>
4970 <span class="special">&&</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">)</span> <span class="special"><=</span> <span class="number">2038</span><span class="special">)</span> <span class="special">]</span>
4971 <span class="special">)</span>
4972 <span class="comment">// Ensure the month actually has that many days!</span>
4973 <span class="special">[</span> <span class="identifier">check</span><span class="special">(</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">days_per_month</span><span class="special">)[</span><span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">month</span><span class="special">)-</span><span class="number">1</span><span class="special">]</span> <span class="special">>=</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">day</span><span class="special">)</span> <span class="special">)</span> <span class="special">]</span>
4974 <span class="special">;</span>
4976 <span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
4977 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"99/99/9999 2/30/2006 2/28/2006"</span><span class="special">);</span>
4979 <span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">date</span><span class="special">))</span>
4980 <span class="special">{</span>
4981 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
4982 <span class="special">}</span>
4985 The above program prints out the following:
4987 <pre class="programlisting">2/28/2006
4990 Notice how the inline custom assertions are used to range-check the values
4991 for the month, day and year. The regular expression doesn't match <code class="computeroutput"><span class="string">"99/99/9999"</span></code> or <code class="computeroutput"><span class="string">"2/30/2006"</span></code>
4992 because they are not valid dates. (There is no 99th month, and February doesn't
4996 <div class="section">
4997 <div class="titlepage"><div><div><h3 class="title">
4998 <a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.symbol_tables_and_attributes" title="Symbol Tables and Attributes">Symbol
4999 Tables and Attributes</a>
5000 </h3></div></div></div>
5002 <a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.h0"></a>
5003 <span class="phrase"><a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.symbol_tables_and_attributes.overview">Overview</a>
5006 Symbol tables can be built into xpressive regular expressions with just a
5007 <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><></span></code>.
5008 The map keys are the strings to be matched and the map values are the data
5009 to be returned to your semantic action. Xpressive attributes, named <code class="computeroutput"><span class="identifier">a1</span></code>, <code class="computeroutput"><span class="identifier">a2</span></code>,
5010 through <code class="computeroutput"><span class="identifier">a9</span></code>, hold the value
5011 corresponding to a matching key so that it can be used in a semantic action.
5012 A default value can be specified for an attribute if a symbol is not found.
5015 <a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.h1"></a>
5016 <span class="phrase"><a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.symbol_tables"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.symbol_tables_and_attributes.symbol_tables">Symbol
5020 An xpressive symbol table is just a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><></span></code>,
5021 where the key is a string type and the value can be anything. For example,
5022 the following regular expression matches a key from map1 and assigns the
5023 corresponding value to the attribute <code class="computeroutput"><span class="identifier">a1</span></code>.
5024 Then, in the semantic action, it assigns the value stored in attribute <code class="computeroutput"><span class="identifier">a1</span></code> to an integer result.
5026 <pre class="programlisting"><span class="keyword">int</span> <span class="identifier">result</span><span class="special">;</span>
5027 <span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">></span> <span class="identifier">map1</span><span class="special">;</span>
5028 <span class="comment">// ... (fill the map)</span>
5029 <span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span> <span class="identifier">a1</span> <span class="special">=</span> <span class="identifier">map1</span> <span class="special">)</span> <span class="special">[</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)</span> <span class="special">=</span> <span class="identifier">a1</span> <span class="special">];</span>
5032 Consider the following example code, which translates number names into integers.
5033 It is described below.
5035 <pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">string</span><span class="special">></span>
5036 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
5037 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
5038 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
5039 <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
5041 <span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
5042 <span class="special">{</span>
5043 <span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">></span> <span class="identifier">number_map</span><span class="special">;</span>
5044 <span class="identifier">number_map</span><span class="special">[</span><span class="string">"one"</span><span class="special">]</span> <span class="special">=</span> <span class="number">1</span><span class="special">;</span>
5045 <span class="identifier">number_map</span><span class="special">[</span><span class="string">"two"</span><span class="special">]</span> <span class="special">=</span> <span class="number">2</span><span class="special">;</span>
5046 <span class="identifier">number_map</span><span class="special">[</span><span class="string">"three"</span><span class="special">]</span> <span class="special">=</span> <span class="number">3</span><span class="special">;</span>
5047 <span class="comment">// Match a string from number_map</span>
5048 <span class="comment">// and store the integer value in 'result'</span>
5049 <span class="comment">// if not found, store -1 in 'result'</span>
5050 <span class="keyword">int</span> <span class="identifier">result</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
5051 <span class="identifier">cregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">((</span><span class="identifier">a1</span> <span class="special">=</span> <span class="identifier">number_map</span> <span class="special">)</span> <span class="special">|</span> <span class="special">*</span><span class="identifier">_</span><span class="special">)</span>
5052 <span class="special">[</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">a1</span> <span class="special">|</span> <span class="special">-</span><span class="number">1</span><span class="special">)];</span>
5054 <span class="identifier">regex_match</span><span class="special">(</span><span class="string">"three"</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">);</span>
5055 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
5056 <span class="identifier">regex_match</span><span class="special">(</span><span class="string">"two"</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">);</span>
5057 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
5058 <span class="identifier">regex_match</span><span class="special">(</span><span class="string">"stuff"</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">);</span>
5059 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
5060 <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
5061 <span class="special">}</span>
5064 This program prints the following:
5066 <pre class="programlisting">3
5071 First the program builds a number map, with number names as string keys and
5072 the corresponding integers as values. Then it constructs a static regular
5073 expression using an attribute <code class="computeroutput"><span class="identifier">a1</span></code>
5074 to represent the result of the symbol table lookup. In the semantic action,
5075 the attribute is assigned to an integer variable <code class="computeroutput"><span class="identifier">result</span></code>.
5076 If the symbol was not found, a default value of <code class="computeroutput"><span class="special">-</span><span class="number">1</span></code> is assigned to <code class="computeroutput"><span class="identifier">result</span></code>.
5077 A wildcard, <code class="computeroutput"><span class="special">*</span><span class="identifier">_</span></code>,
5078 makes sure the regex matches even if the symbol is not found.
5081 A more complete version of this example can be found in <code class="literal">libs/xpressive/example/numbers.cpp</code><a href="#ftn.boost_xpressive.user_s_guide.symbol_tables_and_attributes.f0" class="footnote" name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.f0"><sup class="footnote">[37]</sup></a>. It translates number names up to "nine hundred ninety nine
5082 million nine hundred ninety nine thousand nine hundred ninety nine"
5083 along with some special number names like "dozen".
5086 Symbol table matches are case sensitive by default, but they can be made
5087 case-insensitive by enclosing the expression in <code class="computeroutput"><span class="identifier">icase</span><span class="special">()</span></code>.
5090 <a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.h2"></a>
5091 <span class="phrase"><a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.attributes"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.symbol_tables_and_attributes.attributes">Attributes</a>
5094 Up to nine attributes can be used in a regular expression. They are named
5095 <code class="computeroutput"><span class="identifier">a1</span></code>, <code class="computeroutput"><span class="identifier">a2</span></code>,
5096 ..., <code class="computeroutput"><span class="identifier">a9</span></code> in the <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span></code> namespace. The attribute type
5097 is the same as the second component of the map that is assigned to it. A
5098 default value for an attribute can be specified in a semantic action with
5099 the syntax <code class="computeroutput"><span class="special">(</span><span class="identifier">a1</span>
5100 <span class="special">|</span> <em class="replaceable"><code>default-value</code></em><span class="special">)</span></code>.
5103 Attributes are properly scoped, so you can do crazy things like: <code class="computeroutput"><span class="special">(</span> <span class="special">(</span><span class="identifier">a1</span><span class="special">=</span><span class="identifier">sym1</span><span class="special">)</span>
5104 <span class="special">>></span> <span class="special">(</span><span class="identifier">a1</span><span class="special">=</span><span class="identifier">sym2</span><span class="special">)[</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">x</span><span class="special">)=</span><span class="identifier">a1</span><span class="special">]</span> <span class="special">)[</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">y</span><span class="special">)=</span><span class="identifier">a1</span><span class="special">]</span></code>. The
5105 inner semantic action sees the inner <code class="computeroutput"><span class="identifier">a1</span></code>,
5106 and the outer semantic action sees the outer one. They can even have different
5109 <div class="note"><table border="0" summary="Note">
5111 <td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
5112 <th align="left">Note</th>
5114 <tr><td align="left" valign="top"><p>
5115 Xpressive builds a hidden ternary search trie from the map so it can search
5116 quickly. If BOOST_DISABLE_THREADS is defined, the hidden ternary search
5117 trie "self adjusts", so after each search it restructures itself
5118 to improve the efficiency of future searches based on the frequency of
5123 <div class="section">
5124 <div class="titlepage"><div><div><h3 class="title">
5125 <a name="boost_xpressive.user_s_guide.localization_and_regex_traits"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits" title="Localization and Regex Traits">Localization
5126 and Regex Traits</a>
5127 </h3></div></div></div>
5129 <a name="boost_xpressive.user_s_guide.localization_and_regex_traits.h0"></a>
5130 <span class="phrase"><a name="boost_xpressive.user_s_guide.localization_and_regex_traits.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits.overview">Overview</a>
5133 Matching a regular expression against a string often requires locale-dependent
5134 information. For example, how are case-insensitive comparisons performed?
5135 The locale-sensitive behavior is captured in a traits class. xpressive provides
5136 three traits class templates: <code class="computeroutput"><span class="identifier">cpp_regex_traits</span><span class="special"><></span></code>, <code class="computeroutput"><span class="identifier">c_regex_traits</span><span class="special"><></span></code> and <code class="computeroutput"><span class="identifier">null_regex_traits</span><span class="special"><></span></code>. The first wraps a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>,
5137 the second wraps the global C locale, and the third is a stub traits type
5138 for use when searching non-character data. All traits templates conform to
5139 the <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.traits_requirements">Regex
5143 <a name="boost_xpressive.user_s_guide.localization_and_regex_traits.h1"></a>
5144 <span class="phrase"><a name="boost_xpressive.user_s_guide.localization_and_regex_traits.setting_the_default_regex_trait"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits.setting_the_default_regex_trait">Setting
5145 the Default Regex Trait</a>
5148 By default, xpressive uses <code class="computeroutput"><span class="identifier">cpp_regex_traits</span><span class="special"><></span></code> for all patterns. This causes all
5149 regex objects to use the global <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>.
5150 If you compile with <code class="computeroutput"><span class="identifier">BOOST_XPRESSIVE_USE_C_TRAITS</span></code>
5151 defined, then xpressive will use <code class="computeroutput"><span class="identifier">c_regex_traits</span><span class="special"><></span></code> by default.
5154 <a name="boost_xpressive.user_s_guide.localization_and_regex_traits.h2"></a>
5155 <span class="phrase"><a name="boost_xpressive.user_s_guide.localization_and_regex_traits.using_custom_traits_with_dynamic_regexes"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits.using_custom_traits_with_dynamic_regexes">Using
5156 Custom Traits with Dynamic Regexes</a>
5159 To create a dynamic regex that uses a custom traits object, you must use
5160 <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>.
5161 The basic steps are shown in the following example:
5163 <pre class="programlisting"><span class="comment">// Declare a regex_compiler that uses the global C locale</span>
5164 <span class="identifier">regex_compiler</span><span class="special"><</span><span class="keyword">char</span> <span class="keyword">const</span> <span class="special">*,</span> <span class="identifier">c_regex_traits</span><span class="special"><</span><span class="keyword">char</span><span class="special">></span> <span class="special">></span> <span class="identifier">crxcomp</span><span class="special">;</span>
5165 <span class="identifier">cregex</span> <span class="identifier">crx</span> <span class="special">=</span> <span class="identifier">crxcomp</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"\\w+"</span> <span class="special">);</span>
5167 <span class="comment">// Declare a regex_compiler that uses a custom std::locale</span>
5168 <span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span> <span class="identifier">loc</span> <span class="special">=</span> <span class="comment">/* ... create a locale here ... */</span><span class="special">;</span>
5169 <span class="identifier">regex_compiler</span><span class="special"><</span><span class="keyword">char</span> <span class="keyword">const</span> <span class="special">*,</span> <span class="identifier">cpp_regex_traits</span><span class="special"><</span><span class="keyword">char</span><span class="special">></span> <span class="special">></span> <span class="identifier">cpprxcomp</span><span class="special">(</span><span class="identifier">loc</span><span class="special">);</span>
5170 <span class="identifier">cregex</span> <span class="identifier">cpprx</span> <span class="special">=</span> <span class="identifier">cpprxcomp</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"\\w+"</span> <span class="special">);</span>
5173 The <code class="computeroutput"><span class="identifier">regex_compiler</span></code> objects
5174 act as regex factories. Once they have been imbued with a locale, every regex
5175 object they create will use that locale.
5178 <a name="boost_xpressive.user_s_guide.localization_and_regex_traits.h3"></a>
5179 <span class="phrase"><a name="boost_xpressive.user_s_guide.localization_and_regex_traits.using_custom_traits_with_static_regexes"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits.using_custom_traits_with_static_regexes">Using
5180 Custom Traits with Static Regexes</a>
5183 If you want a particular static regex to use a different set of traits, you
5184 can use the special <code class="computeroutput"><span class="identifier">imbue</span><span class="special">()</span></code> pattern modifier. For instance:
5186 <pre class="programlisting"><span class="comment">// Define a regex that uses the global C locale</span>
5187 <span class="identifier">c_regex_traits</span><span class="special"><</span><span class="keyword">char</span><span class="special">></span> <span class="identifier">ctraits</span><span class="special">;</span>
5188 <span class="identifier">sregex</span> <span class="identifier">crx</span> <span class="special">=</span> <span class="identifier">imbue</span><span class="special">(</span><span class="identifier">ctraits</span><span class="special">)(</span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">);</span>
5190 <span class="comment">// Define a regex that uses a customized std::locale</span>
5191 <span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span> <span class="identifier">loc</span> <span class="special">=</span> <span class="comment">/* ... create a locale here ... */</span><span class="special">;</span>
5192 <span class="identifier">cpp_regex_traits</span><span class="special"><</span><span class="keyword">char</span><span class="special">></span> <span class="identifier">cpptraits</span><span class="special">(</span><span class="identifier">loc</span><span class="special">);</span>
5193 <span class="identifier">sregex</span> <span class="identifier">cpprx1</span> <span class="special">=</span> <span class="identifier">imbue</span><span class="special">(</span><span class="identifier">cpptraits</span><span class="special">)(</span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">);</span>
5195 <span class="comment">// A shorthand for above</span>
5196 <span class="identifier">sregex</span> <span class="identifier">cpprx2</span> <span class="special">=</span> <span class="identifier">imbue</span><span class="special">(</span><span class="identifier">loc</span><span class="special">)(</span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">);</span>
5199 The <code class="computeroutput"><span class="identifier">imbue</span><span class="special">()</span></code>
5200 pattern modifier must wrap the entire pattern. It is an error to <code class="computeroutput"><span class="identifier">imbue</span></code> only part of a static regex. For
5203 <pre class="programlisting"><span class="comment">// ERROR! Cannot imbue() only part of a regex</span>
5204 <span class="identifier">sregex</span> <span class="identifier">error</span> <span class="special">=</span> <span class="identifier">_w</span> <span class="special">>></span> <span class="identifier">imbue</span><span class="special">(</span><span class="identifier">loc</span><span class="special">)(</span> <span class="identifier">_w</span> <span class="special">);</span>
5207 <a name="boost_xpressive.user_s_guide.localization_and_regex_traits.h4"></a>
5208 <span class="phrase"><a name="boost_xpressive.user_s_guide.localization_and_regex_traits.searching_non_character_data_with__literal_null_regex_traits__literal_"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits.searching_non_character_data_with__literal_null_regex_traits__literal_">Searching
5209 Non-Character Data With <code class="literal">null_regex_traits</code></a>
5212 With xpressive static regexes, you are not limitted to searching for patterns
5213 in character sequences. You can search for patterns in raw bytes, integers,
5214 or anything that conforms to the <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.chart_requirements">Char
5215 Concept</a>. The <code class="computeroutput"><span class="identifier">null_regex_traits</span><span class="special"><></span></code> makes it simple. It is a stub implementation
5216 of the <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.traits_requirements">Regex
5217 Traits Concept</a>. It recognizes no character classes and does no case-sensitive
5221 For example, with <code class="computeroutput"><span class="identifier">null_regex_traits</span><span class="special"><></span></code>, you can write a static regex to
5222 find a pattern in a sequence of integers as follows:
5224 <pre class="programlisting"><span class="comment">// some integral data to search</span>
5225 <span class="keyword">int</span> <span class="keyword">const</span> <span class="identifier">data</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span><span class="number">0</span><span class="special">,</span> <span class="number">1</span><span class="special">,</span> <span class="number">2</span><span class="special">,</span> <span class="number">3</span><span class="special">,</span> <span class="number">4</span><span class="special">,</span> <span class="number">5</span><span class="special">,</span> <span class="number">6</span><span class="special">};</span>
5227 <span class="comment">// create a null_regex_traits<> object for searching integers ...</span>
5228 <span class="identifier">null_regex_traits</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">nul</span><span class="special">;</span>
5230 <span class="comment">// imbue a regex object with the null_regex_traits ...</span>
5231 <span class="identifier">basic_regex</span><span class="special"><</span><span class="keyword">int</span> <span class="keyword">const</span> <span class="special">*></span> <span class="identifier">rex</span> <span class="special">=</span> <span class="identifier">imbue</span><span class="special">(</span><span class="identifier">nul</span><span class="special">)(</span><span class="number">1</span> <span class="special">>></span> <span class="special">+((</span><span class="identifier">set</span><span class="special">=</span> <span class="number">2</span><span class="special">,</span><span class="number">3</span><span class="special">)</span> <span class="special">|</span> <span class="number">4</span><span class="special">)</span> <span class="special">>></span> <span class="number">5</span><span class="special">);</span>
5232 <span class="identifier">match_results</span><span class="special"><</span><span class="keyword">int</span> <span class="keyword">const</span> <span class="special">*></span> <span class="identifier">what</span><span class="special">;</span>
5234 <span class="comment">// search for the pattern in the array of integers ...</span>
5235 <span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">data</span><span class="special">,</span> <span class="identifier">data</span> <span class="special">+</span> <span class="number">7</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rex</span><span class="special">);</span>
5237 <span class="identifier">assert</span><span class="special">(</span><span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">].</span><span class="identifier">matched</span><span class="special">);</span>
5238 <span class="identifier">assert</span><span class="special">(*</span><span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">].</span><span class="identifier">first</span> <span class="special">==</span> <span class="number">1</span><span class="special">);</span>
5239 <span class="identifier">assert</span><span class="special">(*</span><span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">].</span><span class="identifier">second</span> <span class="special">==</span> <span class="number">6</span><span class="special">);</span>
5242 <div class="section">
5243 <div class="titlepage"><div><div><h3 class="title">
5244 <a name="boost_xpressive.user_s_guide.tips_n_tricks"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks" title="Tips 'N Tricks">Tips 'N Tricks</a>
5245 </h3></div></div></div>
5247 Squeeze the most performance out of xpressive with these tips and tricks.
5250 <a name="boost_xpressive.user_s_guide.tips_n_tricks.h0"></a>
5251 <span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.compile_patterns_once_and_reuse_them"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.compile_patterns_once_and_reuse_them">Compile
5252 Patterns Once And Reuse Them</a>
5255 Compiling a regex (dynamic or static) is <span class="emphasis"><em>far</em></span> more expensive
5256 than executing a match or search. If you have the option, prefer to compile
5257 a pattern into a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
5258 object once and reuse it rather than recreating it over and over.
5261 Since <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
5262 objects are not mutated by any of the regex algorithms, they are completely
5263 thread-safe once their initialization (and that of any grammars of which
5264 they are members) completes. The easiest way to reuse your patterns is to
5265 simply make your <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
5266 objects "static const".
5269 <a name="boost_xpressive.user_s_guide.tips_n_tricks.h1"></a>
5270 <span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.reuse__literal__classname_alt__boost__xpressive__match_results__match_results_lt__gt___classname___literal__objects"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.reuse__literal__classname_alt__boost__xpressive__match_results__match_results_lt__gt___classname___literal__objects">Reuse
5271 match_results<>
5275 The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
5276 object caches dynamically allocated memory. For this reason, it is far better
5277 to reuse the same <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
5278 object if you have to do many regex searches.
5281 Caveat: <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
5282 objects are not thread-safe, so don't go wild reusing them across threads.
5285 <a name="boost_xpressive.user_s_guide.tips_n_tricks.h2"></a>
5286 <span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.prefer_algorithms_that_take_a__literal__classname_alt__boost__xpressive__match_results__match_results_lt__gt___classname___literal__object"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.prefer_algorithms_that_take_a__literal__classname_alt__boost__xpressive__match_results__match_results_lt__gt___classname___literal__object">Prefer
5287 Algorithms That Take A match_results<>
5291 This is a corollary to the previous tip. If you are doing multiple searches,
5292 you should prefer the regex algorithms that accept a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
5293 object over the ones that don't, and you should reuse the same <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
5294 object each time. If you don't provide a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
5295 object, a temporary one will be created for you and discarded when the algorithm
5296 returns. Any memory cached in the object will be deallocated and will have
5297 to be reallocated the next time.
5300 <a name="boost_xpressive.user_s_guide.tips_n_tricks.h3"></a>
5301 <span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.prefer_algorithms_that_accept_iterator_ranges_over_null_terminated_strings"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.prefer_algorithms_that_accept_iterator_ranges_over_null_terminated_strings">Prefer
5302 Algorithms That Accept Iterator Ranges Over Null-Terminated Strings</a>
5305 xpressive provides overloads of the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
5306 and <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
5307 algorithms that operate on C-style null-terminated strings. You should prefer
5308 the overloads that take iterator ranges. When you pass a null-terminated
5309 string to a regex algorithm, the end iterator is calculated immediately by
5310 calling <code class="computeroutput"><span class="identifier">strlen</span></code>. If you already
5311 know the length of the string, you can avoid this overhead by calling the
5312 regex algorithms with a <code class="computeroutput"><span class="special">[</span><span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span><span class="special">)</span></code>
5316 <a name="boost_xpressive.user_s_guide.tips_n_tricks.h4"></a>
5317 <span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.use_static_regexes"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.use_static_regexes">Use
5321 On average, static regexes execute about 10 to 15% faster than their dynamic
5322 counterparts. It's worth familiarizing yourself with the static regex dialect.
5325 <a name="boost_xpressive.user_s_guide.tips_n_tricks.h5"></a>
5326 <span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.understand__literal_syntax_option_type__optimize__literal_"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.understand__literal_syntax_option_type__optimize__literal_">Understand
5327 <code class="literal">syntax_option_type::optimize</code></a>
5330 The <code class="computeroutput"><span class="identifier">optimize</span></code> flag tells the
5331 regex compiler to spend some extra time analyzing the pattern. It can cause
5332 some patterns to execute faster, but it increases the time to compile the
5333 pattern, and often increases the amount of memory consumed by the pattern.
5334 If you plan to reuse your pattern, <code class="computeroutput"><span class="identifier">optimize</span></code>
5335 is usually a win. If you will only use the pattern once, don't use <code class="computeroutput"><span class="identifier">optimize</span></code>.
5338 <a name="boost_xpressive.user_s_guide.tips_n_tricks.h6"></a>
5339 <span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.common_pitfalls"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.common_pitfalls">Common
5343 Keep the following tips in mind to avoid stepping in potholes with xpressive.
5346 <a name="boost_xpressive.user_s_guide.tips_n_tricks.h7"></a>
5347 <span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.create_grammars_on_a_single_thread"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.create_grammars_on_a_single_thread">Create
5348 Grammars On A Single Thread</a>
5351 With static regexes, you can create grammars by nesting regexes inside one
5352 another. When compiling the outer regex, both the outer and inner regex objects,
5353 and all the regex objects to which they refer either directly or indirectly,
5354 are modified. For this reason, it's dangerous for global regex objects to
5355 participate in grammars. It's best to build regex grammars from a single
5356 thread. Once built, the resulting regex grammar can be executed from multiple
5357 threads without problems.
5360 <a name="boost_xpressive.user_s_guide.tips_n_tricks.h8"></a>
5361 <span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.beware_nested_quantifiers"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.beware_nested_quantifiers">Beware
5362 Nested Quantifiers</a>
5365 This is a pitfall common to many regular expression engines. Some patterns
5366 can cause exponentially bad performance. Often these patterns involve one
5367 quantified term nested withing another quantifier, such as <code class="computeroutput"><span class="string">"(a*)*"</span></code>, although in many cases,
5368 the problem is harder to spot. Beware of patterns that have nested quantifiers.
5371 <div class="section">
5372 <div class="titlepage"><div><div><h3 class="title">
5373 <a name="boost_xpressive.user_s_guide.concepts"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts" title="Concepts">Concepts</a>
5374 </h3></div></div></div>
5376 <a name="boost_xpressive.user_s_guide.concepts.h0"></a>
5377 <span class="phrase"><a name="boost_xpressive.user_s_guide.concepts.chart_requirements"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.chart_requirements">CharT
5381 If type <code class="computeroutput"><span class="identifier">BidiIterT</span></code> is used
5382 as a template argument to <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>,
5383 then <code class="computeroutput"><span class="identifier">CharT</span></code> is <code class="computeroutput"><span class="identifier">iterator_traits</span><span class="special"><</span><span class="identifier">BidiIterT</span><span class="special">>::</span><span class="identifier">value_type</span></code>. Type <code class="computeroutput"><span class="identifier">CharT</span></code>
5384 must have a trivial default constructor, copy constructor, assignment operator,
5385 and destructor. In addition the following requirements must be met for objects;
5386 <code class="computeroutput"><span class="identifier">c</span></code> of type <code class="computeroutput"><span class="identifier">CharT</span></code>,
5387 <code class="computeroutput"><span class="identifier">c1</span></code> and <code class="computeroutput"><span class="identifier">c2</span></code>
5388 of type <code class="computeroutput"><span class="identifier">CharT</span> <span class="keyword">const</span></code>,
5389 and <code class="computeroutput"><span class="identifier">i</span></code> of type <code class="computeroutput"><span class="keyword">int</span></code>:
5392 <a name="boost_xpressive.user_s_guide.concepts.t0"></a><p class="title"><b>Table 46.14. CharT Requirements</b></p>
5393 <div class="table-contents"><table class="table" summary="CharT Requirements">
5402 <span class="bold"><strong>Expression</strong></span>
5407 <span class="bold"><strong>Return type</strong></span>
5412 <span class="bold"><strong>Assertion / Note / Pre- / Post-condition</strong></span>
5420 <code class="computeroutput"><span class="identifier">CharT</span> <span class="identifier">c</span></code>
5425 <code class="computeroutput"><span class="identifier">CharT</span></code>
5430 Default constructor (must be trivial).
5437 <code class="computeroutput"><span class="identifier">CharT</span> <span class="identifier">c</span><span class="special">(</span><span class="identifier">c1</span><span class="special">)</span></code>
5442 <code class="computeroutput"><span class="identifier">CharT</span></code>
5447 Copy constructor (must be trivial).
5454 <code class="computeroutput"><span class="identifier">c1</span> <span class="special">=</span>
5455 <span class="identifier">c2</span></code>
5460 <code class="computeroutput"><span class="identifier">CharT</span></code>
5465 Assignment operator (must be trivial).
5472 <code class="computeroutput"><span class="identifier">c1</span> <span class="special">==</span>
5473 <span class="identifier">c2</span></code>
5478 <code class="computeroutput"><span class="keyword">bool</span></code>
5483 <code class="computeroutput"><span class="keyword">true</span></code> if <code class="computeroutput"><span class="identifier">c1</span></code> has the same value as <code class="computeroutput"><span class="identifier">c2</span></code>.
5490 <code class="computeroutput"><span class="identifier">c1</span> <span class="special">!=</span>
5491 <span class="identifier">c2</span></code>
5496 <code class="computeroutput"><span class="keyword">bool</span></code>
5501 <code class="computeroutput"><span class="keyword">true</span></code> if <code class="computeroutput"><span class="identifier">c1</span></code> and <code class="computeroutput"><span class="identifier">c2</span></code>
5509 <code class="computeroutput"><span class="identifier">c1</span> <span class="special"><</span>
5510 <span class="identifier">c2</span></code>
5515 <code class="computeroutput"><span class="keyword">bool</span></code>
5520 <code class="computeroutput"><span class="keyword">true</span></code> if the value
5521 of <code class="computeroutput"><span class="identifier">c1</span></code> is less than
5522 <code class="computeroutput"><span class="identifier">c2</span></code>.
5529 <code class="computeroutput"><span class="identifier">c1</span> <span class="special">></span>
5530 <span class="identifier">c2</span></code>
5535 <code class="computeroutput"><span class="keyword">bool</span></code>
5540 <code class="computeroutput"><span class="keyword">true</span></code> if the value
5541 of <code class="computeroutput"><span class="identifier">c1</span></code> is greater
5542 than <code class="computeroutput"><span class="identifier">c2</span></code>.
5549 <code class="computeroutput"><span class="identifier">c1</span> <span class="special"><=</span>
5550 <span class="identifier">c2</span></code>
5555 <code class="computeroutput"><span class="keyword">bool</span></code>
5560 <code class="computeroutput"><span class="keyword">true</span></code> if <code class="computeroutput"><span class="identifier">c1</span></code> is less than or equal to
5561 <code class="computeroutput"><span class="identifier">c2</span></code>.
5568 <code class="computeroutput"><span class="identifier">c1</span> <span class="special">>=</span>
5569 <span class="identifier">c2</span></code>
5574 <code class="computeroutput"><span class="keyword">bool</span></code>
5579 <code class="computeroutput"><span class="keyword">true</span></code> if <code class="computeroutput"><span class="identifier">c1</span></code> is greater than or equal to
5580 <code class="computeroutput"><span class="identifier">c2</span></code>.
5587 <code class="computeroutput"><span class="identifier">intmax_t</span> <span class="identifier">i</span>
5588 <span class="special">=</span> <span class="identifier">c1</span></code>
5593 <code class="computeroutput"><span class="keyword">int</span></code>
5598 <code class="computeroutput"><span class="identifier">CharT</span></code> must be convertible
5599 to an integral type.
5606 <code class="computeroutput"><span class="identifier">CharT</span> <span class="identifier">c</span><span class="special">(</span><span class="identifier">i</span><span class="special">);</span></code>
5611 <code class="computeroutput"><span class="identifier">CharT</span></code>
5616 <code class="computeroutput"><span class="identifier">CharT</span></code> must be constructable
5617 from an integral type.
5624 <br class="table-break"><h3>
5625 <a name="boost_xpressive.user_s_guide.concepts.h1"></a>
5626 <span class="phrase"><a name="boost_xpressive.user_s_guide.concepts.traits_requirements"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.traits_requirements">Traits
5630 In the following table <code class="computeroutput"><span class="identifier">X</span></code>
5631 denotes a traits class defining types and functions for the character container
5632 type <code class="computeroutput"><span class="identifier">CharT</span></code>; <code class="computeroutput"><span class="identifier">u</span></code> is an object of type <code class="computeroutput"><span class="identifier">X</span></code>;
5633 <code class="computeroutput"><span class="identifier">v</span></code> is an object of type <code class="computeroutput"><span class="keyword">const</span> <span class="identifier">X</span></code>;
5634 <code class="computeroutput"><span class="identifier">p</span></code> is a value of type <code class="computeroutput"><span class="keyword">const</span> <span class="identifier">CharT</span><span class="special">*</span></code>; <code class="computeroutput"><span class="identifier">I1</span></code>
5635 and <code class="computeroutput"><span class="identifier">I2</span></code> are <code class="computeroutput"><span class="identifier">Input</span> <span class="identifier">Iterators</span></code>;
5636 <code class="computeroutput"><span class="identifier">c</span></code> is a value of type <code class="computeroutput"><span class="keyword">const</span> <span class="identifier">CharT</span></code>;
5637 <code class="computeroutput"><span class="identifier">s</span></code> is an object of type <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code>;
5638 <code class="computeroutput"><span class="identifier">cs</span></code> is an object of type
5639 <code class="computeroutput"><span class="keyword">const</span> <span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code>;
5640 <code class="computeroutput"><span class="identifier">b</span></code> is a value of type <code class="computeroutput"><span class="keyword">bool</span></code>; <code class="computeroutput"><span class="identifier">i</span></code>
5641 is a value of type <code class="computeroutput"><span class="keyword">int</span></code>; <code class="computeroutput"><span class="identifier">F1</span></code> and <code class="computeroutput"><span class="identifier">F2</span></code>
5642 are values of type <code class="computeroutput"><span class="keyword">const</span> <span class="identifier">CharT</span><span class="special">*</span></code>; <code class="computeroutput"><span class="identifier">loc</span></code>
5643 is an object of type <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">locale_type</span></code>; and <code class="computeroutput"><span class="identifier">ch</span></code>
5644 is an object of <code class="computeroutput"><span class="keyword">const</span> <span class="keyword">char</span></code>.
5647 <a name="boost_xpressive.user_s_guide.concepts.t1"></a><p class="title"><b>Table 46.15. Traits Requirements</b></p>
5648 <div class="table-contents"><table class="table" summary="Traits Requirements">
5657 <span class="bold"><strong>Expression</strong></span>
5662 <span class="bold"><strong>Return type</strong></span>
5667 <span class="bold"><strong>Assertion / Note<br> Pre / Post condition</strong></span>
5675 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">char_type</span></code>
5680 <code class="computeroutput"><span class="identifier">CharT</span></code>
5685 The character container type used in the implementation of class
5686 template <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>.
5693 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code>
5698 <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special"><</span><span class="identifier">CharT</span><span class="special">></span></code>
5699 or <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="identifier">CharT</span><span class="special">></span></code>
5708 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">locale_type</span></code>
5713 <span class="emphasis"><em>Implementation defined</em></span>
5718 A copy constructible type that represents the locale used by the
5726 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">char_class_type</span></code>
5731 <span class="emphasis"><em>Implementation defined</em></span>
5736 A bitmask type representing a particular character classification.
5737 Multiple values of this type can be bitwise-or'ed together to obtain
5745 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">hash</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span></code>
5750 <code class="computeroutput"><span class="keyword">unsigned</span> <span class="keyword">char</span></code>
5755 Yields a value between <code class="computeroutput"><span class="number">0</span></code>
5756 and <code class="computeroutput"><span class="identifier">UCHAR_MAX</span></code> inclusive.
5763 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">widen</span><span class="special">(</span><span class="identifier">ch</span><span class="special">)</span></code>
5768 <code class="computeroutput"><span class="identifier">CharT</span></code>
5773 Widens the specified <code class="computeroutput"><span class="keyword">char</span></code>
5774 and returns the resulting <code class="computeroutput"><span class="identifier">CharT</span></code>.
5781 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">in_range</span><span class="special">(</span><span class="identifier">r1</span><span class="special">,</span>
5782 <span class="identifier">r2</span><span class="special">,</span>
5783 <span class="identifier">c</span><span class="special">)</span></code>
5788 <code class="computeroutput"><span class="keyword">bool</span></code>
5793 For any characters <code class="computeroutput"><span class="identifier">r1</span></code>
5794 and <code class="computeroutput"><span class="identifier">r2</span></code>, returns
5795 <code class="computeroutput"><span class="keyword">true</span></code> if <code class="computeroutput"><span class="identifier">r1</span> <span class="special"><=</span>
5796 <span class="identifier">c</span> <span class="special">&&</span>
5797 <span class="identifier">c</span> <span class="special"><=</span>
5798 <span class="identifier">r2</span></code>. Requires that <code class="computeroutput"><span class="identifier">r1</span> <span class="special"><=</span>
5799 <span class="identifier">r2</span></code>.
5806 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">in_range_nocase</span><span class="special">(</span><span class="identifier">r1</span><span class="special">,</span>
5807 <span class="identifier">r2</span><span class="special">,</span>
5808 <span class="identifier">c</span><span class="special">)</span></code>
5813 <code class="computeroutput"><span class="keyword">bool</span></code>
5818 For characters <code class="computeroutput"><span class="identifier">r1</span></code>
5819 and <code class="computeroutput"><span class="identifier">r2</span></code>, returns
5820 <code class="computeroutput"><span class="keyword">true</span></code> if there is some
5821 character <code class="computeroutput"><span class="identifier">d</span></code> for
5822 which <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">translate_nocase</span><span class="special">(</span><span class="identifier">d</span><span class="special">)</span>
5823 <span class="special">==</span> <span class="identifier">v</span><span class="special">.</span><span class="identifier">translate_nocase</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span></code> and <code class="computeroutput"><span class="identifier">r1</span>
5824 <span class="special"><=</span> <span class="identifier">d</span>
5825 <span class="special">&&</span> <span class="identifier">d</span>
5826 <span class="special"><=</span> <span class="identifier">r2</span></code>.
5827 Requires that <code class="computeroutput"><span class="identifier">r1</span> <span class="special"><=</span> <span class="identifier">r2</span></code>.
5834 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">translate</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span></code>
5839 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">char_type</span></code>
5844 Returns a character such that for any character <code class="computeroutput"><span class="identifier">d</span></code>
5845 that is to be considered equivalent to <code class="computeroutput"><span class="identifier">c</span></code>
5846 then <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">translate</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span>
5847 <span class="special">==</span> <span class="identifier">v</span><span class="special">.</span><span class="identifier">translate</span><span class="special">(</span><span class="identifier">d</span><span class="special">)</span></code>.
5854 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">translate_nocase</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span></code>
5859 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">char_type</span></code>
5864 For all characters <code class="computeroutput"><span class="identifier">C</span></code>
5865 that are to be considered equivalent to <code class="computeroutput"><span class="identifier">c</span></code>
5866 when comparisons are to be performed without regard to case, then
5867 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">translate_nocase</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span>
5868 <span class="special">==</span> <span class="identifier">v</span><span class="special">.</span><span class="identifier">translate_nocase</span><span class="special">(</span><span class="identifier">C</span><span class="special">)</span></code>.
5875 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">transform</span><span class="special">(</span><span class="identifier">F1</span><span class="special">,</span>
5876 <span class="identifier">F2</span><span class="special">)</span></code>
5881 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code>
5886 Returns a sort key for the character sequence designated by the
5887 iterator range <code class="computeroutput"><span class="special">[</span><span class="identifier">F1</span><span class="special">,</span> <span class="identifier">F2</span><span class="special">)</span></code> such that if the character sequence
5888 <code class="computeroutput"><span class="special">[</span><span class="identifier">G1</span><span class="special">,</span> <span class="identifier">G2</span><span class="special">)</span></code> sorts before the character sequence
5889 <code class="computeroutput"><span class="special">[</span><span class="identifier">H1</span><span class="special">,</span> <span class="identifier">H2</span><span class="special">)</span></code> then <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">transform</span><span class="special">(</span><span class="identifier">G1</span><span class="special">,</span> <span class="identifier">G2</span><span class="special">)</span> <span class="special"><</span>
5890 <span class="identifier">v</span><span class="special">.</span><span class="identifier">transform</span><span class="special">(</span><span class="identifier">H1</span><span class="special">,</span>
5891 <span class="identifier">H2</span><span class="special">)</span></code>.
5898 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">transform_primary</span><span class="special">(</span><span class="identifier">F1</span><span class="special">,</span>
5899 <span class="identifier">F2</span><span class="special">)</span></code>
5904 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code>
5909 Returns a sort key for the character sequence designated by the
5910 iterator range <code class="computeroutput"><span class="special">[</span><span class="identifier">F1</span><span class="special">,</span> <span class="identifier">F2</span><span class="special">)</span></code> such that if the character sequence
5911 <code class="computeroutput"><span class="special">[</span><span class="identifier">G1</span><span class="special">,</span> <span class="identifier">G2</span><span class="special">)</span></code> sorts before the character sequence
5912 <code class="computeroutput"><span class="special">[</span><span class="identifier">H1</span><span class="special">,</span> <span class="identifier">H2</span><span class="special">)</span></code> when character case is not considered
5913 then <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">transform_primary</span><span class="special">(</span><span class="identifier">G1</span><span class="special">,</span>
5914 <span class="identifier">G2</span><span class="special">)</span>
5915 <span class="special"><</span> <span class="identifier">v</span><span class="special">.</span><span class="identifier">transform_primary</span><span class="special">(</span><span class="identifier">H1</span><span class="special">,</span> <span class="identifier">H2</span><span class="special">)</span></code>.
5922 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">lookup_classname</span><span class="special">(</span><span class="identifier">F1</span><span class="special">,</span>
5923 <span class="identifier">F2</span><span class="special">)</span></code>
5928 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">char_class_type</span></code>
5933 Converts the character sequence designated by the iterator range
5934 <code class="computeroutput"><span class="special">[</span><span class="identifier">F1</span><span class="special">,</span><span class="identifier">F2</span><span class="special">)</span></code> into a bitmask type that can subsequently
5935 be passed to <code class="computeroutput"><span class="identifier">isctype</span></code>.
5936 Values returned from <code class="computeroutput"><span class="identifier">lookup_classname</span></code>
5937 can be safely bitwise or'ed together. Returns <code class="computeroutput"><span class="number">0</span></code>
5938 if the character sequence is not the name of a character class
5939 recognized by <code class="computeroutput"><span class="identifier">X</span></code>.
5940 The value returned shall be independent of the case of the characters
5948 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">lookup_collatename</span><span class="special">(</span><span class="identifier">F1</span><span class="special">,</span>
5949 <span class="identifier">F2</span><span class="special">)</span></code>
5954 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code>
5959 Returns a sequence of characters that represents the collating
5960 element consisting of the character sequence designated by the
5961 iterator range <code class="computeroutput"><span class="special">[</span><span class="identifier">F1</span><span class="special">,</span> <span class="identifier">F2</span><span class="special">)</span></code>. Returns an empty string if the
5962 character sequence is not a valid collating element.
5969 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">isctype</span><span class="special">(</span><span class="identifier">c</span><span class="special">,</span>
5970 <span class="identifier">v</span><span class="special">.</span><span class="identifier">lookup_classname</span><span class="special">(</span><span class="identifier">F1</span><span class="special">,</span>
5971 <span class="identifier">F2</span><span class="special">))</span></code>
5976 <code class="computeroutput"><span class="keyword">bool</span></code>
5981 Returns <code class="computeroutput"><span class="keyword">true</span></code> if character
5982 <code class="computeroutput"><span class="identifier">c</span></code> is a member of
5983 the character class designated by the iterator range <code class="computeroutput"><span class="special">[</span><span class="identifier">F1</span><span class="special">,</span> <span class="identifier">F2</span><span class="special">)</span></code>, <code class="computeroutput"><span class="keyword">false</span></code>
5991 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">value</span><span class="special">(</span><span class="identifier">c</span><span class="special">,</span>
5992 <span class="identifier">i</span><span class="special">)</span></code>
5997 <code class="computeroutput"><span class="keyword">int</span></code>
6002 Returns the value represented by the digit <code class="computeroutput"><span class="identifier">c</span></code>
6003 in base <code class="computeroutput"><span class="identifier">i</span></code> if the
6004 character <code class="computeroutput"><span class="identifier">c</span></code> is
6005 a valid digit in base <code class="computeroutput"><span class="identifier">i</span></code>;
6006 otherwise returns <code class="computeroutput"><span class="special">-</span><span class="number">1</span></code>.<br> [Note: the value of <code class="computeroutput"><span class="identifier">i</span></code> will only be <code class="computeroutput"><span class="number">8</span></code>, <code class="computeroutput"><span class="number">10</span></code>,
6007 or <code class="computeroutput"><span class="number">16</span></code>. -end note]
6014 <code class="computeroutput"><span class="identifier">u</span><span class="special">.</span><span class="identifier">imbue</span><span class="special">(</span><span class="identifier">loc</span><span class="special">)</span></code>
6019 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">locale_type</span></code>
6024 Imbues <code class="computeroutput"><span class="identifier">u</span></code> with the
6025 locale <code class="computeroutput"><span class="identifier">loc</span></code>, returns
6026 the previous locale used by <code class="computeroutput"><span class="identifier">u</span></code>.
6033 <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">getloc</span><span class="special">()</span></code>
6038 <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">locale_type</span></code>
6043 Returns the current locale used by <code class="computeroutput"><span class="identifier">v</span></code>.
6050 <br class="table-break"><h3>
6051 <a name="boost_xpressive.user_s_guide.concepts.h2"></a>
6052 <span class="phrase"><a name="boost_xpressive.user_s_guide.concepts.acknowledgements"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.acknowledgements">Acknowledgements</a>
6055 This section is adapted from the equivalent page in the <a href="../../../libs/regex" target="_top">Boost.Regex</a>
6056 documentation and from the <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1429.htm" target="_top">proposal</a>
6057 to add regular expressions to the Standard Library.
6060 <div class="section">
6061 <div class="titlepage"><div><div><h3 class="title">
6062 <a name="boost_xpressive.user_s_guide.examples"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">Examples</a>
6063 </h3></div></div></div>
6065 Below you can find six complete sample programs. <br>
6069 <a name="boost_xpressive.user_s_guide.examples.h0"></a>
6070 <span class="phrase"><a name="boost_xpressive.user_s_guide.examples.see_if_a_whole_string_matches_a_regex"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_whole_string_matches_a_regex">See
6071 if a whole string matches a regex</a>
6074 This is the example from the Introduction. It is reproduced here for your
6077 <pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
6078 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
6080 <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
6082 <span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
6083 <span class="special">{</span>
6084 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">hello</span><span class="special">(</span> <span class="string">"hello world!"</span> <span class="special">);</span>
6086 <span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(\\w+) (\\w+)!"</span> <span class="special">);</span>
6087 <span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
6089 <span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="identifier">hello</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rex</span> <span class="special">)</span> <span class="special">)</span>
6090 <span class="special">{</span>
6091 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// whole match</span>
6092 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">1</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// first capture</span>
6093 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">2</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// second capture</span>
6094 <span class="special">}</span>
6096 <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
6097 <span class="special">}</span>
6100 This program outputs the following:
6102 <pre class="programlisting">hello world!
6107 <br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a>
6111 <a name="boost_xpressive.user_s_guide.examples.h1"></a>
6112 <span class="phrase"><a name="boost_xpressive.user_s_guide.examples.see_if_a_string_contains_a_sub_string_that_matches_a_regex"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_string_contains_a_sub_string_that_matches_a_regex">See
6113 if a string contains a sub-string that matches a regex</a>
6116 Notice in this example how we use custom <code class="computeroutput"><span class="identifier">mark_tag</span></code>s
6117 to make the pattern more readable. We can use the <code class="computeroutput"><span class="identifier">mark_tag</span></code>s
6118 later to index into the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>.
6120 <pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
6121 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
6123 <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
6125 <span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
6126 <span class="special">{</span>
6127 <span class="keyword">char</span> <span class="keyword">const</span> <span class="special">*</span><span class="identifier">str</span> <span class="special">=</span> <span class="string">"I was born on 5/30/1973 at 7am."</span><span class="special">;</span>
6129 <span class="comment">// define some custom mark_tags with names more meaningful than s1, s2, etc.</span>
6130 <span class="identifier">mark_tag</span> <span class="identifier">day</span><span class="special">(</span><span class="number">1</span><span class="special">),</span> <span class="identifier">month</span><span class="special">(</span><span class="number">2</span><span class="special">),</span> <span class="identifier">year</span><span class="special">(</span><span class="number">3</span><span class="special">),</span> <span class="identifier">delim</span><span class="special">(</span><span class="number">4</span><span class="special">);</span>
6132 <span class="comment">// this regex finds a date</span>
6133 <span class="identifier">cregex</span> <span class="identifier">date</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">month</span><span class="special">=</span> <span class="identifier">repeat</span><span class="special"><</span><span class="number">1</span><span class="special">,</span><span class="number">2</span><span class="special">>(</span><span class="identifier">_d</span><span class="special">))</span> <span class="comment">// find the month ...</span>
6134 <span class="special">>></span> <span class="special">(</span><span class="identifier">delim</span><span class="special">=</span> <span class="special">(</span><span class="identifier">set</span><span class="special">=</span> <span class="char">'/'</span><span class="special">,</span><span class="char">'-'</span><span class="special">))</span> <span class="comment">// followed by a delimiter ...</span>
6135 <span class="special">>></span> <span class="special">(</span><span class="identifier">day</span><span class="special">=</span> <span class="identifier">repeat</span><span class="special"><</span><span class="number">1</span><span class="special">,</span><span class="number">2</span><span class="special">>(</span><span class="identifier">_d</span><span class="special">))</span> <span class="special">>></span> <span class="identifier">delim</span> <span class="comment">// and a day followed by the same delimiter ...</span>
6136 <span class="special">>></span> <span class="special">(</span><span class="identifier">year</span><span class="special">=</span> <span class="identifier">repeat</span><span class="special"><</span><span class="number">1</span><span class="special">,</span><span class="number">2</span><span class="special">>(</span><span class="identifier">_d</span> <span class="special">>></span> <span class="identifier">_d</span><span class="special">));</span> <span class="comment">// and the year.</span>
6138 <span class="identifier">cmatch</span> <span class="identifier">what</span><span class="special">;</span>
6140 <span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_search</span><span class="special">(</span> <span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">date</span> <span class="special">)</span> <span class="special">)</span>
6141 <span class="special">{</span>
6142 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// whole match</span>
6143 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="identifier">day</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// the day</span>
6144 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="identifier">month</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// the month</span>
6145 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="identifier">year</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// the year</span>
6146 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="identifier">delim</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// the delimiter</span>
6147 <span class="special">}</span>
6149 <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
6150 <span class="special">}</span>
6153 This program outputs the following:
6155 <pre class="programlisting">5/30/1973
6162 <br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a>
6166 <a name="boost_xpressive.user_s_guide.examples.h2"></a>
6167 <span class="phrase"><a name="boost_xpressive.user_s_guide.examples.replace_all_sub_strings_that_match_a_regex"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.replace_all_sub_strings_that_match_a_regex">Replace
6168 all sub-strings that match a regex</a>
6171 The following program finds dates in a string and marks them up with pseudo-HTML.
6173 <pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
6174 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
6176 <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
6178 <span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
6179 <span class="special">{</span>
6180 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span> <span class="string">"I was born on 5/30/1973 at 7am."</span> <span class="special">);</span>
6182 <span class="comment">// essentially the same regex as in the previous example, but using a dynamic regex</span>
6183 <span class="identifier">sregex</span> <span class="identifier">date</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(\\d{1,2})([/-])(\\d{1,2})\\2((?:\\d{2}){1,2})"</span> <span class="special">);</span>
6185 <span class="comment">// As in Perl, $& is a reference to the sub-string that matched the regex</span>
6186 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">format</span><span class="special">(</span> <span class="string">"<date>$&</date>"</span> <span class="special">);</span>
6188 <span class="identifier">str</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span> <span class="identifier">str</span><span class="special">,</span> <span class="identifier">date</span><span class="special">,</span> <span class="identifier">format</span> <span class="special">);</span>
6189 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">str</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
6191 <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
6192 <span class="special">}</span>
6195 This program outputs the following:
6197 <pre class="programlisting">I was born on <date>5/30/1973</date> at 7am.
6200 <br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a>
6204 <a name="boost_xpressive.user_s_guide.examples.h3"></a>
6205 <span class="phrase"><a name="boost_xpressive.user_s_guide.examples.find_all_the_sub_strings_that_match_a_regex_and_step_through_them_one_at_a_time"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.find_all_the_sub_strings_that_match_a_regex_and_step_through_them_one_at_a_time">Find
6206 all the sub-strings that match a regex and step through them one at a time</a>
6209 The following program finds the words in a wide-character string. It uses
6210 <code class="computeroutput"><span class="identifier">wsregex_iterator</span></code>. Notice
6211 that dereferencing a <code class="computeroutput"><span class="identifier">wsregex_iterator</span></code>
6212 yields a <code class="computeroutput"><span class="identifier">wsmatch</span></code> object.
6214 <pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
6215 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
6217 <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
6219 <span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
6220 <span class="special">{</span>
6221 <span class="identifier">std</span><span class="special">::</span><span class="identifier">wstring</span> <span class="identifier">str</span><span class="special">(</span> <span class="identifier">L</span><span class="string">"This is his face."</span> <span class="special">);</span>
6223 <span class="comment">// find a whole word</span>
6224 <span class="identifier">wsregex</span> <span class="identifier">token</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">alnum</span><span class="special">;</span>
6226 <span class="identifier">wsregex_iterator</span> <span class="identifier">cur</span><span class="special">(</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">token</span> <span class="special">);</span>
6227 <span class="identifier">wsregex_iterator</span> <span class="identifier">end</span><span class="special">;</span>
6229 <span class="keyword">for</span><span class="special">(</span> <span class="special">;</span> <span class="identifier">cur</span> <span class="special">!=</span> <span class="identifier">end</span><span class="special">;</span> <span class="special">++</span><span class="identifier">cur</span> <span class="special">)</span>
6230 <span class="special">{</span>
6231 <span class="identifier">wsmatch</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">what</span> <span class="special">=</span> <span class="special">*</span><span class="identifier">cur</span><span class="special">;</span>
6232 <span class="identifier">std</span><span class="special">::</span><span class="identifier">wcout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="identifier">L</span><span class="char">'\n'</span><span class="special">;</span>
6233 <span class="special">}</span>
6235 <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
6236 <span class="special">}</span>
6239 This program outputs the following:
6241 <pre class="programlisting">This
6247 <br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a>
6251 <a name="boost_xpressive.user_s_guide.examples.h4"></a>
6252 <span class="phrase"><a name="boost_xpressive.user_s_guide.examples.split_a_string_into_tokens_that_each_match_a_regex"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.split_a_string_into_tokens_that_each_match_a_regex">Split
6253 a string into tokens that each match a regex</a>
6256 The following program finds race times in a string and displays first the
6257 minutes and then the seconds. It uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>.
6259 <pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
6260 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
6262 <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
6264 <span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
6265 <span class="special">{</span>
6266 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span> <span class="string">"Eric: 4:40, Karl: 3:35, Francesca: 2:32"</span> <span class="special">);</span>
6268 <span class="comment">// find a race time</span>
6269 <span class="identifier">sregex</span> <span class="identifier">time</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(\\d):(\\d\\d)"</span> <span class="special">);</span>
6271 <span class="comment">// for each match, the token iterator should first take the value of</span>
6272 <span class="comment">// the first marked sub-expression followed by the value of the second</span>
6273 <span class="comment">// marked sub-expression</span>
6274 <span class="keyword">int</span> <span class="keyword">const</span> <span class="identifier">subs</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span> <span class="number">1</span><span class="special">,</span> <span class="number">2</span> <span class="special">};</span>
6276 <span class="identifier">sregex_token_iterator</span> <span class="identifier">cur</span><span class="special">(</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">time</span><span class="special">,</span> <span class="identifier">subs</span> <span class="special">);</span>
6277 <span class="identifier">sregex_token_iterator</span> <span class="identifier">end</span><span class="special">;</span>
6279 <span class="keyword">for</span><span class="special">(</span> <span class="special">;</span> <span class="identifier">cur</span> <span class="special">!=</span> <span class="identifier">end</span><span class="special">;</span> <span class="special">++</span><span class="identifier">cur</span> <span class="special">)</span>
6280 <span class="special">{</span>
6281 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="special">*</span><span class="identifier">cur</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
6282 <span class="special">}</span>
6284 <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
6285 <span class="special">}</span>
6288 This program outputs the following:
6290 <pre class="programlisting">4
6298 <br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a>
6302 <a name="boost_xpressive.user_s_guide.examples.h5"></a>
6303 <span class="phrase"><a name="boost_xpressive.user_s_guide.examples.split_a_string_using_a_regex_as_a_delimiter"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.split_a_string_using_a_regex_as_a_delimiter">Split
6304 a string using a regex as a delimiter</a>
6307 The following program takes some text that has been marked up with html and
6308 strips out the mark-up. It uses a regex that matches an HTML tag and a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
6309 that returns the parts of the string that do <span class="emphasis"><em>not</em></span> match
6312 <pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
6313 <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
6315 <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
6317 <span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
6318 <span class="special">{</span>
6319 <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span> <span class="string">"Now <bold>is the time <i>for all good men</i> to come to the aid of their</bold> country."</span> <span class="special">);</span>
6321 <span class="comment">// find a HTML tag</span>
6322 <span class="identifier">sregex</span> <span class="identifier">html</span> <span class="special">=</span> <span class="char">'<'</span> <span class="special">>></span> <span class="identifier">optional</span><span class="special">(</span><span class="char">'/'</span><span class="special">)</span> <span class="special">>></span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">>></span> <span class="char">'>'</span><span class="special">;</span>
6324 <span class="comment">// the -1 below directs the token iterator to display the parts of</span>
6325 <span class="comment">// the string that did NOT match the regular expression.</span>
6326 <span class="identifier">sregex_token_iterator</span> <span class="identifier">cur</span><span class="special">(</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">html</span><span class="special">,</span> <span class="special">-</span><span class="number">1</span> <span class="special">);</span>
6327 <span class="identifier">sregex_token_iterator</span> <span class="identifier">end</span><span class="special">;</span>
6329 <span class="keyword">for</span><span class="special">(</span> <span class="special">;</span> <span class="identifier">cur</span> <span class="special">!=</span> <span class="identifier">end</span><span class="special">;</span> <span class="special">++</span><span class="identifier">cur</span> <span class="special">)</span>
6330 <span class="special">{</span>
6331 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="char">'{'</span> <span class="special"><<</span> <span class="special">*</span><span class="identifier">cur</span> <span class="special"><<</span> <span class="char">'}'</span><span class="special">;</span>
6332 <span class="special">}</span>
6333 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
6335 <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
6336 <span class="special">}</span>
6339 This program outputs the following:
6341 <pre class="programlisting">{Now }{is the time }{for all good men}{ to come to the aid of their}{ country.}
6344 <br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a>
6348 <a name="boost_xpressive.user_s_guide.examples.h6"></a>
6349 <span class="phrase"><a name="boost_xpressive.user_s_guide.examples.display_a_tree_of_nested_results"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.display_a_tree_of_nested_results">Display
6350 a tree of nested results</a>
6353 Here is a helper class to demonstrate how you might display a tree of nested
6356 <pre class="programlisting"><span class="comment">// Displays nested results to std::cout with indenting</span>
6357 <span class="keyword">struct</span> <span class="identifier">output_nested_results</span>
6358 <span class="special">{</span>
6359 <span class="keyword">int</span> <span class="identifier">tabs_</span><span class="special">;</span>
6361 <span class="identifier">output_nested_results</span><span class="special">(</span> <span class="keyword">int</span> <span class="identifier">tabs</span> <span class="special">=</span> <span class="number">0</span> <span class="special">)</span>
6362 <span class="special">:</span> <span class="identifier">tabs_</span><span class="special">(</span> <span class="identifier">tabs</span> <span class="special">)</span>
6363 <span class="special">{</span>
6364 <span class="special">}</span>
6366 <span class="keyword">template</span><span class="special"><</span> <span class="keyword">typename</span> <span class="identifier">BidiIterT</span> <span class="special">></span>
6367 <span class="keyword">void</span> <span class="keyword">operator</span> <span class="special">()(</span> <span class="identifier">match_results</span><span class="special"><</span> <span class="identifier">BidiIterT</span> <span class="special">></span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">what</span> <span class="special">)</span> <span class="keyword">const</span>
6368 <span class="special">{</span>
6369 <span class="comment">// first, do some indenting</span>
6370 <span class="keyword">typedef</span> <span class="keyword">typename</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">iterator_traits</span><span class="special"><</span> <span class="identifier">BidiIterT</span> <span class="special">>::</span><span class="identifier">value_type</span> <span class="identifier">char_type</span><span class="special">;</span>
6371 <span class="identifier">char_type</span> <span class="identifier">space_ch</span> <span class="special">=</span> <span class="identifier">char_type</span><span class="special">(</span><span class="char">' '</span><span class="special">);</span>
6372 <span class="identifier">std</span><span class="special">::</span><span class="identifier">fill_n</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special"><</span><span class="identifier">char_type</span><span class="special">>(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">),</span> <span class="identifier">tabs_</span> <span class="special">*</span> <span class="number">4</span><span class="special">,</span> <span class="identifier">space_ch</span> <span class="special">);</span>
6374 <span class="comment">// output the match</span>
6375 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
6377 <span class="comment">// output any nested matches</span>
6378 <span class="identifier">std</span><span class="special">::</span><span class="identifier">for_each</span><span class="special">(</span>
6379 <span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">begin</span><span class="special">(),</span>
6380 <span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">end</span><span class="special">(),</span>
6381 <span class="identifier">output_nested_results</span><span class="special">(</span> <span class="identifier">tabs_</span> <span class="special">+</span> <span class="number">1</span> <span class="special">)</span> <span class="special">);</span>
6382 <span class="special">}</span>
6383 <span class="special">};</span>
6386 <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a>
6389 <div class="footnotes">
6390 <br><hr style="width:100; text-align:left;margin-left: 0">
6391 <div id="ftn.boost_xpressive.user_s_guide.introduction.f0" class="footnote"><p><a href="#boost_xpressive.user_s_guide.introduction.f0" class="para"><sup class="para">[36] </sup></a>
6392 See <a href="http://www.osl.iu.edu/~tveldhui/papers/Expression-Templates/exprtmpl.html" target="_top">Expression
6395 <div id="ftn.boost_xpressive.user_s_guide.symbol_tables_and_attributes.f0" class="footnote"><p><a href="#boost_xpressive.user_s_guide.symbol_tables_and_attributes.f0" class="para"><sup class="para">[37] </sup></a>
6396 Many thanks to David Jenkins, who contributed this example.
6400 <table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
6401 <td align="left"></td>
6402 <td align="right"><div class="copyright-footer">Copyright © 2007 Eric Niebler<p>
6403 Distributed under the Boost Software License, Version 1.0. (See accompanying
6404 file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
6409 <div class="spirit-nav">
6410 <a accesskey="p" href="../xpressive.html"><img src="../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../xpressive.html"><img src="../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="reference.html"><img src="../../../doc/src/images/next.png" alt="Next"></a>