From: thurston Date: Wed, 26 Dec 2007 18:19:40 +0000 (+0000) Subject: Improvements to the action ordering and duplicates section. Explicitly stated X-Git-Tag: 2.0_alpha~205 X-Git-Url: http://review.tizen.org/git/?a=commitdiff_plain;h=87670e794292b6073d2469aa3d6a5d58d704b9d9;p=external%2Fragel.git Improvements to the action ordering and duplicates section. Explicitly stated that ragel does not compare action text. This tripped up a user. git-svn-id: http://svn.complang.org/ragel/trunk@353 052ea7fc-9027-0410-9066-f65837a77df0 --- diff --git a/doc/ragel-guide.tex b/doc/ragel-guide.tex index 3c8c15b..c9e14f2 100644 --- a/doc/ragel-guide.tex +++ b/doc/ragel-guide.tex @@ -1835,35 +1835,32 @@ main := ( \section{Action Ordering and Duplicates} -When building a parser by combining smaller expressions that themselves have -embedded actions, it is often the case that transitions that need to -execute a number of actions on one input character are made. For example when we leave -an expression, we may execute the expression's pending out action and the -subsequent expression's starting action on the same input character. We must -therefore devise a method for ordering actions that is both intuitive and -predictable for the user and repeatable by the state machine compiler. The -determinization processes cannot simply order actions by the time at which they -are introduced into a transition -- otherwise the programmer will be at the -mercy of luck. - -We associate with the embedding of each action a distinct timestamp that is +When combining expressions that have embedded actions it is often the case that +a number of actions must be executed on a single input character. For example, +following a concatenation the pending out action of the left expression and the +entering action of the right expression will be embedded into one transition. +This requires a method of ordering actions that is intuitive and +predictable for the user, and repeatable for the compiler. + +We associate with the embedding of each action a unique timestamp that is used to order actions that appear together on a single transition in the final -compiled state machine. To accomplish this we traverse the parse tree of -regular expressions and assign timestamps to action embeddings. This algorithm -is recursive in nature and quite simple. When it visits a parse tree node it -assigns timestamps to all {\em starting} action embeddings, recurses on the -parse tree, then assigns timestamps to the remaining {\em all}, {\em -finishing}, and {\em leaving} embeddings in the order in which they appear. - -Ragel does not permit actions (defined or unnamed) to appear multiple times in -an action list. When the final machine has been created, actions that appear -more than once in a single transition or EOF action list have their duplicates -removed. The first appearance of the action is preserved. This is useful in a -number of scenarios. First, it allows us to union machines with common -prefixes without worrying about the action embeddings in the prefix being -duplicated. Second, it prevents pending out actions from being transferred multiple times -when a concatenation follows a kleene star and the two machines begin with a common -character. +state machine. To accomplish this we recursively traverse the parse tree of +regular expressions and assign timestamps to action embeddings. References to +machine definitions are followed in the traversal. When we visit a +parse tree node we assign timestamps to all {\em entering} action embeddings, +recurse on the parse tree, then assign timestamps to the remaining {\em all}, +{\em finishing}, and {\em leaving} embeddings in the order in which they +appear. + +Ragel does not permit a single action to appear multiple times in an action +list. When the final machine has been created, actions that appear more than +once in a single transition or EOF action list have their duplicates removed. +The first appearance of the action is preserved. This is useful in a number of +scenarios. First, it allows us to union machines with common prefixes without +worrying about the action embeddings in the prefix being duplicated. Second, it +prevents pending out actions from being transferred multiple times. This can +happen when a machine is repeated, then followed with another machine that +begins with a common character. For example: \verbspace \begin{verbatim} @@ -1871,6 +1868,10 @@ word = [a-z]+ %act; main := word ( '\n' word )* '\n\n'; \end{verbatim} +Note that Ragel does not compare action bodies to determine if they have +identical program text. It simply checks for duplicates using each action +block's unique location in the program. + \section{Values and Statements Available in Code Blocks} \label{vals}