1 Ragel 5.18 - Feb 5, 2007
2 ========================
3 -Backened class structure reorganized to make it easier to add new code
4 generators without having to also modify the existing code generators.
5 -The java code generation was then split out into it's own exectuable to
6 allow it to freely diverge from the C/D-based code generation.
7 -The "large machines in Java" patch from Colin Fleming was comitted. This
8 encodes data arrays as strings and decodes them at runtime to get around
9 limits on the size of static data. Currently only byte and short alphabet
10 types are supported. Since the default alphabet type for Java is "char," it
11 is now necessary to add "alphtype byte;" to Java programs.
12 -The Ruby code generation patch from Victor Hugo Borja was added. This is
13 highly experimental code.
15 Ragel 5.17 - Jan 28, 2007
16 =========================
17 -The scanners and parsers in both the frontend and backend programs were
18 completely rewritten using Ragel and Kelbt.
19 -The '%when condition' syntax was functioning like '$when condition'. This
21 -In the Vim syntax file fixes to the matching of embedding operators were
22 made. Also, improvements to the sync patterns were made.
23 -Added pullscan.rl to the examples directory. It is an example of doing
24 pull-based scanning. Also, xmlscan.rl in rlcodegen is a pull scanner.
25 -The introduction chapter of the manual was improved. The manually-drawn
26 figures for the examples were replaced with graphviz-drawn figures.
28 Ragel 5.16 - Nov 20, 2006
29 =========================
30 -Bug fix: the fhold and fexec directives did not function correctly in
31 scanner pattern actions. In this context manipulations of p may be lost or
32 made invalid. To fix this, fexec and fhold now manipulate tokend, which is
33 now always used to update p when the action terminates.
35 Ragel 5.15 - Oct 31, 2006
36 =========================
37 -A language independent test harness was introduced. Test cases can be
38 written using a custom mini-language in the embedded actions. This
39 mini-language is then translated to C, D and Java when generating the
40 language-specific test cases.
41 -Several existing tests have been ported to the language-independent format
42 and a number of new language-independent test cases have been added.
43 -The state-based embedding operators which access states that are not the
44 start state and are not final (the 'middle' states) have changed. They
46 <@/ eof action into middle states
47 <@! error action into middle states
48 <@^ local error action into middle states
49 <@~ to-state action into middle states
50 <@* from-state action into middle states
52 <>/ eof action into middle states
53 <>! error action into middle states
54 <>^ local error action into middle states
55 <>~ to-state action into middle states
56 <>* from-state action into middle states
57 -The verbose form of embeddings using the <- operator have been removed.
58 This syntax was difficult to remember.
59 -A new verbose form of state-based embedding operators have been added.
60 These are like the symbol versions, except they replace the symbols:
62 with literal keywords:
64 -The following words have been promoted to keywords:
65 when eof err lerr to from
66 -The write statment now gets its own lexical scope in the scanner to ensure
67 that commands are passed through as is (not affected by keywords).
68 -Bug fix: in the code generation of fret in scanner actions the adjustment to
69 p that is needed in some cases (dependent on content of patterns) was not
71 -The fhold directive, which decrements p, cannot be permitted in the pattern
72 action of a scanner item because it will not behave consistently. At the end
73 of a pattern action p could be decremented, set to a new value or left
74 alone. This depends on the contents of the scanner's patterns. The user
75 cannot be expected to predict what will happen to p.
76 -Conditions in D code require a cast to the widec type when computing widec.
77 -Like Java, D code also needs if (true) branches for control flow in actions
78 in order to fool the unreachable code detector. This is now abstracted in
79 all code generators using the CTRL_FLOW() function.
80 -The NULL_ITEM value in java code should be -1. This is needed for
83 Ragel 5.14 - Oct 1, 2006
84 ========================
85 -Fixed the check for use of fcall in actions embedded within longest match
86 items. It was emitting an error if an item's longest-match action had an
87 fcall, which is allowed. This bug was introduced while fixing a segfault in
89 -A new minimization option was added: MinimizeMostOps (-l). This option
90 minimizes at every operation except on chains of expressions and chains of
91 terms (eg, union and concat). On these chains it minimizes only at the last
92 operation. This makes test cases with many states compile faster, without
93 killing the performance on grammars like strings2.rl.
94 -The -l minimiziation option was made the default.
95 -Fixes to Java code: Use of the fc value did not work, now fixed. Static data
96 is now declared with the final keyword. Patch from Colin Fleming. Conditions
97 now work when generating Java code.
98 -The option -p was added to rlcodegen which causes printable characters to be
99 printed in GraphViz output. Patch from Colin Fleming.
100 -The "element" keyword no longer exists, removed from vim syntax file.
101 Updated keyword highlighting.
102 -The host language selection is now made in the frontend.
103 -Native host language types are now used when specifying the alphtype.
104 Previously all languages used the set defined by C, and these were mapped to
105 the appropriate type in the backend.
107 Ragel 5.13 - Sep 7, 2006
108 ========================
109 -Fixed a careless error which broke Java code generation.
111 Ragel 5.12 - Sep 7, 2006
112 ========================
113 -The -o flag did not work in combination with -V. This was fixed.
114 -The split code generation format uses only the required number of digits
115 when writing out the number in the file name of each part.
116 -The -T0, -F0 and -G0 codegens should write out the action list iteration
117 variables only when there are regular, to state or from state actions. The
118 code gens should not use anyActions().
119 -If two states have the same EOF actions, they are written out in the finish
121 -The split and in-place goto formats would sometimes generate _out when it is
122 not needed. This was fixed.
123 -Improved the basic partitioning in the split code gen. The last partition
124 would sometimes be empty. This was fixed.
125 -Use of 'fcall *' was not causing top to be initialized. Fixed.
126 -Implemented a Java backend, specified with -J. Only the table-based format
128 -Implemented range compression in the frontend. This has no effect on the
129 generated code, however it reduces the work of the backend and any programs
130 that read the intermediate format.
132 Ragel 5.11 - Aug 10, 2006
133 =========================
134 -Added a variable to the configure.in script which allows the building of
135 the parsers to be turned off (BUILD_PARSERS). Parser building is off by
136 default for released versions.
137 -Removed configure tests for bison defines header file. Use --defines=file
139 -Configure script doesn't test for bison, flex and gperf when building of the
140 parsers is turned off.
141 -Removed check for YYLTYPE structure from configure script. Since shipped
142 code will not build parsers by default, we don't need to be as accomodating
143 of other versions of bison.
144 -Added a missing include that showed up with g++ 2.95.3.
145 -Failed configure test for Objective-C compiler is now silent.
147 Ragel 5.10 - Jul 31, 2006
148 =========================
149 -Moved the check for error state higher in the table-based processing loop.
150 -Replaced naive implementations of condition searching with proper ones. In
151 the table-based formats the searching is also table-based. In the directly
152 executed formats the searching is also directly executable.
153 -The minimization process was made aware of conditions.
154 -A problem with the condition implementation was fixed. Previously we were
155 taking pointers to transitions and then using them after a call to
156 outTransCopy, which was a bad idea because they may be changed by the call.
157 -Added test mailbox3.rl which is based on mailbox2.rl but includes conditions
158 for restricting header and message body lengths.
159 -Eliminated the initial one-character backup of p just before resuming
161 -Added the -s option to the frontend for printing statistics. This currently
162 includes just the number of states.
163 -Sped up the generation of the in-place goto-driven (-G2) code style.
164 -Implemented a split version of in-place goto-driven code style. This code
165 generation style is suitable for producing fast implementations of very
166 large machines. Partitioning is currently naive. In the future a
167 high-quality partitioning program will be employed. The flag for accessing
168 this feature is -Pn, where n is the number of partitions.
169 -Converted mailbox1.rl, strings2.rl and cppscan1.rl tests to support the
170 split code generation.
171 -Fixes and updates were made to the runtests script: added -c for compiling
172 only, changed the -me option to -e, and added support for testing the split
175 Ragel 5.9 - Jul 19, 2006
176 ========================
177 -Fixed a bug in the include system which caused malformed output from the
178 frontend when the include was made from a multi-line machine spec and the
179 included file ended in a single line spec (or vice versa).
180 -Static data is now const.
181 -Actions which referenced states but were not embedded caused the frontend to
183 -Manual now built with pdflatex.
184 -The manual was reorganized and expanded. Chapter sequence is now:
185 Introduction, Constructing Machines, Embedding Actions, Controlling
186 Nondeterminism and Interfacing to the Host program.
188 Ragel 5.8 - Jun 17, 2006
189 ========================
190 -The internal representation of the alphabet type has been encapsulated
191 into a class and all operations on it have been defined as C++ operators.
192 -The condition implementation now supports range transitions. This allows
193 conditions to be embedded into arbitrary machines. Conditions are still
195 -More condition embedding operators were added
196 1. Isolate the start state and embed a condition into all transitions
199 2. Embed a condition into all transitions:
200 when cond OR $when cond OR $?cond
201 3. Embed a condition into pending out transitions:
203 -Improvements were made to the determinization process to support pending out
205 -The Vim sytax file was fixed so that :> doesn't cause the match of a label.
206 -The test suite was converted to a single-file format which uses less disk
207 space than the old directory-per-test format.
209 Ragel 5.7 - May 14, 2006
210 ========================
211 -Conditions will not be embedded like actions because they involve a
212 manipulation of the state machine they are specified in. They have therefore
213 been taken out of the verbose action embedding form (using the <- compound
214 symbol). A new syntax for specifying conditions has been created:
215 m = '\n' when {i==4};
216 -Fixed a bug which prevented state machine commands like fcurs, fcall, fret,
217 etc, from being accounted for in from-state actions and to-state actions.
218 This prevented some necessary support code from being generated.
219 -Implemented condition testing in remaining code generators.
220 -Configure script now checks for gperf, which is required for building.
221 -Added support for case-insensitive literal strings (in addition to regexes).
222 A case-insensitive string is made by appending an 'i' to the literal, as in
224 -Fixed a bug which caused all or expressions inside of all regular
225 expressions to be case-insensitive. For example /[fo]o bar/ would make the
226 [fo] part case-insensitive even though no 'i' was given following the
229 Ragel 5.6 - Apr 1, 2006
230 =======================
231 -Added a left-guarded concatenation operator. This operator <: is equivalent
232 to ( expr1 $1 . expr2 >0 ). It is useful if you want to prefix a sequence
233 with a sequence of a subset of the characters it matches. For example, one
234 can consume leading whitespace before tokenizing a sequence of whitespace
235 separated words: ( ' '* <: ( ' '+ | [a-z]+ )** )
236 -Removed context embedding code, which has been dead since 5.0.
238 Ragel 5.5 - Mar 28, 2006
239 ========================
240 -Implemented a case-insensitive option for regular expressions: /get/i.
241 -If no input file is given to the ragel program it reads from standard input.
242 -The label of the start state has been changed from START to IN to save on
243 required screen space.
244 -Bug fix: \0 was not working in literal strings, due to a change that reduced
245 memory usage by concatenating components of literal strings. Token data
246 length is now passed from the scanner to the paser so that we do not need to
247 rely on null termination.
249 Ragel 5.4 - Mar 12, 2006
250 ========================
251 -Eliminated the default transition from the frontend implementation. This
252 default transition was a space-saving optimization that at best could reduce
253 the number of allocated transitions by one half. Unfortunately it
254 complicated the implementation and this stood in the way of introducing
255 conditionals. The default transition may be reintroduced in the future.
256 -Added entry-guarded concatenation. This operator :>, is syntactic sugar
257 for expr1 $0 . expr >1. This operator terminates the matching of the first
258 machine when a first character of the second machine is matched. For
259 example in any* . ';' we never leave the any* machine. If we use any* :> ';'
260 then the any* machine is terminiated upon matching the semi-colon.
261 -Added finish-guarded concatenation. This operator :>>, is syntactic sugar
262 for expr1 $0 . expr @1. This operator is like entry guarded concatenation
263 except the first machine is terminated when the second machine enters a
264 final state. This is useful for delaying the guard until a full pattern is
265 matched. For example as in '/*' any* :>> '*/'.
266 -Added strong subtraction. Where regular subtraction removes from the first
267 machine any strings that are matched by the second machine, strong
268 subtraction removes any strings from the first that contain any strings of
269 the second as a substring. Strong subtraction is syntactic sugar for
270 expr1 - ( any* expr2 any* ).
271 -Eliminated the use of priorities from the examples. Replaced with
272 subtraction, guarded concatenation and longest-match kleene star.
273 -Did some initial work on supporting conditional transitions. Far from
274 complete and very buggy. This code will only be active when conditionals are
277 Ragel 5.3 - Jan 27, 2006
278 ========================
279 -Added missing semi-colons that cause the build to fail when using older
281 -Fix for D code: if the contents of an fexec is a single word, the generated
282 code will get interpreted as a C-style cast. Adding two brackets prevents
283 this. Can now turn eliminate the "access this.;" in cppscan5 that was used to
284 get around this problem.
285 -Improved some of the tag names in the intermediate format.
286 -Added unsigned long to the list of supported alphabet types.
287 -Added ids of actions and action lists to XML intermediate format. Makes it
289 -Updated to latest Aapl package.
291 Ragel 5.2 - Jan 6, 2006
292 ========================
293 -Ragel emits an error if the target of fentry, fcall, fgoto or fnext is inside
294 a longest match operator, or if an action embedding in a longest match
295 machine uses fcall. The fcall command can still be used in pattern actions.
296 -Made improvements to the clang, rlscan, awkemu and cppscan examples.
297 -Some fixes to generated label names: they should all be prefixed with _.
298 -A fix to the Vim syntax highlighting script was made
299 -Many fixes and updates to the documentation. All important features and
300 concepts are now documented. A second chapter describing Ragel's use
303 Ragel 5.1 - Dec 22, 2005
304 ========================
305 -Fixes to the matching of section delimiters in Vim syntax file.
306 -If there is a longest match machine, the tokend var is now initialized by
307 write init. This is not necessary for correct functionality, however
308 prevents compiler warnings.
309 -The rlscan example was ported to the longest match operator and changed to
311 -Fix to the error handling in the frontend: if there are errors in the lookup
312 of names at machine generation time then do not emit anything.
313 -If not compiling the full machine in the frontend (by using -M), avoid
314 errors and segfaults caused by names that are not part of the compiled
316 -Longest match bug fix: need to init tokstart when returing from fsm calls
317 that are inside longest match actions.
318 -In Graphviz drawing, the arrow into the start state is not a real
319 transition, do not draw to-state actions on the label.
320 -A bug fix to the handling of non-tag data within an XML tag was made.
321 -Backend exit value fixed: since the parser now accepts nothing so as to
322 avoid a redundant parse error when the frontend dies, we must force an
323 error. The backend should now be properly reporting errors.
324 -The longest match machine now has it's start state set final. An LM machine
325 is in a final state when it has not matched anything, when it has matched
326 and accepted a token and is ready for another, and when it has matched a
327 token but is waiting for some lookahead before determining what to do about
328 it (similar to kleene star).
329 -Element statement removed from some tests.
330 -Entry point names are propagated to the backend and used to label the entry
331 point arrows in Graphviz output.
333 Ragel 5.0 - Dec 17, 2005
334 ========================
335 (additional details in V5 release notes)
336 -Ragel has been split into two executables: A frontend which compiles
337 machines and emits them in an XML format, and a backend which generates code
338 or a Graphviz dot file from the XML input. The purpose of this split is to
339 allow Ragel to interface with other tools by means of the XML intermediate
340 format and to reduce complexity by strictly separating the previously
341 entangled phases. The intermediate format will provide a better platform
342 inspecting compiled machines and for extending Ragel to support other host
344 -The host language interface has been reduced significantly. Ragel no longer
345 expects the machine to be implemented as a structure or class and does not
346 generate functions corresponding to initialization, execution and EOF.
347 Instead, Ragel just generates the code of these components, allowing all of
348 them to be placed in a single function if desired. The user specifies a
349 machine in the usual manner, then indicates at which place in the program
350 text the state machine code is to be generated. This is done using the write
351 statement. It is possible to specify to Ragel how it should access the
352 variables it needs (such as the current state) using the access statement.
353 -The host language embedding delimiters have been changed. Single line
354 machines start with '%%' and end at newline. Multiline machines start with
355 '%%{' and end with '}%%'. The machine name is given with the machine
356 statement at the very beginning of the specification. This purpose of this
357 change is to make it easier separate Ragel code from the host language. This
358 will ease the addition of supported host languages.
359 -The structure and class parsing which was previously able to extract a
360 machine's name has been removed since this feature is dependent on the host
361 language and inhibits the move towards a more language-independent frontend.
362 -The init, element and interface statements have been made obsolete by the
363 new host language interface and have been removed.
364 -The fexec action statement has been changed to take only the new position to
365 move to. This statement is more useful for moving backwards and reparsing
366 input than for specifying a whole new buffer entirely and has been shifted
367 to this new use. Giving it only one argument also simplifies the parsing of
368 host code embedded in a Ragel specification. This will ease the addition of
369 supported host languages.
370 -Introduced the fbreak statement, which allows one to stop processing data
371 immediately. The machine ends up in the state that the current transition
372 was to go to. The current character is not changed.
373 -Introduced the noend option for writing the execute code. This inhibits
374 checking if we have reached pe. The machine will run until it goes into the
375 error state or fbreak is hit. This allows one to parse null-terminate
376 strings without first computing the length.
377 -The execute code now breaks out of the processing loop when it moves into
378 the error state. Previously it would run until pe was hit. Breaking out
379 makes the noend option useful when an error is encountered and allows
380 user code to determine where in the input the error occured. It also
381 eliminates needlessly iterating the input buffer.
382 -Introduced the noerror, nofinal and noprefix options for writing the machine
383 data. The first two inhibit the writing of the error state and the
384 first-final state should they not be needed. The noprefix eliminates the
385 prefixing of the data items with the machine name.
386 -Support for the D language has been added. This is specified in the backend
388 -Since the new host language interface has been reduced considerably, Ragel
389 no longer needs to distinguish between C-based languages. Support for C, C++
390 and Objective-C has been folded into one option in the backend: -C
391 -The code generator has been made independent of the languages that it
392 supports by pushing the language dependent apsects down into the lower
393 levels of the code generator.
394 -Many improvements to the longest match construction were made. It is no
395 longer considered experimental. A longest match machine must appear at the
396 top level of a machine instantiation. Since it does not generate a pure
397 state machine (it may need to backtrack), it cannot be used as an operand to
399 -References to the current character and current state are now completely
400 banned in EOF actions.
402 Ragel 4.2 - Sep 16, 2005
403 ========================
404 (additional details in V4 release notes)
405 -Fixed a bug in the longest match operator. In some states it's possible that
406 we either match a token or match nothing at all. In these states we need to
407 consult the LmSwitch on error so it must be prepared to execute an error
408 handler. We therefore need to init act to this error value (which is zero).
409 We can compute if we need to do this and the code generator emits the
410 initialization only if necessary.
411 -Changed the definition of the token end of longest match actions. It now
412 points to one past the last token. This makes computing the token length
413 easier because you don't have to add one. The longest match variables token
414 start, action identifier and token end are now properly initialized in
415 generated code. They don't need to be initialized in the user's code.
416 -Implemented to-state and from-state actions. These actions are executed on
417 transitions into the state (after the in transition's actions) and on
418 transitions out of the state (before the out transition's actions). See V4
419 release notes for more information.
420 -Since there are no longer any action embedding operators that embed both on
421 transitions and on EOF, any actions that exist in both places will be there
422 because the user has explicitly done so. Presuming this case is rare, and
423 with code duplication in the hands of the user, we therefore give the EOF
424 actions their own action switch in the finish() function. This is further
425 motivated by the fact that the best solution is to do the same for to-state
426 and from-state actions in the main loop.
427 -Longest match actions can now be specified using a named action. Since a
428 word following a longest match item conflicts with the concatenation of a
429 named machine, the => symbol must come immediately before a named action.
430 -The longest match operator permits action and machine definitions in the
431 middle of a longest match construction. These are parsed as if they came
432 before the machine definition they are contained in. Permitting action and
433 machine definitions in a longest match construction allows objects to be
434 defined closer to their use.
435 -The longest match operator can now handle longest match items with no
436 action, where previously Ragel segfaulted.
437 -Updated to Aapl post 2.12.
438 -Fixed a bug in epsilon transition name lookups. After doing a name lookup
439 the result was stored in the parse tree. This is wrong because if a machine
440 is used more than once, each time it may resolve to different targets,
441 however it will be stored in the same place. We now store name resolutions
442 in a separated data structure so that each walk of a parse tree uses the
443 name resolved during the corresponding walk in the name lookup pass.
444 -The operators used to embed context and actions into states have been
445 modified. The V4 release notes contain the full details.
446 -Added zlen builtin machine to represent the zero length machine. Eventually
447 the name "null" will be phased out in favour of zlen because it is unclear
448 whether null matches the zero length string or if it does not match any
449 string at all (as does the empty builtin).
450 -Added verbose versions of action, context and priority embedding. See the V4
451 release notes for the full details. A small example:
452 machine <- all exec { foo(); } <- final eof act1
453 -Bugfix for machines with epsilon ops, but no join operations. I had
454 wrongfully assumed that because epsilon ops can only increase connectivity,
455 that no states are ever merged and therefore a call to fillInStates() is not
456 necessary. In reality, epsilon transitions within one machine can induce the
457 merging of states. In the following, state 2 follows two paths on 'i':
458 main := 'h' -> i 'i h' i: 'i';
459 -Changed the license of the guide from a custom "do not propagate modified
460 versions of this document" license to the GPL.
462 Ragel 4.1 - Jun 26, 2005
463 ========================
464 (additional details in V4 release notes)
465 -A bug in include processing was fixed. Surrounding code in an include file
466 was being passed through to the output when it should be ignored. Includes
467 are only for including portions of another machine into he current. This
468 went unnoticed because all tested includes were wrapped in #ifndef ...
469 #endif directives and so did not affect the compilation of the file making
471 -Fixes were made to Vim syntax highlighting file.
472 -Duplicate actions are now removed from action lists.
473 -The character-level negation operator ^ was added. This operator produces a
474 machine that matches single characters that are not matched by the machine
475 it is applied to. This unary prefix operator has the same precedence level
477 -The use of + to specify the a positive literal number was discontinued.
478 -The parser now assigns the subtraction operator a higher precedence than
479 the negation of literal number.
481 Ragel 4.0 - May 26, 2005
482 ========================
483 (additional details in V4 release notes)
484 -Operators now strictly embed into a machine either on a specific class of
485 characters or on EOF, but never both. This gives a cleaner association
486 between the operators and the physical state machine entitites they operate
487 on. This change is made up of several parts:
488 1. '%' operator embeds only into leaving characters.
489 2. All global and local error operators only embed on error character
490 transitions, their action will not be triggerend on EOF in non-final
492 3. EOF action embedding operators have been added for all classes of states
493 to make up for functionality removed from other operators. These are
495 4. Start transition operator '>' no longer implicitly embeds into leaving
496 transtions when start state is final.
497 -Ragel now emits warnings about the improper use of statements and values in
498 action code that is embedded as an EOF action. Warnings are emitted for fpc,
499 fc, fexec, fbuf and fblen.
500 -Added a longest match construction operator |* machine opt-action; ... *|.
501 This is for repetition where an ability to revert to a shorter, previously
502 matched item is required. This is the same behaviour as flex and re2c. The
503 longest match operator is not a pure FSM construction, it introduces
504 transitions that implicitly hold the current character or reset execution to
505 a previous location in the input. Use of this operator requires the caller
506 of the machine to occasionally hold onto data after a call to the exectute
507 routine. Use of machines generated with this operator as the input to other
508 operators may have undefined results. See examples/cppscan for an example.
509 This is very experimental code.
510 -Action ids are only assigned to actions that are referenced in the final
511 constructed machine, preventing gaps in the action id sequence. Previously
512 an action id was assigned if the action was referenced during parsing.
513 -Machine specifications now begin with %% and are followed with an optional
514 name and either a single Ragel statement or a sequence of statements
516 -Ragel no longer generates the FSM's structure or class. It is up to the user
517 to declare the structure and to give it a variable named curs of type
518 integer. If the machine uses the call stack the user must also declare a
519 array of integers named stack and an integer variable named top.
520 -In the case of Objective-C, Ragel no longer generates the interface or
521 implementation directives, allowing the user to declare additional methods.
522 -If a machine specification does not have a name then Ragel tries to find a
523 name for it by first checking if the specification is inside a struct, class
524 or interface. If it is not then it uses the name of the previous machine
525 specification. If still no name is found then this is an error.
526 -Fsm specifications now persist in memory and statements accumulate.
527 -Ragel now has an include statement for including the statements of a machine
528 spec in another file (perhaps because it is the corresponding header file).
529 The include statement can also be used to draw in the statements of another
530 fsm spec in the current file.
531 -The fstack statement is now obsolete and has been removed.
532 -A new statement, simply 'interface;', indicates that ragel should generate
533 the machine's interface. If Ragel sees the main machine it generates the
534 code sections of the machine. Previously, the header portion was generated
535 if the (now removed) struct statement was found and code was generated if
536 any machine definition was found.
537 -Fixed a bug in the resolution of fsm name references in actions. The name
538 resolution code did not recurse into inline code items with children
539 (fgoto*, fcall*, fnext*, and fexec), causing a segfault at code generation
541 -Cleaned up the code generators. FsmCodeGen was made into a virtual base
542 class allowing for the language/output-style specific classes to inherit
543 both a language specific and style-specific base class while retaining only
544 one copy of FsmCodeGen. Language specific output can now be moved into the
545 language specific code generators, requiring less duplication of code in the
546 language/output-style specific leaf classes.
547 -Fixed bugs in fcall* implementation of IpgGoto code generation.
548 -If the element type has not been defined Ragel now uses a constant version
549 of the alphtype, not the exact alphtype. In most cases the data pointer of
550 the execute routine should be const. A non-const element type can still be
551 defined with the element statement.
552 -The fc special value now uses getkey for retrieving the current char rather
553 than *_p, which is wrong if the element type is a structure.
554 -User guide converted to TeX and updated for new 4.0 syntax and semantics.
556 Ragel 3.7 - Oct 31, 2004
557 ========================
558 -Bug fix: unreferenced machine instantiations causing segfault due to name
559 tree and parse tree walk becomming out of syncronization.
560 -Rewrote representation of inline code blocks using a tree data structure.
561 This allows special keywords such as fbuf to be used as the operatands of
563 -Documentation updates.
564 -When deciding whether or not to generate machine instantiations, search the
565 entire name tree beneath the instantiation for references, not just the
567 -Removed stray ';' in keller2.rl
568 -Added fexec for restarting the machine with new buffer data (state stays the
569 same), fbuf for retrieving the the start of the buf, and fblen for
570 retrieving the orig buffer length.
571 -Implemented test/cppscan2 using fexec. This allows token emitting and restart
572 to stay inside the execute routine, instead of leaving and re-entering on
574 -Changed examples/cppscan to use fexec and thereby go much faster.
575 -Implemented flex and re2c versions of examples/cppscan. Ragel version
576 goes faster than flex version but not as fast as re2c version.
577 -Merged in Objective-C patch from Erich Ocean.
578 -Turned off syncing with stdio in C++ tests to make them go faster.
579 -Renamed C++ code generaion classes with the Cpp Prefix instead of CC to make
581 -In the finish function emit fbuf as 0 cast to a pointer to the element type
582 so it's type is not interpreted as an integer.
583 -The number -128 underflows char alphabets on some architectures. Removed
585 -Disabled the keller2 test because it causes problems on many architectures
586 due to its large size and compilation requirements.
588 Ragel 3.6 - Jul 10, 2004
589 ========================
590 -Many documentation updates.
591 -When resolving names, return a set of values so that a reference in an
592 action block that is embedded more than once won't report distinct entry
593 points that are actually the same.
594 -Implemented flat tables. Stores a linear array of indicies into the
595 transition array and only a low and high key value. Faster than binary
596 searching for keys but not usable for large alphabets.
597 -Fixed bug in deleting of transitions leftover from converstion from bst to
598 list implementation of transitions. Other code cleanup.
599 -In table based output calculate the cost of using an index. Don't use if
601 -Changed fstate() value available in init and action code to to fentry() to
602 reflect the fact that the values returned are intended to be used as targets
603 in fgoto, fnext and fcall statements. The returned state is not a unique
604 state representing the label. There can be any number of states representing
606 -Added keller2 test, C++ scanning tests and C++ scanning example.
607 -In table based output split up transitions into targets and actions. This
608 allows actions to be omitted.
609 -Broke the components of the state array into separate arrays. Requires
610 adding some fields where they could previously be omitted, however allows
611 finer grained control over the sizes of items and an overal size reduction.
612 Also means that state numbers are not an offset into the state array but
613 instead a sequence of numbers, meaning the context array does not have any
615 -Action lists and transition also have their types chosen to be the smallest
616 possible for accomodating the contained values.
617 -Changed curs state stored in fsm struct from _cs to curs. Keep fsm->curs ==
618 -1 while in machine. Added tests curs1 and curs2.
619 -Implemented the notion of context. Context can be embedded in states using
620 >:, $:, @: and %: operators. These embed a named context into start states,
621 all states, non-start/non-final and final states. If the context is declared
622 using a context statment
624 then the context can be quered for any state using fsm_name_ctx_name(state)
625 in C code and fsm_name::ctx_name(state) in C++ code. This feature makes it
626 possible to determine what "part" of the machine is currently active.
627 -Fixed crash on machine generation of graphs with no final state. If there
628 is no reference to a final state in a join operation, don't generate one.
629 -Updated Vim sytax: added labels to inline code, added various C++ keywords.
630 Don't highlight name separations as labels. Added switch labels, improved
631 alphtype, element and getkey.
632 -Fixed line info in error reporting of bad epsilon trans.
633 -Fixed fstate() for tab code gen.
634 -Removed references to malloc.h.
636 Ragel 3.5 - May 29, 2004
637 ========================
638 -When parse errors occur, the partially generated output file is deleted and
639 an non-zero exit status is returned.
640 -Updated Vim syntax file.
641 -Implemented the setting of the element type that is passed to the execute
642 routine as well as method for specifying how ragel should retrive the key
643 from the element type. This lets ragel process arbitrary structures inside
644 of which is the key that is parsed.
645 element struct Element;
646 getkey fpc->character;
647 -The current state is now implemented with an int across all machines. This
648 simplifies working with current state variables. For example this allows a
649 call stack to be implemented in user code.
650 -Implemented a method for retrieving the current state, the target state, and
652 fcurs -retrieve the current state
653 ftargs -retrieve the target state
654 fstate(name) -retrieve a named state.
655 -Implemented a mechanism for jumping to and calling to a state stored in a
657 fgoto *<expr>; -goto the state returned by the C/C++ expression.
658 fcall *<expr>; -call the state returned by the C/C++ expression.
659 -Implemented a mechanism for specifying the next state without immediately
660 transfering control there (any code following statement is executed).
661 fnext label; -set the state pointed to by label as the next state.
662 fnext *<expr>; -set the state returned by the C/C++ expression as the
664 -Action references are determined from the final machine instead of during
665 the parse tree walk. Some actions can be referenced in the parse tree but not
666 show up in the final machine. Machine analysis is now done based on this new
668 -Named state lookup now employs a breadth-first search in the lookup and
669 allows the user to fully qualify names, making it possible to specify
670 jumps/calls into parts of the machine deep in the name hierarchy. Each part
671 of name (separated by ::) employs a breadth first search from it's starting
673 -Name references now must always refer to a single state. Since references to
674 multiple states is not normally intended, it no longer happens
675 automatically. This frees the programmer from thinking about whether or not
676 a state reference is unique. It also avoids the added complexity of
677 determining when to merge the targets of multiple references. The effect of
678 references to multiple states can be explicitly created using the join
679 operator and epsilon transitions.
680 -M option was split into -S and -M. -S specifies the machine spec to generate
681 for graphviz output and dumping. -M specifies the machine definition or
683 -Machine function parameters are now prefixed with and underscore to
684 avoid the hiding of class members.
686 Ragel 3.4 - May 8, 2004
687 =======================
688 -Added the longest match kleene star operator **, which is synonymous
689 with ( ( <machine> ) $0 %1 ) *.
690 -Epsilon operators distinguish between leaving transitions (going to an
691 another expression in a comma separated list) and non-leaving transitions.
692 Leaving actions and priorities are appropriately transferred.
693 -Relative priority of following ops changed to:
697 If label is done first then the isolation of the start state in > operators
698 will cause the label to point to the old start state that doesn't have the
700 -Merged >! and >~, @! and @~, %! and %~, and $! and $~ operators to have one
701 set of global error action operators (>!, @!, %! and $!) that are invoked on
702 error by unexpected characters as well as by unexepected EOF.
703 -Added the fpc keyword for use in action code. This is a pointer to the
704 current character. *fpc == fc. If an action is invoked on EOF then fpc == 0.
705 -Added >^, @^, %^, and $^ local error operators. Global error operators (>!,
706 @!, $!, and %!) cause actions to be invoked if the final machine fails.
707 Local error actions cause actions to be invoked if if the current machine
709 -Changed error operators to mean embed global/local error actions in:
710 >! and !^ -the start state.
711 @! and @^ -states that are not the start state and are not final.
712 %! and %^ -final states.
713 $! and $^ -all states.
714 -Added >@! which is synonymous >! then @!
715 -Added >@^ which is synonymous >^ then @^
716 -Added @%! which is synonymous @! then %!
717 -Added @%^ which is synonymous >^ then @^
718 -FsmGraph representation of transition lists was changed from a mapping of
719 alphabet key -> transition objects using a BST to simply a list of
720 transition objects. Since the transitions are no longer divided by
721 single/range, the fast finding of transition objects by key is no longer
722 required functionality and can be eliminated. This new implementation uses
723 the same amount of memory however causes less allocations. It also make more
724 sense for supporting error transitions with actions. Previously an error
725 transition was represented by a null value in the BST.
726 -Regular expression ranges are checked to ensure that lower <= upper.
727 -Added printf-like example.
728 -Added atoi2, erract2, and gotcallret to the test suite.
729 -Improved build test to support make -jN and simplified the compiling and
732 Ragel 3.3 - Mar 7, 2004
733 =========================
734 -Portability bug fixes were made. Minimum and maximum integer values are
735 now taken from the system. An alignment problem on 64bit systems
738 Ragel 3.2 - Feb 28, 2004
739 ========================
740 -Added a Vim syntax file.
741 -Eliminated length var from generated execute code in favour of an end
742 pointer. Using length requires two variables be read and written. Using an
743 end pointer requires one variable read and written and one read. Results in
744 more optimizable code.
745 -Minimization is now on by default.
746 -States are ordered in output by depth first search.
747 -Bug in minimization fixed. States were not being distinguished based on
749 -Added null and empty builtin machines.
750 -Added EOF error action operators. These are >~, >@, $~, and %~. EOF error
751 operators embed actions to take if the EOF is seen and interpreted as an
752 error. The operators correspond to the following states:
754 -any state with a transition to a final state
755 -any state with a transiion out
757 -Fixed bug in generation of unreference machine vars using -M. Unreferenced
758 vars don't have a name tree built underneath when starting from
759 instantiations. Need to instead build the name tree starting at the var.
760 -Calls, returns, holds and references to fc in out action code are now
761 handled for ipgoto output.
762 -Only actions referenced by an instantiated machine expression are put into
763 the action index and written out.
764 -Added rlscan, an example that lexes Ragel input.
766 Ragel 3.1 - Feb 18, 2004
767 ========================
768 -Duplicates in OR literals are removed and no longer cause an assertion
770 -Duplicate entry points used in goto and call statements are made into
771 deterministic entry points.
772 -Base FsmGraph code moved from aapl into ragel, as an increasing amount
773 of specialization is required. Too much time was spent attempting to
774 keep it as a general purpose template.
775 -FsmGraph code de-templatized and heirarchy squashed to a single class.
776 -Single transitions taken out of FsmGraph code. In the machine construction
777 stage, transitions are now implemented only with ranges and default
778 transtions. This reduces memory consumption, simplifies code and prevents
779 covered transitions. However it requires the automated selection of single
780 transitions to keep goto-driven code lean.
781 -Machine reduction completely rewritten to be in-place. As duplicate
782 transitions and actions are found and the machine is converted to a format
783 suitable for writing as C code or as GraphViz input, the memory allocated
784 for states and transitions is reused, instead of newly allocated.
785 -New reduction code consolodates ranges, selects a default transition, and
786 selects single transitions with the goal of joining ranges that are split by
787 any number of single characters.
788 -Line directive changed from "# <num> <file>" to the more common format
789 "#line <num> <file>".
790 -Operator :! changed to @!. This should have happened in last release.
791 -Added params example.
793 Ragel 3.0 - Jan 22, 2004
794 ========================
795 -Ragel now parses the contents of struct statements and action code.
796 -The keyword fc replaces the use of *p to reference the current character in
798 -Machine instantiations other than main are allowed.
799 -Call, jump and return statements are now available in action code. This
800 facility makes it possible to jump to an error handling machine, call a
801 sub-machine for parsing a field or to follow paths through a machine as
802 determined by arbitrary C code.
803 -Added labels to the language. Labels can be used anywhere in a machine
804 expression to define an entry point. Also references to machine definitions
805 cause the implicit creation of a label.
806 -Added epsilon transitions to the language. Epsilon operators may reference
807 labels in the current name scope resolved when join operators are evaluated
808 and at the root of the expression tree of machine assignment/instantiation.
809 -Added the comma operator, which joins machines together without drawing any
810 transitions between them. This operator is useful in combination with
811 labels, the epsilon operator and user code transitions for defining machines
812 using the named state and transition list paradigm. It is also useful for
813 invoking transitions based on some analysis of the input or on the
815 -Added >!, :!, $!, %! operators for specifying actions to take should the
816 machine fail. These operators embed actions to execute if the machine
819 -any state with a transition to a final state
820 -any state with a transiion out
822 The general rule is that if an action embedding operator embeds an action
823 into a set of transitions T, then the error-counterpart with a ! embeds an
824 action into the error transition taken when any transition T is a candidate,
825 but does not match the input.
826 -The finishing augmentation operator ':' has been changed to '@'. This
827 frees the ':' symbol for machine labels and avoids hacks to the parser to
828 allow the use of ':' for both labels and finishing augmentations. The best
829 hack required that label names be distinct from machine definition names as
830 in main := word : word; This restriction is not good because labels are
831 local to the machine that they are used in whereas machine names are global
832 entities. Label name choices should not be restricted by the set of names
833 that are in use for machines.
834 -Named priority syntax now requires parenthesis surrounding the name and
835 value pair. This avoids grammar ambiguities now that the ',' operator has
836 been introduced and makes it more clear that the name and value are an
838 -Backslashes are escaped in line directive paths.
840 Ragel 2.2 - Oct 6, 2003
841 =======================
842 -Added {n}, {,n}, {n,} {n,m} repetition operators.
843 <expr> {n} -- exactly n repetitions
844 <expr> {,n} -- zero to n repetitions
845 <expr> {n,} -- n or more repetitions
846 <expr> {n,m} -- n to m repetitions
847 -Bug in binary search table in Aapl fixed. Fixes crashing on machines that
848 add to action tables that are implicitly shared among transitions.
849 -Tests using obsolete minimization algorithms are no longer built and run by
851 -Added atoi and concurrent from examples to the test suite.
853 Ragel 2.1 - Sep 22, 2003
854 ========================
855 -Bug in priority comparison code fixed. Segfaulted on some input with many
857 -Added two new examples.
859 Ragel 2.0 - Sep 7, 2003
860 =======================
861 -Optional (?), One or More (+) and Kleene Star (*) operators changed from
862 prefix to postfix. Rationale is that postfix version is far more common in
863 regular expression implementations and will be more readily understood.
864 -All priority values attached to transitions are now accompanied by a name.
865 Transitions no longer have default priority values of zero assigned
866 to them. Only transitions that have different priority values assigned
867 to the same name influence the NFA-DFA conversion. This scheme reduces
868 side-effects of priorities.
869 -Removed the %! statement for unsetting pending out priorities. With
870 named priorities, it is not necessary to clear the priorities of a
871 machine with $0 %! because non-colliding names can be used to avoid
873 -Removed the clear keyword, which was for removing actions from a machine.
874 Not required functionality and it is non-intuitive to have a language
875 feature that undoes previous definitions.
876 -Removed the ^ modifier to repetition and concatenation operators. This
877 undocumented feature prevented out transitions and out priorities from being
878 transfered from final states to transitions leaving machines. Not required
879 functionality and complicates the language unnecessarily.
880 -Keyword 'func' changed to 'action' as a part of the phasing out of the term
881 'function' in favour of 'action'. Rationale is that the term 'function'
882 implies that the code is called like a C function, which is not necessarily
883 the case. The term 'action' is far more common in state machine compiler
885 -Added the instantiation statement, which looks like a standard variable
886 assignment except := is used instead of =. Instantiations go into the
887 same graph dictionary as definitions. In the the future, instantiations
888 will be used as the target for gotos and calls in action code.
889 -The main graph should now be explicitly instantiated. If it is not,
891 -Or literal basic machines ([] outside of regular expressions) now support
893 -C and C++ interfaces lowercased. In the C interface an underscore now
894 separates the fsm machine and the function name. Rationale is that lowercased
895 library and generated routines are more common.
897 int fsm_init( struct clang *fsm );
898 int fsm_execute( struct clang *fsm, char *data, int dlen );
899 int fsm_finish( struct clang *fsm );
902 int fsm::execute( char *data, int dlen );
904 -Init, execute and finish all return -1 if the machine is in the error state
905 and can never accept, 0 if the machine is in a non-accepting state that has a
906 path to a final state and 1 if the machine is in an accepting state.
907 -Accept routine eliminated. Determining whether or not the machine accepts is
908 done by examining the return value of the finish routine.
909 -In C output, fsm structure is no longer a typedef, so referencing requires
910 the struct keyword. This is to stay in line with C language conventions.
911 -In C++ output, constructor is no longer written by ragel. As a consequence,
912 init routine is not called automatically. Allows constructor to be supplied
913 by user as well as the return value of init to be examined without calling it
915 -Static start state and private structures are taken out of C++ classes.
917 Ragel 1.5.4 - Jul 14, 2003
918 ==========================
919 -Workaround for building with bison 1.875, which produces an
920 optimization that doesn't build with newer version gcc.
922 Ragel 1.5.3 - Jul 10, 2003
923 ==========================
924 -Fixed building with versions of flex that recognize YY_NO_UNPUT.
925 -Fixed version numbers in ragel.spec file.
927 Ragel 1.5.2 - Jul 7, 2003
928 =========================
929 -Transition actions and out actions displayed in the graphviz output.
930 -Transitions on negative numbers handled in graphviz output.
931 -Warning generated when using bison 1.875 now squashed.
933 Ragel 1.5.1 - Jun 21, 2003
934 ==========================
935 -Bugs fixed: Don't delete the output objects when writing to standard out.
936 Copy mem into parser buffer with memcpy, not strcpy. Fixes buffer mem errror.
937 -Fixes for compiling with Sun WorkShop 6 compilers.
939 Ragel 1.5.0 - Jun 10, 2003
940 ==========================
941 -Line directives written to the output so that errors in the action code
942 are properly reported in the ragel input file.
943 -Simple graphviz dot file output format is supported. Shows states and
944 transitions. Does not yet show actions.
945 -Options -p and -f dropped in favour of -d output format.
946 -Added option -M for specifying the machine to dump with -d or the graph to
948 -Error recovery implemented.
949 -Proper line and column number tracking implemented in the scanner.
950 -All action/function code is now embedded in the main Execute routine. Avoids
951 duplication of action code in the Finish routine and the need to call
952 ExecFuncs which resulted in huge code bloat. Will also allow actions to
953 modify cs when fsm goto, call and return is supported in action code.
954 -Fsm spec can have no statements, nothing will be generated.
955 -Bug fix: Don't accept ] as the opening of a .-. range a reg exp.
956 -Regular expression or set ranges (ie /[0-9]/) are now handled by the parser
957 and consequently must be well-formed. The following now generates a parser
958 error: /[+-]/ and must be rewritten as /[+\-]/. Also fixes a bug whereby ]
959 might be accepted as the opening of a .-. range causing /[0-9]-[0-9]/ to
961 -\v, \f, and \r are now treated as whitespace in an fsm spec.
963 Ragel 1.4.1 - Nov 19, 2002
964 ==========================
965 -Compile fixes. The last release (integer alphabets) was so exciting
966 that usual portability checks got bypassed.
968 Ragel 1.4.0 - Nov 19, 2002
969 ==========================
970 -Arbitrary integer alphabets are now fully supported! A new language
972 'alphtype <type>' added for specifying the type of the alphabet. Default
973 is 'char'. Possible alphabet types are:
974 char, unsigned char, short, unsigned short, int, unsigned int
975 -Literal machines specified in decimal format can now be negative when the
976 alphabet is a signed type.
977 -Literal machines (strings, decimal and hex) have their values checked for
978 overflow/underflow against the size of the alphabet type.
979 -Table driven and goto driven output redesigned to support ranges. Table
980 driven uses a binary search for locating single characters and ranges. Goto
981 driven uses a switch statement for single characters and nested if blocks for
983 -Switch driven output removed due to a lack of consistent advantages. Most of
984 the time the switch driven FSM is of no use because the goto FSM makes
985 smaller and faster code. Under certain circumstances it can produce smaller
986 code than a goto driven fsm and be almost as fast, but some sporadic case
987 does not warrant maintaining it.
988 -Many warnings changed to errors.
989 -Added option -p for printing the final fsm before minimization. This lets
990 priorities be seen. Priorties are all reset to 0 before minimization. The
991 exiting option -f prints the final fsm after minimization.
992 -Fixed a bug in the clang test and example that resulted in redundant actions
995 Ragel 1.3.4 - Nov 6, 2002
996 =========================
997 -Fixes to Chapter 1 of the guide.
998 -Brought back the examples and made them current.
999 -MSVC is no longer supported for compiling windows binaries because its
1000 support for the C++ standard is frustratingly inadequate, it will cost money
1001 to upgrade if it ever gets better, and MinGW is a much better alternative.
1002 -The build system now supports the --host= option for building ragel
1003 for another system (used for cross compiling a windows binary with MinGW).
1004 -Various design changes and fixes towards the goal of arbitrary integer
1005 alphabets and the handling of larger state machines were made.
1006 -The new shared vector class is now used for action lists in transitions and
1007 states to reduce memory allocations.
1008 -An avl tree is now used for the reduction of transitions and functions of an
1009 fsm graph before making the final machine. The tree allows better scalability
1010 and performance by not requiring consecutively larger heap allocations.
1011 -Final stages in the separation of fsm graph code from action embedding and
1012 priority assignment is complete. Makes the base graph leaner and easier to reuse
1013 in other projects (like Keller).
1015 Ragel 1.3.3 - Oct 22, 2002
1016 ==========================
1017 -More diagrams were added to section 1.7.1 of the user guide.
1018 -FSM Graph code was reworked to spearate the regex/nfa/minimizaion graph
1019 algorithms from the manipulation of state and transition properties.
1020 -An rpm spec file from Cris Bailiff was added. This allows an rpm for ragel
1021 to be built with the command 'rpm -ta ragel-x.x.x.tar.gz'
1022 -Fixes to the build system and corresponding doc updates in the README.
1023 -Removed autil and included the one needed source file directly in the top
1024 level ragel directory.
1025 -Fixed a bug that nullified the 20 times speedup in large compilations
1026 claimed by the last version.
1027 -Removed awk from the doc build (it was added with the last release -- though
1028 not mentioned in the changelog).
1029 -Install of man page was moved to the doc dir. The install also installs the
1030 user guide to $(PREFIX)/share/doc/ragel/
1032 Ragel 1.3.2 - Oct 16, 2002
1033 ==========================
1034 -Added option -v (or --version) to show version information.
1035 -The subtract operator no longer removes transition data from the machine
1036 being subtracted. This is left up to the user for the purpose of making it
1037 possible to transfer transitions using subtract and also for speeding up the
1038 subtract routine. Note that it is possible to explicitly clear transition
1039 data before a doing a subtract.
1040 -Rather severe typo bug fixed. Bug was related to transitions with higher
1041 priorities taking precedence. A wrong ptr was being returned. It appears to
1042 have worked most of the time becuase the old ptr was deleted and the new one
1043 allocated immediatly after so the old ptr often pointed to the same space.
1045 -Bug in the removing of dead end paths was fixed. If the start state
1046 has in transitions then those paths were not followed when finding states to
1047 keep. Would result in non-dead end states being removed from the graph.
1048 -In lists and in ranges are no longer maintained as a bst with the key as the
1049 alphabet character and the value as a list of transitions coming in on that
1050 char. There is one list for each of inList, inRange and inDefault. Now that
1051 the required functionality of the graph is well known it is safe to remove
1052 these lists to gain in speed and footprint. They shouldn't be needed.
1053 -IsolateStartState() runs on modification of start data only if the start
1054 state is not already isolated, which is now possible with the new in list
1056 -Concat, Or and Star operators now use an approximation to
1057 removeUnreachableStates that does not require a traversal of the entire
1058 graph. This combined with an 'on-the-fly' management of final bits and final
1059 state status results is a dramatic speed increase when compiling machines
1060 that use those operators heavily. The strings2 test goes 20 times faster.
1061 -Before the final minimization, after all fsm operations are complete,
1062 priority data is reset which enables better minimization in cases where
1063 priorities would otherwise separate similar states.
1065 Ragel 1.3.1 - Oct 2, 2002
1066 =========================
1067 -Range transitions are now used to implement machines made with /[a-z]/ and
1068 the .. operator as well as most of the builtin machines. The ranges are not
1069 yet reflected in the output code, they are expanded as if they came from the
1070 regular single transitions. This is one step closer to arbitrary integer
1072 -The builtin machine 'any' was added. It is equiv to the builtin extend,
1073 matching any characters.
1074 -The builtin machine 'cntrl' now includes newline.
1075 -The builtin machine 'space' now includes newline.
1076 -The builtin machine 'ascii' is now the range 0-127, not all characters.
1077 -A man page was written.
1078 -A proper user guide was started. Chapter 1: Specifying Ragel Programs
1079 was written. It even has some diagrams :)
1081 Ragel 1.3.0 - Sept 4, 2002
1082 ==========================
1083 -NULL keyword no longer used in table output.
1084 -Though not yet in use, underlying graph structure changed to support range
1085 transitions. As a result, most of the code that walks transition lists is now
1086 implemented with an iterator that hides the complexity of the transition
1087 lists and ranges. Range transitions will be used to implement /[a-z]/ style
1088 machines and machines made with the .. operator. Previously a single
1089 transition would be used for each char in the range, which is very costly.
1090 Ranges eliminate much of the space complexity and allow for the .. operator
1091 to be used with very large (integer) alphabets.
1092 -New minimization similar to Hopcroft's alg. It does not require n^2 space and
1093 runs close to O(n*log(n)) (an exact analysis of the alg is very hard). It is
1094 much better than the stable and approx minimization and obsoletes them both.
1095 An exact implementation of Hopcroft's alg is desirable but not possible
1096 because the ragel implementation does not assume a finite alphabet, which
1097 Hopcroft's requires. Ragel will support arbitrary integer alphabets which
1098 must be treated as an infinite set for implementation considerations.
1099 -New option -m using above described minimization to replace all previous
1100 minimization options. Old options sill work but are obsolete and not
1102 -Bug fixed in goto style output. The error exit set the current state to 0,
1103 which is actually a valid state. If the machine was entered again it would go
1104 into the first state, very wrong. If the first state happened to be final then
1105 an immediate finish would accept when in fact it should fail.
1106 -Slightly better fsm minimization now capable due to clearing of the
1107 transition ordering numbers just prior to minimization.
1109 Ragel 1.2.2 - May 25, 2002
1110 ==========================
1111 -Configuration option --prefix now works when installing.
1112 -cc file extension changed to cpp for better portability.
1113 -Unlink of output file upon error no longer happens, removes dependency on
1114 unlink system command.
1115 -All multiline strings removed: not standard c++.
1116 -Awk build dependency removed.
1117 -MSVC 6.0 added to the list of supported compilers (with some tweaking of
1118 bison and flex output).
1120 Ragel 1.2.1 - May 13, 2002
1121 ==========================
1122 -Automatic dependencies were fixed, they were not working correctly.
1123 -Updated AUTHORS file to reflect contributors.
1124 -Code is more C++ standards compliant: compiles with g++ 3.0
1125 -Fixed bugs that only showed up in g++ 3.0
1126 -Latest (unreleased) Aapl.
1127 -Configuration script bails out if bison++ is installed. Ragel will not
1128 compile with bison++ because it is coded in c++ and bison++ automatically
1129 generates a c++ parser. Ragel uses a c-style bison parser.
1131 Ragel 1.2.0 - May 3, 2002
1132 =========================
1133 -Underlying graph structure now supports default transitions. The result is
1134 that a transition does not need to be made for each char of the alphabet
1135 when making 'extend' or '/./' machines. Ragel compiles machines that
1136 use the aforementioned primitives WAY faster.
1137 -The ugly hacks needed to pick default transitions now go away due to
1138 the graph supporting default transitions directly.
1139 -If -e is given, but minimization is not turned on, print a warning.
1140 -Makefiles use automatic dependencies.
1142 Ragel 1.1.0 - April 15, 2002
1143 ============================
1144 -Added goto fsm: much faster than any other fsm style.
1145 -Default operator (if two machines are side by side with no operator
1146 between them) is concatenation. First showed up in 1.0.4.
1147 -The fsm machine no longer auotmatically builds the flat table for
1148 transition indicies. Instead it keeps the key,ptr pair. In tabcodegen
1149 the flat table is produced. This way very large alphabets with sparse
1150 transitions will not consume large amounts of mem. This is also in prep
1151 for fsm graph getting a default transition.
1152 -Generated code contains a statement explicitly stating that ragel fsms
1153 are NOT covered by the GPL. Technically, Ragel copies part of itself
1154 to the output to make the generic fsm execution routine (for table driven
1155 fsms only) and so the output could be considered under the GPL. But this
1156 code is very trivial and could easlily be rewritten. The actual fsm data
1157 is subject to the copyright of the source. To promote the use of Ragel,
1158 a special exception is made for the part of the output copied from Ragel:
1159 it may be used without restriction.
1160 -Much more elegant code generation scheme is employed. Code generation
1161 class members need only put the 'codegen' keyword after their 'void' type
1162 in order to be automatically registerd to handle macros of the same name.
1163 An awk script recognises this keyword and generates an appropriate driver.
1164 -Ragel gets a test suite.
1165 -Postfunc and prefunc go away because they are not supported by non
1166 loop-driven fsms (goto, switch) and present duplicate functionality.
1167 Universal funcs can be implemented by using $ operator.
1168 -Automatic dependencies used in build system, no more make depend target.
1169 -Code generation section in docs.
1170 -Uses the latests aapl.
1172 Ragel 1.0.5 - March 3, 2002
1173 ===========================
1174 -Bugfix in SetErrorState that caused an assertion failure when compiling
1175 simple machines that did not have full transition tables (and thus did
1176 not show up on any example machines). Assertion failure did not occur
1177 when using the switch statement code as ragel does not call SetErrorState
1179 -Fixed some missing includes, now compiles on redhat.
1180 -Moved the FsmMachTrans Compare class out of FsmMachTrans. Some compilers
1181 don't deal with nested classes in templates too well.
1182 -Removed old unused BASEREF in fsmgraph and ragel now compiles using
1183 egcs-2.91.66 and presumably SUNWspro. The baseref is no longer needed
1184 because states do not support being elements in multiple lists. I would
1185 rather be able to support more compilers than have this feature.
1186 -Started a README with compilation notes. Started an AUTHORS file.
1187 -Started the user documentation. Describes basic machines and operators.
1189 Ragel 1.0.4 - March 1, 2002
1190 ===========================
1191 -Ported to the version of Aapl just after 2.2.0 release. See
1192 http://www.ragel.ca/aapl/ for details on aapl.
1193 -Fixed a bug in the clang example: the newline machine was not stared.
1194 -Added explanations to the clang and mailbox examples. This should
1195 help people that want to learn the lanuage as the manual is far from
1198 Ragel 1.0.3 - Feb 2, 2002
1199 =========================
1200 -Added aapl to the ragel tree. No longer requires you to download
1201 and build aapl separately. Should avoid discouraging impatient users
1202 from compiling ragel.
1203 -Added the examples to the ragel tree.
1204 -Added configure script checks for bison and flex.
1205 -Fixed makefile so as not to die with newer versions of bison that
1206 write the header of the parser to a .hh file.
1207 -Started ChangeLog file.
1209 Ragel 1.0.2 - Jan 30, 2002
1210 ==========================
1211 -Bug fix in calculating highIndex for table based code. Was using
1212 the length of out tranisition table rather than the value at the
1214 -If high/low index are at the limits, output a define in their place,
1215 not the high/low values themselves so as not to cause compiler warnings.
1216 -If the resulting machines don't have any indicies or functions, then
1217 omit the empty unrefereced static arrays so as not to cause compiler
1218 warnings about unused static vars.
1219 -Fixed variable sized indicies support. The header cannot have any
1220 reference to INDEX_TYPE as that info is not known at the time the header
1221 data is written. Forces us to use a void * for pointers to indicies. In
1222 the c++ versions we are forced to make much of the data non-member
1223 static data in the code portion for the same reason.
1225 Ragel 1.0.1 - Jan 28, 2002
1226 ==========================
1227 -Exe name change from reglang to ragel.
1228 -Added ftabcodegen output code style which uses a table for states and
1229 transitions but uses a switch statement for the function execution.
1230 -Reformatted options in usage dump to look better.
1231 -Support escape sequences in [] sections of regular expressions.
1233 Ragel 1.0 - Jan 25, 2002
1234 ========================