1 .\" $Id: yacc.1,v 1.42 2022/11/06 17:07:16 tom Exp $
3 .\" .TH YACC 1 "July\ 15,\ 1990"
18 .\" Escape single quotes in literal strings from groff's Unicode transform.
25 .\" Bulleted paragraph
30 .TH YACC 1 "November 6, 2022" "Berkeley Yacc" "User Commands"
32 \*N \- an LALR(1) parser generator
34 .B \*n [ \-BdghilLPrtvVy ] [ \-b
46 reads the grammar specification in the file
48 and generates an LALR(1) parser for it.
49 The parsers consist of a set of LALR(1) parsing tables and a driver routine
50 written in the C programming language.
52 normally writes the parse tables and the driver routine to the file
55 The following options are available:
57 \fB\-b \fIfile_prefix\fR
60 option changes the prefix prepended to the output file names to
63 The default prefix is the character
67 create a backtracking parser (compile-time configuration for \fBbtyacc\fP).
70 causes the header file
73 It contains #define's for the token identifiers.
76 print a usage message to the standard error.
78 \fB\-H \fIdefines_file\fR
79 causes #define's for the token identifiers
80 to be written to the given \fIdefines_file\fP rather
81 than the \fBy.tab.h\fP file used by the \fB\-d\fP option.
86 option causes a graphical description of the generated LALR(1) parser to
87 be written to the file
89 in graphviz format, ready to be processed by
93 The \fB\-i\fR option causes a supplementary header file
96 It contains extern declarations
97 and supplementary #define's as needed to map the conventional \fIyacc\fP
98 \fByy\fP-prefixed names to whatever the \fB\-p\fP option may specify.
99 The code file, e.g., \fBy.tab.c\fP is modified to #include this file
100 as well as the \fBy.tab.h\fP file, enforcing consistent usage of the
101 symbols defined in those files.
103 The supplementary header file makes it simpler to separate compilation
104 of lex- and yacc-files.
109 option is not specified,
111 will insert \fI#line\fP directives in the generated code.
112 The \fI#line\fP directives let the C compiler relate errors in the
113 generated code to the user's original code.
114 If the \fB\-l\fR option is specified,
116 will not insert the \fI#line\fP directives.
117 \&\fI#line\fP directives specified by the user will be retained.
120 enable position processing,
121 e.g., \*(``%locations\*('' (compile-time configuration for \fBbtyacc\fP).
123 \fB\-o \fIoutput_file\fR
124 specify the filename for the parser file.
125 If this option is not given, the output filename is
126 the file prefix concatenated with the file suffix, e.g., \fBy.tab.c\fP.
127 This overrides the \fB\-b\fP option.
129 \fB\-p \fIsymbol_prefix\fR
132 option changes the prefix prepended to yacc-generated symbols to
133 the string denoted by
135 The default prefix is the string
139 create a reentrant parser, e.g., \*(``%pure\-parser\*(''.
146 to produce separate files for code and tables.
147 The code file is named
149 and the tables file is named
151 The prefix \*(``\fIy.\fP\*('' can be overridden using the \fB\-b\fP option.
154 suppress \*(``\fB#define\fP\*('' statements generated for string literals in
155 a \*(``\fB%token\fP\*('' statement,
156 to more closely match original \fByacc\fP behavior.
158 Normally when \fB\*n\fP sees a line such as
163 it notices that the quoted \*(``ADD\*('' is a valid C identifier,
164 and generates a #define not only for OP_ADD,
173 The original \fByacc\fP does not generate the second \*(``\fB#define\fP\*(''.
174 The \fB\-s\fP option suppresses this \*(``\fB#define\fP\*(''.
176 POSIX (IEEE 1003.1 2004) documents only names and numbers
177 for \*(``\fB%token\fP\*('',
178 though original \fByacc\fP and bison also accept string literals.
183 option changes the preprocessor directives generated by
185 so that debugging statements will be incorporated in the compiled code.
187 \fB\*N\fR sends debugging output to the standard output
188 (compatible with both the original \fByacc\fP and \fBbtyacc\fP),
189 while \fBbtyacc\fP writes debugging output to the standard error
195 option causes a human-readable description of the generated parser to
196 be written to the file
200 print the version number to the standard output.
203 \fB\*n\fP ignores this option,
204 which bison supports for ostensible POSIX compatibility.
206 The \fIfilename\fP parameter is not optional.
207 However, \fB\*n\fP accepts a single \*(``\-\*('' to read the grammar
208 from the standard input.
209 A double \*(``\-\-\*('' marker denotes the end of options.
210 A single \fIfilename\fP parameter is expected after a \*(``\-\-\*('' marker.
213 provides some extensions for
214 compatibility with bison and other implementations of yacc.
215 It accepts several \fIlong options\fP which have equivalents in \*n.
216 The \fB%destructor\fP and \fB%locations\fP features are available
217 only if \fB\*n\fP has been configured and compiled to support the
218 back-tracking (\fBbtyacc\fP) functionality.
219 The remaining features are always available:
221 \fB %code\fP \fIkeyword\fP { \fIcode\fP }
222 Adds the indicated source \fIcode\fP at a given point in the output file.
223 The optional \fIkeyword\fP tells \fB\*n\fP where to insert the \fIcode\fP:
227 just after the version-definition in the generated code-file.
230 just after the declaration of public parser variables.
231 If the \fB\-d\fP option is given, the code is inserted at the
232 beginning of the defines-file.
235 just after the declaration of private parser variables.
236 If the \fB\-d\fP option is given, the code is inserted at the
237 end of the defines-file.
240 If no \fIkeyword\fP is given, the code is inserted at the
241 beginning of the section of code copied verbatim from the source file.
242 Multiple \fB%code\fP directives may be given;
243 \fB\*n\fP inserts those into the corresponding code- or defines-file
244 in the order that they appear in the source file.
247 This has the same effect as the \*(``\-t\*('' command-line option.
249 \fB %destructor\fP { \fIcode\fP } \fIsymbol+\fP
250 defines code that is invoked when a symbol is automatically
251 discarded during error recovery.
252 This code can be used to
253 reclaim dynamically allocated memory associated with the corresponding
254 semantic value for cases where user actions cannot manage the memory
257 On encountering a parse error, the generated parser
258 discards symbols on the stack and input tokens until it reaches a state
259 that will allow parsing to continue.
260 This error recovery approach results in a memory leak
261 if the \fBYYSTYPE\fP value is, or contains,
262 pointers to dynamically allocated memory.
264 The bracketed \fIcode\fP is invoked whenever the parser discards one of
266 Within \fIcode\fP, \*(``\fB$$\fP\*('' or
267 \*(``\fB$<\fItag\fB>$\fR\*('' designates the semantic value associated with the
268 discarded symbol, and \*(``\fB@$\fP\*('' designates its location (see
269 \fB%locations\fP directive).
271 A per-symbol destructor is defined by listing a grammar symbol
272 in \fIsymbol+\fP. A per-type destructor is defined by listing
273 a semantic type tag (e.g., \*(``<some_tag>\*('') in \fIsymbol+\fP; in this
274 case, the parser will invoke \fIcode\fP whenever it discards any grammar
275 symbol that has that semantic type tag, unless that symbol has its own
276 per-symbol destructor.
278 Two categories of default destructor are supported that are
279 invoked when discarding any grammar symbol that has no per-symbol and no
283 the code for \*(``\fB<*>\fP\*('' is used
284 for grammar symbols that have an explicitly declared semantic type tag
285 (via \*(``\fB%type\fP\*('');
287 the code for \*(``\fB<>\fP\*('' is used
288 for grammar symbols that have no declared semantic type tag.
292 ignored by \fB\*n\fP.
294 \fB %expect\fP \fInumber\fP
295 tells \fB\*n\fP the expected number of shift/reduce conflicts.
296 That makes it only report the number if it differs.
298 \fB %expect\-rr\fP \fInumber\fP
299 tell \fB\*n\fP the expected number of reduce/reduce conflicts.
300 That makes it only report the number if it differs.
301 This is (unlike bison) allowable in LALR parsers.
304 tells \fB\*n\fP to enable management of position information associated
305 with each token, provided by the lexer in the global variable \fByylloc\fP,
306 similar to management of semantic value information provided in \fByylval\fP.
308 As for semantic values, locations can be referenced within actions using
309 \fB@$\fP to refer to the location of the left hand side symbol, and \fB@\fIN\fR
310 (\fIN\fP an integer) to refer to the location of one of the right hand side
312 Also as for semantic values, when a rule is matched, a default
313 action is used the compute the location represented by \fB@$\fP as the
314 beginning of the first symbol and the end of the last symbol in the right
315 hand side of the rule.
316 This default computation can be overridden by
317 explicit assignment to \fB@$\fP in a rule action.
319 The type of \fByylloc\fP is \fBYYLTYPE\fP, which is defined by default as:
321 typedef struct YYLTYPE {
329 \fBYYLTYPE\fP can be redefined by the user
330 (\fBYYLTYPE_IS_DEFINED\fP must be defined, to inhibit the default)
331 in the declarations section of the specification file.
332 As in bison, the macro \fBYYLLOC_DEFAULT\fP is invoked
333 each time a rule is matched to calculate a position for the left hand side of
334 the rule, before the associated action is executed; this macro can be
335 redefined by the user.
337 This directive adds a \fBYYLTYPE\fP parameter to \fByyerror()\fP.
338 If the \fB%pure\-parser\fP directive is present,
339 a \fBYYLTYPE\fP parameter is added to \fByylex()\fP calls.
341 \fB %lex\-param\fP { \fIargument-declaration\fP }
342 By default, the lexer accepts no parameters, e.g., \fByylex()\fP.
343 Use this directive to add parameter declarations for your customized lexer.
345 \fB %parse\-param\fP { \fIargument-declaration\fP }
346 By default, the parser accepts no parameters, e.g., \fByyparse()\fP.
347 Use this directive to add parameter declarations for your customized parser.
350 Most variables (other than \fByydebug\fP and \fByynerrs\fP) are
351 allocated on the stack within \fByyparse\fP, making the parser reasonably
355 Make the parser's names for tokens available in the \fByytname\fP array.
358 does not predefine \*(``$end\*('', \*(``$error\*(''
359 or \*(``$undefined\*('' in this array.
361 According to Robert Corbett,
363 Berkeley Yacc is an LALR(1) parser generator. Berkeley Yacc
364 has been made as compatible as possible with AT&T Yacc.
365 Berkeley Yacc can accept any input specification that
366 conforms to the AT&T Yacc documentation. Specifications
367 that take advantage of undocumented features of AT&T Yacc
368 will probably be rejected.
373 http://pubs.opengroup.org/onlinepubs/9699919799/utilities/yacc.html
376 documents some features of AT&T yacc which are no longer required for POSIX
379 That said, you may be interested in reusing grammar files with some
380 other implementation which is not strictly compatible with AT&T yacc.
381 For instance, there is bison.
382 Here are a few differences:
384 \fBYacc\fP accepts an equals mark preceding the left curly brace
385 of an action (as in the original grammar file \fBftp.y\fP):
393 \fBYacc\fP and bison emit code in different order, and in particular bison
394 makes forward reference to common functions such as yylex, yyparse and
395 yyerror without providing prototypes.
397 Bison's support for \*(``%expect\*('' is broken in more than one release.
398 For best results using bison, delete that directive.
400 Bison has no equivalent for some of \fB\*n\fP's command-line options,
401 relying on directives embedded in the grammar file.
403 Bison's \*(``\fB\-y\fP\*('' option does not affect bison's lack of support for
404 features of AT&T yacc which were deemed obsolescent.
406 \fBYacc\fP accepts multiple parameters
407 with \fB%lex\-param\fP and \fB%parse\-param\fP in two forms
409 {type1 name1} {type2 name2} ...
410 {type1 name1, type2 name2 ...}
413 Bison accepts the latter (though undocumented), but depending on the
414 release may generate bad code.
416 Like bison, \fB\*n\fP will add parameters specified via \fB%parse\-param\fP
417 to \fByyparse\fP, \fByyerror\fP and (if configured for back-tracking)
418 to the destructor declared using \fB%destructor\fP.
419 Bison puts the additional parameters \fIfirst\fP for
420 \fByyparse\fP and \fByyerror\fP but \fIlast\fP for destructors.
421 \fBYacc\fP matches this behavior.
424 If there are rules that are never reduced, the number of such rules is
425 reported on standard error.
426 If there are any LALR(1) conflicts, the number of conflicts is reported