YACC is an automatic tool that generates the parser program.
As we have discussed YACC in the first unit of this tutorial so you can go through the concepts again to make things more clear.
YACC – Automatic Parser Generator
- YACC is a automatic tool that generates the parser program
- YACC stands for Yet Another Compiler Compiler. This program is available in UNIX OS
- The construction of LR parser requires lot of work for parsing the input string. Hence, the process must involve automation to achieve efficiency in parsing an input
- Basically YACC is a LALR parser generator that reports conflicts or uncertainties (if at all present) in the form of error messages
- The typical YACC translator can be represented as shown in the image
- YACC Specification
- Ambiguous Grammars in YACC
- Error Recovery in YACC
YACC Specification:
The YACC specification file consists of three parts
Declaration section: In this section, ordinary C declarations are inserted and grammar tokens are declared. The tokens should be declared between %{ and %}
Translation rule section
It includes the production rules of context free grammar with corresponding actions
Example:
Rule-1 action-1
Rule-2 action-2
:
:
Rule n action n
If there is more than one alternative to a single rule then those alternatives are separated by ‘|’ (pipe) character. The actions are typical C statements. If CFG is
LHS: alternative 1 | alternative 2 | …… alternative n
Then
LHS: alternative 1 {action 1}
| alternative 2 {action 1}
:
:
alternative n {action n}
C functions Section: this consists of one main function in which the routine yyparse() is called. And it also contains required C functions
The specification file comprising these sections can be written as:
%{
/ * declaration section */
%}
/* Translation rule section */
%%
/* Required C functions */
Example:
YACC Specification of a simple desk calculator:
%{
#include <ctype.h>
%}
%token DIGIT
%%
line: expr ‘\n’ { printf(“%d\n”, $1); }
;
expr : expr ‘+’ term { $$ = $1 + $3; }
| term
;
term : term ‘*’ factor { $$ = $1 * $3; }
| factor
;
factor : ‘(‘ expr ‘)’ { $$ = $2; }
| DIGIT
;
%%
yylex() {
int c;
c = getchar();
if(isdigit(c)
{
yylval = c-‘0’;
return DIGIT;
}
return c;
}
Ambiguous Grammars in YACC
YACC declarations resolve shift/reduce and reduce/reduce conflicts using operator precedence and operator associativity information
YACC has default methods for resolving conflicts. However, it is better to find out what conflicts arose and how they were resolved using the ‘-v’ option
The declarations provide a way to override YACCs defaults
Productions have the precedence of their rightmost terminal, unless otherwise specified by “%prec” element
The declaration keywords %left, %right and %nonassoc inform YACC that the following tokens are to be treated as left-associative (as binary + - * & / commonly are), right-associative (as exp often is), or non-associative (as binay < & > often are)
The order of declarations informs YACC that the tokens should be accorded increasing precedence
%left ‘+’ ‘-‘ | Effect is that * has higher precedence |
%left ‘*’ ‘I’ | Than +, so x+y*z is grouped like x+(y*z) |
YACC Specification for a more advanced desk calculator:
%{
#include <ctype.h>
#include <stdiio.h>
#define YYSTYPE double /* double type for YACC stack */
%}
%token NUMBER
%left ‘+’ ‘-‘
%left ‘*’ ‘/’
%right UMINUS
%%
lines : lines expr ‘\n’ { printf(“%g\n”, $2); }
| lines ‘\n’
| /* empty */
;
expr : expr ‘+’ expr { $$ = $1 + $3; }
| expr ‘-‘ expr { $$ = $1 - $3; }
| expr ‘*’ expr {$$ = $1 * $3; }
| expr ‘/’ expr {$$ = $1 / $3; }
| ‘(‘ expr ‘)’ {$$ = $2;}
| ‘-‘ expr %prec UMINUS { $$ = -$2; }
| NUMBER
;
%
yylex() {
int c;
while (( c = getchar() ) == ‘ ‘ );
if ((c == ‘.’) || (isdigit(c) ) {
unget(c, stdin);
scanf(“%1f”, &uulval);
return Number;
}
return c;
}
Error Recovery in YACC
In YACC, error recovery is performed with the help of error productions
To process errors at the level of a nonterminal A, add an error production A → error α. Here, error is a YACC reserved word, and α is a string of grammar symbols (may be empty)
If YACC encounters an error during parsing, it continues to pop symbols from its stack until it finds the topmost state on its stack, whose underlying set of items includes an item of the following form
A → .error α
The parser then shifts the false token error onto the stack as though it saw the token error of the input
When α is ε, reduction to A occurs immediately and the semantic action associated with the production A → error (which might be a user-specified error recovery routine) is invoked. The parser then discards the input symbols until it finds an input symbol on which normal parsing can proceed
If α is not empty, YACC skips the input and looks for a substring that can be “reduced” to α on the stack. Now, the parser has an error α on top of its stack, reducing it to A. The parser then resumes normal parsing
YACC specification contains a translation rule, for example
lines : error ‘\n’ { yyerror(“reenter previous line:”); yyerrok; }
Desk Calculator with Error Recovery:
%{
#include <ctype.h>
#include <stdiio.h>
#define YYSTYPE double /* double type for YACC stack */
%}
%token NUMBER
%left ‘+’ ‘-‘
%left ‘*’ ‘/’
%right UMINUS
%%
lines : lines expr ‘\n’ { printf(“%g\n”, $2); }
| lines ‘\n’
| /* empty */
| error ‘\n’ { yyerror(“reenter previous line”);
Yyerrok;
;
expr : expr ‘+’ expr { $$ = $1 + $3; }
| expr ‘-‘ expr { $$ = $1 - $3; }
| expr ‘*’ expr {$$ = $1 * $3; }
| expr ‘/’ expr {$$ = $1 / $3; }
| ‘(‘ expr ‘)’ {$$ = $2;}
| ‘-‘ expr %prec UMINUS { $$ = -$2; }
| NUMBER
;
%
#include “lex.yy.c”
0 Comments