Header Ads Widget

Why FIRST and FOLLOW in Compiler Design?

Why FIRST?
If the compiler would have come to know in advance, that what is the “first character of the string produced when a production rule is applied”, and comparing it to the current character or token in the input string it sees, it can wisely take decision on which production rule to apply.

Let’s take the grammar:

S -> cAd

A -> bc|a 

And the input string is “cad”. 

Thus, in the example above, if it knew that after reading character ‘c’ in the input string and applying S->cAd, next character in the input string is ‘a’, then it would have ignored the production rule A->bc (because ‘b’ is the first character of the string produced by this production rule, not ‘a’ ), and directly use the production rule A->a (because ‘a’ is the first character of the string produced by this production rule, and is same as the current character of the input string which is also ‘a’).
Hence it is validated that if the compiler/parser knows about first character of the string that can be obtained by applying a production rule, then it can wisely apply the correct production rule to get the correct syntax tree for the given input string.

Why FOLLOW?
The parser faces one more problem. Let us consider below grammar to understand this problem.

A -> aBb
B -> c | ε
And suppose the input string is “ab” to parse.

As the first character in the input is a, the parser applies the rule A->aBb.

A
/ | \
a B b

Now the parser checks for the second character of the input string which is b, and the Non-Terminal to derive is B, but the parser can’t get any string derivable from B that contains b as first character.
But the Grammar does contain a production rule B -> ε, if that is applied then B will vanish, and the parser gets the input “ab”, as shown below. But the parser can apply it only when it knows that the character that follows B in the production rule is same as the current character in the input.

In RHS of A -> aBb, b follows Non-Terminal B, i.e. FOLLOW(B) = {b}, and the current input character read is also b. Hence the parser applies this rule. And it is able to get the string “ab” from the given grammar.

A A
/ | \ / \
a B b => a b
|
ε

So FOLLOW can make a Non-terminal vanish out if needed to generate the string from the parse tree.

 

The conclusions is, we need to find FIRST and FOLLOW sets for a given grammar so that the parser can properly apply the needed rule at the correct position.

In the next article, we will discuss formal definitions of FIRST and FOLLOW, and some easy rules to compute these sets.

Post a Comment

0 Comments