4.4 Top-Down Parsing

Top-down parsing can be viewed as the problem of constructing a parse tree for the input string, starting from the root and creating the nodes of the parse tree in preorder (depth-first, as discussed in Section 2.3.4). Equivalently, top-down parsing can be viewed as finding a leftmost derivation for an input string.

Example 4.27: The sequence of parse trees in Fig. 4.12 for the input id+ id * id is a top-down parse according to grammar (4.2), repeated here:

E → T E’

E’ → + T E’ | ϵ

T → F T’

T’ → * F T’ | ϵ

F → ( E ) | id

(4.28)

This sequence of trees corresponds to a leftmost derivation of the input. □

At each step of a top-down parse, the key problem is that of determining the production to be applied for a nonterminal, say A. Once an A-production is chosen, the rest of the parsing process consists of “matching” the terminal symbols in the production body with the input string.

The section begins with a general form of top-down parsing, called recursive-descent parsing, which may require backtracking to find the correct A-production to be applied. Section 2.4.2 introduced predictive parsing, a special case of recursive-descent parsing, where no backtracking is required. Predictive parsing chooses the correct A-production by looking ahead at the input a fixed number of symbols, typically we may look only at one (that is, the next input symbol).

Figure 4.12: Top-down parse for id + id * id

For example, consider the top-down parse in Fig. 4.12, which constructs a tree with two nodes labeled E’. At the first E’ node (in preorder), the production E’ → +T E’ is chosen; at the second E’ node, the production E’ → ϵ is chosen. A predictive parser can choose between E’-productions by looking at the next input symbol.

The class of grammars for which we can construct predictive parsers looking k symbols ahead in the input is sometimes called the LL(k) class. We discuss the LL(1) class in Section 4.4.3, but intro duce certain computations, called FIRST and FOLLOW , in a preliminary Section 4.4.2. From the FIRST and FOLLOW sets for a grammar, we shall construct “predictive parsing tables,” which make explicit the choice of production during top-down parsing. These sets are also useful during bottom-up parsing, as we shall see.

In Section 4.4.4 we give a non-recursive parsing algorithm that maintains a stack explicitly, rather than implicitly via recursive calls. Finally, in Section 4.4.5 we discuss error recovery during top-down parsing.

4.4.1 Recursive-Descent Parsing

 

void A() {

1)

Choose an A-production, A → X1 X2 … Xk;

2)

for ( i = 1 to k ) {

3)

if ( X i is a nonterminal )

4)

call procedure X i ();

5)

else if ( X i equals the current input symbol a )

6)

advance the input to the next symbol;

7)

else

/* an error has occurred */;

 

}

}

Figure 4.13: A typical procedure for a nonterminal in a top-down parser

A recursive-descent parsing program consists of a set of procedures, one for each nonterminal. Execution begins with the procedure for the start symbol, which halts and announces success if its procedure body scans the entire input string. Pseudocode for a typical nonterminal appears in Fig. 4.13. Note that this pseudocode is nondeterministic, since it begins by choosing the A-production to apply in a manner that is not specified.

General recursive-descent may require backtracking; that is, it may require repeated scans over the input. However, backtracking is rarely needed to parse programming language constructs, so backtracking parsers are not seen frequently. Even for situations like natural language parsing, backtracking is not very efficient, and tabular methods such as the dynamic programming algorithm of Exercise 4.4.9 or the method of Earley (see the bibliographic notes) are preferred.

To allow backtracking, the co de of Fig. 4.13 needs to be modified. First, we cannot choose a unique A-production at line (1), so we must try each of several productions in some order. Then, failure at line (7) is not ultimate failure, but suggests only that we need to return to line (1) and try another A-production. Only if there are no more A-productions to try do we declare that an input error has been found. In order to try another A-production, we need to be able to reset the input pointer to where it was when we first reached line (1). Thus, a local variable is needed to store this input pointer for future use.

Example 4.29: Consider the grammar

S → c A d

A → a b | a

To construct a parse tree top-down for the input string w = cad, begin with a tree consisting of a single node labeled S, and the input pointer pointing to c, the first symbol of w. S has only one production, so we use it to expand S and obtain the tree of Fig. 4.14(a). The leftmost leaf, labeled c, matches the first symbol of input w , so we advance the input pointer to a, the second symbol of w , and consider the next leaf, labeled A.

(a)

(b)

(c)

Figure 4.14: Steps in a top-down parse

Now, we expand A using the first alternative A → a b to obtain the tree of Fig. 4.14(b). We have a match for the second input symbol, a, so we advance the input pointer to d, the third input symbol, and compare d against the next leaf, labeled b. Since b does not match d, we rep ort failure and go back to A to see whether there is another alternative for A that has not been tried, but that might produce a match.

In going back to A, we must reset the input pointer to position 2, the position it had when we first came to A, which means that the procedure for A must store the input pointer in a local variable.

The second alternative for A produces the tree of Fig. 4.14(c). The leaf a matches the second symbol of w and the leaf d matches the third symbol. Since we have produced a parse tree for w, we halt and announce successful completion of parsing. □

A left-recursive grammar can cause a recursive-descent parser, even one with backtracking, to go into an infinite lo op. That is, when we try to expand a nonterminal A, we may eventually find ourselves again trying to expand A without having consumed any input.

4.4.2 FIRST and FOLLOW

The construction of both top-down and bottom-up parsers is aided by two functions, FIRST and FOLLOW, associated with a grammar G. During top-down parsing, FIRST and FOLLOW allow us to choose which production to apply, based on the next input symbol. During panic-mode error recovery, sets of tokens produced by FOLLOW can be used as synchronizing tokens.

Define FIRST (α), where α is any string of grammar symbols, to be the set of terminals that begin strings derived from α. If α *⇒ϵ, then ϵ is also in FIRST (α). For example, in Fig. 4.15, A *⇒cγ, so c is in FIRST (A).

For a preview of how FIRST can be used during predictive parsing, consider two A-productions A → α|β, where FIRST (α) and FIRST (β) are disjoint sets. We can then choose between these A-productions by looking at the next input symbol a, since a can be in at most one of FIRST (α) and FIRST (β), not both. For instance, if a is in FIRST (β) choose the production A → β. This idea will be explored when LL(1) grammars are defined in Section 4.4.3.

 

Figure 4.15: Terminal c is in FIRST(A) and a is in FOLLOW (A)

Define FOLLOW(A), for nonterminal A, to be the set of terminals a that can appear immediately to the right of A in some sentential form; that is, the set of terminals a such that there exists a derivation of the form S *⇒αAaβ , for some α and β , as in Fig. 4.15. Note that there may have been symbols between A and a, at some time during the derivation, but if so, they derived ϵ and disappeared. In addition, if A can be the rightmost symbol in some sentential form, then $ is in FOLLOW (A); recall that $ is a special “endmarker” symbol that is assumed not to be a symbol of any grammar.

To compute FIRST (X) for all grammar symbols X, apply the following rules until no more terminals or can be added to any FIRST set.

1.     If X is a terminal, then FIRST(X) = {X}.

2.     If X is a nonterminal and X → Y1 Y2 … Yk is a production for some k≥1, then place a in FIRST(X) if for some i, a is in FIRST (Yi), and is in all of FIRST(Y1 ), … , FIRST(Yi-1 ); that is, Y1 … Yi-1 *⇒ϵ. If ϵ is in FIRST (Yj) for all j = 1, 2, … k, then add ϵ to FIRST (X). For example, everything in FIRST (Y1) is surely in FIRST(X). If Y1 does not derive ϵ, then we add nothing more to FIRST(X), but if Y1 *⇒ϵ, then we add FIRST (Y2), and so on.

3.     If X → ϵ is a production, then add ϵ to FIRST(X).

Now, we can compute FIRST for any string X1 X2 … Xn as follows. Add to FIRST (X1 X2 … Xn) all non-ϵ symbols of FIRST(X1). Also add the non-symbols of FIRST (X2), if is in FIRST (X1); the non-ϵ symbols of FIRST (X3), if is in FIRST (X1) and FIRST(X2); and so on. Finally, add ϵ to FIRST (X1 X2 … Xn) if, for all i, ϵ is in FIRST (Xi).

To compute FOLLOW (A) for all nonterminals A, apply the following rules until nothing can be added to any FOLLOW set.

1.     Place $ in FOLLOW (S), where S is the start symbol, and $ is the input right endmarker.

2.     If there is a production A → αBβ, then everything in FIRST (β) except ϵ is in FOLLOW (B).

3.     If there is a production A →αB, or a production A →αBβ, where FIRST (β) contains ϵ, then everything in FOLLOW (A) is in FOLLOW (B).

Example 4.30: Consider again the non-left-recursive grammar (4.28). Then:

1.     FIRST (F) = FIRST (T) = FIRST (E) = {(, id}. To see why, note that the two productions for F have bodies that start with these two terminal symbols, id and the left parenthesis. T has only one production, and its body starts with F. Since F does not derive ϵ, FIRST (T) must be the same as FIRST (F). The same argument covers FIRST (E).

2.     FIRST (E’) = {+, ϵ}. The reason is that one of the two productions for E’ has a body that begins with terminal +, and the other's body is ϵ. Whenever a nonterminal derives ϵ, we place in FIRST for that nonterminal.

3.     FIRST (T’) = {*, ϵ}. The reasoning is analogous to that for FIRST (E’).

4.     FOLLOW (E) = FOLLOW (E’) = {), $}. Since E is the start symbol, FOLLOW (E) must contain $. The production body (E) explains why the right parenthesis is in FOLLOW (E). For E’, note that this nonterminal appears only at the ends of b o dies of E-productions. Thus, FOLLOW (E’) must be the same as FOLLOW (E).

5.     FOLLOW (T) = FOLLOW (T’) = {+, ), $}. Notice that T appears in bodies only followed by E’. Thus, everything except ϵ that is in FIRST (E’) must be in FOLLOW (T); that explains the symbol +. However, since FIRST (E’) contains ϵ (i.e., E’ *⇒ϵ), and E’ is the entire string following T in the bodies of the E-productions, everything in FOLLOW (E) must also be in FOLLOW (T). That explains the symbols $ and the right parenthesis. As for T’, since it appears only at the ends of the T -productions, it must be that FOLLOW (T’) = FOLLOW (T).

6.     FOLLOW (F) = {+, *, ), $}. The reasoning is analogous to that for T in point (5).

4.4.3 LL(1) Grammars

Predictive parsers, that is, recursive-descent parsers needing no backtracking, can be constructed for a class of grammars called LL(1). The first “L” in LL(1) stands for scanning the input from left to right, the second “L” for producing a leftmost derivation, and the “1” for using one input symbol of lookahead at each step to make parsing action decisions.

Transition Diagrams for Predictive Parsers

Transition diagrams are useful for visualizing predictive parsers. For example, the transition diagrams for nonterminals E and E’ of grammar (4.28) appear in Fig. 4.16(a). To construct the transition diagram from a grammar, first eliminate left recursion and then left factor the grammar. Then, for each nonterminal A,

1.     Create an initial and final (return) state.

2.     For each production A →X1 X2 … Xk, create a path from the initial to the final state, with edges labeled X1, X2, …, Xk. If A → ϵ, the path is an edge labeled ϵ.

Transition diagrams for predictive parsers differ from those for lexical analyzers. Parsers have one diagram for each nonterminal. The labels of edges can be tokens or nonterminals. A transition on a token (terminal) means that we take that transition if that token is the next input symbol. A transition on a nonterminal A is a call of the procedure for A.

With an LL(1) grammar, the ambiguity of whether or not to take an -edge can be resolved by making -transitions the default choice.

Transition diagrams can be simplified, provided the sequence of grammar symbols along paths is preserved. We may also substitute the diagram for a nonterminal A in place of an edge labeled A. The diagrams in Fig. 4.16(a) and (b) are equivalent: if we trace paths from E to an accepting state and substitute for E’, then, in both sets of diagrams, the grammar symbols along the paths make up strings of the form T + T + … + T . The diagram in (b) can be obtained from (a) by transformations akin to those in Section 2.5.4, where we used tail-recursion removal and substitution of procedure b o dies to optimize the procedure for a nonterminal.

The class of LL(1) grammars is rich enough to cover most programming constructs, although care is needed in writing a suitable grammar for the source language. For example, no left-recursive or ambiguous grammar can be LL(1).

A grammar G is LL(1) if and only if whenever A → α|β are two distinct productions of G, the following conditions hold:

1. For no terminal a do both α and β derive strings beginning with a.

2. At most one of and fi can derive the empty string.

3. If β *⇒ϵ, then α does not derive any string beginning with a terminal in FOLLOW (A). Likewise, if α *⇒ϵ, then β does not derive any string beginning with a terminal in FOLLOW (A).

(a)

(b)

Figure 4.16: Transition diagrams for nonterminals E and E’ of grammar 4.28

The first two conditions are equivalent to the statement that FIRST (α) and FIRST (β) are disjoint sets. The third condition is equivalent to stating that if ϵ is in FIRST (β), then FIRST (α) and FOLLOW(A) are disjoint sets, and likewise if ϵ is in FIRST(α).

Predictive parsers can be constructed for LL(1) grammars since the proper production to apply for a nonterminal can be selected by looking only at the current input symbol. Flow-of-control constructs, with their distinguishing keywords, generally satisfy the LL(1) constraints. For instance, if we have the productions

stmt

→ if ( expr ) stmt else stmt

| while ( expr ) stmt

| { stmt_list }

then the keywords if, while, and the symbol { tell us which alternative is the only one that could possibly succeed if we are to find a statement.

The next algorithm collects the information from FIRST and FOLLOW sets into a predictive parsing table M [A, a], a two-dimensional array, where A is a nonterminal, and a is a terminal or the symbol $, the input endmarker. The algorithm is based on the following idea: the production A → α is chosen if the next input symbol a is in FIRST (α). The only complication occurs when α = ϵ or, more generally, α *⇒ϵ. In this case, we should again choose A → α, if the current input symbol is in FOLLOW (A), or if the $ on the input has been reached and $ is in FOLLOW (A).

Algorithm 4.31: Construction of a predictive parsing table.

INPUT: Grammar G.

OUTPUT: Parsing table M.

METHOD: For each production A → α of the grammar, do the following:

1.     For each terminal a in FIRST (A), add A → α to M [A, a].

2.     If ϵ is in FIRST (α), then for each terminal b in FOLLOW (A), add A → α to M [A, b]. If ϵ is in FIRST (α) and $ is in FOLLOW (A), add A → α to M [A, $] as well.

If, after performing the above, there is no production at all in M [A, a], then set M [A, a] to error (which we normally represent by an empty entry in the table). □

Example 4.32: For the expression grammar (4.28), Algorithm 4.31 produces the parsing table in Fig. 4.17. Blanks are error entries; non-blanks indicate a production with which to expand a nonterminal.

NON- TERMINAL

INPUT SYMBOL

id

+

*

(

)

$

E

E → T E’

 

 

E → T E’

 

 

E’

 

E’→ +T E’

 

 

E’→ ϵ

E’→ ϵ

T

T → F T’

 

 

T → F T’

 

 

T’

 

T’→ ϵ

T’→ *F T’

 

T’→ ϵ

T’→ ϵ

F

F → id

 

 

F → (E)

 

 

Figure 4.17: Parsing table M for Example 4.32

Consider production E → T E’. Since

FIRST (T E’) = FIRST (T) = {(, id}

this production is added to M [E, (] and M [E, id]. production E’→ +T E’ is added to M [E’, +] since FIRST (+T E’) = {+}. Since FOLLOW (E’) = {), $}, production E’→ ϵ is added to M [E’, )] and M [E’, $]. □

Algorithm 4.31 can be applied to any grammar G to produce a parsing table M. For every LL(1) grammar, each parsing-table entry uniquely identifies a production or signals an error. For some grammars, however, M may have some entries that are multiply defined. For example, if G is left-recursive or ambiguous, then M will have at least one multiply defined entry. Although left-recursion elimination and left factoring are easy to do, there are some grammars for which no amount of alteration will produce an LL(1) grammar.

The language in the following example has no LL(1) grammar at all.

Example 4.33: The following grammar, which abstracts the dangling-else problem, is repeated here from Example 4.22:

S → iEtSS’ | a

S’→ eS | ϵ

E → b

The parsing table for this grammar app ears in Fig. 4.18. The entry for M [S’, e] contains both S’→ eS and S’→ ϵ.

The grammar is ambiguous and the ambiguity is manifested by a choice in what production to use when an e (else) is seen. We can resolve this ambiguity

NON-TERMINAL

INPUT SYMBOL

a

b

e

i

t

$

S

S → a

 

 

S → iEtSS’

 

 

S’

 

 

S’→ ϵ

S’→ eS

 

 

S’→ ϵ

E

 

E → b

 

 

 

 

Figure 4.18: Parsing table M for Example 4.33

by choosing S’→ eS. This choice corresponds to associating an else with the closest previous then. Note that the choice S’→ ϵ would prevent e from ever being put on the stack or removed from the input, and is surely wrong. □

4.4.4 Nonrecursive Predictive Parsing

A nonrecursive predictive parser can be built by maintaining a stack explicitly, rather than implicitly via recursive calls. The parser mimics a leftmost derivation. If ω is the input that has been matched so far, then the stack holds a sequence of grammar symbols α such that

S *lm⇒ ωα

The table-driven parser in Fig. 4.19 has an input buffer, a stack containing a sequence of grammar symbols, a parsing table constructed by Algorithm 4.31, and an output stream. The input buffer contains the string to be parsed, followed by the endmarker $. We reuse the symbol $ to mark the bottom of the stack, which initially contains the start symbol of the grammar on top of $.

The parser is controlled by a program that considers X, the symbol on top of the stack, and a, the current input symbol. If X is a nonterminal, the parser chooses an X -production by consulting entry M [X, a] of the parsing table M.

(Additional co de could be executed here, for example, code to construct a node in a parse tree.) Otherwise, it checks for a match between the terminal X and current input symbol a.

The behavior of the parser can be described in terms of its configurations, which give the stack contents and the remaining input. The next algorithm describes how configurations are manipulated.

Algorithm 4.34: Table-driven predictive parsing.

INPUT: A string w and a parsing table M for grammar G.

OUTPUT: If ω is in L(G), a leftmost derivation of ω; otherwise, an error indication.

Figure 4.19: Model of a table-driven predictive parser

METHOD: Initially, the parser is in a configuration with ω$ in the input buffer and the start symbol S of G on top of the stack, above $. The program in Fig. 4.20 uses the predictive parsing table M to produce a predictive parse for the input. □

let a be the first symbol of w ;

let X be the top stack symbol;

while ( X≠$ ) { /* stack is not empty */

if ( X = a ) pop the stack and let a be the next symbol of w ;

else if ( X is a terminal ) error();

else if ( M [X, a] is an error entry ) error();

else if ( M [X, a] = X → Y1 Y2 … Yk ) {

output the production X → Y1 Y2 … Yk;

pop the stack;

push Yk, Yk-1, … Y1 onto the stack, with Y1 on top;

}

let X be the top stack symbol;

}

Figure 4.20: Predictive parsing algorithm

Example 4.35: Consider grammar (4.28); we have already seen it’s the parsing table in Fig. 4.17. On input id + id * id, the nonrecursive predictive parser of Algorithm 4.34 makes the sequence of moves in Fig. 4.21. These moves correspond to a leftmost derivation (see Fig. 4.12 for the full derivation):

E lm⇒ T E’ lm⇒ F T’E’ lm⇒ id T’E’ lm⇒ id E’ lm⇒ id + T E’ lm⇒ …

MATCHED

STACK

INPUT

ACTION

 

E $

id + id * id$

 

 

T E’$

id + id * id$

output E → T E’

 

F T’E’$

id + id * id$

output T → F T’

 

id T’E’$

id + id * id$

output F → id

id

T’E’$

+ id * id$

match id

id

E’$

+ id * id$

output T’→ ϵ

id

+ T E’$

+ id * id$

output E’→ + T E’

id +

T E’$

id * id$

match +

id +

F T’E’$

id * id$

output T → F T’

id +

id T’E’$

id * id$

output F → id

id + id

T’E’$

* id$

match id

id + id

* F T’E’$

* id$

output T’→ * F T’

id + id *

F T’E’$

id$

match *

id + id *

id T’E’$

id$

output F → id

id + id * id

T’E’$

$

match id

id + id * id

E’$

$

output T’→ ϵ

id + id * id

$

$

output E’→ ϵ

Figure 4.21: Moves made by a predictive parser on input id + id * id

Note that the sentential forms in this derivation correspond to the input that has already been matched (in column MATCHED) followed by the stack contents. The matched input is shown only to highlight the correspondence. For the same reason, the top of the stack is to the left; when we consider bottom-up parsing, it will be more natural to show the top of the stack to the right. The input pointer points to the leftmost symbol of the string in the INPUT column. □

4.4.5 Error Recovery in Predictive Parsing

This discussion of error recovery refers to the stack of a table-driven predictive parser, since it makes explicit the terminals and nonterminals that the parser hopes to match with the remainder of the input; the techniques can also be used with recursive-descent parsing.

An error is detected during predictive parsing when the terminal on top of the stack does not match the next input symbol or when nonterminal A is on top of the stack, a is the next input symbol, and M [A, a] is error (i.e., the parsing-table entry is empty).

Panic Mode

Panic-mode error recovery is based on the idea of skipping over symbols on the input until a token in a selected set of synchronizing tokens app ears. Its effectiveness depends on the choice of synchronizing set. The sets should be chosen so that the parser recovers quickly from errors that are likely to occur in practice. Some heuristics are as follows:

1.     As a starting point, place all symbols in FOLLOW (A) into the synchronizing set for nonterminal A. If we skip tokens until an element of FOLLOW (A) is seen and pop A from the stack, it is likely that parsing can continue.

2.     It is not enough to use FOLLOW (A) as the synchronizing set for A. For example, if semicolons terminate statements, as in C, then keywords that begin statements may not app ear in the FOLLOW set of the nonterminal representing expressions. A missing semicolon after an assignment may therefore result in the keyword beginning the next statement being skipped. Often, there is a hierarchical structure on constructs in a language; for example, expressions app ear within statements, which appear within blocks, and so on. We can add to the synchronizing set of a lower-level construct the symbols that begin higher-level constructs. For example, we might add keywords that begin statements to the synchronizing sets for the nonterminals generating expressions.

3.     If we add symbols in FIRST(A) to the synchronizing set for nonterminal A, then it may be possible to resume parsing according to A if a symbol in FIRST (A) app ears in the input.

4.     If a nonterminal can generate the empty string, then the production deriving can be used as a default. Doing so may postpone some error detection, but cannot cause an error to be missed. This approach reduces the number of nonterminals that have to be considered during error recovery.

5.     If a terminal on top of the stack cannot be matched, a simple idea is to pop the terminal, issue a message saying that the terminal was inserted, and continue parsing. In effect, this approach takes the synchronizing set of a token to consist of all other tokens.

Example 4.36: Using FIRST and FOLLOW symbols as synchronizing tokens works reasonably well when expressions are parsed according to the usual grammar (4.28). The parsing table for this grammar in Fig. 4.17 is repeated in Fig. 4.22, with “synch” indicating synchronizing tokens obtained from the FOLLOW set of the nonterminal in question. The FOLLOW sets for the nonterminals are obtained from Example 4.30.

The table in Fig. 4.22 is to be used as follows. If the parser looks up entry M [A, a] and finds that it is blank, then the input symbol a is skipped. If the entry is “synch,” then the nonterminal on top of the stack is popped in an attempt to resume parsing. If a token on top of the stack does not match the input symbol, then we pop the token from the stack, as mentioned above.

NON- TERMINAL

INPUT SYMBOL

id

+

*

(

)

$

E

E → T E’

 

 

E → T E’

synch

synch

E’

 

E’→ +T E’

 

 

E’→ ϵ

E’→ ϵ

T

T → F T’

synch

 

T → F T’

synch

synch

T’

 

T’→ ϵ

T’→ *F T’

 

T’→ ϵ

T’→ ϵ

F

F → id

synch

synch

F → (E)

synch

synch

Figure 4.22: Synchronizing tokens added to the parsing table of Fig. 4.17

On the erroneous input ) id * + id, the parser and error recovery mechanism of Fig. 4.22 behave as in Fig. 4.23. □

STACK

INPUT

REMARK

E $

) id * + id $

error, skip )

E $

id * + id $

id is in FIRST (E)

T E’$

id * + id $

 

F T’E’$

id * + id $

 

id T’E’$

id * + id $

 

T’E’$

* + id $

 

* F T’E’$

* + id $

 

F T’E’$

+ id $

error, M [F, +] = synch

T’E’$

+ id $

F has been popped

E’$

+ id $

 

+ T E’$

+ id $

 

T E’$

id $

 

F T’E’$

id $

 

id T’E’$

id $

 

T’E’$

$

 

E’$

$

 

$

$

 

Figure 4.23: Parsing and error recovery moves made by a predictive parser

The above discussion of panic-mode recovery does not address the important issue of error messages. The compiler designer must supply informative error messages that not only describe the error, they must draw attention to where the error was discovered.

Phrase-level Recovery

Phrase-level error recovery is implemented by filling in the blank entries in the predictive parsing table with pointers to error routines. These routines may change, insert, or delete symbols on the input and issue appropriate error messages. They may also pop from the stack. Alteration of stack symbols or the pushing of new symbols onto the stack is questionable for several reasons. First, the steps carried out by the parser might then not correspond to the derivation of any word in the language at all. Second, we must ensure that there is no possibility of an infinite loop. Checking that any recovery action eventually results in an input symbol being consumed (or the stack being shortened if the end of the input has been reached) is a good way to protect against such loops.

4.4 Top-Down Parsing的更多相关文章

  1. DOS批处理 - 函数教程

    DOS Batch - Function Tutorial What it is, why it`s important and how to write your own. Description: ...

  2. Parsing XML in J2ME

    sun的原文,原文地址是http://developers.sun.com/mobility/midp/articles/parsingxml/. by Jonathan KnudsenMarch 7 ...

  3. Cursor: Pin S Wait On X In The Top 5 Wait Events

    Wait Events , Posted in: Technical Track Tags: Group Blog Posts, Oracle, Technical Blog Lately, wait ...

  4. Browser Page Parsing Details

    Browser Work: 1.输入网址.  2.浏览器查找域名的IP地址.  3. 浏览器给web服务器发送一个HTTP请求  4. 网站服务的永久重定向响应  5. 浏览器跟踪重定向地址 现在,浏 ...

  5. Macro-Micro Adversarial Network for Human Parsing

    Macro-Micro Adversarial Network for Human Parsing ECCV-2018 2018-10-27 15:15:07 Paper: https://arxiv ...

  6. JJTree Tutorial for Advanced Java Parsing

    The Problem JJTree is a part of JavaCC is a parser/scanner generator for Java. JJTree is a preproces ...

  7. Summary on Visual Tracking: Paper List, Benchmarks and Top Groups

    Summary on Visual Tracking: Paper List, Benchmarks and Top Groups 2018-07-26 10:32:15 This blog is c ...

  8. Chapter 3 Top 10 List

    3.1 Introduction Given a set of (key-as-string, value-as-integer) pairs, then finding a Top-N ( wher ...

  9. 场景分割:MIT Scene Parsing 与DilatedNet 扩展卷积网络

    MIT Scene Parsing Benchmark简介 Scene parsing is to segment and parse an image into different image re ...

随机推荐

  1. DNS服务器原理简述、搭建主/从DNS服务器并实现智能解析

    1. TLD:Top Level Domain 顶级域名 组织域:.com, .net, .org, .gov, .edu, .mil 国家域:.iq, .tw, .hk, .jp, .cn, ... ...

  2. JavaScript中数据类型的转换规则

    JavaScript中数据类型的转换规则 制作人:全心全意 JavaScript是一种无类型语言,也就是说,在声明变量时无须指定数据类型,这使得JavaScript更具有灵活性和简单性. 在代码执行过 ...

  3. web前端开发——css

    一.css介绍 1.css是什么? Cascading Style Sheets缩写,层叠样式表.样式定义如何显示HTML元素,样式通常又会存在于样式表中. 2.为什么需要css? 使HTML页面变得 ...

  4. 节点回来shard仍然delayed原因

    1:es2 fetch shard data时,存在节点刚加入集群,还没有收到cluster metadata的情况.此时,节点因为没有该索引,返回的sharddata为empty,主节点缓存了该sh ...

  5. 关于zookeeper中session timeout

    转自https://yq.aliyun.com/articles/117825?t=t1,主要结论如下: 经过源码分析,得出SessionTimeOut的协商如下: 情况1: 配置文件配置了maxSe ...

  6. 洛谷P3373 线段树2(补上注释了)

    毒瘤题.找了一下午+晚上的BUG,才发现原来query_tree写的是a%p; 真的是一个教训 UPD:2019.6.18 #include<iostream> #include<c ...

  7. 九度oj 题目1074:对称平方数

    题目1074:对称平方数 时间限制:1 秒 内存限制:32 兆 特殊判题:否 提交:6422 解决:2912 题目描述: 打印所有不超过n(n<256)的,其平方具有对称性质的数. 如11*11 ...

  8. 移动端click事件延迟300ms该如何解决

    window.addEventListener( "load", function() {     FastClick.attach( document.body ); }, fa ...

  9. UVA 116_ Unidirectional TSP

    题意: 给定整数矩阵,从第一列的任何一个位置出发,每次可以向右.右上.右下走一个格,将最后一行和第一行看成是邻接的,最终到达最后一列,路径长度为所经过格中的整数之和,求最小路径,答案不唯一,输出字典序 ...

  10. Bad Luck Island-CodeForce(dp)

    链接:http://codeforces.com/problemset/problem/540/D 题目大意: 这个岛上有三种生物   r石头  s剪刀 p布 求最后只剩一种生物的概率 用dp[i][ ...