4.7.5 Efficient Construction of LALR Parsing Tables
4.7.5 Efficient Construction of LALR Parsing Tables
There are several modifications we can make to Algorithm 4.59 to avoid constructing the full collection of sets of LR(1) items in the process of creating an LALR(1) parsing table.
First, we can represent any set of LR(0) or LR(1) items I by its kernel, that is, by those items that are either the initial item -- [S’→@S] or [S’→@S,$] -- or that have the dot somewhere other than at the beginning of the production body.
We can construct the LALR(1)-item kernels from the LR(0)-item kernels by a process of propagation and spontaneous generation of lookaheads, that we shall describe shortly.
If we have the LALR(1) kernels, we can generate the LALR(1) parsing table by closing each kernel, using the function CLOSURE of Fig. 4.40, and then computing table entries by Algorithm 4.56, as if the LALR(1) sets of items were canonical LR(1) sets of items.
Example 4.61: We shall use as an example of the efficient LALR(1) table-construction method the non-SLR grammar from Example 4.48, which we reproduce below in its augmented form:
S’→S
S→L = R | R
L→*R | id
R→L
The complete sets of LR(0) items for this grammar were shown in Fig. 4.39. The kernels of these items are shown in Fig. 4.44.
|
I0: |
S’→@S |
I5: |
L→id@ |
|
I1: |
S’→S@ |
I6: |
S→L = @R |
|
I2: |
S→L@ = R R→L@ |
I7: |
L→*R@ |
|
I3: |
S→R@ |
I8: |
R→L@ |
|
I4: |
L→*@R |
I9: |
S→L = R@ |
Figure 4.44: Kernels of the sets of LR(0) items for grammar (4.49)
□
Now we must attach the proper lookaheads to the LR(0) items in the kernels, to create the kernels of the sets of LALR(1) items. There are two ways a lookahead b can get attached to an LR(0) item B→γ@δ in some set of LALR(1) items J :
1. There is a set of items I, with a kernel item A→α@β, a, and J=GOTO(I, X), and the construction of
GOTO(CLOSURE({[A→α@β, a]}), X)
as given in Fig. 4.40, contains [B→γ@δ, b], regardless of a. Such a lookahead b is said to be generated spontaneously for B→γ@δ. As a special case, lookahead $ is generated spontaneously for the item S’→@S in the initial set of items.
2. All is as in (1), but a = b, and GOTO(CLOSURE({[A→α@β, a]}), X), as given in Fig. 4.40, contains [B→γ@δ, b] only because A→α@β has b as one of its associated lookaheads. In such a case, we say that lookaheads propagate from A→α@β in the kernel of I to B→γ@δ in the kernel of J. Note that propagation does not depend on the particular lookahead symbol; either all lookaheads propagate from one item to another, or none do.
We need to determine the spontaneously generated lookaheads for each set of LR(0) items, and also to determine which items propagate lookaheads from which. The test is actually quite simple. Let # be a symbol not in the grammar at hand. Let A→α@β be a kernel LR(0) item in set I. Compute, for each X, J = GOTO(CLOSURE({[A→α@β, #]}), X).
For each kernel item in J, we examine its set of lookaheads. If # is a lookahead, then lookaheads propagate to that item from A→α@β. Any other lookahead is spontaneously generated. These ideas are made precise in the following algorithm, which also makes use of the fact that the only kernel items in J must have X immediately to the left of the dot; that is, they must be of the form B→γX@δ.
Algorithm 4.62: Determining lookaheads.
INPUT: The kernel K of a set of LR(0) items I and a grammar symbol X.
OUTPUT: The lookaheads spontaneously generated by items in I for kernel items in GOTO(I, X) and the items in I from which lookaheads are propagated to kernel items in GOTO(I, X).
METHOD: The algorithm is given in Fig. 4.45. □
for ( each item A→α@β in K ) {
J := CLOSURE({[A→α@β, #]});
if ( [B→γX@δ, a] is in J, and a is not # )
conclude that lookahead a is generated spontaneously for item
B→γX@δin GOTO(I, X);
if ( [B→γX@δ, #] is in J)
conclude that lookaheads propagate from A→α@β in I to
B→γ@δ in GOTO (I, X);
}
Figure 4.45: Discovering propagated and spontaneous lookaheads
We are now ready to attach lookaheads to the kernels of the sets of LR(0) items to form the sets of LALR(1) items. First, we know that $ is a lookahead for S 0→S in the initial set of LR(0) items. Algorithm 4.62 gives us all the lookaheads generated spontaneously. After listing all those lookaheads, we must allow them to propagate until no further propagation is possible. There are many different approaches, all of which in some sense keep track of “new” lookaheads that have propagated into an item but which have not yet propagated out. The next algorithm describes one technique to propagate lookaheads to all items.
Algorithm 4.63: Efficient computation of the kernels of the LALR(1) collection of sets of items.
INPUT: An augmented grammar G’.
OUTPUT: The kernels of the LALR(1) collection of sets of items for G’.
METHOD:
1. Construct the kernels of the sets of LR(0) items for G. If space is not at a premium, the simplest way is to construct the LR(0) sets of items, as in Section 4.6.2, and then remove the nonkernel items. If space is severely constrained, we may wish instead to store only the kernel items for each set, and compute GOTO for a set of items I by first computing the closure of I.
2. Apply Algorithm 4.62 to the kernel of each set of LR(0) items and grammar symbol X to determine which lookaheads are spontaneously generated for kernel items in GOTO(I, X), and from which items in I lookaheads are propagated to kernel items in GOTO(I, X).
3. Initialize a table that gives, for each kernel item in each set of items, the associated lookaheads. Initially, each item has associated with it only those lookaheads that we determined in step (2) were generated spontaneously.
4. Make repeated passes over the kernel items in all sets. When we visit an item i, we lo ok up the kernel items to which i propagates its lookaheads, using information tabulated in step (2). The current set of lookaheads for i is added to those already associated with each of the items to which i propagates its lookaheads. We continue making passes over the kernel items until no more new lookaheads are propagated.
□
Example 4.64: Let us construct the kernels of the LALR(1) items for the grammar of Example 4.61. The kernels of the LR(0) items were shown in Fig. 4.44. When we apply Algorithm 4.62 to the kernel of set of items I0, we first compute CLOSURE ({[S’→@S, #]}), which is
|
S’→@S, # S→@L = R, # S→@R, # |
L→@*R, #/= L→@id, #/= R→@L, # |
Among the items in the closure, we see two where the lookahead = has been generated spontaneously. The first of these is L→@*R. This item, with * to the right of the dot, gives rise to [L→@*R, =]. That is, = is a spontaneously generated lookahead for L→@*R, which is in set of items I4. Similarly, [L→@id, =] tells us that = is a spontaneously generated lookahead for L→id@ in I5.
As # is a lookahead for all six items in the closure, we determine that the item S’→@S in I0 propagates lookaheads to the following six items:
|
S’→ S@ in I1 S→L@ = R in I2 S→R@ in I3 |
L→*@R in I4 L→id@ in I5 R→L@ in I2 |
|
FROM |
TO |
|
I0: S’→@S |
I1: S’→S@ I2: S→L@ = R I2: R→L@ I3: S→R@ I4: L→*@R I5: L→id@ |
|
I2: S→L@ = R |
I6: S→L = @R |
|
I4: L→*@R |
I4: L→*@R I5: L→id@ I7: L→*R@ I8: R→L@ |
|
I6: S→L = @R |
I4: L→*@R I5: L→id@ I8: R→L@ I9: S→L = R@ |
Figure 4.46: Propagation of lookaheads
In Fig. 4.47, we show steps (3) and (4) of Algorithm 4.63. The column labeled INIT shows the spontaneously generated lookaheads for each kernel item. These are only the two occurrences of = discussed earlier, and the spontaneous lookahead $ for the initial item S’→@S.
On the first pass, the lookahead $ propagates from S’→@S in I0 to the six items listed in Fig. 4.46. The lookahead = propagates from L→*@R in I4 to items L→*R@ in I7 and R→L@ in I8. It also propagates to itself and to L→id@ in I5, but these lookaheads are already present. In the second and third passes, the only new lookahead propagated is $, discovered for the successors of I2 and I4 on pass 2 and for the successor of I6 on pass 3. No new lookaheads are propagated on pass 4, so the final set of lookaheads is shown in the rightmost column of Fig. 4.47.
Note that the shift/reduce conflict found in Example 4.48 using the SLR method has disappeared with the LALR technique. The reason is that only lookahead $ is associated with R→L@ in I2, so there is no conflict with the parsing action of shift on = generated by item S→L@=R in I2.
□
|
SET |
ITEM |
LOOKAHEADS |
|||
|
INIT |
PASS 1 |
PASS 2 |
PASS 3 |
||
|
I0: |
S’→@S |
$ |
$ |
$ |
$ |
|
I1: |
S’→S@ |
$ |
$ |
$ |
|
|
I2: |
S→L@ = R |
$ |
$ |
$ |
|
|
R→L@ |
$ |
$ |
$ |
||
|
I3: |
S→R@ |
$ |
$ |
$ |
|
|
I4: |
L→*@R |
= |
=/$ |
=/$ |
=/$ |
|
I5: |
L→id@ |
= |
=/$ |
=/$ |
=/$ |
|
I6: |
S→L = @R |
$ |
$ |
||
|
I7: |
L→*R@ |
= |
=/$ |
=/$ |
|
|
I8: |
R→L@ |
= |
=/$ |
=/$ |
|
|
I9: |
S→L = R@ |
$ |
|||
Figure 4.47: Computation of lookaheads
4.7.5 Efficient Construction of LALR Parsing Tables的更多相关文章
- 4.7.4 Constructing LALR Parsing Tables
4.7.4 Constructing LALR Parsing Tables We now introduce our last parser construction method, the LAL ...
- 4.7.3 Canonical LR(1) Parsing Tables
4.7.3 Canonical LR(1) Parsing Tables We now give the rules for constructing the LR(1) ACTION and GOT ...
- 4.7.6 Compaction of LR Parsing Tables
4.7.6 Compaction of LR Parsing Tables A typical programming language grammar with 50 to 100 terminal ...
- 4.4 Top-Down Parsing
4.4 Top-Down Parsing Top-down parsing can be viewed as the problem of constructing a parse tree for ...
- 基于虎书实现LALR(1)分析并生成GLSL编译器前端代码(C#)
基于虎书实现LALR(1)分析并生成GLSL编译器前端代码(C#) 为了完美解析GLSL源码,获取其中的信息(都有哪些in/out/uniform等),我决定做个GLSL编译器的前端(以后简称编译器或 ...
- Lua 架构 The Lua Architecture
转载自:http://magicpanda.net/2010/10/lua%E6%9E%B6%E6%9E%84%E6%96%87%E6%A1%A3/ Lua架构文档(翻译) 十 102010 前段时间 ...
- 4.9 Parser Generators
4.9 Parser Generators This section shows how a parser generator can be used to facilitate the constr ...
- compiler
http://www.lingcc.com/2012/05/16/12048/ a list of compiler books — 汗牛充栋的编译器参考资料 Posted on 2012年5月16日 ...
- 【原创】大数据基础之Hive(5)性能调优Performance Tuning
1 compress & mr hive默认的execution engine是mr hive> set hive.execution.engine;hive.execution.eng ...
随机推荐
- wepy.request 请求成功但是不进入success和fail方法,及请求传参问题
1.根据wepy官方给的文档如下,用then拿后台返回的数据,如果用then报错,请先在app.wpy中配置promise. 没有success,fail,complete方法,如若用了也是不会进入方 ...
- Linux 关于umount
场景:linux下挂载过去的代码目录编译失败.怀疑本地磁盘空间不足问题导致.解决方法:卸载重新挂载. 操作:卸载时报错: 解决方法: 1.umount, 老是提示:device is busy, 服务 ...
- Python中接收用户的输入
一.如何去接收用户的输入?使用函数 input() 函数 input() 让程序暂停运行,等待用户输入一些文本,获取用户的输入后,Python将其存储到一个变量中,以方便后期使用. name = in ...
- 自己打断点走的struts流程&拦截器工作原理
①. 请求发送给 StrutsPrepareAndExecuteFilter ②. StrutsPrepareAndExecuteFilter 判定该请求是否是一个 Struts2 请 求(Actio ...
- SQL SERVER 自增字段相关问题
SET IDENTITY_INSERT Data0048_TEST ON --给自增列赋值 DBCC CHECKIDENT(TableName) --查看某个表中的自增列当前的值 DBCC CHECK ...
- Google Protocol Buffer 的使用(一)
一.什么是Google Protocol Buffer下面是官网给的解释:Protocol buffers are a language-neutral, platform-neutral exten ...
- 洛谷 P4136 谁能赢呢?
P4136 谁能赢呢? 题目描述 小明和小红经常玩一个博弈游戏.给定一个n×n的棋盘,一个石头被放在棋盘的左上角.他们轮流移动石头.每一回合,选手只能把石头向上,下,左,右四个方向移动一格,并且要求移 ...
- Ubuntu 16.04安装RapidSVN
使用RabbitVCS有一些不完美,比如没有把文件增加到版本库的功能,导致无法提交等问题,现在再次安装RapidSVN来弥补这些缺点. 安装: sudo apt-get install rapidsv ...
- URL传递多个参数遇到的bug
bug所在: 通过URL传递多个参数的时候,其一是中文出现乱码,其二是空格被“%20”替代: 原因分析:原理暂时还不清楚,后续再研究下原理,只知道有中文的时候就会出现乱码:%20是url空格的编码: ...
- HDU 4983 Goffi and GCD(数论)
HDU 4983 Goffi and GCD 思路:数论题.假设k为2和n为1.那么仅仅可能1种.其它的k > 2就是0种,那么事实上仅仅要考虑k = 1的情况了.k = 1的时候,枚举n的因子 ...