令狐冲慢慢走近,那汉子全身发抖,双膝一屈,跪倒在雪地之中。令狐冲怒道:“你辱我师妹,须饶你不得。”长剑指在他咽喉之上,心念一动,走近一步,低声问道:“写在雪人上的,是些什么字?”

  那汉子颤声道:“是……是……‘海枯……海枯……石烂,两……情……情不……不渝’。”自从世上有了“海枯石烂,两情不渝”这八个字以来,说得如此胆战心惊、丧魂落魄的,只怕这是破题儿第一遭了。

  令狐冲一呆,道:“嗯,是海枯石烂,两情不渝。”心头酸楚,长剑送出,刺入他咽喉。

——《笑傲江湖》

语义分析较困难的根本原因在于语法的可递归性,深层次的递归使得问题的分解看起来变得相当地复杂。但是如果能将递归问题转化为迭代问题,便能很大程度地简化此问题模型。递归转化为迭代的关键在于——找到最深层递归结构的全部特征,迭代化之,问题便迎刃而解。
  一般情况下,人们在面对复杂的递归问题时时,亦是依据其语法规则,找到其递归深层的结构,化解之,步步迭代,如此,问题便得到解决。人类的思维很是擅长将递归问题转化为迭代问题,而学习知识的过程,则可以看成是对各种各样语法规则的理解与掌握。
  一元操作符、二元操作符的递归问题,可以很简单的转化为迭代,多元操作符的情况稍复杂些。

所有的操作符及其优先级如下图:

如typeof、取地址、指针指向等,在这里并未实现。实现的包括有算数运算式、逻辑运算式、函数调用与括号。对于理解语义分析的过程,已足够。

对于不包含括号与函数的简单表达式,我们语义分析演算过程如下:

我们的数据结构:

 '''
____________________________ Syntax Tree
Parenthesis:
["(",None]
[")",None]
Operators(grouped by precedence):
Unary :
1 + - ! ~ ["+",None] ["-",None] ["!",None] ["~",None]
Binary :
2 * / % ["*",None] ["/",None] ["%",None]
3 + - ["+",None] ["-",None]
4 << >> ["<<",None] [">>",None]
5 > >= < <= [">",None] [">=",None] ["<",None] ["<=",None]
6 == != ["==",None] ["!=",None]
7 & ["&",None]
8 ^ ["^",None]
9 | ["|",None]
10 && ["&&",None]
11 || ["||",None]
Ternary :
12 expr ? expr : expr ["?",None] [":",None] ["@expr","?:",listPtr0,listPtr1,listPtr2]
13 expr , expr , expr...
Var,Num,Expr,Function:
["@var","varName"]
["@num","num_string"]
["@expr","Operator",listPtr,...]
["@func","funcName",listPtr1,...]
["@expr_list",["@var"|"@num"|"@expr"|"@func",...],...]
'''

这是我们最终的代码模块图:

其中形如 module_x_y 的函数,x表示此运算符的优先级,y表示横向序号,从零开始。代码注释已经写得很详细了,请看源代码:

 ######################################## global list
OperatorList=['+','-','!','~',\
'*','/','%',\
'+','-',\
'<<','>>',\
'>','>=','<','<=',\
'==','!=',\
'&',\
'^',\
'|',\
'&&',\
'||',\
'?',':'\
',']
''' 31 + 8 * 9 '''
listToParse=[ ['@num',''] , ['+',None] , ['@num',''] , ['*',None] , ['@num',''] ] ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# + =: ^+A... | ...Op+A...
def module_1_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ^+A...
if i==0 and len(lis)>=2:
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[0:2]
lis.insert(0,["@expr","+",rightPtr])
return 0
# process: ...Op+A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[left][0] in OperatorList:
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[i:i+2]
lis.insert(i,["@expr","+",rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# - =: ^-A... | ...Op-A...
def module_1_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ^-A...
if i==0 and len(lis)>=2:
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[0:2]
lis.insert(0,["@expr","-",rightPtr])
return 0
# process: ...Op-A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[left][0] in OperatorList:
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[i:i+2]
lis.insert(i,["@expr","-",rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# ! =: ...!A...
def module_1_2(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...!A...
if len(lis)>=2 and right<len(lis):
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[i:i+2]
lis.insert(i,["@expr","!",rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# ~ =: ...~A...
def module_1_3(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...~A...
if len(lis)>=2 and right<len(lis):
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[i:i+2]
lis.insert(i,["@expr","~",rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# * =: ...A*A...
def module_2_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A*A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","*",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# / =: ...A/A...
def module_2_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A/A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","/",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# % =: ...A%A...
def module_2_2(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A%A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","%",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# + =: ...A+A...
def module_3_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A+A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","+",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# - =: ...A-A...
def module_3_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A-A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","-",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# << =: ...A<<A...
def module_4_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A<<A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","<<",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# >> =: ...A>>A...
def module_4_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A>>A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr",">>",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# > =: ...A>A...
def module_5_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A>A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr",">",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# >= =: ...A>=A...
def module_5_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A>=A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr",">=",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# < =: ...A<A...
def module_5_2(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A<A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","<",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# <= =: ...A<=A...
def module_5_3(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A<=A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","<=",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# == =: ...A==A...
def module_6_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A==A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","==",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# != =: ...A!=A...
def module_6_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A!=A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","!=",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# & =: ...A&A...
def module_7_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A&A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","&",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# ^ =: ...A^A...
def module_8_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A^A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","^",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# | =: ...A|A...
def module_9_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A|A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","|",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# && =: ...A&&A...
def module_10_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A&&A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","&&",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# || =: ...A||A...
def module_11_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A||A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","||",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# ?: =: ...A?A:A...
################# ^
def module_12_0(lis,i): # left i right are both indexes :)
first=i-3
leftOp=i-2
left=i-1
right=i+1 # process: ...A?A:A...
# ^
if i>=3 and len(lis)>=5 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' and\
lis[leftOp][0]=='?' and lis[first][0][0]=='@':
firstPtr=lis[first]
leftPtr=lis[left]
rightPtr=lis[right]
del lis[first:first+5]
lis.insert(first,["@expr","?:",firstPtr,leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# , =: A,A,...A,A
def module_13_0(lis,i): # process: A,A,...A,A
if len(lis)==1 and lis[0][0][0]!='@':
return 1
if len(lis)==1 and lis[0][0][0]=='@':
return 0
if (len(lis)%2)==1 :
i=1
if lis[0][0][0]!='@':
return 1
while i<len(lis):
if lis[i+1][0][0]=='@' and lis[i][0]==',':
i=i+2
else:
return 1
ls=[['@expr_list']]
i=0
while i<len(lis):
ls[0].append(lis[i])
i=i+2
del lis[:]
lis[:]=ls[:]
return 0
return 1

上面的代码虽然很大,却是最简单的一部分了,其实可以采取一些方法显著地压缩代码量,但是时间有限。

下面给出一元运算符、二元运算符、三元运算符及逗号分隔符的语义分析过程,这是本文的核心代码之一:

 ######################################## global list
# construct a module dictionary
# module_dic_tuple[priority]['Operator'](lis,i)
module_dic_tuple=({}, { '+':module_1_0,'-':module_1_1,'!':module_1_2,'~':module_1_3 },\
{ '*':module_2_0,'/':module_2_1,'%':module_2_2 }, \
{ '+':module_3_0,'-':module_3_1 },\
{ '<<':module_4_0,'>>':module_4_1 },\
{ '>':module_5_0,'>=':module_5_1,'<':module_5_2,'<=':module_5_3 },\
{ '==':module_6_0,'!=':module_6_1 },\
{ '&':module_7_0 },\
{ '^':module_8_0 },\
{ '|':module_9_0 },\
{ '&&':module_10_0 },\
{ '||':module_11_0 },\
{ '?:':module_12_0 },\
{ ',':module_13_0 } ) operator_priority_tuple=( () , ('+', '-', '!', '~') , ('*','/','%'),\
('+','-'),('<<','>>'),\
('>','>=','<','<='),('==','!='),\
('&'),('^'),('|'),('&&'),('||'),('?',':'),(',') ) ############################# parse:unary,binary,ternary,comma expr
########### return value :
############# 0 parsed sucessfully
############# 1 syntax error
def parse_simple_expr(lis):
if len(lis)==0:
return 1
#if lis[len(lis)-1][0][0]!='@':
# return 1
#if lis[0][0][0]!='@' and lis[0][0] not in ('+','-','!','~'):
# return 1
for pri in range(1,12): # pri 1,2,3,4,5,6,7,8,9,10,11
i=0
while 1:
if len(lis)==1 and lis[0][0][0]=='@':
return 0
if i>=len(lis):
break
if lis[i][0] in operator_priority_tuple[pri]:
if module_dic_tuple[pri][lis[i][0]](lis,i)==0:
i=0
continue
else:
i=i+1
continue
else:
i=i+1
for pri in range(12,13): # pri 12 # parse ...A?A:A...
i=0
while 1:
if len(lis)==1 and lis[0][0][0]=='@':
return 0
if i>=len(lis):
break
if lis[i][0]==':':
if module_dic_tuple[pri]['?:'](lis,i)==0:
i=0
continue
else:
i=i+1
continue
else:
i=i+1
return module_dic_tuple[13][','](lis,0)
return 1

上面代码中,使用了函数引用的词典链表来简化此部分的代码数量。

这一部分就不进行验证展示了,具体过程与前面的文章《一个简单的语义分析算法:单步算法——Python实现》中的描述类似。

实现了 parse_simple_expr 功能之后,剩下的函数与括号的语义分析变得简单些,演算过程如下:

代码实现:

 ########### return value :[intStatusCode,indexOf'(',indexOf')']
############# intStatusCode
############# 0 sucessfully
############# 1 no parenthesis matched
############# 2 list is null :(
def module_parenthesis_place(lis):
length=len(lis)
err=0
x=0
y=0
if length==0:
return [2,None,None]
try:
x=lis.index([")",None])
except:
err=1
lis.reverse()
try:
y=lis.index(["(",None],length-x-1)
except:
err=1
lis.reverse()
y=length-y-1
if err==1:
return [1,None,None]
else:
return [0,y,x] ############################# parse:unary binary ternary prenthesis function expr
########### return value :
############# 0 parsed sucessfully
############# 1 syntax error
############################# find first ')'
def parse_comp_expr(lis):
while 1:
if len(lis)==0:
return 1
if len(lis)==1:
if lis[0][0][0]=='@':
return 0
else:
return 1
place=module_parenthesis_place(lis)
if place[0]==0:
mirror=lis[(place[1]+1):place[2]]
if parse_simple_expr(mirror)==0:
if place[1]>=1 and lis[place[1]-1][0]=='@var':
'''func'''
funcName=lis[place[1]-1][1]
del lis[place[1]-1:(place[2]+1)]
lis.insert(place[1]-1,["@func",funcName,mirror[0]])
else:
del lis[place[1]:(place[2]+1)]
lis.insert(place[1],mirror[0])
else:
return 1
else:
return parse_simple_expr(lis)
return 1

如此,代码到此结束。

下面给出实验结果:

>>> ls=[['(',None],['@var','f'],['(',None],['@num',''],[',',None],['@num',''],[',',None],['@num',''],[',',None],['!',None],['-',None],['@var','x'],['?',None],['@var','y'],[':',None],['~',None],['@var','z'],[')',None],['-',None],['@num',''],[')',None],['/',None],['@num','']]
>>> ls
[['(', None], ['@var', 'f'], ['(', None], ['@num', ''], [',', None], ['@num', ''], [',', None], ['@num', ''], [',', None], ['!', None], ['-', None], ['@var', 'x'], ['?', None], ['@var', 'y'], [':', None], ['~', None], ['@var', 'z'], [')', None], ['-', None], ['@num', ''], [')', None], ['/', None], ['@num', '']]
>>> len(ls)
23
>>> parse_comp_expr(ls);ls
0
[['@expr', '/', ['@expr', '-', ['@func', 'f', ['@expr_list', ['@num', ''], ['@num', ''], ['@num', ''], ['@expr', '?:', ['@expr', '!', ['@expr', '-', ['@var', 'x']]], ['@var', 'y'], ['@expr', '~', ['@var', 'z']]]]], ['@num', '']], ['@num', '']]]
>>> len(ls)
1
>>>

附录:

本文的全部源代码如下:

 '''
____________________________Syntax & Syntax Tree
Parenthesis:
["(",None]
[")",None]
Operators(grouped by precedence):
Unary :
1 + - ! ~ ["+",None] ["-",None] ["!",None] ["~",None]
Binary :
2 * / % ["*",None] ["/",None] ["%",None]
3 + - ["+",None] ["-",None]
4 << >> ["<<",None] [">>",None]
5 > >= < <= [">",None] [">=",None] ["<",None] ["<=",None]
6 == != ["==",None] ["!=",None]
7 & ["&",None]
8 ^ ["^",None]
9 | ["|",None]
10 && ["&&",None]
11 || ["||",None]
Ternary :
12 expr ? expr : expr ["?",None] [":",None] ["@expr","?:",listPtr0,listPtr1,listPtr2]
13 expr , expr , expr...
Var,Num,Expr,Function:
["@var","varName"]
["@num","num_string"]
["@expr","Operator",listPtr,...]
["@func","funcName",listPtr1,...]
["@expr_list",["@var"|"@num"|"@expr"|"@func",...],...]
''' ######################################## global list
OperatorList=['+','-','!','~',\
'*','/','%',\
'+','-',\
'<<','>>',\
'>','>=','<','<=',\
'==','!=',\
'&',\
'^',\
'|',\
'&&',\
'||',\
'?',':'\
',']
''' 31 + 8 * 9 '''
listToParse=[ ['@num',''] , ['+',None] , ['@num',''] , ['*',None] , ['@num',''] ] ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# + =: ^+A... | ...Op+A...
def module_1_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ^+A...
if i==0 and len(lis)>=2:
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[0:2]
lis.insert(0,["@expr","+",rightPtr])
return 0
# process: ...Op+A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[left][0] in OperatorList:
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[i:i+2]
lis.insert(i,["@expr","+",rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# - =: ^-A... | ...Op-A...
def module_1_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ^-A...
if i==0 and len(lis)>=2:
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[0:2]
lis.insert(0,["@expr","-",rightPtr])
return 0
# process: ...Op-A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[left][0] in OperatorList:
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[i:i+2]
lis.insert(i,["@expr","-",rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# ! =: ...!A...
def module_1_2(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...!A...
if len(lis)>=2 and right<len(lis):
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[i:i+2]
lis.insert(i,["@expr","!",rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# ~ =: ...~A...
def module_1_3(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...~A...
if len(lis)>=2 and right<len(lis):
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[i:i+2]
lis.insert(i,["@expr","~",rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# * =: ...A*A...
def module_2_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A*A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","*",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# / =: ...A/A...
def module_2_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A/A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","/",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# % =: ...A%A...
def module_2_2(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A%A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","%",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# + =: ...A+A...
def module_3_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A+A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","+",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# - =: ...A-A...
def module_3_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A-A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","-",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# << =: ...A<<A...
def module_4_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A<<A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","<<",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# >> =: ...A>>A...
def module_4_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A>>A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr",">>",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# > =: ...A>A...
def module_5_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A>A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr",">",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# >= =: ...A>=A...
def module_5_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A>=A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr",">=",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# < =: ...A<A...
def module_5_2(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A<A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","<",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# <= =: ...A<=A...
def module_5_3(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A<=A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","<=",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# == =: ...A==A...
def module_6_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A==A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","==",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# != =: ...A!=A...
def module_6_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A!=A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","!=",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# & =: ...A&A...
def module_7_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A&A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","&",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# ^ =: ...A^A...
def module_8_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A^A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","^",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# | =: ...A|A...
def module_9_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A|A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","|",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# && =: ...A&&A...
def module_10_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A&&A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","&&",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# || =: ...A||A...
def module_11_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A||A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","||",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# ?: =: ...A?A:A...
################# ^
def module_12_0(lis,i): # left i right are both indexes :)
first=i-3
leftOp=i-2
left=i-1
right=i+1 # process: ...A?A:A...
# ^
if i>=3 and len(lis)>=5 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' and\
lis[leftOp][0]=='?' and lis[first][0][0]=='@':
firstPtr=lis[first]
leftPtr=lis[left]
rightPtr=lis[right]
del lis[first:first+5]
lis.insert(first,["@expr","?:",firstPtr,leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# , =: A,A,...A,A
def module_13_0(lis,i): # process: A,A,...A,A
if len(lis)==1 and lis[0][0][0]!='@':
return 1
if len(lis)==1 and lis[0][0][0]=='@':
return 0
if (len(lis)%2)==1 :
i=1
if lis[0][0][0]!='@':
return 1
while i<len(lis):
if lis[i+1][0][0]=='@' and lis[i][0]==',':
i=i+2
else:
return 1
ls=[['@expr_list']]
i=0
while i<len(lis):
ls[0].append(lis[i])
i=i+2
del lis[:]
lis[:]=ls[:]
return 0
return 1 ######################################## global list
# construct a module dictionary
# module_dic_tuple[priority]['Operator'](lis,i)
module_dic_tuple=({}, { '+':module_1_0,'-':module_1_1,'!':module_1_2,'~':module_1_3 },\
{ '*':module_2_0,'/':module_2_1,'%':module_2_2 }, \
{ '+':module_3_0,'-':module_3_1 },\
{ '<<':module_4_0,'>>':module_4_1 },\
{ '>':module_5_0,'>=':module_5_1,'<':module_5_2,'<=':module_5_3 },\
{ '==':module_6_0,'!=':module_6_1 },\
{ '&':module_7_0 },\
{ '^':module_8_0 },\
{ '|':module_9_0 },\
{ '&&':module_10_0 },\
{ '||':module_11_0 },\
{ '?:':module_12_0 },\
{ ',':module_13_0 } ) operator_priority_tuple=( () , ('+', '-', '!', '~') , ('*','/','%'),\
('+','-'),('<<','>>'),\
('>','>=','<','<='),('==','!='),\
('&'),('^'),('|'),('&&'),('||'),('?',':'),(',') ) ############################# parse:unary,binary,ternary,comma expr
########### return value :
############# 0 parsed sucessfully
############# 1 syntax error
def parse_simple_expr(lis):
if len(lis)==0:
return 1
#if lis[len(lis)-1][0][0]!='@':
# return 1
#if lis[0][0][0]!='@' and lis[0][0] not in ('+','-','!','~'):
# return 1
for pri in range(1,12): # pri 1,2,3,4,5,6,7,8,9,10,11
i=0
while 1:
if len(lis)==1 and lis[0][0][0]=='@':
return 0
if i>=len(lis):
break
if lis[i][0] in operator_priority_tuple[pri]:
if module_dic_tuple[pri][lis[i][0]](lis,i)==0:
i=0
continue
else:
i=i+1
continue
else:
i=i+1
for pri in range(12,13): # pri 12 # parse ...A?A:A...
i=0
while 1:
if len(lis)==1 and lis[0][0][0]=='@':
return 0
if i>=len(lis):
break
if lis[i][0]==':':
if module_dic_tuple[pri]['?:'](lis,i)==0:
i=0
continue
else:
i=i+1
continue
else:
i=i+1
return module_dic_tuple[13][','](lis,0)
return 1 ########### return value :[intStatusCode,indexOf'(',indexOf')']
############# intStatusCode
############# 0 sucessfully
############# 1 no parenthesis matched
############# 2 list is null :(
def module_parenthesis_place(lis):
length=len(lis)
err=0
x=0
y=0
if length==0:
return [2,None,None]
try:
x=lis.index([")",None])
except:
err=1
lis.reverse()
try:
y=lis.index(["(",None],length-x-1)
except:
err=1
lis.reverse()
y=length-y-1
if err==1:
return [1,None,None]
else:
return [0,y,x] ############################# parse:unary binary ternary prenthesis function expr
########### return value :
############# 0 parsed sucessfully
############# 1 syntax error
############################# find first ')'
def parse_comp_expr(lis):
while 1:
if len(lis)==0:
return 1
if len(lis)==1:
if lis[0][0][0]=='@':
return 0
else:
return 1
place=module_parenthesis_place(lis)
if place[0]==0:
mirror=lis[(place[1]+1):place[2]]
if parse_simple_expr(mirror)==0:
if place[1]>=1 and lis[place[1]-1][0]=='@var':
'''func'''
funcName=lis[place[1]-1][1]
del lis[place[1]-1:(place[2]+1)]
lis.insert(place[1]-1,["@func",funcName,mirror[0]])
else:
del lis[place[1]:(place[2]+1)]
lis.insert(place[1],mirror[0])
else:
return 1
else:
return parse_simple_expr(lis)
return 1

由于当树结构稍复杂时,分析其结构很是耗费时间,接下来,我们将开发一个将代码中的树结构图形化显示的简陋工具。

如有问题或者建议,欢迎留言讨论 :)

语义分析:C语言表达式的语法树生成——Python实现的更多相关文章

  1. [WebKit内核] JavaScript引擎深度解析--基础篇(一)字节码生成及语法树的构建详情分析

    [WebKit内核] JavaScript引擎深度解析--基础篇(一)字节码生成及语法树的构建详情分析 标签: webkit内核JavaScriptCore 2015-03-26 23:26 2285 ...

  2. [WebKit内核] JavaScriptCore深度解析--基础篇(一)字节码生成及语法树的构建

    看到HorkeyChen写的文章<[WebKit] JavaScriptCore解析--基础篇(三)从脚本代码到JIT编译的代码实现>,写的很好,深受启发.想补充一些Horkey没有写到的 ...

  3. JSP编译成Servlet(一)语法树的生成——语法解析

    一般来说,语句按一定规则进行推导后会形成一个语法树,这种树状结构有利于对语句结构层次的描述.同样Jasper对JSP语法解析后也会生成一棵树,这棵树各个节点包含了不同的信息,但对于JSP来说解析后的语 ...

  4. Atitit.sql ast 表达式 语法树 语法 解析原理与实现 java php c#.net js python

    Atitit.sql ast 表达式 语法树 语法 解析原理与实现 java php c#.net js python 1.1. Sql语法树 ast 如下图锁死1 2. SQL语句解析的思路和过程3 ...

  5. EL语言表达式 (一)【语法和特点】

    一.基本语法规则: EL表达式语言以“${”开头,以"}"结尾的程序段,具体格式如下: ${expression} 其中expression:表示要指定输出的内容和字符串以及EL运 ...

  6. .NET技术-6.0. Expression 表达式树 生成 Lambda

    .NET技术-6.0. Expression 表达式树 生成 Lambda public static event Func<Student, bool> myevent; public ...

  7. 《深入理解Android虚拟机内存管理》示例程序编译阶段生成的各种语法树完整版

    1.tokens "int"                   "int" <SPACES>                " &quo ...

  8. 抽象语法树简介(ZZ)

    转载自: http://www.cnblogs.com/cxihu/p/5836744.html (一)简介 抽象语法树(abstract syntax code,AST)是源代码的抽象语法结构的树状 ...

  9. 03.从0实现一个JVM语言系列之语法分析器-Parser-03月01日更新

    从0实现JVM语言之语法分析器-Parser 相较于之前有较大更新, 老朋友们可以复盘或者针对bug留言, 我会看到之后答复您! 源码github仓库, 如果这个系列文章对你有帮助, 希望获得你的一个 ...

随机推荐

  1. sersync2 文件的实时同步备份

    |——需求: 监控192.168.9.5[主]  下的 /data/vmeipai 目录  --> 同步到 192.168.12.8 [备] 下的 /data/vmeipai 目录 |——网络拓 ...

  2. matlab图像处理注意溢出!先要im2double!

    imagedata_comb=imagedata_ebic*addnumber_ebic+imagedata_sem*addnumber_sem; %注意溢出啊!!!uint8最大值是255,也就是说 ...

  3. spring @Autowired注入的原理

    只知道如何用Autowired注解,知道可以替代set,get方法,很方便,却一直不知道,为什么可以代替 今天探索一下原因,所谓知其然还要知其所以然,才能理解的更好,记忆的更牢,才能转化为自己的知识. ...

  4. 《DSP using MATLAB》示例Example7.9

    代码: wp = 0.2*pi; ws = 0.3*pi; As = 50; tr_width = ws - wp; M = ceil((As-7.95)/(2.285*tr_width) + 1 ) ...

  5. 【angularJS】Route路由

    介绍 AngularJS 路由允许我们通过不同的 URL 访问不同的内容. 通过 AngularJS 可以实现多视图的单页Web应用(single page web application,SPA). ...

  6. [BZOJ5329][SDOI2018]战略游戏

    bzoj luogu Description 省选临近,放飞自我的小Q无心刷题,于是怂恿小C和他一起颓废,玩起了一款战略游戏. 这款战略游戏的地图由n个城市以及m条连接这些城市的双向道路构成,并且从任 ...

  7. Django之tag标签和filter标签

    1.Django的tag常见的标签,可以做一些简单的功能 {%if%} 的使用主要用于做判断,还可以包含{%elif%} 这样的用法,最后要跟上{% endif %}.可以使用你的and,or,not ...

  8. 第一次Sprint团队贡献分

    201406114105       董婷婷           21 201406114157       容杰龙           22 201406114343       卓炜杰       ...

  9. py2exe转换参数

    在公司用python写了个统计数据并通过xlsxwriter模块生成excel的小工具, 完成后使用py2exe转换成exe文件过程中遇到了些问题, 记录下. from distutils.core ...

  10. Python中super()和__init__()方法

    采用新式类,要求最顶层的父类一定要继承于object,这样就可以利用super()函数来调用父类的init()等函数, 每个父类都执行且执行一次,并不会出现重复调用的情况.而且在子类的实现中,不用到处 ...