(a) an if-then-else

(b) a while loop

(c) a natural loop with two exits, e.g. while with an if...break in the middle; non-structured but reducible

(d) an irreducible CFG: a loop with two entry points, e.g. goto into a while or for loop

 

控制流图是代码的一种表征形式。

wiki:

Definition

In a control flow graph each node in the graph represents a basic block, i.e. a straight-line piece of code without any jumps or jump targets; jump targets start a block, and jumps end a block. Directed edges are used to represent jumps in the control flow. There are, in most presentations, two specially designated blocks: the entry block, through which control enters into the flow graph, and the exit block, through which all control flow leaves.[3]

Because of its construction procedure, in a CFG, every edge A→B has the property that:

outdegree(A) > 1 or indegree(B) > 1 (or both).[4]

The CFG can thus be obtained, at least conceptually, by starting from the program's (full) flow graph—i.e. the graph in which every node represents an individual instruction—and performing an edge contraction for every edge that falsifies the predicate above, i.e. contracting every edge whose source has a single exit and whose destination has a single entry. This contraction-based algorithm is of no practical importance, except as a visualization aid for understanding the CFG construction, because the CFG can be more efficiently constructed directly from the program by scanning it for basic blocks.[4]

Example

Consider the following fragment of code:

0: (A) t0 = read_num
1: (A) if t0 mod 2 == 0
2: (B) print t0 + " is even."
3: (B) goto 5
4: (C) print t0 + " is odd.”           [jump targets]
5: (D) end program                    [jump targets]
In the above, we have 4 basic blocks:

A from 0 to 1,      [entry block]

B from 2 to 3,      

C at 4,

D at 5.                 [exit block]

A graph for this fragment has edges from A to B, A to C, B to D and C to D.

Reachability

Reachability is a graph property useful in optimization.

If a subgraph is not connected from the subgraph containing the entry block, that subgraph is unreachable during any execution, and so is unreachable code; under normal conditions it can be safely removed.

If the exit block is unreachable from the entry block, an infinite loop may exist. Not all infinite loops are detectable, see Halting problem. A halting order may also exist there.

Unreachable code and infinite loops are possible even if the programmer does not explicitly code them: optimizations like constant propagation and constant folding followed by jump threading can collapse multiple basic blocks into one, cause edges to be removed from a CFG, etc., thus possibly disconnecting parts of the graph.

Domination relationship

A block M dominates a block N if every path from the entry that reaches block N has to pass through block M. The entry block dominates all blocks.

In the reverse direction, block M postdominates block N if every path from N to the exit has to pass through block M. The exit block postdominates all blocks.

It is said that a block M immediately dominates block N if M dominates N, and there is no intervening block P such that M dominates P and P dominates N. In other words, M is the last dominator on all paths from entry to N. Each block has a unique immediate dominator.

Similarly, there is a notion of immediate postdominator, analogous to immediate dominator.

The dominator tree is an ancillary data structure depicting the dominator relationships. There is an arc from Block M to Block N if M is an immediate dominator of N. This graph is a tree, since each block has a unique immediate dominator. This tree is rooted at the entry block. The dominator tree can be calculated efficiently using Lengauer–Tarjan's algorithm.

postdominator tree is analogous to the dominator tree. This tree is rooted at the exit block.

Special edges

back edge is an edge that points to a block that has already been met during a depth-first (DFS) traversal of the graph. Back edges are typical of loops.

critical edge is an edge which is neither the only edge leaving its source block, nor the only edge entering its destination block. These edges must be split: a new block must be created in the middle of the edge, in order to insert computations on the edge without affecting any other edges.

An abnormal edge is an edge whose destination is unknown. Exception handling constructs can produce them. These edges tend to inhibit optimization.

An impossible edge (also known as a fake edge) is an edge which has been added to the graph solely to preserve the property that the exit block postdominates all blocks. It cannot ever be traversed.

Loop management

loop header (sometimes called the entry point of the loop) is a dominator that is the target of a loop-forming back edge. The loop header dominates all blocks in the loop body. A block may be a loop header for more than one loop. A loop may have multiple entry points, in which case it has no "loop header".

Suppose block M is a dominator with several incoming edges, some of them being back edges (so M is a loop header). It is advantageous to several optimization passes to break M up into two blocks Mpre and Mloop. The contents of M and back edges are moved to Mloop, the rest of the edges are moved to point into Mpre, and a new edge from Mpre to Mloop is inserted (so that Mpre is the immediate dominator of Mloop). In the beginning, Mpre would be empty, but passes like loop-invariant code motion could populate it. Mpre is called the loop pre-header, and Mloop would be the loop header.

 

 

【Static Program Analysis - Chapter 2】 代码的表征之控制流图的更多相关文章

  1. 【Static Program Analysis - Chapter 2】 代码的表征之抽象语法树

    抽象语法树:AbstractSyntaxTrees 定义(wiki): 在计算机科学中,抽象语法树(abstract syntax tree或者缩写为AST),或者语法树(syntax tree),是 ...

  2. 【Static Program Analysis - Chapter 4】格理论(Lattice Theory)与程序分析

    # 从一个例子说起, **任务:给定这样一段代码,假设我们想分析出这段代码中,每个数值型变量和表达式的符号,即正数,负数或0.** 此外,还有可能出现两种情况就是: 1.我们无法分析出结果,即我们无法 ...

  3. 【Static Program Analysis - Chapter 3】Type Analysis

    类型分析,个人理解就是(通过静态分析技术)分析出代码中,哪些地方只能是某种或某几种数据类型,这是一种约束.   例如,给定一个程序: 其中,我们可以很直接地得到一些约束: 最后,经过简化可以得到: 对 ...

  4. 【Static Program Analysis - Chapter 1】 Introduction

    Regarding correctness, programmers routinely use testing to gain confidence that their programs work ...

  5. The Ultimate List of Open Source Static Code Analysis Security Tools

    https://www.checkmarx.com/2014/11/13/the-ultimate-list-of-open-source-static-code-analysis-security- ...

  6. Top 40 Static Code Analysis Tools

    https://www.softwaretestinghelp.com/tools/top-40-static-code-analysis-tools/ In this article, I have ...

  7. 静态时序分析(static timing analysis)

    静态时序分析(static timing analysis,STA)会检测所有可能的路径来查找设计中是否存在时序违规(timing violation).但STA只会去分析合适的时序,而不去管逻辑操作 ...

  8. static timing analysis 基础

    此博文依据 特权同学在电子发烧友上的讲座PPT进行整理而成. static timing analysis   静态时序分析基础 过约束:有不必要的约束,或者是约束不能再某一情况下满足.——约束过头了 ...

  9. Soot生成代码控制流图

    Soot可以对代码进行分析,提供了多种字节码分析和变换功能,通过它可以进行过程内和过程间的分析优化,以及程序流图的生成,还能通过图形化的方式输出. http://www.brics.dk/SootGu ...

随机推荐

  1. Vue(六)过滤器

    1. 简介 用来过滤模型数据,在显示之前进行数据处理和筛选 语法:{{ data | filter1(参数) | filter2(参数)}} 2. 关于内置过滤器 vue1.0中内置许多过滤器,如:c ...

  2. 常见的git清单

    我们每天使用 Git ,但是很多命令记不住. 一般来说,日常使用只要记住下图6个命令,就可以了.但是熟练使用,恐怕要记住60-100个命令. 这篇文章是从别人博客上copy重新整理出来的,作为笔记用, ...

  3. JS冲刺

    1.简单/复杂数据类型1)基本数据类型把数据名和值直接存储在栈当中复杂数据类型在栈中存储数据名和一个堆的地址,在堆中存储属性及值,访问时先从栈中获取地址,再到堆中拿出相应的值简单数据类型:number ...

  4. Idea checkstyle插件的使用

    File->Setting 选择Plugins,查询是否已经安装了checkstyle,如果没有安装,可以点击下面的“Browse repositories...”按钮 查询到checkstyl ...

  5. C# StreamReader.ReadLine统计行数的问题

    要实现一个功能: 从 lua 文件中提取字符串放到 excel 中,再将 excel 给海外同事,翻译完成后,用翻译的文本替换相应中文. 整个功能并不复杂,要点有二点: 1.提取字符串,一行中文如&q ...

  6. 构建分布式Tensorflow模型系列:CVR预估之ESMM

    https://zhuanlan.zhihu.com/p/42214716 本文是“基于Tensorflow高阶API构建大规模分布式深度学习模型系列”的第五篇,旨在通过一个完整的案例巩固一下前面几篇 ...

  7. 每天一个linux命令(17):whereis

    1.命令简介         whereis (whereis) 命令用来定位指令的二进制程序.源代码文件和man手册页等相关文件的路径.         whereis命令只能用于程序名的搜索,而且 ...

  8. R语言︱处理缺失数据&&异常值检验、离群点分析、异常值处理

    在数据挖掘的过程中,数据预处理占到了整个过程的60% 脏数据:指一般不符合要求,以及不能直接进行相应分析的数据 脏数据包括:缺失值.异常值.不一致的值.重复数据及含有特殊符号(如#.¥.*)的数据 数 ...

  9. Linux系统下x86和ARM的区别有哪些?

    问题: 最近在用三星的一款i5处理器的Windows平板,和iPad,以及其他使用ARM处理器的手机相比,发热量大很多,甚至需要借助风扇来散热,耗电量也大了不少. 那么就很奇怪,在主频相差不大,并且实 ...

  10. 深入理解linux系统下proc文件系统内容

    深入理解linux系统下proc文件系统内容 内容摘要:Linux系统上的/proc目录是一种文件系统,即proc文件系统. Linux系统上的/proc目录是一种文件系统,即proc文件系统.与其它 ...