Intel® Threading Building Blocks (Intel® TBB) Developer Guide 中文 Parallelizing Data Flow and Dependence Graphs并行化data flow和依赖图
https://www.threadingbuildingblocks.org/docs/help/index.htm
Parallelizing Data Flow and Dependency Graphs
In addition to loop parallelism, the Intel® Threading Building Blocks (Intel® TBB) library also supports graph parallelism. It's possible to create graphs that are highly scalable, but it is also possible to create graphs that are completely sequential.
除了循环并行化,tbb还支持图并行化。这使得创建高度扩展性的图有了可能,同时也能都创建完全顺序执行的图
Using graph parallelism, computations are represented by nodes and the communication channels between these computations are represented by edges. When a node in the graph receives a message, a task is spawned to execute its body object on the incoming message. Messages flow through the graph across the edges that connect the nodes. The following sections present two examples of applications that can be expressed as graphs. For more information on tasks, see the See Also section below.
图并行化中,计算被表示为节点,计算之间的通讯通道被表达为边。当一个节点收到消息,一个任务会被执行。消息通过连接节点的边来流过图。下面有两个例子
The following figure shows a streaming or data flow application where a sequence of values is processed as each value passes through the nodes in the graph. In this example, the sequence is created by a function F. For each value in the sequence, G squares the value and H cubes the value. J then takes each of the squared and cubed values and adds them to a global sum. After all values in the sequence are completely processed, sum is equal to the sum of the sequence of squares and cubes from 1 to 10. In a streaming or data flow graph, the values actually flow across the edges; the output of one node becomes the input of its successor(s).
下图是一个streaming or data flow 的应用

The following graphic shows a different form of graph application. In this example, a dependence graph is used to establish a partial ordering among the steps for making a peanut butter and jelly sandwich. In this partial ordering, you must first get the bread before spreading the peanut butter or jelly on the bread. You must spread on the peanut butter before you put away the peanut butter jar, and likewise spread on the jelly before you put away the jelly jar. And, you need to spread on both the peanut butter and jelly before putting the two slices of bread together. This is a partial ordering because, for example, it doesn't matter if you spread on the peanut butter first or the jelly first. It also doesn't matter if you finish making the sandwich before putting away the jars.
下图是另外一种图的应用,以dependence graph 的形式表达任务的步骤执行

While it can be inferred that resources, such as the bread, or the jelly jar, are shared between ordered steps, it is not explicit in the graph. Instead, only the required ordering of steps is explicit in a dependence graph. For example, you must "Put jelly on 1 slice" before you "Put away jelly jar".
The flow graph interface in the Intel TBB library allows you to express data flow and dependence graphs such as these, as well as more complicated graphs that include cycles, conditionals, buffering and more. If you express your application using the flow graph interface, the runtime library spawns tasks to exploit the parallelism that is present in the graph. For example, in the first example above, perhaps two different values might be squared in parallel, or the same value might be squared and cubed in parallel. Likewise in the second example, the peanut butter might be spread on one slice of bread in parallel with the jelly being spread on the other slice. The interface expresses what is legal to execute in parallel, but allows the runtime library to choose at runtime what will be executed in parallel.
tbb允许你表达data flow and dependence graphs。以及更复杂的图,比如包含cycle,条件,缓冲。。
The support for graph parallelism is contained within the namespace tbb::flow and is defined in the flow_graph.h header file.
See Also
Basic Flow Graph Concepts
基本的概念
Flow Graph Basics: Graph Object 图
Conceptually a flow graph is a collection of nodes and edges. Each node belongs to exactly one graph and edges are made only between nodes in the same graph. In the flow graph interface, a graph object represents this collection of nodes and edges, and is used for invoking whole graph operations such as waiting for all tasks related to the graph to complete, resetting the state of all nodes in the graph, and canceling the execution of all nodes in the graph.
The code below creates a graph object and then waits for all tasks spawned by the graph to complete. The call to wait_for_all in this example returns immediately since this is a trivial graph with no nodes or edges, and therefore no tasks are spawned.
graph g;
g.wait_for_all();
Flow Graph Basics: Nodes 节点
A node is a class that inherits from tbb::flow::graph_node and also typically inherits from tbb::flow::sender<T> , tbb::flow::receiver<T> or both. A node performs some operation, usually on an incoming message and may generate zero or more output messages. Some nodes require more than one input message or generate more than one output message.
节点用来做计算
While it is possible to define your own node types by inheriting from graph_node, sender and receiver, it is more typical that predefined node types are used to construct a graph. The list of predefined nodes is available from the See Also section below.
A function_node is a predefined type available in flow_graph.h and represents a simple function with one input and one output. The constructor for afunction_node takes three arguments:
template< typename Body> function_node(graph &g, size_t concurrency, Body body)
| Parameter | Description |
|---|---|
| Body |
Type of the body object. |
| g |
The graph the node belongs to. |
| concurrency |
The concurrency limit for the node. You can use the concurrency limit to control how many invocations of the node are allowed to proceed concurrently, from 1 (serial) to an unlimited number. |
| body |
User defined function object, or lambda expression, that is applied to the incoming message to generate the outgoing message. |
Below is code for creating a simple graph that contains a single function_node. In this example, a node n is constructed that belongs to graph g, and has a second argument of 1, which allows at most 1 invocation of the node to occur concurrently. The body is a lambda expression that prints each value v that it receives, spins for v seconds, prints the value again, and then returns v unmodified. The code for the function spin_for is not provided.
graph g;
function_node< int, int > n( g, 1, []( int v ) -> int {
cout << v;
spin_for( v );
cout << v;
return v;
} );
After the node is constructed in the example above, you can pass messages to it, either by connecting it to other nodes using edges or by invoking its function try_put. Using edges is described in the next section.
n.try_put( 1 );
n.try_put( 2 );
n.try_put( 3 );
You can then wait for the messages to be processed by calling wait_for_all on the graph object:
g.wait_for_all();
In the above example code, the function_node n was created with a concurrency limit of 1. When it receives the message sequence 1, 2 and 3, the node n will spawn a task to apply the body to the first input, 1. When that task is complete, it will then spawn another task to apply the body to 2. And likewise, the node will wait for that task to complete before spawning a third task to apply the body to 3. The calls to try_put do not block until a task is spawned; if a node cannot immediately spawn a task to process the message, the message will be buffered in the node. When it is legal, based on concurrency limits, a task will be spawned to process the next buffered message.
In the above graph, each message is processed sequentially. If however, you construct the node with a different concurrency limit, parallelism can be achieved:
function_node< int, int > n( g, tbb::flow::unlimited, []( int v ) -> int {
cout << v;
spin_for( v );
cout << v;
return v;
} );
You can use unlimited as the concurrency limit to instruct the library to spawn a task as soon as a message arrives, regardless of how many other tasks have been spawned. You can also use any specific value, such as 4 or 8, to limit concurrency to at most 4 or 8, respectively. It is important to remember that spawning a task does not mean creating a thread. So while a graph may spawn many tasks, only the number of threads available in the library's thread pool will be used to execute these tasks.
Suppose you use unlimited in the function_node constructor instead and call try_put on the node:
n.try_put( 1 );
n.try_put( 2 );
n.try_put( 3 );
g.wait_for_all();
The library spawns three tasks, each one applying n's lambda expression to one of the messages. If you have a sufficient number of threads available on your system, then all three invocations of the body will occur in parallel. If however, you have only one thread in the system, they execute sequentially.
Intel® Threading Building Blocks (Intel® TBB) Developer Guide 中文 Parallelizing Data Flow and Dependence Graphs并行化data flow和依赖图的更多相关文章
- Linux安装Intel Threading Building Blocks(TBB)
编译安装: wget https://codeload.github.com/01org/tbb/tar.gz/2019_U3 tar zxvf 2019_U3 cd tbb-2019_U3 make ...
- 四、Implementation: The Building Blocks 实现:构件
四.Implementation: The Building Blocks 实现:构件 This is the essential part of this guide. We will introd ...
- 虚拟机启动linux系统报错,此主机支持 Intel VT-x,但 Intel VT-x 处于禁用状态
在使用虚拟机启动linux的时候报错,如下: 已将该虚拟机配置为使用 64 位客户机操作系统.但是,无法执行 64 位操作. 此主机支持 Intel VT-x,但 Intel VT-x 处于禁用状态. ...
- Thinkpad 笔记本VMware Workstation 安装虚拟机出现“此主机支持 Intel VT-x,但 Intel VT-x 处于禁用状态”解决方法
今天在使用VMware打算在机器中安装新的虚拟机时,出现"此主机支持 Intel VT-x,但 Intel VT-x 处于禁用状态"错误如下: 提示信息: 已将该虚拟机配 ...
- bc.34.B.Building Blocks(贪心)
Building Blocks Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 65536/65536 K (Java/Others) ...
- DTD - XML Building Blocks
The main building blocks of both XML and HTML documents are elements. The Building Blocks of XML Doc ...
- 内置在虚拟机上64位操作系统:该主机支持 Intel VT-x,但 Intel VT-x 残
VT-Virtual Technology. 在这里特别说一下:虚拟64位操作系统,须要特别在BIOS中打开VT支持.怎样提示:此主机不支持Intel VT-x,则不可能虚拟出64位系统. 当提示:此 ...
- 企业架构研究总结(35)——TOGAF架构内容框架之构建块(Building Blocks)
之前忙于搬家移居,无暇顾及博客,今天终于得闲继续我的“政治课”了,希望之后至少能够补完TOGAF方面的内容.从前面文章可以看出,笔者并无太多能力和机会对TOGAF进行理论和实际的联系,仅可对标准的文本 ...
- TOGAF架构内容框架之构建块(Building Blocks)
TOGAF架构内容框架之构建块(Building Blocks) 之前忙于搬家移居,无暇顾及博客,今天终于得闲继续我的“政治课”了,希望之后至少能够补完TOGAF方面的内容.从前面文章可以看出,笔者并 ...
随机推荐
- MVC 路由模块内核原理
.net网站第一次运行的时候 执行global文件的application_start方法 注册路由信息 RouteConfig.RegisterRoutes(RouteTable.Routes) ...
- ngxtop:在命令行实时监控 Nginx 的神器
Nginx网站服务器在生产环境中运行的时候需要进行实时监控.实际上,诸如Nagios, Zabbix, Munin 的网络监控软件是支持 Nginx 监控的. 如果你不需要以上软件提供的综合性报告或者 ...
- KMP算法(快速模式匹配)
详细理解看这里:http://kb.cnblogs.com/page/176818/ 或者这里:http://blog.csdn.net/yutianzuijin/article/details/11 ...
- 第一个python程序hello.py
使用vim编辑代码: #!/usr/bin/python2.7 #-*-coding:utf-8-*- name = raw_input('请输入你的名字:') print 'Hello,',name ...
- 第45讲:Scala中Context Bounds代码实战及其在Spark中的应用源码解析
今天学业习了上下文界定的内容,看下这段代码 class Pair_Ordering[T:Ordering](val first : T,val second : T){ def bigger(imp ...
- GROUP BY,WHERE,HAVING之间的区别和用法
GROUP BY,WHERE,HAVING之间的区别和用法 分类: Oracle学习2009-11-01 23:40 21963人阅读 评论(6) 收藏 举报 mathmanagersql数据库m ...
- 用mysql时遇到的一些问题
1 mysql5.7文件夹中没有my.ini文件 解决办法-> 如果是windows的系统下安装的,应该是在这个目录下面:C:\ProgramData\MySQL\MySQL Server 5. ...
- linux-8 基本命令---echo
1.echo 命令用于终端显示字符或变量 格式:“echo[字符串| 变量]” @1 .echo命令的字符串输出到终端: @2 .echo查看当前SHELL的变量值(前面有$符号): @3 .查看 ...
- 【Android开发】 第一课 环境搭建教程
Windows 开发环境部署: Android Studio 中文社区:http://www.android-studio.org/ 本教程将分为五个步骤来完成Android开发环境的部署. 第一步: ...
- 【基础知识】.Net基础加强10天
一. 复习 1. 委托是类型,还是一种引用类型. 2. 使用委托的时候必须new一个委托对象.即便看到代码中没有new委托对象,编译器也会在编译的时候帮我们new赋值给委托的方法,其实是存储在委托对象 ...