data hazard in CPU pipeline
1, background info
5 stages in CPU pipeline: IF, ID, EX, MM, WB
IF – Instruction Fetch
ID – Instruction Decode
EX – Execute
MM – Memory
WB – Write Back
2, what is data hazard and how does it happen
Data hazards occur when instructions that exhibit data dependency modify data in different stages of a pipeline.
For example, we write a register and then read it. The write and read operations are dependent; they are related by the same register.
In a pipeline, if read happens before the write can finish, it’s very likely that it would not read back a correct value.
In this case, data hazard happens.
It’s not difficult to imagine that any 2 operations involving write may cause data hazard.
We can categorize data hazard into 3 kinds:
(1) read after write (RAW);
(2) write after write (WAW);
(3) write after read (WAR);
3, how to prevent data hazard
(1) pipeline bubbling
That is, stall the pipeline until hazard is resolved.
add r1, r2, r3
add r4, r1, r5 Cycles----->
________________________ Instructions
|_IF_|_ID_|_EX_|_MM_|_WB_|______________ add r1, r2, r3
|_IF_|_x_x_x_x_|_ID_|_EX_|_MM_|_WB_| add r4, r1, r5
stall cycles
Usually, NOP operationg is inserted during stall time.
As instructions are fetched, control logic determines whether a hazard could/will occur. If this is true, then the control logic insert NOPs into the pipeline.
If the number of NOPs equals the number of stages in the pipeline, the processor has been cleared of all instructions and can proceed free from hazards.
(2) out-of-order execution
This would be introduced heavily later.
(3) operand forwarding/bypass
See below examle.
Instruction 0: Register 1 = 6Instruction 1: Register 1 = 3Instruction 2: Register 2 = Register 1 + 7 = 10
Instruction 2 would need to use Register 1. If Instruction 2 is executed before Instruction 1 is finished, it may get a wrong result.
However, we can see that:
a) the output of Instruction 1 (which is 3) can be used by subsequent instructions before the value 3 is committed to/stored in Register 1.
b) there is no wait to commit/store the output of Instruction 1 in Register 1 (in this example, the output is 3) before making that output available to the subsequent instruction (in this case, Instruction 2).
So it can be done that:
Instruction 2 uses the correct (the more recent) value of Register 1: the commit/store was made immediately and not pipelined.
This is forwarding/bypass.
4, how does forwarding/bypss work
With forwarding enabled, the Instruction Decode/Execution (ID/EX) stage of the pipeline now has two inputs: the value read from the register specified (in this example, the value 6 from Register 1), and the new value of Register 1 (in this example, this value is 3) which is sent from the next stage Instruction Execute/Memory Access(EX/MM). Added control logic is used to determine which input to use.
A forwarding vs. no forwarding example as followed:
ADD A B C #A=B+C
SUB D C A #D=C-A
data hazard in CPU pipeline的更多相关文章
- [Machine Learning with Python] Data Preparation through Transformation Pipeline
In the former article "Data Preparation by Pandas and Scikit-Learn", we discussed about a ...
- antidependence and data hazard
See below example. ADDD F6, F0, F8 SUBD F8, F10, F14 Some article would say that “ There’s an ant ...
- SSIS Data Flow 的 Execution Tree 和 Data Pipeline
一,Execution Tree 执行树是数据流组件(转换和适配器)基于同步关系所建立的逻辑分组,每一个分组都是一个执行树的开始和结束,也可以将执行树理解为一个缓冲区的开始和结束,即缓冲区的整个生命周 ...
- verilog实现16位五级流水线的CPU带Hazard冲突处理
verilog实现16位五级流水线的CPU带Hazard冲突处理 该文是基于博主之前一篇博客http://www.cnblogs.com/wsine/p/4292869.html所增加的Hazard处 ...
- [DE] Pipeline for Data Engineering
How to build an ML pipeline for Data Science 垃圾信息分类 Ref:Develop a NLP Model in Python & Deploy I ...
- 以Excel 作为Data Source,将data导入db
将Excel作为数据源,将数据导入db,是SSIS的一个简单的应用,下图是示例Excel,数据列是code和name 第一部分,Excel中的数据类型是数值类型 1,使用SSDT创建一个package ...
- One EEG preprocessing pipeline - EEG-fMRI paradigm
The preprocessing pipeline of EEG data from EEG-fMRI paradigm differs from that of regular EEG data, ...
- Java底层实现 - CPU术语
1.内存屏障(memory barriers)是一组处理器指令,用于实现对内存操作的顺序限制 2.缓冲行(cache line)CPU高速缓存中可以分配的最小存储单位.处理器填写缓存行时 会加载整个缓 ...
- CPU和GPU性能对比
计算20000次10000点的fft,分别使用CPU和GPU,得 the running time of cpu is : 2.3696s the running time of gpu is : 0 ...
随机推荐
- Openstack组件实现原理 — OpenVswitch/Gre/vlan
目录 目录 前文提要 Neutron 管理的网络相关实体 OpenVswitchOVS OVS 的架构 VLan GRE 隧道 Compute Node 中的 Instance 通过 GRE 访问 P ...
- Zookeeper怎么实现分布式锁?
对访问资源 R1 的过程加锁,在操作 O1 结束对资源 R1 访问前,其他操作不允许访问资源 R1.以上算是对独占锁的简单定义了,那么这段定义在 Zookeeper 的"类 Unix/Lin ...
- sshpass批量分发ssh秘钥
首先安装sshpass: yum -y install sshpass 单条命令: sshpass -p“password” ssh-copy-id -i /root/.ssh/id_rsa.pub ...
- 8分钟带你深入浅出搞懂Nginx
Nginx是一款轻量级的Web服务器.反向代理服务器,由于它的内存占用少,启动极快,高并发能力强,在互联网项目中广泛应用. 图基本上说明了当下流行的技术架构,其中Nginx有点入口网关的味道. 反向代 ...
- 《转》python学习基础
学习的python本来想自己总结,但是发现了一篇不错的大牛的博客,拿来主义,,又被我实践了 关于前两篇如果总结的不详细,因此把他人的转载过来 http://www.cnblogs.com/BeginM ...
- fuzzy commitment 和fuzzy vault
Alice,这位令人惊异的魔术天才,正表演关于人类意念的神秘技巧.她将在Bob选牌之前猜中Bob将选的牌!注意Alice在一张纸上写出她的预测.Alice很神秘地将那张纸片装入信封中并封上.就在人们吃 ...
- 利用 Dockerfile 定制镜像
镜像的定制实际上就是定制每一层所添加的配置.文件. 如果我们可以把每一层修改.安装.构建.操作的命令都写入一个脚本,用这个脚本来构建.定制镜像, 那么之前提及的无法重复的问题.镜像构建透明性的问题.体 ...
- drop database出现1010
> drop database glc; ERROR (HY000): Error dropping database (can't rmdir './glc/', errno: 17) Fri ...
- [JZOJ2865]【集训队互测 2012】Attack
题目 题目大意 平面上有一堆带权值的点.两种操作:交换两个点的权值,查找一个矩形的第\(k\)小 \(N<=60000\) \(M<=10000\) \(10000ms\) 思考历程&am ...
- 【JZOJ4665】数列
description analysis 水法又\(n\)方二十万-- 可以先离散化,然后枚举起点,枚举向下扫 同一个数出现过或模数不相同就\(break\),注意\(k\)不够顶替还是有可能存在解不 ...
