cod-hw
COD hw 4
Xinglu Wang
3140102282
2016-12-27 21:28:01
5.3
5.3.1
5.3.3
5.3.4
5.3.5
5.3.6
5.4
5.4.1
5.4.2
5.4.3
5.4.4 ?????
5.4.5
5.4.6
5.6
5.6.1
5.6.2
5.6.3
5.6.4
5.6.5
5.6.6
5.7
5.7.1
5.7.2
5.3
5.3.1
One block has 32 8-bits data, express into words: cache block size is${2^{5 - 2}} = 8$ words
5.3.2
$2^{9-5+1}=32$ entries
5.3.3
Data storage bits:$2^5 \times 8 \times 2^5$
Cache implementation:$2^5 \times (22+1+8 \times 2^5)$
The ratio is 1.086
5.3.4
| Byte Address | Block Address | Ind of Block | Tag | MissOrHit |
|---|---|---|---|---|
| 0 | 0 | 0 | 0 | miss |
| 4 | 4/32=0 | 4/32 mod 32 =0 | 4/32/32 =0 | hit |
| 16 | 16/32=0 | 16/32 mod 32 =0 | 16/32/32=0 | hit |
| 132 | 132/32=4 | 132/32 mod 32 =4 | 132/32/32=0 | miss |
| 232 | 7 | 7 | 0 | miss |
| 160 | 5 | 5 | 0 | miss |
| 1024 | 32 | 0 | 1 | miss replace |
| 30 | 0 | 0 | 0 | miss replace |
| 140 | 4 | 4 | 0 | hit |
| 3100 | 96 | 0 | 3 | miss replace |
| 180 | 5 | 5 | 0 | hit |
| 2180 | 68 | 4 | 2 | miss replace |
Thus, 4 blocks are replaced.
5.3.5
Hit ratio is $4/12=1/3$
5.3.6
$$\begin{align*}
<& index, tag, data> \\\
<& 0, 3, Mem[3072] \tilde{\quad} Mem[3103] > \\\
<& 4, 2, Mem[2176] \tilde{\quad} Mem[2207]>\\\
<& 5, 0, Mem[160] \tilde{\quad} Mem[191]>\\\
<& 7, 0, Mem[224] \tilde{\quad} Mem[255]>
\end{align*}$$
5.4
5.4.1
It is needed to design a buffer between L1 and L2 cache:
When L1 cache write miss, the data need to write to L2 cache directly, but it is recommended to store this data into buffer first, and write to L2 cache when buffer overflow. It is because L2 cache spend more times to write data in.
It is also needed to design a buffer between L2 cache and mem:
When L2 write hit, there will be a dirty data on L2 cache, which is not consist with Mem.
When L2 write miss, we need to allocate on L2 cache, if L2 cache is full, we have to choose an old and dirty data to be replaced, and this old data has to be written back to mem. Store this old data in buffer first, can reduce penalty time.
5.4.2
解决L1 写失败的过程:
直接写通到L2,在L2上写分以下两种情况:
L2上写中:替换对应数据,更改标志位:有效,dirty
L2上写失败,写分配,分以下两种情况:
- L2 cache上有空闲空间,直接分配
- L2 cache上没有空闲空间,根据选择策略选择old/dirty的数据替换,而这个dirty的数据也同时被放入buffer。
5.4.3
L1 写失败:数据直接写入L2,L1上不留副本
之后如果读取同一地址数据,导致L1读失败:从L2读出该数据,L2标志位置为无效,L2中该数据写入Mem和L2间的buffer
5.4.4 ?????
2 CPI $\Rightarrow$ 0.5 instrution/cycle
- Instruction bandwidth = $0.5 \times 0.3\% 64 bytes=0.096 bytes$
- Data read bandwidth = $0.5 \times \frac{250}{1000} \times 2\% 64 bytes=0.16bytes$
- Total read bandwidth = $0.256bytes/cycle$
- Data write bandwidth = $0.5 \times \frac{100}{1000} \times 4bytes=0.2bytes$ , write through write direct to Mem 1 words per instruction.
- Total write bandwidth = $0.2bytes/cycle$
5.4.5
- Total read bandwidth = $0.256bytes/cycle$
- Data write bandwidth = $0.5 \times \frac{100}{1000} \times 30\% \times 64bytes=0.96bytes$
- Total write bandwidth = $0.96bytes/cycle$
5.4.6
5.6
5.6.1
- P1 Cycle Time = L1 Hit Time = $0.66ns$, P1 Clock Rate = $1/0.66ns=1.52GHz$
- P2 Cycle Time = L2 Hit Time = $0.90ns$, P2 Clock Rate = $1/0.90ns=1.11GHz$
5.6.2
- P1 AMAT in ns = $0.66ns+8\% \times 70 ns=6.26ns$
- P2 AMAT in ns = $0.90ns+6\% \times 70 ns=5.1ns$
5.6.3
First, express in cycles:
- P1 AMAT in Cycles =$ 6.26/0.66 = 9.56 Cycles$
- P2 AMAT in Cycles =$ 5.1/0.90 = 5.68 Cycles$
Then, calculate CPI: - P1 CPI = $9.56+(9.56-1) \times 0.36 = 12.64 CPI$
- P2 CPI = $5.68+(5.68-1) \times 0.36 = 7.36 CPI$
Campare in ns: - P1 time per inst: 8.34 ns/inst
- P2 time per inst: 6.63 ns/inst
$\Rightarrow $ P2 is faster than P1
5.6.4
P1 AMAT = $0.66+8\% \times 70 = 6.26ns$
P1 with L2 AMAT = $0.66+8\%(5.62+95\% \times 70 )=6.43ns$
$ \Rightarrow $ P1 with L2 becomes worser!
5.6.5
P1 with L2 AMAT in cycles: $\frac{0.66+8\%(5.62+95\% \times 70)}{0.66} =9.7418 Cycles $
P1 with L2 CPI: $9.74+(9.74-1) \times 0.36=12.88$
5.6.6
P1 with L2 AMAT = $0.66+8\%(5.62+95\% \times 70 )=6.43ns$
P2 without L2 AMAT = $0.9+6\% \times 70=5.1ns$
$\Rightarrow $ P1 should be improved to match P2:
$$5.1=0.66+MR \times (5.62+95\% \times 70 ) $$
$\Rightarrow MR=6.156\%$
5.7
5.7.1
| Word Addr | Block Addr | cache Block | Tag | HorM |
|---|---|---|---|---|
| 3 | 3/2=1 | 1 mod 4=1 | 1/4 =0 | M |
| 180 | 90 | 2 | 22 | M |
| 43 | 21 | 1 | 5 | M |
| 2 | 1 | 1 | 0 | H |
| 191 | 95 | 3 | 23 | M |
| 88 | 44 | 0 | 11 | M |
| 190 | 95 | 3 | 23 | H |
| 14 | 7 | 3 | 1 | M |
| 181 | 90 | 2 | 22 | H |
| 44 | 22 | 2 | 5 | M |
| 186 | 93 | 1 | 23 | M |
| 253 | 126 | 2 | 31 | M |
| Contents |
|---|
| $\begin{array}{*{20}{l}} {FFFF} & {FFFF} & {FFFF} \\\ T(Mem[2-3])=0 & {FFFF} & {FFFF}\\\ {FFFF} & {FFFF} & {FFFF}\\\ {FFFF} & {FFFF} & {FFFF}\end{array} $ |
| $\begin{array}{*{20}{l}} {FFFF} & {FFFF} & {FFFF} \\\ T(Mem[2-3])=0 & {FFFF} & {FFFF}\\\ T(Mem[180-181])=22 & {FFFF} & {FFFF}\\\ {FFFF} & {FFFF} & {FFFF}\end{array} $ |
| $\begin{array}{*{20}{l}} {FFFF} & {FFFF} & {FFFF} \\\ T(Mem[2-3])=0 & T(Mem[42-43])=5 & {FFFF}\\\ T(Mem[180-181])=23 & {FFFF} & {FFFF}\\\ {FFFF} & {FFFF} & {FFFF}\end{array} $ |
| the same as above |
| $\begin{array}{*{20}{l}} {FFFF} & {FFFF} & {FFFF} \\\ T(Mem[2-3])=0 & T(Mem[42-43])=5 & {FFFF}\\\ T(Mem[180-181])=23 & {FFFF} & {FFFF}\\\ T(Mem[190-191])=23 & {FFFF} & {FFFF}\end{array} $ |
| $\begin{array}{*{20}{l}} T(Mem[88-89])=11 & {FFFF} & {FFFF} \\\ T(Mem[2-3])=0 & T(Mem[42-43])=5 & {FFFF}\\\ T(Mem[180-181])=23 & {FFFF} & {FFFF}\\\ T(Mem[190-191])=24 & {FFFF} & {FFFF}\end{array} $ |
| the same as above |
| $\begin{array}{*{20}{l}} T(Mem[88-89])=11 & {FFFF} & {FFFF} \\\ T(Mem[2-3])=0 & T(Mem[42-43])=5 & {FFFF}\\\ T(Mem[180-181])=23 & {FFFF} & {FFFF}\\\ T(Mem[190-191])=24 & T(Mem[14-15])=1 & {FFFF}\end{array} $ |
| the same as above |
| $\begin{array}{*{20}{l}} T(Mem[88-89])=11 & {FFFF} & {FFFF} \\\ T(Mem[2-3])=0 & T(Mem[42-43])=5 & {FFFF}\\\ T(Mem[180-181])=23 & T(Mem[44-45])=5 & {FFFF}\\\ T(Mem[190-191])=24 & T(Mem[14-15])=2 & {FFFF}\end{array} $ |
| $\begin{array}{*{20}{l}} T(Mem[88-89])=11 & {FFFF} & {FFFF} \\\ T(Mem[2-3])=0 & T(Mem[42-43])=5 & T(Mem[186-187])=23\\\ T(Mem[180-181])=23 & T(Mem[44-45])=6 & {FFFF}\\\ T(Mem[190-191])=24 & T(Mem[14-15])=2 & {FFFF}\end{array} $ |
| $\begin{array}{*{20}{l}} T(Mem[88-89])=11 & {FFFF} & {FFFF} \\\ T(Mem[2-3])=0 & T(Mem[42-43])=5 & T(Mem[186-187])=23\\\ T(Mem[180-181])=23 & T(Mem[44-45])=6 & T(Mem[252-253])=31\\\ T(Mem[190-191])=24 & T(Mem[14-15])=2 & {FFFF}\end{array} $ |
5.7.2
| Word Addr = Block Addr=Tag | cache Block | hitormiss | Contents |
|---|---|---|---|
| 3 | 0 | miss | 3 FF FF FF FF FF FF FF |
| 180 | 0 | miss | 3 180 FF FF FF FF FF FF |
| 43 | 0 | miss | 3 180 43 FF FF FF FF FF |
| 2 | 0 | miss | 3 180 43 2 FF FF FF FF |
| 191 | 0 | miss | 3 180 43 2 191 FF FF FF |
| 88 | 0 | miss | 3 180 43 2 191 88 FF FF |
| 190 | 0 | miss | 3 180 43 2 191 88 190 FF |
| 14 | 0 | miss | 3 180 43 2 191 88 190 14 |
| 181 | 0 | miss replace | 181 180 43 2 191 88 190 14 |
| 44 | 0 | miss replace | 181 44 43 2 191 88 190 14 |
| 186 | 0 | miss replace | 181 44 186 2 191 88 190 14 |
| 253 | 0 | miss replace | 181 44 186 253 191 88 190 14 |
cod-hw的更多相关文章
- The Nine Indispensable Rules for HW/SW Debugging 软硬件调试之9条军规
I read this book in the weekend, and decided to put the book on my nightstand. It's a short and funn ...
- HW职责 (Hardware Engineer)
硬件设计就是根据产品经理的需求PRS(Product Requirement Specification),在COGS(Cost of Goods Sale)的要求下,利用目前业界成熟的芯片方案或者技 ...
- 等待事件:enq: HW - contention和enq: TM - contention
今天生成了生产库前几日的AWR报告,发现等待事件中出现了一个陌生的event--enq: HW - contention,google一下是ASSM(Auto Segment Space Manage ...
- VS2008通过 map 和 cod 文件定位崩溃代码行
VS 2005/2008使用map文件查找程序崩溃原因 一般程序崩溃可以通过debug,找到程序在那一行代码崩溃了,最近编一个多线程的程序,都不知道在那发生错误,多线程并发,又不好单行调试,终于找到一 ...
- HW Video Acceleration in Chrome/Chromium HTML5 video 视频播放硬件加速
Introduction Video decode (e.g. YouTube playback) and encode (e.g. video chat applications) are some ...
- H-W平衡
hardy-weinberg平衡:标准定义————如果一个种群符合下列条件:1.种群是极大的:2.种群个体间的交配是随机的,也就是说种群中每一个个体与种群中其他个体的交配机会是相等的:3.没有突变产生 ...
- 海外仓系统 COD货到付款到付功能
全球还有很多国家买家网购选择货到付款方式,例如东南亚的越南.泰国.印度尼西亚,中东的阿联酋.沙特等国家.在这些国家建立海外仓需要需要具备COD货到付款功能,麦哲伦海外仓系统已经支持COD货到到付结算相 ...
- 或许,挂掉的点总是出人意料(hw其实蛮有好感的公司)
1:问了有没有考研的打算,为什么: ` 实验室指导自己的两个学长, 他们两个都是不考研党派,当然两个学长本科都进入了不错的公司hw,xm,耳濡目染就自己也就不想去考研了: 跟一些已经工作的程序员聊天, ...
- Analyzing 'enq: HW - contention' Wait Event (Doc ID 740075.1)
Analyzing 'enq: HW - contention' Wait Event (Doc ID 740075.1) In this Document Symptoms Cause ...
随机推荐
- 装饰模式(Decorator pattern)
装饰模式(Decorator pattern): 又名包装模式(Wrapper pattern), 它以对客户端透明的方式扩展对象的功能,是继承关系的一个替代方案. 装饰模式以对客户透明的方式动态的给 ...
- PHPCMS V9 分页类的修改教程
首先,打开 phpcms\libs\functions\global.func.php 这个文件,找到文件第622行的分页函数,复制一下,粘贴到默认分页函数的下面,重新命名后保存.(笔者在此命名为:p ...
- dev_set_draw的fill和margin模式
注意:分别观察两张填充模式,一种是内部填充,一种是边缘填充.还有一种缺省的填充. Name dev_set_draw — Define the region fill mode. Signature ...
- 关于JDK 安装,以及Java环境的设置
关于JDK 安装,以及Java环境的设置 1.下载JDK1.6,选择对应的安装路径 2.配置相应的Java 环境变量 A.属性名称:JAVA_HOME 属性值:C:\Program Files\Jav ...
- CSS3 04
animate.css库的使用 官网:https://daneden.github.io/animate.css/ 作用:将一切常见的动画直接封装,开发者不需要考虑实现过程,只需要添加对应的类就能实现 ...
- 并发编程中.net与java的一些对比
Java在并发编程中进行使用java.util.concurrent.atomic来处理一些轻量级变量 如AtomicInteger AtomicBoolean等 .Net中则使用Interlocke ...
- Winform 委托窗体传值
有窗体Form1和窗体Form2,单击Form1按钮弹出Form2,单击Form2吧Form2的textBox控件文本传给Form1的label控件. 窗体1里: 实例化Form2,注册Form2的事 ...
- 查询数据库最大id加1
SELECT ISNULL(MAX(id),0)+1 AS MaxId FROM TABLE ISNULL(MAX(id),0) 就是如果id为空 就返回0,然后再加1
- Eclipse设置、调优、使用(转自)
转自http://yuanzhifei89.iteye.com/blog/974082 eclipse调优 一般在不对eclipse进行相关设置的时候,使用eclipse总是会觉得启动好慢,用起来好卡 ...
- unity5.0新功能-布料、动画系统
原作者:只待苍霞 这一章讲一下布料系统, 这次的布料系统有很大的改良.Unity4中, 需要对SkinnedMeshRenderer使用SkinnedCloth, 或者对Cloth Renderer使 ...