gem5-gpu 运行 PARSEC2.1

PARSEC是针对共享内存多核处理器（CPU）的一套基准测试程序，详细介绍见wiki：http://wiki.cs.princeton.edu/index.php/PARSEC，主要参考：http://www.cs.utexas.edu/~cart/parsec_m5/，下载parsec、inputs：http://parsec.cs.princeton.edu/download.htm

阅读此教程的前提是，你已经配置好全系统模拟环境。

首先，下载所需的PARSEC应用程序，这一步已经在配置全系统时完成，即磁盘镜像中根目录下的parsec文件夹。要想重新编译请看http://www.cs.utexas.edu/~cart/parsec_m5/，在http://parsec.cs.princeton.edu/download.htm下载的源文件是否需要对gem5-gpu做针对性的更改：未知，需要实验来给出结果。

其次，下载一个rcS生成器：http://www.cs.utexas.edu/~parsec_m5/writescripts.pl，这是一个Perl脚本，很多Linux发行版默认安装了Perl，只需赋予用户该文件的可执行权限即可，源文件的最后注明了用法。PARSEC wiki页面最后也有个别程序的用法。

可以看到生成了5个rcS脚本，命名格式为“benchName_threadsNumberc_input"，其中input的含义：

test：尽可能小的输入，dev：我没看懂，small：真实输入，运行时间约1s，medium：真实输入，运行时间约5s，large：真实输入，运行时间约15s，native：真实输入，运行时间约15min。

rcS文件内容可供参考，参考其命令格式，可以通过运行 ./bench --help-all 查看程序的其他用法。

付用法，具体含义见源文件main.cpp

blackscholes;<nthreads> <inputdir>in_4.txt <inputdir>prices.txt;<nthreads> <inputdir>in_16.txt <inputdir>prices.txt;<nthreads> <inputdir>in_4K.txt <inputdir>prices.txt;<nthreads> <inputdir>in_16K.txt <inputdir>prices.txt;<nthreads> <inputdir>in_64K.txt <inputdir>prices.txt

bodytrack;<inputdir>sequenceB_1 4 1 5 1 0 <nthreads>;<inputdir>sequenceB_1 4 1 100 3 0 <nthreads>;<inputdir>sequenceB_1 4 1 1000 5 0 <nthreads>;<inputdir>sequenceB_2 4 2 2000 5 0 <nthreads>;<inputdir>sequenceB_4 4 4 4000 5 0 <nthreads>

canneal;<nthreads> 5 100 <inputdir>10.nets 1;<nthreads> 100 300 <inputdir>100.nets 2;<nthreads> 10000 2000 <inputdir>100000.nets 32;<nthreads> 15000 2000 <inputdir>200000.nets 64;<nthreads> 15000 2000 <inputdir>400000.nets 128

dedup;-c -p -f -t <nthreads> -i <inputdir>test.dat -o <inputdir>output.dat.ddp;-c -p -f -t <nthreads> -i <inputdir>hamlet.dat -o <inputdir>output.dat.ddp;-c -p -f -t <nthreads> -i <inputdir>medias.dat -o <inputdir>output.dat.ddp;-c -p -f -t <nthreads> -i <inputdir>mediam.dat -o <inputdir>output.dat.ddp;-c -p -f -t <nthreads> -i <inputdir>medial.dat -o <inputdir>output.dat.ddp

facesim;-h;-timing -threads <nthreads>;-timing -threads <nthreads>;-timing -threads <nthreads>;-timing -threads <nthreads>

ferret;<inputdir>corelt lsh <inputdir>queriest 1 1 <nthreads> <inputdir>output.txt;<inputdir>coreld lsh <inputdir>queriesd 5 5 <nthreads> <inputdir>output.txt;<inputdir>corels lsh <inputdir>queriess 10 20 <nthreads> <inputdir>output.txt;<inputdir>corelm lsh <inputdir>queriesm 10 20 <nthreads> <inputdir>output.txt;<inputdir>corell lsh <inputdir>queriesl 10 20 <nthreads> <inputdir>output.txt

fluidanimate;<nthreads> 1 <inputdir>in_5K.fluid <inputdir>out.fluid;<nthreads> 3 <inputdir>in_15K.fluid <inputdir>out.fluid;<nthreads> 5 <inputdir>in_35K.fluid <inputdir>out.fluid;<nthreads> 5 <inputdir>in_100K.fluid <inputdir>out.fluid;<nthreads> 5 <inputdir>in_300K.fluid <inputdir>out.fluid

freqmine;<inputdir>T10I4D100K_3.dat 1;<inputdir>T10I4D100K_1k.dat 3;<inputdir>kosarak_250k.dat 220;<inputdir>kosarak_500k.dat 410;<inputdir>kosarak_990k.dat 790

rtview;<inputdir>octahedron.obj -nodisplay -automove -nthreads <nthreads> -frames 1 -res 1 1;<inputdir>bunny.obj -nodisplay -automove -nthreads <nthreads> -frames 1 -res 16 16;<inputdir>happy_buddha.obj -nodisplay -automove -nthreads <nthreads> -frames 3 -res 480 270;<inputdir>happy_buddha.obj -nodisplay -automove -nthreads <nthreads> -frames 3 -res 960 540;<inputdir>happy_buddha.obj -nodisplay -automove -nthreads <nthreads> -frames 3 -res 1920 1080

streamcluster;2 5 1 10 10 5 none <inputdir>output.txt <nthreads>;3 10 3 16 16 10 none <inputdir>output.txt <nthreads>;10 20 32 4096 4096 1000 none <inputdir>output.txt <nthreads>;10 20 64 8192 8192 1000 none <inputdir>output.txt <nthreads>;10 20 128 16384 16384 1000 none <inputdir>output.txt <nthreads>

swaptions;-ns 1 -sm 5 -nt <nthreads>;-ns 3 -sm 50 -nt <nthreads>;-ns 16 -sm 5000 -nt <nthreads>;-ns 32 -sm 10000 -nt <nthreads>;-ns 64 -sm 20000 -nt <nthreads>

vips;im_benchmark <inputdir>barbados_256x288.v <inputdir>output.v;im_benchmark <inputdir>barbados_256x288.v <inputdir>output.v;im_benchmark <inputdir>pomegranate_1600x1200.v <inputdir>output.v;im_benchmark <inputdir>vulture_2336x2336.v <inputdir>output.v;im_benchmark <inputdir>bigben_2662x5500.v <inputdir>output.v

x264;--quiet --qp 20 --partitions b8x8,i4x4 --ref 5 --direct auto --b-pyramid --weightb --mixed-refs --no-fast-pskip --me umh --subme 7 --analyse b8x8,i4x4 --threads <nthreads> -o <inputdir>eledream.264 <inputdir>eledream_32x18_1.y4m;--quiet --qp 20 --partitions b8x8,i4x4 --ref 5 --direct auto --b-pyramid --weightb --mixed-refs --no-fast-pskip --me umh --subme 7 --analyse b8x8,i4x4 --threads <nthreads> -o <inputdir>eledream.264 <inputdir>eledream_64x36_3.y4m;--quiet --qp 20 --partitions b8x8,i4x4 --ref 5 --direct auto --b-pyramid --weightb --mixed-refs --no-fast-pskip --me umh --subme 7 --analyse b8x8,i4x4 --threads <nthreads> -o <inputdir>eledream.264 <inputdir>eledream_640x360_8.y4m;--quiet --qp 20 --partitions b8x8,i4x4 --ref 5 --direct auto --b-pyramid --weightb --mixed-refs --no-fast-pskip --me umh --subme 7 --analyse b8x8,i4x4 --threads <nthreads> -o <inputdir>eledream.264 <inputdir>eledream_640x360_32.y4m;--quiet --qp 20 --partitions b8x8,i4x4 --ref 5 --direct auto --b-pyramid --weightb --mixed-refs --no-fast-pskip --me umh --subme 7 --analyse b8x8,i4x4 --threads <nthreads> -o <inputdir>eledream.264 <inputdir>eledream_640x360_128.y4m

Benchmark	input	cpu_threads	start_from	sim_seconds	ROI time	cpu_threads	start_from	sim_seconds	ROI time
blackscholes	small	1		0.4799s	0.480s	2			0.24s
bodytrack				1.4s	1.4s
canneal				0.774s	0.772s
dedup				2.913s	2.912s
facesim
ferret
fluidanimate				2.597s	2.6s
freqmine				1.536s	1.54s
rtview
streamcluster				2.532s	2.532s
swaption
vips
x264				0.4699s	0.468s

gem5-gpu 运行 PARSEC2.1的更多相关文章

指定GPU运行python程序
一.命令行运行python程序时 1.首先查看哪些GPU空闲,nvidia-smi显示当前GPU使用情况. nvidia-smi 2.然后指定空闲的GPU运行python程序. CUDA_VISIBL ...
配置ubuntu16.04下Theano使用GPU运行程序的环境
ubuntu16.04默认安装了python2.7和python3.5 .本教程使用python3.5 第一步:将ubuntu16.04默认的python2修改成默认使用python3 . sudo ...
pytorch设置多GPU运行的方法
1.DataParallel layers (multi-GPU, distributed) 1)DataParallel CLASS torch.nn.DataParallel(module, de ...
GPU运行Tensorflow的几点建议
1.在运行之前先查看GPU的使用情况: 指令:nvidia-smi 备注:查看GPU此时的使用情况或者指令:watch nvidia-smi 备注:实时返回GPU使用情况 2.指定GPU训练: 方 ...
Ubuntu16.04 faster-rcnn+caffe+gpu运行环境配置以及解决各种bug
https://blog.csdn.net/flygeda/article/details/78638824 本文主要是对近期参考的网上各位大神的博客的总结,其中,从安装系统到跑通程序过程中遇到的各种 ...
Keras/Tensorflow选择GPU/CPU运行
首先,导入os,再按照PCI_BUS_ID顺序,从0开始排列GPU, import os os.environ["CUDA_DEVICE_ORDER"] = "PCI_B ...
使用colab运行深度学习gpu应用（Mask R-CNN）实践
1,目的 Google Colaboratory(https://colab.research.google.com)是谷歌开放的一款研究工具,主要用于机器学习的开发和研究.这款工具现在可以免费使用, ...
深入GPU硬件架构及运行机制
目录一.导言 1.1 为何要了解GPU? 1.2 内容要点 1.3 带着问题阅读二.GPU概述 2.1 GPU是什么? 2.2 GPU历史 2.2.1 NV GPU发展史 2.2.2 NV GPU ...
【OpenCV】OpenCV中GPU模块使用
CUDA基本使用方法在介绍OpenCV中GPU模块使用之前,先回顾下CUDA的一般使用方法,其基本步骤如下: 1.主机代码执行:2.传输数据到GPU:3.确定grid,block大小: 4.调用内核 ...

随机推荐

SZWI3800
xml <mapper namespace="jp.co.alsok.g6.zwi.dao.mapper.g6.custom.SZWI3800Mapper"> < ...
SwiftStack 因战略转变而裁员
导读销售团队在前段圣诞节来临前面临裁减的糟糕处境.企业云存储公司SwiftStack进行了裁员,人数不详,公司规模因此缩小. IT外媒The Register获悉,这家公司裁掉了大概一半的人员,但总 ...
C++中的四种类型转换运算符static_cast、dynamic_cast、const_cast和reinterpret_cast的使用
1.上一遍讲述了C语言的隐式类型转换和显示类型转换,C语言之所以增加强制类型转换,就是为了强调转换的风险性,但这种强调风险的方式是比较粗放了,粒度比较大,它并没有表明存在什么风险,风险程度如何. 2. ...
matplotlib 柱状图 Bar Chart 样例及参数
def bar_chart_generator(): l = [1,2,3,4,5] h = [20, 14, 38, 27, 9] w = [0.1, 0.2, 0.3, 0 ...
Day11 - K - Good Luck in CET-4 Everybody! HDU - 1847
大学英语四级考试就要来临了,你是不是在紧张的复习?也许紧张得连短学期的ACM都没工夫练习了,反正我知道的Kiki和Cici都是如此.当然,作为在考场浸润了十几载的当代大学生,Kiki和Cici更懂得考 ...
Kibana7.3.2与ElasticSearch7.3.2的集成
上接: Ubuntu18.04 ElasticSearch7.3.2集群搭建上传二进制包解压到指定目录, 修改目录名 tar -xzvf tar xzvf kibana-6.3.2-linux-x8 ...
redis的并发set
1.Redis高并发的问题 Redis缓存的高性能有目共睹,应用的场景也是非常广泛,但是在高并发的场景下,也会出现问题:缓存击穿.缓存雪崩.缓存和数据一致性,以及今天要谈到的缓存并发竞争. 这里的并发 ...
ZCGL大数据项目优化组件布置
1.经JMeter并发性能测试,每个HBaseService服务的并发请求上限大概是1K,为了支持5W个并发请求量,需要增加部署节点,相应需要增加部署路由网管Zuul,为了隐藏多个路由网管Zuul的I ...
CNN反向传播算法过程
主模块规格数据输入(加载,调格式,归一化) 定义网络结构设置训练参数调用初始化模块调用训练模块调用测试模块画图初始化模块设置初始化参数(输入通道,输入尺寸) 遍历层(计算尺寸,输入输出 ...
robotframework+appium 实现App自动化值环境搭建（一）
第一步: Cmd命令输入pip install robotframework-appiumlibrary 下载和导入appiumlibrary 第二步: 安装JDK,附件有JDK1.8安装包第三步 ...

gem5-gpu 运行 PARSEC2.1

gem5-gpu 运行 PARSEC2.1的更多相关文章

随机推荐

热门专题