一些网摘的hpc材料
source from: https://computing.llnl.gov
Factors determines a large-scale program's performance
4 * Application related factors:
5 * algorithms
6 * dataset size
7 * Memory Usage Pattern
8 * Use of IO
9 * Communication Patterns
10 * Task Granularity
11 * Load Balancing
12 * Amdahl's Law
13
14 * Hardware factors
15 * Processors Architecture
16 * Memory Hierarchy
17 * I/O configuration
18 * Network
19
20 * Software factors
21 * OS
22 * Compiler
23 * Preprocessor
24 * Communication protocols
25 * Libraries
Performance analysis:
Timers, Profiles, system stat, memory tools
Learn some about hardware archiecture:
Intel Xeon 5500/5600
4-core/ 6-core
2.4/2.8 GHz
Cache
L1 Data 32Kb, private
L1 Instruction 32Kb, private
L2 256K, private
L3 8Mb/12Mb, shared
Cpu-Memory bandwidth: 32 Gb/s
Intel Xeon E5-2670
8-core, 2.6GHz
Cache
L1 Data 32K, private
L1 Instruction 32K, private
L2 256K, private
L3 20Mb, shared
CPU-Memory bandwidth 51.2G/s
AMD processors
2.2 GHz
Cache
L1 Data 64k (2-way)
L1 Instruction 64k(2-way)
L2 512K private
L3 2M shared
Direct - connect Architecture
CPU-memory bandwidth 10.7G/s per socket F
other connect socket bandwidth 8G/s(2-way)
4x Infiniband Interconnect
* SDR 1.25G/s
* DDR 2.5G/s
* QDR 5G/s
Learn something about NUMA
-physical: each node has sevearl(2-4) sockets, each socket has sevearl(4-8) CPU cores. On same socket, cores share L3 cache; socket-socket communcation through CPU-memory bus, almost 2x ~ 5x slower.
-design consideration: CPU affinity(numactl --cpunodebind), local memory policy. other compiler/running-time options(mpirun --bind-to-socket -bynode)
Finally and most importantly, a good algorithm.
一些网摘的hpc材料的更多相关文章
- Feedly订阅Blog部落格RSS网摘 - Blog透视镜
网络信息爆炸的时代,如何更有效率地阅读文章,订阅RSS网摘,可以快速地浏览文章标题,当对某些文章有兴趣时,才点下连结连到原网站,阅读更详细的文章,Feedly Reader阅读器除了提供在线版订阅RS ...
- Bloglines订阅Blog部落格RSS网摘 - Blog透视镜
网络信息蓬勃发展,Blog部落格越来越普及,如果逐一地去浏览网站,势必费时费力,倘若信息可以自己送上门,那就可以节省不少时间,就好像看报纸的标题,有兴趣才点连结,进到网站浏览文章内容,Blogline ...
- TCP/IP协议头部结构体(网摘小结)(转)
源:TCP/IP协议头部结构体(网摘小结) TCP/IP协议头部结构体(转) 网络协议结构体定义 // i386 is little_endian. #ifndef LITTLE_ENDIAN #de ...
- Vim命令快捷键(网摘)
Vim命令快捷键(网摘) 原文出处:[?---->home]
- c#与C++类型转换网摘
转载自 C++和C#转换 https://www.cnblogs.com/zjoch/p/4147182.html c#与C++类型转换,网摘 //c++:HANDLE(void *) ...
- Delphi 中DataSnap技术网摘
Delphi2010中DataSnap技术网摘 一.为DataSnap系统服务程序添加描述 这几天一直在研究Delphi 2010的DataSnap,感觉功能真是很强大,现在足有理由证明Delphi7 ...
- Python入门及容易!网摘分享给大家!
Python:Python学习总结 背景 PHP的$和->让人输入的手疼(PHP确实非常简洁和强大,适合WEB编程),Ruby的#.@.@@也好不到哪里(OO人员最该学习的一门语言). Pyth ...
- 【网摘】DICOM 基础简介
一 什么是DICOM?DICOM是Digital Imaging and Communication of Medicine的缩写,是美国放射学会(American College of Radiol ...
- PHP 调用微信JS-SDK 开发详解 [网摘]
一:准备文件,并将文件置于网站根目录下 access_token.json {"access_token":"","expire_time" ...
随机推荐
- PLSQL存储过程校验身份证
CREATE OR REPLACE FUNCTION FUN_CHECKIDCARD(PI_AAC002 VARCHAR2) RETURN VARCHAR2 IS /*************** ...
- Java IO (1) - InputStream
Java IO (1) - InputStream 前言 JavaIO一共包括两种,一种是stream,一种是reader/writer,每种又包括in/out,所以一共是四种包.Java 流在处理上 ...
- oracle 的rowid和rownum
rowid就是唯一标志记录物理位置的一个id,对于rownum来说它是oracle系统顺序分配为从查询返回的行的编号,返回的第一行分配的是1,第二行是2,依此类推,这个伪字段可以用于限制查询返回的总行 ...
- Spring Data JPA教程, 第四部分: JPA Criteria Queries(未翻译)
The third part of my Spring Data JPA tutorialdescribed how you can create custom queries by using qu ...
- HDU 1796How many integers can you find(简单容斥定理)
How many integers can you find Time Limit: 12000/5000 MS (Java/Others) Memory Limit: 65536/32768 ...
- Hql处理日期格式化问题
1. Date date=Calendar.getInstance().getTime(); Date date1=Calendar.getInstance().getTime(); String h ...
- freemaker获取字符串长度
freemarker 判断字符串长度大于多少或者int变量大于多少,比较<#if "test"?length gt 2> 长度大于2</#if> 大于 ...
- Centos 卸载OpenJdk
[root@localhost ~]# rpm -qa|grep jdkjava-1.6.0-openjdk-devel-1.6.0.0-1.50.1.11.5.el6_3.i686java-1.7. ...
- Thinkphp框架----微信公众测试号开发(2)
---恢复内容开始--- 最近忙着投简历找工作.现在继续更 微信回复图文设置. 效果: 页面HTML需要一个form表单 简单效果: 接下来是数据库 字段:id title(标题) text(描述) ...
- linux杂谈(十八):DNS服务器的配置(一)
原文地址: http://blog.chinaunix.net/uid-29622064-id-4242123.html 1.DNS服务器简介 域名系统(英文:Domain Name System,縮 ...