By Ilya Grigorik on June 23, 2009

Measuring and optimizing IO performance
is somewhat of a black art: the tools are there, the resources and discussions are plenty, but it is also incredibly easy to get lost in the forest. I speak from recent experience. Having gone down multiple false starts with filesystem optimization, RAID tweaking,
and even app-level changes it really helped to finally step back and revisit the basics. Many man pages and discussion threads later, a few useful realizations emerged: iostat is your best friend, but it can also be incredibly deceiving; refreshing your memory
of disk latencies will go a long way; disks and filesystems are fast, but not that fast.

Monitoring IO Performance with iostat

If IO performance is suspect, iostat is your best friend. Having said that, the man pages are cryptic so don't be surprised if you find
yourself reading the source. To get started, identify the device in question and start a monitoring process:

# -k output rates in kB
# -x output extended stats
# -d monitoring single device
# sample stats every 5 seconds for device /dev/sdh
$ iostat -dxk /dev/sdi 5

Next, allocate yourself a couple of hours to understand the output or expect to find yourself down a wrong path in no time flat (been there, done that). iostat is a popular tool amongst the database crowd, so not surprisingly you'll find a lot of great discussions documenting the
use. Depending on your application you will need to focus on different metrics, but as a gentle introduction let's take a look at awaitsvctime and avgque:

  • await - The average time (in milliseconds) for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them.
  • svctime - The average service time (in milliseconds) for I/O requests that were issued to the device.
  • avgqu-sz - The average queue length of the requests that were issued to the device.

First off, await is a deceiving metric! Even though it claims to measure average time, it is better understood
as an aggregate function, so don't be mislead by it: avgqu-sz * svctm / (%util/100). Ideally, await should be roughly equal to your svctime, which leads us to a corollary: your average queue size is ideally
hovering around single digits. Understanding these variables alone can tell you volumes about the application generating the load.

Disk Latencies Refresher & EBS Performance

Disk access
time
 is determined via the sum of several variables: spin-up, seek, rotational delay, and transfer time. Assuming your disk is not is not sleeping we can discount the spin-up
time
, which leaves us with seek (time for the disk arm to find the track: ~10ms), rotational
delay
 (time to get the right sector under the head: depends on RPM), and the actual transfer time. Hence, in the worst case we will take ~10ms to seek, 60s/7200RPM ~= 8ms in rotational delay, plus the read time. On average, for a 7.2k RPM
disk this translates into roughly ~5ms access time (~20ms in worst case) to read the first byte!

Armed with this knowledge we can now put Amazon's EBS performance in context: on average our EBS mounts show 10~30ms svctime, which all things considered is not outrageous for a SAN. This number also dips into low single digits at nights and on weekends,
which points to the fact that as with any shared resource, the performance of EBS degrades during the day.
Having said that, a 6x performance difference based on time of day is definitely not anything to sneeze at, so let's hope Amazon is on top of this!

Average queue size (avgqu-sz) is a popular metric in the DBA circles, but do be
careful with it
 when
running on a SAN
 or any multi-spindle device. Ideally, your queue size (avgqu-sz) for a single disk should be in single digits, which means that the underlying device is well matched to the IO load generated by the application. Conversely,
if the queue size is artificially low, chances are your application code can benefit from some tuning: do less disk flushing, think about caching or buffering, or in other words, double check the assumption that IO is the bottleneck!

Disks, Filesystems and Facebook Case Study: Haystack

Average access time on our disks places some hard
limits on the number of IOPs - at 5ms average, we get a very optimistic 200 req/s with no read time. Hence, if you're trying to store several hundred files a second, you might want to revisit the architecture or seriously think about switching to SSD's! Databases
such as MySQL work around this constraint by minimizing the number of file handles, caching data, and using aggressive buffering techniques. Willing to potentially loose a little bit of data with InnoDB? Set flush_log_at_trx_commit
to 2
 to avoid flushing on every transaction in favor of a periodic one second flush. In similar fashion, you can tweak your MyISAM key
buffers, or even place your index and data files on different drives.

Facebook team recently released the details of their Haystack photo storage system which serves as a great case study
of working around the IO bottlenecks: over 15PB of photo storage, and ~360 new photos being uploaded every second as of April '09. To meet the requirements, they dropped the POSIX filesystem semantics and went for an append only structure with a separate in-memory
index which stores the direct inode offsets for each photo. As a result, each photo access is translated into a single IO request - a huge win. Read through it, fascinating
stuff
and an illustrative example of optimizing for IO.


Ilya Grigorik is a web performance engineer and developer advocate on the Make The Web Fast team
at Google, where he spends his days and nights on making the web fast and driving adoption of performance best practices.

Follow @igrigorik

Measuring & Optimizing I/O Performance的更多相关文章

  1. Optimizing Item Import Performance in Oracle Product Hub/Inventory

    APPLIES TO: Oracle Product Hub - Version 12.1.1 to 12.1.1 [Release 12.1] Oracle Inventory Management ...

  2. PatentTips - Optimizing Write Combining Performance

    BACKGROUND OF THE INVENTION The use of a cache memory with a processor facilitates the reduction of ...

  3. [Forward]Improving Web App Performance With the Chrome DevTools Timeline and Profiles

    Improving Web App Performance With the Chrome DevTools Timeline and Profiles We all want to create h ...

  4. Java性能提示(全)

    http://www.onjava.com/pub/a/onjava/2001/05/30/optimization.htmlComparing the performance of LinkedLi ...

  5. Migrating Oracle on UNIX to SQL Server on Windows

    Appendices Published: April 27, 2005 On This Page Appendix A: SQL Server for Oracle Professionals Ap ...

  6. (转) [it-ebooks]电子书列表

    [it-ebooks]电子书列表   [2014]: Learning Objective-C by Developing iPhone Games || Leverage Xcode and Obj ...

  7. 数据库调优过程(一):SqlServer批量复制(bcp)[C#SqlBulkCopy]性能极低问题

    背景 最近一段给xx做项目,这边最头疼的事情就是数据库入库瓶颈问题. 环境 服务器环境:虚拟机,分配32CPU,磁盘1.4T,4T,5T,6T几台服务器不等同(转速都是7200r),内存64G. 排查 ...

  8. 跨过slf4j和logback,直接晋级log4j 2

    今年一直关注log4j 2,但至今还没有出正式版.等不及了,今天正式向大家介绍一下log4j的升级框架,log4j 2. log4j,相信大家都熟悉,至今对java影响最大的logging系统,至今仍 ...

  9. 论文笔记:Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

    Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks ICML 2017 Paper:https://arxiv.org/ ...

随机推荐

  1. mybatis基础学习1---(配置文件和sql语句)

    1:配置文件(主要配置文件) 2:配置文件(引入) 3:sql语句解析: <mapper namespace="/"> <!-- 1 -->根据id查对象 ...

  2. 实际比较filter2D和imfilter之间的关系

    实际比较filter2D和imfilter之间的关系 ​                  卷积运算是图像处理和增强中经常遇到的一种算法.由于很多优秀的开源算法都是采用matlab编写的,在我改写为c ...

  3. SQL AlawaysOn 之一:安装域控制器

    一.准备阶段 1.  计算机名称命名 2.IP地址修改.DNS修改 IP地址和DNS不一定要和图上的一致,只要固定就行了 二.安装阶段 1.服务器管理器,仪表盘,点击“添加角色和功能” 2.添加角色和 ...

  4. 一名合格的JAVA程序员需要点亮那些技能树?

    这是从450家企业的招聘信息中统计而来,相对来说还是比较真实的,虽然有些公司的招聘要求万年不变,但还是可以大致反应企业的招聘要求的. 尽管Struts2漏洞频出,但是由于政府.银行以及传统企业遗留项目 ...

  5. Hibernate的映射文件中基于主键的双向1对1的关联

    1.Hibernate中采用基于主键的映射策略是,有一端(任意一端)的主键生成策略要是foreign,根据对方的主键来生成自己的主键,它的实体不能拥有自己的主键生成策略,如我的配置文件: <?x ...

  6. Spark2.1.0分布式集群安装

    一.依赖文件安装 1.1 JDK 参见博文:http://www.cnblogs.com/liugh/p/6623530.html 1.2 Hadoop 参见博文:http://www.cnblogs ...

  7. XJOI1424解压字符串

    解压字符串 给你一个字符串S,S是已经被加密过的字符串.现在要求你把字符串S还原.字符串S可能会出现这样的格式:k(q),它表示字符串q重复了k次,其中q是0个或多个字符,而k是一个数字,范围是0至9 ...

  8. 老李分享:《Linux Shell脚本攻略》 要点(四)

    老李分享:<Linux Shell脚本攻略> 要点(四)   1.IP地址的正则表达式: [0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3} 2. ...

  9. 性能测试培训:sql server性能测试分析局部变量的性能影响

    poptest是国内唯一一家培养测试开发工程师的培训机构,以学员能胜任自动化测试,性能测试,测试工具开发等工作为目标.在poptest的loadrunner的培训中,为了提高学员性能优化的经验,加入了 ...

  10. Python Selenium设计模式-POM

    前言 本文就python selenium自动化测试实践中所需要的POM设计模式进行分享,以便大家在实践中对POM的特点.应用场景和核心思想有一定的理解和掌握. 为什么要用POM 基于python s ...