这两天在调优数据库性能的过程中需要降低操作系统文件Cache对数据库性能的影响，故调研了一些降低文件系统缓存大小的方法，其中一种是通过修改/proc/sys/vm/dirty_background_ration以及/proc/sys/vm/dirty_ratio两个参数的大小来实现。看了不少相关博文的介绍，不过一直弄不清楚这两个参数的区别在哪里，后来看了下面的一篇英文博客才大致了解了它们的不同。

vm.dirty_background_ratio:这个参数指定了当文件系统缓存脏页数量达到系统内存百分之多少时（如5%）就会触发pdflush/flush/kdmflush等后台回写进程运行，将一定缓存的脏页异步地刷入外存；

vm.dirty_ratio:而这个参数则指定了当文件系统缓存脏页数量达到系统内存百分之多少时（如10%），系统不得不开始处理缓存脏页（因为此时脏页数量已经比较多，为了避免数据丢失需要将一定脏页刷入外存）；在此过程中很多应用进程可能会因为系统转而处理文件IO而阻塞。

之前一直错误的一位dirty_ratio的触发条件不可能达到，因为每次肯定会先达到vm.dirty_background_ratio的条件，后来才知道自己理解错了。确实是先达到vm.dirty_background_ratio的条件然后触发flush进程进行异步的回写操作，但是这一过程中应用进程仍然可以进行写操作，如果多个应用进程写入的量大于flush进程刷出的量那自然会达到vm.dirty_ratio这个参数所设定的坎，此时操作系统会转入同步地处理脏页的过程，阻塞应用进程。

附上原文：

Better Linux Disk Caching & Performance with vm.dirty_ratio & vm.dirty_background_ratio

by BOB PLANKERS on DECEMBER 22, 2013

in BEST PRACTICES,CLOUD,SYSTEM ADMINISTRATION,VIRTUALIZATION

This is post #16 in my December 2013 series about Linux Virtual Machine Performance Tuning. For more, please see the tag “Linux VM Performance Tuning.”

In previous posts on vm.swappiness and using RAM disks we talked about how the memory on a Linux guest is used for the OS itself (the kernel, buffers, etc.), applications, and also for file cache. File caching is an important performance improvement, and read caching is a clear win in most cases, balanced against applications using the RAM directly. Write caching is trickier. The Linux kernel stages disk writes into cache, and over time asynchronously flushes them to disk. This has a nice effect of speeding disk I/O but it is risky. When data isn’t written to disk there is an increased chance of losing it.

There is also the chance that a lot of I/O will overwhelm the cache, too. Ever written a lot of data to disk all at once, and seen large pauses on the system while it tries to deal with all that data? Those pauses are a result of the cache deciding that there’s too much data to be written asynchronously (as a non-blocking background operation, letting the application process continue), and switches to writing synchronously (blocking and making the process wait until the I/O is committed to disk). Of course, a filesystem also has to preserve write order, so when it starts writing synchronously it first has to destage the cache. Hence the long pause.

The nice thing is that these are controllable options, and based on your workloads & data you can decide how you want to set them up. Let’s take a look:

$ sysctl -a | grep dirty vm.dirty_background_ratio = 10 vm.dirty_background_bytes = 0 vm.dirty_ratio = 20 vm.dirty_bytes = 0 vm.dirty_writeback_centisecs = 500 vm.dirty_expire_centisecs = 3000

vm.dirty_background_ratio is the percentage of system memory that can be filled with “dirty” pages — memory pages that still need to be written to disk — before the pdflush/flush/kdmflush background processes kick in to write it to disk. My example is 10%, so if my virtual server has 32 GB of memory that’s 3.2 GB of data that can be sitting in RAM before something is done.

vm.dirty_ratio is the absolute maximum amount of system memory that can be filled with dirty pages before everything must get committed to disk. When the system gets to this point all new I/O blocks until dirty pages have been written to disk. This is often the source of long I/O pauses, but is a safeguard against too much data being cached unsafely in memory.

vm.dirty_background_bytes and vm.dirty_bytes are another way to specify these parameters. If you set the _bytes version the _ratio version will become 0, and vice-versa.

vm.dirty_expire_centisecs is how long something can be in cache before it needs to be written. In this case it’s 30 seconds. When the pdflush/flush/kdmflush processes kick in they will check to see how old a dirty page is, and if it’s older than this value it’ll be written asynchronously to disk. Since holding a dirty page in memory is unsafe this is also a safeguard against data loss.

vm.dirty_writeback_centisecs is how often the pdflush/flush/kdmflush processes wake up and check to see if work needs to be done.

You can also see statistics on the page cache in /proc/vmstat:

$ cat /proc/vmstat | egrep "dirty|writeback" nr_dirty 878 nr_writeback 0 nr_writeback_temp 0

In my case I have 878 dirty pages waiting to be written to disk.

Approach 1: Decreasing the Cache

As with most things in the computer world, how you adjust these depends on what you’re trying to do. In many cases we have fast disk subsystems with their own big, battery-backed NVRAM caches, so keeping things in the OS page cache is risky. Let’s try to send I/O to the array in a more timely fashion and reduce the chance our local OS will, to borrow a phrase from the service industry, be “in the weeds.” To do this we lower vm.dirty_background_ratio and vm.dirty_ratio by adding new numbers to /etc/sysctl.conf and reloading with “sysctl –p”:

vm.dirty_background_ratio = 5 vm.dirty_ratio = 10

This is a typical approach on virtual machines, as well as Linux-based hypervisors. I wouldn’t suggest setting these parameters to zero, as some background I/O is nice to decouple application performance from short periods of higher latency on your disk array & SAN (“spikes”).

Approach 2: Increasing the Cache

There are scenarios where raising the cache dramatically has positive effects on performance. These situations are where the data contained on a Linux guest isn’t critical and can be lost, and usually where an application is writing to the same files repeatedly or in repeatable bursts. In theory, by allowing more dirty pages to exist in memory you’ll rewrite the same blocks over and over in cache, and just need to do one write every so often to the actual disk. To do this we raise the parameters:

vm.dirty_background_ratio = 50 vm.dirty_ratio = 80

Sometimes folks also increase the vm.dirty_expire_centisecs parameter to allow more time in cache. Beyond the increased risk of data loss, you also run the risk of long I/O pauses if that cache gets full and needs to destage, because on large VMs there will be a lot of data in cache.

Approach 3: Both Ways

There are also scenarios where a system has to deal with infrequent, bursty traffic to slow disk (batch jobs at the top of the hour, midnight, writing to an SD card on a Raspberry Pi, etc.). In that case an approach might be to allow all that write I/O to be deposited in the cache so that the background flush operations can deal with it asynchronously over time:

vm.dirty_background_ratio = 5 vm.dirty_ratio = 80

Here the background processes will start writing right away when it hits that 5% ceiling but the system won’t force synchronous I/O until it gets to 80% full. From there you just size your system RAM and vm.dirty_ratio to be able to consume all the written data. Again, there are tradeoffs with data consistency on disk, which translates into risk to data. Buy a UPS and make sure you can destage cache before the UPS runs out of power. :)

No matter the route you choose you should always be gathering hard data to support your changes and help you determine if you are improving things or making them worse. In this case you can get data from many different places, including the application itself, /proc/vmstat, /proc/meminfo, iostat, vmstat, and many of the things in /proc/sys/vm. Good luck!

文件系统缓存dirty_ratio与dirty_background_ratio两个参数区别的更多相关文章

Linux 文件系统缓存dirty_ratio与dirty_background_ratio两个参数区别
文件系统缓存dirty_ratio与dirty_background_ratio两个参数区别 (2014-03-16 17:54:32) 转载▼ 标签: linux 文件系统缓存 cache dirt ...
(转)文件系统缓存dirty_ratio与dirty_background_ratio两个参数区别
这两天在调优数据库性能的过程中需要降低操作系统文件Cache对数据库性能的影响,故调研了一些降低文件系统缓存大小的方法,其中一种是通过修改/proc/sys/vm/dirty_background_r ...
[转载]文件系统缓存dirty_ratio与dirty_background_ra
原文地址:文件系统缓存dirty_ratio与dirty_background_ratio两个参数区别作者:vincent 这两天在调优数据库性能的过程中需要降低操作系统文件Cache对数据库性能的影 ...
[转载]C#缓存absoluteExpiration、slidingExpiration两个参数的疑惑
看了很多资料终于搞明白cache中absoluteExpiration,slidingExpiration这两个参数的含义. absoluteExpiration:用于设置绝对过期时间,它表示只要时间 ...
C#缓存absoluteExpiration、slidingExpiration两个参数的疑惑
看了很多资料终于搞明白cache中absoluteExpiration,slidingExpiration这两个参数的含义. absoluteExpiration:用于设置绝对过期时间,它表示只要时间 ...
maven跳过单元测试的两个参数区别
maven在打包过程中需要执行单元测试.但有些时候单元测试已经通过只是想打包时,想跳过测试.maven提供了两个参数跳过测试:maven.test.skip=true 和skipTests. 例子 m ...
js中ajax连接服务器open函数的另外两个默认参数get请求和默认异步（open的post方式send函数带参数）（post请求和get请求区别：get：快、简单 post：安全，量大，不缓存）（服务器同步和异步区别：同步：等待服务器响应当中浏览器不能做别的事情）（ajax和jquery一起用的）
js中ajax连接服务器open函数的另外两个默认参数get请求和默认异步(open的post方式send函数带参数)(post请求和get请求区别:get:快.简单 post:安全,量大,不缓存)( ...
Better Linux Disk Caching & Performance with vm.dirty_ratio & vm.dirty_background_ratio
In previous posts on vm.swappiness and using RAM disks we talked about how the memory on a Linux gue ...
wParam和lParam两个参数到底是什么意思？
在Windows的消息函数中,有两个非常熟悉的参数:wParam,lParam. 这两个参数的字面意义对于现在的程序来说已经不重要了,因为它是16位系统的产物,为了保持程序的可移植性,就将它保存了下来 ...

随机推荐

远控软件VNC攻击案例研究
欢迎大家给我投票: http://2010blog.51cto.com/350944 本文出自 "李晨光原创技术博客" 博客,谢绝转载!
openstack-dbs
真正的服务器派生出线程和子进程处理多个连接当允许客户端加入聊天室,他发送的任何一条文本都将广播给聊天室中的每个用户,除非文本是服务器CLI当广播一条消息,消息前面将加上发送者的昵称以尖括号括住昵称 ...
第二百一十五、六天 how can I 坚持
昨天刷机刷到很晚,博客都忘写了,刷了个flyme,用着没什么感觉,今天打电话试了下还有破音,有点小后悔.不行过两天再刷回来. 今天.mysql ifnull函数. 两条熊猫鱼都死了,这两天雾霾那么严重 ...
TBluetoothLE
delphi 蓝牙技术 D:\Users\Public\Documents\Embarcadero\Studio\17.0\Samples\Object Pascal\Multi-Device Sam ...
Spark RDD概念学习系列之RDD的转换（十）
RDD的转换 Spark会根据用户提交的计算逻辑中的RDD的转换和动作来生成RDD之间的依赖关系,同时这个计算链也就生成了逻辑上的DAG.接下来以“Word Count”为例,详细描述这个DAG生成的 ...
C# 多线程参数的使用
一个参数: Thread.Start方法可以带一个参数: public static void Main() { Thread t = new Thread(new ParameterizedThre ...
Socket小项目的一些心得（鸣谢传智的教学视频）
Socket是一种封装了四层通信的整体抽象入口,通常也称作"套接字",这是常用的四层通信这是访问Socket的流程图,这个分为客户端和服务器端,其中服务器端有以下步骤去建立,前面的 ...
newlsip 检查磁盘分区使用情况
主要还是用df -k这个命令,然后将输出结果全部逐行解析,最后调用REST API,发送给服务器保存. 参考代码: #!/usr/bin/newlisp (set 'cur-path "/o ...
hdoj 5417 Victor and Machine
题目链接:http://acm.hdu.edu.cn/showproblem.php?pid=5417 水题,一开始题目读错了做了好久,每次停工以后都是重新计时. 需要注意的是,先除后乘注意加括号 # ...
sql with(lock) 与事务
sql select查询语句表后面携带 with(nolock) 会获取到在事务中已经执行但还未完成提交的记录即使表被锁住也能查询到当事务最终执行失败时查询到的记录可能没有啦不 ...

文件系统缓存dirty_ratio与dirty_background_ratio两个参数区别