Linux系统出现hung_task_timeout_secs和blocked for more than 120 seconds的解决方法

Linux系统出现系统没有响应。 在/var/log/message日志中出现大量的 “echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.” “blocked for more than 120 seconds”错误。

问题原因:

默认情况下, Linux会最多使用40%的可用内存作为文件系统缓存。当超过这个阈值后,文件系统会把将缓存中的内存全部写入磁盘, 导致后续的IO请求都是同步的。将缓存写入磁盘时,有一个默认120秒的超时时间。 出现上面的问题的原因是IO子系统的处理速度不够快,不能在120秒将缓存中的数据全部写入磁盘。IO系统响应缓慢,导致越来越多的请求堆积,最终系统内存全部被占用,导致系统失去响应。

解决方法:

根据应用程序情况,对vm.dirty_ratio,vm.dirty_background_ratio两个参数进行调优设置。 例如,推荐如下设置:
# sysctl -w vm.dirty_ratio=10
# sysctl -w vm.dirty_background_ratio=5
# sysctl -p

如果系统永久生效,修改/etc/sysctl.conf文件。加入如下两行:
#vi /etc/sysctl.conf

vm.dirty_background_ratio = 5
vm.dirty_ratio = 10

重启系统生效。

有关Cache

文件缓存是提升性能的重要手段。毋庸置疑,读缓存(Read caching)在绝大多数情况下是有益无害的(程序可以直接从RAM中读取数据),而写缓存(Write caching)则相对复杂。Linux内核将写磁盘的操作分解成了,先写缓存,每隔一段时间再异步地将缓存写入磁盘。这提升了IO读写的速度,但存在一定风险。数据没有及时写入磁盘,所以存在数据丢失的风险。

同样,也存在cache被写爆的情况。还可能出现一次性往磁盘写入过多数据,以致使系统卡顿。之所以卡顿,是因为系统认为,缓存太大用异步的方式来不及把它们都写进磁盘,于是切换到同步的方式写入。(异步,即写入的同时进程能正常运行;同步,即写完之前其他进程不能工作)。

好消息是,你可以根据实际情况,对写缓存进行配置。
可以看一下这些参数:

[root@host ~]# sysctl -a | grep dirty
vm.dirty_background_ratio = 10
vm.dirty_background_bytes = 0
vm.dirty_ratio = 20
vm.dirty_bytes = 0
vm.dirty_writeback_centisecs = 500
vm.dirty_expire_centisecs = 3000

vm.dirty_background_ratio 是内存可以填充“脏数据”的百分比。这些“脏数据”在稍后是会写入磁盘的,pdflush/flush/kdmflush这些后台进程会稍后清理脏数据。举一个例子,我有32G内存,那么有3.2G的内存可以待着内存里,超过3.2G的话就会有后来进程来清理它。

vm.dirty_ratio 是绝对的脏数据限制,内存里的脏数据百分比不能超过这个值。如果脏数据超过这个数量,新的IO请求将会被阻挡,直到脏数据被写进磁盘。这是造成IO卡顿的重要原因,但这也是保证内存中不会存在过量脏数据的保护机制。

vm.dirty_expire_centisecs 指定脏数据能存活的时间。在这里它的值是30秒。当 pdflush/flush/kdmflush 进行起来时,它会检查是否有数据超过这个时限,如果有则会把它异步地写到磁盘中。毕竟数据在内存里待太久也会有丢失风险。

vm.dirty_writeback_centisecs 指定多长时间 pdflush/flush/kdmflush 这些进程会起来一次。

可以通过下面方式看内存中有多少脏数据:

[root@host ~]# cat /proc/vmstat | egrep "dirty|writeback"
nr_dirty 69
nr_writeback 0
nr_writeback_temp 0

这说明了,我有69页的脏数据要写到磁盘里。


情景1:减少Cache

你可以针对要做的事情,来制定一个合适的值。
在一些情况下,我们有快速的磁盘子系统,它们有自带的带备用电池的NVRAM caches,这时候把数据放在操作系统层面就显得相对高风险了。所以我们希望系统更及时地往磁盘写数据。
可以在/etc/sysctl.conf中加入下面两行,并执行"sysctl -p"

vm.dirty_background_ratio = 5
vm.dirty_ratio = 10

这是虚拟机的典型应用。不建议将它设置成0,毕竟有点后台IO可以提升一些程序的性能。


情景2:增加Cache

在一些场景中增加Cache是有好处的。例如,数据不重要丢了也没关系,而且有程序重复地读写一个文件。允许更多的cache,你可以更多地在内存上进行读写,提高速度。

vm.dirty_background_ratio = 50
vm.dirty_ratio = 80

有时候还会提高vm.dirty_expire_centisecs 这个参数的值,来允许脏数据更长时间地停留。


情景3:增减兼有

有时候系统需要应对突如其来的高峰数据,它可能会拖慢磁盘。(比如说,每个小时开始时进行的批量操作等)
这个时候需要容许更多的脏数据存到内存,让后台进程慢慢地通过异步方式将数据写到磁盘当中。

vm.dirty_background_ratio = 5
vm.dirty_ratio = 80

这个时候,后台进行在脏数据达到5%时就开始异步清理,但在80%之前系统不会强制同步写磁盘。这样可以使IO变得更加平滑。


从/proc/vmstat, /proc/meminfo, /proc/sys/vm中可以获得更多资讯来作出调整。

这两天在调优数据库性能的过程中需要降低操作系统文件Cache对数据库性能的影响,故调研了一些降低文件系统缓存大小的方法,其中一种是通过修改/proc/sys/vm/dirty_background_ration以及/proc/sys/vm/dirty_ratio两个参数的大小来实现。看了不少相关博文的介绍,不过一直弄不清楚这两个参数的区别在哪里,后来看了下面的一篇英文博客才大致了解了它们的不同。

vm.dirty_background_ratio:这个参数指定了当文件系统缓存脏页数量达到系统内存百分之多少时(如5%)就会触发pdflush/flush/kdmflush等后台回写进程运行,将一定缓存的脏页异步地刷入外存;
 
vm.dirty_ratio:而这个参数则指定了当文件系统缓存脏页数量达到系统内存百分之多少时(如10%),系统不得不开始处理缓存脏页(因为此时脏页数量已经比较多,为了避免数据丢失需要将一定脏页刷入外存);在此过程中很多应用进程可能会因为系统转而处理文件IO而阻塞。
 
之前一直错误的一位dirty_ratio的触发条件不可能达到,因为每次肯定会先达到vm.dirty_background_ratio的条件,后来才知道自己理解错了。确实是先达到vm.dirty_background_ratio的条件然后触发flush进程进行异步的回写操作,但是这一过程中应用进程仍然可以进行写操作,如果多个应用进程写入的量大于flush进程刷出的量那自然会达到vm.dirty_ratio这个参数所设定的坎,此时操作系统会转入同步地处理脏页的过程,阻塞应用进程。
 

附上原文:

Better Linux Disk Caching & Performance with vm.dirty_ratio & vm.dirty_background_ratio

by BOB PLANKERS on DECEMBER 22, 2013

in BEST PRACTICES,CLOUD,SYSTEM ADMINISTRATION,VIRTUALIZATION

 

This is post #16 in my December 2013 series about Linux Virtual Machine Performance Tuning. For more, please see the tag “Linux VM Performance Tuning.”

In previous posts on vm.swappiness and using RAM disks we talked about how the memory on a Linux guest is used for the OS itself (the kernel, buffers, etc.), applications, and also for file cache. File caching is an important performance improvement, and read caching is a clear win in most cases, balanced against applications using the RAM directly. Write caching is trickier. The Linux kernel stages disk writes into cache, and over time asynchronously flushes them to disk. This has a nice effect of speeding disk I/O but it is risky. When data isn’t written to disk there is an increased chance of losing it.

There is also the chance that a lot of I/O will overwhelm the cache, too. Ever written a lot of data to disk all at once, and seen large pauses on the system while it tries to deal with all that data? Those pauses are a result of the cache deciding that there’s too much data to be written asynchronously (as a non-blocking background operation, letting the application process continue), and switches to writing synchronously (blocking and making the process wait until the I/O is committed to disk). Of course, a filesystem also has to preserve write order, so when it starts writing synchronously it first has to destage the cache. Hence the long pause.

The nice thing is that these are controllable options, and based on your workloads & data you can decide how you want to set them up. Let’s take a look:

$ sysctl -a | grep dirty vm.dirty_background_ratio = 10 vm.dirty_background_bytes = 0 vm.dirty_ratio = 20 vm.dirty_bytes = 0 vm.dirty_writeback_centisecs = 500 vm.dirty_expire_centisecs = 3000

vm.dirty_background_ratio is the percentage of system memory that can be filled with “dirty” pages — memory pages that still need to be written to disk — before the pdflush/flush/kdmflush background processes kick in to write it to disk. My example is 10%, so if my virtual server has 32 GB of memory that’s 3.2 GB of data that can be sitting in RAM before something is done.

vm.dirty_ratio is the absolute maximum amount of system memory that can be filled with dirty pages before everything must get committed to disk. When the system gets to this point all new I/O blocks until dirty pages have been written to disk. This is often the source of long I/O pauses, but is a safeguard against too much data being cached unsafely in memory.

vm.dirty_background_bytes and vm.dirty_bytes are another way to specify these parameters. If you set the _bytes version the _ratio version will become 0, and vice-versa.

vm.dirty_expire_centisecs is how long something can be in cache before it needs to be written. In this case it’s 30 seconds. When the pdflush/flush/kdmflush processes kick in they will check to see how old a dirty page is, and if it’s older than this value it’ll be written asynchronously to disk. Since holding a dirty page in memory is unsafe this is also a safeguard against data loss.

vm.dirty_writeback_centisecs is how often the pdflush/flush/kdmflush processes wake up and check to see if work needs to be done.

You can also see statistics on the page cache in /proc/vmstat:

$ cat /proc/vmstat | egrep "dirty|writeback" nr_dirty 878 nr_writeback 0 nr_writeback_temp 0

In my case I have 878 dirty pages waiting to be written to disk.

Approach 1: Decreasing the Cache

As with most things in the computer world, how you adjust these depends on what you’re trying to do. In many cases we have fast disk subsystems with their own big, battery-backed NVRAM caches, so keeping things in the OS page cache is risky. Let’s try to send I/O to the array in a more timely fashion and reduce the chance our local OS will, to borrow a phrase from the service industry, be “in the weeds.” To do this we lower vm.dirty_background_ratio and vm.dirty_ratio by adding new numbers to /etc/sysctl.conf and reloading with “sysctl –p”:

vm.dirty_background_ratio = 5 vm.dirty_ratio = 10

This is a typical approach on virtual machines, as well as Linux-based hypervisors. I wouldn’t suggest setting these parameters to zero, as some background I/O is nice to decouple application performance from short periods of higher latency on your disk array & SAN (“spikes”).

Approach 2: Increasing the Cache

There are scenarios where raising the cache dramatically has positive effects on performance. These situations are where the data contained on a Linux guest isn’t critical and can be lost, and usually where an application is writing to the same files repeatedly or in repeatable bursts. In theory, by allowing more dirty pages to exist in memory you’ll rewrite the same blocks over and over in cache, and just need to do one write every so often to the actual disk. To do this we raise the parameters:

vm.dirty_background_ratio = 50 vm.dirty_ratio = 80

Sometimes folks also increase the vm.dirty_expire_centisecs parameter to allow more time in cache. Beyond the increased risk of data loss, you also run the risk of long I/O pauses if that cache gets full and needs to destage, because on large VMs there will be a lot of data in cache.

Approach 3: Both Ways

There are also scenarios where a system has to deal with infrequent, bursty traffic to slow disk (batch jobs at the top of the hour, midnight, writing to an SD card on a Raspberry Pi, etc.). In that case an approach might be to allow all that write I/O to be deposited in the cache so that the background flush operations can deal with it asynchronously over time:

vm.dirty_background_ratio = 5 vm.dirty_ratio = 80

Here the background processes will start writing right away when it hits that 5% ceiling but the system won’t force synchronous I/O until it gets to 80% full. From there you just size your system RAM and vm.dirty_ratio to be able to consume all the written data. Again, there are tradeoffs with data consistency on disk, which translates into risk to data. Buy a UPS and make sure you can destage cache before the UPS runs out of power. :)

No matter the route you choose you should always be gathering hard data to support your changes and help you determine if you are improving things or making them worse. In this case you can get data from many different places, including the application itself, /proc/vmstat, /proc/meminfo, iostat, vmstat, and many of the things in /proc/sys/vm. Good luck!

hung_task_timeout_secs和blocked for more than 120 seconds的解决方法的更多相关文章

  1. Linux系统出现hung_task_timeout_secs和blocked for more than 120 seconds的解决方法

    Linux系统出现系统没有响应. 在/var/log/message日志中出现大量的 “echo 0 > /proc/sys/kernel/hung_task_timeout_secs" ...

  2. hung_task_timeout_secs 和 blocked for more than 120 seconds

    https://help.aliyun.com/knowledge_detail/41544.html 问题现象 云服务器 ECS Linux 系统出现系统没有响应. 在/var/log/messag ...

  3. Error -27728: Step download timeout (120 seconds)的解决方法(转)

    LR中超时问题解决方法 超时错误在LoadRunner录制Web协议脚本回放时超时经常出现. 现象1:Action.c(16): Error -27728: Step download timeout ...

  4. linux 出错 “INFO: task xxxxxx: 634 blocked for more than 120 seconds.”的3种解决方案(转)

    linux 出错 “INFO: task xxxxxx: 634 blocked for more than 120 seconds.”的3种解决方案 1 问题描述 服务器内存满了,ssh登录失败 , ...

  5. linux 出错 “INFO: task java: xxx blocked for more than 120 seconds.” 的3种解决方案

    1 问题描述 最近搭建的一个linux最小系统在运行到241秒时在控制台自动打印如下图信息,并且以后每隔120秒打印一次. 仔细阅读打印信息发现关键信息是“hung_task_timeout_secs ...

  6. Linux 日志报错 xxx blocked for more than 120 seconds

    监控作业发现一台服务器(Red Hat Enterprise Linux Server release 5.7)从凌晨1:32开始,有一小段时间无法响应,数据库也连接不上,后面又正常了.早上检查了监听 ...

  7. INFO: task java:27465 blocked for more than 120 seconds不一定是cache太大的问题

    这几天,老有几个环境在中午收盘后者下午收盘后那一会儿,系统打不开,然后过了一会儿,进程就消失不见了,查看了下/var/log/message,有如下信息: Dec 12 11:35:38 iZ23nn ...

  8. task mysqld:26208 blocked for more than 120 seconds

    早上10点左右,某台线上ECS服务器突然没响应. 查看日志,发现如下信息: Aug 14 03:26:01 localhost rsyslogd: [origin software="rsy ...

  9. kernel: INFO: task sadc:14833 blocked for more than 120 seconds.

    早上一到,发现oracle连不上. 到主机上,发现只有oracleora11g一个进程,其他进程全没了. Nov 14 23:33:30 hs-test-10-20-30-15 kernel: INF ...

随机推荐

  1. Effective Java 第三版——45. 明智审慎地使用Stream

    Tips <Effective Java, Third Edition>一书英文版已经出版,这本书的第二版想必很多人都读过,号称Java四大名著之一,不过第二版2009年出版,到现在已经将 ...

  2. Handlebars模板引擎之高阶

    Helpers 其实在Handlebars模板引擎之进阶我想说if else的功能的,可是由于这个功能在我的开发中我觉的鸡肋没啥用,就直接不用了. 因为if else只能进行简单判断,如果条件参数返回 ...

  3. 在 word 中对正文和目录进行分节显示页码

    使用版本 word 2016 使目录独占一页:在正文第一页的第一个字符前插入分节符下一页(布局--分节符--下一页),此时会在正文第一个字符前插入分节符.在之前插入一张空白页,用于插入目录.(插入 - ...

  4. vertx插件使用vertx-maven-plugin

    http://search.maven.org http://search.maven.org/#artifactdetails%7Cio.fabric8%7Cvertx-maven-plugin%7 ...

  5. 《软件测试自动化之道》读书笔记 之 底层的Web UI 测试

    <软件测试自动化之道>读书笔记 之 底层的Web UI 测试 2014-09-28 测试自动化程序的任务待测程序测试程序  启动IE并连接到这个实例  如何判断待测web程序完全加载到浏览 ...

  6. 【九天教您南方cass 9.1】02 从地形图上绘制纵横断面

    同学们大家好,欢迎收看由老王测量上班记出品的cass9.1视频课程, 测量空间的[九天教您南方cass]专栏是九天老师专门开设cass免费教学班.希望能帮助那些刚入行的同学,并祝您一臂之力. [点击索 ...

  7. 【九天教您南方cass 9.1】 07 绘制与标注圆曲线和细部点的方法

    同学们大家好,欢迎收看由老王测量上班记出品的cass9.1视频课程 我是本节课主讲老师九天. 我们讲课的教程附件也是共享的,请注意索取测量空间中. [点击索取cass教程]5元立得 (给客服说暗号:“ ...

  8. CentOS配置Tomcat开机启动

    通常我们进行服务器维护的时候需要注意点为,服务器上的容器一般都是开机启动,减少停机后应用还需要一个一个启动的麻烦. 1) 新建tomcat文件 touch /etc/rc.d/init.d/tomca ...

  9. 详解Linux安装GCC

    为你详解Linux安装GCC方法 2009-12-11 14:05 佚名 博客园 字号:T | T 现在很多程序员都应用GCC,怎样才能更好的应用GCC.本文以在Redhat Linux安装GCC4. ...

  10. WebRTC 配置环境

    复制文件到指定文件路径 cp -rf /home/leehongee/LeeHonGee/jdk1.7.0_45 /usr/lib/jvm 创建文件夹   mkdir jvm 修改环境变量 sudo ...