早上一到,发现oracle连不上。

到主机上,发现只有oracleora11g一个进程,其他进程全没了。

Nov 14 23:33:30 hs-test-10-20-30-15 kernel: INFO: task sadc:14833 blocked for more than 120 seconds.
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: Not tainted 2.6.32-431.el6.x86_64 #1
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: sadc D 0000000000000000 0 14833 14832 0x00000084
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: ffff88061533bdc8 0000000000000086 0000000000000000 ffff88061533bde8
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: ffff88061533bd88 ffffffff8111f3e0 ffff880528dab9d0 ffff88061533bde8
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: ffff880614125af8 ffff88061533bfd8 000000000000fbc8 ffff880614125af8
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: Call Trace:
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffff8111f3e0>] ? find_get_pages_tag+0x40/0x130
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffffa02b65a5>] jbd2_log_wait_commit+0xc5/0x140 [jbd2]
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffff81134c91>] ? do_writepages+0x21/0x40
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffffa02b6938>] jbd2_complete_transaction+0x68/0xb0 [jbd2]
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffffa02d2231>] ext4_sync_file+0x121/0x1d0 [ext4]
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffff811baa61>] vfs_fsync_range+0xa1/0x100
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffff811bab2d>] vfs_fsync+0x1d/0x20
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffff811bab6e>] do_fsync+0x3e/0x60
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffff811baba3>] sys_fdatasync+0x13/0x20
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: INFO: task NetworkManager:2081 blocked for more than 120 seconds.
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: Not tainted 2.6.32-431.el6.x86_64 #1
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: NetworkManage D 0000000000000001 0 2081 1 0x00000080
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: ffff880614185dc8 0000000000000082 0000000000000000 ffff880613b13e80
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: 0000000000000000 ffff880612e5e0d0 0000000000000000 0000000000000000
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: ffff88061464bab8 ffff880614185fd8 000000000000fbc8 ffff88061464bab8
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: Call Trace:
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: [<ffffffffa02b65a5>] jbd2_log_wait_commit+0xc5/0x140 [jbd2]
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: [<ffffffff81134c91>] ? do_writepages+0x21/0x40
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: [<ffffffffa02b6938>] jbd2_complete_transaction+0x68/0xb0 [jbd2]
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: [<ffffffffa02d2231>] ext4_sync_file+0x121/0x1d0 [ext4]
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: [<ffffffff811baa61>] vfs_fsync_range+0xa1/0x100
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: [<ffffffff811bab2d>] vfs_fsync+0x1d/0x20
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: [<ffffffff811bab6e>] do_fsync+0x3e/0x60
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: [<ffffffff811babc0>] sys_fsync+0x10/0x20
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: INFO: task NetworkManager:2081 blocked for more than 120 seconds.
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: Not tainted 2.6.32-431.el6.x86_64 #1
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: NetworkManage D 0000000000000001 0 2081 1 0x00000080
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: ffff880614185dc8 0000000000000082 0000000000000000 ffff880613b13e80
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: 0000000000000000 ffff880612e5e0d0 0000000000000000 0000000000000000
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: ffff88061464bab8 ffff880614185fd8 000000000000fbc8 ffff88061464bab8
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: Call Trace:
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffffa02b65a5>] jbd2_log_wait_commit+0xc5/0x140 [jbd2]
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffff81134c91>] ? do_writepages+0x21/0x40
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffffa02b6938>] jbd2_complete_transaction+0x68/0xb0 [jbd2]
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffffa02d2231>] ext4_sync_file+0x121/0x1d0 [ext4]
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffff811baa61>] vfs_fsync_range+0xa1/0x100
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffff811bab2d>] vfs_fsync+0x1d/0x20
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffff811bab6e>] do_fsync+0x3e/0x60
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffff811babc0>] sys_fsync+0x10/0x20
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: INFO: task sadc:15210 blocked for more than 120 seconds.
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: Not tainted 2.6.32-431.el6.x86_64 #1
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: sadc D 0000000000000000 0 15210 15209 0x00000084
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: ffff88091ed9bdc8 0000000000000082 0000000000000000 ffff88091ed9bde8
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: ffff88091ed9bd88 ffffffff8111f3e0 ffff88008f60a9d0 ffff88091ed9bde8
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: ffff88061439bab8 ffff88091ed9bfd8 000000000000fbc8 ffff88061439bab8
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: Call Trace:
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffff8111f3e0>] ? find_get_pages_tag+0x40/0x130
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffffa02b65a5>] jbd2_log_wait_commit+0xc5/0x140 [jbd2]
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40

原因以及排查思路:

Under heavy IO load on servers you may see something like:

INFO: task nfsd:2252 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

...probably followed by a call trace that mentions your filesystem, and probably io_schedule and sync_buffer.

This message is not an error.

It is an indication that a program has had to wait for a very long time, and what it was doing. (which is not so informative of the reason - it's common that the real IO load issue comes from another process)

The code behind this sits in hung_task.c and was added somewhere around 2.6.30. This is a kernel thread that detects tasks that stays in the D state for a while (which typically meaning it is waiting for IO).

It complains when it sees a process has been waiting on IO so long that the whole process has not been scheduled for any CPU-time for 120 seconds (default).

Notes:

  • if it happens constantly your IO system is slower than your IO use
  • most likely to happen to a process that was ioniced into the idle class. Which means it's working, idle-class is meant as an extreme politeness thing. It just indicates something else is doing a bunch of IO right now (for at least 120 seconds)
e.g. updatedb (may be victim if it were ioniced, cause if not)
  • if it happens only nightly, look at your cron jobs
  • trashing system can cause this, and then it's purely a side effect of one program using too much RAM
  • being blocked by a desktop-class drive with bad sectors (because they retry for a long while)
  • NFS seems to be a common culprit, probably because it's good at filling the writeback cache, something which implies blocking while writeback happens - which is likely to block various things related to the same filesystem. (verify)
  • if it happens on a fileserver, you may want to consider spreading to more fileservers, or using a parallel filesystem
if your load is fairly sequential, you may get some relief from using the noop io scheduler (instead of cfq) though note that that disables ionice)
if your load is relatively random, upping the queue depth may help

kernel: INFO: task sadc:14833 blocked for more than 120 seconds.的更多相关文章

  1. INFO: task java:27465 blocked for more than 120 seconds不一定是cache太大的问题

    这几天,老有几个环境在中午收盘后者下午收盘后那一会儿,系统打不开,然后过了一会儿,进程就消失不见了,查看了下/var/log/message,有如下信息: Dec 12 11:35:38 iZ23nn ...

  2. task mysqld:26208 blocked for more than 120 seconds

    早上10点左右,某台线上ECS服务器突然没响应. 查看日志,发现如下信息: Aug 14 03:26:01 localhost rsyslogd: [origin software="rsy ...

  3. linux 出错 “INFO: task xxxxxx: 634 blocked for more than 120 seconds.”的3种解决方案(转)

    linux 出错 “INFO: task xxxxxx: 634 blocked for more than 120 seconds.”的3种解决方案 1 问题描述 服务器内存满了,ssh登录失败 , ...

  4. linux 出错 “INFO: task java: xxx blocked for more than 120 seconds.” 的3种解决方案

    1 问题描述 最近搭建的一个linux最小系统在运行到241秒时在控制台自动打印如下图信息,并且以后每隔120秒打印一次. 仔细阅读打印信息发现关键信息是“hung_task_timeout_secs ...

  5. linux 出错 “INFO: task xxxxxx: 634 blocked for more than 120 seconds.”的3种解决方案

    https://blog.csdn.net/electrocrazy/article/details/79377214

  6. Linux 日志报错 xxx blocked for more than 120 seconds

    监控作业发现一台服务器(Red Hat Enterprise Linux Server release 5.7)从凌晨1:32开始,有一小段时间无法响应,数据库也连接不上,后面又正常了.早上检查了监听 ...

  7. Linux系统出现hung_task_timeout_secs和blocked for more than 120 seconds的解决方法

    Linux系统出现系统没有响应. 在/var/log/message日志中出现大量的 “echo 0 > /proc/sys/kernel/hung_task_timeout_secs" ...

  8. hung_task_timeout_secs 和 blocked for more than 120 seconds

    https://help.aliyun.com/knowledge_detail/41544.html 问题现象 云服务器 ECS Linux 系统出现系统没有响应. 在/var/log/messag ...

  9. 服务器卡死,重启报错: INFO: task blocked for more than 120 seconds

    问题:服务器负载很高,但是CPU利用率不高.服务器经常夯住,网站打不开,SSH连接非常不稳定,输入命令夯住. 重启服务器报错: INFO: task blocked for more than 120 ...

随机推荐

  1. C++学习 —— 灵活的继承特性

    0.继承与算法开发 在之前的笔记中,我展示了来自继承的威力.继承这种机制能够大幅度减小编码量,子类可以继承父类所有的变量,方法.利用这种机制,我们可以在其他人工作的基础上,完成有自己特色的部分.比如我 ...

  2. Linux基本的命令使用2018-4-20 18:47:28

    1.1ls -a 显式所有文件,包括隐藏文件 1.2  ls -l  列表形式显式文件名称 1.3  ls -l -h 列表显式大小和名称 也可以这样写 ls -alh  (-可以省略) 重定向 ls ...

  3. Druid连接池(二)

    DRUID是阿里巴巴开源平台上一个数据库连接池实现,它结合了C3P0.DBCP.PROXOOL等DB池的优点,同时加入了日志监控,可以很好的监控DB池连接和SQL的执行情况,可以说是针对监控而生的DB ...

  4. Git Bash上传文件

    今天通过Git Bash上传了一个项目(之前是通过Github Desk上传的),操作命令如下: 在目录下shift+右键打开Git Bash 1.git init 2.git add *.py 3. ...

  5. node.js+ react + redux 环境搭建

    1.安装node.js 2. yarn init: 初始化,主要包含以下条目 name: 项目名 version: 版本号 description: 项目简要描述 entry point: 文件入口, ...

  6. EasyUI combogrid/combobox过滤时限制只能选择现有项

    在使用EasyUI的combogrid时可以通过输入进行过滤,达到快速选择的目的,但是手工输入不存在的项也不会出错,结果提交到数据库后就会产生错误. 比如idField是int型的,输入的数据通过是检 ...

  7. python语法_while循环_for循环

    while 循环: while 条件: print('''asdasd') print('''asdasd') print('''asdasd') 当条件为True时,持续循环 当条件为Flase时, ...

  8. HTML、CSS知识点,面试开发都会需要--No.5 文章段落

    No.5 文章段落 1.文字属性 文字属性包含font-*和text-*两类. 2.基于font的属性 (1)font-family:字体属性,多个字体之前用逗号隔开.如果第一个字体没找到,则依次找后 ...

  9. CentOS使用systemctl daemon-reload报错Error getting authority: Error initializing authority: Error calling StartServiceByName for org.freedesktop.PolicyKit1: Timeout was reached (g-io-error-quark, 24)解决办法

    CentOS修改了系统启动文件后需要重载报错 systemctl daemon-reload Error getting authority: Error initializing authority ...

  10. git迁移

    git迁移 项目开发的不同阶段可能要使用不同的git仓库,有时需要迁移. git有很好的方法,只需要几个命令 目标: 我们需要把代码从 http://a.com/projectA.git 迁移到 ht ...