昨天南京到客户服务数据库的优化调整,其中新上线,经过审查alert.log当日志现在是在过去一段时间内取得,每隔几个小时的时间滞后,班会报似的内容:


Thu Aug 21 09:01:26 2014
WARNING: Heavy swapping observed on system in last 5 mins.
pct of memory swapped in [8.42%] pct of memory swapped out [2.16%].
Please make sure there is no memory pressure and the SGA and PGA
are configured correctly. Look at DBRM trace file for more details.

Thu Aug 21 14:56:27 2014
WARNING: Heavy swapping observed on system in last 5 mins.
pct of memory swapped in [5.40%] pct of memory swapped out [8.63%].
Please make sure there is no memory pressure and the SGA and PGA
are configured correctly. Look at DBRM trace file for more details.

......

Sat Oct 18 22:13:48 2014
WARNING: Heavy swapping observed on system in last 5 mins.
pct of memory swapped in [7.76%] pct of memory swapped out [0.33%].
Please make sure there is no memory pressure and the SGA and PGA
are configured correctly. Look at DBRM trace file for more details.

客户的环境是IBM P570,AIX 6.1,安装了Oracle 11.2.0.3单实例数据库,物理内存64G,只分配了20G给SGA,採用memory自己主动管理

查阅了一下MOS,发现是AIX平台上的一个bug。相关文档为:[1508575.1]

相应的数据库和平台:

Oracle Database - Enterprise Edition - Version 11.2.0.3 to 11.2.0.3 [Release 11.2]

IBM AIX on POWER Systems (64-bit)


症状:

There is new warning message in alert.log in 11.2.0.3 similar to

WARNING: Heavy swapping observed on system in last 5 mins.

pct of memory swapped in [2.08%] pct of memory swapped out [0.12%].

Please make sure there is no memory pressure and the SGA and PGA 

are configured correctly. Look at DBRM trace file for more details.

On AIX platform this message can be seen even when there is no virtual memory swapping at all.    --物理内存足够,并且根本没有使用swap交换空间

You may compare the vmstat from AIX level with DBRM trace file entries to see the differences.


原因:

The issue is caused by unpublished Bug:14731911.

 

Swap usage messages are based on statistics that do not reflect the actual usage.

The v$osstat does not reflect proper stats for the swap space paging.

解决方法:

Apply Patch:11801934 on top
of your IBM AIX on POWER Systems (64-bit) platform.

P.S: Bug is port-specific.    --这个bug是针对端口指定的平台的

The issue is fixed in patchset 11.2.0.4 and release 12.1. 
  --说是在12.1的patch中修复了。但实际上12.1还是会有这个问题。会有ora-700错误,详见文档:[ID 1919850.1]

来看一下BUG:14731911的描写叙述:

Bug 属性

   
 

类型 B - Defect 已在产品版本号中修复
严重性 2 - Severe Loss of Service 产品版本号 11.2.0.3
状态 96 - Closed, Duplicate Bug 平台 212 - IBM AIX on POWER Systems (64-bit)
创建时间 2012-10-8 平台版本号 6.1
更新时间 2014-10-11 基本 Bug 11801934
数据库版本号 11.2.0.3 影响平台 Port-Specific
产品源 Oracle

id=14731911" style="font-size:14px; color:rgb(0,113,194); padding-top:5px; font-family:Helvetica,sans-serif">与此 Bug 相关的知识, 补丁程序和 Bug

 
 

相关产品

   
 

产品线 Oracle Database Products 系列 Oracle Database Suite
区域 Oracle Database 产品 5 - Oracle Database - Enterprise Edition
Hdr: 14731911 11.2.0.3 RDBMS 11.2.0.3 VOS PRODID-5 PORTID-212 11801934
Abstract: FALSE SWAP WARNING MESSAGES PRINTED TO ALERT.LOG ON AIX *** 10/08/12 04:52 am ***
 
 
  BUG TYPE CHOSEN
  ===============
  Code
 
  SubComponent: Virtual Operating System
  ======================================
  DETAILED PROBLEM DESCRIPTION
  ============================
  Oracle process seems to check wrong OS local statistic (which include also
  FILESYSTEM caching etc.)
 
  Alert log shows WARNING: Heavy swapping observed on system in last 5 mins.
  pct of memory swapped in [2.08%] pct of memory swapped out [0.12%].
  Please make sure there is no memory pressure and the SGA and PGA
  are configured correctly. Look at DBRM trace file for more details.
 
  but this is not reflected at OS level.
 
  DIAGNOSTIC ANALYSIS
  ===================
  1. nmon shows virtual memory swapping does not occur at all - see attached file --nmon根本没有监控到swap动作
 
  2. Oracle Database Server is 11.2.0.3 and contains fix for 10220118
 
  3. Server configuration
  real mem: 144GB
  lowest value of fre memory : 87,65 GB --剩余内存充足
 
  4. DBRM seems to use a wrong OS statistics - trace file is attached
 
  WORKAROUND?
  ===========
  No
 
  TECHNICAL IMPACT
  ================
  Wrong diagnostic analyze.
  Message is bothering customer's DBA when in fact the warning message is
  misleading
 
  RELATED ISSUES (bugs, forums, RFAs)
  ===================================
  http://myforums.oracle.com/jive3/thread.jspa?threadID=1104581
  10220118
 
  HOW OFTEN DOES THE ISSUE REPRODUCE AT CUSTOMER SITE? ====================================================
  Always
 
  DOES THE ISSUE REPRODUCE INTERNALLY? ====================================
  No
 
  EXPLAIN WHY THE ISSUE WAS NOT TESTED INTERNALLY.
  ================================================
  Unavailable Data Volume
 
  IS A TESTCASE AVAILABLE?
  ========================
  No
 
  Link to IPS Package:
  ====================
  not available

DBRM(Database Resource Manager)是11gR2中新特性中出现的后台进程。会在alert.log告警日志中反映OS操作系统近期5分钟是否有剧烈的swap活动。而在AIX平台上,由于BUG:14731911的存在。oracle的这个进程谎报了内存进行了swapin和swapout动作。我们知道。仅仅有当物理内存真的不够用的情况下,才会去用swap(一般会配置成物理内存的2倍),而swap是很耗费性能的(从物理磁盘读写)。

可是个人觉得这个bug的危害性并不大,仅仅仅仅是在alert.log日志中报了一个WARNING,并没有由于这个影响导致对数据库更加负面的影响,因此是否打补丁到11.2.0.4就见仁见智了。假设想让alert.log平安无事。那么就能够升级一下patch。

当然了,假设真的是由于OS内存吃紧造成的swap动作。就要差别对待了。由于此时的确会对数据库造成严重影响。要区分是否真的内存不足而非系统误报。那么主要还是通过nmon,topas,vmstat等监控工具来进行分析(linux下还能够用free监控)


对于AIX平台,事实上还有还有一个bug,仅仅只是是unpublished base bug,而不是port-specific bug

AIX Platform

If your Platform is IBM-AIX then this is not the only possible reason for this alert log message. 

For IBM AIX on POWER Systems (64-bit), there is also next known port-specific bug:

id=14731911&parent=DOCUMENT&sourceId=1452790.1">Bug 14731911 - FALSE SWAP WARNING MESSAGES PRINTED TO ALERT.LOG ON AIX

with unpublished base bug:

Bug 11801934 : WRONG PAGE-IN AND PAGE-OUT OS VM STATS IN AIX.

在vmware平台中的这个WARNING信息。假设不是bug引起,则非常有可能和ora-04031/ora-04030相关,这个就严重多了

VMWare

Under VMWare, the messages may perhaps indicate a more serious issue, even when no memory related ORA-4031/ORA-4030 errors are reported. 

Under circumstances, an instance in a virtual machine may be simply terminated by PMON due to error 471 without further errors in the alert log.

The OS logs may in such case report an out of memory condition like below:

[root@vmh ~]# grep Kill /var/log/messages*

/var/log/messages-20140629:Jun 27 18:29:06 vmh-msfc-dodp02 kernel: [1895074.304941] Out of memory: Kill process 42094 (oracle) score 391 or sacrifice child

/var/log/messages-20140629:Jun 27 18:29:06 vmh-msfc-dodp02 kernel: [1895074.305203] Killed process 42094, UID 303, (oracle) total-vm:189081588kB, anon-rss:27412kB, file-rss:109612

通常解决OS内存swap问题有下面几种方案:

1. 诊断是否存在内存泄露的进程,解决内存泄露
2. 调优SGA/PGA,降低oracle对内存的占用
3. 利用/proc/sys/vm/drop_caches,临时释放一些cache的内存(Linux)
4. 调整系统VM内存管理參数, 比如Linux上sysctl.conf中的下面几个參数:



vm.min_free_kbytes:Raising the value in /proc/sys/vm/min_free_kbytes will
cause the system to start reclaiming memory at an earlier time than it would have before.

vm.vfs_cache_pressure:At the default value of vfs_cache_pressure = 100 the kernel will attempt
to reclaim dentries and inodes at a “fair” rate with respect to pagecache and swapcache reclaim. Decreasing vfs_cache_pressure causes the kernel to prefer to retain dentry and inode caches. Increasing
vfs_cache_pressure beyond 100 causes the kernel to prefer to reclaim dentries and inodes.



vm.swappiness:default 60,Apparently /proc/sys/vm/swappiness on
Red Hat Linux allows the admin to tune how aggressively the kernel swaps out processes'memory. Decreasing the swappiness setting may result in improved Directory performance as the kernel holds more of the server process in memory longer before swapping it
out.

设置下面值,以降低OOM(Out Of Memory)的可能性:

# Oracle-Validated setting for vm.min_free_kbytes is 51200 to avoid OOM killer

vm.min_free_kbytes = 51200

vm.swappiness = 40

vm.vfs_cache_pressure = 200

版权声明:本文博客原创文章,博客,未经同意,不得转载。

AIX6.1/11.2.0.3在有关数据库SWAP一个BUG的更多相关文章

  1. Oracle从11.2.0.2开始,数据库补丁包是一个完整安装包(转)

    从11.2.0.2开始,数据库补丁包是一个完整安装包.也就是说:比如要打11.2.0.2的补丁包,直接用11.2.0.2包来安装就可以了,不需要像10G一样先安装数据库软件再来打补丁包. 如果已经安装 ...

  2. oracle 11.2.0.2以后对数据库用户名重命名

    本文来自我的github pages博客http://galengao.github.io/ 即www.gaohuirong.cn [转自]http://www.xifenfei.com/2012/0 ...

  3. 手动升级11.2.0.1的rac数据库到11.2.0.4

    ① 关闭两个节点上的数据库 crsctl stop resource ora.ORA11G.db ② 命令行单节点启动数据库, 注意这里的SQLPLUS 一定是升级后的软件地址 sqlplus / a ...

  4. solaris x86安装ORACLE 11.2.0.3软件时因SWAP不足报错: INFO: ld: fatal: mmap anon failed

    1.ORACLE软件安装到86%时报错,图忘截了.日志例如以下: /oracle/u01/app/oracle/product/11.2.0/ INFO: db_1/lib/sysliblist` - ...

  5. 11.2.0.1升级到11.2.0.4报错之中的一个:UtilSession failed: Patch 9413827

    UtilSession failed: Patch 9413827 requires component(s) that are not installed in OracleHome. These ...

  6. Octopus系列之HttpCustom2.0模板引擎的处理,一个bug的分析

    实现的目标是: 1.实现手机和PC模板请求的区分:使得来自两种不同设备请求的时候,各自路由到不同的目录中去 2.保持只有一个引擎实例对象 最后发现一个bug就是,当我从PC访问时初始化了PC的目录,呈 ...

  7. oracle 11.2.0.4 dbca创建数据库时 报错ORA-12532

    ORA-12532:TNS:无效参数 在实例安装到50%的时候ORA-12532的错误. 原因: sys密码中包含‘@’字符引起的.重新设置,通过.

  8. AIX6.1平台11.2.0.3RAC 实施手册

    1 前言 此文档详细描述了Oracle 11gR2 数据库在AIX6.1上的安装RAC的检查及安装步骤.文档中#表示root用户执行,$表示grid或oracle用户执行. 2 系统环境 操作系统环境 ...

  9. 完整记录一则Oracle 11.2.0.4单实例打PSU补丁的过程

    本文记录了打PSU的全过程,意在体会数据库打PSU补丁的整个过程. 1.OPatch替换为最新版本2.数据库软件应用19121551补丁程序3.数据库应用补丁4.验证PSU补丁是否应用成功 1.OPa ...

随机推荐

  1. UINavigationController的横屏问题

    近期用代码创建了一个UINavigationController,并且当前的屏幕设置为横屏的,此时遇到的问题是UINavigationController的view的大小为宽768 高1024,也就是 ...

  2. Windows Phone开发(16):样式和控件模板

    原文:Windows Phone开发(16):样式和控件模板 在前面资源一文中也提过样式,样式就如同我们做HTML页排版时常用到的CSS样式表,它是对于特定娄型的可视化元素,应该可以直接说是针对控件的 ...

  3. 凝视条件推断浏览器<!--[if !IE]><!--[if IE]><!--[if lt IE 6]><!--[if gte IE 6]>

    <!--[if !IE]><!--> 除IE外可识别 <!--<![endif]--> <!--[if IE]> 所有的IE可识别 <![e ...

  4. iOS开发 编辑框被系统弹出的软键盘遮挡问题

    我们在开发注冊界面的时候,最后几个注冊条件经常easy被系统弹出的键盘遮挡,例如以下图: 能够看见,邮箱条件被遮挡掉了,怎么解决呢?我是通过UITextField的代理加计算偏移量: - (void) ...

  5. Android Studio怎样安装插件

    Android Studio安装插件的方式事实上和Eclipse大同小异.废话不多说,直接上图: 区域1:你当前已经安装了的插件 区域2:在线安装 区域3:从硬盘安装,即针对你已经下载好了的插件,可通 ...

  6. UVa 825 - Walking on the Safe Side

    题目:在一个N*M的网格中,从左上角走到右下角,有一些点不能经过,求最短路的条数. 分析:dp,帕斯卡三角.每一个点最短的就是走N条向下,M条向右的路. 到达每一个点的路径条数为左边和上面的路径之和. ...

  7. python 凸包(经纬度) + 面积[近似]

    def cross(A,B): return A[0] * B[1] - A[1] * B[0] def vectorMinus( a , b): return ( (a[0] - b[0] )*10 ...

  8. JAVA学习课第二十八届(多线程(七))- 停止-threaded多-threaded面试题

    主密钥 /*  * wait 和 sleep 差别?  * 1.wait能够指定时间也能够不指定  * sleep必须指定时间  * 2.在同步中,对CPU的运行权和锁的处理不同  * wait释放运 ...

  9. Maximal Square 我们都在寻找最高1子矩阵(leeCode)

    Given a 2D binary matrix filled with 0's and 1's, find the largest square containing all 1's and ret ...

  10. 玩转Web之Json(三)-----easy ui怎么把前台显示的dataGird中的所有数据序列化为json,返回到后台并解析

    最近做一个项目时,需要在dataGird中插入<input>,即文本输入框,当点击提交时,需要把文本框里填的数据返以及其他列的一些信息以json数组的格式返回到后台,虽然我实现了该功能,但 ...