ASM device error ORA-27041 ORA-15025 ORA-15081 (Doc ID 1487475.1)

描述总结:
数据库的alert中发现大量ORA-27041 ORA-15025 ORA-15081报错,首先查看asm的磁盘组的状态,对应的盘符的状态和权限全部正常,查看asm的alert日志并未看到刷新。
炸一看像是磁盘权限的问题,但是细想下来,如果真的是磁盘权限的问题,那么数据库应该就会挂了,但是查看业务的会话,全部正常。此时没有头绪,查看mos,其实已经有了答案,但是没有理解mos的意思。
后坚持30分钟沟通的原则,沟通是否有磁盘相关的变更。根据trc中的uid,确认与操作系统DSG用户相关,咨询业务也未有明确答案。查看该用户下的进程,发现一个大量执行的脚本,查看相应的log,发现程序执行的时间和数据库报错的时间吻合,并且报错一致。最终判定,是由于dsg用户缺少磁盘组的相应权限导致,该程序访问磁盘提示权限不足。

相关信息如下:
1.Environment:11.2.0.4
2.Symptoms:
1)报错
Tue Sep ORA-2704128 11:08:13 2021
Errors in file /XXX/app/oracle/diag/rdbms/xxx/xxx1/trace/xxx1_ora_9178000.trc:
ORA-15025: could not open disk "/dev/rhdisk12"
ORA-27041: unable to open file
IBM AIX RISC System/6000 Error: 13: Permission denied
Additional information: 3
Additional information: 4
Additional information: 138444804
Errors in file /XXX/app/oracle/diag/rdbms/xxx/xxx1/trace/xxx1_ora_9178000.trc:
ORA-15025: could not open disk "/dev/rhdisk12"
ORA-27041: unable to open file
IBM AIX RISC System/6000 Error: 13: Permission denied
Additional information: 3
Additional information: 4
Additional information: 138444804
WARNING: failed to read mirror side 1 of virtual extent 0 logical extent 0 of file 3103 in group [1.374353710] from disk DATA_0002 allocation unit 893192 r
eason error; if possible, will try another mirror side
Errors in file /XXX/app/oracle/diag/rdbms/xxx/xxx1/trace/xxx1_ora_9178000.trc:
ORA-00202: control file: '+DATA/xxx/controlfile/current.257.1062864093.bak'
ORA-15081: failed to submit an I/O operation to a disk
Tue Sep 28 11:08:14 2021

2)磁盘相关查询
SQL> select DISK_NUMBER,B.NAME GROUP_NAME,a.name diskname,a.free_MB/1024 FREE_GB,a.TOTAL_MB/1024 TOTAL_GB,a.free_mb/a.total_mb*100 free_percentage ,a.path,A.STATE FROM V$ASM_DISK A,V$ASM_DISKGROUP B WHERE A.GROUP_NUMBER=B.GROUP_NUMBER order by b.name,DISK_NUMBER ;

DISK_NUMBER GROUP_NAME DISKNAME FREE_GB TOTAL_GB FREE_PERCENTAGE PATH STATE
----------- --------------- --------------- --------- --------- --------------- -------------------- ------------------------
0 DATA DATA_0000 27.23 1024.00 2.66 /dev/rhdisk10 NORMAL
1 DATA DATA_0001 27.12 1024.00 2.65 /dev/rhdisk11 NORMAL
2 DATA DATA_0002 27.05 1024.00 2.64 /dev/rhdisk12 NORMAL
……
0 OCR OCR_0000 9.62 10.00 96.21 /dev/rhdisk2 NORMAL
73 rows selected.

SQL> select group_number, name,state,type,total_MB / 1024 total_GB,free_mb / 1024 FREE_GB,free_mb / total_MB * 100 free_per,(case when free_mb / total_mb * 100 < 15 then '*' else '' end) care from V$ASM_DISKGROUP;

GROUP_NUMBER NAME STATE TYPE TOTAL_GB FREE_GB FREE_PER C
------------ ------------------------------ ----------- ------ ---------- ---------- ---------- -
1 DATA CONNECTED EXTERN 73728 1915.58496 2.59817839 *
2 OCR MOUNTED EXTERN 10 9.62109375 96.2109375

-bash-4.2# ls -l /dev/hdisk1*
brw------- 1 root system 13, 3 Dec 17 2020 /dev/hdisk1
brw------- 1 root system 13, 16 Dec 23 2020 /dev/hdisk10
brw------- 1 root system 13, 14 Dec 23 2020 /dev/hdisk11
brw------- 1 root system 13, 19 Dec 23 2020 /dev/hdisk12

-bash-4.2# ls -l /dev/rhdisk1*
crw------- 1 root system 13, 3 Dec 17 2020 /dev/rhdisk1
crw-rw---- 1 grid asmadmin 13, 16 Sep 28 11:17 /dev/rhdisk10
crw-rw---- 1 grid asmadmin 13, 14 Sep 28 11:17 /dev/rhdisk11
crw-rw---- 1 grid asmadmin 13, 19 Sep 28 11:17 /dev/rhdisk12
crw-rw---- 1 grid asmadmin 13, 8 Sep 28 11:17 /dev/rhdisk13
-bash-4.2# errpt

-bash-4.2# lsattr -El hdisk12 | grep reserve_policy
reserve_policy no_reserve Reserve Policy True
-bash-4.2# lsattr -El hdisk13 | grep reserve_policy
reserve_policy no_reserve Reserve Policy True

-bash-4.2# id oracle
uid=1101(oracle) gid=1000(oinstall) groups=1100(asmadmin),1200(dba),1300(asmdba)
-bash-4.2# id grid
uid=1100(grid) gid=1000(oinstall) groups=1100(asmadmin),1200(dba),1300(asmdba),1301(asmoper)
-bash-4.2#

3.定位原因:trc中显示kfk_debug_get_user_groups: uid:207, euid:1101, gid:1000, egid:1000(对应dsg用户为oinstall属组,未包括asm相关属组)

Trace file /XXX/app/oracle/diag/rdbms/xxx/xxx1/trace/xxx1_ora_9178000.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
ORACLE_HOME = /XXX/app/oracle/product/11.2/db_2
System name: AIX
Node name: hxdg1
Release: 1
Version: 7
Machine: 00F6E6DC4C00
Instance name: xxx1
Redo thread mounted by this instance: 1
Oracle process number: 194
Unix process pid: 9178000, image: oracle@hxdg1 (TNS V1-V3)

*** 2021-09-28 11:08:13.726
*** SESSION ID:(988.19) 2021-09-28 11:08:13.726
*** CLIENT ID:() 2021-09-28 11:08:13.726
*** SERVICE NAME:(SYS$USERS) 2021-09-28 11:08:13.726
*** MODULE NAME:(oxad@hxdg1 (TNS V1-V3)) 2021-09-28 11:08:13.726
*** ACTION NAME:() 2021-09-28 11:08:13.726

WARNING: failed to open a disk[/dev/rhdisk12]
ORA-15025: could not open disk "/dev/rhdisk12"
ORA-27041: unable to open file
IBM AIX RISC System/6000 Error: 13: Permission denied
Additional information: 3
Additional information: 4
Additional information: 138444804
kfk_debug_get_user_groups: uid:207, euid:1101, gid:1000, egid:1000
WARNING: failed to open a disk[/dev/rhdisk12]
ORA-15025: could not open disk "/dev/rhdisk12"
ORA-27041: unable to open file
IBM AIX RISC System/6000 Error: 13: Permission denied
Additional information: 3
Additional information: 4
Additional information: 138444804
kfk_debug_get_user_groups: uid:207, euid:1101, gid:1000, egid:1000
WARNING: disk locally closed resulting in I/O error
WARNING: Read Failed. group:1 disk:2 AU:893192 offset:16384 size:16384
path:Unknown disk
incarnation:0x4560e025 synchronous result:'I/O error'
subsys:Unknown library iop:0x110db7ed0 bufp:0x110cfde00 osderr:0x0 osderr1:0x0
WARNING: failed to read mirror side 1 of virtual extent 0 logical extent 0 of file 3103 in group [1.374353710] from disk DATA_0002 allocation unit 893192 reason error; if possible, will try another mirror side
DDE rules only execution for: ORA 202
----- START Event Driven Actions Dump ----
---- END Event Driven Actions Dump ----
----- START DDE Actions Dump -----
Executing SYNC actions
----- START DDE Action: 'DB_STRUCTURE_INTEGRITY_CHECK' (Async) -----
DDE Action 'DB_STRUCTURE_INTEGRITY_CHECK' was flood controlled
----- END DDE Action: 'DB_STRUCTURE_INTEGRITY_CHECK' (FLOOD CONTROLLED, 1 csec) -----
Executing ASYNC actions
----- END DDE Actions Dump (total 0 csec) -----

*** 2021-09-28 11:08:13.731
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=1, mask=0x0)
----- Error Stack Dump -----
ORA-00202: control file: '+DATA/xxx/controlfile/current.257.1062864093.bak'
ORA-15081: failed to submit an I/O operation to a disk
----- Current SQL Statement for this session (sql_id=amdwhucub5mzk) -----
select count(:"SYS_B_0") from v$dataguard_stats

cat /etc/passwd | grep uid
ps -ef |grep dsg

-bash-4.2# ps -ef | grep dsg
dsg 8455834 2033984 0 09:58:37 - 0:01 sshd: dsg@pts/15
dsg 11405170 11733628 0 12:33:23 - 0:00 /XXX/dsg/aiod/bin/aiod -n 127.0.0.1,45007 -flog /XXX/dsg/aiod/log/log.aiod
root 11536588 13501954 0 13:04:45 pts/28 0:00 grep dsg
root 2033984 3473680 0 09:58:36 - 0:00 sshd: dsg [priv]
dsg 6752538 8455834 0 09:58:37 pts/15 0:00 -ksh
dsg 11733628 1 0 12:33:23 - 0:00 /XXX/dsg/aiod/bin/aiod -n 127.0.0.1,45007 -flog /XXX/dsg/aiod/log/log.aiod

more /XXX/dsg/aiod/log/log.aiod
encrypt_pwd=n # input oracle password is encrypted?
oracle_pdb= # Oracle 12c Instance PDB name
sysdba=n # connect Oracle by SYSDBA?(Y|N)
sysasm=n # connect Oracle by SYSASM?(Y|N)
timeout=6 # recv oxad info timeout minutes. 0 - unused timeout, (0 ~ 254)
standby=A # A - auto check, Y - standby DB, N - not standby DB
[I] 2021-09-28:11:55:49 AOXD XP#5834878 loop startup ...
[I] 2021-09-28:11:55:49 asm#21366664 startup(edn:1,rawofs:0) blen(24:4:18)MB 16777216, 149ms
[I] 2021-09-28:11:55:49 ASM APIs startup success! used ASM APIs, 0.17s
[I] 2021-09-28:11:55:49 ASM#21366664 (0,1,255,0) 0, 0.00M, s(0.003s, )
[E] 2021-09-28:11:55:49 AOXD loop err:OXA-1000 OCI for Oracle error -1 occurred at api/sql/execute.c:112, sid xxx1, tns .
ORA-00204: error in reading (block 1, # blocks 1) of control file
ORA-00202: control file: '+DATA/xxx/controlfile/current.257.1062864093.bak'
ORA-15081: failed to submit an I/O operation to a disk
Error - OCI_ERROR select count(1) from v$dataguard_stats
[I] 2021-09-28:11:55:49 AOXD service shutdown.
[W] 2021-09-28:11:55:49 Task 0, pid 21366664 exit, (normal exit 0) sleep 30s, and restart.

最终解决确认权限,不再报错
-bash-4.2# id dsg
uid=207(dsg) gid=1000(oinstall) groups=1100(asmadmin),1200(dba),1300(asmdba),1301(asmoper)

11.2.0.4 ORA-15025 ORA-27041 IBM AIX RISC System/6000 Error: 13: Permission denied的更多相关文章

  1. 解决Mac nginx问题 [emerg] 54933#0: bind() to 0.0.0.0:80 failed (13: Permission denied)

    brew services restart nginx Stopping nginx... (might take a while) ==> Successfully stopped nginx ...

  2. Ubuntu nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)

    在Ubuntu 12中启动刚安装好的Nginx,报错: nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied) 原因如下: ...

  3. 解决Nginx的connect() to 127.0.0.1:8080 failed (13: Permission denied) while connect

    在进行Nginx+Tomcat 负载均衡的时候遇到了这个权限问题,在error.log日志中.我们能够看到例如以下: connect() to 127.0.0.1:8080 failed (13: P ...

  4. 解决nginx访问问题connect() to 127.0.0.1:8080 failed (13: Permission denied) while connecting to upstream,

    问题:搭建好项目之后,用nginx进行代理,进行日常配置之后,发现前端正常访问,但是后端访问出现错误,报502错误,查找nginx日志,发现connect() to 127.0.0.1:8080 fa ...

  5. Starting nginx: nginx: [emerg] bind() to 0.0.0.0:8088 failed (13: Permission denied) nginx 启动失败

     Starting nginx: nginx: [emerg] bind() to 0.0.0.0:8088 failed (13: Permission denied)     nginx 启动失败 ...

  6. nginx bind() to 0.0.0.0:**** failed (13: Permission denied)

    nginx 启动失败,日志里面报错信息如下: Starting nginx: nginx: [emerg] bind() to 0.0.0.0:**** failed (13: Permission ...

  7. "/usr/local/openresty/nginx/html/index.html" is forbidden (13: Permission denied), client: 10.0.4.118, server: localhost, request: "GET / HTTP/1.1"

    openrestry 安装之后 报"/usr/local/openresty/nginx/html/index.html" is forbidden (13: Permission ...

  8. nginx之 [error] 6702#0:XXX is forbidden (13: Permission denied)

    问题描述: 配置完 nginx 两个虚拟机后,客户端能够访问原始的server ,新增加的 server 虚拟机 不能够访问,报错如下页面 解决过程: 1. 查看报错日志[root@mysql03 n ...

  9. 解决nginx报错:nginx: [emerg] bind() to 0.0.0.0:8088 failed (13: Permission denied)

    报错描述: nginx: [emerg] bind() to 0.0.0.0:8088 failed (13: Permission denied) 通过ansible远程给主机更换端口并重新启动ng ...

随机推荐

  1. 关于 java编程思想第五版 《On Java 8》

    On Java 8中文版 英雄召集令 这是该项目的GITHUB地址:https://github.com/LingCoder/OnJava8 广招天下英雄,为开源奉献!让我们一起来完成这本书的翻译吧! ...

  2. linux————mysql————修改密码

    SET PASSWORD FOR 'root'@'localhost' = PASSWORD('输入新密码');

  3. 我说Java完全面向对象,老大过来就是一jio

    哈喽,大家好,我是指北君.自从开始学Java,就知道Java是一门面向对象编程的语言,所以在指北君眼中,Java就是完全面向对象的.有一天老大问到我这个事情,我脱口而出,结果老大过来就是一jio... ...

  4. 剑指offer计划5(查找算法中等版)---java

    1.1.题目1 剑指 Offer 04. 二维数组中的查找 1.2.解法 其实就是暴力解法的升级版,从最后一行开始判断,通过num当前的大小, 如果还是大于目标值则行数-1,若是小于则列数+1 1.3 ...

  5. 洛谷P2424 约数和 题解

    题目 约数和 题解 此题可以说完全就是一道数学题,不难看出这道题所求的是 \(\sum\limits_{i=x}^{y}{\sum\limits_{d|i}{d}}\) 的值. 很显然,用暴力枚举肯定 ...

  6. Flask - 访问返回字典的接口报错:The view function did not return a valid response. The return type must be a string, tuple, Response instance, or WSGI callable, but it was a dict.

    背景 有一个 Flask 项目,然后有一个路由返回的是 dict 通过浏览器访问,结果报错 关键报错信息 TypeError: 'dict' object is not callable The vi ...

  7. Docker(23)- 注册 docker hub 的账号

    如果你还想从头学起 Docker,可以看看这个系列的文章哦! https://www.cnblogs.com/poloyy/category/1870863.html 前言 Docker Hub 是 ...

  8. 回收Windows 10恢复分区之后的磁盘空间

    我电脑上安装了Windows 10和Linux双系统,现在将Linux删除之后,准备将其磁盘空间并入到Windows 10的C盘中,但是发现C盘跟Linux空间之间还隔了一个Windows的恢复分区, ...

  9. Stream 流

    Stream流(接口不是函数接口) 描述 在java.1.8中,由于 lambda表达式这种函数编程jdk引入了一个全新的改变Stream流它是用来解决已有集合类库的一些弊端的. Stream是jav ...

  10. node实战小例子

    第一章 2020-2-6 留言小本子 思路(由于本章没有数据库,客户提交的数据放在全局变量,接收请求用的是bodyParser, padyParser使用方法 app.use(bodyParser.u ...