ASM device error ORA-27041 ORA-15025 ORA-15081 (Doc ID 1487475.1)

描述总结:
数据库的alert中发现大量ORA-27041 ORA-15025 ORA-15081报错,首先查看asm的磁盘组的状态,对应的盘符的状态和权限全部正常,查看asm的alert日志并未看到刷新。
炸一看像是磁盘权限的问题,但是细想下来,如果真的是磁盘权限的问题,那么数据库应该就会挂了,但是查看业务的会话,全部正常。此时没有头绪,查看mos,其实已经有了答案,但是没有理解mos的意思。
后坚持30分钟沟通的原则,沟通是否有磁盘相关的变更。根据trc中的uid,确认与操作系统DSG用户相关,咨询业务也未有明确答案。查看该用户下的进程,发现一个大量执行的脚本,查看相应的log,发现程序执行的时间和数据库报错的时间吻合,并且报错一致。最终判定,是由于dsg用户缺少磁盘组的相应权限导致,该程序访问磁盘提示权限不足。

相关信息如下:
1.Environment:11.2.0.4
2.Symptoms:
1)报错
Tue Sep ORA-2704128 11:08:13 2021
Errors in file /XXX/app/oracle/diag/rdbms/xxx/xxx1/trace/xxx1_ora_9178000.trc:
ORA-15025: could not open disk "/dev/rhdisk12"
ORA-27041: unable to open file
IBM AIX RISC System/6000 Error: 13: Permission denied
Additional information: 3
Additional information: 4
Additional information: 138444804
Errors in file /XXX/app/oracle/diag/rdbms/xxx/xxx1/trace/xxx1_ora_9178000.trc:
ORA-15025: could not open disk "/dev/rhdisk12"
ORA-27041: unable to open file
IBM AIX RISC System/6000 Error: 13: Permission denied
Additional information: 3
Additional information: 4
Additional information: 138444804
WARNING: failed to read mirror side 1 of virtual extent 0 logical extent 0 of file 3103 in group [1.374353710] from disk DATA_0002 allocation unit 893192 r
eason error; if possible, will try another mirror side
Errors in file /XXX/app/oracle/diag/rdbms/xxx/xxx1/trace/xxx1_ora_9178000.trc:
ORA-00202: control file: '+DATA/xxx/controlfile/current.257.1062864093.bak'
ORA-15081: failed to submit an I/O operation to a disk
Tue Sep 28 11:08:14 2021

2)磁盘相关查询
SQL> select DISK_NUMBER,B.NAME GROUP_NAME,a.name diskname,a.free_MB/1024 FREE_GB,a.TOTAL_MB/1024 TOTAL_GB,a.free_mb/a.total_mb*100 free_percentage ,a.path,A.STATE FROM V$ASM_DISK A,V$ASM_DISKGROUP B WHERE A.GROUP_NUMBER=B.GROUP_NUMBER order by b.name,DISK_NUMBER ;

DISK_NUMBER GROUP_NAME DISKNAME FREE_GB TOTAL_GB FREE_PERCENTAGE PATH STATE
----------- --------------- --------------- --------- --------- --------------- -------------------- ------------------------
0 DATA DATA_0000 27.23 1024.00 2.66 /dev/rhdisk10 NORMAL
1 DATA DATA_0001 27.12 1024.00 2.65 /dev/rhdisk11 NORMAL
2 DATA DATA_0002 27.05 1024.00 2.64 /dev/rhdisk12 NORMAL
……
0 OCR OCR_0000 9.62 10.00 96.21 /dev/rhdisk2 NORMAL
73 rows selected.

SQL> select group_number, name,state,type,total_MB / 1024 total_GB,free_mb / 1024 FREE_GB,free_mb / total_MB * 100 free_per,(case when free_mb / total_mb * 100 < 15 then '*' else '' end) care from V$ASM_DISKGROUP;

GROUP_NUMBER NAME STATE TYPE TOTAL_GB FREE_GB FREE_PER C
------------ ------------------------------ ----------- ------ ---------- ---------- ---------- -
1 DATA CONNECTED EXTERN 73728 1915.58496 2.59817839 *
2 OCR MOUNTED EXTERN 10 9.62109375 96.2109375

-bash-4.2# ls -l /dev/hdisk1*
brw------- 1 root system 13, 3 Dec 17 2020 /dev/hdisk1
brw------- 1 root system 13, 16 Dec 23 2020 /dev/hdisk10
brw------- 1 root system 13, 14 Dec 23 2020 /dev/hdisk11
brw------- 1 root system 13, 19 Dec 23 2020 /dev/hdisk12

-bash-4.2# ls -l /dev/rhdisk1*
crw------- 1 root system 13, 3 Dec 17 2020 /dev/rhdisk1
crw-rw---- 1 grid asmadmin 13, 16 Sep 28 11:17 /dev/rhdisk10
crw-rw---- 1 grid asmadmin 13, 14 Sep 28 11:17 /dev/rhdisk11
crw-rw---- 1 grid asmadmin 13, 19 Sep 28 11:17 /dev/rhdisk12
crw-rw---- 1 grid asmadmin 13, 8 Sep 28 11:17 /dev/rhdisk13
-bash-4.2# errpt

-bash-4.2# lsattr -El hdisk12 | grep reserve_policy
reserve_policy no_reserve Reserve Policy True
-bash-4.2# lsattr -El hdisk13 | grep reserve_policy
reserve_policy no_reserve Reserve Policy True

-bash-4.2# id oracle
uid=1101(oracle) gid=1000(oinstall) groups=1100(asmadmin),1200(dba),1300(asmdba)
-bash-4.2# id grid
uid=1100(grid) gid=1000(oinstall) groups=1100(asmadmin),1200(dba),1300(asmdba),1301(asmoper)
-bash-4.2#

3.定位原因:trc中显示kfk_debug_get_user_groups: uid:207, euid:1101, gid:1000, egid:1000(对应dsg用户为oinstall属组,未包括asm相关属组)

Trace file /XXX/app/oracle/diag/rdbms/xxx/xxx1/trace/xxx1_ora_9178000.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
ORACLE_HOME = /XXX/app/oracle/product/11.2/db_2
System name: AIX
Node name: hxdg1
Release: 1
Version: 7
Machine: 00F6E6DC4C00
Instance name: xxx1
Redo thread mounted by this instance: 1
Oracle process number: 194
Unix process pid: 9178000, image: oracle@hxdg1 (TNS V1-V3)

*** 2021-09-28 11:08:13.726
*** SESSION ID:(988.19) 2021-09-28 11:08:13.726
*** CLIENT ID:() 2021-09-28 11:08:13.726
*** SERVICE NAME:(SYS$USERS) 2021-09-28 11:08:13.726
*** MODULE NAME:(oxad@hxdg1 (TNS V1-V3)) 2021-09-28 11:08:13.726
*** ACTION NAME:() 2021-09-28 11:08:13.726

WARNING: failed to open a disk[/dev/rhdisk12]
ORA-15025: could not open disk "/dev/rhdisk12"
ORA-27041: unable to open file
IBM AIX RISC System/6000 Error: 13: Permission denied
Additional information: 3
Additional information: 4
Additional information: 138444804
kfk_debug_get_user_groups: uid:207, euid:1101, gid:1000, egid:1000
WARNING: failed to open a disk[/dev/rhdisk12]
ORA-15025: could not open disk "/dev/rhdisk12"
ORA-27041: unable to open file
IBM AIX RISC System/6000 Error: 13: Permission denied
Additional information: 3
Additional information: 4
Additional information: 138444804
kfk_debug_get_user_groups: uid:207, euid:1101, gid:1000, egid:1000
WARNING: disk locally closed resulting in I/O error
WARNING: Read Failed. group:1 disk:2 AU:893192 offset:16384 size:16384
path:Unknown disk
incarnation:0x4560e025 synchronous result:'I/O error'
subsys:Unknown library iop:0x110db7ed0 bufp:0x110cfde00 osderr:0x0 osderr1:0x0
WARNING: failed to read mirror side 1 of virtual extent 0 logical extent 0 of file 3103 in group [1.374353710] from disk DATA_0002 allocation unit 893192 reason error; if possible, will try another mirror side
DDE rules only execution for: ORA 202
----- START Event Driven Actions Dump ----
---- END Event Driven Actions Dump ----
----- START DDE Actions Dump -----
Executing SYNC actions
----- START DDE Action: 'DB_STRUCTURE_INTEGRITY_CHECK' (Async) -----
DDE Action 'DB_STRUCTURE_INTEGRITY_CHECK' was flood controlled
----- END DDE Action: 'DB_STRUCTURE_INTEGRITY_CHECK' (FLOOD CONTROLLED, 1 csec) -----
Executing ASYNC actions
----- END DDE Actions Dump (total 0 csec) -----

*** 2021-09-28 11:08:13.731
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=1, mask=0x0)
----- Error Stack Dump -----
ORA-00202: control file: '+DATA/xxx/controlfile/current.257.1062864093.bak'
ORA-15081: failed to submit an I/O operation to a disk
----- Current SQL Statement for this session (sql_id=amdwhucub5mzk) -----
select count(:"SYS_B_0") from v$dataguard_stats

cat /etc/passwd | grep uid
ps -ef |grep dsg

-bash-4.2# ps -ef | grep dsg
dsg 8455834 2033984 0 09:58:37 - 0:01 sshd: dsg@pts/15
dsg 11405170 11733628 0 12:33:23 - 0:00 /XXX/dsg/aiod/bin/aiod -n 127.0.0.1,45007 -flog /XXX/dsg/aiod/log/log.aiod
root 11536588 13501954 0 13:04:45 pts/28 0:00 grep dsg
root 2033984 3473680 0 09:58:36 - 0:00 sshd: dsg [priv]
dsg 6752538 8455834 0 09:58:37 pts/15 0:00 -ksh
dsg 11733628 1 0 12:33:23 - 0:00 /XXX/dsg/aiod/bin/aiod -n 127.0.0.1,45007 -flog /XXX/dsg/aiod/log/log.aiod

more /XXX/dsg/aiod/log/log.aiod
encrypt_pwd=n # input oracle password is encrypted?
oracle_pdb= # Oracle 12c Instance PDB name
sysdba=n # connect Oracle by SYSDBA?(Y|N)
sysasm=n # connect Oracle by SYSASM?(Y|N)
timeout=6 # recv oxad info timeout minutes. 0 - unused timeout, (0 ~ 254)
standby=A # A - auto check, Y - standby DB, N - not standby DB
[I] 2021-09-28:11:55:49 AOXD XP#5834878 loop startup ...
[I] 2021-09-28:11:55:49 asm#21366664 startup(edn:1,rawofs:0) blen(24:4:18)MB 16777216, 149ms
[I] 2021-09-28:11:55:49 ASM APIs startup success! used ASM APIs, 0.17s
[I] 2021-09-28:11:55:49 ASM#21366664 (0,1,255,0) 0, 0.00M, s(0.003s, )
[E] 2021-09-28:11:55:49 AOXD loop err:OXA-1000 OCI for Oracle error -1 occurred at api/sql/execute.c:112, sid xxx1, tns .
ORA-00204: error in reading (block 1, # blocks 1) of control file
ORA-00202: control file: '+DATA/xxx/controlfile/current.257.1062864093.bak'
ORA-15081: failed to submit an I/O operation to a disk
Error - OCI_ERROR select count(1) from v$dataguard_stats
[I] 2021-09-28:11:55:49 AOXD service shutdown.
[W] 2021-09-28:11:55:49 Task 0, pid 21366664 exit, (normal exit 0) sleep 30s, and restart.

最终解决确认权限,不再报错
-bash-4.2# id dsg
uid=207(dsg) gid=1000(oinstall) groups=1100(asmadmin),1200(dba),1300(asmdba),1301(asmoper)

11.2.0.4 ORA-15025 ORA-27041 IBM AIX RISC System/6000 Error: 13: Permission denied的更多相关文章

  1. 解决Mac nginx问题 [emerg] 54933#0: bind() to 0.0.0.0:80 failed (13: Permission denied)

    brew services restart nginx Stopping nginx... (might take a while) ==> Successfully stopped nginx ...

  2. Ubuntu nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)

    在Ubuntu 12中启动刚安装好的Nginx,报错: nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied) 原因如下: ...

  3. 解决Nginx的connect() to 127.0.0.1:8080 failed (13: Permission denied) while connect

    在进行Nginx+Tomcat 负载均衡的时候遇到了这个权限问题,在error.log日志中.我们能够看到例如以下: connect() to 127.0.0.1:8080 failed (13: P ...

  4. 解决nginx访问问题connect() to 127.0.0.1:8080 failed (13: Permission denied) while connecting to upstream,

    问题:搭建好项目之后,用nginx进行代理,进行日常配置之后,发现前端正常访问,但是后端访问出现错误,报502错误,查找nginx日志,发现connect() to 127.0.0.1:8080 fa ...

  5. Starting nginx: nginx: [emerg] bind() to 0.0.0.0:8088 failed (13: Permission denied) nginx 启动失败

     Starting nginx: nginx: [emerg] bind() to 0.0.0.0:8088 failed (13: Permission denied)     nginx 启动失败 ...

  6. nginx bind() to 0.0.0.0:**** failed (13: Permission denied)

    nginx 启动失败,日志里面报错信息如下: Starting nginx: nginx: [emerg] bind() to 0.0.0.0:**** failed (13: Permission ...

  7. "/usr/local/openresty/nginx/html/index.html" is forbidden (13: Permission denied), client: 10.0.4.118, server: localhost, request: "GET / HTTP/1.1"

    openrestry 安装之后 报"/usr/local/openresty/nginx/html/index.html" is forbidden (13: Permission ...

  8. nginx之 [error] 6702#0:XXX is forbidden (13: Permission denied)

    问题描述: 配置完 nginx 两个虚拟机后,客户端能够访问原始的server ,新增加的 server 虚拟机 不能够访问,报错如下页面 解决过程: 1. 查看报错日志[root@mysql03 n ...

  9. 解决nginx报错:nginx: [emerg] bind() to 0.0.0.0:8088 failed (13: Permission denied)

    报错描述: nginx: [emerg] bind() to 0.0.0.0:8088 failed (13: Permission denied) 通过ansible远程给主机更换端口并重新启动ng ...

随机推荐

  1. Java中Byte类型数据在运算中的问题

    比如: byte a=1; byte b=2; byte c; c=a+b; //这样是计算不出c,是错误的 c=a+1; //这样也是不能计算c的 c=64+1; //为什么这样就能计算c,在Jav ...

  2. 关于对String中intern方法的理解

    在java的String中有个一直被我们忽视了的方法intern方法:它的官方解释是:一个初始时为空的字符串池,它由类 String 私有地维护. 当调用 intern 方法时,如果池已经包含一个等于 ...

  3. TNN iOS非图像模型入门

    注:本文同步发布于微信公众号:stringwu的互联网杂谈TNN iOS 非图像模型入门指南 1 背景 TNN是腾讯优图实验室开源的高性能.轻量级神经网络推理框架TNN,github上也有比较详细的例 ...

  4. etcd raft 处理流程图系列3-wal的存储和运行

    存储和节点的创建 raftexample中的存储其实有两种,一个是通过raft.NewMemoryStorage()进行创建的raft.raftStorage,关联到单个raft节点,另一个是通过ne ...

  5. rollup 使用babel7版本的插件rollup-plugin-babel,rollup-plugin-babel使用报错解决办法。

    最近在研究rollup,想吐槽下rollup的官方文档写的真的太简单了,而且照着文档一步步来还报错,说明文档年代有点久远啊... 照着文档使用rollup-plugin-babel报错,首先打开rol ...

  6. Java同步之线程池详解

    带着问题阅读 1.什么是池化,池化能带来什么好处 2.如何设计一个资源池 3.Java的线程池如何使用,Java提供了哪些内置线程池 4.线程池使用有哪些注意事项 池化技术 池化思想介绍 池化思想是将 ...

  7. 快速入门PaddleOCR,并试用其开发一个搜题小工具

    介绍 PaddleOCR 是一个基于百度飞桨的OCR工具库,包含总模型仅8.6M的超轻量级中文OCR,单模型支持中英文数字组合识别.竖排文本识别.长文本识别.同时支持多种文本检测.文本识别的训练算法. ...

  8. JS 之 每日一题 之 算法 ( 划分字母区间 )

    题目详解: 字符串 S 由小写字母组成.我们要把这个字符串划分为尽可能多的片段,同一个字母只会出现在其中的一个片段.返回一个表示每个字符串片段的长度的列表. 例子: 示例 1: 输入:S = &quo ...

  9. 假期作业02:安装JDK与文本编辑器并编写第一个Java程序

    假期作业02:安装JDK与文本编辑器并编写第一个Java程序 一.安装JDK与文本编辑器并编写第一个java程序 首先在oracle官网(需要创建账号,进行登录后方可使用)按照自己的需求下载JDK(h ...

  10. Django+Ansible构建任务中心思路

    Ansible作为老牌的自动化运维工具,由Python开发,应用广泛,但其默认只提供了命令行下的使用方式,好在提供有完善的API支持二次开发,可以很方便的集成到我们的自动化运维系统中 最近一个朋友跳槽 ...