换了网线异常了,CRS无法正常启动,clssnmSendingThread: sending status msg to all nodes
同事换网线前我将节点2正常关闭了,换完网线告诉我,发现节点2死活起不来了,看上面的日志和一些帖子最后也没解决,尝试过重启、网线拔掉重新插上、查看过存储是否正常和存储重新挂载。。。。看过一个帖子说可能是OCR信息发生了改变,不过之前没备份,也没忘这方面深入考虑。
最后还是没搞定,主要是技术有限,没准确的定位出具体问题也不敢轻易乱动。。。
20xx-12-16 19:01:05.792: [ CSSD][3786819328]clssnmSendingThread: sending join msg to all nodes
20xx-12-16 19:01:05.792: [ CSSD][3786819328]clssnmSendingThread: sent 5 join msgs to all nodes
20xx-12-16 19:01:06.295: [GIPCHALO][3811858176] gipchaLowerProcessNode: no valid interfaces found to node for 7286464 ms, node 0x7fecd0028450 { host 'myrac1', haName 'CSS_myrac-cluster', srcLuid fac66ea4-f1a960af, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [249 : 249], createTime 7037424, sentRegister 1, localMonitor 1, flags 0x4 }
20xx-12-16 19:01:06.303: [ CSSD][3789973248]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
20xx-12-16 19:01:06.420: [ CSSD][3799754496]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618800, LATS 7286584, lastSeqNo 211618797, uniqueness 1576485880, timestamp 1576494065/8540734
20xx-12-16 19:01:06.435: [ CSSD][3804591872]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618802, LATS 7286594, lastSeqNo 211618799, uniqueness 1576485880, timestamp 1576494066/8541524
20xx-12-16 19:01:07.304: [ CSSD][3789973248]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
20xx-12-16 19:01:07.421: [ CSSD][3799754496]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618803, LATS 7287584, lastSeqNo 211618800, uniqueness 1576485880, timestamp 1576494066/8541734
20xx-12-16 19:01:07.435: [ CSSD][3804591872]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618805, LATS 7287604, lastSeqNo 211618802, uniqueness 1576485880, timestamp 1576494067/8542524
20xx-12-16 19:01:08.304: [ CSSD][3789973248]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
20xx-12-16 19:01:08.422: [ CSSD][3799754496]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618806, LATS 7288584, lastSeqNo 211618803, uniqueness 1576485880, timestamp 1576494067/8542734
20xx-12-16 19:01:08.436: [ CSSD][3804591872]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618808, LATS 7288604, lastSeqNo 211618805, uniqueness 1576485880, timestamp 1576494068/8543524
20xx-12-16 19:01:09.304: [ CSSD][3789973248]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
20xx-12-16 19:01:09.422: [ CSSD][3799754496]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618809, LATS 7289584, lastSeqNo 211618806, uniqueness 1576485880, timestamp 1576494068/8543744
20xx-12-16 19:01:09.437: [ CSSD][3804591872]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618811, LATS 7289604, lastSeqNo 211618808, uniqueness 1576485880, timestamp 1576494069/8544524
20xx-12-16 19:01:09.803: [ CSSD][3785242368]clssnmRcfgMgrThread: Local Join
20xx-12-16 19:01:09.803: [ CSSD][3785242368]clssnmLocalJoinEvent: begin on node(2), waittime 193000
20xx-12-16 19:01:09.803: [ CSSD][3785242368]clssnmLocalJoinEvent: set curtime (7289964) for my node
20xx-12-16 19:01:09.803: [ CSSD][3785242368]clssnmLocalJoinEvent: scanning 32 nodes
20xx-12-16 19:01:09.803: [ CSSD][3785242368]clssnmLocalJoinEvent: Node myrac1, number 1, is in an existing cluster with disk state 3
20xx-12-16 19:01:09.803: [ CSSD][3785242368]clssnmLocalJoinEvent: takeover aborted due to cluster member node found on disk
20xx-12-16 19:01:10.305: [ CSSD][3789973248]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
20xx-12-16 19:01:10.423: [ CSSD][3799754496]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618812, LATS 7290584, lastSeqNo 211618809, uniqueness 1576485880, timestamp 1576494069/8544744
20xx-12-16 19:01:10.437: [ CSSD][3804591872]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618814, LATS 7290604, lastSeqNo 211618811, uniqueness 1576485880, timestamp 1576494070/8545524
20xx-12-16 19:01:10.794: [ CSSD][3786819328]clssnmSendingThread: sending join msg to all nodes
20xx-12-16 19:01:10.794: [ CSSD][3786819328]clssnmSendingThread: sent 5 join msgs to all nodes

20xx-12-16 20:36:02.919: [ CSSD][2756265728]clssgmUpdateGrpData: grock(CLSN.ONSNETPROC.MASTER), commissioner(-1/0)
20xx-12-16 20:36:02.919: [ CSSD][2756265728]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(118), status(0), sendresp(1)
20xx-12-16 20:36:02.920: [ CSSD][2756265728]clssgmTestSetLastGrockUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(118) msgseq(119), lastupdt<0x7fbb58031e10>, ignoreseq(0)
20xx-12-16 20:36:02.920: [ CSSD][2756265728]clssgmGrockOpTagProcess: Request to commission member(1) using key(1) for grock(CLSN.ONSNETPROC.MASTER)
20xx-12-16 20:36:02.920: [ CSSD][2756265728]clssgmUpdateGrpData: grock(CLSN.ONSNETPROC.MASTER), commissioner(1/1)
20xx-12-16 20:36:02.920: [ CSSD][2756265728]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(119), status(0), sendresp(1)
20xx-12-16 20:36:02.921: [ CSSD][2756265728]clssgmTestSetLastGrockUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(119) msgseq(120), lastupdt<0x7fbb5804d490>, ignoreseq(0)
20xx-12-16 20:36:02.921: [ CSSD][2756265728]clssgmUpdateGrpData: grock(CLSN.ONSNETPROC.MASTER), private data(2052), incarn(40)
20xx-12-16 20:36:02.921: [ CSSD][2756265728]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(120), status(0), sendresp(1)
20xx-12-16 20:36:02.922: [ CSSD][2756265728]clssgmTestSetLastGrockUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(120) msgseq(121), lastupdt<0x7fbb5803dee0>, ignoreseq(0)
20xx-12-16 20:36:02.922: [ CSSD][2756265728]clssgmGrockOpTagProcess: Request to commission member(-1) using key(1) for grock(CLSN.ONSNETPROC.MASTER)
20xx-12-16 20:36:02.922: [ CSSD][2756265728]clssgmUpdateGrpData: grock(CLSN.ONSNETPROC.MASTER), commissioner(-1/0)
20xx-12-16 20:36:02.922: [ CSSD][2756265728]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(121), status(0), sendresp(1)
20xx-12-16 20:36:05.064: [ CSSD][2753111808]clssnmSendingThread: sending status msg to all nodes
20xx-12-16 20:36:05.064: [ CSSD][2753111808]clssnmSendingThread: sent 5 status msgs to all nodes
20xx-12-16 20:36:09.065: [ CSSD][2753111808]clssnmSendingThread: sending status msg to all nodes
20xx-12-16 20:36:09.065: [ CSSD][2753111808]clssnmSendingThread: sent 4 status msgs to all nodes
20xx-12-16 20:36:14.066: [ CSSD][2753111808]clssnmSendingThread: sending status msg to all nodes
...

根据日志能判断出bond信息变了吗?我当时没发现也没分析出来,最后同事说改了bond!当时不是说只换根网线重新排下线吗?我说改回去试试,果然如此,重启一切正常了

胡乱重启了下,没起来。。。
[root@myrac2 bin]# ./crsctl query crs activeversion
Oracle Cluster Registry initialization failed accessing Oracle Cluster Registry device: PROC-26: Error while accessing the physical storage
ORA-15077: could not locate ASM instance serving a required diskgroup

[root@myrac2 bin]# ./ocrcheck
PROT-602: Failed to retrieve data from the cluster registry
PROC-26: Error while accessing the physical storage
ORA-15077: could not locate ASM instance serving a required diskgroup

[grid@myrac2 ~]$ cd /u01/app/11.2.0/grid/bin/
[grid@myrac2 bin]$ srvctl start nodeapps -n myrac2
PRCR-1070 : Failed to check if resource ora.gsd is registered
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource ora.net1.network is registered
Cannot communicate with crsd
PRCR-1035 : Failed to look up CRS resource myrac2 for ora.cluster_vip.type
PRCR-1068 : Failed to query resources
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource ora.ons is registered
Cannot communicate with crsd

[grid@myrac2 bin]$ srvctl start asm -n myrac2
PRCR-1070 : Failed to check if resource ora.asm is registered
Cannot communicate with crsd

[grid@myrac2 bin]$ srvctl start database -d testdb2
PRCD-1027 : Failed to retrieve database testdb2
PRCR-1115 : Failed to find entities of type resource that match filters ((NAME == ora.testdb2.db) && (TYPE == ora.database.type)) and contain attributes VERSION,ORACLE_HOME,DATABASE_TYPE
Cannot communicate with crsd
[grid@myrac2 bin]$

节点2被修改的bond,明显跟1不一样
[root@myrac2 11.2.0]# service network status
Configured devices:
lo bond0 bond1 em1 em2 em3 em4
Currently active devices:
lo em1 em2 em3 em4 bond0 bond1
[root@myrac2 11.2.0]#

节点1
[root@myrac1 ~]# service network status
Configured devices:
lo bond0 em1 em2 em3 em4 idrac
Currently active devices:
lo em1 em2 em3 bond0

抛开技术行不行先不说,单这件事来说,同事之间的合作有时候更重要。一不小心你就会给别人挖个坑或掉到别人给你挖的坑

换了网线异常了,CRS无法正常启动,clssnmSendingThread: sending status msg to all nodes的更多相关文章

  1. 异常System.Web.HttpException (0x80004005): Server cannot set status after HTTP headers have been sent.

    在用mvc 的AuthorizeAttribute做身份验证,重写HandleUnauthorizedRequest方法,在Application_Error方法里出现异常System.Web.Htt ...

  2. Linux异常关机后,Mysql启动出错ERROR 2002 (HY000)

    Linux异常关机后,Mysql启动或訪问时,出错: ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/ ...

  3. AIX下禁止crs随ha启动而启动

    /etc/init.crs enable /etc/init.crs disable 查看目前crs是enable还是disable状态 状态记录在一个文本文件里  /etc/oracle/scls_ ...

  4. 异常-CDH的service无法启动并抛出异常-org.apache.avro.AvroRemoteException: java.net.ConnectException: Connection refused (Connection refused)

    1 详细异常 org.apache.avro.AvroRemoteException: java.net.ConnectException: Connection refused (Connectio ...

  5. WPF App.xaml.cs常用模板,包括:异常捕获,App只能启动一次

    App.xaml.cs中的代码每次都差不多,故特地将其整理出来直接复用: using System; using System.Configuration; using System.Diagnost ...

  6. MyEclipse异常关闭导致Tomcat不能启动的问题

    由于MyEclipse的异常关闭从而导致Tomcat并没有关闭,所以再次启动Tomcat当然是无法启动的啦,解决方法:在任务管理器中关闭一个叫javaw.exe的进程,如果你这时已经启动了MyEcli ...

  7. 左右RAC CRS 自己主动启动

    左右CRS自己主动重新启动实验 一.检验ASM [root@rac1 ~]# /etc/init.d/oracleasm status Checking if ASM is loaded: yes C ...

  8. 联想ideapad 310s如何进BIOS,换固态硬盘SSD,配置U盘启动,重装Win10系统

    1. 如何进BIOS 关机情况下,捅一下Novo键,即可进入BIOS 2. 安装固态硬盘 Ideadpad 310S 本身自带的硬盘是5400转的机械硬盘,容量小速度慢.换的新的固态硬盘是SATA接口 ...

  9. 换了XCode版本之后,iOS应用启动时不占满全屏,上下有黑边

    原因是没有Retina4对应的启动图片,解决方法很简单,就是把Retina4对应的图片给补上就只可以了

随机推荐

  1. PHP自动发红包代码示例

    <?php header('Content-type:text'); define("TOKEN", "weixin"); $wechatObj = ne ...

  2. 基于Quartz.NET框架的任务计划管理工具

    最近接到一个小需求 ——可以定期同步20个Sql Server 7.0数据库里的数据(数据量会预计>10000),并保存为CSV格式文件 ——可以设置保存文件数据量 ——该应用需要用WinFor ...

  3. 多线程之NSOperation小结

    一.NSOperation 抽象类 NSOperation 是一个"抽象类",不能直接使用.抽象类的用处是定义子类共有的属性和方法. NSOperation 是基于 GCD 做的面 ...

  4. Tensorflow Serving Docker compose 部署服务细节(Ubuntu)

    [摘要] Tensorflow Serving 是tf模型持久化的重要工具,本篇介绍如何通过Docker compose搭建并调试TensorFlow Serving TensorFlow Servi ...

  5. 大型情感剧集Selenium:1_介绍 #华为云·寻找黑马程序员#

    学习selenium能做什么? 很多书籍.文章中是这么定义selenium的: Selenium 是开源的自动化测试工具,它主要是用于Web 应用程序的自动化测试,不只局限于此,同时支持所有基于web ...

  6. windows7 上安装python3.8步骤

    今天给小白们写一个在windows7 上安装python3.8的过程. 1.先到https://www.python.org/downloads/官网下载最新版的python, 不要到别的下载网站去下 ...

  7. luogu P1356 数列的整数性 |动态规划

    题目描述 对于任意一个整数数列,我们可以在每两个整数中间任意放一个符号'+'或'-',这样就可以构成一个表达式,也就可以计算出表达式的值.比如,现在有一个整数数列:17,5,-2,-15,那么就可以构 ...

  8. [TimLinux] Python nonlocal和global的作用

    1. 执行代码 以下实例都是通过执行以下代码,需要把以下执行代码放在后面实例代码的后面. a = outer_func()print("call a()") a() a() a() ...

  9. 2017 CCPC秦皇岛 M题 Safest Buildings

    PUBG is a multiplayer online battle royale video game. In the game, up to one hundred players parach ...

  10. 基于Pact的契约测试

    背景 如今,契约测试已经逐渐成为测试圈中一个炙手可热的话题,特别是在微服务大行其道的行业背景下,越来越多的团队开始关注服务之间的契约及其契约测试. 什么是契约测试     关于什么是契约测试这个问题, ...