Reboot-less node fencing in Oracle Clusterware 11g Release 2
在进行一次RAC的高可用性测试时,当private网卡的网线被拔掉之后,没有出现传说中的有一个节点被CRS强制重启,取而代之的是node2上面的ASM实例和RDBMS实例被关闭;当网线被重新插上时,node2上面的ASM实例和RDBMS实例自动重新启动。
基于上面的现象,在google上搜索,发现oracle在11.2.0.2版本之后引入了叫reboot-less node fencing的特性,即不使用重启节点的方式进行fencing。
下面引用oracle官方文档对reboot-less node fencing的介绍。
As mentioned, Oracle Clusterware uses a STONITH (Shoot The Other Node In The Head) comparable fencing algorithm to ensure data integrity in cases, in which cluster integrity is endangered and split-brain scenarios need to be prevented. In case of Oracle Clusterware, this means that a local process enforces the removal of one or more nodes from the cluster (fencing).
Until Oracle Clusterware 11g Release 2, Patch Set One (11.2.0.2) the fencing of a node was performed by a “fast reboot” of the respective server. A “fast reboot” in this context summarizes a shutdown and restart procedure that does not wait for any IO to finish or for file systems to synchronize on shutdown. With Oracle Clusterware 11g Release 2, Patch Set One (11.2.0.2) this mechanism has been changed in order to prevent such a reboot as much as possible.
Already with Oracle Clusterware 11g Release 2 this algorithm was improved so that failures of certain, Oracle RAC-required subcomponents in the cluster do not necessarily cause an immediate fencing (reboot) of a node. Instead, an attempt is made to clean up the failure within the cluster and to restart the failed subcomponent. Only, if a cleanup of the failed component appears to be unsuccessful, a node reboot is performed in order to force a cleanup.
With Oracle Clusterware 11g Release 2, Patch Set One (11.2.0.2) further improvements were made so that Oracle Clusterware will try to prevent a split-brain without rebooting the node. It thereby implements a standing requirement from those customers, who were requesting to preserve the node and to prevent a reboot, since the node runs applications not managed by Oracle Clusterware, which would otherwise be forcibly shut down by the reboot of a node.
With the new algorithm and when a decision is made to evict a node from the cluster, Oracle Clusterware will first attempt to shutdown all resources on the machine that was chosen to be the subject of an eviction. Especially IO generating processes are killed and it is ensured that those processes are completely stopped before continuing. If, for some reason, not all resources can be stopped or IO generating processes cannot be stopped completely, Oracle Clusterware will still perform a reboot or use IPMI to forcibly evict the node from the cluster.
If all resources can be stopped and all IO generating processes can be killed, Oracle Clusterware will shut itself down on the respective node, but will attempt to restart after the stack has been stopped. The restart is initiated by the Oracle High Availability Services Daemon, which has been introduced with Oracle Clusterware 11g Release 2.
Reboot-less node fencing in Oracle Clusterware 11g Release 2的更多相关文章
- Opatching PSU in Oracle Database 11g Release 2 RAC on RHEL6
Opatching PSU in Oracle Database 11g Release 2(11.2.0.4) RAC on RHEL6 1) 升级opatch工具 1.1) For GI home ...
- Oracle Database 11g Release 2 Standard Edition and Enterprise Edition Software Downloads
Oracle Database 11g Release 2 Standard Edition and Enterprise Edition Software DownloadsOracle 数据库 1 ...
- Oracle Database 11g Release 2(11.2.0.3.0) RAC On Redhat Linux 5.8 Using Vmware Workstation 9.0
一,简介 二,配置虚拟机 1,创建虚拟机 (1)添加三块儿网卡: 主节点 二节点 eth0: 公网 192.168.1.20/24 NAT eth0: 公网 192.168.1 ...
- 在redhat6.4下安装 Oracle® Database 11g Release 2
OS版本: 安装过程的相关信息: pdksh 安装好后根据需要设置oracle开机自启动http://www.cnblogs.com/softidea/p/3761671.html 设置环境变量NLS ...
- Oracle数据库11g基于rehl6.5的配置与安装
REDHAT6.5安装oracle11.2.4 ORACLE11G R2官档网址: http://docs.oracle.com/cd/E11882_01/install.112/e24326/toc ...
- Install Oracle 11G Release 2 (11.2) on Centos Linux 7
Install Oracle 11G Release 2 (11.2) on Centos Linux 7 This article presents how to install Oracle 11 ...
- Oracle Database 11G R2 标准版 企业版 下载地址(转)
转自:http://blog.itpub.net/628922/viewspace-759245/ 不需要注册,直接复制到迅雷或其他下载软件中即可下载. oracle 11.2.0.3 下载地址: L ...
- win10 安装Oracle 11g release 2
参考资料: Oracle Database 11g Release 2 安装详解 - WIN 10 系统 准备工作: 安装 Oracle 11g 之前,要确保在此操作系统上未安装过 Oracle,或者 ...
- Database Initialization Parameters for Oracle E-Business Suite Release 12 (文档 ID 396009.1)
In This Document Section 1: Common Database Initialization Parameters For All Releases Section 2: Re ...
随机推荐
- 莫烦tensorflow(7)-mnist
import tensorflow as tffrom tensorflow.examples.tutorials.mnist import input_data#number 1 to 10 dat ...
- spring boot 之 spring task(定时任务)
cron:通过表达式来配置任务执行时间cron表达式详解 一个cron表达式有至少6个(也可能7个)有空格分隔的时间元素.按顺序依次为: 秒(0~59)分钟(0~59)3 小时(0~23)4 天(0 ...
- 初窥async,await
首先是一道今日头条的面试题:(听说是今日头条的并且已经烂大街了) async function async1() { console.log( 'async1 start' ) await async ...
- ImageLorderUtil
import android.content.Context;import android.graphics.Bitmap;import android.os.Environment; import ...
- C++学习(三十八)(C语言部分)之 排序(冒泡 选择 插入 快排)
算法是解决一类问题的方法排序算法 根据元素大小关系排序 从小到大 从大到小冒泡 选择 插入 快排希尔排序 归并排序 堆排序 冒泡排序 从头到尾比较 每一轮将最大的数沉底 或者最小数字上浮 选择排序 1 ...
- 《DSP using MATLAB》Problem 7.3
- Python基础-socketserver
ocketserver通讯模块实现并发操作,基于select.epoll.socket.多线程,实现的正真多线程多并发 socketserver通讯模块底层调用的socket模块,只是它作了处理基于l ...
- requestmapping等相关知识
@responseBody注解的使用 1. @responseBody注解的作用是将controller的方法返回的对象通过适当的转换器转换为指定的格式之后,写入到response对象的body区 ...
- systemd service 设置limit,不生效问题
参考博文: http://smilejay.com/2016/06/centos-7-systemd-conf-limits/(解决方法参考此博文) 问题简述:Centos7下修改系统的最大文件打 ...
- 串口接收端verilog代码分析
串口接收端verilog代码分析 `timescale 1ns / 1ps ////////////////////////////////////////////////////////////// ...