OCFS2 FencingPosted on February 8, 2011 by Abdulhameed Basha

I am very excited to start writing my experience with Oracle products
and solutions. In this blog am going to detail on OCFS2 fencing technology.

We are setting up a new data center for Oracle products,
for which we have got SUN servers X4170,X4270 and X4470,
SUN SAN storage 6180, SAN Switches and Fiber Channel
HBA (Host Bus Adapter). We have configured SAN storage and allocated
LUNs to the servers for Oracle EBS R12.1, OBI and Hyperian Applications.

We created a required partitions and OCFS2 file system on new devices
for our Oracle EBS R12.1 RAC database and shared Applications File system.
OCFS2 is certified for Shared Application file system, check ID .
After configuring our file system and mounted, we started to test
various scenarios to verify the stability of the setup. To our surprise noticed
that server is getting rebooted when we removed all HBAs connected to the
server or powered off SAN switch.

On analysis we found that it is a expected behavior of OCFS2 which is called “Fencing”.

Fencing is the act of forcefully removing a node from a cluster.
A node with OCFS2 file system mounted will fence itself when it
realizes that it does not have quorum in a degraded cluster.
It does this so that other nodes won’t be stuck trying to access its resources.
In earlier versions of OCFS2, User reported that nodes are hanging
during fencing. From version OCFS2 1.2.5, Oracle no longer uses
“panic” state for fencing instead it uses “machine restart”.

Let us now see, how exactly OCFS2 forces kernel to restart on fencing.

* After configuring OCFS2 and started cluster service O2CB, there will be a heartbeat system file in which every node writes its presence every 2 seconds to its block in the file.
* Block offset is equal to its global node number, that is node 0 will write to the first block in the file, node 1 to the second block and so on.
* All nodes will read the heartbeat system file every 2 seconds.
* As long as timestamp is changing that node is consider alive.
* A node self-fences if it fails to update its timestamp for ((O2CB_HEARTBEAT_THRESHOLD – 1) * 2) seconds.
* The [o2hb-xx] kernel thread, after every timestamp write, sets a timer to panic the system after that duration.
* If the next timestamp is written within that duration, as it should, it first cancels old timer before setting up a new one.
* If for some reason the [o2hb-x] kernel thread is unable to update the timestamp for O2CB_HEARTBEAT_THRESHOLD (default=7 or 31) loops and thus be deemed dead by other nodes in the cluster and OCFS2 forces kernel to restart.
* Once a node is deemed dead, the surviving node which manages cluster, lock the dead node’s journal, recovers it by replaying the journal.

From the above steps it is evident that the parameter O2CB_HEARTBEAT_THRESHOLD=(((timeout in secs) / 2) + 1), is very important in defining the time line to restart the server.
The default value is 31, which means if the node does not update the timestamp in the heartbeat system file in 60 sec then that node restarts. This value is quite low for RAC environment.

Assume that we have a 2 controller SAN storage and configured active/passive that is at any particular time only 1 controller path is active other is passive and used for failover. OCFS2 forces the kernel to restart (after the timeout) when a cable to a SAN device is cut even if the SAN configuration was going to perform failover.

To over come this we have to increase the value of O2CB_HEARTBEAT_THRESHOLD. If you want to increase to 120sec then the value should be 61.

Steps to change O2CB_HEARTBEAT_THRESHOLD
=====================================
1. Stop O2CB services in the server
#service O2CB stop

2. Update the O2CB configuration file with required value for O2CB_HEARTBEAT_THRESHOLD

# service o2cb configure
Configuring the O2CB driver.

This will configure the on-boot properties of the O2CB driver.
The following questions will determine whether the driver is loaded on
boot. The current values will be shown in brackets (‘[]‘). Hitting
without typing an answer will keep that current value. Ctrl-C
will abort.

Load O2CB driver on boot (y/n) [y]:
Cluster stack backing O2CB [o2cb]:
Cluster to start on boot (Enter “none” to clear) [ocfs2]:
Specify heartbeat dead threshold (>=7) [31]: 61
Specify network idle timeout in ms (>=5000) [30000]:
Specify network keepalive delay in ms (>=1000) [2000]:
Specify network reconnect delay in ms (>=2000) [2000]:
Writing O2CB configuration: OK
Loading filesystem “configfs”: OK
Mounting configfs filesystem at /sys/kernel/config: OK
Loading filesystem “ocfs2_dlmfs”: OK
Mounting ocfs2_dlmfs filesystem at /dlm: OK
Starting O2CB cluster ocfs2: OK
#

Alternate way to update the value, is to edit file /etc/sysconfig/o2cb and start O2CB service.

3. Check the status of O2CB
# /etc/init.d/o2cb status
Driver for “configfs”: Loaded
Filesystem “configfs”: Mounted
Driver for “ocfs2_dlmfs”: Loaded
Filesystem “ocfs2_dlmfs”: Mounted
Checking O2CB cluster ocfs2: Online
Heartbeat dead threshold = 61
Network idle timeout: 30000
Network keepalive delay: 2000
Network reconnect delay: 2000
Checking O2CB heartbeat: Active
#

References
========
1. OCFS2: A Cluster File System for Linux – User’s Guide for Release 1.4
2. OCFS2 Kernel Panics on SAN Failover [ID ]
3. OCFS2 1.2 – FREQUENTLY ASKED QUESTIONS [ID ]
4. Heartbeat/Voting/Quorum Related Timeout Configuration for Linux, OCFS2, RAC Stack to Avoid Unnecessary Node Fencing, Panic and Reboot [ID ]

In my next blog, I will explain the detail steps for configuring multipath using native device mapper(DM) in Linux.

OCFS2 Fencing的更多相关文章

  1. ocfs2: 搭建环境

    OCFS2是基于共享磁盘的集群文件系统,它在一块共享磁盘上创建OCFS2文件系统,让集群中的其它节点可以对磁盘进行读写操作.OCFS2由两部分内容构成,一部分实现文件系统功能,位于VFS之下和Ext4 ...

  2. 在Oracle Linux Server release 6.4下配置ocfs2文件系统

    ① 安装ocfs-tools-1.8 如果是使用RedHat Enterprise Linux 6.4,也可以安装ocfs-tools-1.8的,只是要插入Oracle Linux Server re ...

  3. USACO Section 5.1 Fencing the Cows(凸包)

    裸的凸包..很好写,废话不说,直接贴代码. ----------------------------------------------------------------------------- ...

  4. Oracle 11g RAC database on ASM, ACFS or OCFS2

    I see a lot of questions on shared file systems that can be used when people move from single instan ...

  5. How To Configure VMware fencing using fence_vmware_soap in RHEL High Availability Add On(RHEL Pacemaker中配置STONITH)

    本文主要简单介绍一下如何在RHEL 7 Pacemaker中配置一个fence_vmware_soap类型的STONITH设备(仅供测试学习). STONITH是Shoot-The-Other-Nod ...

  6. Reboot-less node fencing in Oracle Clusterware 11g Release 2

    在进行一次RAC的高可用性测试时,当private网卡的网线被拔掉之后,没有出现传说中的有一个节点被CRS强制重启,取而代之的是node2上面的ASM实例和RDBMS实例被关闭:当网线被重新插上时,n ...

  7. How To Configure VMware fencing using fence_vmware_soap in RHEL High Availability Add On——RHEL Pacemaker中配置STONITH

    本文主要简单介绍一下如何在RHEL 7 Pacemaker中配置一个fence_vmware_soap类型的STONITH设备(仅供测试学习). STONITH是Shoot-The-Other-Nod ...

  8. USACO 5.1 Fencing the Cows

    Fencing the CowsHal Burch Farmer John wishes to build a fence to contain his cows, but he's a bit sh ...

  9. Red Hat Cluster Suite 组件 fencing FAQ

    说明 Red Hat Cluster实现HA的关键组件之一是fencing.没有设置fencing,虽然看上去也能够运行Cluster,但是一旦遇到故障切换就会出现异 常,所以深入理解fencing原 ...

随机推荐

  1. YARN的Fair Scheduler和Capacity Scheduler

    关于Scheduler YARN有四种调度机制:Fair Schedule,Capacity Schedule,FIFO以及Priority: 其中Fair Scheduler是资源池机制,进入到里面 ...

  2. C#实现 OPC历史数据存取研究

    来源:http://blog.csdn.net/gjack/article/details/5641794 C#实现 OPC历史数据存取研究 (原文)Research of Accessing the ...

  3. C程序花括号嵌套层次统计(新)

    [问题描述] 编写程序,统计给定的C源程序中花括号的最大嵌套层次,并输出花括号嵌套序列,该程序没有语法错误. 注意:1)源程序注释(/* … */)中的花括号应被忽略,不参与统计.2)源程序中的字符串 ...

  4. 给scrapy添加代理IP

    request.meta['proxy'] = 'http://'+'175.42.123.111:33995'

  5. 文档主题生成模型(LDA)

    一.问题描述 1.1文本建模相关 统计文本建模的目的其实很简单:就是估算一组参数,这组参数使得整个语料库出现的概率最大.这是很简单的极大似然的思想了,就是认为观测到的样本的概率是最大的.建模的目标也是 ...

  6. 0908期 HTML 基础 第一讲

    HTML  常用属性.标签以及表格 HTML 超文本标记语言的简称. <html>    --开始标签 <head> 网页上的控制信息 <title>页面标题< ...

  7. 用户从手机的浏览器访问www.baidu.com,看到的可能跟桌面PC电脑,是不太一样的网页效果,会更适合移动设备使用。请简要分析一下,实现这种网页区分显示的原因及技术原理。

    手机的网速问题.屏幕大小.内存.CPU等.通过不同设备的特征,实现不同的网页展现或输出效果.根据useragent.屏幕大小信息.IP.网速.css media Query等原理,实现前端或后端的特征 ...

  8. PHP如何知道一个类中所有的方法

    当我们使用一个类时既没有源码也没有文档时(尤其是php扩展提供的类,比如mysqli,Redis类),我们该怎么知道这个类中提供了哪些方法,以及每个方法该怎么使用呢,此时就该PHP中强大的反射登场了, ...

  9. ie6绝对定位的块会被select元素遮挡的解决方案

    RT(已无力吐槽ie),解决方法是:定义一个iframe,与想要显示的绝对定位的块设置为同一大小.放在同一个位置上.我的网页里绝对定位的元素是会随着鼠标移动显示和隐藏的,于是这个frame也要跟着显示 ...

  10. Windows下安装logstash

    1. 下载 https://www.elastic.co/downloads/logstash https://www.elastic.co/downloads/past-releases 2. 文档 ...