Region Split请求是在Region MemStore Flush之后被触发的:

boolean shouldCompact = region.flushcache();

// We just want to check the size
boolean shouldSplit = region.checkSplit() != null; if (shouldSplit) {
this.server.compactSplitThread.requestSplit(region);
} else if (shouldCompact) {
server.compactSplitThread.requestCompaction(region, getName());
} server.getMetrics().addFlush(region.getRecentFlushInfo());

Region Flush操作完成之后,会进行checkSplit的判断,如果返回值不为null(返回值为该Region的SplitPoint),表示该Region达到了进行Split的条件,发起相应的Split请求。

checkSplit方法定义如下:

/**
* Return the splitpoint. null indicates the region isn't splittable. If the
* splitpoint isn't explicitly specified, it will go over the stores to find
* the best splitpoint. Currently the criteria of best splitpoint is based
* on the size of the store.
*/
public byte[] checkSplit() {
// Can't split ROOT/META
if (this.regionInfo.isMetaTable()) {
if (shouldForceSplit()) {
LOG.warn("Cannot split root/meta regions in HBase 0.20 and above");
} return null;
} if (!splitPolicy.shouldSplit()) {
return null;
} byte[] ret = splitPolicy.getSplitPoint(); if (ret != null) {
try {
checkRow(ret, "calculated split");
} catch (IOException e) {
LOG.error("Ignoring invalid split", e); return null;
}
} return ret;
}

由上述代码可以看出,如果当前Region属于目录信息表(ROOT/META),则是不允许进行Split操作的,否则根据当前Region的RegionSplitPolicy实例判断是否需要进行Split,流程包含两步:

(1)该Region是否允许进行Split;

(2)该Region在允许进行Split的条件下,是否可以计算出相应的SplitPoint。

RegionSplitPolicy shouldSplit

如果没有在定义表结构时进行特殊的指定,RegionSplitPolicy默认为org.apache.hadoop.hbase.regionserver.IncreasingToUpperBoundRegionSplitPolicy的实例,配置项为hbase.regionserver.region.split.policy。

方法代码如下:

@Override
protected boolean shouldSplit() {
if (region.shouldForceSplit()) {
return true;
} boolean foundABigStore = false; // Get count of regions that have the same common table as this.region
int tableRegionsCount = getCountOfCommonTableRegions(); // Get size to check
long sizeToCheck = getSizeToCheck(tableRegionsCount); for (Store store : region.getStores().values()) {
// If any of the stores is unable to split (eg they contain
// reference files)
// then don't split
if ((!store.canSplit())) {
return false;
} // Mark if any store is big enough
long size = store.getSize(); if (size > sizeToCheck) {
LOG.debug("ShouldSplit because " + store.getColumnFamilyName()
+ " size=" + size + ", sizeToCheck=" + sizeToCheck
+ ", regionsWithCommonTable=" + tableRegionsCount); foundABigStore = true; break;
}
} return foundABigStore;
}

执行流程:

(1)如果当前Region被请求执行ForceSplit,则直接返回true;

(2)计算当前Region中的各个Store大小的上限值;

(3)循环判断当前Region中的某一Store大小是否超过上限值,如果存在这样的Store,则提前结束循环,返回true即可。

其中,进行大小判断的Region Store必须是可Split的,即该Store中不包含Reference类型的文件,如果某一Store中出现了Reference类型的文件,则表示该Region已经被Split过,不能再进行Split,此时,直接返回false即可。

重点讲述一下Region中各个Store大小的上限值的计算方法:

(1)假设当前Region所属的表为t,计算该Region所处于的RegionServer上包含表t的Online Region数目,并将结果保存至变量tableRegionsCount中;

// Get count of regions that have the same common table as this.region
int tableRegionsCount = getCountOfCommonTableRegions();

getCountOfCommonTableRegions方法代码如下:

/**
* @return Count of regions on this server that share the table this.region
* belongs to
*/
private int getCountOfCommonTableRegions() {
RegionServerServices rss = this.region.getRegionServerServices(); // Can be null in tests
if (rss == null) {
return 0;
} byte[] tablename = this.region.getTableDesc().getName(); int tableRegionsCount = 0; try {
List<HRegion> hri = rss.getOnlineRegions(tablename); tableRegionsCount = hri == null || hri.isEmpty() ? 0 : hri.size();
} catch (IOException e) {
LOG.debug("Failed getOnlineRegions " + Bytes.toString(tablename), e);
} return tableRegionsCount;
}

首先获取该Region所处于的RegionServer实例:

RegionServerServices rss = this.region.getRegionServerServices();

然后获取该Region所对应的表的名称:

byte[] tablename = this.region.getTableDesc().getName();

最后获取表tablename在rss上的Online Region的数目:

List<HRegion> hri = rss.getOnlineRegions(tablename);

(2)根据tableRegionsCount计算上限值:

// Get size to check
long sizeToCheck = getSizeToCheck(tableRegionsCount);

getSizeToCheck方法代码如下:

/**
* @return Region max size or
* <code>count of regions squared * flushsize, which ever is
* smaller; guard against there being zero regions on this server.
*/
long getSizeToCheck(final int tableRegionsCount) {
return tableRegionsCount == 0 ? getDesiredMaxFileSize() : Math.min(
getDesiredMaxFileSize(), this.flushSize
* (tableRegionsCount * tableRegionsCount));
}

计算过程根据tableRegionsCount的值分为两种情况:

(1)tableRegionsCount值为0时(可能发生么?),直接通过方法getDesiredMaxFileSize返回结果即可(getDesiredMaxFileSize的返回值可以在创建表时指定,如果创建表时没有特殊指定,则由配置项hbase.hregion.max.filesize决定,默认值为10737418240即10G);

(2)tableRegionsCount值不为0时,结果为getDesiredMaxFileSize()与this.flushSize * (tableRegionsCount * tableRegionsCount)两者之间的最小值,其中flushSize在创建表时指定,如果创建表时没有特殊指定,则由配置项hbase.hregion.memstore.flush.size决定,默认值为134217728即128M。

RegionSplitPolicy getSplitPoint

进行到这一步,表示该Region是允许进行Split的,下一步应该计算该Region的SplitPoint。

方法代码如下:

/**
* @return the key at which the region should be split, or null if it cannot
* be split. This will only be called if shouldSplit previously
* returned true.
*/
protected byte[] getSplitPoint() {
byte[] explicitSplitPoint = this.region.getExplicitSplitPoint();
if (explicitSplitPoint != null) {
return explicitSplitPoint;
} Map<byte[], Store> stores = region.getStores(); byte[] splitPointFromLargestStore = null; long largestStoreSize = 0; for (Store s : stores.values()) {
byte[] splitPoint = s.getSplitPoint(); long storeSize = s.getSize(); if (splitPoint != null && largestStoreSize < storeSize) {
splitPointFromLargestStore = splitPoint; largestStoreSize = storeSize;
}
} return splitPointFromLargestStore;
}

执行流程如下:

(1)如果请求ForceSplit时显示指定了SplitPoint,则直接将该值返回即可;

(2)循环处理该Region的Store,分别获取该Store的大小和SplitPoint,最后Region的SplitPoint为最大的那个Store的SplitPoint。

接下来的问题是如何计算Store的SplitPoint。

Store getSplitPoint

/**
* Determines if Store should be split
*
* @return byte[] if store should be split, null otherwise.
*/
public byte[] getSplitPoint() {
this.lock.readLock().lock(); try {
// sanity checks
if (this.storefiles.isEmpty()) {
return null;
} // Should already be enforced by the split policy!
assert !this.region.getRegionInfo().isMetaRegion(); // Not splitable if we find a reference store file present in the
// store.
long maxSize = 0L; StoreFile largestSf = null; for (StoreFile sf : storefiles) {
if (sf.isReference()) {
// Should already be enforced since we return false in this
// case
assert false : "getSplitPoint() called on a region that can't split!"; return null;
} StoreFile.Reader r = sf.getReader(); if (r == null) {
LOG.warn("Storefile " + sf + " Reader is null"); continue;
} long size = r.length(); if (size > maxSize) {
// This is the largest one so far
maxSize = size; largestSf = sf;
}
} StoreFile.Reader r = largestSf.getReader(); if (r == null) {
LOG.warn("Storefile " + largestSf + " Reader is null"); return null;
} // Get first, last, and mid keys. Midkey is the key that starts
// block
// in middle of hfile. Has column and timestamp. Need to return just
// the row we want to split on as midkey.
byte[] midkey = r.midkey(); if (midkey != null) {
KeyValue mk = KeyValue.createKeyValueFromKey(midkey, 0,
midkey.length); byte[] fk = r.getFirstKey();
KeyValue firstKey = KeyValue.createKeyValueFromKey(fk, 0,
fk.length); byte[] lk = r.getLastKey();
KeyValue lastKey = KeyValue.createKeyValueFromKey(lk, 0,
lk.length); // if the midkey is the same as the first or last keys, then we
// cannot
// (ever) split this region.
if (this.comparator.compareRows(mk, firstKey) == 0
|| this.comparator.compareRows(mk, lastKey) == 0) {
if (LOG.isDebugEnabled()) {
LOG.debug("cannot split because midkey is the same as first or "
+ "last row");
} return null;
} return mk.getRow();
}
} catch (IOException e) {
LOG.warn("Failed getting store size for " + this, e);
} finally {
this.lock.readLock().unlock();
} return null;
}

执行流程

(1)选择Store StoreFiles中的最大的那个StoreFile largestSf;

long maxSize = 0L;

StoreFile largestSf = null;

for (StoreFile sf : storefiles) {
if (sf.isReference()) {
// Should already be enforced since we return false in this
// case
assert false : "getSplitPoint() called on a region that can't split!"; return null;
} StoreFile.Reader r = sf.getReader(); if (r == null) {
LOG.warn("Storefile " + sf + " Reader is null"); continue;
} long size = r.length(); if (size > maxSize) {
// This is the largest one so far
maxSize = size; largestSf = sf;
}
}

(2)获取largestSf的MidKey、FirstKey、LastKey,如果MidKey与FirstKey相等或者MidKey与LastKey相等,则返回null(为什么?);否则返回MidKey。

// Get first, last, and mid keys. Midkey is the key that starts
// block
// in middle of hfile. Has column and timestamp. Need to return just
// the row we want to split on as midkey.
byte[] midkey = r.midkey(); if (midkey != null) {
KeyValue mk = KeyValue.createKeyValueFromKey(midkey, 0,
midkey.length); byte[] fk = r.getFirstKey();
KeyValue firstKey = KeyValue.createKeyValueFromKey(fk, 0,
fk.length); byte[] lk = r.getLastKey();
KeyValue lastKey = KeyValue.createKeyValueFromKey(lk, 0,
lk.length); // if the midkey is the same as the first or last keys, then we
// cannot
// (ever) split this region.
if (this.comparator.compareRows(mk, firstKey) == 0
|| this.comparator.compareRows(mk, lastKey) == 0) {
if (LOG.isDebugEnabled()) {
LOG.debug("cannot split because midkey is the same as first or "
+ "last row");
} return null;
} return mk.getRow();
}

StoreFile是由多个Block组成的(这里的Block不同于HDFS的Block),每个Block的第一个RowKey会被存储到StoreFile中的特殊位置中,因此,这里的MidKey、FirstKey、LastKey指的就是StoreFile中MidBlock、FirstBlock、LastBlock各自的第一个RowKey。

Region Split是以Row作为最小切分单位的,即同一行的数据会完整的出现在某一Region中,如果MidKey与FirstKey相等或者MidKey与LastKey相等,则表示如果进行切分则会出现某Region中的RowKey是完全一样的,即该Region中仅包含一个行的数据,这种情况出现中HBase中是不合理的,因此不允许MidKey与FirstKey相等或者MidKey与LastKey相等时进行Split。

综上所述,如果某一Region满足Split的条件且可以计算出SplitPoint,则可以发起Split请求:

this.server.compactSplitThread.requestSplit(region);

HBase Split的更多相关文章

  1. Hbase split的三种方式和split的过程

    在Hbase中split是一个很重要的功能,Hbase是通过把数据分配到一定数量的region来达到负载均衡的.一个table会被分配到一个或多个region中,这些region会被分配到一个或者多个 ...

  2. Hbase split的过程以及解发条件

    一.Split触发条件   1.  有任一一个Hfile的大小超过默认值10G时,都会进行split    2.  达到这个值不在拆分,默认为int_max,不进行拆分       3.compact ...

  3. HBase内部操作日志说明

    版本:0.94-cdh4.2.1 1. Split Region [regionserver60020-splits-1397585864985] INFO org.apache.hadoop.hba ...

  4. 为什么不建议在hbase中使用过多的列簇

    我们知道,hbase表可以设置一个至多个列簇(column families),但是为什么说越少的列簇越好呢? 官网原文: HBase currently does not do well with ...

  5. HBASE学习笔记(一)

    一.数据库OLAP和OLTP简单的介绍比较 1.OLTP:on-line transaction processing在线事务处理,应用在传统关系型数据库比较多,执行日常基本的事务处理,比如数据库记录 ...

  6. HBase原理 – 分布式系统中snapshot是怎么玩的?(转载)

    snapshot(快照)基础原理 snapshot是很多存储系统和数据库系统都支持的功能.一个snapshot是一个全部文件系统.或者某个目录在某一时刻的镜像.实现数据文件镜像最简单粗暴的方式是加锁拷 ...

  7. HBASE-使用问题-split region

    问题描述: HBASE表的管理以REGION分区为核心,通常面临如下几个问题: 1) 数据如何存储到指定的region分区,即rowkey设计,region splitkey设计 2)设计的split ...

  8. Apache HBase MTTR 优化实践

    HBase介绍 HBase是Hadoop Database的简称,是建立在Hadoop文件系统之上的分布式面向列的数据库,它具有高可靠.高性能.面向列和可伸缩的特性,提供快速随机访问海量数据能力. H ...

  9. HBase原理深入

    HBase 读写数据流程 Hbase 读数据流程 首先从 zk 找到 meta 表的 region 位置,然后读取 meta 表中的数据,meta 表中存储了用户表的 region 信息 根据要查询的 ...

随机推荐

  1. Android设置虚线、圆角、渐变

    有图又真相,先上图再说. 点击效果: 设置虚线: <?xml version="1.0" encoding="utf-8"?> <shape  ...

  2. C++学习路线

    已经确定做C++后台的工作了,因此,要对C++要越来越熟悉才行,今天,在此列出学习和温习C++书籍的顺序,从而由浅入深地学习C++. 1. <C++ primer> 2. <Acce ...

  3. 4道过滤菜鸟的iOS面试题

    网上已经有很多针对各种知识点的面试题,面试时有些人未必真正理解也能通过背题看上去很懂.我自己总结了4道面试题,好快速的判断这个人是否是一个合格的工程师,欢迎大家点评. 1.struct和class的区 ...

  4. Android(java)学习笔记220:开发一个多界面的应用程序之界面间数据传递

    1.界面跳转的数据传递 (1)intent.setData() --> intent.getData():     传递的数据比较简单,一般是文本类型的数据String:倘若我们传递的数据比较复 ...

  5. Day8 - Python网络编程 Socket编程

    Python之路,Day8 - Socket编程进阶   本节内容: Socket语法及相关 SocketServer实现多并发 Socket语法及相关 socket概念 socket本质上就是在2台 ...

  6. javascript 实用函数

    1.去除字符串空格 /*去左空格*/ function ltrim(s) { return s.replace(/^(\s*| *)/, ""); } /*去右空格*/ funct ...

  7. Python 文件的IO

    对文件的操作 #coding=utf-8 #!user/bin/python import os #基本操作和写入文件 fo = open("test2.py",'wb') pri ...

  8. Excel 2007中的新文件格式

    *.xlsx:基于XML文件格式的Excel 2007工作簿缺省格式 *.xlsm:基于XML且启用宏的Excel 2007工作簿 *.xltx:Excel2007模板格式 *.xltm:Excel ...

  9. 控制器View的加载过程

    1.控制器内部的view是延迟加载 1> 用到时再加载2> 加载完毕后会调用控制器的viewDidLoad方法 2.创建控制器的方式 1> 直接通过代码创建OneViewContro ...

  10. 知识点总结之HTML篇

    1.标签语义化: ①.在不依赖样式的情况下,页面能够呈现清晰的结构. ②.如果使用者有视觉障碍,屏幕阅读器会完全根据你的标记来选择读取你的网页. ③.有利于搜索引擎依赖于标记来确定上下文和各个关键字的 ...