How Region Split works in Hbase
- A region is decided to be split when store file size goes above hbase.hregion.max.filesize or according to defined region split policy.
- At this point this region is divided into two by region server.
- Region server creates two reference files for these two daughter regions.
- These reference files are stored in a new directory called splits under parent directory.
- Exactly at this point, parent region is marked as closed or offline so no client tries to read or write to it.
- Now region server creates two new directories in splits directory for these daughter regions.
- If steps till 6 are completed successfully, Region server moves both daughter region directories under table directory.
- The META table is now informed of the creation of two new regions, along with an update in the entry of parent region that it has now been split and is offline. (OFFLINE=true , SPLIT=true)
- The reference files are very small files containing only the key at which the split happened and also whether it represents top half or bottom half of the parent region.
- There is a class called “HalfHFileReader”which then utilizes these two reference files to read the original data file of parent region and also to decide as which half of the file has to be read.
- Both regions are now brought online by region server and start serving requests to clients.
- As soon as the daughter regions come online, a compaction is scheduled which rewrites the HFile of parent region into two HFiles independent for both daughter regions.
- As this process in step 12 completes, both the HFiles cleanly replace their respective reference files. The compaction activity happens under .tmp directory of daughter regions.
- With the successful completion till step 13, the parent region is now removed from META and all its files and directories marked for deletion.
- Finally Master server is informed by this Region server about two new regions getting born. Master now decides the fate of the two regions as to let them run on same region server or have them travel to another one.
How Region Split works in Hbase的更多相关文章
- 「从零单排HBase 05」核心特性region split
HBase拥有出色的扩展性,其中最依赖的就是region的自动split机制. 1.split触发时机与策略 前面我们已经知道了,数据写入过程中,需要先写memstore,然后memstore满了以后 ...
- region split流程分析
region split流程分析 splitregion的发起主要通过client端调用regionserver.splitRegion或memstore.flsuh时检查并发起. Client通过r ...
- [HBase]region split流程
1. 简介 HBase 的最小管理单位为region,region会按照region 分裂策略进行分裂. 基于CDH5.4.2 2. 总览
- HBASE 基础命令总结
HBASE基础命令总结 一,概述 本文中介绍了hbase的基础命令,作者既有记录总结hbase基础命令的目的还有本着分享的精神,和广大读者一起进步.本文的hbase版本是:HBase 1.2.0-cd ...
- Hbase学习02
第2章 Apache HBase配置 本章在“入门”一章中进行了扩展,以进一步解释Apache HBase的配置. 请仔细阅读本章,特别是基本先决条件,确保您的HBase测试和部署顺利进行,并防止数据 ...
- hbase集群region数量和大小的影响
1.Region数量的影响 通常较少的region数量可使群集运行的更加平稳,官方指出每个RegionServer大约100个regions的时候效果最好,理由如下: 1)Hbase的一个特性MSLA ...
- HBase如何选取split point
hbase region split操作的一些细节,具体split步骤很多文档都有说明,本文主要关注regionserver如何选取split point 首先推荐web ui查看hbase regi ...
- Hbase split的三种方式和split的过程
在Hbase中split是一个很重要的功能,Hbase是通过把数据分配到一定数量的region来达到负载均衡的.一个table会被分配到一个或多个region中,这些region会被分配到一个或者多个 ...
- hbase日常运维管用命令,region管理
1 Hbase日常运维 1.1 监控Hbase运行状况 1.1.1 操作系统 1.1.1.1 IO 群集网络IO,磁盘IO,HDFS IO IO越大说明文件读 ...
随机推荐
- Html img 标签
Html img 标签 <html> <body> <!-- img 标签用于显示图片.src="xxx.jpg" 指定图片路径名称--> &l ...
- Mysql 集合链接查询
MySQL NULL 值处理 需求:我们已经知道MySQL使用 SQL SELECT 命令及 WHERE 子句来读取数据表中的数据,但是当提供的查询条件字段为 NULL 时,该命令可能就无法正常工作. ...
- Jenkins介绍和安装及配合GitLab代码自动部署
Jenkins是什么? 基于JAVA的开源的自动化系统平台 加速自动化CI,CD任务及流水线,所有类型的任务:构建,测试,部署等 丰富的插件生态系统支持功能扩展,1400+插件和SCM,测试,通知,报 ...
- 【CentOS 7】CentOS7与CentOS6 的区别
前言 centos7与6之间最大的差别就是初始化技术的不同,7采用的初始化技术是Systemd,并行的运行方式,除了这一点之外,服务启动.开机启动文件.网络命令方面等等,都说6有所不同. 一.系统初始 ...
- 论文笔记: Mutual Learning to Adapt for Joint Human Parsing and Pose Estimation
Mutual Learning to Adapt for Joint Human Parsing and Pose Estimation 2018-11-03 09:58:58 Paper: http ...
- 力扣(LeetCode)605. 种花问题
假设你有一个很长的花坛,一部分地块种植了花,另一部分却没有.可是,花卉不能种植在相邻的地块上,它们会争夺水源,两者都会死去. 给定一个花坛(表示为一个数组包含0和1,其中0表示没种植花,1表示种植了花 ...
- (15)线程---Condition条件
功能:也是通过阻塞控制线程数量.类似信号量\进程池\线程池的作用 语法:wait from threading import Condition con= Condition() conn.acq ...
- JVM垃圾回收(一)- 什么是垃圾回收
什么是垃圾回收? 垃圾回收是追踪所有正在被使用的对象,并标注剩余的为garbage.这里我们先从JVM的GC是如何实现的说起. 手动内存管理 在开始介绍垃圾回收之前,我们先复习一下手动内存管理.它是指 ...
- MySql 8.0 版本使用navicat连不上解决
先通过命令行进入mysql的root账户: 更改加密方式 ALTER USER 'root'@'localhost' IDENTIFIED BY 'password' PASSWORD EXPIRE ...
- 漏洞复现——Apache SSI远程命令执行
漏洞原理:当目标服务器开启了SSI与CGI支持,我们就可以上传shtml文件,利用<!--#exec cmd="id" -->语法执行命令. SSI:SSI(服务器端包 ...