Major compaction时的scan操作
版权声明:本文为博主原创文章。未经博主同意不得转载。
https://blog.csdn.net/u014393917/article/details/24419355
Major compaction时的scan操作
发起majorcompaction时,通过CompactSplitThread.CompactionRunner.run開始运行
-->region.compact(compaction,store)-->store.compact(compaction)-->
CompactionContext.compact,发起compact操作
CompactionContext的实例通过HStore中的storeEngine.createCompaction()生成,
默认值为DefaultStoreEngine,通过hbase.hstore.engine.class配置。
默认的CompactionContext实例为DefaultCompactionContext。
而DefaultCompactionContext.compact方法终于调用DefaultStoreEngine.compactor来运行
compactor的实现通过hbase.hstore.defaultengine.compactor.class配置,默认实现为DefaultCompactor
调用DefaultCompactor.compact(request);
1.依据要进行compact的storefile文件,生成相应的StoreFileScanner集合列表。
在生成StoreFileScanner实例时,每个scanner中的ScanQueryMatcher为null
2.创建StoreScanner实例。设置ScanType为ScanType.COMPACT_DROP_DELETES。
privateStoreScanner(Storestore,ScanInfoscanInfo,Scanscan,
List<?
extendsKeyValueScanner>scanners,ScanTypescanType,longsmallestReadPoint,
longearliestPutTs,byte[]dropDeletesFromRow,byte[]dropDeletesToRow)throwsIOException
{
this(store,false,scan,null,scanInfo.getTtl(),
scanInfo.getMinVersions());
if(dropDeletesFromRow==null){
运行这里,传入的columns为null
matcher=newScanQueryMatcher(scan,scanInfo,null,scanType,
smallestReadPoint,earliestPutTs,oldestUnexpiredTS);
}else{
matcher=newScanQueryMatcher(scan,scanInfo,null,smallestReadPoint,
earliestPutTs,oldestUnexpiredTS,dropDeletesFromRow,dropDeletesToRow);
}
ScanqueryMatcher的构造方法:
传入的columns为null
publicScanQueryMatcher(Scanscan,ScanInfoscanInfo,
NavigableSet<byte[]>columns,ScanTypescanType,
longreadPointToUse,longearliestPutTs,longoldestUnexpiredTS){
tr中mintime=0,maxtime=long.maxvalue
this.tr=scan.getTimeRange();
this.rowComparator=scanInfo.getComparator();
此deletes属性中的kvdelete信息为到一个新的row时。都会又一次进行清空。
this.deletes=newScanDeleteTracker();
this.stopRow=scan.getStopRow();
this.startKey=
KeyValue.createFirstDeleteFamilyOnRow(scan.getStartRow(),
scanInfo.getFamily());
得到filter实例
this.filter=scan.getFilter();
this.earliestPutTs=earliestPutTs;
this.maxReadPointToTrackVersions=readPointToUse;
this.timeToPurgeDeletes=scanInfo.getTimeToPurgeDeletes();
此处为的值为false
/*how to deal with deletes */
this.isUserScan=scanType==
ScanType.USER_SCAN;
此处的值为false,scanInfo.getKeepDeletedCells()的值默认false,
可通过table的columnfmaily中配置KEEP_DELETED_CELLS属性
scan.isRaw()可通过在scan中setAttribute的_raw_属性,默觉得false
//keep deleted cells: if compaction or raw scan
this.keepDeletedCells=
(scanInfo.getKeepDeletedCells()&& !isUserScan)||scan.isRaw();
此处的值为false,此时是major的compact,不保留delete的数据
scan.isRaw()可通过在scan中setAttribute的_raw_属性,默觉得false
//retain deletes: if minor compaction or raw scan
this.retainDeletesInOutput=scanType==
ScanType.COMPACT_RETAIN_DELETES||scan.isRaw();
此时的值为false
//seePastDeleteMarker: user initiated scans
this.seePastDeleteMarkers=scanInfo.getKeepDeletedCells()&&isUserScan;
得到查询的最大版本号数,此时的scan.maxversion与scanInfo.maxversion的值是同样的值
intmaxVersions=
scan.isRaw()?scan.getMaxVersions():
Math.min(scan.getMaxVersions(),
scanInfo.getMaxVersions());
生成columns属性的值为ScanWildcardColumnTracker实例,设置hasNullColumn的值为true
//Single branch to deal with two types of reads (columnsvsall
in family)
if(columns
==null||columns.size()==
0) {
//there is always a null column in thewildcardcolumn
query.
hasNullColumn=true;
columns属性中的index表示当前比对到的column的下标值。每比較一行时。此值会又一次清空
//use a specialized scan forwildcardcolumn
tracker.
this.columns=newScanWildcardColumnTracker(
scanInfo.getMinVersions(),maxVersions,oldestUnexpiredTS);
}else{
这个部分在compact时是不会运行的
//whether there is null column in the explicit column query
hasNullColumn= (columns.first().length==
0);
//We can share the ExplicitColumnTracker,diffis
we reset
//between rows, not betweenstorefiles.
this.columns=newExplicitColumnTracker(columns,
scanInfo.getMinVersions(),maxVersions,oldestUnexpiredTS);
}
}
ScanQueryMatcher.match过滤kv是否包括的方法分析
在通过StoreScanner.next(kvlist,limit)得到下一行的kv集合时
调用ScanQueryMatcher.match来过滤数据的方法分析
当中match方法返回的值详细作用可參见StoreScanner中的例如以下方法:
publicbooleannext(List<Cell>outResult,intlimit).....
publicMatchCodematch(KeyValuekv)throwsIOException
{
调用filter的filterAllRemaining方法,假设此方法返回true表示此次scan结束
if(filter
!=null&&filter.filterAllRemaining()){
returnMatchCode.DONE_SCAN;
}
得到kv的值
byte[]bytes
=kv.getBuffer();
KV在bytes中的開始位置
intoffset
=kv.getOffset();
得到key的长度
keyvalue的组成:
|
4 |
4 |
2 |
~ |
1 |
~ |
~ |
8 |
1 |
~ |
|
kenlen |
vlen |
rowlen |
row |
cflen |
cf |
column |
timestamp |
kvtype |
value |
intkeyLength
=Bytes.toInt(bytes,offset,Bytes.SIZEOF_INT);
得到rowkey的长度记录的開始位置(不包括keylen与vlen)
offset+= KeyValue.ROW_OFFSET;
rowkey的长度记录的開始位置
intinitialOffset=offset;
得到rowkey的长度
shortrowLength
=Bytes.toShort(bytes,offset,Bytes.SIZEOF_SHORT);
得到rowkey的開始位置
offset+= Bytes.SIZEOF_SHORT;
比較当前传入的kv的rowkey部分是否与当前行中第一个kv的rowkey部分同样。也就是是否是同一行的数据
intret
=this.rowComparator.compareRows(row,this.rowOffset,this.rowLength,
bytes,offset,rowLength);
假设当前传入的kv中的rowkey大于当前行的kv的rowkey部分,表示如今传入的kv是下一行,
结束当前next操作,(不是结束scan,是结束当次的next。表示这个next的一行数据的全部kv都查找完了)
if(ret
<=-1) {
returnMatchCode.DONE;
否则表示当前传入的kv是上一行的数据,须要把当前的scanner向下移动一行
}elseif(ret
>=1) {
//could optimize this, if necessary?
//Could also be called SEEK_TO_CURRENT_ROW, but this
//should be rare/never happens.
returnMatchCode.SEEK_NEXT_ROW;
}
优化配置,是否须要不运行以下流程,直接把当前的scanner向下移动一行
stickyNextRow的值为true的条件:
1.ColumnTracker.done返回为true,
2.ColumnTracker.checkColumn返回为SEEK_NEXT_ROW.
3.filter.filterKeyValue(kv);返回结果为NEXT_ROW。
4.ColumnTracker.checkVersions返回为INCLUDE_AND_SEEK_NEXT_ROW。
ColumnTracker的实如今scan的columns为null或者是compactscan时为ScanWildcardColumnTracker。
否则为ExplicitColumnTracker。
//optimize case.
if(this.stickyNextRow)
returnMatchCode.SEEK_NEXT_ROW;
在ScanWildcardColumnTracker实例中返回值为false,
在ExplicitColumnTracker实例中返回值是当前的kv是否大于或等于查找的column列表的总和
if(this.columns.done()){
stickyNextRow=true;
returnMatchCode.SEEK_NEXT_ROW;
}
得到familylen的记录位置
//PassingrowLength
offset+=rowLength;
得到family的长度
//Skippingfamily
bytefamilyLength=bytes[offset];
把位置移动到family的名称记录的位置
offset+=familyLength+
1;
得到column的长度
intqualLength=keyLength-
(offset-initialOffset)-
KeyValue.TIMESTAMP_TYPE_SIZE;
得到kv中timestamp的值
longtimestamp
=Bytes.toLong(bytes,initialOffset+keyLength–
KeyValue.TIMESTAMP_TYPE_SIZE);
检查timestamp是否在指定的范围内,主要检查ttl是否过期
//check for early out based ontimestampalone
if(columns.isDone(timestamp)){
假设发现kv的ttl过期,在ScanWildcardColumnTracker实例中直接返回SEEK_NEXT_COL。这个在compact中是默认
在ExplicitColumnTracker实例中检查是否有下一个column假设有返回SEEK_NEXT_COL。否则返回SEEK_NEXT_ROW。
returncolumns.getNextRowOrNextColumn(bytes,offset,qualLength);
}
/*
*The delete logic is pretty complicated now.
*This is corroborated by the following:
*1. The store might be instructed to keep deleted rows around.
*2. A scan can optionally see past a delete marker now.
*3. If deleted rows are kept, we have to find out when we can
* remove the delete markers.
*4. Family delete markers are always first (regardless of their TS)
*5. Delete markers should not be counted as version
*6. Delete markers affect puts of the *same* TS
*7. Delete marker need to be version counted together with puts
* they affect
*/
得到kv的类型。
bytetype
=bytes[initialOffset+keyLength– 1];
假设kv是删除的kv
if(kv.isDelete()){
在默认情况下,此keepDeletedCells值为false,这里的if检查会进去
if(!keepDeletedCells){
//first ignore delete markers if the scanner can do so, and the
//range does not include the marker
//
//during flushes andcompactionsalso
ignore delete markers newer
//than thereadpointof
any open scanner, this prevents deleted
//rows that could still be seen by a scanner from being collected
此时的值为true,scan中的tr默觉得alltime=true
booleanincludeDeleteMarker=seePastDeleteMarkers?
tr.withinTimeRange(timestamp):
tr.withinOrAfterTimeRange(timestamp);
把删除的kv加入到DeleteTracker中。compact时的实现为ScanDeleteTracker。
if(includeDeleteMarker
&&kv.getMvccVersion()<=maxReadPointToTrackVersions){
this.deletes.add(bytes,offset,qualLength,timestamp,type);
}
//Can't early out now, because DelFam come before any other keys
}
假设非minorcompact时。
或者在compact的scan时,同一时候当前时间减去kv的timestamp还不到hbase.hstore.time.to.purge.deletes配置的时间。
默认的配置值为0,
或者kv的mvcc值大于如今最大的mvcc值时。此if会进行。
眼下在做majorcompact的scan,不进去
if(retainDeletesInOutput
|| (!isUserScan&& (EnvironmentEdgeManager.currentTimeMillis()-timestamp)<=timeToPurgeDeletes)
||kv.getMvccVersion()>maxReadPointToTrackVersions){
//always include or it is not time yet to check whether it is OK
//to purgedeltesor
not
if(!isUserScan){
//if this is not a user scan (compaction), we can filter thisdeletemarkerright
here
//otherwise (i.e. a "raw" scan) we fall through to normalversion andtimerangechecking
returnMatchCode.INCLUDE;
}
下面的检查通常情况不会进入
}elseif(keepDeletedCells){
if(timestamp<earliestPutTs){
//keeping delete rows, but there are no puts older than
//this delete in the store files.
returncolumns.getNextRowOrNextColumn(bytes,offset,qualLength);
}
//else: fall through and do version counting on the
//delete markers
假设kv是已经delete的kv,加入到DeleteTracker后,直接返回SKIP.
}else{
returnMatchCode.SKIP;
}
//note the following next else if...
//delete marker are not subject to other delete markers
}elseif(!this.deletes.isEmpty()){
假设不是删除的KV时。检查删除的kv中是否包括此kv的版本号。
a.假设KV是DeleteFamily。同一时候当前的KV的TIMESTAMP的值小于删除的KV的TIMESTAMP的值,返回FAMILY_DELETED。
b.假设KV是DeleteFamilyVersion已经删除掉的版本号(删除时指定了timestamp)。返回FAMILY_VERSION_DELETED。
c.假设KV的是DeleteColumn,同一时候deleteTracker中包括的kv中部分qualifier的值
与传入的kv中部分qualifier的值同样。同一时候delete中包括的kv是DeleteColumn返回COLUMN_DELETED。
否则deleteTracker中包括的kv中部分qualifier的值与传入的kv中部分qualifier的值同样。
同一时候传入的kv中的timestamp的值是delete中的timestamp值。表示删除指定的版本号,返回VERSION_DELETED。
d.否则表示没有删除的情况,返回NOT_DELETED。
DeleteResultdeleteResult=deletes.isDeleted(bytes,offset,qualLength,
timestamp);
switch(deleteResult){
caseFAMILY_DELETED:
caseCOLUMN_DELETED:
returncolumns.getNextRowOrNextColumn(bytes,offset,qualLength);
caseVERSION_DELETED:
caseFAMILY_VERSION_DELETED:
returnMatchCode.SKIP;
caseNOT_DELETED:
break;
default:
thrownewRuntimeException("UNEXPECTED");
}
}
检查当前传入的kv的timestamp是否在包括的时间范围内,默认的scan是全部时间都包括
inttimestampComparison=tr.compare(timestamp);
假设当前kv的时间超过了最大的时间,返回SKIP。
if(timestampComparison>=
1) {
returnMatchCode.SKIP;
}elseif(timestampComparison<=
-1) {
假设当前kv的时间小于了最小的时间,返回SEEK_NEXT_COL或者SEEK_NEXT_ROW。
returncolumns.getNextRowOrNextColumn(bytes,offset,qualLength);
}
假设时间在正常的范围内,columns.checkColumn假设是compact时的scan此方法返回INCLUDE。
其他情况请參见ExplicitColumnTracker的实现。
//STEP 1: Check if the column is part of the requested columns
MatchCodecolChecker=columns.checkColumn(bytes,offset,qualLength,type);
此处的IF检查会进入
if(colChecker==MatchCode.INCLUDE){
运行filter操作,并依据filter的响应返回相关的值。此处不在说明,比較easy看明确。
ReturnCodefilterResponse=ReturnCode.SKIP;
//STEP 2: Yes, the column is part of the requested columns. Check iffilter is present
if(filter
!=null){
//STEP 3: Filter the key value and return if it filters out
filterResponse=filter.filterKeyValue(kv);
switch(filterResponse){
caseSKIP:
returnMatchCode.SKIP;
caseNEXT_COL:
returncolumns.getNextRowOrNextColumn(bytes,offset,qualLength);
caseNEXT_ROW:
stickyNextRow=true;
returnMatchCode.SEEK_NEXT_ROW;
caseSEEK_NEXT_USING_HINT:
returnMatchCode.SEEK_NEXT_USING_HINT;
default:
//Itmeans it is either include or include and seek next
break;
}
}
/*
* STEP 4: Reaching this stepmeans the column is part of the requested columns and either
* the filter is null or thefilter has returned INCLUDE or INCLUDE_AND_NEXT_COL response.
* Now check the number ofversions needed. This method call returns SKIP, INCLUDE,
* INCLUDE_AND_SEEK_NEXT_ROW,INCLUDE_AND_SEEK_NEXT_COL.
*
* FilterResponse ColumnChecker Desired behavior
* INCLUDE SKIP row has already been included, SKIP.
* INCLUDE INCLUDE INCLUDE
* INCLUDE INCLUDE_AND_SEEK_NEXT_COL INCLUDE_AND_SEEK_NEXT_COL
* INCLUDE INCLUDE_AND_SEEK_NEXT_ROW INCLUDE_AND_SEEK_NEXT_ROW
* INCLUDE_AND_SEEK_NEXT_COL SKIP row has already been included, SKIP.
* INCLUDE_AND_SEEK_NEXT_COLINCLUDE INCLUDE_AND_SEEK_NEXT_COL
* INCLUDE_AND_SEEK_NEXT_COLINCLUDE_AND_SEEK_NEXT_COL INCLUDE_AND_SEEK_NEXT_COL
* INCLUDE_AND_SEEK_NEXT_COLINCLUDE_AND_SEEK_NEXT_ROW INCLUDE_AND_SEEK_NEXT_ROW
*
* In all the above scenarios, wereturn the column checker return value except for
* FilterResponse(INCLUDE_AND_SEEK_NEXT_COL) and ColumnChecker(INCLUDE)
*/
colChecker=
columns.checkVersions(bytes,offset,qualLength,timestamp,type,
kv.getMvccVersion()>maxReadPointToTrackVersions);
//Optimizewith stickyNextRow
stickyNextRow=colChecker==MatchCode.INCLUDE_AND_SEEK_NEXT_ROW?true:stickyNextRow;
return(filterResponse==ReturnCode.INCLUDE_AND_NEXT_COL&&
colChecker==MatchCode.INCLUDE)?MatchCode.INCLUDE_AND_SEEK_NEXT_COL
:colChecker;
}
stickyNextRow= (colChecker==MatchCode.SEEK_NEXT_ROW)?true
:stickyNextRow;
returncolChecker;
}
major与minor的compact写入新storefile时的差别
假设是major的compact的写入。会在close掉writer时,
在meta中写入major==true的值MAJOR_COMPACTION_KEY=true。
此值主要用来控制做minor的compact时是否选择这个storefile文件。
if (writer!=null)
{
writer.appendMetadata(fd.maxSeqId,request.isMajor());
writer.close();
newFiles.add(writer.getPath());
}
Major compaction时的scan操作的更多相关文章
- PySpark操作HBase时设置scan参数
在用PySpark操作HBase时默认是scan操作,通常情况下我们希望加上rowkey指定范围,即只获取一部分数据参加运算.翻遍了spark的python相关文档,搜遍了google和stackov ...
- jedis keys和scan操作
关于redis的keys命令的性能问题 KEYS pattern 查找所有符合给定模式 pattern 的 key . KEYS * 匹配数据库中所有 key . KEYS h?llo 匹配 hell ...
- jedis的scan操作要注意cursor数据类型
环境 jedis3.0.0 背景 在使用jedis的"scan"操作获取redis中某些key时,发现总是出现类型转换的异常--"java.lang.ClassCastE ...
- Sqlserver2005附加数据库时出错提示操作系统错误5(拒绝访问)错误5120的解决办法
Sqlserver2005附加数据库时出错提示操作系统错误5(拒绝访问)错误5120的解决办法 最近几天从网上找了几个asp.net的登录案例想要研究研究代码,结果在用 Sql Server2005附 ...
- SQLServer2005+附加数据库时出错提示操作系统错误5(拒绝访问)错误5120的解决办法
SQLServer2005+ 附加数据库时出错提示操作系统错误5(拒绝访问)错误5120的解决办法 我们在用Sql SQLServer2005+附加数据库文件时弹出错误信息如下图的处理办法: 方案一: ...
- vb 中recordset提示对象关闭时不允许操作
vb中执行查询后,一般要判断是否为空,只要执行的查询执行了select,都可以用rs.eof 或者 rs.recordcount来判断, 但是,如果执行的sql中加了逻辑判断,导致没有执行任何sele ...
- [经使用有效]Sqlserver2005附加数据库时出错提示操作系统错误5(拒绝访问)错误5120的解决办法
sqlserver2005附加数据库时出错提示操作系统错误5(拒绝访问)错误5120的解决办法 最近几天从网上找了几个asp.net的登录案例想要研究研究代码,结果在用 Sql Server2005附 ...
- oracle expdp导入时 提示“ORA-39002: 操作无效 ORA-39070: 无法打开日志文件 ”
1.导出数据库的时候报错 expdp zz/zz@orcl directory=exp_dp dumpfile=zz_20170520.dump logfile=zz_20170520.log 2 ...
- linux vim vi编辑时撤销输入操作
linux vim vi编辑时撤销输入操作 1,esc退出输入状态 2,u 撤销上次操作 3,ctrl+r 恢复撤销
随机推荐
- kubernetes role
https://kubernetes.io/docs/admin/authorization/rbac/
- go_条件和循环
package main import ( "io/ioutil" "fmt") func grade(score int) string{ g:=" ...
- haproxy 配置 说明
一.环境说明实验环境OS CentOS5.4192.168.0.14 proxy192.168.0.24 web1192.168.0.64 web2 官方地址:http://hapr ...
- php使用curl 实现GET和POST请求(抓取网页,上传文件),支持跨项目和跨服务器
一:curl 函数和参数详解 函数库:1:curl_init 初始化一个curl会话2:curl_close 关闭一个curl会话3:curl_setopt 为一个curl设置会话参数4:curl_e ...
- PythonScripter2.7报错ascii codec can't encode characters in position 0-1:ordinal not in range(128)
1. 这是Python 2 mimetypes的bug2. 需要将Python2.7\lib\mimetypes.py文件中如下片段注释或删除:try: ctype = ctype.encode(de ...
- Sqlserver2008 FileStream解决图片存储问题
SQLserver FileStream的出现就是为了解决对大对象的存储中一个矛盾. 对于图片的存储方式 第一种:方式是存储在数据库里面,这种方式一般使用image字段,或者varbinary(max ...
- p2093 [国家集训队]JZPFAR
传送门 分析 首先给大家推荐一个非常好的KDTree笔记 here 此题就是y9ong优先队列维护距离最远的k个,最后输出队首元素即可 估价函数就是max和min两点到 询问点的最远距离 代码 #in ...
- code3731 寻找道路
将图反向,从终点spfa,把和终点没有联系的点找出来, 把它们和它们所连接的所有点删去(反向图) 然后再一遍spfa,输出最短路即可 代码: #include<iostream> #inc ...
- springdata -----操作ES
一:配置springdata-Es elasticseach-JPA.xml <?xml version="1.0" encoding="UTF-8"?& ...
- C#中深复制和浅复制
C# 支持两种类型:“值类型”和“引用类型”. 值类型(Value Type)(如 char.int 和 float).枚举类型和结构类型. 引用类型(Reference Type) 包括类 (Cla ...