前言:本文详细介绍了 HBase ValueFilter 过滤器 Java&Shell API 的使用,并贴出了相关示例代码以供参考。ValueFilter 基于列值进行过滤,在工作中涉及到需要通过HBase 列值进行数据过滤时可以考虑使用它。比较器细节及原理请参照之前的更文:HBase Filter 过滤器之比较器 Comparator 原理及源码学习

一。Java Api

头部代码

/**
* 用于列值过滤。
*/
public class ValueFilterDemo {
private static boolean isok = false;
private static String tableName = "test";
private static String[] cfs = new String[]{"f1","f2"};
private static String[] data = new String[]{
"row-1:f1:c1:abcdefg",
"row-2:f1:c2:abc",
"row-3:f2:c3:abc123456",
"row-4:f2:c4:1234abc567"
};
public static void main(String[] args) throws IOException { MyBase myBase = new MyBase();
Connection connection = myBase.createConnection();
if (isok) {
myBase.deleteTable(connection, tableName);
myBase.createTable(connection, tableName, cfs);
// 造数据
myBase.putRows(connection, tableName, data);
}
Table table = connection.getTable(TableName.valueOf(tableName));
Scan scan = new Scan();

中部代码

向右滑动滚动条可查看输出结果。

1. BinaryComparator 构造过滤器

        ValueFilter valueFilter = new ValueFilter(CompareFilter.CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes("abc"))); // [row-2:f1:c2:abc]
ValueFilter valueFilter = new ValueFilter(CompareFilter.CompareOp.NOT_EQUAL, new BinaryComparator(Bytes.toBytes("abc"))); // [row-1:f1:c1:abcdefg, row-3:f2:c3:abc123456, row-4:f2:c4:1234abc567]
ValueFilter valueFilter = new ValueFilter(CompareFilter.CompareOp.GREATER, new BinaryComparator(Bytes.toBytes("abc"))); // [row-1:f1:c1:abcdefg, row-3:f2:c3:abc123456]
ValueFilter valueFilter = new ValueFilter(CompareFilter.CompareOp.GREATER_OR_EQUAL, new BinaryComparator(Bytes.toBytes("abc1"))); // [row-1:f1:c1:abcdefg, row-3:f2:c3:abc123456]
ValueFilter valueFilter = new ValueFilter(CompareFilter.CompareOp.LESS, new BinaryComparator(Bytes.toBytes("abc"))); // [row-4:f2:c4:1234abc567]
ValueFilter valueFilter = new ValueFilter(CompareFilter.CompareOp.LESS_OR_EQUAL, new BinaryComparator(Bytes.toBytes("abc"))); // [row-2:f1:c2:abc, row-4:f2:c4:1234abc567]

2. BinaryPrefixComparator 构造过滤器

        ValueFilter valueFilter = new ValueFilter(CompareFilter.CompareOp.EQUAL, new BinaryPrefixComparator(Bytes.toBytes("123"))); // [row-4:f2:c4:1234abc567]
ValueFilter valueFilter = new ValueFilter(CompareFilter.CompareOp.NOT_EQUAL, new BinaryPrefixComparator(Bytes.toBytes("ab"))); // [row-4:f2:c4:1234abc567]
ValueFilter valueFilter = new ValueFilter(CompareFilter.CompareOp.GREATER, new BinaryPrefixComparator(Bytes.toBytes("ab"))); // [] 只比较prefix长度的字节
ValueFilter valueFilter = new ValueFilter(CompareFilter.CompareOp.GREATER_OR_EQUAL, new BinaryPrefixComparator(Bytes.toBytes("ab"))); // [row-1:f1:c1:abcdefg, row-2:f1:c2:abc, row-3:f2:c3:abc123456]
ValueFilter valueFilter = new ValueFilter(CompareFilter.CompareOp.LESS, new BinaryPrefixComparator(Bytes.toBytes("abc"))); // [row-4:f2:c4:1234abc567]
ValueFilter valueFilter = new ValueFilter(CompareFilter.CompareOp.LESS_OR_EQUAL, new BinaryPrefixComparator(Bytes.toBytes("abc"))); // [row-1:f1:c1:abcdefg, row-2:f1:c2:abc, row-3:f2:c3:abc123456, row-4:f2:c4:1234abc567]

3. SubstringComparator 构造过滤器

        ValueFilter valueFilter = new ValueFilter(CompareFilter.CompareOp.EQUAL, new SubstringComparator("123")); // [row-3:f2:c3:abc123456, row-4:f2:c4:1234abc567]
ValueFilter valueFilter = new ValueFilter(CompareFilter.CompareOp.NOT_EQUAL, new SubstringComparator("def")); // [row-2:f1:c2:abc, row-3:f2:c3:abc123456, row-4:f2:c4:1234abc567]```

4. RegexStringComparator 构造过滤器

        ValueFilter valueFilter = new ValueFilter(CompareFilter.CompareOp.NOT_EQUAL, new RegexStringComparator("4[a-z]")); // [row-1:f1:c1:abcdefg, row-2:f1:c2:abc, row-3:f2:c3:abc123456]
ValueFilter valueFilter = new ValueFilter(CompareFilter.CompareOp.EQUAL, new RegexStringComparator("4[a-z]")); // [row-4:f2:c4:1234abc567]
ValueFilter valueFilter = new ValueFilter(CompareFilter.CompareOp.EQUAL, new RegexStringComparator("abc")); // [row-1:f1:c1:abcdefg, row-2:f1:c2:abc, row-3:f2:c3:abc123456, row-4:f2:c4:1234abc567]

尾部代码

		scan.setFilter(valueFilter);
ResultScanner scanner = table.getScanner(scan);
Iterator<Result> iterator = scanner.iterator();
LinkedList<String> keys = new LinkedList<>();
while (iterator.hasNext()) {
String key = "";
Result result = iterator.next();
for (Cell cell : result.rawCells()) {
byte[] rowkey = CellUtil.cloneRow(cell);
byte[] family = CellUtil.cloneFamily(cell);
byte[] column = CellUtil.cloneQualifier(cell);
byte[] value = CellUtil.cloneValue(cell);
key = Bytes.toString(rowkey) + ":" + Bytes.toString(family) + ":" + Bytes.toString(column) + ":" + Bytes.toString(value);
keys.add(key);
}
}
System.out.println(keys);
scanner.close();
table.close();
connection.close();
}
}

二。Shell Api

1. BinaryComparator 构造过滤器

方式一:

hbase(main):006:0> scan 'test',{FILTER=>"ValueFilter(=,'binary:abc')"}
ROW COLUMN+CELL
row-2 column=f1:c2, timestamp=1589453592471, value=abc
1 row(s) in 0.0240 seconds

支持的比较运算符:= != > >= < <=,不再一一举例。

方式二:

import org.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.BinaryComparator
import org.apache.hadoop.hbase.filter.ValueFilter hbase(main):010:0> scan 'test',{FILTER => ValueFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'), BinaryComparator.new(Bytes.toBytes('abc')))}
ROW COLUMN+CELL
row-2 column=f1:c2, timestamp=1589453592471, value=abc
1 row(s) in 0.0230 seconds

支持的比较运算符:LESSLESS_OR_EQUALEQUALNOT_EQUALGREATERGREATER_OR_EQUAL,不再一一举例。

推荐使用方式一,更简洁方便。

2. BinaryPrefixComparator 构造过滤器

方式一:

hbase(main):011:0> scan 'test',{FILTER=>"ValueFilter(=,'binaryprefix:ab')"}
ROW COLUMN+CELL
row-1 column=f1:c1, timestamp=1589453592471, value=abcdefg
row-2 column=f1:c2, timestamp=1589453592471, value=abc
row-3 column=f2:c3, timestamp=1589453592471, value=abc123456
3 row(s) in 0.0430 seconds

方式二:

import org.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.BinaryPrefixComparator
import org.apache.hadoop.hbase.filter.ValueFilter hbase(main):013:0> scan 'test',{FILTER => ValueFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'), BinaryPrefixComparator.new(Bytes.toBytes('ab')))}
ROW COLUMN+CELL
row-1 column=f1:c1, timestamp=1589453592471, value=abcdefg
row-2 column=f1:c2, timestamp=1589453592471, value=abc
row-3 column=f2:c3, timestamp=1589453592471, value=abc123456
3 row(s) in 0.0440 seconds

其它同上。

3. SubstringComparator 构造过滤器

方式一:

hbase(main):014:0> scan 'test',{FILTER=>"ValueFilter(=,'substring:123')"}
ROW COLUMN+CELL
row-3 column=f2:c3, timestamp=1589453592471, value=abc123456
row-4 column=f2:c4, timestamp=1589453592471, value=1234abc567
2 row(s) in 0.0340 seconds

方式二:

import org.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.SubstringComparator
import org.apache.hadoop.hbase.filter.ValueFilter hbase(main):016:0> scan 'test',{FILTER => ValueFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'), SubstringComparator.new('123'))}
ROW COLUMN+CELL
row-3 column=f2:c3, timestamp=1589453592471, value=abc123456
row-4 column=f2:c4, timestamp=1589453592471, value=1234abc567
2 row(s) in 0.0240 seconds

区别于上的是这里直接传入字符串进行比较,且只支持EQUALNOT_EQUAL两种比较符。

4. RegexStringComparator 构造过滤器

import org.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.RegexStringComparator
import org.apache.hadoop.hbase.filter.ValueFilter hbase(main):018:0> scan 'test',{FILTER => ValueFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'), RegexStringComparator.new('4[a-z]'))}
ROW COLUMN+CELL
row-4 column=f2:c4, timestamp=1589453592471, value=1234abc567
1 row(s) in 0.0290 seconds

该比较器直接传入字符串进行比较,且只支持EQUALNOT_EQUAL两种比较符。若想使用第一种方式可以传入regexstring试一下,我的版本有点低暂时不支持,不再演示了。

注意这里的正则匹配指包含关系,对应底层find()方法。

ValueFilter 不支持使用 LongComparator 比较器,且 BitComparatorNullComparator 比较器用之甚少,也不再介绍。

查看文章全部源代码请访以下GitHub地址:

https://github.com/zhoupengbo/demos-bigdata/blob/master/hbase/hbase-filters-demos/src/main/java/com/zpb/demos/ValueFilterDemo.java

转载请注明出处!欢迎关注本人微信公众号【HBase工作笔记】

HBase Filter 过滤器之 ValueFilter 详解的更多相关文章

  1. HBase Filter 过滤器之RowFilter详解

    前言:本文详细介绍了HBase RowFilter过滤器Java&Shell API的使用,并贴出了相关示例代码以供参考.RowFilter 基于行键进行过滤,在工作中涉及到需要通过HBase ...

  2. HBase Filter 过滤器之FamilyFilter详解

    前言:本文详细介绍了 HBase FamilyFilter 过滤器 Java&Shell API 的使用,并贴出了相关示例代码以供参考.FamilyFilter 基于列族进行过滤,在工作中涉及 ...

  3. HBase Filter 过滤器之QualifierFilter详解

    前言:本文详细介绍了 HBase QualifierFilter 过滤器 Java&Shell API 的使用,并贴出了相关示例代码以供参考.QualifierFilter 基于列名进行过滤, ...

  4. HBase Filter 过滤器之 Comparator 原理及源码学习

    前言:上篇文章HBase Filter 过滤器概述对HBase过滤器的组成及其家谱进行简单介绍,本篇文章主要对HBase过滤器之比较器作一个补充介绍,也算是HBase Filter学习的必备低阶魂技吧 ...

  5. Java 容器之Hashset 详解

    Java 容器之Hashset 详解.http://blog.csdn.net/nvd11/article/details/27716511

  6. Android为TV端助力 转载:Android绘图Canvas十八般武器之Shader详解及实战篇(上)

    前言 Android中绘图离不开的就是Canvas了,Canvas是一个庞大的知识体系,有Java层的,也有jni层深入到Framework.Canvas有许多的知识内容,构建了一个武器库一般,所谓十 ...

  7. Android为TV端助力 转载:Android绘图Canvas十八般武器之Shader详解及实战篇(下)

    LinearGradient 线性渐变渲染器 LinearGradient中文翻译过来就是线性渐变的意思.线性渐变通俗来讲就是给起点设置一个颜色值如#faf84d,终点设置一个颜色值如#CC423C, ...

  8. hbase实践之数据读取详解

    hbase基本存储组织结构与数据读取组织结构对比 Segment是Hbase2.0的概念,MemStore由一个可写的Segment,以及一个或多个不可写的Segments构成.故hbase 1.*版 ...

  9. 网页元素定位神器之Xpath详解

    摘要: 经常在工作中会使用到XPath的相关知识,但每次总会在一些关键的地方不记得或不太清楚,所以免不了每次总要查一些零碎的知识,感觉即很烦又浪费时间,所以对XPath归纳及总结一下. ...     ...

随机推荐

  1. C#线程学习笔记

    本笔记摘抄自:https://www.cnblogs.com/zhili/archive/2012/07/18/Thread.html,记录一下学习,方便后面资料查找 一.线程的介绍 进程(Proce ...

  2. 5. 配置项:rule_files

    prometheus配置文件内容: global: # 默认情况下抓取目标的频率. [ scrape_interval: <duration> | default = 1m ] # 抓取超 ...

  3. 使用 GoLand 启动 运行 Go 项目

    来源:https://my.oschina.net/u/3744526/blog/3085468 在使用本博客经验之前 需配置好 GOPATH 跟 GOROOT 创建好本地工作路径之后,使用 GoLa ...

  4. 2019-2020-1 20199310《Linux内核原理与分析》第五周作业

    1.问题描述 在前面的文章中,已经了解了Linux内核源代码的目录结构,并在Oracle VM VirtualBox的Linux环境中构造一个简单的操作系统MenuOS,本文将学习系统调用的相关理论知 ...

  5. 一千行mysql笔记

    原文地址:https://shockerli.net/post/1000-line-mysql-note/ /* Windows服务 */ -- 启动MySQL net start mysql -- ...

  6. HTML5 Canvas指纹及反追踪介绍

    1 Canvas指纹的简介很多网站通过Canvas指纹来跟踪用户.browserleaks[1]是一个在线检测canvas指纹的网站.一般的指纹实现原理即通过canvas画布绘制一些图形,填写一些文字 ...

  7. 2019 ICPC 银川网络赛 D. Take Your Seat (疯子坐飞机问题)

    Duha decided to have a trip to Singapore by plane. The airplane had nn seats numbered from 11 to nn, ...

  8. 3) drf 框架生命周期 请求模块 渲染模块 解析模块 自定义异常模块 响应模块(以及二次封装)

    一.DRF框架 1.安装 pip3 install djangorestframework 2.drf框架规矩的封装风格 按功能封装,drf下按不同功能不同文件,使用不同功能导入不同文件 from r ...

  9. python——remove,del,pop三种删除元素方式的区别

    记性不好,整理出来以作保存 1.remove ①直接删除元素,remove(obj),顺序删除第一个遇到的,所以想要全部删除 ,需要遍历 aList = [123, 'xyz', 'zara', 'a ...

  10. requests抓取数据示例

    1:获取豆瓣电影名称及评分 # 抓取豆瓣电影名称及评分 url="https://movie.douban.com/j/search_subjects" start=input(& ...