1.过滤器

基础API中的查询操作在面对大量数据的时候是非常苍白的,这里Hbase提供了高级的查询方法:Filter。Filter可以根据簇、列、版本等更多的条件来对数据进行过滤,基于Hbase本身提供的三维有序(主键有序、列有序、版本有序),这些Filter可以高效的完成查询过滤的任务。带有Filter条件的RPC查询请求会把Filter分发到各个RegionServer,是一个服务器端(Server-side)的过滤器,这样也可以降低网络传输的压力。

要完成一个过滤的操作,至少需要两个参数。一个是抽象的操作符,Hbase提供了枚举类型的变量来表示这些抽象的操作符:LESS/LESS_OR_EQUAL/EQUAL/NOT_EUQAL等;另外一个就是具体的比较器(Comparator),代表具体的比较逻辑,如果可以提高字节级的比较、字符串级的比较等。有了这两个参数,我们就可以清晰的定义筛选的条件,过滤数据。

1.1 抽象操作符(比较运算符)

LESS <

LESS_OR_EQUAL <=

EQUAL =

NOT_EQUAL <>

GREATER_OR_EQUAL >=

GREATER >

NO_OP 排除所有

1.2 比较器(指定比较机制)

BinaryComparator 按字节索引顺序比较指定字节数组,采用 Bytes.compareTo(byte[])

BinaryPrefixComparator 跟前面相同,只是比较左端的数据是否相同

NullComparator 判断给定的是否为空

BitComparator 按位比较

RegexStringComparator 提供一个正则的比较器,仅支持 EQUAL 和非 EQUAL

SubstringComparator 判断提供的子串是否出现在 value 中

2.HBase过滤器的分类

2.1 比较过滤器

2.1.1 行键过滤器 RowFilter

Filter rowFilter = new RowFilter(CompareOp.GREATER, new BinaryComparator("95007".getBytes()));

scan.setFilter(rowFilter);

public class HbaseFilterTest {

private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";

private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181";

private static Connection conn = null;

private static Admin admin = null;

public static void main(String[] args) throws Exception {

Configuration conf = HBaseConfiguration.create();

conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);

conn = ConnectionFactory.createConnection(conf);

admin = conn.getAdmin();

Table table = conn.getTable(TableName.valueOf("student"));

Scan scan = new Scan();

Filter rowFilter = new RowFilter(CompareOp.GREATER, new BinaryComparator("95007".getBytes()));

scan.setFilter(rowFilter);

ResultScanner resultScanner = table.getScanner(scan);

for(Result result : resultScanner) {

List<Cell> cells = result.listCells();

for(Cell cell : cells) {

System.out.println(cell);

}

}

}

}

运行结果部分截图

 

2.1.2 列簇过滤器 FamilyFilter

Filter familyFilter = new FamilyFilter(CompareOp.EQUAL, new BinaryComparator("info".getBytes()));

scan.setFilter(familyFilter);

public class HbaseFilterTest {

private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";

private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181";

private static Connection conn = null;

private static Admin admin = null;

public static void main(String[] args) throws Exception {

Configuration conf = HBaseConfiguration.create();

conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);

conn = ConnectionFactory.createConnection(conf);

admin = conn.getAdmin();

Table table = conn.getTable(TableName.valueOf("student"));

Scan scan = new Scan();

Filter familyFilter = new FamilyFilter(CompareOp.EQUAL, new BinaryComparator("info".getBytes()));

scan.setFilter(familyFilter);

ResultScanner resultScanner = table.getScanner(scan);

for(Result result : resultScanner) {

List<Cell> cells = result.listCells();

for(Cell cell : cells) {

System.out.println(cell);

}

}

}

}

 

2.1.3 列过滤器 QualifierFilter

Filter qualifierFilter = new QualifierFilter(CompareOp.EQUAL, new BinaryComparator("name".getBytes()));

scan.setFilter(qualifierFilter);

public class HbaseFilterTest {

private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";

private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181";

private static Connection conn = null;

private static Admin admin = null;

public static void main(String[] args) throws Exception {

Configuration conf = HBaseConfiguration.create();

conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);

conn = ConnectionFactory.createConnection(conf);

admin = conn.getAdmin();

Table table = conn.getTable(TableName.valueOf("student"));

Scan scan = new Scan();

Filter qualifierFilter = new QualifierFilter(CompareOp.EQUAL, new BinaryComparator("name".getBytes()));

scan.setFilter(qualifierFilter);

ResultScanner resultScanner = table.getScanner(scan);

for(Result result : resultScanner) {

List<Cell> cells = result.listCells();

for(Cell cell : cells) {

System.out.println(cell);

}

}

}

}

 

2.1.4 值过滤器 ValueFilter

Filter valueFilter = new ValueFilter(CompareOp.EQUAL, new SubstringComparator("男"));

scan.setFilter(valueFilter);

public class HbaseFilterTest {

private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";

private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181";

private static Connection conn = null;

private static Admin admin = null;

public static void main(String[] args) throws Exception {

Configuration conf = HBaseConfiguration.create();

conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);

conn = ConnectionFactory.createConnection(conf);

admin = conn.getAdmin();

Table table = conn.getTable(TableName.valueOf("student"));

Scan scan = new Scan();

Filter valueFilter = new ValueFilter(CompareOp.EQUAL, new SubstringComparator("男"));

scan.setFilter(valueFilter);

ResultScanner resultScanner = table.getScanner(scan);

for(Result result : resultScanner) {

List<Cell> cells = result.listCells();

for(Cell cell : cells) {

System.out.println(cell);

}

}

}

}

 

2.1.5 时间戳过滤器 TimestampsFilter

List<Long> list = new ArrayList<>();

list.add(1522469029503l);

TimestampsFilter timestampsFilter = new TimestampsFilter(list);

scan.setFilter(timestampsFilter);

public class HbaseFilterTest {

private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";

private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181";

private static Connection conn = null;

private static Admin admin = null;

public static void main(String[] args) throws Exception {

Configuration conf = HBaseConfiguration.create();

conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);

conn = ConnectionFactory.createConnection(conf);

admin = conn.getAdmin();

Table table = conn.getTable(TableName.valueOf("student"));

Scan scan = new Scan();

List<Long> list = new ArrayList<>();

list.add(1522469029503l);

TimestampsFilter timestampsFilter = new TimestampsFilter(list);

scan.setFilter(timestampsFilter);

ResultScanner resultScanner = table.getScanner(scan);

for(Result result : resultScanner) {

List<Cell> cells = result.listCells();

for(Cell cell : cells) {

System.out.println(Bytes.toString(cell.getRow()) + "\t" + Bytes.toString(cell.getFamily()) + "\t" + Bytes.toString(cell.getQualifier())

+ "\t" + Bytes.toString(cell.getValue()) + "\t" + cell.getTimestamp());

}

}

}

}

 

2.2 专用过滤器

2.2.1 单列值过滤器 SingleColumnValueFilter

会返回满足条件的整行

SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter(

"info".getBytes(), //列簇

"name".getBytes(), //列

CompareOp.EQUAL,

new SubstringComparator("刘晨"));

//如果不设置为 true,则那些不包含指定 column 的行也会返回

singleColumnValueFilter.setFilterIfMissing(true);

scan.setFilter(singleColumnValueFilter);

public class HbaseFilterTest2 {

private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";

private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181";

private static Connection conn = null;

private static Admin admin = null;

public static void main(String[] args) throws Exception {

Configuration conf = HBaseConfiguration.create();

conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);

conn = ConnectionFactory.createConnection(conf);

admin = conn.getAdmin();

Table table = conn.getTable(TableName.valueOf("student"));

Scan scan = new Scan();

SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter(

"info".getBytes(),

"name".getBytes(),

CompareOp.EQUAL,

new SubstringComparator("刘晨"));

singleColumnValueFilter.setFilterIfMissing(true);

scan.setFilter(singleColumnValueFilter);

ResultScanner resultScanner = table.getScanner(scan);

for(Result result : resultScanner) {

List<Cell> cells = result.listCells();

for(Cell cell : cells) {

System.out.println(Bytes.toString(cell.getRow()) + "\t" + Bytes.toString(cell.getFamily()) + "\t" + Bytes.toString(cell.getQualifier())

+ "\t" + Bytes.toString(cell.getValue()) + "\t" + cell.getTimestamp());

}

}

}

}

 

2.2.2 单列值排除器 SingleColumnValueExcludeFilter

SingleColumnValueExcludeFilter singleColumnValueExcludeFilter = new SingleColumnValueExcludeFilter(

"info".getBytes(),

"name".getBytes(),

CompareOp.EQUAL,

new SubstringComparator("刘晨"));

singleColumnValueExcludeFilter.setFilterIfMissing(true);

scan.setFilter(singleColumnValueExcludeFilter);

public class HbaseFilterTest2 {

private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";

private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181";

private static Connection conn = null;

private static Admin admin = null;

public static void main(String[] args) throws Exception {

Configuration conf = HBaseConfiguration.create();

conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);

conn = ConnectionFactory.createConnection(conf);

admin = conn.getAdmin();

Table table = conn.getTable(TableName.valueOf("student"));

Scan scan = new Scan();

SingleColumnValueExcludeFilter singleColumnValueExcludeFilter = new SingleColumnValueExcludeFilter(

"info".getBytes(),

"name".getBytes(),

CompareOp.EQUAL,

new SubstringComparator("刘晨"));

singleColumnValueExcludeFilter.setFilterIfMissing(true);

scan.setFilter(singleColumnValueExcludeFilter);

ResultScanner resultScanner = table.getScanner(scan);

for(Result result : resultScanner) {

List<Cell> cells = result.listCells();

for(Cell cell : cells) {

System.out.println(Bytes.toString(cell.getRow()) + "\t" + Bytes.toString(cell.getFamily()) + "\t" + Bytes.toString(cell.getQualifier())

+ "\t" + Bytes.toString(cell.getValue()) + "\t" + cell.getTimestamp());

}

}

}

}

 

2.2.3 前缀过滤器 PrefixFilter----针对行键

PrefixFilter prefixFilter = new PrefixFilter("9501".getBytes());

scan.setFilter(prefixFilter);

public class HbaseFilterTest2 {

private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";

private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181";

private static Connection conn = null;

private static Admin admin = null;

public static void main(String[] args) throws Exception {

Configuration conf = HBaseConfiguration.create();

conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);

conn = ConnectionFactory.createConnection(conf);

admin = conn.getAdmin();

Table table = conn.getTable(TableName.valueOf("student"));

Scan scan = new Scan();

PrefixFilter prefixFilter = new PrefixFilter("9501".getBytes());

scan.setFilter(prefixFilter);

ResultScanner resultScanner = table.getScanner(scan);

for(Result result : resultScanner) {

List<Cell> cells = result.listCells();

for(Cell cell : cells) {

System.out.println(Bytes.toString(cell.getRow()) + "\t" + Bytes.toString(cell.getFamily()) + "\t" + Bytes.toString(cell.getQualifier())

+ "\t" + Bytes.toString(cell.getValue()) + "\t" + cell.getTimestamp());

}

}

}

}

 

2.2.4 列前缀过滤器 ColumnPrefixFilter

ColumnPrefixFilter columnPrefixFilter = new ColumnPrefixFilter("name".getBytes());

scan.setFilter(columnPrefixFilter);

public class HbaseFilterTest2 {

private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum";

private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181";

private static Connection conn = null;

private static Admin admin = null;

public static void main(String[] args) throws Exception {

Configuration conf = HBaseConfiguration.create();

conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE);

conn = ConnectionFactory.createConnection(conf);

admin = conn.getAdmin();

Table table = conn.getTable(TableName.valueOf("student"));

Scan scan = new Scan();

ColumnPrefixFilter columnPrefixFilter = new ColumnPrefixFilter("name".getBytes());

scan.setFilter(columnPrefixFilter);

ResultScanner resultScanner = table.getScanner(scan);

for(Result result : resultScanner) {

List<Cell> cells = result.listCells();

for(Cell cell : cells) {

System.out.println(Bytes.toString(cell.getRow()) + "\t" + Bytes.toString(cell.getFamily()) + "\t" + Bytes.toString(cell.getQualifier())

+ "\t" + Bytes.toString(cell.getValue()) + "\t" + cell.getTimestamp());

}

}

}

}

 

 

HBase学习——4.HBase过滤器的更多相关文章

  1. Hbase 学习(一) hbase配置文件同步

    最近在狂啃hadoop的书籍,这部<hbase:权威指南>就进入我的视野里面了,啃吧,因为是英文的书籍,有些个人理解不对的地方,欢迎各位拍砖. HDFS和Hbase配置同步 hbase的配 ...

  2. HBase 学习之一 <<HBase使用客户端API动态创建Hbase数据表并在Hbase下导出执行>>

    HBase使用客户端API动态创建Hbase数据表并在Hbase下导出执行                       ----首先感谢网络能够给我提供一个开放的学习平台,如果没有网上的技术爱好者提供 ...

  3. HBase学习——3.HBase表设计

    1.建表高级属性 建表过程中常用的shell命令 1.1 BLOOMFILTER 默认是 NONE 是否使用布隆过虑及使用何种方式,布隆过滤可以每列族单独启用 使用HColumnDescriptor. ...

  4. HBase学习笔记-HBase性能研究(1)

    使用Java API与HBase集群交互时,需要构建HTable对象,使用该对象提供的方法来进行插入/删除/查询等操作.要创建HTable对象,首先要创建一个带有HBase集群信息的配置对象Confi ...

  5. Hbase学习(三)过滤器 java API

    Hbase学习(三)过滤器 HBase 的基本 API,包括增.删.改.查等. 增.删都是相对简单的操作,与传统的 RDBMS 相比,这里的查询操作略显苍白,只能根据特性的行键进行查询(Get)或者根 ...

  6. HBase学习系列

    转自:http://www.aboutyun.com/thread-8391-1-1.html 问题导读: 1.hbase是什么? 2.hbase原理是什么? 3.hbase使用中会遇到什么问题? 4 ...

  7. 《HBase in Action》 第二章节的学习总结 ---- HBase基本组成

    准备工作:采用的HBase版本是:CDH4.5,其中的Hadoop版本是:hadoop-2.0.0-cdh4.5.0:HBase版本是:hbase-0.94.6-cdh4.5.0: Hbase的配置文 ...

  8. 最近学习了HBase

    HBase是什么 最近学习了HBase,正常来说写这篇文章,应该从DB有什么缺点,HBase如何弥补DB的缺点开始讲会更有体感,但是本文这些暂时不讲,只讲HBase,把HBase相关原理和使用讲清楚, ...

  9. HBase学习与实践

    Photo by bealach verse on Unsplash 参考书籍:<HBase 权威指南> -- Lars George著. 文章为个人从零开始学习记录,如有错误,还请不吝赐 ...

随机推荐

  1. Linux环境下Hadoop集群搭建

    Linux环境下Hadoop集群搭建 前言: 最近来到了武汉大学,在这里开始了我的研究生生涯.昨天通过学长们的耐心培训,了解了Hadoop,Hdfs,Hive,Hbase,MangoDB等等相关的知识 ...

  2. JavaOOP笔记

    http://note.youdao.com/noteshare?id=bbdc0b970721e40d327db983a2f96371

  3. NOIP2008 立体图

    题目描述 小渊是个聪明的孩子,他经常会给周围的小朋友们将写自己认为有趣的内容.最近,他准备给小朋友们讲解立体图,请你帮他画出立体图. 小渊有一块面积为m*n的矩形区域,上面有m*n个边长为1的格子,每 ...

  4. 为什么在Python里推荐使用多进程而不是多线程?

    最近在看Python的多线程,经常我们会听到老手说:“Python下多线程是鸡肋,推荐使用多进程!”,但是为什么这么说呢?   要知其然,更要知其所以然.所以有了下面的深入研究: 首先强调背景: 1. ...

  5. Rectangular Covering [POJ2836] [状压DP]

    题意 平面上有 n (2 ≤ n ≤ 15) 个点,现用平行于坐标轴的矩形去覆盖所有点,每个矩形至少盖两个点,矩形面积不可为0,求这些矩形的最小面积. Input The input consists ...

  6. .Net Core 部署 CentOs7+Nginx

    先爆图 由于是初学者,部署出来这个界面也不容易,此前第一步弄了个这个出来 动态的没问题,然后静态资源死活就是不出来,弄了两个小时没有结果,带着遗憾睡了个觉 试验1: server { listen ; ...

  7. C#相对路径

    1. 根目录 .\\ 或者直接给出文件名称,是找根目录的路径. 如:path = "gs.mdb" 与 path = ".\\gs.mdb"是一个意思. 2. ...

  8. getMemory的经典例子

    //NO.1:程序首先申请一个char类型的指针str,并把str指向NULL(即str里存的是NULL的地址,*str为NULL中的值为0),调用函数的过程中做了如下动作:1申请一个char类型的指 ...

  9. 权限系统设计-day02

    练习中的问题: 1,<s:url action="employee_input" />这个标签用来让struts自动生成请求的路径,struts生成的路径是一个全路径, ...

  10. easyui自定义皮肤及缺陷修改

    引言: 一个商业项目的需要,又因为时间紧迫的关系,准备购买一套简洁,易用,可定制化强的UI,经过对国内外多家UI产品进行了对比, 包括:FineUI, EasyUI, EXT.NET, EXTJS, ...