HBase Client JAVA API

旧的 HBase 接口逻辑与传统 JDBC 方式很不相同，新的接口与传统 JDBC 的逻辑更加相像，具有更加清晰的 Connection 管理方式。

同时，在旧的接口中，客户端何时将 Put 写到服务端也需要设置，一个 Put 马上写到服务端，还是攒到一批写到服务端，新用户往往对此不太清楚。

在新的接口中，引入了 BufferedMutator，可以提供更加高效清晰的写操作。

HBase 0.98 与 HBase 1.0 接口名称对比

举一个例子，旧的 API 写入操作的代码：

新的 API 写入操作的代码：

可以看到，在操作前，首先建立连接，然后拿到一个对应表的句柄，之后再进行一系列操作。以上两个是同步写操作。

下面看一下批量异步写入接口：

org.apache.hadoop.hbase.client.BufferedMutator主要用来对HBase的单个表进行操作。它和Put类的作用差不多，但是主要用来实现批量的异步写操作。

BufferedMutator替换了HTable的setAutoFlush(false)的作用。

可以从Connection的实例中获取BufferedMutator的实例。在使用完成后需要调用close()方法关闭连接。对BufferedMutator进行配置需要通过BufferedMutatorParams完成。

MapReduce Job的是BufferedMutator使用的典型场景。MapReduce作业需要批量写入，但是无法找到恰当的点执行flush。

BufferedMutator接收MapReduce作业发送来的Put数据后，会根据某些因素（比如接收的Put数据的总量）启发式地执行Batch Put操作，且会异步的提交Batch Put请求，这样MapReduce作业的执行也不会被打断。

BufferedMutator也可以用在一些特殊的情况上。MapReduce作业的每个线程将会拥有一个独立的BufferedMutator对象。

一个独立的BufferedMutator也可以用在大容量的在线系统上来执行批量Put操作，但是这时需要注意一些极端情况比如JVM异常或机器故障，此时有可能造成数据丢失。

官方源码路径：/hbase-2.0.4/hbase-examples/src/main/java/org/apache/hadoop/hbase/client/example/BufferedMutatorExample.java

/**

 *

 * Licensed to the Apache Software Foundation (ASF) under one

 * or more contributor license agreements.  See the NOTICE file

 * distributed with this work for additional information

 * regarding copyright ownership.  The ASF licenses this file

 * to you under the Apache License, Version 2.0 (the

 * "License"); you may not use this file except in compliance

 * with the License.  You may obtain a copy of the License at

 *

 *     http://www.apache.org/licenses/LICENSE-2.0

 *

 * Unless required by applicable law or agreed to in writing, software

 * distributed under the License is distributed on an "AS IS" BASIS,

 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

 * See the License for the specific language governing permissions and

 * limitations under the License.

 */

package org.apache.hadoop.hbase.client.example;

import java.io.IOException;

import java.util.ArrayList;

import java.util.List;

import java.util.concurrent.Callable;

import java.util.concurrent.ExecutionException;

import java.util.concurrent.ExecutorService;

import java.util.concurrent.Executors;

import java.util.concurrent.Future;

import java.util.concurrent.TimeUnit;

import java.util.concurrent.TimeoutException;

import org.apache.hadoop.conf.Configured;

import org.apache.hadoop.hbase.TableName;

import org.apache.hadoop.hbase.client.BufferedMutator;

import org.apache.hadoop.hbase.client.BufferedMutatorParams;

import org.apache.hadoop.hbase.client.Connection;

import org.apache.hadoop.hbase.client.ConnectionFactory;

import org.apache.hadoop.hbase.client.Put;

import org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException;

import org.apache.hadoop.hbase.util.Bytes;

import org.apache.hadoop.util.Tool;

import org.apache.hadoop.util.ToolRunner;

import org.apache.yetus.audience.InterfaceAudience;

import org.slf4j.Logger;

import org.slf4j.LoggerFactory;

/**

 * An example of using the {@link BufferedMutator} interface.

 */

@InterfaceAudience.Private

public class BufferedMutatorExample extends Configured implements Tool {

  private static final Logger LOG = LoggerFactory.getLogger(BufferedMutatorExample.class);

  private static final int POOL_SIZE = 10;

  private static final int TASK_COUNT = 100;

  private static final TableName TABLE = TableName.valueOf("foo");

  private static final byte[] FAMILY = Bytes.toBytes("f");

  @Override

  public int run(String[] args) throws InterruptedException, ExecutionException, TimeoutException {

    /** a callback invoked when an asynchronous write fails. */

    final BufferedMutator.ExceptionListener listener = new BufferedMutator.ExceptionListener() {

      @Override

      public void onException(RetriesExhaustedWithDetailsException e, BufferedMutator mutator) {

        for (int i = 0; i < e.getNumExceptions(); i++) {

          LOG.info("Failed to sent put " + e.getRow(i) + ".");

        }

      }

    };

    BufferedMutatorParams params = new BufferedMutatorParams(TABLE)

        .listener(listener);

    //

    // step 1: create a single Connection and a BufferedMutator, shared by all worker threads.

    //

    try (final Connection conn = ConnectionFactory.createConnection(getConf());

         final BufferedMutator mutator = conn.getBufferedMutator(params)) {

      /** worker pool that operates on BufferedTable instances */

      final ExecutorService workerPool = Executors.newFixedThreadPool(POOL_SIZE);

      List<Future<Void>> futures = new ArrayList<>(TASK_COUNT);

      for (int i = 0; i < TASK_COUNT; i++) {

        futures.add(workerPool.submit(new Callable<Void>() {

          @Override

          public Void call() throws Exception {

            //

            // step 2: each worker sends edits to the shared BufferedMutator instance. They all use

            // the same backing buffer, call-back "listener", and RPC executor pool.

            //

            Put p = new Put(Bytes.toBytes("someRow"));

            p.addColumn(FAMILY, Bytes.toBytes("someQualifier"), Bytes.toBytes("some value"));

            mutator.mutate(p);

            // do work... maybe you want to call mutator.flush() after many edits to ensure any of

            // this worker's edits are sent before exiting the Callable

            return null;

          }

        }));

      }

      //

      // step 3: clean up the worker pool, shut down.

      //

      for (Future<Void> f : futures) {

        f.get(5, TimeUnit.MINUTES);

      }

      workerPool.shutdown();

    } catch (IOException e) {

      // exception while creating/destroying Connection or BufferedMutator

      LOG.info("exception while creating/destroying Connection or BufferedMutator", e);

    } // BufferedMutator.close() ensures all work is flushed. Could be the custom listener is

      // invoked from here.

    return 0;

  }

  public static void main(String[] args) throws Exception {

    ToolRunner.run(new BufferedMutatorExample(), args);

  }

}

HBase Client JAVA API的更多相关文章

Hbase框架原理及相关的知识点理解、Hbase访问MapReduce、Hbase访问Java API、Hbase shell及Hbase性能优化总结
转自:http://blog.csdn.net/zhongwen7710/article/details/39577431 本blog的内容包含: 第一部分:Hbase框架原理理解第二部分:Hbas ...
5 hbase-shell + hbase的java api
本博文的主要内容有 .HBase的单机模式(1节点)安装 .HBase的单机模式(1节点)的启动 .HBase的伪分布模式(1节点)安装 .HBase的伪分布模式(1节点)的启动 .HBase ...
HBase的Java Api连接失败的问题及解决方法
分布式方式部署的HBase,启动正常,Shell操作正常,使用HBase的Java Api操作时总是连接失败,信息如下: This server is in the failed servers li ...
hbase-shell + hbase的java api
本博文的主要内容有 .HBase的单机模式(1节点)安装 .HBase的单机模式(1节点)的启动 .HBase的伪分布模式(1节点)安装 .HBase的伪分布模式(1节点)的启动 .HBas ...
linux 下通过过 hbase 的Java api 操作hbase
hbase版本:0.98.5 hadoop版本:1.2.1 使用自带的zk 本文的内容是在集群中创建java项目调用api来操作hbase,主要涉及对hbase的创建表格,删除表格,插入数据,删除数据 ...
Hbase之Java API远程访问Kerberos认证
HbaseConnKer.java package BigData.conn; import BigData.utils.resource.ResourcesUtils; import org.apa ...
Hbase/Hadoop Java API编程常用语句
从scanner获取rowkey: for(Result rr : scanner){ String key =Bytes.toString(rr.getRow())} HBase API - Res ...
Hbase之JAVA API不能远程访问问题解决
1.配置Linux的hostname2.配置Linux的hosts,映射ip的hostname的关系3.配置访问windows的hosts 参考文档:http://blog.csdn.net/ty49 ...
windows上使用metastore client java api链接hive metastore问题
https://github.com/sdravida/hadoop2.6_Win_x64 下载winutils.exe 添加到path中

随机推荐

开始使用Filebeat
认识Beats Beats是用于单用途数据托运人的平台.它们以轻量级代理的形式安装,并将来自成百上千台机器的数据发送到Logstash或Elasticsearch. (画外音:通俗地理解,就是采集数据 ...
Python爬虫入门教程 17-100 CSD*博客抓取数据
写在前面写了一段时间的博客了,忽然间忘记了,其实CSD*博客频道的博客也是可以抓取的,所以我干了..... 其实这事情挺简单的,打开CSDN博客首页,他不是有个最新文章么,这个里面都是最新发布的文章 ...
Python爬虫入门教程 9-100 河北阳光理政投诉板块
河北阳光理政投诉板块-写在前面之前几篇文章都是在写图片相关的爬虫,今天写个留言板爬出,为另一套数据分析案例的教程做做准备,作为一个河北人,遵纪守法,有事投诉是必备的技能,那么咱看看我们大河北人都因为 ...
淘宝npm镜像使用方法(转)
1.临时使用 npm --registry https://registry.npm.taobao.org install express 2.持久使用 npm config set registry ...
设计模式总结篇系列：装饰器模式（Decorator）
在面向对象设计过程中,经常会遇到需要对现有的类的功能进行扩展,通常我们可以采用继承的方式.例如老罗最近在做手机,一开始需要定义手机所应具有的功能: interface Phone{ public vo ...
Java开发知识之JavaIO操作缓存操作
目录带缓存的输入/输出流一丶简介二丶BufferedInputStream 与 BufferedOutputString类. 2.BufferOutputStream类. 三丶BufferedR ...
.NET快速信息化系统开发框架 V3.2 -> WinForm“组织机构管理”界面组织机构权限管理采用新的界面，操作权限按模块进行展示
对于某些大型的企业.信息系统,涉及的组织机构较多,模块多.操作权限也多,对用户或角色一一设置模块.操作权限等比较繁琐.我们可以直接对某一组织机构进行权限的设置,这样设置后,同一组织机构的用户就可以拥有 ...
打造自己的.NET Core项目模板
前言每个人都有自己习惯的项目结构,有人的喜欢在项目里面建解决方案文件夹:有的人喜欢传统的三层命名:有的人喜欢单一,简单的项目一个csproj就搞定.. 反正就是萝卜青菜,各有所爱. 可能不同的公司对 ...
Lily_music 网页音乐播放器 -可搜索（附歌词联动播放效果解说）
博客地址:https://ainyi.com/59 写在前面这是我今年(2018)年初的小项目,当时也是手贱,不想用别的播放器,想着做一个自己的网页播放器,有个歌曲列表.可关键词搜索.歌词滚动播放的 ...
Smobiler 4.0 正式发布
l Smobiler4.0提供了三大技术亮点:第三方插件.JS.自定义控件等: 强大的插件移动应用引擎 Smobiler支持分插件打包功能和插件扩展机制,让应用开发更加灵活. 分插件打包是指Smo ...

HBase Client JAVA API

HBase Client JAVA API的更多相关文章

随机推荐

热门专题