HBase入门教程

# 背景

最近看到公司一个项目用到hbase，之前也一直想看下hbase。个人理解Hbase作为一个nosql数据库，逻辑模型感觉跟关系型数据库有点类似。一个table，有row即行，列。不过列是一个列族，可以放多个字段，类似下面这种格式

table users

行　　 | 列族　　　　 | value 　　　　| 　　列族　　 | 　　value

rows1 | info:name　 | zhangsan　　 | 　　....　　　 |　　...

rows1 | info:address | wudaokou　　 | 　　....　　　 |　　...　　

# 安装

说下安装吧，有三种模式：单机、伪分布式、集群。这里我用的单机，官网:https://hbase.apache.org/downloads.html

下载，解压

安装步骤一定要安装官网说明来，博客这些都太老了

1. hbase-env.sh设置JAVA_HOME

2. hbase-site.xml

<configuration>

  <property>

    <name>hbase.rootdir</name>

    <value>file:///Users/gxf/hbase</value>

  </property>

  <property>

    <name>hbase.zookeeper.property.dataDir</name>

    <value>/Users/gxf/zookeeper</value>

  </property>

  <property>

    <name>hbase.unsafe.stream.capability.enforce</name>

    <value>false</value>

    <description>

      Controls whether HBase will check for stream capabilities (hflush/hsync).

      Disable this if you intend to run on LocalFileSystem, denoted by a rootdir

      with the 'file://' scheme, but be mindful of the NOTE below.

      WARNING: Setting this to false blinds you to potential data loss and

      inconsistent system state in the event of process and/or node failures. If

      HBase is complaining of an inability to use hsync or hflush it's most

      likely not a false positive.

    </description>

  </property>

</configuration>

这里，安装和单机部署基本完成

$HBASE_HOME/bin/start-hbase.sh启动hbase

http://localhost:16010/master-status这个能正确显示即启动成功

# 使用

hbaset提供了一个命令行客户端，我们可以使用命令行客户端，创建、删除、修改、查询表，插入记录，插叙记录，删除记录，修改记录

$HBASE_HOME/bin/hbase shell

启动客户端，基本命令在官网也可以看，建议在官网看。我也是搬运工，顺便熟悉一下，做个备忘录

1. list命令，列出所有的表

list

2. 新建user表，列族为info，存放用户的基本信息

create 'user', 'info'

3. 删除表，要先disable，再drop

disable 'user'

drop 'user'

4. 插入数据, put 'tablename', 'row', 'cf:col', 'value'

put 'user', 'row1', 'info:name', 'guanxianseng'

5. 查询数据 scan tablenanme

scan 'user'

# java客户端

pom.xml

<dependencies>

    <dependency>

      <groupId>org.apache.hbase</groupId>

      <artifactId>hbase-client</artifactId>

      <version>1.4.8</version>

    </dependency>

    <dependency>

      <groupId>org.apache.hbase</groupId>

      <artifactId>hbase</artifactId>

      <version>1.4.8</version>

    </dependency>

    <!-- log -->

    <dependency>

      <groupId>org.slf4j</groupId>

      <artifactId>slf4j-api</artifactId>

      <version>1.7.25</version>

    </dependency>

    <dependency>

      <groupId>org.slf4j</groupId>

      <artifactId>slf4j-log4j12</artifactId>

      <version>1.8.0-beta2</version>

    </dependency>

    <dependency>

      <groupId>org.apache.logging.log4j</groupId>

      <artifactId>log4j-slf4j-impl</artifactId>

      <version>2.11.0</version>

    </dependency>

    <dependency>

      <groupId>org.apache.logging.log4j</groupId>

      <artifactId>log4j-core</artifactId>

      <version>2.11.1</version>

    </dependency>

    <dependency>

      <groupId>org.apache.logging.log4j</groupId>

      <artifactId>log4j-api</artifactId>

      <version>2.11.1</version>

    </dependency>

  </dependencies>

这里我用了log,所有加了log4j等log依赖

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.hbase.Cell;

import org.apache.hadoop.hbase.CellUtil;

import org.apache.hadoop.hbase.HBaseConfiguration;

import org.apache.hadoop.hbase.HColumnDescriptor;

import org.apache.hadoop.hbase.HTableDescriptor;

import org.apache.hadoop.hbase.TableName;

import org.apache.hadoop.hbase.client.Admin;

import org.apache.hadoop.hbase.client.Connection;

import org.apache.hadoop.hbase.client.ConnectionFactory;

import org.apache.hadoop.hbase.client.Delete;

import org.apache.hadoop.hbase.client.Get;

import org.apache.hadoop.hbase.client.Put;

import org.apache.hadoop.hbase.client.Result;

import org.apache.hadoop.hbase.client.ResultScanner;

import org.apache.hadoop.hbase.client.Scan;

import org.apache.hadoop.hbase.client.Table;

import org.slf4j.Logger;

import org.slf4j.LoggerFactory;

public class HBaseTest {

  private static Configuration conf = null;

  private static Connection connection = null;

  private static Admin admin = null;

  private static Logger logger = LoggerFactory.getLogger(HBaseTest.class);

  static {

    //设置连接信息

    conf = HBaseConfiguration.create();

    conf.set("hbase.zookeeper.quorum", "localhost");

    conf.set("hbase.zookeeper.property.clientPort", "2181");

    conf.setInt("hbase.rpc.timeout", 2000);

    conf.setInt("hbase.client.operation.timeout", 3000);

    conf.setInt("hbase.client.scanner.timeout.period", 6000);

    try {

      connection = ConnectionFactory.createConnection(conf);

      admin = connection.getAdmin();

    } catch (Exception e) {

      e.printStackTrace();

    }

  }

  public static void main(String[] args) throws Exception {

    String tableName = "test3";

    String[] colFam = new String[]{"colFam"};

//    createTable(tableName, colFam);

//    deleteTable(tableName);

//    listTables();

//    addData("users", "row3", "info", "name", "guanxianseng");

//    deleteData("users", "row1", "info", "name");

//    query("users", "row2", "info", "name");

    scan("users", "row1", "row2");

  }

  /**

   * scan数据

   * */

  public static void scan(String tableNameStr, String startRowKey, String stopRowKey)

      throws IOException {

    Table table = connection.getTable(TableName.valueOf(tableNameStr));

    Scan scan = new Scan();

    ResultScanner resultScanner = table.getScanner(scan);

    for(Result result : resultScanner){

      showCell(result);

    }

  }

  /**

   * 查询数据

   * */

  public static void query(String tableNameStr, String rowkey, String colFam, String col)

      throws IOException {

    Table table = connection.getTable(TableName.valueOf(tableNameStr));

    Get get = new Get(rowkey.getBytes());

    Result result = table.get(get);

    showCell(result);

  }

  /**

   * 打印一个cell所有数据

   * */

  private static void showCell(Result result){

    for(Cell cell : result.rawCells()){

      logger.info("rawname:{}, timestamp:{}, colFam:{}, colName:{}, value:{}", new String(CellUtil.cloneRow(cell)), cell.getTimestamp(),

                  new String(CellUtil.cloneFamily(cell)), new String(CellUtil.cloneQualifier(cell)), new String(CellUtil.cloneValue(cell)));

    }

  }

  /**

   * 删除数据

   * */

  public static void deleteData(String tableNameStr, String row,  String colFam, String col) throws IOException {

    Table table = connection.getTable(TableName.valueOf(tableNameStr));

    Delete delete = new Delete(row.getBytes());

    table.delete(delete);

    logger.info("delete tablename: {}, row:{}, colFam:{}, col:{}", tableNameStr, row, colFam, col);

  }

  /**

   * 向表中插入数据

   * */

  public static void addData(String tableNameStr, String rowkey, String colFam, String col, String value)

      throws IOException {

    TableName tableName = TableName.valueOf(tableNameStr);

    Table table = connection.getTable(tableName);

    Put put = new Put(rowkey.getBytes());

    put.addColumn(colFam.getBytes(), col.getBytes(), value.getBytes());

    table.put(put);

    table.close();

    logger.info("put table:{}, rowkey:{}, colFam:{}, col:{}, value:{}", tableNameStr, rowkey, colFam, col, value);

  }

  /**

   * 列出所有的表

   * */

  public static void listTables() throws IOException {

    HTableDescriptor[] hTableDescriptors = admin.listTables();

    for(HTableDescriptor hTableDescriptor : hTableDescriptors){

      logger.info("table :{}", hTableDescriptor.getTableName());

    }

  }

  /**

   * 创建表

   */

  public static void createTable(String tableNameStr, String[] colFam) {

    try {

      TableName tableName = TableName.valueOf(tableNameStr);

      Table table = connection.getTable(tableName);

      if (admin.tableExists(tableName)) {

        //表已经存在

        logger.info("table {} already exist", tableNameStr);

      } else {

        //表不存在

        HTableDescriptor hTableDescriptor = new HTableDescriptor(tableNameStr);

        for (String colStr : colFam) {

          HColumnDescriptor columnDescriptor = new HColumnDescriptor(colStr);

          hTableDescriptor.addFamily(columnDescriptor);

        }

        admin.createTable(hTableDescriptor);

        logger.info("creat table success");

        admin.close();

      }

    } catch (Exception e) {

      e.printStackTrace();

    }

  }

  /**

   * 删除表 1. disable 2. delete

   */

  public static void deleteTable(String tableNameStr) throws Exception {

    TableName tableName = TableName.valueOf(tableNameStr);

    if (!admin.tableExists(tableName)) {

      logger.error("table :{} not exist", tableNameStr);

    } else {

      admin.disableTable(tableName);

      admin.deleteTable(tableName);

      logger.info("delete table:{}", tableNameStr);

    }

  }

}

这个java demo也参考了网上的demo

HBase入门教程的更多相关文章

一条数据的HBase之旅，简明HBase入门教程-Write全流程
如果将上篇内容理解为一个冗长的"铺垫",那么,从本文开始,剧情才开始正式展开.本文基于提供的样例数据,介绍了写数据的接口,RowKey定义,数据在客户端的组装,数据路由,打包分发, ...
一条数据的HBase之旅，简明HBase入门教程-开篇
常见的HBase新手问题: 什么样的数据适合用HBase来存储? 既然HBase也是一个数据库,能否用它将现有系统中昂贵的Oracle替换掉? 存放于HBase中的数据记录,为何不直接存放于HDFS之 ...
Hbase入门教程--单节点伪分布式模式的安装与使用
Hbase入门简介 HBase是一个分布式的.面向列的开源数据库,该技术来源于 FayChang 所撰写的Google论文"Bigtable:一个结构化数据的分布式存储系统".就像 ...
一条数据的HBase之旅，简明HBase入门教程1：开篇
[摘要] 这是HBase入门系列的第1篇文章,主要介绍HBase当前的项目活跃度以及搜索引擎热度信息,以及一些概况信息,内容基于HBase 2.0 beta2版本.本系列文章既适用于HBase新手,也 ...
HBase入门教程ppt
HBase – Hadoop Database,是一个高可靠性.高性能.面向列.可伸缩的分布式存储系统,利用HBase技术可在廉价PC Server上搭建起大规模结构化存储集群.HBase利用Hado ...
一条数据的HBase之旅，简明HBase入门教程4：集群角色
[摘要] 本文主要介绍HBase与HDFS的关系,一些关键进程角色,以及在部署上的建议 HBase与HDFS 我们都知道HBase的数据是存储于HDFS里面的,相信大家也都有这么的认知: HBase是 ...
一条数据的HBase之旅，简明HBase入门教程3：适用场景
[摘要] 这篇文章继HBase数据模型之后,介绍HBase的适用场景,以及与一些关键场景有关的周边技术生态,最后给出了本文的示例数据华为云上的NoSQL数据库服务CloudTable,基于Apach ...
一条数据的HBase之旅，简明HBase入门教程2：数据模型
[摘要] 上一篇文章讲了HBase项目与应用概况信息,这篇文章讲述HBase的数据模型以及一些基础概念,数据模型可以说决定了HBase适合于什么应用场景. 华为云上的NoSQL数据库服务CloudTa ...
HBase入门基础教程之单机模式与伪分布式模式安装（转）
原文链接:HBase入门基础教程在本篇文章中,我们将介绍Hbase的单机模式安装与伪分布式的安装方式,以及通过浏览器查看Hbase的用户界面.搭建HBase伪分布式环境的前提是我们已经搭建好了Had ...

随机推荐

java使用Redis7--分布式存储并实现sentinel主从自动切换
前面实现了分布式存储,也实现了sentinel单点故障时主从自动切换,现在还需要一种机制,实现分布式存储下,单点故障时的主从自动切换. Server配置 # cd /usr/redis/src/tes ...
php 逐行读取文本文件
在读取文本时,我们要注意一个事情,那就是换行符,应为我们在写文档时会手动换行,这个换行符需不需要保存就要看自己的需求了. 这里封装了两个方法,一个保留换行,一个不保留.$path为文件路径+文件名 1 ...
java 实现七大基本排序算法
一. 选择排序 /** * 选择排序: int arr[] = { 5, 6, 2, 7, 8, 6, 4 }; * * 第0趟 5 2 6 7 6 4 8 第1趟 2 5 6 6 4 7 8 第2趟 ...
javascript举例介绍事件委托的典型使用场景
在了解什么是DOM事件以及给DOM事件绑定监听器的几种方法后,我们来谈谈事件委托. 1. e.target 和 e.currentTarget 当我们给目标元素target 绑定一个事件监听器targ ...
Machine learning 第5周编程作业
1.Sigmoid Gradient function g = sigmoidGradient(z) %SIGMOIDGRADIENT returns the gradient of the sigm ...
基于聚类的“图像分割”(python)
基于聚类的“图像分割” 参考网站: https://zhuanlan.zhihu.com/p/27365576 昨天萌新使用的是PIL这个库,今天发现机器学习也可以这样玩. 视频地址Python机器学 ...
Github使用笔记
========================Github使用===================概念解释:远程仓库Remote:就是指保存在github网站里的代码;本地仓库Repository ...
开源.net 混淆器ConfuserEx介绍 [转]
今天给大家介绍一个开源.net混淆器——ConfuserEx http://yck1509.github.io/ConfuserEx/ 由于项目中要用到.net 混淆器,网上搜寻了很多款,比如Dotf ...
linux命令-寻找超过100M的文件，并将其删除
find / -type f -size +100M | xargs rm -rf 其中第一个/表示路径,此时表示的是根目录,也就是搜索全部的文件 -type表示类型 f表示是文件 -size 表示大 ...
SOA与微服务
SOA 面向服务架构,它可以根据需求通过网络对松散耦合的粗粒度应用组件进行分布式部署.组合和使用.服务层是SOA的基础,可以直接被应用调用,从而有效控制系统中与软件代理交互的人为依赖性. SOA是一种 ...

HBase入门教程

HBase入门教程的更多相关文章

随机推荐

热门专题