安装Standalone模式HBase

所谓Standalone模式HBase，就是只启动一个JVM进程，在这个进程中同时启动了多个后台角色，如：HMaster，单个HRegionServer，以及ZooKeeper服务。

下载安装

最新版本从这里下载。

历史版本从这里下载。

以最新版2.4.14为例说明，基于Ubuntu18.04 Server环境。

解压到到指定路径即可，如：/home/zhangsan/opt。

$ tar xvf hbase-2.4.14-bin.tar.gz

$ cd hbase-2.4.14

修改配置参数：主要就是设置$JAVA_HOME环境变量。

$ vim conf/hbase-env.sh

export JAVA_HOME=/home/zhangsan/opt/jdk-11.0.16.1

启动服务：

$ cd /home/zhangsan/opt/hbase-2.4.14

$ ./bin/start-hbase.sh

查看服务是否启动成功：

zhangsan@ubuntu18_server:~/opt/hbase-2.4.14$ jps

8926 HMaster

9359 Jps

如果在Standalone模式启动成功，将会存在一个名称为HMaster的进程。

停止服务：

$ cd /home/zhangsan/opt/hbase-2.4.14

$ ./bin/stop-hbase.sh

访问HBase

整体来讲，有2类客户端可以连接并操作HBase：

第一类：HBase自带的命令行客户端

第二类：编程接口客户端

命令行客户端

# 进入到HBase安装目录

$ cd /home/zhangsan/opt/hbase-2.4.14

# 使用命令行连接HBase

$ ./bin/hbase shell

# 创建表

> create 'test', 'cf'

# 查看表是否存在

> list 'test'

# 查看表详情

> describe 'test'

# 向表中添加数据

> put 'test', 'row1', 'cf:a', 'value1'

> put 'test', 'row2', 'cf:b', 'value2'

> put 'test', 'row3', 'cf:c', 'value3'

# 查看表中所有i数据

> scan 'test'

# 获取表中一行数据

> get 'test', 'row1'

# 禁用表

> disable 'test'

# 启用表

> enable 'test'

# 删除表

# 注意：删除表之前必须先禁用，否则报错“ERROR: Table xxx is enabled. Disable it first.”

> drop 'test'

# 查看所有表

> list

TABLE

test

1 row(s)

Took 0.0557 seconds

=> ["test"]

编程客户端

最常用的编程客户端是HBase自带的hbase-client，这是一套比较底层的API，在实际使用时需要对其进行再封装。

关于hbase-client接口的基本使用参考Apache HBase APIs

详细使用可以参考如下文档：

HBase Java API: hbase-client

源码解读--(1)hbase客户端源代码

 HBase(2) Java 操作 HBase 教程

值得注意的是，在使用接口时创建org.apache.hadoop.hbase.client.Connection对象代价非常昂贵，最好不要频繁创建，参考科学使用HBase Connection

HBase本地模式允许远程连接

所谓的HBase远程连接，就是指通过网络地址和端口访问，如下示例：

// 使用hbase-client连接远程HBase

Configuration conf = HBaseConfiguration.create();

conf.set("hbase.zookeeper.quorum", "192.168.10.100");

conf.set("hbase.zookeeper.property.clientPort", "2181");

Connection connection = ConnectionFactory.createConnection(conf);

默认情况下，以Standalone模式启动HBase时，ZooKeeper服务是无法通过远程方式连接的（端口只绑定了localhost）。

解决办法是使用外部的ZooKeeper服务（需要确保这个ZooKeeper能够被外部访问），同时需要修改HBase的相应配置参数（hbase-site.xml），如下所示：

<configuration>

  <property>

    <name>hbase.cluster.distributed</name>

    <value>true</value>     <!-- 这个要改成true,才能使用外置的 zookeeper -->

  </property>

  <property>

    <name>hbase.tmp.dir</name>

    <value>./tmp</value>

  </property>

  <property>

    <name>hbase.unsafe.stream.capability.enforce</name>

    <value>false</value>

  </property>

  <property>

    <name>hbase.rootdir</name>

    <value>file:///opt/hbase-2.3.4/data/hbase</value>

  </property>

  <property>

    <name>hbase.zookeeper.property.dataDir</name>

    <value>/tmp/zookeeper</value>

  </property>

  <property>

     <name>hbase.zookeeper.quorum</name>

     <value>localhost</value>  <!-- 外部ZoopKeeper服务的连接地址，可以是IP地址或者域名 -->

  </property>

  <property>

     <name>hbase.zookeeper.property.clientPort</name>

     <value>2181</value>       <!-- 外部ZooKeeper服务的连接端口 -->

  </property>

</configuration>

详情参考：hbase (local mode) remote access

【参考】

https://hbase.apache.org/book.html#_preface

https://www.jianshu.com/p/1cf5ab260283 HBase的配置

https://www.yiibai.com/hbase HBase教程

https://blog.51cto.com/u_14286115/3703411 hbase中scan和get查看版本数据的区别

https://www.cnblogs.com/cc11001100/p/9911730.html HBase笔记之namespace

https://toboto.wang/2020/06/09/基于HBase的数据分析方案.html 基于HBase的数据分析方案