HBase HTablePool

Instead of creating an HTable instance for every request from your client application, it makes much more sense to create one initially and subsequently reuse them.

The primary reason for doing so is that creating an HTable instance is a fairly expensive operation that takes a few seconds to complete. In a highly contended environment with thousands of requests per second, you would not be able to use this approach at all—creating the HTable instance would be too slow. You need to create the instance

at startup and use it for the duration of your client’s life cycle.

There is an additional issue with the HTable being reused by multiple threads within the same process.

The HTable class is not thread-safe, that is, the local write buffer is not guarded against concurrent modifications. Even if you were to use setAutoFlush(true) (which is the default currently; see “Client-side write buffer” on page 86 ) this is not advisable. Instead, you should use one instance of HTable for each thread you are running in your client application.

综上所述，使用HTablePool的原因主要有两点：

（1）创建HTable实例是比较耗时的；

（2）HTable实例不是线程安全的。

在详细介绍HTablePool工作原理之前，需要弄明白它的两个依赖类PoolMap（直接依赖）、Pool（间接依赖）的实现方式。

Pool

Pool仅仅是一个接口，描述信息如下图所示：

该接口对应着三个实现类，实现方式均不同，如下图所示：

(1) ReusablePool

该类继承自ConcurrentLinkedQueue，顾名思义，池子内部使用队列（Queue）实现，获取资源里从队列头部摘取（poll），使用完成后再将资源放入（add）队列的尾部（不能超过池子配额限制），核心代码如下：

@Override

public R get() {

return poll();

}

@Override

public R put(R resource) {

if (size() < maxSize) {

add(resource);

}

return null;

}

说明：资源池中的资源数量不会超过maxSize，但池子中的资源实例可能发生变化，当某一时刻池子中的资源已被耗尽（资源均处于外部使用状态，且尚未归还），这里如果有资源请求到来，get返回为Null，此时外部可能会立刻创建新的资源，但是和资源被释放归还的时刻是不定的，可能某一资源（不是新创建的资源）归还时发现资源池中的资源数量已达到配额限制，则该资源不能进入资源池（可能被丢弃回收），虽然资源实例发生变化，但不会影响资源池中的资源数目。

（2）RoundRobinPool

该类继承自CopyOnWriteArrayList，顾名思义，使用轮询方式访问列表中的资源，代码如下：

@Override

public R get() {

if (size() < maxSize) {

return null;

}

nextResource %= size();

R resource = get(nextResource++);

return resource;

}

资源池内部维护一个下标，获取资源时，下标对资源数量（size()）求余并自增一，返回对应的资源（资源并没有被从资源池中移除）。使用轮询的方式可能会导致某一资源被外部（线程）同时使用，需要注意资源线程安全问题。

从上述代码也可以看出，当资源池中的资源数量小于配置限制时，资源池中的资源是不会被重用的。

资源回收的代码同上，在此不再赘述。

（3）ThreadLocalPool

该类继承自ThreadLocal，顾名思义，该资源池与线程有着紧密的联系，资源池中的资源数目取决于使用线程往资源池中添加资源的线程数目，每个线程仅可以使用属于自己线程的那个资源。

private static final Map<ThreadLocalPool<?>, AtomicInteger> poolSizes = new HashMap<ThreadLocalPool<?>, AtomicInteger>();

poolSizes 用来记录各个资源池实例中各自的资源数目（同一程序中可以有多个不同的资源池）。

资源的获取与归还主要利用ThreadLocal中的get、set方法，代码如下：

@Override

public R put(R resource) {

R previousResource = get();

if (previousResource == null) {

AtomicInteger poolSize = poolSizes.get(this);

if (poolSize == null) {

poolSizes.put(this, poolSize = new AtomicInteger(0));

}

poolSize.incrementAndGet();

}

this.set(resource);

return previousResource;

}

get、set中间的代码个人认为仅仅完成资源池审计功能。

PoolMap

HTablePool的资源池设计是以表名称（String、byte[]）为单位的，即对HBase中的每一张表维护着各自的连接池，因此在Pool之上有了PoolMap，实际使用中PoolMap的Key即为表名称。

PoolMap内部维护着一个名为pools的变量，

private Map<K, Pool<V>> pools = new ConcurrentHashMap<K, Pool<V>>();

其中，Key为表名称，Value即为对应着的资源池实例。

请求资源代码如下：

@Override

public V get(Object key) {

Pool<V> pool = pools.get(key);

return pool != null ? pool.get() : null;

}

首先根据key（表名称）找到对应的资源池实例pool，然后从资源池实例pool中请求资源；

添加资源代码如下：

@Override

public V put(K key, V value) {

Pool<V> pool = pools.get(key);

if (pool == null) {

pools.put(key, pool = createPool());

}

return pool != null ? pool.put(value) : null;

}

首先根据key（表名称）找到对应的资源池实例pool，然后通过该资源池实例pool添加资源，如果相应的资源池实例不存在，则创建并维护对应关系，创建代码如下：

protected Pool<V> createPool() {

switch (poolType) {

case Reusable:

return new ReusablePool<V>(poolMaxSize);

case RoundRobin:

return new RoundRobinPool<V>(poolMaxSize);

case ThreadLocal:

return new ThreadLocalPool<V>();

}

return null;

}

根据poolType（ReusablePool、RoundRobinPool、ThreadLocalPool）创建相应类型的资源池。

在Pool、PoolMap的基础上，我们可以开始研究HTablePool的实现原理。

HTablePool

该类内部维护着两个重要变量：

private final PoolMap<String, HTableInterface> tables;

......

private final HTableInterfaceFactory tableFactory;

其中，tables维护着某表对应的连接资源（即HTable实例），tableFactory用以创建、释放HTable实例。

HTableInterfaceFactory拥有一个实例类HTableFactory，代码如下：

public class HTableFactory implements HTableInterfaceFactory {

@Override

public HTableInterface createHTableInterface(Configuration config,

byte[] tableName) {

try {

return new HTable(config, tableName);

} catch (IOException ioe) {

throw new RuntimeException(ioe);

}

@Override

public void releaseHTableInterface(HTableInterface table) throws IOException {

table.close();

}

HTableFactory工作过程比较简单，创建、释放（关闭）HTable实例。

请求某表连接资源（HTable实例）代码如下：

public HTableInterface getTable(String tableName) {

// call the old getTable implementation renamed to findOrCreateTable

HTableInterface table = findOrCreateTable(tableName);

// return a proxy table so when user closes the proxy, the actual table

// will be returned to the pool

return new PooledHTable(table);

}

private HTableInterface findOrCreateTable(String tableName) {

HTableInterface table = tables.get(tableName);

if (table == null) {

table = createHTable(tableName);

}

return table;

}

protected HTableInterface createHTable(String tableName) {

return this.tableFactory.createHTableInterface(config,Bytes.toBytes(tableName));

}

（1）根据表名称查询或创建HTableInterface（HTable实现该接口）实例，由方法findOrCreateTable完成；

（2）tables中含有该表名称对应的实例，则直接返回，否则通过方法createHTable创建（即通过tableFactory创建）后返回；

（3）使用代理模式对HTableInterface实例进行包装，将包装后的实例（即PooledHTable实例）返回给调用者。

PooledHTable通过代理模式将请求全部发送至内部HTable实例，但有一个方法请求例外，即close方法，

class PooledHTable implements HTableInterface {

private HTableInterface table; // actual table implementation

public PooledHTable(HTableInterface table) {

this.table = table;

}

/**

* Returns the actual table back to the pool

* @throws IOException

public void close() throws IOException {

returnTable(table);

}

@Override

public RowLock lockRow(byte[] row) throws IOException {

return table.lockRow(row);

}

该方法不再通过HTableInterface close方法执行关闭操作，而是将实例table返回至相应的资源池中，代码如下：

private void returnTable(HTableInterface table) throws IOException {

// this is the old putTable method renamed and made private

String tableName = Bytes.toString(table.getTableName());

if (tables.size(tableName) >= maxSize) {

// release table instance since we're not reusing it

this.tables.remove(tableName, table);

this.tableFactory.releaseHTableInterface(table);

return;

}

tables.put(tableName, table);

}

如果相应的表名称的资源池中的资源数目已经达到配额限制，则将该资源从资源池中移除（某些资源池中实现并没有将资源实际移除资源池，参考前面分析），否则根据对应关系将资源归还即可。

HBase HTablePool的更多相关文章

Hbase的连接池--HTablePool被Deprecated之后
说明: 最近两天在调研HBase的连接池,有了一些收获,特此记录下来. 本文先将官方文档(http://hbase.apache.org/book.html)9.3.1.1节翻译,方便大家阅读,然 ...
Java 向Hbase表插入数据报（org.apache.hadoop.hbase.client.HTablePool$PooledHTable cannot be cast to org.apac）
org.apache.hadoop.hbase.client.HTablePool$PooledHTable cannot be cast to org.apac 代码: //1.create HTa ...
【甘道夫】HBase连接池 -- HTablePool是Deprecated之后
说明: 近期两天在调研HBase的连接池,有了一些收获,特此记录下来. 本文先将官方文档(http://hbase.apache.org/book.html)9.3.1.1节翻译,方便大家阅读,然后查 ...
Java 向Hbase表插入数据报（org.apache.hadoop.hbase.client.HTablePool$PooledHTable cannot be cast to org.apac
org.apache.hadoop.hbase.client.HTablePool$PooledHTable cannot be cast to org.apac 代码: //1.create HTa ...
HBase概念学习（九）HTablePool为何弃用？
版权声明:本文为博主原创文章,未经博主同意不得转载. https://blog.csdn.net/jiq408694711/article/details/36526433 转载请注明出处:jiq•钦 ...
Java 向Hbase表插入数据异常org.apache.hadoop.hbase.client.HTablePool$PooledHTable cannot be cast to org.apache.client.HTable
出错代码如下: //1.create HTablePool HTablePool hp=new HTablePool(con, 1000); //2.get HTable from HTablepoo ...
hbase scan 的例子
/** * Created by han on 2016/1/28. */ import org.apache.hadoop.conf.Configuration; import org.apache ...
HBase Java简单示例
Hbase采用Java实现,原生客户端也是Java实现,其他语言需要通过thritf接口服务间接访问Hbase的数据. Hbase作为大数据存储数据库,其写能力非常强,加上Hbase本身就脱胎于Had ...
【hbase】——Java操作Hbase进行建表、删表以及对数据进行增删改查，条件查询
1.搭建环境新建JAVA项目,添加的包有: 有关Hadoop的hadoop-core-0.20.204.0.jar 有关Hbase的hbase-0.90.4.jar.hbase-0.90.4-tes ...

随机推荐

一起学android之怎样获取手机程序列表以及程序相关信息并启动指定程序（26）
效果图: 程序列表: watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvaGFpX3FpbmdfeHVfa29uZw==/font/5a6L5L2T/fonts ...
css之选择器
我们都用过jquery,使用jquery选择器,非常的简单,最近刚好有项目上手,拿起书本看了一下,发现好多的东西都忘掉了,好记性不如烂笔头,就将这章内容记录下来,现在我们看下css原生的选择器. 选择 ...
git 创建远程仓库
在远程服务器上$ cd /server/path/ $ git init --bare myproject.git 在本地 1> $ cd /client/path/ 运行 git init 2 ...
POJ 1986(LCA and RMQ)
题意:给定一棵树,求任意两点之间的距离. 思路:由于树的特殊性,所以任意两点之间的路径是唯一的.u到v的距离等于dis(u) + dis(v) - 2 * dis(lca(u, v)); 其中dis( ...
prim 堆优化+ kruskal 按秩优化
#include<iostream> #include<cstdio> #include<cstring> #include<queue> #defin ...
sql server 各种函数
SQL2008 表达式:是常量.变量.列或函数等与运算符的任意组合. 1. 字符串函数函数名称参数示例说明 ascii(字符串表达式) select ascii('abc') 返回 97 返 ...
jquery获取元素到屏幕底的可视距离
jquery获取元素到屏幕底的可视距离要打对号的图里的height(我自称为可视高度:滚动条未滑到最底端) 不是打叉图里的到页面底部(滚动条到最底部时的height)(offset().top方法 ...
SQL 里面的COALESCE函数
在SQL里面除了is null 还有这样一个还用的方法 COALESCE(值[, ...]) select COALESCE(NULL,NULL,'AAAA') -> 'AAAA' 意思是前面的 ...
无法升级数据库....因为此版本的 SQL Server 不支持该数据库的非发布版本(539) 解决方案
使用SQL2012附加一个数据库时报出了以下错误:“无法升级数据库....因为此版本的 SQL Server 不支持该数据库的非发布版本(539).不能打开与此版本的 sqlserver.exe 不兼 ...
Hibernate 多对多关联Demo
以学生[Student ]与课程[Course ]之间的关系为例: //Course .java public class Course implements Serializable { priva ...

HBase HTablePool

HBase HTablePool的更多相关文章

随机推荐

热门专题