[Hive - LanguageManual] Hive Concurrency Model (待)
Hive Concurrency Model
Use Cases
Concurrency support (http://issues.apache.org/jira/browse/HIVE-1293) is a must in databases and their use cases are well understood. At a minimum, we want to support concurrent readers and writers whenever possible. It would be useful to add a mechanism to discover the current locks which have been acquired. There is no immediate requirement to add an API to explicitly acquire any locks, so all locks would be acquired implicitly.
The following lock modes will be defined in hive (Note that Intent lock is not needed).
- Shared (S)
- Exclusive (X)
As the name suggests, multiple shared locks can be acquired at the same time, whereas X lock blocks all other locks.
The compatibility matrix is as follows:
|
Lock |
Existing Lock |
||
| S |
X |
||
|
Requested |
S |
True |
False |
| X |
False |
False |
|
For some operations, locks are hierarchical in nature -- for example for some partition operations, the table is also locked (to make sure that the table cannot be dropped while a new partition is being created).
The rational behind the lock mode to acquire is as follows:
For a non-partitioned table, the lock modes are pretty intuitive. When the table is being read, a S lock is acquired, whereas an X lock is acquired for all other operations (insert into the table, alter table of any kind etc.)
For a partitioned table, the idea is as follows:
A 'S' lock on table and relevant partition is acquired when a read is being performed. For all other operations, an 'X' lock is taken on the partition. However, if the change is only applicable to the newer partitions, a 'S' lock is acquired on the table, whereas if the change is applicable to all partitions, a 'X' lock is acquired on the table. Thus, older partitions can be read and written into, while the newer partitions are being converted to RCFile. Whenever a partition is being locked in any mode, all its parents are locked in 'S' mode.
Based on this, the lock acquired for an operation is as follows:
|
Hive Command |
Locks Acquired |
|
select .. T1 partition P1 |
S on T1, T1.P1 |
|
insert into T2(partition P2) select .. T1 partition P1 |
S on T2, T1, T1.P1 and X on T2.P2 |
|
insert into T2(partition P.Q) select .. T1 partition P1 |
S on T2, T2.P, T1, T1.P1 and X on T2.P.Q |
|
alter table T1 rename T2 |
X on T1 |
|
alter table T1 add cols |
X on T1 |
|
alter table T1 replace cols |
X on T1 |
|
alter table T1 change cols |
X on T1 |
|
alter table T1 add partition P1 |
S on T1, X on T1.P1 |
|
alter table T1 drop partition P1 |
S on T1, X on T1.P1 |
|
alter table T1 touch partition P1 |
S on T1, X on T1.P1 |
|
alter table T1 set serdeproperties |
S on T1 |
|
alter table T1 set serializer |
S on T1 |
|
alter table T1 set file format |
S on T1 |
|
alter table T1 set tblproperties |
X on T1 |
|
drop table T1 |
X on T1 |
In order to avoid deadlocks, a very simple scheme is proposed here. All the objects to be locked are sorted lexicographically, and the required mode lock is acquired. Note that in some cases, the list of objects may not be known -- for example in case of dynamic partitions, the list of partitions being modified is not known at compile time -- so, the list is generated conservatively. Since the number of partitions may not be known, an exclusive lock is taken on the table, or the prefix that is known.
Two new configurable parameters will be added to decide the number of retries for the lock and the wait time between each retry. If the number of retries are really high, it can lead to a live lock. Look at ZooKeeper recipes (http://hadoop.apache.org/zookeeper/docs/r3.1.2/recipes.html#sc_recipes_Locks) to see how read/write locks can be implemented using the zookeeper apis. Note that instead of waiting, the lock request will be denied. The existing locks will be released, and all of them will be retried after the retry interval.
The recipe listed above will not work as specified, because of the hierarchical nature of locks.
The 'S' lock for table T is specified as follows:
- Call create( ) to create a node with pathname "/warehouse/T/read-". This is the lock node used later in the protocol. Make sure to set the sequence and ephemeral flag.
- Call getChildren( ) on the lock node without setting the watch flag.
- If there is a child with a pathname starting with "write-" and a lower sequence number than the one obtained, the lock cannot be acquired. Delete the node created in the first step and return.
- Otherwise the lock is granted.
The 'X' lock for table T is specified as follows:
- Call create( ) to create a node with pathname "/warehouse/T/write-". This is the lock node used later in the protocol. Make sure to set the sequence and ephemeral flag.
- Call getChildren( ) on the lock node without setting the watch flag.
- If there is a child with a pathname starting with "read-" or "write-" and a lower sequence number than the one obtained, the lock cannot be acquired. Delete the node created in the first step and return.
- Otherwise the lock is granted.
The proposed scheme starves the writers for readers. In case of long readers, it may lead to starvation for writers.
The default Hive behavior will not be changed, and concurrency will not be supported.
Turn Off Concurrency
You can turn off concurrency by setting the following variable to false: hive.support.concurrency.
Debugging
You can see the locks on a table by issuing the following command:
- SHOW LOCKS <TABLE_NAME>;
- SHOW LOCKS <TABLE_NAME> EXTENDED;
- SHOW LOCKS <TABLE_NAME> PARTITION (<PARTITION_DESC>);
- SHOW LOCKS <TABLE_NAME> PARTITION (<PARTITION_DESC>) EXTENDED;
Configuration
Configuration properties for Hive locking are described in Locking.
Locking in Hive Transactions
Hive 0.13.0 adds transactions with row-level ACID semantics, using a new lock manager. For more information, see:
[Hive - LanguageManual] Hive Concurrency Model (待)的更多相关文章
- [HIve - LanguageManual] Hive Operators and User-Defined Functions (UDFs)
Hive Operators and User-Defined Functions (UDFs) Hive Operators and User-Defined Functions (UDFs) Bu ...
- [Hive - LanguageManual] Hive Default Authorization - Legacy Mode
Disclaimer Prerequisites Users, Groups, and Roles Names of Users and Roles Creating/Dropping/Using R ...
- [Hive - LanguageManual] Create/Drop/Grant/Revoke Roles and Privileges / Show Use
Create/Drop/Grant/Revoke Roles and Privileges Hive Default Authorization - Legacy Mode has informati ...
- [Hive - LanguageManual ] ]SQL Standard Based Hive Authorization
Status of Hive Authorization before Hive 0.13 SQL Standards Based Hive Authorization (New in Hive 0. ...
- [Hive - LanguageManual] Archiving for File Count Reduction
Archiving for File Count Reduction Note: Archiving should be considered an advanced command due to t ...
- [HIve - LanguageManual] Joins
Hive Joins Hive Joins Join Syntax Examples MapJoin Restrictions Join Optimization Predicate Pushdown ...
- 【hive】——Hive sql语法详解
Hive 是基于Hadoop 构建的一套数据仓库分析系统,它提供了丰富的SQL查询方式来分析存储在Hadoop 分布式文件系统中的数据,可以将结构 化的数据文件映射为一张数据库表,并提供完整的SQL查 ...
- Hive 文件格式 & Hive操作(外部表、内部表、区、桶、视图、索引、join用法、内置操作符与函数、复合类型、用户自定义函数UDF、查询优化和权限控制)
本博文的主要内容如下: Hive文件存储格式 Hive 操作之表操作:创建外.内部表 Hive操作之表操作:表查询 Hive操作之表操作:数据加载 Hive操作之表操作:插入单表.插入多表 Hive语 ...
- 【hive】——Hive四种数据导入方式
Hive的几种常见的数据导入方式这里介绍四种:(1).从本地文件系统中导入数据到Hive表:(2).从HDFS上导入数据到Hive表:(3).从别的表中查询出相应的数据并导入到Hive表中:(4).在 ...
随机推荐
- Delphi 中的 procedure of object (类方法存在一个隐藏参数self),简单深刻 good
其实要了解这些东西,适当的学些反汇编,WINDOWS内存管理机制,PE结构,看下李维的VCL架构剖析可以很好理解type TMyEvent = procedure of object;这是一种数据类型 ...
- Servlet3.0的新特性
注意:Servlet3.0的项目一定要使用Tomcat7.0才能看到效果!! 1.新增标注支持 在Servlet3.0的部署描述文件web.xml的顶层标签<web-app>中有一 ...
- 轻量级MVC标准
看到标题,估计有人就开始想吐了,没关系,你可以先吐完再看,现在MVC框架多如牛毛,没必要再重复发明轮子了,要声明的是,这里不是想要发明轮子,也没那个闲工夫去发明轮子,而是看到这么多MVC框架模样都差不 ...
- 在Ubuntu上为Android系统的Application Frameworks层增加硬件访问服务(老罗学习笔记5)
在数字科技日新月异的今天,软件和硬件的完美结合,造就了智能移动设备的流行.今天大家对iOS和Android系统的趋之若鹜,一定程度上是由于这两个系统上有着丰富多彩的各种应用软件.因此,软件和硬件的关系 ...
- java知识积累——单元测试和JUnit(二)
首先来复习一下几个重要知识点,然后接着进行一些介绍.在上一篇文章中,我曾经贴过下面这张图片: 在Which method stubs would you like to create?这里,现在结合4 ...
- [Codeforces673B]Problems for Round(思路,规律)
题目链接:http://codeforces.com/contest/673/problem/B 现在有n个题和m个相似的关系,现在要把他们分到2组去. 要求: 1组的所有题比2组难 每个组都得至少有 ...
- ie下jquery ajax 80020101错误的解决方法
<script language="javascript"> <!-- function checkAll(name,isCheck){ ...
- 制作SM2证书
前段时间将系统的RSA算法全部升级为SM2国密算法,密码机和UKey硬件设备大都同时支持RSA和SM2算法,只是应用系统的加解密签名验证需要修改,这个更改底层调用的加密动态库来,原来RSA用的对称加密 ...
- Nginx - webbench压力测试
1. 下载,安装,进目录. 2. 执行:webbench -t 10 -c 1000 http://192.168.1.44/ (我开的nginx,默认端口80,直接处理了) 结果分析: Webben ...
- Oracle 隔离级别
From 11gR2: http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/consist.htm#CNCPT621 一. A ...