hive 锁
HiveQL是一种SQL语言,但缺少udpate和insert类型操作时的行,列或者查询级别的锁支持,hadoop文件通常是一次写入(支持有限的文件追加功能),hadoop和hive都是多用户系统,锁和协调是非常有用的。所有锁必须有单独的系统进行协调。
hive包含了一个使用 apache zookeeper进行锁定的锁功能。zookeeper实现了高度可靠的分布式协调功能。zookeeper对于hive用户是透明的。
zookeeper ['zukipɚ]
zookeeper伪集群模式安装配置如下:
下载
wget https://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/zookeeper-3.4.12/zookeeper-3.4.12.tar.gz
解压
tar -zxf zookeeper-3.4.12.tar.gz -C /root/
[root@host zookeeper-3.4.12]# pwd
/root/zookeeper-3.4.12
创建文件夹serverlist,并创建三个子文件夹
[root@host serverlist]# pwd
/root/zookeeper-3.4.12/serverlist
[root@host serverlist]# ls
server1 server2 server3
在serverlist的每个子文件夹中分别创建data 以及logs两个文件夹:
[root@host server1]# pwd
/root/zookeeper-3.4.12/serverlist/server1
[root@host server1]# ls
data logs
配置
将zoo_sample.cfg文件复制4份,zoo.cfg,zoo1.cfg zoo2.cfg zoo3.cfg
zoo1.cfg 如下:
[root@host conf]# cat zoo1.cfg
tickTime=2000
dataDir=/root/zookeeper-3.4.12/serverlist/server1/data
dataLogDir=/root/zookeeper-3.4.12/serverlist/server1/logs
clientPort=2181
initLimit=5
syncLimit=2
server.1=192.168.53.122:2888:3888
server.2=192.168.53.122:4888:5888
server.3=192.168.53.122:6888:7888
zoo2.cfg 如下:
[root@host conf]# cat zoo2.cfg
tickTime=2000
dataDir=/root/zookeeper-3.4.12/serverlist/server2/data
dataLogDir=/root/zookeeper-3.4.12/serverlist/server2/logs
clientPort=2182
initLimit=5
syncLimit=2
server.1=192.168.53.122:2888:3888
server.2=192.168.53.122:4888:5888
zoo3.cfg 如下:
[root@host conf]# cat zoo3.cfg
tickTime=2000
dataDir=/root/zookeeper-3.4.12/serverlist/server3/data
dataLogDir=/root/zookeeper-3.4.12/serverlist/server3/logs
clientPort=2183
initLimit=5
syncLimit=2
server.1=192.168.53.122:2888:3888
server.2=192.168.53.122:4888:5888
server.3=192.168.53.122:6888:7888
注:
tickTime:基本事件单元,以毫秒为单位,这个时间作为 Zookeeper 服务器之间或客户端之间维持心跳的时间间隔
dataDir:存储内存中数据库快照的位置,顾名思义就是 Zookeeper 保存数据的目录,默认情况下,Zookeeper 将写数据的日志文件也保存到这个目录里
clientPort:这个端口就是客户端连接 Zookeeper 服务器的端口,Zookeeper 会监听这个端口,接受客户端的访问请求
initLimit:这个配置项是用来配置 Zookeeper 接受客户端初始化连接时最长能忍受多少个心跳时间间隔,当已经超过 10 个心跳的时间也就是(ticktime)长度后 Zookeeper 服务器还没有收到客户端的返回信息,那么表明这个客户端连接失败,总的时间长度就是:10*2000 = 20s
syncLimit:这个配置项表示 Leader 与 Follower 之间发送消息,请求和应答时间长度,最长不能超过多少个 tickTime 的时间长度,总的时间长度就是:5*2000 = 10s
server.A=B:C:D:其中 A 是一个数字,表示这个是第几号服务器;B 是这个服务器的 ip 地址;C 表示的是这个服务器与集群中的 Leader 服务器交换信息的端口;D 表示的是万一集群中的 Leader 服务器挂了,需要一个端口来重新进行选举,选出一个新的 Leader,而这个端口就是用来执行选举时服务器相互通信的端口。如果是伪集群的配置方式,由于 B 都是一样,所以不同的 Zookeeper 实例通信端口号不能一样,所以要给它们分配不同的端口号。
[root@host serverlist]# echo 1 > server1/data/myid
[root@host serverlist]# echo 2 > server2/data/myid
[root@host serverlist]# echo 3 > server3/data/myid
启动
[root@host serverlist]# ../bin/zkServer.sh start zoo1.cfg
[root@host serverlist]# ../bin/zkServer.sh start zoo2.cfg
[root@host serverlist]# ../bin/zkServer.sh start zoo3.cfg
查看状态(leader产生有随机性)
leader:负责客户端的 writer 类型请求
Follower:负责客户端 reader 类型请求,参与 leader 选举
Observer:特殊的“Follower”,其可以接收客户端 reader 请求,但不参与选举。(扩容系统支撑能力,提高读取速度)因为他不接受任何同步的写入请求,只负责 leader 同步数据
[root@host serverlist]# ../bin/zkServer.sh status zoo1.cfg
ZooKeeper JMX enabled by default
Using config: /root/zookeeper-3.4.12/bin/../conf/zoo1.cfg
Mode: follower
[root@host serverlist]# ../bin/zkServer.sh status zoo2.cfg
ZooKeeper JMX enabled by default
Using config: /root/zookeeper-3.4.12/bin/../conf/zoo2.cfg
Mode: leader
[root@host serverlist]# ../bin/zkServer.sh status zoo3.cfg
ZooKeeper JMX enabled by default
Using config: /root/zookeeper-3.4.12/bin/../conf/zoo3.cfg
Mode: follower
链接测试:
[root@host serverlist]# ../bin/zkCli.sh -server 192.168.53.122:2181
Connecting to 192.168.53.122:2181 2018-05-09 17:27:52,673 [myid:] - INFO [main:Environment@100] - Client environment:zookeeper.version=3.4.12-e5259e437540f349646870ea94dc2658c4e44b3b, built on 03/27/2018 03:55 GMT 2018-05-09 17:27:52,675 [myid:] - INFO [main:Environment@100] - Client environment:host.name=host 2018-05-09 17:27:52,675 [myid:] - INFO [main:Environment@100] - Client environment:java.version=1.8.0_101 2018-05-09 17:27:52,677 [myid:] - INFO [main:Environment@100] - Client environment:java.vendor=Oracle Corporation 2018-05-09 17:27:52,677 [myid:] - INFO [main:Environment@100] - Client environment:java.home=/usr/java/jdk1.8.0_101/jre 2018-05-09 17:27:52,677 [myid:] - INFO [main:Environment@100] - Client environment:java.class.path=/root/zookeeper-3.4.12/bin/../build/classes:/root/zookeeper-3.4.12/bin/../build/lib/*.jar:/root/zookeeper-3.4.12/bin/../lib/slf4j-log4j12-1.7.25.jar:/root/zookeeper-3.4.12/bin/../lib/slf4j-api-1.7.25.jar:/root/zookeeper-3.4.12/bin/../lib/netty-3.10.6.Final.jar:/root/zookeeper-3.4.12/bin/../lib/log4j-1.2.17.jar:/root/zookeeper-3.4.12/bin/../lib/jline-0.9.94.jar:/root/zookeeper-3.4.12/bin/../lib/audience-annotations-0.5.0.jar:/root/zookeeper-3.4.12/bin/../zookeeper-3.4.12.jar:/root/zookeeper-3.4.12/bin/../src/java/lib/*.jar:/root/zookeeper-3.4.12/bin/../conf: 2018-05-09 17:27:52,677 [myid:] - INFO [main:Environment@100] - Client environment:java.library.path=/root/hadoop/hadoop-2.7.4/lib/native/::/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib 2018-05-09 17:27:52,677 [myid:] - INFO [main:Environment@100] - Client environment:java.io.tmpdir=/tmp 2018-05-09 17:27:52,677 [myid:] - INFO [main:Environment@100] - Client environment:java.compiler=<NA> 2018-05-09 17:27:52,678 [myid:] - INFO [main:Environment@100] - Client environment:os.name=Linux 2018-05-09 17:27:52,678 [myid:] - INFO [main:Environment@100] - Client environment:os.arch=amd64 2018-05-09 17:27:52,678 [myid:] - INFO [main:Environment@100] - Client environment:os.version=2.6.32-431.el6.x86_64 2018-05-09 17:27:52,678 [myid:] - INFO [main:Environment@100] - Client environment:user.name=root 2018-05-09 17:27:52,678 [myid:] - INFO [main:Environment@100] - Client environment:user.home=/root 2018-05-09 17:27:52,678 [myid:] - INFO [main:Environment@100] - Client environment:user.dir=/root/zookeeper-3.4.12/serverlist 2018-05-09 17:27:52,679 [myid:] - INFO [main:ZooKeeper@441] - Initiating client connection, connectString=192.168.53.122:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@799f7e29 Welcome to ZooKeeper! 2018-05-09 17:27:52,695 [myid:] - INFO [main-SendThread(192.168.53.122:2181):ClientCnxn$SendThread@1028] - Opening socket connection to server 192.168.53.122/192.168.53.122:2181. Will not attempt to authenticate using SASL (unknown error) JLine support is enabled 2018-05-09 17:27:52,760 [myid:] - INFO [main-SendThread(192.168.53.122:2181):ClientCnxn$SendThread@878] - Socket connection established to 192.168.53.122/192.168.53.122:2181, initiating session [zk: 192.168.53.122:2181(CONNECTING) 0] 2018-05-09 17:27:52,810 [myid:] - INFO [main-SendThread(192.168.53.122:2181):ClientCnxn$SendThread@1302] - Session establishment complete on server 192.168.53.122/192.168.53.122:2181, sessionid = 0x100877925070001, negotiated timeout = 30000
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[zk: 192.168.53.122:2181(CONNECTED) 0] [zk: 192.168.53.122:2181(CONNECTED) 0]
[root@host serverlist]# ../bin/zkCli.sh -server 192.168.53.122:2182
Connecting to 192.168.53.122:2182 2018-05-09 17:28:53,602 [myid:] - INFO [main:Environment@100] - Client environment:zookeeper.version=3.4.12-e5259e437540f349646870ea94dc2658c4e44b3b, built on 03/27/2018 03:55 GMT 2018-05-09 17:28:53,605 [myid:] - INFO [main:Environment@100] - Client environment:host.name=host 2018-05-09 17:28:53,605 [myid:] - INFO [main:Environment@100] - Client environment:java.version=1.8.0_101 2018-05-09 17:28:53,607 [myid:] - INFO [main:Environment@100] - Client environment:java.vendor=Oracle Corporation 2018-05-09 17:28:53,607 [myid:] - INFO [main:Environment@100] - Client environment:java.home=/usr/java/jdk1.8.0_101/jre 2018-05-09 17:28:53,607 [myid:] - INFO [main:Environment@100] - Client environment:java.class.path=/root/zookeeper-3.4.12/bin/../build/classes:/root/zookeeper-3.4.12/bin/../build/lib/*.jar:/root/zookeeper-3.4.12/bin/../lib/slf4j-log4j12-1.7.25.jar:/root/zookeeper-3.4.12/bin/../lib/slf4j-api-1.7.25.jar:/root/zookeeper-3.4.12/bin/../lib/netty-3.10.6.Final.jar:/root/zookeeper-3.4.12/bin/../lib/log4j-1.2.17.jar:/root/zookeeper-3.4.12/bin/../lib/jline-0.9.94.jar:/root/zookeeper-3.4.12/bin/../lib/audience-annotations-0.5.0.jar:/root/zookeeper-3.4.12/bin/../zookeeper-3.4.12.jar:/root/zookeeper-3.4.12/bin/../src/java/lib/*.jar:/root/zookeeper-3.4.12/bin/../conf: 2018-05-09 17:28:53,607 [myid:] - INFO [main:Environment@100] - Client environment:java.library.path=/root/hadoop/hadoop-2.7.4/lib/native/::/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib 2018-05-09 17:28:53,608 [myid:] - INFO [main:Environment@100] - Client environment:java.io.tmpdir=/tmp 2018-05-09 17:28:53,608 [myid:] - INFO [main:Environment@100] - Client environment:java.compiler=<NA> 2018-05-09 17:28:53,608 [myid:] - INFO [main:Environment@100] - Client environment:os.name=Linux 2018-05-09 17:28:53,608 [myid:] - INFO [main:Environment@100] - Client environment:os.arch=amd64 2018-05-09 17:28:53,608 [myid:] - INFO [main:Environment@100] - Client environment:os.version=2.6.32-431.el6.x86_64 2018-05-09 17:28:53,608 [myid:] - INFO [main:Environment@100] - Client environment:user.name=root 2018-05-09 17:28:53,608 [myid:] - INFO [main:Environment@100] - Client environment:user.home=/root 2018-05-09 17:28:53,608 [myid:] - INFO [main:Environment@100] - Client environment:user.dir=/root/zookeeper-3.4.12/serverlist 2018-05-09 17:28:53,609 [myid:] - INFO [main:ZooKeeper@441] - Initiating client connection, connectString=192.168.53.122:2182 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@799f7e29 Welcome to ZooKeeper! 2018-05-09 17:28:53,625 [myid:] - INFO [main-SendThread(192.168.53.122:2182):ClientCnxn$SendThread@1028] - Opening socket connection to server 192.168.53.122/192.168.53.122:2182. Will not attempt to authenticate using SASL (unknown error) JLine support is enabled 2018-05-09 17:28:53,687 [myid:] - INFO [main-SendThread(192.168.53.122:2182):ClientCnxn$SendThread@878] - Socket connection established to 192.168.53.122/192.168.53.122:2182, initiating session [zk: 192.168.53.122:2182(CONNECTING) 0] 2018-05-09 17:28:53,736 [myid:] - INFO [main-SendThread(192.168.53.122:2182):ClientCnxn$SendThread@1302] - Session establishment complete on server 192.168.53.122/192.168.53.122:2182, sessionid = 0x200877925380002, negotiated timeout = 30000
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[zk: 192.168.53.122:2182(CONNECTED) 0] [zk: 192.168.53.122:2182(CONNECTED) 0] [zk: 192.168.53.122:2182(CONNECTED) 0]
配置hive,修改$HIVE_HOME/conf/hive_site.xml
<property>
<name>hive.support.concurrency</name>
<value>true</value>
<description>
Whether Hive supports concurrency control or not.
A ZooKeeper instance must be up and running when using zookeeper Hive lock manager
</description>
由于本机的伪分布式,一个就可以
<property>
<name>hive.zookeeper.quorum</name>
<value>192.168.53.122</value>
<description>
List of ZooKeeper servers to talk to. This is needed for:
1. Read/write locks - when hive.lock.manager is set to
org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager,
2. When HiveServer2 supports service discovery via Zookeeper.
3. For delegation token storage if zookeeper store is used, if
hive.cluster.delegation.token.store.zookeeper.connectString is not set
4. LLAP daemon registry service
</description>
配置好属性后,hive会对特性的查询自动启动获取锁
hive> show locks;
OK
Time taken: 0.006 seconds
hive> show locks tb_cust extended;
OK
tab_name mode
Time taken: 0.023 seconds
hive> show locks tb_cust;
OK
tab_name mode
Time taken: 0.022 seconds
hive> show locks tb_cust partition(city='beijing');
OK
tab_name mode
Time taken: 0.145 seconds
hive> show locks tb_cust partition(city='beijing') extended;
OK
tab_name mode
Time taken: 0.078 seconds
hive提供了两种类型的锁,开启并发功能后,它们也就自动启动了。某个表被读取时需要使用共享锁,多重并发共享锁也是被允许的。
修改数据的操作需要使用独占锁,它不仅冻结其他的表修改操作,还有阻止其他进程的查询。
只要一个操作对表或者分区出发了独占锁,该表或者分区不能并发执行作业。
对表是分区表时,对表分区的独占锁会导致需要对表本身获取共享锁来防止发生不相容的变更。
load同insert一样会出发独占锁
显示锁和独占锁
独占锁表或者分区后,其他进程会等待,解锁后,其他进程继续执行。
hive> lock table tb_cust exclusive; //锁表
OK
Time taken: 0.123 seconds
hive> show locks tb_cust;
OK
tab_name mode
gamedw@tb_cust EXCLUSIVE
Time taken: 0.045 seconds, Fetched: 1 row(s)
hive> unlock table tb_cust;
OK
Time taken: 0.081 seconds
hive> show locks tb_cust;
OK
tab_name mode
Time taken: 0.019 seconds
hive> lock table tb_cust partition(city='beijing') exclusive; //锁定一个分区,其他分区不被锁定,可以有其他操作,但整表无法进行操作。
OK
Time taken: 0.184 seconds
hive> show locks tb_cust;
OK
Time taken: 0.023 seconds
hive> show locks tb_cust partition(city='beijing');
OK
gamedw@tb_cust@city=beijing EXCLUSIVE
Time taken: 0.106 seconds, Fetched: 1 row(s)
hive 锁的更多相关文章
- hive 锁表问题
报错如下: Unable to acquire IMPLICIT, EXCLUSIVE lock dms@pc_user_msg@month=201611 after 100 attempts. 显示 ...
- Hive 锁处理
hive有两个锁,共享索(s) 和排它锁(x) 在进行ddl操作时,排他锁会阻止 ddl 操作.drop.alter table 如果一个hive查询使用到了表A,执行时间10分钟.在这10分钟内要d ...
- Hive 锁 lock
Hive + zookeeper 可以支持锁功能 锁有两种:共享锁.独占锁,Hive开启并发功能的时候自动开启锁功能 1)查询操作使用共享锁,共享锁是可以多重.并发使用的 2)修改表操作使用独占锁,它 ...
- HIVE锁相关
hive存在两种锁,共享锁Shared (S)和互斥锁Exclusive (X) 其中只触发s锁的操作可以并发的执行,只要有一个操作对表或者分区出发了x锁,则该表或者分区不能并发的执行作业. -- 加 ...
- hive lock命令的使用
1.hive锁表命令 hive> lock table t1 exclusive;锁表后不能对表进行操作 2.hive表解锁: hive> unlock table t1; 3.查看被锁的 ...
- hive资料
Hive基本操作 Hive 解锁操作 之前使用Hive,出现过一种情况:在代码正在执行insert into或insert overwrite时,中途手动将程序停掉,会出现卡死情况,只能执行查询操作, ...
- hadoop问题集(1)
参考: http://dataunion.org/22887.html 1.mapreduce_shuffle does not exist 执行任何时报错: Container launch ...
- 一文带你读懂zookeeper在大数据生态的应用
一个执着于技术的公众号 一.简述 在一群动物掌管的世界中,动物没有人类聪明的思想,为了保持动物世界的生态平衡,这时,动物管理员-zookeeper诞生了. 打开Apache zookeeper的官网, ...
- hive 表锁和解锁
场景: 在执行insert into或insert overwrite任务时,中途手动将程序停掉,会出现卡死情况(无法提交MapReduce),只能执行查询操作,而drop insert操作均不可操作 ...
随机推荐
- LG1397 [NOI2013]矩阵游戏
题意 婷婷是个喜欢矩阵的小朋友,有一天她想用电脑生成一个巨大的n行m列的矩阵(你不用担心她如何存储).她生成的这个矩阵满足一个神奇的性质:若用F[i][j]来表示矩阵中第i行第j列的元素,则F[i][ ...
- 2014华为机试西安地区A组试题
2014华为机试西安地区A组试题 题目一.分苹果 M个同样苹果放到N个同样篮子里有多少种放法,同意有篮子不放. 1<=M<=10.1<=N<=10 比如5个苹果三个篮子,3,1 ...
- WebApi_基于Token的身份验证——JWT
JWT是啥? JWT就是一个字符串,经过加密处理与校验处理的字符串,形式为: A.B.C A由JWT头部信息header加密得到B由JWT用到的身份验证信息json数据加密得到C由A和B加密得到,是校 ...
- Bloom Filter(布隆过滤器)的概念和原理
Bloom filter 适用范围:可以用来实现数据字典,进行数据的判重,或者集合求交集 基本原理及要点: 对于原理来说很简单,位数组+k个独立hash函数.将hash函数对应的值的位数组置1,查找时 ...
- nyoj 某种序列
某种序列 时间限制:3000 ms | 内存限制:65535 KB 难度:4 描述 数列A满足An = An-1 + An-2 + An-3, n >= 3 编写程序,给定A0, A1 ...
- tomcat源码阅读之日志记录器(Logger)
UML图: 1.Logger接口中定义了日志的级别:FATAL.ERROR.WARNING.INFORMATION.DEBUG,通过接口函数getVerbosity获取日志级别,setVerbosit ...
- Oracle活动会话历史(ASH)及报告解读
对于数据库运行期间的各种状态的实时监控以及相关性能数据捕获对于解决性能问题,提高整体业务系统运行效率是至关重要的.在Oracle数据库中,实时捕获相关性能数据是通过ASH工具来实现的.ASH通过每秒钟 ...
- 【python】if&&for&&while语句
if语法: 类型一: if expression : if_suit else: else_suit 例如: adic={"name":"paulwinflo" ...
- java jni 调用c++ opencv代码成功范例
java上建立接口定义 package com.dtk; public class Rec { public native String RecImage(String src); public st ...
- json格式字符串处理
public class InternalClass { public int MID; public string Name; ...