HiveQL是一种SQL语言,但缺少udpate和insert类型操作时的行,列或者查询级别的锁支持,hadoop文件通常是一次写入(支持有限的文件追加功能),hadoop和hive都是多用户系统,锁和协调是非常有用的。所有锁必须有单独的系统进行协调。

hive包含了一个使用 apache zookeeper进行锁定的锁功能。zookeeper实现了高度可靠的分布式协调功能。zookeeper对于hive用户是透明的。

zookeeper ['zukipɚ]

zookeeper伪集群模式安装配置如下:

下载

wget https://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/zookeeper-3.4.12/zookeeper-3.4.12.tar.gz

解压

tar -zxf zookeeper-3.4.12.tar.gz -C /root/

[root@host zookeeper-3.4.12]# pwd
/root/zookeeper-3.4.12

创建文件夹serverlist,并创建三个子文件夹

[root@host serverlist]# pwd
/root/zookeeper-3.4.12/serverlist
[root@host serverlist]# ls
server1  server2  server3

在serverlist的每个子文件夹中分别创建data 以及logs两个文件夹:

[root@host server1]# pwd
/root/zookeeper-3.4.12/serverlist/server1
[root@host server1]# ls
data  logs

配置

将zoo_sample.cfg文件复制4份,zoo.cfg,zoo1.cfg  zoo2.cfg  zoo3.cfg

zoo1.cfg 如下:

[root@host conf]# cat zoo1.cfg
tickTime=2000
dataDir=/root/zookeeper-3.4.12/serverlist/server1/data
dataLogDir=/root/zookeeper-3.4.12/serverlist/server1/logs
clientPort=2181
initLimit=5
syncLimit=2
server.1=192.168.53.122:2888:3888
server.2=192.168.53.122:4888:5888
server.3=192.168.53.122:6888:7888

zoo2.cfg 如下:

[root@host conf]# cat zoo2.cfg
tickTime=2000
dataDir=/root/zookeeper-3.4.12/serverlist/server2/data
dataLogDir=/root/zookeeper-3.4.12/serverlist/server2/logs
clientPort=2182
initLimit=5
syncLimit=2
server.1=192.168.53.122:2888:3888
server.2=192.168.53.122:4888:5888

zoo3.cfg 如下:

[root@host conf]# cat zoo3.cfg
tickTime=2000
dataDir=/root/zookeeper-3.4.12/serverlist/server3/data
dataLogDir=/root/zookeeper-3.4.12/serverlist/server3/logs
clientPort=2183
initLimit=5
syncLimit=2
server.1=192.168.53.122:2888:3888
server.2=192.168.53.122:4888:5888
server.3=192.168.53.122:6888:7888

注:

tickTime:基本事件单元,以毫秒为单位,这个时间作为 Zookeeper 服务器之间或客户端之间维持心跳的时间间隔

dataDir:存储内存中数据库快照的位置,顾名思义就是 Zookeeper 保存数据的目录,默认情况下,Zookeeper 将写数据的日志文件也保存到这个目录里

clientPort:这个端口就是客户端连接 Zookeeper 服务器的端口,Zookeeper 会监听这个端口,接受客户端的访问请求

initLimit:这个配置项是用来配置 Zookeeper 接受客户端初始化连接时最长能忍受多少个心跳时间间隔,当已经超过 10 个心跳的时间也就是(ticktime)长度后 Zookeeper 服务器还没有收到客户端的返回信息,那么表明这个客户端连接失败,总的时间长度就是:10*2000 = 20s

syncLimit:这个配置项表示 Leader 与 Follower 之间发送消息,请求和应答时间长度,最长不能超过多少个 tickTime 的时间长度,总的时间长度就是:5*2000 = 10s

server.A=B:C:D:其中 A 是一个数字,表示这个是第几号服务器;B 是这个服务器的 ip 地址;C 表示的是这个服务器与集群中的 Leader 服务器交换信息的端口;D 表示的是万一集群中的 Leader 服务器挂了,需要一个端口来重新进行选举,选出一个新的 Leader,而这个端口就是用来执行选举时服务器相互通信的端口。如果是伪集群的配置方式,由于 B 都是一样,所以不同的 Zookeeper 实例通信端口号不能一样,所以要给它们分配不同的端口号。

[root@host serverlist]# echo 1 > server1/data/myid 
[root@host serverlist]# echo 2 > server2/data/myid   
[root@host serverlist]# echo 3 > server3/data/myid

启动

[root@host serverlist]#  ../bin/zkServer.sh start zoo1.cfg

[root@host serverlist]#  ../bin/zkServer.sh start zoo2.cfg

[root@host serverlist]#  ../bin/zkServer.sh start zoo3.cfg

查看状态(leader产生有随机性)

leader:负责客户端的 writer 类型请求

Follower:负责客户端 reader 类型请求,参与 leader 选举

Observer:特殊的“Follower”,其可以接收客户端 reader 请求,但不参与选举。(扩容系统支撑能力,提高读取速度)因为他不接受任何同步的写入请求,只负责 leader 同步数据

[root@host serverlist]#  ../bin/zkServer.sh status zoo1.cfg
ZooKeeper JMX enabled by default
Using config: /root/zookeeper-3.4.12/bin/../conf/zoo1.cfg
Mode: follower
[root@host serverlist]#  ../bin/zkServer.sh status zoo2.cfg
ZooKeeper JMX enabled by default
Using config: /root/zookeeper-3.4.12/bin/../conf/zoo2.cfg
Mode: leader
[root@host serverlist]#  ../bin/zkServer.sh status zoo3.cfg
ZooKeeper JMX enabled by default
Using config: /root/zookeeper-3.4.12/bin/../conf/zoo3.cfg
Mode: follower

链接测试:

[root@host serverlist]#  ../bin/zkCli.sh -server 192.168.53.122:2181

Connecting to 192.168.53.122:2181 2018-05-09 17:27:52,673 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.12-e5259e437540f349646870ea94dc2658c4e44b3b, built on 03/27/2018 03:55 GMT 2018-05-09 17:27:52,675 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=host 2018-05-09 17:27:52,675 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.8.0_101 2018-05-09 17:27:52,677 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation 2018-05-09 17:27:52,677 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/usr/java/jdk1.8.0_101/jre 2018-05-09 17:27:52,677 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/root/zookeeper-3.4.12/bin/../build/classes:/root/zookeeper-3.4.12/bin/../build/lib/*.jar:/root/zookeeper-3.4.12/bin/../lib/slf4j-log4j12-1.7.25.jar:/root/zookeeper-3.4.12/bin/../lib/slf4j-api-1.7.25.jar:/root/zookeeper-3.4.12/bin/../lib/netty-3.10.6.Final.jar:/root/zookeeper-3.4.12/bin/../lib/log4j-1.2.17.jar:/root/zookeeper-3.4.12/bin/../lib/jline-0.9.94.jar:/root/zookeeper-3.4.12/bin/../lib/audience-annotations-0.5.0.jar:/root/zookeeper-3.4.12/bin/../zookeeper-3.4.12.jar:/root/zookeeper-3.4.12/bin/../src/java/lib/*.jar:/root/zookeeper-3.4.12/bin/../conf: 2018-05-09 17:27:52,677 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/root/hadoop/hadoop-2.7.4/lib/native/::/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib 2018-05-09 17:27:52,677 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp 2018-05-09 17:27:52,677 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA> 2018-05-09 17:27:52,678 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux 2018-05-09 17:27:52,678 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64 2018-05-09 17:27:52,678 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=2.6.32-431.el6.x86_64 2018-05-09 17:27:52,678 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=root 2018-05-09 17:27:52,678 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/root 2018-05-09 17:27:52,678 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/root/zookeeper-3.4.12/serverlist 2018-05-09 17:27:52,679 [myid:] - INFO  [main:ZooKeeper@441] - Initiating client connection, connectString=192.168.53.122:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@799f7e29 Welcome to ZooKeeper! 2018-05-09 17:27:52,695 [myid:] - INFO  [main-SendThread(192.168.53.122:2181):ClientCnxn$SendThread@1028] - Opening socket connection to server 192.168.53.122/192.168.53.122:2181. Will not attempt to authenticate using SASL (unknown error) JLine support is enabled 2018-05-09 17:27:52,760 [myid:] - INFO  [main-SendThread(192.168.53.122:2181):ClientCnxn$SendThread@878] - Socket connection established to 192.168.53.122/192.168.53.122:2181, initiating session [zk: 192.168.53.122:2181(CONNECTING) 0] 2018-05-09 17:27:52,810 [myid:] - INFO  [main-SendThread(192.168.53.122:2181):ClientCnxn$SendThread@1302] - Session establishment complete on server 192.168.53.122/192.168.53.122:2181, sessionid = 0x100877925070001, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null

[zk: 192.168.53.122:2181(CONNECTED) 0] [zk: 192.168.53.122:2181(CONNECTED) 0]

[root@host serverlist]#  ../bin/zkCli.sh -server 192.168.53.122:2182

Connecting to 192.168.53.122:2182 2018-05-09 17:28:53,602 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.12-e5259e437540f349646870ea94dc2658c4e44b3b, built on 03/27/2018 03:55 GMT 2018-05-09 17:28:53,605 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=host 2018-05-09 17:28:53,605 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.8.0_101 2018-05-09 17:28:53,607 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation 2018-05-09 17:28:53,607 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/usr/java/jdk1.8.0_101/jre 2018-05-09 17:28:53,607 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/root/zookeeper-3.4.12/bin/../build/classes:/root/zookeeper-3.4.12/bin/../build/lib/*.jar:/root/zookeeper-3.4.12/bin/../lib/slf4j-log4j12-1.7.25.jar:/root/zookeeper-3.4.12/bin/../lib/slf4j-api-1.7.25.jar:/root/zookeeper-3.4.12/bin/../lib/netty-3.10.6.Final.jar:/root/zookeeper-3.4.12/bin/../lib/log4j-1.2.17.jar:/root/zookeeper-3.4.12/bin/../lib/jline-0.9.94.jar:/root/zookeeper-3.4.12/bin/../lib/audience-annotations-0.5.0.jar:/root/zookeeper-3.4.12/bin/../zookeeper-3.4.12.jar:/root/zookeeper-3.4.12/bin/../src/java/lib/*.jar:/root/zookeeper-3.4.12/bin/../conf: 2018-05-09 17:28:53,607 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/root/hadoop/hadoop-2.7.4/lib/native/::/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib 2018-05-09 17:28:53,608 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp 2018-05-09 17:28:53,608 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA> 2018-05-09 17:28:53,608 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux 2018-05-09 17:28:53,608 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64 2018-05-09 17:28:53,608 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=2.6.32-431.el6.x86_64 2018-05-09 17:28:53,608 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=root 2018-05-09 17:28:53,608 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/root 2018-05-09 17:28:53,608 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/root/zookeeper-3.4.12/serverlist 2018-05-09 17:28:53,609 [myid:] - INFO  [main:ZooKeeper@441] - Initiating client connection, connectString=192.168.53.122:2182 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@799f7e29 Welcome to ZooKeeper! 2018-05-09 17:28:53,625 [myid:] - INFO  [main-SendThread(192.168.53.122:2182):ClientCnxn$SendThread@1028] - Opening socket connection to server 192.168.53.122/192.168.53.122:2182. Will not attempt to authenticate using SASL (unknown error) JLine support is enabled 2018-05-09 17:28:53,687 [myid:] - INFO  [main-SendThread(192.168.53.122:2182):ClientCnxn$SendThread@878] - Socket connection established to 192.168.53.122/192.168.53.122:2182, initiating session [zk: 192.168.53.122:2182(CONNECTING) 0] 2018-05-09 17:28:53,736 [myid:] - INFO  [main-SendThread(192.168.53.122:2182):ClientCnxn$SendThread@1302] - Session establishment complete on server 192.168.53.122/192.168.53.122:2182, sessionid = 0x200877925380002, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null

[zk: 192.168.53.122:2182(CONNECTED) 0] [zk: 192.168.53.122:2182(CONNECTED) 0] [zk: 192.168.53.122:2182(CONNECTED) 0]

配置hive,修改$HIVE_HOME/conf/hive_site.xml

<property>
    <name>hive.support.concurrency</name>
    <value>true</value>
    <description>
      Whether Hive supports concurrency control or not.
      A ZooKeeper instance must be up and running when using zookeeper Hive lock manager
    </description>

由于本机的伪分布式,一个就可以

<property>
    <name>hive.zookeeper.quorum</name>
    <value>192.168.53.122</value>
    <description>
      List of ZooKeeper servers to talk to. This is needed for:
      1. Read/write locks - when hive.lock.manager is set to
      org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager,
      2. When HiveServer2 supports service discovery via Zookeeper.
      3. For delegation token storage if zookeeper store is used, if
      hive.cluster.delegation.token.store.zookeeper.connectString is not set
      4. LLAP daemon registry service
    </description>

配置好属性后,hive会对特性的查询自动启动获取锁

hive> show locks;
OK
Time taken: 0.006 seconds

hive> show locks tb_cust extended;
OK
tab_name        mode
Time taken: 0.023 seconds
hive> show locks tb_cust;
OK
tab_name        mode
Time taken: 0.022 seconds
hive> show locks tb_cust partition(city='beijing');
OK
tab_name        mode
Time taken: 0.145 seconds
hive> show locks tb_cust partition(city='beijing') extended;
OK
tab_name        mode
Time taken: 0.078 seconds

hive提供了两种类型的锁,开启并发功能后,它们也就自动启动了。某个表被读取时需要使用共享锁,多重并发共享锁也是被允许的。

修改数据的操作需要使用独占锁,它不仅冻结其他的表修改操作,还有阻止其他进程的查询。

只要一个操作对表或者分区出发了独占锁,该表或者分区不能并发执行作业。

对表是分区表时,对表分区的独占锁会导致需要对表本身获取共享锁来防止发生不相容的变更。

load同insert一样会出发独占锁

显示锁和独占锁

独占锁表或者分区后,其他进程会等待,解锁后,其他进程继续执行。

hive> lock table tb_cust exclusive;   //锁表
OK
Time taken: 0.123 seconds
hive> show locks tb_cust;
OK
tab_name        mode
gamedw@tb_cust  EXCLUSIVE
Time taken: 0.045 seconds, Fetched: 1 row(s)
hive> unlock table tb_cust;
OK
Time taken: 0.081 seconds
hive> show locks tb_cust;
OK
tab_name        mode
Time taken: 0.019 seconds

hive> lock table tb_cust partition(city='beijing') exclusive; //锁定一个分区,其他分区不被锁定,可以有其他操作,但整表无法进行操作。
OK
Time taken: 0.184 seconds

hive> show locks tb_cust;
OK
Time taken: 0.023 seconds
hive> show locks tb_cust partition(city='beijing');
OK
gamedw@tb_cust@city=beijing     EXCLUSIVE
Time taken: 0.106 seconds, Fetched: 1 row(s)

hive 锁的更多相关文章

  1. hive 锁表问题

    报错如下: Unable to acquire IMPLICIT, EXCLUSIVE lock dms@pc_user_msg@month=201611 after 100 attempts. 显示 ...

  2. Hive 锁处理

    hive有两个锁,共享索(s) 和排它锁(x) 在进行ddl操作时,排他锁会阻止 ddl 操作.drop.alter table 如果一个hive查询使用到了表A,执行时间10分钟.在这10分钟内要d ...

  3. Hive 锁 lock

    Hive + zookeeper 可以支持锁功能 锁有两种:共享锁.独占锁,Hive开启并发功能的时候自动开启锁功能 1)查询操作使用共享锁,共享锁是可以多重.并发使用的 2)修改表操作使用独占锁,它 ...

  4. HIVE锁相关

    hive存在两种锁,共享锁Shared (S)和互斥锁Exclusive (X) 其中只触发s锁的操作可以并发的执行,只要有一个操作对表或者分区出发了x锁,则该表或者分区不能并发的执行作业. -- 加 ...

  5. hive lock命令的使用

    1.hive锁表命令 hive> lock table t1 exclusive;锁表后不能对表进行操作 2.hive表解锁: hive> unlock table t1; 3.查看被锁的 ...

  6. hive资料

    Hive基本操作 Hive 解锁操作 之前使用Hive,出现过一种情况:在代码正在执行insert into或insert overwrite时,中途手动将程序停掉,会出现卡死情况,只能执行查询操作, ...

  7. hadoop问题集(1)

        参考: http://dataunion.org/22887.html 1.mapreduce_shuffle does not exist 执行任何时报错: Container launch ...

  8. 一文带你读懂zookeeper在大数据生态的应用

    一个执着于技术的公众号 一.简述 在一群动物掌管的世界中,动物没有人类聪明的思想,为了保持动物世界的生态平衡,这时,动物管理员-zookeeper诞生了. 打开Apache zookeeper的官网, ...

  9. hive 表锁和解锁

    场景: 在执行insert into或insert overwrite任务时,中途手动将程序停掉,会出现卡死情况(无法提交MapReduce),只能执行查询操作,而drop insert操作均不可操作 ...

随机推荐

  1. 《DSP using MATLAB》Problem 4.15

    只会做前两个, 代码: %% ---------------------------------------------------------------------------- %% Outpu ...

  2. hdu1227 dp

    题意:在一条路上有 n 个站点,并给定了每个站点的坐标,然后想要在 k 个站点旁边分别各建一个补给站,求所有站点到最近的补给站的距离和的最小值. 是的,毫无疑问,显然是 DP 问题,但是这题怎么递推还 ...

  3. (机器学习)小试牛刀 利用Zapier和MonkeyLearn

    MonkeyLearn + Zapier Integration(阅者注:本文介绍如何用Zapier和MonkeyLearn将机器学习实际应用到工作当中,比如:客户咨询和投诉管理,营销邮件管理) We ...

  4. adb学习笔记

    一.adb实现原理 adb的目的是想仅在PC端执行adb操作来获取手机里面的文件或向手机内部发送文件.这是通过Ubuntu中adb操作作为客户端与Ubuntu中运行的adb service交互,Ubu ...

  5. YUV和RGB之间的转换方法

    yCbCr<-->rgb Y’ = 0.257*R' + 0.504*G' + 0.098*B' + 16 Cb Cr R) G) - 0.392*(Cb'-128) B) 参考: htt ...

  6. stenciljs 学习十三 @stencil/router 组件使用说明

    @stencil/router 组件包含的子组件 stencil-router stencil-route-switch stencil-route stencil-route-link stenci ...

  7. vulcanjs 开源工具方便快速开发react graphql meteor 应用

    vulcan 开源工具方便快速开发react graphql meteor 应用 操作环境mac os 安装 meteor 安装(此安装有点慢,可以通过正确上网解决) curl https://ins ...

  8. 批处理(bat)命令学习的一些总结

    这篇笔记是我对批处理学习的一些总结,能在系统帮助里找到的内容我就不写了,太偏门的也不写,只写些个人感觉很好用的技巧,大部分属于整理 一.set 篇: 1.set(无开关) set .=test set ...

  9. Java的四种引用之强弱软虚

    在java中提供4个级别的引用:强引用.软引用.弱引用和虚引用.除了强引用外,其他3中引用均可以在java.lang.ref包中找到对应的类.开发人员可以在应用程序中直接使用他们. 1 强引用 强引用 ...

  10. APP自动化测试各项指标分析

    一.内存分析专项 启动App. DDMS->update heap 操作app,点几次GC dump heap hprof-conv转化 MAT分析 二.区分几种内存 VSS- Virtual ...