问题

Drill最新版本是1.14，从1.13开始Drill支持hive的版本升级到2.3.2，详见1.13的release notes

The Hive client for Drill is updated to version 2.3.2. With the update, Drill supports queries on transactional (ACID) and non-transactional Hive bucketed ORC tables. The updated libraries are backward compatible with earlier versions of the Hive server and metastore. (DRILL-5978)

强行使用Drill1.14连接Hive2.1.1会由于metastore thrift接口变化导致问题，具体体现为 show tables是空，具体报错如下：

2018-10-10 13:03:54,355 [244277c5-ba8c-b6c8-8f99-2cdde9f3c4d8:frag:0:0] WARN o.a.d.e.s.h.DrillHiveMetaStoreClient - Failure while attempting to get hive table. Retries once.

org.apache.thrift.TApplicationException: Invalid method name: 'get_table_req'

at org.apache.thrift.TApplicationException.read(TApplicationException.java:111) ~[drill-hive-exec-shaded-1.14.0.jar:1.14.0]

at org.apache.drill.exec.store.hive.DrillHiveMetaStoreClient$TableLoader.load(DrillHiveMetaStoreClient.java:531) [drill-storage-hive-core-1.14.0.jar:1.14.0]

at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527) [guava-18.0.jar:na]

at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319) [guava-18.0.jar:na]

at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282) [guava-18.0.jar:na]

at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2197) [guava-18.0.jar:na]

at com.google.common.cache.LocalCache.get(LocalCache.java:3937) [guava-18.0.jar:na]

at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941) [guava-18.0.jar:na]

at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824) [guava-18.0.jar:na]

at org.apache.drill.exec.store.hive.DrillHiveMetaStoreClient$HiveClientWithCaching.getHiveReadEntry(DrillHiveMetaStoreClient.java:495) [drill-storage-hive-core-1.14.0.jar:1.14.0]

at org.apache.drill.exec.store.hive.schema.HiveSchemaFactory$HiveSchema.getSelectionBaseOnName(HiveSchemaFactory.java:230) [drill-storage-hive-core-1.14.0.jar:1.14.0]

at org.apache.drill.exec.store.hive.schema.HiveSchemaFactory$HiveSchema.getDrillTable(HiveSchemaFactory.java:210) [drill-storage-hive-core-1.14.0.jar:1.14.0]

at org.apache.drill.exec.store.hive.schema.HiveDatabaseSchema.getTable(HiveDatabaseSchema.java:62) [drill-storage-hive-core-1.14.0.jar:1.14.0]

at org.apache.drill.exec.store.AbstractSchema.getTablesByNames(AbstractSchema.java:239) [drill-java-exec-1.14.0.jar:1.14.0]

at org.apache.drill.exec.store.AbstractSchema.getTableNamesAndTypes(AbstractSchema.java:257) [drill-java-exec-1.14.0.jar:1.14.0]

at org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator$Tables.visitTables(InfoSchemaRecordGenerator.java:301) [drill-java-exec-1.14.0.jar:1.14.0]

at org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.scanSchema(InfoSchemaRecordGenerator.java:216) [drill-java-exec-1.14.0.jar:1.14.0]

at org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.scanSchema(InfoSchemaRecordGenerator.java:209) [drill-java-exec-1.14.0.jar:1.14.0]

at org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.scanSchema(InfoSchemaRecordGenerator.java:209) [drill-java-exec-1.14.0.jar:1.14.0]

at org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.scanSchema(InfoSchemaRecordGenerator.java:196) [drill-java-exec-1.14.0.jar:1.14.0]

at org.apache.drill.exec.store.ischema.InfoSchemaTableType.getRecordReader(InfoSchemaTableType.java:58) [drill-java-exec-1.14.0.jar:1.14.0]

at org.apache.drill.exec.store.ischema.InfoSchemaBatchCreator.getBatch(InfoSchemaBatchCreator.java:34) [drill-java-exec-1.14.0.jar:1.14.0]

at org.apache.drill.exec.store.ischema.InfoSchemaBatchCreator.getBatch(InfoSchemaBatchCreator.java:30) [drill-java-exec-1.14.0.jar:1.14.0]

at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:159) [drill-java-exec-1.14.0.jar:1.14.0]

at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:182) [drill-java-exec-1.14.0.jar:1.14.0]

at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:137) [drill-java-exec-1.14.0.jar:1.14.0]

at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:182) [drill-java-exec-1.14.0.jar:1.14.0]

at org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:110) [drill-java-exec-1.14.0.jar:1.14.0]

at org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:87) [drill-java-exec-1.14.0.jar:1.14.0]

at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:261) [drill-java-exec-1.14.0.jar:1.14.0]

at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.14.0.jar:1.14.0]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_60]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_60]

at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]

编译

于是尝试重新编译Drill1.14，将依赖的hive版本降到2.1.1，下载代码

http://mirror.bit.edu.cn/apache/drill/drill-1.14.0/apache-drill-1.14.0-src.tar.gz

POM

修改pom中的hive版本

<hive.version>2.3.2</hive.version>

修改为<hive.version>2.1.1</hive.version>

重新编译打包后发现问题依旧，经检查发现修改版本之后只有jars/3rdparty下的3个hive相关jar从2.3.2改为2.1.1

hive-contrib-2.1.1.jar
hive-hbase-handler-2.1.1.jar
hive-metastore-2.1.1.jar

报错的jar是drill-hive-exec-shaded-1.14.0.jar，这个jar包中包含包含hive-exec及依赖，

<artifactId>maven-shade-plugin</artifactId>

<configuration>

<artifactSet>

<includes>

<include>org.apache.hive:hive-exec</include>

并且没有使用配置的hive.version

<dependency>

<groupId>org.apache.hive</groupId>

<artifactId>hive-exec</artifactId>

<scope>compile</scope>

导致打进jar包中的hive-exec是2.3.2版本的，增加hive.version配置

<dependency>

<groupId>org.apache.hive</groupId>

<artifactId>hive-exec</artifactId>

<version>${hive.version}</version>

<scope>compile</scope>

再打包，问题消失，show tables正常；

Hadoop Location

官方文档说明如下：

Apache Drill users must tell Drill-on-YARN the location of your Hadoop install. Set the HADOOP_HOME environment variable in $DRILL_SITE/drillenv.sh to point to your Hadoop installation:
export HADOOP_HOME= /path/to/hadoop-home  

但配置之后依然存在问题：

1）报错

Diagnostics: File file:/user/drill/site.tar.gz does not exist
java.io.FileNotFoundException: File file:/user/drill/site.tar.gz does not exist

需要添加link

ln -s $HADOOP_HOME/etc/hadoop/core-site.xml $DRILL_SITE/core-site.xml

2）在实际查询时会报错找不到hdfs_name，需要添加link

ln -s $HADOOP_HOME/etc/hadoop/hdfs-site.xml $DRILL_SITE/hdfs-site.xml

【原创】大数据基础之Drill（2）Drill1.14+Hive2.1.1运行的更多相关文章

【原创】大数据基础之Drill（1）简介、安装及使用
https://drill.apache.org/ 一简介 Drill is an Apache open-source SQL query engine for Big Data explorat ...
【原创】大数据基础之Zookeeper（2）源代码解析
核心枚举 public enum ServerState { LOOKING, FOLLOWING, LEADING, OBSERVING; } zookeeper服务器状态:刚启动LOOKING,f ...
【原创】大数据基础之Benchmark（2）TPC-DS
tpc 官方:http://www.tpc.org/ 一简介 The TPC is a non-profit corporation founded to define transaction pr ...
【原创】大数据基础之词频统计Word Count
对文件进行词频统计,是一个大数据领域的hello word级别的应用,来看下实现有多简单: 1 Linux单机处理 egrep -o "\b[[:alpha:]]+\b" test ...
【原创】大数据基础之Impala（1）简介、安装、使用
impala2.12 官方:http://impala.apache.org/ 一简介 Apache Impala is the open source, native analytic datab ...
大数据基础知识：分布式计算、服务器集群[zz]
大数据中的数据量非常巨大,达到了PB级别.而且这庞大的数据之中,不仅仅包括结构化数据(如数字.符号等数据),还包括非结构化数据(如文本.图像.声音.视频等数据).这使得大数据的存储,管理和处理很难利用 ...
大数据基础知识问答----spark篇，大数据生态圈
Spark相关知识点 1.Spark基础知识 1.Spark是什么? UCBerkeley AMPlab所开源的类HadoopMapReduce的通用的并行计算框架 dfsSpark基于mapredu ...
大数据基础知识问答----hadoop篇
handoop相关知识点 1.Hadoop是什么? Hadoop是一个由Apache基金会所开发的分布式系统基础架构.用户可以在不了解分布式底层细节的情况下,开发分布式程序.充分利用集群的威力进行高速 ...
hadoop大数据基础框架技术详解
一.什么是大数据进入本世纪以来,尤其是2010年之后,随着互联网特别是移动互联网的发展,数据的增长呈爆炸趋势,已经很难估计全世界的电子设备中存储的数据到底有多少,描述数据系统的数据量的计量单位从MB ...

随机推荐

.NET 开源项目 Anet 介绍
使用 Anet 有一段时间了,已经在我的个人网站(如 bookist.cc)投入使用,目前没有发现什么大问题,所以才敢写篇文章向大家介绍. GitHub 地址:https://github.com/a ...
基于Grunt构建一个的项目
没有搭建环境的,请参考<Grunt自动化构建环境搭建 >,搭建完成后新建一个项目目录,这里建立一个“Demo”目录运行CMD,并进入这个目录,运行 npm install grunt ...
PS快速祛除脸上小雀斑
首先我们要把图片放到PS软件中,然后在PS左侧工具栏中找到污点修复画笔工具(J), 配合着污点修复画笔中的修补工具一起使用,注意:模式要选择正常,属性栏中类型要选择内容识别. 下一步我们需要在图层上添 ...
让多个HTML页面使用同一段HTML代码
需求背景一个网站有多个网页:一个网页,可以分为很多部分,举个例子,下面是一个特别简单的网页结构: 一般情况下,footer都是用于标识网站的相关信息(备案.联系方式.制作方),每一个页面都是相 ...
mysql-笔记--增删改查
查看数据库:可以使用 show databases; 命令查看已经创建了哪些数据库指定数据库:在登录后使用 use 语句指定, 命令: use 数据库名;要对一个数据库进行操作, 必须先选择该数据库 ...
angular4 数据绑定
HTML属性绑定 1.基本Html属性绑定 <td [attr.colspan]="tableColspan">something</td> 2.css类绑 ...
[模板] 二分图博弈 && BZOJ2463:[中山市选2009]谁能赢呢？
二分图博弈 from BZOJ 1443 游戏(二分图博弈) - free-loop - 博客园定义 1.博弈者人数为两人,双方轮流进行决策. 2.博弈状态(对应点)可分为两类(状态空间可分为两个集 ...
下载图片没有关闭http输入流导致下载超时
在某次接入第三方厂商数据时,需要根据对方提供的URL地址下载图片,当数据量大时会遇到很多的下载图片超时问题,开始以为是第三方厂商的问题,对方排查了很久之后,说是我这边下载数据全部留在缓存区,导致缓存区 ...
P1313 计算系数 HMR大佬讲解
今天,HMR大佬给我们讲解了这一道难题. 这道题明显的二项式定理,自然想到了要用到杨辉三角了.基本思路就是先用for循环求出杨辉三角,这样就求出了x的n次方的系数和y的m次方的系数. 这是大佬的AC代 ...
python版接口自动化测试框架源码完整版（requests + unittest）
python版接口自动化测试框架:https://gitee.com/UncleYong/my_rf [框架目录结构介绍] bin: 可执行文件,程序入口 conf: 配置文件 core: 核心文件 ...

【原创】大数据基础之Drill（2）Drill1.14+Hive2.1.1运行

问题

编译

POM

Hadoop Location

【原创】大数据基础之Drill（2）Drill1.14+Hive2.1.1运行的更多相关文章

随机推荐

热门专题