【原创】大叔经验分享(55)spark连接kudu报错
spark-2.4.2
kudu-1.7.0
开始尝试
1)自己手工将jar加到classpath
spark-2.4.2-bin-hadoop2.6
+
kudu-spark2_2.11-1.7.0-cdh5.16.1.jar
# bin/spark-shell
scala> val df = spark.read.options(Map("kudu.master" -> "master:7051", "kudu.table" -> "impala::test.tbl_test")).format("kudu").load
java.lang.ClassNotFoundException: Failed to find data source: kudu. Please find packages at http://spark.apache.org/third-party-projects.html
at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:660)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:194)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167)
... 49 elided
Caused by: java.lang.ClassNotFoundException: kudu.DefaultSource
at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:72)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$5(DataSource.scala:634)
at scala.util.Try$.apply(Try.scala:213)
at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$4(DataSource.scala:634)
at scala.util.Failure.orElse(Try.scala:224)
at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:634)
... 51 more
2)采用官方的方式(将kudu版本改为1.7.0)
spark-2.4.2-bin-hadoop2.6
# bin/spark-shell --packages org.apache.kudu:kudu-spark2_2.11:1.7.0
same error
3)采用官方的方式(不修改)
spark-2.4.2-bin-hadoop2.6
# bin/spark-shell --packages org.apache.kudu:kudu-spark2_2.11:1.9.0
scala> val df = spark.read.options(Map("kudu.master" -> "master:7051", "kudu.table" -> "impala::test.tbl_test")).format("kudu").load
java.lang.NoClassDefFoundError: scala/Product$class
at org.apache.kudu.spark.kudu.Upsert$.<init>(OperationType.scala:41)
at org.apache.kudu.spark.kudu.Upsert$.<clinit>(OperationType.scala)
at org.apache.kudu.spark.kudu.DefaultSource$$anonfun$getOperationType$2.apply(DefaultSource.scala:217)
at org.apache.kudu.spark.kudu.DefaultSource$$anonfun$getOperationType$2.apply(DefaultSource.scala:217)
at scala.Option.getOrElse(Option.scala:138)
at org.apache.kudu.spark.kudu.DefaultSource.getOperationType(DefaultSource.scala:217)
at org.apache.kudu.spark.kudu.DefaultSource.createRelation(DefaultSource.scala:104)
at org.apache.kudu.spark.kudu.DefaultSource.createRelation(DefaultSource.scala:87)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:318)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167)
... 49 elided
Caused by: java.lang.ClassNotFoundException: scala.Product$class
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 61 more
看起来是scala版本冲突,到spark下载页面发现一句话:
Note that, Spark is pre-built with Scala 2.11 except version 2.4.2, which is pre-built with Scala 2.12.
4)kudu-spark改为scala2.12
spark-2.4.2-bin-hadoop2.6
# bin/spark-shell --packages org.apache.kudu:kudu-spark2_2.12:1.9.0
::::::::::::::::::::::::::::::::::::::::::::::
:: UNRESOLVED DEPENDENCIES ::
::::::::::::::::::::::::::::::::::::::::::::::
:: org.apache.kudu#kudu-spark2_2.12;1.9.0: not found
::::::::::::::::::::::::::::::::::::::::::::::
好吧,下载2.4.3
5)采用官方的方式(继续)
spark-2.4.3-bin-hadoop2.6
# bin/spark-shell --packages org.apache.kudu:kudu-spark2_2.11:1.9.0
scala> val df = spark.read.options(Map("kudu.master" -> "master:7051", "kudu.table" -> "impala::test.tbl_test")).format("kudu").load
df: org.apache.spark.sql.DataFrame = [order_no: string, id: bigint ... 28 more fields]
正常了
6)采用官方的方式(将kudu版本改为1.7.0)
spark-2.4.3-bin-hadoop2.6
# bin/spark-shell --packages org.apache.kudu:kudu-spark2_2.11:1.7.0
same error
看来spark连接kudu只能采用scala2.11+kudu-spark2_2.11:1.9.0
参考:
https://kudu.apache.org/docs/developing.html
http://spark.apache.org/downloads.html
【原创】大叔经验分享(55)spark连接kudu报错的更多相关文章
- 【原创】大叔经验分享(53)kudu报错unable to find SASL plugin: PLAIN
kudu安装后运行不正常,master中找不到任何tserver,查看tserver日志发现有很多报错: Failed to heartbeat to master:7051: Invalid arg ...
- 【原创】大叔经验分享(51)docker报错Exited (137)
docker container启动失败,报错:Exited (137) *** ago,比如 Exited (137) 16 seconds ago 这时通过docker logs查不到任何日志,从 ...
- 【原创】大叔经验分享(63)kudu vs parquet
一 对比 存储空间对比: 查询性能对比: 二 设计方案 将数据拆分为:历史数据(hdfs+parquet+snappy)+ 近期数据(kudu),可以兼具各种优点: 1)整体低于10%的磁盘占用: 2 ...
- 【原创】大叔经验分享(61)kudu rebalance报错
kudu rebalance命令报错 terminate called after throwing an instance of 'std::regex_error' what(): regex_e ...
- 【原创】大叔经验分享(62)kudu副本数量
kudu的副本数量是在表上设置,可以通过命令查看 # sudo -u kudu kudu cluster ksck $master ... Summary by table Name | RF | S ...
- 【原创】大叔经验分享(59)kudu查看table size
kudu并没有命令可以直接查看每个table占用的空间,可以从cloudera manager上间接查看 CM is scrapping and aggregating the /metrics pa ...
- 【原创】大叔经验分享(58)kudu写入压力大时报错
kudu写入压力大时报错 19/05/18 16:53:12 INFO AsyncKuduClient: Invalidating location fd52e4f930bc45458a8f29ed1 ...
- 【原创】大叔经验分享(38)beeline连接hiveserver2报错impersonate
beeline连接hiveserver2报错 Error: Could not open client transport with JDBC Uri: jdbc:hive2://localhost: ...
- 【原创】大叔问题定位分享(33)beeline连接presto报错
hive2.3.4 presto0.215 使用hive2.3.4的beeline连接presto报错 $ beeline -d com.facebook.presto.jdbc.PrestoDriv ...
随机推荐
- TCP层close系统调用的实现分析
在调用close系统调用关闭套接字时,如果套接字引用计数已经归零,则需继续向上层调用其close实现,tcp为tcp_close:本文仅介绍tcp部分,前置部分请参考本博关于close系统调用的文章: ...
- PHP冒泡排序原生代码
//冒泡排序 $arr=array(23,5,26,4,9,85,10,2,55,44,21,39,11,16,55,88,421,226,588); $n =count($arr); //echo ...
- 七十九:flask.Restful之flask-Restful蓝图与渲染模板
1.flask-Restful与蓝图结合使用如果要在蓝图中使用flask-Restful,那么在创建Api对象的时候,就不应该使用app,而是蓝图,如果有多个蓝图,则需在每一个蓝图里面创建一个Api对 ...
- jenkins+git+gitlab+ansible实现持续集成自动化部署
一.环境配置 192.168.42.8部署gitlab,节点一 192.168.42.9部署git,Jenkins,ansible服务器 192.168.42.10节点二 二.操作演示 ①gitlab ...
- RxJava2实战--第二章 RxJava基础知识
第二章 RxJava基础知识 1. Observable 1.1 RxJava的使用三步骤 创建Observable 创建Observer 使用subscribe()进行订阅 Observable.j ...
- Spring集成CXF获取HttpServletRequest,HttpServletResponse
最近的项目中,在Spring继承CXF中要用到request来获取IP,所以先要获取到HttpServletRequest对象,具体方法如下: 1.配置文件: <jaxrs:server id= ...
- django使用session来保存用户登录状态
先建好登录用的model,其次理解使用cookie和session的原理,一个在本机保存,一个在服务器保存 使用session好处,可以设置登录过期的时间, 编写views中login的函数 def ...
- 【Java】递归删除目录以及文件
public static void deleteDirectory(String path) { File pFile = new File(path); //若目录以及文件不存在,则终止继续执行方 ...
- XCTF (app2)
打开app,有两个输入框和一个按钮.点击按钮会跳转到新的页面显示Waiting for you. 打开JEB反编译. 如果两个输入框的长度都不为0,那么获取这两个值到v0和v1中,Log记录日志. 创 ...
- Leetcode之动态规划(DP)专题-63. 不同路径 II(Unique Paths II)
Leetcode之动态规划(DP)专题-63. 不同路径 II(Unique Paths II) 初级题目:Leetcode之动态规划(DP)专题-62. 不同路径(Unique Paths) 一个机 ...