spark提示Caused by: java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to [Lscala.collection.immutable.Map;
spark提示Caused by: java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to [Lscala.collection.immutable.Map;
起因
编写了一个处理两列是否相等的UDF,这两列的数据结构是一样的,但是结构比较复杂,如下:
|-- list: array (nullable = true)
| |-- element: map (containsNull = true)
| | |-- key: string
| | |-- value: array (valueContainsNull = true)
| | | |-- element: struct (containsNull = true)
| | | | |-- Date: integer (nullable = true)
| | | | |-- Name: string (nullable = true)
|-- list2: array (nullable = true)
| |-- element: map (containsNull = true)
| | |-- key: string
| | |-- value: array (valueContainsNull = true)
| | | |-- element: struct (containsNull = true)
| | | | |-- Date: integer (nullable = true)
| | | | |-- Name: string (nullable = true)
也就是说Array里嵌着Map,Map里还嵌着一个Array,只能依次去比较,编写的UDF如下:
case class AppList(Date: Int, versionCode: Int, Name:String)
def isMapEqual(map1: Map[String, Array[AppList]], map2:Map[String, Array[AppList]]): Boolean = {
try{
if (map1.size != map2.size){
return false
} else{
for ( x <- map1.keys){
if (map1(x) != map2(x)){
return false
}
}
return true
}
} catch {
case e: Exception => false
}
}
def isListEqual(list1: Array[Map[String, Array[AppList]]], list2:Seq[Map[String, Seq[AppList]]]): Boolean = {
try {
if (list1.length != list2.length){
return false
} else if (list1.length == 0 || list2.length == 0){
return false
} else {
return isMapEqual(list1(0), list2(0))
}
} catch {
case e: Exception => false
}
}
val isColumnEqual = udf((list1: Array[Map[String, Array[AppList]]], list2:Array[Map[String, Array[AppList]]]) => {
isListEqual(list1, list2)
})
然后我就贴到spark-shell里去执行了下面语句:
val dat = df.withColumn("equal", isColumnEqual($"list1", $"list2"))
dat.show()
此时就出现了如下错误:
Caused by: org.apache.spark.SparkException: Failed to execute user defined function($anonfun$1: (array<map<string,array<struct<Date:int,Name:string>>>>, array<map<string,array<struct<Date:int,Name:string>>>>) => boolean)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:231)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:225)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to [Lscala.collection.immutable.Map;
at $anonfun$1.apply(<console>:42)
... 16 more
解决办法
所谓的解决办法,自然是去谷歌了…
在这里看到,说把Array改成Seq就好了,囧,尝试了一下,果然就好了
原因
这里说:
So it looks like the ArrayType on Dataframe "idDF" is really a WrappedArray and not an Array - So the function call to "filterMapKeysWithSet" failed as it expected an Array but got a WrappedArray/ Seq instead (which doesn't implicitly convert to Array in Scala 2.8 and above).
意思是,此Array非Scala中的原生Array,而是封装了一下的Array(有错的一定指出来,我都没写过Scala,慌
参考
- https://stackoverflow.com/questions/40199507/scala-collection-mutable-wrappedarrayofref-cannot-be-cast-to-integer
- https://stackoverflow.com/questions/40764957/spark-java-lang-classcastexception-scala-collection-mutable-wrappedarrayofref
spark提示Caused by: java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to [Lscala.collection.immutable.Map;的更多相关文章
- hive orc压缩数据异常java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.hive.ql.io.orc.OrcSerde$OrcSerdeRow
hive表在创建时候指定存储格式 STORED AS ORC tblproperties ('orc.compress'='SNAPPY'); 当insert数据到表时抛出异常 Caused by: ...
- sbt spark2.3.1 idea打包Caused by: java.lang.ClassNotFoundException: scala.Product$class
今天同事在服务区上面装的是最新版本的hadoop3.10和spark2.3.1,因为用scala开发, 所以我想用sbt进行开发.过程中遇到各种坑,刚开始用的jdk10,结果也报错,后来改成jdk1. ...
- 关于android使用ksoap2报Caused by: java.lang.ClassCastException: org.ksoap2.SoapFault cannot be cast to org.ksoap2.serialization.SoapObject
Caused by: java.lang.ClassCastException: org.ksoap2.SoapFault cannot be cast to org.ksoap2.serializa ...
- Caused by: java.lang.ClassCastException: org.springframework.web.SpringServletContainerInitializer cannot be cast to javax.servlet.ServletContainerInitializer错误解决办法
严重: Failed to initialize end point associated with ProtocolHandler ["http-bio-8080"] java. ...
- Caused by: java.lang.ClassCastException: org.springframework.web.SpringServletContainerInitializer cannot be cast to javax.servlet.ServletContainerInitializer
A child container failed during startjava.util.concurrent.ExecutionException: org.apache.catalina.Li ...
- Caused by: java.lang.ClassCastException: class java.lang.Double cannot be cast to class org.apache.hadoop.io.WritableComparable
错误: Caused by: java.lang.ClassCastException: class java.lang.Double cannot be cast to class org.apac ...
- java.lang.ClassCastException: net.sf.ezmorph.bean.MorphDynaBean cannot be cast to
Java.lang.ClassCastException: net.sf.ezmorph.bean.MorphDynaBean cannot be cast to 在使用JSONObject.toBe ...
- SSH整合时执行hibernate查询报错:java.lang.ClassCastException: com.ch.hibernate.Department_$$_javassist_0 cannot be cast to javassist.util.proxy
今天在整合ssh三个框架时,有一个功能,是查询所有员工信息,且员工表和部门表是多对一的映射关系,代码能正常运行到查询得到一个List集合,但在页面展示的时候,就报异常了, java.lang.Clas ...
- java.lang.ClassCastException: org.springframework.web.filter.CharacterEncodingFilter cannot be cast to javax.servlet.Filter
java.lang.ClassCastException: org.springframework.web.filter.CharacterEncodingFilter cannot be cast ...
随机推荐
- vijos:旅行家的预算[贪心]
题目 Problem description 一个旅行家想驾驶汽车以最少的费用从一个城市到另一个城市(假设出发时油箱是空的).给定两个城市之间的距离D1.汽车油箱的容量C(以升为单位).每升汽油能行驶 ...
- Rot13加密算法
Rot13是一种非常简单的替换加密算法,只能加密26个英语字母.方法是:把每个字母用其后第13个字母代替. 因为有26个字母,取其一半13. s = "xrlvf23xfqwsxsqf&qu ...
- 让硬盘灯不再狂闪,调整Win7系统绝技(转)
让硬盘灯不再狂闪,调整Win7系统绝技! Win7对硬盘的大量读写确实令人头疼,Win7虽然快,但这是以损耗我们的硬件作为代价的,特别是Win7系统中内置的几种系统服务,对普通用户没有多大的用处,但是 ...
- linux shell 脚本攻略学习18--grep命令详解
grep(global search regular expression(RE) and print out the line,全面搜索正则表达式并把行打印出来)是unix/linux中用于文本搜索 ...
- iOS 10 SceneKit 新特性 – SceneKit 制作 3D 场景框架
来源:scauos(@大朕东) 链接:http://www.jianshu.com/p/b30785bb6c97 开头语: 今天的主题是探索iOS10 SceneKit的新功能,你可以观看今年WWDC ...
- Lua队列问题
今天看到Lua程序设计第11章了,表示按照书中的例子打出来,但是不知道正确写用: List = {} function List.new () return {first = 0, last = -1 ...
- ipsec在企业网中的应用(IKE野蛮模式)(转)
from:http://lulu1101.blog.51cto.com/4455468/817954 ipsec在企业网中的应用(IKE野蛮模式) 案例: 本实验采用华为三台F100防火墙,和一台s3 ...
- 【Struts2】如何查看Struts2框架的源码
学习三大框架时难免遇到不太理解的地方需要去研究框架源码,这里总结一下查看struts2源码的两种方式. 1.直接解压struts2.X.X-all.zip,在的到的解压文件中看到如下目录: 打开图中蓝 ...
- 更改Eclipse下Tomcat的部署目录 ,防止上传的文件是到eclipse的克隆的tomcat上的webapp,而不是tomcat本身的webapp
使用eclipse开发是因为机器不够用myeclipse,eclipse也比myeclipse清爽很多,启动速度也快.这里的搭建开发环境使用: Jdk1.6+Tomcat6+Eclipse JEE, ...
- php数组添加元素的方法
PHP数组添加一个元素的方式: push(), arr[], Php代码 $arr = array(); array_push($arr, el1, el2 ... eln); 但其实有一种更直 ...