问题:

Caused by: java.util.concurrent.ExecutionException: java.lang.IndexOutOfBoundsException: Index: 0
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1016)
... 65 more
Caused by: java.lang.IndexOutOfBoundsException: Index: 0
at java.util.Collections$EmptyList.get(Collections.java:4454)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$Type.getSubtypes(OrcProto.java:12240)
at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getColumnIndicesFromNames(ReaderImpl.java:651)
at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getRawDataSizeOfColumns(ReaderImpl.java:634)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.populateAndCacheStripeDetails(OrcInputFormat.java:927)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:836)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:702)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
java.lang.RuntimeException: serious problem
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
at scala.Option.getOrElse(Option.scala:121)

原因:

sparksql生成的hive表有空文件,但是sparksql读取空文件的时候,因为表示orc格式的,导致sparksql解析orc文件出错。但是用hive却可以正常读取。

解决办法:

设置set spark.sql.hive.convertMetastoreOrc=true

单纯的设置以上参数还是会报错:

java.lang.IndexOutOfBoundsException
at java.nio.Buffer.checkIndex(Buffer.java:540)
at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:139)
at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:374)
at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:316)
at org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:187)
at org.apache.spark.sql.hive.orc.OrcFileOperator$$anonfun$getFileReader$2.apply(OrcFileOperator.scala:75)
at org.apache.spark.sql.hive.orc.OrcFileOperator$$anonfun$getFileReader$2.apply(OrcFileOperator.scala:73)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at scala.collection.TraversableOnce$class.collectFirst(TraversableOnce.scala:145)
at scala.collection.AbstractIterator.collectFirst(Iterator.scala:1336)
at org.apache.spark.sql.hive.orc.OrcFileOperator$.getFileReader(OrcFileOperator.scala:86)
at org.apache.spark.sql.hive.orc.OrcFileOperator$$anonfun$readSchema$1.apply(OrcFileOperator.scala:95)
at org.apache.spark.sql.hive.orc.OrcFileOperator$$anonfun$readSchema$1.apply(OrcFileOperator.scala:95)
at scala.collection.immutable.Stream.flatMap(Stream.scala:489)

需要再设置set spark.sql.orc.impl=native

参考https://issues.apache.org/jira/browse/SPARK-19809

sparksql读取hive数据报错:java.lang.RuntimeException: serious problem的更多相关文章

  1. hive 报错 java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

    Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable ...

  2. storm supervisor启动报错java.lang.RuntimeException: java.io.EOFException

    storm因机器断电或其他异常导致的supervisor意外终止,再次启动时报错: 1. 2013-09-24 09:15:44,361 INFO [main] daemon.supervisor ( ...

  3. Android Studio 首次安装报错 Java.lang.RuntimeException:java.lang.NullPointerException...错

    下次安装报:Java.lang.RuntimeException: java.lang.NullPointerException......错 只需在文件..\Android Studio\bin\i ...

  4. hive Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

    Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata. ...

  5. 我的Android进阶之旅------>Android中MediaRecorder.stop()报错 java.lang.RuntimeException: stop failed.【转】

    本文转载自:http://blog.csdn.net/ouyang_peng/article/details/48048975 今天在调用MediaRecorder.stop(),报错了,java.l ...

  6. maven报错 java.lang.RuntimeException: com.google.inject.CreationException: Unable to create injector, see the following errors

    2 errors java.lang.RuntimeException: com.google.inject.CreationException: Unable to create injector, ...

  7. java.lang.RuntimeException: Class "org.apache.maven.cli.MavenCli$CliRequest" not found

    IDEA版本:14.1 maven版本:apache-maven-3.3.9-bin IDEA的maven项目,在pom文件执行Maven--Reimport,引入jar包依赖,报错java.lang ...

  8. hive 报错FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient FAILED: Execu

    使用hive一段时间以后,今天在使用的时候突然报错,如下: hive> show databases;FAILED: Error in metadata: java.lang.RuntimeEx ...

  9. Hive 报错:java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

    在配置好hive后启动报错信息如下: [walloce@bigdata-study- hive--cdh5.3.6]$ bin/hive Logging initialized using confi ...

随机推荐

  1. SQL on Hadoop技术综述

    一.系统架构 runtime framework v.s. mpp 在SQL on Hadoop系统中,有两种架构: 1.一种是基于某个运行时框架来构建查询引擎,典型案例是Hive: 2.另一种是仿照 ...

  2. spring boot整合websocket之使用自带tomcat启动项目报错记录

    项目中用到websocket,就将原来写好的websocket工具类直接拿来使用,发现前端建立连接的时候报404,经查找发现是因为原来用的是配置的外部tomcat启动,这次是spring boot自带 ...

  3. Octet和byte的差异(转)

    在不严谨的前提下,byte和octet都表示为8 bits,但是严格意义上来讲,octet才是严格意义上的8 bits,而历史上的byte其实可以表示为4 bits ~ 10 bits,只不过现在的计 ...

  4. 深入理解JVM虚拟机

    JVM平台上还可以运行其他语言,运行的是Class字节码.只要能翻译成Class的语言就OK了.挺强大的. JVM厂商很多 垃圾收集器.收集算法 JVM检测工具 关于类的加载: Java代码中,类型( ...

  5. a simple machine learning system demo, for ML study.

    Machine Learning System introduction This project is a full stack Django/React/Redux app that uses t ...

  6. elasticsearch中mapping的_source和store的笔记(转)

    原文地址: https://www.cnblogs.com/zklidd/p/6149120.html 0.故事引入 无意中看到了ES的mapping中有store字段,作为一个ES菜鸡,有必要对这个 ...

  7. 使用vue搭建应用一入门

    1.准备 安装nodejs,配置环境变量 安装了nodejs,也就安装了npm 安装webpack npm install webpack -g 安装vue脚手架项目初始化工具 vue-cli npm ...

  8. 如何在Debian 9上安装和使用Docker

    介绍 Docker是一个简化容器中应用程序进程管理过程的应用程序.容器允许您在资源隔离的进程中运行应用程序.它们与虚拟机类似,但容器更便携,更加资源友好,并且更依赖于主机操作系统. 在本教程中,您将在 ...

  9. 【转】解决深入学习PHP的瓶颈?

    转自:https://www.cnblogs.com/aksir/p/6774115.html PHP给学习者的感觉是:初学的时候很容易,但是学了2-3年,就深刻感觉遇到了瓶颈,很难深入,放弃又可惜. ...

  10. Fiddler抓包工具的简单使用

    Fiddler的官方网站:http://www.fiddler2.com Fiddler的官方帮助:http://docs.telerik.com/fiddler/knowledgebase/quic ...