Spark报错处理

1、问题:org.apache.spark.SparkException: Exception thrown in awaitResult

分析:出现这个情况的原因是spark启动的时候设置的是hostname启动的,导致访问的时候DNS不能解析主机名导致。

问题解决:

第一种方法:确保URL是spark://服务器ip:7077,而不是spark://hostname:7077;启动的时候指定-h  ip地址

第二种方法:修改主机的host文件添加主机的解析记录(推荐这种方式)

Ip     主机名

第三种方法:hive.metastore.try.direct.sql: false         (in hive-site.xml)

2、spark2.x版本使用hive,即copy一份hive-site.xml文件到spark2.x的conf目录下。

使用spark的bin目录下的spark-sql进入终端时总提示一个warning:

Thu Jun 15 12:56:05 CST 2017 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.

解决方法:

修改hive-site.xml文件下的mysql连接的url,设置useSSL=false。由于hive-site.xml文件采用的是xml格式,所以不支持直接使用&连接,需要使用&进行连接。

<value>jdbc:mysql://localhost:3306/metastore?createDatabaseIfNotExist=true&amp;useSSL=false</value>

 

重启spark即可,

#../sbin/stop-all.sh

#../sbin/start-all.sh

 

 

3、 问题:

Spark运行了一段时间,数据量上来以后,出现了一个这样的报错:

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

   at java.lang.Thread.run(Thread.java:745)

17/10/26 20:29:00 ERROR Executor: Exception in task 39.1 in stage 8.0 (TID 1122)

java.io.FileNotFoundException: /tmp/spark-2de5fa03-a7cb-47a2-9540-403de85d0371/executor-eebecccb-4cdb-4b85-80a3-73c4baa4c7bd/blockmgr-fc644c14-23e8-401c-aee8-00bc108bf607/2b/temp_shuffle_75eb7338-be41-41b4-bed4-5dcb0c1d0fdf (No space left on device)

   at java.io.FileOutputStream.open0(Native Method)

   at java.io.FileOutputStream.open(FileOutputStream.java:270)

   at java.io.FileOutputStream.<init>(FileOutputStream.java:213)

   at org.apache.spark.storage.DiskBlockObjectWriter.initialize(DiskBlockObjectWriter.scala:102)

   at org.apache.spark.storage.DiskBlockObjectWriter.open(DiskBlockObjectWriter.scala:115)

   at org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:235)

   at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:151)

   at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)

   at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)

   at org.apache.spark.scheduler.Task.run(Task.scala:108)

   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)

   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

 

从日志报错来看说是没有空间了,spark默认是把临时文件存放到/tmp目录下。需要修改啊!!!放到一个大存储的地方:

 

解决方法:

修改spark-env.sh

export SPARK_DRIVER_MEMORY=5g

export SPARK_LOCAL_DIRS=/data/sparktmp

 

不要添加到spark-defaault.conf里面去,因为spark从1.0版本已经放弃了spark.local.dir参数。

 

源码分析:

(1) DiskBlockManager类中的下面的方法

通过日志我们最终定位这块出现的错误

/**

* Create local directories for storing block data. These directories are

* located inside configured local directories and won't

* be deleted on JVM exit when using the external shuffle service.

*/

private def createLocalDirs(conf: SparkConf): Array[File] = {

Utils.getConfiguredLocalDirs(conf).flatMap { rootDir =>

try {

val localDir = Utils.createDirectory(rootDir, "blockmgr")

logInfo(s"Created local directory at $localDir")

Some(localDir)

} catch {

case e: IOException =>

logError(s"Failed to create local dir in $rootDir. Ignoring this directory.", e)

None

}

}

}

(2) SparkConf.scala 类中的方法

这个方法告诉我们在spark-defaults.conf 中配置spark.local.dir参数在spark1.0 版本后已经过时。

/** Checks for illegal or deprecated config settings. Throws an exception for the former. Not

* idempotent - may mutate this conf object to convert deprecated settings to supported ones. */

private[spark] def validateSettings() {

if (contains("spark.local.dir")) {

val msg = "In Spark 1.0 and later spark.local.dir will be overridden by the value set by " +

"the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN)."

logWarning(msg)

}

val executorOptsKey = "spark.executor.extraJavaOptions"

val executorClasspathKey = "spark.executor.extr

。。。。

}

(3)Utils.scala 类中的方法

通过分析下面的代码,我们发现不在spark-env.sh 下配置SPARK_LOCAL_DIRS的情况下,

通过该conf.get("spark.local.dir", System.getProperty("java.io.tmpdir")).split(",")设置spark.local.dir,然后或根据路径创建,导致上述错误。

故我们直接在spark-env.sh 中设置SPARK_LOCAL_DIRS 即可解决。

然后我们直接在spark-env.sh 中配置:

export SPARK_LOCAL_DIRS=/home/hadoop/data/sparktmp

/**

* Return the configured local directories where Spark can write files. This

* method does not create any directories on its own, it only encapsulates the

* logic of locating the local directories according to deployment mode.

*/

def getConfiguredLocalDirs(conf: SparkConf): Array[String] = {

val shuffleServiceEnabled = conf.getBoolean("spark.shuffle.service.enabled", false)

if (isRunningInYarnContainer(conf)) {

// If we are in yarn mode, systems can have different disk layouts so we must set it

// to what Yarn on this system said was available. Note this assumes that Yarn has

// created the directories already, and that they are secured so that only the

// user has access to them.

getYarnLocalDirs(conf).split(",")

} else if (conf.getenv("SPARK_EXECUTOR_DIRS") != null) {

conf.getenv("SPARK_EXECUTOR_DIRS").split(File.pathSeparator)

} else if (conf.getenv("SPARK_LOCAL_DIRS") != null) {

conf.getenv("SPARK_LOCAL_DIRS").split(",")

} else if (conf.getenv("MESOS_DIRECTORY") != null && !shuffleServiceEnabled) {

// Mesos already creates a directory per Mesos task. Spark should use that directory

// instead so all temporary files are automatically cleaned up when the Mesos task ends.

// Note that we don't want this if the shuffle service is enabled because we want to

// continue to serve shuffle files after the executors that wrote them have already exited.

Array(conf.getenv("MESOS_DIRECTORY"))

} else {

if (conf.getenv("MESOS_DIRECTORY") != null && shuffleServiceEnabled) {

logInfo("MESOS_DIRECTORY available but not using provided Mesos sandbox because " +

"spark.shuffle.service.enabled is enabled.")

}

// In non-Yarn mode (or for the driver in yarn-client mode), we cannot trust the user

// configuration to point to a secure directory. So create a subdirectory with restricted

// permissions under each listed directory.

conf.get("spark.local.dir", System.getProperty("java.io.tmpdir")).split(",")

}

}

3、Join condition is missing or trivial.Use the CROSS JOIN syntax to allow cartesian products between these relations.;

解决方法:

spark.sql.crossjoin.enabled: true

4、Caused by: org.codehaus.janino.JaninoRuntimeException: Code of method "eval(Lorg/apache/spark/sql/catalyst/InternalRow;)Z" of class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificPredicate" grows beyond 64 KB

解决方法:

spark.sql.codegen.wholeStage : false

5、java.lang.OutOfMemoryError: Java heap space

解决方法:

spark.driver.memory : 10g   <to a higher-value>

spark.sql.ui.retainedExecutions: 5   <to some lower-value>

spark报错处理的更多相关文章

  1. spark报错:invalid token

    启动spark报错,启动container失败,去看yarn的日志,显示invalid token, 经过排查是hadoop子节点的配置和主节点的配置不一致导致的,同步之后,问题解决.

  2. spark-shell启动spark报错

    前言 离线安装好CDH.Coudera Manager之后,通过Coudera Manager安装所有自带的应用,包括hdfs.hive.yarn.spark.hbase等应用,过程很是波折,此处就不 ...

  3. Spark报错java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.

    Spark 读取 JSON 文件时运行报错 java.io.IOException: Could not locate executable null\bin\winutils.exe in the ...

  4. 安装spark 报错:java.io.IOException: Could not locate executable E:\hadoop-2.7.7\bin\winutils.exe

    打开 cmd 输入 spark-shell 虽然可以正常出现 spark 的标志符,但是报错:java.io.IOException: Could not locate executable E:\h ...

  5. spark报错 java.lang.NoClassDefFoundError: scala/xml/MetaData

    代码: 报错信息: java.lang.NoClassDefFoundError: scala/xml/MetaData 原因:确失jar包 <dependency> <groupI ...

  6. Spark记录-spark报错Unable to load native-hadoop library for your platform

    解决方案一: #cp $HADOOP_HOME/lib/native/libhadoop.so  $JAVA_HOME/jre/lib/amd64 #源码编译snappy---./configure  ...

  7. Spark报错:Failed to locate the winutils binary in the hadoop binary path

    之前在mac上调试hadoop程序(mac之前配置过hadoop环境)一直都是正常的.因为工作需要,需要在windows上先调试该程序,然后再转到linux下.程序运行的过程中,报 Failed to ...

  8. window 运行spark报错

    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties // :: ERROR Shell: F ...

  9. spark 报错 InvalidClassException: no valid constructor

    2019-03-19 02:50:24 WARN TaskSetManager:66 - Lost task 1.0 in stage 0.0 (TID 1, 1.2.3.4, executor 1) ...

随机推荐

  1. radom

    radom模块提供了随机生成对象的方法 Help on module random: NAME random - Random variable generators. FILE /usr/local ...

  2. Net系列框架-Dapper+简单三层架构

    Net系列框架-Dapper+简单三层架构 工作将近6年多了,工作中也陆陆续续学习和搭建了不少的框架,后续将按由浅入深的方式,整理出一些框架源码,所有框架源码本人都亲自调试通过,如果有问题,欢迎联系我 ...

  3. 解决:百度编辑器UEditor,怎么将图片保存到图片服务器,或者上传到ftp服务器的问题(如果你正在用UE,这篇文章值得你看下)

    在使用百度编辑器ueditor的时候,怎么将图片保存到另一个服务器,或者上传到ftp服务器?这个问题,估计很多使用UE的人会遇到.而且我百度过,没有找到这个问题的解决方案.那么:本篇文章就很适合你了. ...

  4. BitAdminCore框架应用篇:(一)使用Cookiecutter创建应用项目

      框架演示:http://bit.bitdao.cn 框架源码:https://github.com/chenyinxin/cookiecutter-bitadmin-core 一.简介 1.Coo ...

  5. asp.net 下载EXCEL文件

    一.需要导入NPOI 库文件 打开VS2012 工具>>库程序包管理器>>管理解决方案的NuGet程序包,搜索NPOI,如下图 安装完成: 添加 using NPOI.HSSF ...

  6. 请教如何用ASP.NET实现http://abc.com/orderID这样的URL???

    我查看了一下微信二维码的内容是:https://u.wechat.com/XXXXXXXXX这种格式. 我现在想把我们的订单URL也做成 http://abc.com/orderID这样子,做成二维码 ...

  7. What are rules about using an underscore in a c identifier

    http://stackoverflow.com/questions/228783/what-are-the-rules-about-using-an-underscore-in-a-c-identi ...

  8. elasticsearch中 refresh 和flush区别

    elasticsearch中有两个比较重要的操作:refresh 和 flush refresh操作 当我们向ES发送请求的时候,我们发现es貌似可以在我们发请求的同时进行搜索.而这个实时建索引并可以 ...

  9. Gson简单使用

    最近做个IM类型的Android 应用,由于有三种客户端(pc,ios,Android),所以底层使用的是C++与服务器通信,所以通信部分基本上有c++完成,封装好Jni即可,可以把底层c++通信看成 ...

  10. postgresql分区表探索(pg_pathman)

    使用场景 许多系统在在使用几年之后数据量不断膨胀,这个时候单表数据量超过2000w+,数据库的查询也越来越慢,而随着时间的推移许多历史数据的重要性可能逐渐下降.这时候就可以考虑使用分区表来将冷热数据分 ...