二:Spark Worker启动Driver源码解析

  case LaunchDriver(driverId, driverDesc) => {
logInfo(s"Asked to launch driver $driverId")
val driver = new DriverRunner(//代理模式启动Driver
conf,
driverId,
workDir,
sparkHome,
driverDesc.copy(command = Worker.maybeUpdateSSLSettings(driverDesc.command, conf)),
self,
workerUri,
securityMgr)
drivers(driverId) = driver//将生成的DriverRunner对象按照driverId放到drivers数组中,这里面存放的HashMap的键值对,键为driverId,值为DriverRunner对象,用来标识当前的DriverRunner对象
driver.start()
//driver启动之后,将使用的cores和内存记录起来。
coresUsed += driverDesc.cores
memoryUsed += driverDesc.mem
}

补充说明:如果Cluster上的driver启动失败或者崩溃的时候,如果driverDescription的supervise设置的为true的时候,会自动重启,由worker负责它的重新启动。

  DriverRunner对象

private[deploy] class DriverRunner(
conf: SparkConf,
val driverId: String,
val workDir: File,
val sparkHome: File,
val driverDesc: DriverDescription,
val worker: RpcEndpointRef,
val workerUrl: String,
val securityManager: SecurityManager)
extends Logging {

DriverRunner的构造方法,包括driver启动时的一些配置信息。这个类中封装了一个start方法,开启新的线程来启动driver

/** Starts a thread to run and manage the driver. */
private[worker] def start() = {
new Thread("DriverRunner for " + driverId) {//使用java的线程代码开启新线程来启动driver
override def run() {
try {
val driverDir = createWorkingDirectory()//创建driver工作目录
val localJarFilename = downloadUserJar(driverDir)//从hdfs上下载用户的jar包依赖(用户把jar提交给集群,会存储在hdfs上) def substituteVariables(argument: String): String = argument match {
case "{{WORKER_URL}}" => workerUrl
case "{{USER_JAR}}" => localJarFilename
case other => other
} // TODO: If we add ability to submit multiple jars they should also be added here
val builder = CommandUtils.buildProcessBuilder(driverDesc.command, securityManager,//如通过processBuilder来launchDriver
driverDesc.mem, sparkHome.getAbsolutePath, substituteVariables)
launchDriver(builder, driverDir, driverDesc.supervise)
}
catch {
case e: Exception => finalException = Some(e)
} val state =
if (killed) {
DriverState.KILLED
} else if (finalException.isDefined) {
DriverState.ERROR
} else {
finalExitCode match {
case Some() => DriverState.FINISHED
case _ => DriverState.FAILED
}
} finalState = Some(state) worker.send(DriverStateChanged(driverId, state, finalException))//启动发生异常会向worker发消息。
}
}.start()
}

可以看出在run方法中会创建driver的工作目录

/**
* Creates the working directory for this driver.
* Will throw an exception if there are errors preparing the directory.
*/
private def createWorkingDirectory(): File = {
val driverDir = new File(workDir, driverId)
if (!driverDir.exists() && !driverDir.mkdirs()) {
throw new IOException("Failed to create directory " + driverDir)
}
driverDir
}

接下来会通过processBuilder来launchDriver

def buildProcessBuilder(
command: Command,
securityMgr: SecurityManager,
memory: Int,
sparkHome: String,
substituteArguments: String => String,
classPaths: Seq[String] = Seq[String](),
env: Map[String, String] = sys.env): ProcessBuilder = {
val localCommand = buildLocalCommand(
command, securityMgr, substituteArguments, classPaths, env)
val commandSeq = buildCommandSeq(localCommand, memory, sparkHome)
val builder = new ProcessBuilder(commandSeq: _*)
val environment = builder.environment()
for ((key, value) <- localCommand.environment) {
environment.put(key, value)
}
builder
}

剩下的就是异常处理了,这部分就是java的异常处理机制。需要说明的是如果启动失败,会发消息给worker和master。通知driver状态发生了改变。

case class DriverStateChanged(
driverId: String,
state: DriverState,
exception: Option[Exception])
extends DeployMessage

三:Worker启动Executor源码解析

Worker启动Executor的过程跟启动Driver基本一致,从本质上来说,Driver就是Worker上的一个Executor(当然是指Cluster模式)。这里就附上源码,不在展开了

case LaunchExecutor(masterUrl, appId, execId, appDesc, cores_, memory_) =>
if (masterUrl != activeMasterUrl) {
logWarning("Invalid Master (" + masterUrl + ") attempted to launch executor.")
} else {
try {
logInfo("Asked to launch executor %s/%d for %s".format(appId, execId, appDesc.name)) // Create the executor's working directory
val executorDir = new File(workDir, appId + "/" + execId)
if (!executorDir.mkdirs()) {
throw new IOException("Failed to create directory " + executorDir)
} // Create local dirs for the executor. These are passed to the executor via the
// SPARK_EXECUTOR_DIRS environment variable, and deleted by the Worker when the
// application finishes.
val appLocalDirs = appDirectories.get(appId).getOrElse {
Utils.getOrCreateLocalRootDirs(conf).map { dir =>
val appDir = Utils.createDirectory(dir, namePrefix = "executor")
Utils.chmod700(appDir)
appDir.getAbsolutePath()
}.toSeq
}
appDirectories(appId) = appLocalDirs
val manager = new ExecutorRunner(
appId,
execId,
appDesc.copy(command = Worker.maybeUpdateSSLSettings(appDesc.command, conf)),
cores_,
memory_,
self,
workerId,
host,
webUi.boundPort,
publicAddress,
sparkHome,
executorDir,
workerUri,
conf,
appLocalDirs, ExecutorState.RUNNING)
executors(appId + "/" + execId) = manager
manager.start()
coresUsed += cores_
memoryUsed += memory_
sendToMaster(ExecutorStateChanged(appId, execId, manager.state, None, None))
} catch {
case e: Exception => {
logError(s"Failed to launch executor $appId/$execId for ${appDesc.name}.", e)
if (executors.contains(appId + "/" + execId)) {
executors(appId + "/" + execId).kill()
executors -= appId + "/" + execId
}
sendToMaster(ExecutorStateChanged(appId, execId, ExecutorState.FAILED,
Some(e.toString), None))
}
}
}

Spark Worker启动Driver和Executor工作流程的更多相关文章

  1. [Spark内核] 第32课:Spark Worker原理和源码剖析解密:Worker工作流程图、Worker启动Driver源码解密、Worker启动Executor源码解密等

    本課主題 Spark Worker 原理 Worker 启动 Driver 源码鉴赏 Worker 启动 Executor 源码鉴赏 Worker 与 Master 的交互关系 [引言部份:你希望读者 ...

  2. Spark Worker原理和源码剖析解密:Worker工作流程图、Worker启动Driver源码解密、Worker启动Executor源码解密等

    本课主题 Spark Worker 原理 Worker 启动 Driver 源码鉴赏 Worker 启动 Executor 源码鉴赏 Worker 与 Master 的交互关系 Spark Worke ...

  3. Spark基本工作流程及YARN cluster模式原理(读书笔记)

    Spark基本工作流程及YARN cluster模式原理 转载请注明出处:http://www.cnblogs.com/BYRans/ Spark基本工作流程 相关术语解释 Spark应用程序相关的几 ...

  4. Spark Client和Cluster两种运行模式的工作流程

    1.client mode: In client mode, the driver is launched in the same process as the client that submits ...

  5. [Spark内核] 第33课:Spark Executor内幕彻底解密:Executor工作原理图、ExecutorBackend注册源码解密、Executor实例化内幕、Executor具体工作内幕

    本課主題 Spark Executor 工作原理图 ExecutorBackend 注册源码鉴赏和 Executor 实例化内幕 Executor 具体是如何工作的 [引言部份:你希望读者看完这篇博客 ...

  6. Spark Executor内幕彻底解密:Executor工作原理图、ExecutorBackend注册源码解密、Executor实例化内幕、Executor具体工作内幕

    本课主题 Spark Executor 工作原理图 ExecutorBackend 注册源码鉴赏和 Executor 实例化内幕 Executor 具体是如何工作的 Spark Executor 工作 ...

  7. spark standalone模式单节点启动多个executor

    以前为了在一台机器上启动多个executor都是通过instance多个worker来实现的,因为standalone模式默认在一台worker上启动一个executor,造成了很大的不便利,并且会造 ...

  8. worker启动executor源码分析-executor.clj

    在"supervisor启动worker源码分析-worker.clj"一文中,我们详细讲解了worker是如何初始化的.主要通过调用mk-worker函数实现的.在启动worke ...

  9. 【嵌入式开发】 Bootloader 详解 ( 代码环境 | ARM 启动流程 | uboot 工作流程 | 架构设计)

    作者 : 韩曙亮 博客地址 : http://blog.csdn.net/shulianghan/article/details/42462795 转载请著名出处 相关资源下载 :  -- u-boo ...

随机推荐

  1. vc写的dll被mingw的g++编译引用

    dll.cpp,用vc2017编译 #include <iostream>#include <windows.h> extern "C" __declspe ...

  2. ORA-03135 防火墙超时设置断开db link 连接

    [现象] 应用使用数据库连接池,访问A库时通过dblink查询B库,应用时不时会报错ORA. [过程还原] 当应用获取了一个数据库连接,并在数据库连接中使用了dblink,如果应用到A库的连接不释放, ...

  3. Unity3D 优化NGUI纹理

    原理就是将一张rgba 32的分成两张纹理:一张平台压缩不带alpha和一张为原图1/4大小的压缩图存储alpha信息(用r分量当alpha值),然后修改原材质的Shader传入这两张纹理. 代码如下 ...

  4. 使用Eclipse的坑

    1.运行Eclipse时突然出现找不到或者无法加载主类,这个问题不解决,下面的学习就无从做起,查了网上的一些资料,无法解决,所以还是有点烦人.如果在解决问题的过程中能够学到点什么,也是很值得的,但是就 ...

  5. LogisticRegression 和 LogisticRegressionCV

    在scikit-learn中,与逻辑回归有关的主要是这3个类.LogisticRegression, LogisticRegressionCV 和logistic_regression_path.其中 ...

  6. 二叉苹果树|codevs5565|luoguP2015|树形DP|Elena

    二叉苹果树 题目描述 有一棵苹果树,如果树枝有分叉,一定是分2叉(就是说没有只有1个儿子的结点) 这棵树共有N个结点(叶子点或者树枝分叉点),编号为1-N,树根编号一定是1. 我们用一根树枝两端连接的 ...

  7. okvis代码解读

    okvis_timing模块 提供时间转换的功能函数 okvis_util模块 包含 assert_macros.hpp  该文件包含一些有用的断言宏. source_file_pos.hpp 该文件 ...

  8. 如何辨别高潜牛人的六个方法,据说源自500强HR

    如果你是一名领导,当老板派下来任务让你招人的时候,你有考虑过怎么招到合适的人么?今天,架构师米洛特意分享一篇优秀的网络文章,据说来自500强的HR,希望对你招人有所帮助. 如何识人是HR及管理者重要的 ...

  9. iOS 动画学习之视图控制器转场动画

    一.概述 1.系统会创建一个转场相关的上下文对象,传递到动画执行器的animateTransition:和transitionDuration:方法,同样,也会传递到交互Controller的star ...

  10. 关于学习oi的一些事项

    我只是突然有感而发!(脑抽罢了 我其实是那种一直都没有计划说去学什么的人. 当然也不是那种点开洛谷一道题去写这道题不会就去学习相应的知识点的人. 随着洛谷 poj bzoj HDU CH Vojs 等 ...