SchedulerBackend, 两个任务, 申请资源和task执行和管理

对于SparkDeploySchedulerBackend, 基于actor模式, 主要就是启动和管理两个actor
Deploy.Client Actor, 负责资源申请, 在SparkDeploySchedulerBackend初始化的时候就会被创建, 然后Client会去到Master上注册, 最终完成在Worker上的ExecutorBackend的创建(参考, Spark源码分析 – Deploy), 并且这些ExecutorBackend都会被注册到Driver Actor上
Driver Actor, 负责task的执行
由于Spark是原先基于Mesos的, 然后为了兼容性才提供Standalone模式, 所以你可以看到Driver Actor中的接口都是mesos风格的, 在mesos的情况下应该是动态的申请资源, 然后执行task (猜测, 还没有看源码)
但对于coarse-grained Mesos mode和Spark's standalone deploy mode, 这步被简化成当TaskScheduler初始化的时候, 直接就将资源分配好了, 然后Driver Actor只是负责调度task在这些executor上执行
所以在makeOffers的注释上, 写的是Make fake resource offers, 因为这里其实没有真正的offer resources
关于Driver Actor如何调用task去执行, 关键在scheduler.resourceOffers

SchedulerBackend

package org.apache.spark.scheduler.cluster
/**
* A backend interface for cluster scheduling systems that allows plugging in different ones under
* ClusterScheduler. We assume a Mesos-like model where the application gets resource offers as
* machines become available and can launch tasks on them.
*/
private[spark] trait SchedulerBackend {
def start(): Unit
def stop(): Unit
def reviveOffers(): Unit
def defaultParallelism(): Int // Memory used by each executor (in megabytes)
protected val executorMemory: Int = SparkContext.executorMemoryRequested // TODO: Probably want to add a killTask too
}

 

StandaloneSchedulerBackend

用于coarse-grained Mesos mode和Spark's standalone deploy mode

可用看到主要目的, 就是创建并维护driverActor

主要的逻辑都在driverActor 中

/**
* A standalone scheduler backend, which waits for standalone executors to connect to it through
* Akka. These may be executed in a variety of ways, such as Mesos tasks for the coarse-grained
* Mesos mode or standalone processes for Spark's standalone deploy mode (spark.deploy.*).
*/
private[spark]
class StandaloneSchedulerBackend(scheduler: ClusterScheduler, actorSystem: ActorSystem)
extends SchedulerBackend with Logging
{
// Use an atomic variable to track total number of cores in the cluster for simplicity and speed
var totalCoreCount = new AtomicInteger(0) class DriverActor(sparkProperties: Seq[(String, String)]) extends Actor {
  // ……后面分析
} var driverActor: ActorRef = null
val taskIdsOnSlave = new HashMap[String, HashSet[String]] override def start() {
val properties = new ArrayBuffer[(String, String)]
val iterator = System.getProperties.entrySet.iterator
while (iterator.hasNext) {
val entry = iterator.next
val (key, value) = (entry.getKey.toString, entry.getValue.toString)
if (key.startsWith("spark.") && !key.equals("spark.hostPort")) {
properties += ((key, value))
}
}
driverActor = actorSystem.actorOf( // 关键就是创建driverActor
Props(new DriverActor(properties)), name = StandaloneSchedulerBackend.ACTOR_NAME)
} private val timeout = Duration.create(System.getProperty("spark.akka.askTimeout", "10").toLong, "seconds") override def stop() {
try {
if (driverActor != null) {
val future = driverActor.ask(StopDriver)(timeout) // 关闭driverActor
Await.result(future, timeout)
}
} catch {
case e: Exception =>
throw new SparkException("Error stopping standalone scheduler's driver actor", e)
}
} override def reviveOffers() {
driverActor ! ReviveOffers // 发送ReviveOffers event给driverActor
} override def defaultParallelism() = Option(System.getProperty("spark.default.parallelism"))
.map(_.toInt).getOrElse(math.max(totalCoreCount.get(), 2)) // Called by subclasses when notified of a lost worker
def removeExecutor(executorId: String, reason: String) {
try {
val future = driverActor.ask(RemoveExecutor(executorId, reason))(timeout)
Await.result(future, timeout)
} catch {
case e: Exception =>
throw new SparkException("Error notifying standalone scheduler's driver actor", e)
}
}
}

DriverActor

关键的函数, makeOffers, 在executors上launch tasks, 什么时候调用?

RegisterExecutor的时候,

Task StatusUpdate的时候,

收到ReviveOffers event的时候, 新的task被submit的时候, delay scheduling被触发的时候(per second)

关于delay scheduling, 应该是为了保持活度, 当没有任何状态变化时, 仍然需要继续保持launch tasks

  class DriverActor(sparkProperties: Seq[(String, String)]) extends Actor {
private val executorActor = new HashMap[String, ActorRef] // track所有executorActor Ref
private val executorAddress = new HashMap[String, Address]
private val executorHost = new HashMap[String, String]
private val freeCores = new HashMap[String, Int]
private val actorToExecutorId = new HashMap[ActorRef, String]
private val addressToExecutorId = new HashMap[Address, String] override def preStart() {
// Listen for remote client disconnection events, since they don't go through Akka's watch()
context.system.eventStream.subscribe(self, classOf[RemoteClientLifeCycleEvent]) // Periodically revive offers to allow delay scheduling to work
val reviveInterval = System.getProperty("spark.scheduler.revive.interval", "1000").toLong
context.system.scheduler.schedule(0.millis, reviveInterval.millis, self, ReviveOffers)
} def receive = {
case RegisterExecutor(executorId, hostPort, cores) => // 接收从StandaloneExecutorBackend发来的RegisterExecutor
Utils.checkHostPort(hostPort, "Host port expected " + hostPort)
if (executorActor.contains(executorId)) {
sender ! RegisterExecutorFailed("Duplicate executor ID: " + executorId)
} else {
logInfo("Registered executor: " + sender + " with ID " + executorId)
sender ! RegisteredExecutor(sparkProperties)
context.watch(sender) // watch executor actor
executorActor(executorId) = sender
executorHost(executorId) = Utils.parseHostPort(hostPort)._1
freeCores(executorId) = cores
executorAddress(executorId) = sender.path.address
actorToExecutorId(sender) = executorId
addressToExecutorId(sender.path.address) = executorId
totalCoreCount.addAndGet(cores)
makeOffers()
} case StatusUpdate(executorId, taskId, state, data) =>
scheduler.statusUpdate(taskId, state, data.value)
if (TaskState.isFinished(state)) {
freeCores(executorId) += 1
makeOffers(executorId)
} case ReviveOffers => // 接收从StandaloneSchedulerBackend发来的ReviveOffers
makeOffers() case StopDriver =>
sender ! true
context.stop(self) case RemoveExecutor(executorId, reason) =>
removeExecutor(executorId, reason)
sender ! true case Terminated(actor) =>
actorToExecutorId.get(actor).foreach(removeExecutor(_, "Akka actor terminated")) case RemoteClientDisconnected(transport, address) =>
addressToExecutorId.get(address).foreach(removeExecutor(_, "remote Akka client disconnected")) case RemoteClientShutdown(transport, address) =>
addressToExecutorId.get(address).foreach(removeExecutor(_, "remote Akka client shutdown"))
} // Make fake resource offers on all executors
def makeOffers() {
launchTasks(scheduler.resourceOffers(
executorHost.toArray.map {case (id, host) => new WorkerOffer(id, host, freeCores(id))}))
} // Make fake resource offers on just one executor
// 可以看到这里传给scheduler.resourceOffers的WorkOffer,是根据之前已经分布好的executor静态生成的
// 而不是动态得到的workeroffer, 如果用mesos, 这里应该是动态获取workeroffer, 然后传给scheduler.resourceOffers
def makeOffers(executorId: String) {
launchTasks(scheduler.resourceOffers(
Seq(new WorkerOffer(executorId, executorHost(executorId), freeCores(executorId)))))
} // Launch tasks returned by a set of resource offers
def launchTasks(tasks: Seq[Seq[TaskDescription]]) {
for (task <- tasks.flatten) {
freeCores(task.executorId) -= 1
executorActor(task.executorId) ! LaunchTask(task) // launch就是给executorActor发送LaunchTask event
}
} // Remove a disconnected slave from the cluster
def removeExecutor(executorId: String, reason: String) {
if (executorActor.contains(executorId)) {
logInfo("Executor " + executorId + " disconnected, so removing it")
val numCores = freeCores(executorId)
actorToExecutorId -= executorActor(executorId)
addressToExecutorId -= executorAddress(executorId)
executorActor -= executorId
executorHost -= executorId
freeCores -= executorId
totalCoreCount.addAndGet(-numCores)
scheduler.executorLost(executorId, SlaveLost(reason))
}
}
}

 

SparkDeploySchedulerBackend

关键就是创建和管理Driver and Client Actor

private[spark] class SparkDeploySchedulerBackend(
scheduler: ClusterScheduler,
sc: SparkContext,
master: String,
appName: String)
extends StandaloneSchedulerBackend(scheduler, sc.env.actorSystem)
with ClientListener
with Logging {
var client: Client = null
  override def start() {
super.start() // 调用StandaloneSchedulerBackend的start,创建DriverActor // The endpoint for executors to talk to us
val driverUrl = "akka://spark@%s:%s/user/%s".format(
System.getProperty("spark.driver.host"), System.getProperty("spark.driver.port"),
StandaloneSchedulerBackend.ACTOR_NAME)
val args = Seq(driverUrl, "{{EXECUTOR_ID}}", "{{HOSTNAME}}", "{{CORES}}")
val command = Command( // 生成worker中ExecutorRunner中执行的command, 其实就是运行StandaloneExecutorBackend
"org.apache.spark.executor.StandaloneExecutorBackend", args, sc.executorEnvs)
val sparkHome = sc.getSparkHome().getOrElse(null)
val appDesc = new ApplicationDescription(appName, maxCores, executorMemory, command, sparkHome,
"http://" + sc.ui.appUIAddress) // 生成application description client = new Client(sc.env.actorSystem, master, appDesc, this) // 创建Client Actor, 并start
client.start()
} override def stop() {
stopping = true
super.stop()
client.stop()
if (shutdownCallback != null) {
shutdownCallback(this)
}
}
}

Spark源码分析 – SchedulerBackend的更多相关文章

  1. Spark源码分析 – 汇总索引

    http://jerryshao.me/categories.html#architecture-ref http://blog.csdn.net/pelick/article/details/172 ...

  2. Spark源码分析 -- TaskScheduler

    Spark在设计上将DAGScheduler和TaskScheduler完全解耦合, 所以在资源管理和task调度上可以有更多的方案 现在支持, LocalSheduler, ClusterSched ...

  3. Spark源码分析(三)-TaskScheduler创建

    原创文章,转载请注明: 转载自http://www.cnblogs.com/tovin/p/3879151.html 在SparkContext创建过程中会调用createTaskScheduler函 ...

  4. Spark源码分析:多种部署方式之间的区别与联系(转)

    原文链接:Spark源码分析:多种部署方式之间的区别与联系(1) 从官方的文档我们可以知道,Spark的部署方式有很多种:local.Standalone.Mesos.YARN.....不同部署方式的 ...

  5. Spark源码分析之七:Task运行(一)

    在Task调度相关的两篇文章<Spark源码分析之五:Task调度(一)>与<Spark源码分析之六:Task调度(二)>中,我们大致了解了Task调度相关的主要逻辑,并且在T ...

  6. Spark源码分析之五:Task调度(一)

    在前四篇博文中,我们分析了Job提交运行总流程的第一阶段Stage划分与提交,它又被细化为三个分阶段: 1.Job的调度模型与运行反馈: 2.Stage划分: 3.Stage提交:对应TaskSet的 ...

  7. spark 源码分析之四 -- TaskScheduler的创建和启动过程

    在 spark 源码分析之二 -- SparkContext 的初始化过程 中,第 14 步 和 16 步分别描述了 TaskScheduler的 初始化 和 启动过程. 话分两头,先说 TaskSc ...

  8. spark 源码分析之十九 -- Stage的提交

    引言 上篇 spark 源码分析之十九 -- DAG的生成和Stage的划分 中,主要介绍了下图中的前两个阶段DAG的构建和Stage的划分. 本篇文章主要剖析,Stage是如何提交的. rdd的依赖 ...

  9. spark 源码分析之二十一 -- Task的执行流程

    引言 在上两篇文章 spark 源码分析之十九 -- DAG的生成和Stage的划分 和 spark 源码分析之二十 -- Stage的提交 中剖析了Spark的DAG的生成,Stage的划分以及St ...

随机推荐

  1. element UI 的学习一,路由跳转

    1.项目开始: # 安装vue    $ cnpm install vue@2.1.6    # 全局安装 vue-cli    $ cnpm install --global vue-cli    ...

  2. MacBook Air 2014 安装win7

    1.准备一个4G以上容量USB3.0 U盘.制作一个带USB3.0驱动的win7 2.将制作好的win7iso镜像文件复制到macbook上,插上U盘,运行Boot Camp助理: 3.选择默认勾选项 ...

  3. 使用【单独】的一个<script>进行js文件的引用

    刚才用jQuery的时候,总是发现js代码不被执行...后来发现我的代码是这么写的: <script type="text/javascript" src="htt ...

  4. [kernel]字符设备驱动、平台设备驱动、设备驱动模型、sysfs几者之间的比较和关联

    转自:http://www.2cto.com/kf/201510/444943.html Linux驱动开发经验总结,绝对干货! 学习Linux设备驱动开发的过程中自然会遇到字符设备驱动.平台设备驱动 ...

  5. STM32F1_常见外设资源汇总

    前言 STM32F1系列芯片算是在STM32中最早的一系列,在实际生活中应用的比较广泛.因此,汇总一下STM32F1系列芯片常见片内资源,每一篇文章把重点提出来讲解,并提供软件源代码工程. 汇总常见资 ...

  6. PHP——简单的表单提交

    <body> <form name="" method="post" action="CHULI.php"> < ...

  7. 动态加载javascript增强版

    我们经常使用动态加载Javascript,写个函数很容易现实,之前也写过一个函数,不过当加载多个JS时,只能根据浏览器返回的顺序来先后加载,这肯定不是我们想要的,现在使用了一下技巧,当加载多个JS时, ...

  8. MapReduce程序的工作过程

    转自:http://www.aboutyun.com/thread-15494-1-2.html 问题导读1.HDFS框架组成是什么?2.HDFS文件的读写过程是什么?3.MapReduce框架组成是 ...

  9. SQL on Hadoop 的真相(1)

    转自:http://blog.jobbole.com/86710/ 这是一组系列博文,目的是详尽介绍 SQL-on-Hadoop .本系列的第一篇会介绍 Hadoop 系统的存储引擎和在线事务处理(简 ...

  10. bootstrap基础学习一篇

    官网:http://www.bootcss.com/ 这里,主要讲解bootstrap3.关于他的介绍就不用复述了. 1.示例 <!doctype html> <html lang= ...