SchedulerBackend, 两个任务, 申请资源和task执行和管理

对于SparkDeploySchedulerBackend, 基于actor模式, 主要就是启动和管理两个actor
Deploy.Client Actor, 负责资源申请, 在SparkDeploySchedulerBackend初始化的时候就会被创建, 然后Client会去到Master上注册, 最终完成在Worker上的ExecutorBackend的创建(参考, Spark源码分析 – Deploy), 并且这些ExecutorBackend都会被注册到Driver Actor上
Driver Actor, 负责task的执行
由于Spark是原先基于Mesos的, 然后为了兼容性才提供Standalone模式, 所以你可以看到Driver Actor中的接口都是mesos风格的, 在mesos的情况下应该是动态的申请资源, 然后执行task (猜测, 还没有看源码)
但对于coarse-grained Mesos mode和Spark's standalone deploy mode, 这步被简化成当TaskScheduler初始化的时候, 直接就将资源分配好了, 然后Driver Actor只是负责调度task在这些executor上执行
所以在makeOffers的注释上, 写的是Make fake resource offers, 因为这里其实没有真正的offer resources
关于Driver Actor如何调用task去执行, 关键在scheduler.resourceOffers

SchedulerBackend

package org.apache.spark.scheduler.cluster
/**
* A backend interface for cluster scheduling systems that allows plugging in different ones under
* ClusterScheduler. We assume a Mesos-like model where the application gets resource offers as
* machines become available and can launch tasks on them.
*/
private[spark] trait SchedulerBackend {
def start(): Unit
def stop(): Unit
def reviveOffers(): Unit
def defaultParallelism(): Int // Memory used by each executor (in megabytes)
protected val executorMemory: Int = SparkContext.executorMemoryRequested // TODO: Probably want to add a killTask too
}

 

StandaloneSchedulerBackend

用于coarse-grained Mesos mode和Spark's standalone deploy mode

可用看到主要目的, 就是创建并维护driverActor

主要的逻辑都在driverActor 中

/**
* A standalone scheduler backend, which waits for standalone executors to connect to it through
* Akka. These may be executed in a variety of ways, such as Mesos tasks for the coarse-grained
* Mesos mode or standalone processes for Spark's standalone deploy mode (spark.deploy.*).
*/
private[spark]
class StandaloneSchedulerBackend(scheduler: ClusterScheduler, actorSystem: ActorSystem)
extends SchedulerBackend with Logging
{
// Use an atomic variable to track total number of cores in the cluster for simplicity and speed
var totalCoreCount = new AtomicInteger(0) class DriverActor(sparkProperties: Seq[(String, String)]) extends Actor {
  // ……后面分析
} var driverActor: ActorRef = null
val taskIdsOnSlave = new HashMap[String, HashSet[String]] override def start() {
val properties = new ArrayBuffer[(String, String)]
val iterator = System.getProperties.entrySet.iterator
while (iterator.hasNext) {
val entry = iterator.next
val (key, value) = (entry.getKey.toString, entry.getValue.toString)
if (key.startsWith("spark.") && !key.equals("spark.hostPort")) {
properties += ((key, value))
}
}
driverActor = actorSystem.actorOf( // 关键就是创建driverActor
Props(new DriverActor(properties)), name = StandaloneSchedulerBackend.ACTOR_NAME)
} private val timeout = Duration.create(System.getProperty("spark.akka.askTimeout", "10").toLong, "seconds") override def stop() {
try {
if (driverActor != null) {
val future = driverActor.ask(StopDriver)(timeout) // 关闭driverActor
Await.result(future, timeout)
}
} catch {
case e: Exception =>
throw new SparkException("Error stopping standalone scheduler's driver actor", e)
}
} override def reviveOffers() {
driverActor ! ReviveOffers // 发送ReviveOffers event给driverActor
} override def defaultParallelism() = Option(System.getProperty("spark.default.parallelism"))
.map(_.toInt).getOrElse(math.max(totalCoreCount.get(), 2)) // Called by subclasses when notified of a lost worker
def removeExecutor(executorId: String, reason: String) {
try {
val future = driverActor.ask(RemoveExecutor(executorId, reason))(timeout)
Await.result(future, timeout)
} catch {
case e: Exception =>
throw new SparkException("Error notifying standalone scheduler's driver actor", e)
}
}
}

DriverActor

关键的函数, makeOffers, 在executors上launch tasks, 什么时候调用?

RegisterExecutor的时候,

Task StatusUpdate的时候,

收到ReviveOffers event的时候, 新的task被submit的时候, delay scheduling被触发的时候(per second)

关于delay scheduling, 应该是为了保持活度, 当没有任何状态变化时, 仍然需要继续保持launch tasks

  class DriverActor(sparkProperties: Seq[(String, String)]) extends Actor {
private val executorActor = new HashMap[String, ActorRef] // track所有executorActor Ref
private val executorAddress = new HashMap[String, Address]
private val executorHost = new HashMap[String, String]
private val freeCores = new HashMap[String, Int]
private val actorToExecutorId = new HashMap[ActorRef, String]
private val addressToExecutorId = new HashMap[Address, String] override def preStart() {
// Listen for remote client disconnection events, since they don't go through Akka's watch()
context.system.eventStream.subscribe(self, classOf[RemoteClientLifeCycleEvent]) // Periodically revive offers to allow delay scheduling to work
val reviveInterval = System.getProperty("spark.scheduler.revive.interval", "1000").toLong
context.system.scheduler.schedule(0.millis, reviveInterval.millis, self, ReviveOffers)
} def receive = {
case RegisterExecutor(executorId, hostPort, cores) => // 接收从StandaloneExecutorBackend发来的RegisterExecutor
Utils.checkHostPort(hostPort, "Host port expected " + hostPort)
if (executorActor.contains(executorId)) {
sender ! RegisterExecutorFailed("Duplicate executor ID: " + executorId)
} else {
logInfo("Registered executor: " + sender + " with ID " + executorId)
sender ! RegisteredExecutor(sparkProperties)
context.watch(sender) // watch executor actor
executorActor(executorId) = sender
executorHost(executorId) = Utils.parseHostPort(hostPort)._1
freeCores(executorId) = cores
executorAddress(executorId) = sender.path.address
actorToExecutorId(sender) = executorId
addressToExecutorId(sender.path.address) = executorId
totalCoreCount.addAndGet(cores)
makeOffers()
} case StatusUpdate(executorId, taskId, state, data) =>
scheduler.statusUpdate(taskId, state, data.value)
if (TaskState.isFinished(state)) {
freeCores(executorId) += 1
makeOffers(executorId)
} case ReviveOffers => // 接收从StandaloneSchedulerBackend发来的ReviveOffers
makeOffers() case StopDriver =>
sender ! true
context.stop(self) case RemoveExecutor(executorId, reason) =>
removeExecutor(executorId, reason)
sender ! true case Terminated(actor) =>
actorToExecutorId.get(actor).foreach(removeExecutor(_, "Akka actor terminated")) case RemoteClientDisconnected(transport, address) =>
addressToExecutorId.get(address).foreach(removeExecutor(_, "remote Akka client disconnected")) case RemoteClientShutdown(transport, address) =>
addressToExecutorId.get(address).foreach(removeExecutor(_, "remote Akka client shutdown"))
} // Make fake resource offers on all executors
def makeOffers() {
launchTasks(scheduler.resourceOffers(
executorHost.toArray.map {case (id, host) => new WorkerOffer(id, host, freeCores(id))}))
} // Make fake resource offers on just one executor
// 可以看到这里传给scheduler.resourceOffers的WorkOffer,是根据之前已经分布好的executor静态生成的
// 而不是动态得到的workeroffer, 如果用mesos, 这里应该是动态获取workeroffer, 然后传给scheduler.resourceOffers
def makeOffers(executorId: String) {
launchTasks(scheduler.resourceOffers(
Seq(new WorkerOffer(executorId, executorHost(executorId), freeCores(executorId)))))
} // Launch tasks returned by a set of resource offers
def launchTasks(tasks: Seq[Seq[TaskDescription]]) {
for (task <- tasks.flatten) {
freeCores(task.executorId) -= 1
executorActor(task.executorId) ! LaunchTask(task) // launch就是给executorActor发送LaunchTask event
}
} // Remove a disconnected slave from the cluster
def removeExecutor(executorId: String, reason: String) {
if (executorActor.contains(executorId)) {
logInfo("Executor " + executorId + " disconnected, so removing it")
val numCores = freeCores(executorId)
actorToExecutorId -= executorActor(executorId)
addressToExecutorId -= executorAddress(executorId)
executorActor -= executorId
executorHost -= executorId
freeCores -= executorId
totalCoreCount.addAndGet(-numCores)
scheduler.executorLost(executorId, SlaveLost(reason))
}
}
}

 

SparkDeploySchedulerBackend

关键就是创建和管理Driver and Client Actor

private[spark] class SparkDeploySchedulerBackend(
scheduler: ClusterScheduler,
sc: SparkContext,
master: String,
appName: String)
extends StandaloneSchedulerBackend(scheduler, sc.env.actorSystem)
with ClientListener
with Logging {
var client: Client = null
  override def start() {
super.start() // 调用StandaloneSchedulerBackend的start,创建DriverActor // The endpoint for executors to talk to us
val driverUrl = "akka://spark@%s:%s/user/%s".format(
System.getProperty("spark.driver.host"), System.getProperty("spark.driver.port"),
StandaloneSchedulerBackend.ACTOR_NAME)
val args = Seq(driverUrl, "{{EXECUTOR_ID}}", "{{HOSTNAME}}", "{{CORES}}")
val command = Command( // 生成worker中ExecutorRunner中执行的command, 其实就是运行StandaloneExecutorBackend
"org.apache.spark.executor.StandaloneExecutorBackend", args, sc.executorEnvs)
val sparkHome = sc.getSparkHome().getOrElse(null)
val appDesc = new ApplicationDescription(appName, maxCores, executorMemory, command, sparkHome,
"http://" + sc.ui.appUIAddress) // 生成application description client = new Client(sc.env.actorSystem, master, appDesc, this) // 创建Client Actor, 并start
client.start()
} override def stop() {
stopping = true
super.stop()
client.stop()
if (shutdownCallback != null) {
shutdownCallback(this)
}
}
}

Spark源码分析 – SchedulerBackend的更多相关文章

  1. Spark源码分析 – 汇总索引

    http://jerryshao.me/categories.html#architecture-ref http://blog.csdn.net/pelick/article/details/172 ...

  2. Spark源码分析 -- TaskScheduler

    Spark在设计上将DAGScheduler和TaskScheduler完全解耦合, 所以在资源管理和task调度上可以有更多的方案 现在支持, LocalSheduler, ClusterSched ...

  3. Spark源码分析(三)-TaskScheduler创建

    原创文章,转载请注明: 转载自http://www.cnblogs.com/tovin/p/3879151.html 在SparkContext创建过程中会调用createTaskScheduler函 ...

  4. Spark源码分析:多种部署方式之间的区别与联系(转)

    原文链接:Spark源码分析:多种部署方式之间的区别与联系(1) 从官方的文档我们可以知道,Spark的部署方式有很多种:local.Standalone.Mesos.YARN.....不同部署方式的 ...

  5. Spark源码分析之七:Task运行(一)

    在Task调度相关的两篇文章<Spark源码分析之五:Task调度(一)>与<Spark源码分析之六:Task调度(二)>中,我们大致了解了Task调度相关的主要逻辑,并且在T ...

  6. Spark源码分析之五:Task调度(一)

    在前四篇博文中,我们分析了Job提交运行总流程的第一阶段Stage划分与提交,它又被细化为三个分阶段: 1.Job的调度模型与运行反馈: 2.Stage划分: 3.Stage提交:对应TaskSet的 ...

  7. spark 源码分析之四 -- TaskScheduler的创建和启动过程

    在 spark 源码分析之二 -- SparkContext 的初始化过程 中,第 14 步 和 16 步分别描述了 TaskScheduler的 初始化 和 启动过程. 话分两头,先说 TaskSc ...

  8. spark 源码分析之十九 -- Stage的提交

    引言 上篇 spark 源码分析之十九 -- DAG的生成和Stage的划分 中,主要介绍了下图中的前两个阶段DAG的构建和Stage的划分. 本篇文章主要剖析,Stage是如何提交的. rdd的依赖 ...

  9. spark 源码分析之二十一 -- Task的执行流程

    引言 在上两篇文章 spark 源码分析之十九 -- DAG的生成和Stage的划分 和 spark 源码分析之二十 -- Stage的提交 中剖析了Spark的DAG的生成,Stage的划分以及St ...

随机推荐

  1. Java的==与equals之辨,简单解释,很清楚

    "=="和equals方法究竟有什么区别? (单独把一个东西说清楚,然后再说清楚另一个,这样,它们的区别自然就出来了,混在一起说,则很难说清楚) ==操作符专门用来比较两个变量的值 ...

  2. 导出word功能,用html代码在word中插入分页符

    <span lang=EN-US style="font-size:10.5pt;mso-bidi-font-size:12.0pt;font-family:" mce_st ...

  3. Django接受ajax传过来的数组

    $.ajax({ cache: false, type: "POST", url: "/userdelete/", traditional:true, //加上 ...

  4. scp采用无密码在两台linux服务器之间传输数据

    一.root用户: 1. 在主机A上执行如下命令来生成配对密钥: ssh-keygen -t rsa 按照提示操作,注意,不要输入passphrase.提示信息如下 Generating public ...

  5. 重启php

    注意这是重启php,不是重启apache service php-fpm restart

  6. C++ 匿名对象初始化新对象

    //c++中匿名对象初始化新对象 #include<iostream> using namespace std; class Point{ public: Point(){ cout &l ...

  7. python 人脸识别

    """Performs face alignment and calculates L2 distance between the embeddings of image ...

  8. 第二百七十一节,Tornado框架-CSRF防止跨站post请求伪造

    Tornado框架-CSRF防止跨站post请求伪造 CSRF是什么 CSRF是用来在post请求时做请求验证的,防止跨站post请求伪造 当用户访问一个表单页面时,会自动在表单添加一个隐藏的inpu ...

  9. java-I/O File类(5)-Reader和Writer、OutputStreamWriter 、BufferedWriter、字节流和字符流的区别

      标签: outputstreamwriterreader字符file方法 2015-05-14 23:06 469人阅读 评论(0) 收藏 举报  分类: 孙鑫-java基础(16)  I-O(4 ...

  10. 让div也出现滑动框。

    <div id="box" style="height: 300px; width: 200px; border:1px solid #CCC; overflow: ...