ExecutorBackend

很简单的接口

package org.apache.spark.executor
/**
* A pluggable interface used by the Executor to send updates to the cluster scheduler.
*/
private[spark] trait ExecutorBackend {
def statusUpdate(taskId: Long, state: TaskState, data: ByteBuffer)
}

 

StandaloneExecutorBackend

维护executor, 并负责注册executor以及executor和driver之间的通信

private[spark] class StandaloneExecutorBackend(
driverUrl: String,
executorId: String,
hostPort: String,
cores: Int)
extends Actor
with ExecutorBackend
with Logging {
var executor: Executor = null
var driver: ActorRef = null override def preStart() {
logInfo("Connecting to driver: " + driverUrl)
driver = context.actorFor(driverUrl) // 创建driver actor ref, 以便于和driver通信
driver ! RegisterExecutor(executorId, hostPort, cores) // 向driver注册executor
} override def receive = {
case RegisteredExecutor(sparkProperties) =>
logInfo("Successfully registered with driver")
// Make this host instead of hostPort ?
executor = new Executor(executorId, Utils.parseHostPort(hostPort)._1, sparkProperties) // 当注册成功后, 创建Executor case RegisterExecutorFailed(message) =>
logError("Slave registration failed: " + message)
System.exit(1) case LaunchTask(taskDesc) =>
logInfo("Got assigned task " + taskDesc.taskId)
if (executor == null) {
logError("Received launchTask but executor was null")
System.exit(1)
} else {
executor.launchTask(this, taskDesc.taskId, taskDesc.serializedTask) // 调用executor.launchTask,启动task
} case Terminated(_) | RemoteClientDisconnected(_, _) | RemoteClientShutdown(_, _) =>
logError("Driver terminated or disconnected! Shutting down.")
System.exit(1)
} override def statusUpdate(taskId: Long, state: TaskState, data: ByteBuffer) {
driver ! StatusUpdate(executorId, taskId, state, data) // 当task状态变化时, 报告给driver actor
}
}

Executor

对于Executor, 维护一个threadPool, 可以run多个task, 取决于core的个数

所以对于launchTask, 就是在threadPool中挑个thread去run TaskRunner

private[spark] class Executor(
executorId: String,
slaveHostname: String,
properties: Seq[(String, String)])
extends Logging
{
  // Initialize Spark environment (using system properties read above)
val env = SparkEnv.createFromSystemProperties(executorId, slaveHostname, 0, false, false)
SparkEnv.set(env)

  // Start worker thread pool
val threadPool = new ThreadPoolExecutor(
1, 128, 600, TimeUnit.SECONDS, new SynchronousQueue[Runnable]) def launchTask(context: ExecutorBackend, taskId: Long, serializedTask: ByteBuffer) {
threadPool.execute(new TaskRunner(context, taskId, serializedTask))
}
}

 

TaskRunner

  class TaskRunner(context: ExecutorBackend, taskId: Long, serializedTask: ByteBuffer)
extends Runnable { override def run() {
try {
SparkEnv.set(env)
Accumulators.clear()
val (taskFiles, taskJars, taskBytes) = Task.deserializeWithDependencies(serializedTask) // 反序列化
updateDependencies(taskFiles, taskJars)
val task = ser.deserialize[Task[Any]](taskBytes, Thread.currentThread.getContextClassLoader) // 反序列化
attemptedTask = Some(task)
logInfo("Its epoch is " + task.epoch)
env.mapOutputTracker.updateEpoch(task.epoch)
taskStart = System.currentTimeMillis()
val value = task.run(taskId.toInt) // 调用task.run执行真正的逻辑
val taskFinish = System.currentTimeMillis()
        val accumUpdates = Accumulators.values
val result = new TaskResult(value, accumUpdates, task.metrics.getOrElse(null)) // 生成TaskResult
val serializedResult = ser.serialize(result) // 将TaskResult序列化
logInfo("Serialized size of result for " + taskId + " is " + serializedResult.limit)
        context.statusUpdate(taskId, TaskState.FINISHED, serializedResult) // 将任务完成和taskresult,通过statusUpdate报告给driver
logInfo("Finished task ID " + taskId)
} catch { // 处理各种fail, 同样也要用statusUpdate event通知driver
case ffe: FetchFailedException => {
val reason = ffe.toTaskEndReason
context.statusUpdate(taskId, TaskState.FAILED, ser.serialize(reason))
} case t: Throwable => {
val serviceTime = (System.currentTimeMillis() - taskStart).toInt
val metrics = attemptedTask.flatMap(t => t.metrics)
for (m <- metrics) {
m.executorRunTime = serviceTime
m.jvmGCTime = getTotalGCTime - startGCTime
}
val reason = ExceptionFailure(t.getClass.getName, t.toString, t.getStackTrace, metrics)
context.statusUpdate(taskId, TaskState.FAILED, ser.serialize(reason)) // TODO: Should we exit the whole executor here? On the one hand, the failed task may
// have left some weird state around depending on when the exception was thrown, but on
// the other hand, maybe we could detect that when future tasks fail and exit then.
logError("Exception in task ID " + taskId, t)
//System.exit(1)
}
}
}
}

Spark源码分析 – Executor的更多相关文章

  1. Spark源码分析 – 汇总索引

    http://jerryshao.me/categories.html#architecture-ref http://blog.csdn.net/pelick/article/details/172 ...

  2. Spark源码分析(三)-TaskScheduler创建

    原创文章,转载请注明: 转载自http://www.cnblogs.com/tovin/p/3879151.html 在SparkContext创建过程中会调用createTaskScheduler函 ...

  3. 【转】Spark源码分析之-deploy模块

    原文地址:http://jerryshao.me/architecture/2013/04/30/Spark%E6%BA%90%E7%A0%81%E5%88%86%E6%9E%90%E4%B9%8B- ...

  4. Spark源码分析:多种部署方式之间的区别与联系(转)

    原文链接:Spark源码分析:多种部署方式之间的区别与联系(1) 从官方的文档我们可以知道,Spark的部署方式有很多种:local.Standalone.Mesos.YARN.....不同部署方式的 ...

  5. Spark 源码分析 -- task实际执行过程

    Spark源码分析 – SparkContext 中的例子, 只分析到sc.runJob 那么最终是怎么执行的? 通过DAGScheduler切分成Stage, 封装成taskset, 提交给Task ...

  6. Spark源码分析 – BlockManager

    参考, Spark源码分析之-Storage模块 对于storage, 为何Spark需要storage模块?为了cache RDD Spark的特点就是可以将RDD cache在memory或dis ...

  7. Spark源码分析 -- TaskScheduler

    Spark在设计上将DAGScheduler和TaskScheduler完全解耦合, 所以在资源管理和task调度上可以有更多的方案 现在支持, LocalSheduler, ClusterSched ...

  8. Spark源码分析 – SchedulerBackend

    SchedulerBackend, 两个任务, 申请资源和task执行和管理 对于SparkDeploySchedulerBackend, 基于actor模式, 主要就是启动和管理两个actor De ...

  9. Spark源码分析 – Deploy

    参考, Spark源码分析之-deploy模块   Client Client在SparkDeploySchedulerBackend被start的时候, 被创建, 代表一个application和s ...

随机推荐

  1. css 盒子垂直居中

    面试的时候经常会被问到这样一个题目:让一个元素中内容垂直居中怎么做.其实之前,我就会两种,line-height和table-cell,今天做项目,遇到了这个问题,就系统的查了一下,总结一下方法: 1 ...

  2. Django Model获取指定列的数据

    model一般都是有多个属性的,但是很多时候我们又只需要查询特定的某一个,这个时候可以用到values和values_list 利用values查询 from attendence.models im ...

  3. array2json

    原文:jQuery方法扩展:type, toJSON, evalJSON. http://zhkac.iteye.com/blog/499330 .2013-05-19 (function($) { ...

  4. yield return关键字怎么使用?

    在迭代器块中用于向枚举数对象提供值或发出迭代结束信号.它的形式为下列之一: 复制代码 yield return <expression>;yield break; 备注计算表达式并以枚举数 ...

  5. Hbase脚本小结

    脚本使用小结: 1.开启集群,start-hbase.sh 2.关闭集群,stop-hbase.sh 3.开启/关闭所有的regionserver.zookeeper,hbase-daemons.sh ...

  6. 树莓派系统Raspbian安装小结

    是有界面的系统. NOOBS, our easy installer for Raspbian  基于debian NOOBS stands for New Out Of Box Software h ...

  7. thinkphp 命名规范

    目录和文件命名 目录和文件名采用 小写+下划线,并且以小写字母开头: 类库.函数文件统一以.php为后缀: 类的文件名均以命名空间定义,并且命名空间的路径和类库文件所在路径一致(包括大小写): 类名和 ...

  8. Asp.net回调技术Callback学习

    .aspx: <%@ Page Language="C#" AutoEventWireup="true" CodeFile="Default.a ...

  9. java为安全起见对Applet有所限制

    Applet消亡的原因: ①java为安全起见对Applet有所限制:Applet不允许访问本地文件信息.敏感信息,不能执行本地指令(比如FORMAT),不能访问初原服务器之外的其他服务器. ① IE ...

  10. [openwrt]网络配置

    Network: config interface 'loopback'    option ifname 'lo'    option proto 'static'    option ipaddr ...