spark1.1.0源码阅读-taskScheduler

1. sparkContext中设置createTaskScheduler

       case "yarn-standalone" | "yarn-cluster" =>

         if (master == "yarn-standalone") {

           logWarning(

             "\"yarn-standalone\" is deprecated as of Spark 1.0. Use \"yarn-cluster\" instead.")

         }

         val scheduler = try {

           val clazz = Class.forName("org.apache.spark.scheduler.cluster.YarnClusterScheduler")

           val cons = clazz.getConstructor(classOf[SparkContext])

           cons.newInstance(sc).asInstanceOf[TaskSchedulerImpl]

         } catch {

           // TODO: Enumerate the exact reasons why it can fail

           // But irrespective of it, it means we cannot proceed !

           case e: Exception => {

             throw new SparkException("YARN mode not available ?", e)

           }

         }

         val backend = new CoarseGrainedSchedulerBackend(scheduler, sc.env.actorSystem)

         scheduler.initialize(backend) //调用实现类的initialize函数

         scheduler

在taskSchedulerImpl.scala中

   def initialize(backend: SchedulerBackend) {

     this.backend = backend

     // temporarily set rootPool name to empty

     rootPool = new Pool("", schedulingMode, 0, 0)

     schedulableBuilder = {

       schedulingMode match {

         case SchedulingMode.FIFO =>

           new FIFOSchedulableBuilder(rootPool)

         case SchedulingMode.FAIR =>

           new FairSchedulableBuilder(rootPool, conf)

       }

     }

     schedulableBuilder.buildPools()

   }

2. submitTasks

   override def submitTasks(taskSet: TaskSet) {

     val tasks = taskSet.tasks

     logInfo("Adding task set " + taskSet.id + " with " + tasks.length + " tasks")

     this.synchronized {

       val manager = new TaskSetManager(this, taskSet, maxTaskFailures)

       activeTaskSets(taskSet.id) = manager

       schedulableBuilder.addTaskSetManager(manager, manager.taskSet.properties)

       if (!isLocal && !hasReceivedTask) {

         starvationTimer.scheduleAtFixedRate(new TimerTask() {

           override def run() {

             if (!hasLaunchedTask) {

               logWarning("Initial job has not accepted any resources; " +

                 "check your cluster UI to ensure that workers are registered " +

                 "and have sufficient memory")

             } else {

               this.cancel()

             }

           }

         }, STARVATION_TIMEOUT, STARVATION_TIMEOUT)

       }

       hasReceivedTask = true

     }

     backend.reviveOffers()

   }

3. CoarseGrainedSchedulerBackend的reviveOffers

   override def reviveOffers() {

     driverActor ! ReviveOffers  //将msg发给CoarseGrainedSchedulerBackend的driverActor

   }

       case ReviveOffers =>

         makeOffers()

     // Make fake resource offers on all executors

     def makeOffers() {

       launchTasks(scheduler.resourceOffers(

         executorHost.toArray.map {case (id, host) => new WorkerOffer(id, host, freeCores(id))}))

     }

 /**

2  * Represents free resources available on an executor.

  */

 private[spark]

 case class WorkerOffer(executorId: String, host: String, cores: Int)

 1   /**

 2    * Called by cluster manager to offer resources on slaves. We respond by asking our active task

 3    * sets for tasks in order of priority. We fill each node with tasks in a round-robin manner so

 4    * that tasks are balanced across the cluster.

 5    */

   def resourceOffers(offers: Seq[WorkerOffer]): Seq[Seq[TaskDescription]] = synchronized {

     SparkEnv.set(sc.env)

     // Mark each slave as alive and remember its hostname

     for (o <- offers) {

       executorIdToHost(o.executorId) = o.host

       if (!executorsByHost.contains(o.host)) {

         executorsByHost(o.host) = new HashSet[String]()

         executorAdded(o.executorId, o.host)

       }

     }

     // Randomly shuffle offers to avoid always placing tasks on the same set of workers.

     val shuffledOffers = Random.shuffle(offers)

     // Build a list of tasks to assign to each worker.

21     val tasks = shuffledOffers.map(o => new ArrayBuffer[TaskDescription](o.cores))

     val availableCpus = shuffledOffers.map(o => o.cores).toArray

     val sortedTaskSets = rootPool.getSortedTaskSetQueue

     for (taskSet <- sortedTaskSets) {

       logDebug("parentName: %s, name: %s, runningTasks: %s".format(

         taskSet.parent.name, taskSet.name, taskSet.runningTasks))

     }

     // Take each TaskSet in our scheduling order, and then offer it each node in increasing order

     // of locality levels so that it gets a chance to launch local tasks on all of them.

     var launchedTask = false

     for (taskSet <- sortedTaskSets; maxLocality <- TaskLocality.values) {

       do {

         launchedTask = false

         for (i <- 0 until shuffledOffers.size) {

           val execId = shuffledOffers(i).executorId

           val host = shuffledOffers(i).host

           if (availableCpus(i) >= CPUS_PER_TASK) {

             for (task <- taskSet.resourceOffer(execId, host, maxLocality)) {

               tasks(i) += task

               val tid = task.taskId

               taskIdToTaskSetId(tid) = taskSet.taskSet.id

               taskIdToExecutorId(tid) = execId

               activeExecutorIds += execId

               executorsByHost(host) += execId

               availableCpus(i) -= CPUS_PER_TASK

               assert (availableCpus(i) >= 0)

               launchedTask = true

             }

           }

         }

       } while (launchedTask)

     }

     if (tasks.size > 0) {

       hasLaunchedTask = true

     }

58     return tasks

   }

4. launchTasks

     // Launch tasks returned by a set of resource offers

     def launchTasks(tasks: Seq[Seq[TaskDescription]]) {

       for (task <- tasks.flatten) {

         freeCores(task.executorId) -= scheduler.CPUS_PER_TASK

         executorActor(task.executorId) ! LaunchTask(task)

       }

     }

 class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: ActorSystem)

   extends SchedulerBackend with Logging

 {

   // Use an atomic variable to track total number of cores in the cluster for simplicity and speed

   var totalCoreCount = new AtomicInteger(0)

   val conf = scheduler.sc.conf

   private val timeout = AkkaUtils.askTimeout(conf)

   class DriverActor(sparkProperties: Seq[(String, String)]) extends Actor {

     private val executorActor = new HashMap[String, ActorRef]

     private val executorAddress = new HashMap[String, Address]

     private val executorHost = new HashMap[String, String]

     private val freeCores = new HashMap[String, Int]

     private val totalCores = new HashMap[String, Int]

     private val addressToExecutorId = new HashMap[Address, String]

   // Driver to executors

   case class LaunchTask(task: TaskDescription) extends CoarseGrainedClusterMessage

 private[spark] class TaskDescription(

 2     val taskId: Long,

 3     val executorId: String,

 4     val name: String,

 5     val index: Int,    // Index within this task's TaskSet

 6     _serializedTask: ByteBuffer)

 7   extends Serializable {

   // Because ByteBuffers are not serializable, wrap the task in a SerializableBuffer

   private val buffer = new SerializableBuffer(_serializedTask)

   def serializedTask: ByteBuffer = buffer.value

   override def toString: String = "TaskDescription(TID=%d, index=%d)".format(taskId, index)

 }

5. CoarseGrainedSchedulerBackend收到executor的注册之后，记录executor

     def receive = {

       case RegisterExecutor(executorId, hostPort, cores) =>

         Utils.checkHostPort(hostPort, "Host port expected " + hostPort)

         if (executorActor.contains(executorId)) {

           sender ! RegisterExecutorFailed("Duplicate executor ID: " + executorId)

         } else {

           logInfo("Registered executor: " + sender + " with ID " + executorId)

           sender ! RegisteredExecutor(sparkProperties)

           executorActor(executorId) = sender

           executorHost(executorId) = Utils.parseHostPort(hostPort)._1

           totalCores(executorId) = cores

           freeCores(executorId) = cores

           executorAddress(executorId) = sender.path.address

           addressToExecutorId(sender.path.address) = executorId

           totalCoreCount.addAndGet(cores)

           makeOffers()

         }

executor先向CoarseGrainedSchedulerBackend注册，然后CoarseGrainedSchedulerBackend发task（序列化后）到这个executor上去。

6. CoarseGrainedExecutorBackend跟CoarseGrainedSchedulerBackend通信。

 private[spark] class CoarseGrainedExecutorBackend(

     driverUrl: String,

     executorId: String,

     hostPort: String,

     cores: Int,

     sparkProperties: Seq[(String, String)])

 7   extends Actor with ActorLogReceive with ExecutorBackend with Logging {

   Utils.checkHostPort(hostPort, "Expected hostport")

   var executor: Executor = null

   var driver: ActorSelection = null

   override def preStart() {

     logInfo("Connecting to driver: " + driverUrl)

     driver = context.actorSelection(driverUrl)

17     driver ! RegisterExecutor(executorId, hostPort, cores) //注册

     context.system.eventStream.subscribe(self, classOf[RemotingLifecycleEvent])

   }

   override def receiveWithLogging = {

     case RegisteredExecutor =>

       logInfo("Successfully registered with driver")

       // Make this host instead of hostPort ?

       executor = new Executor(executorId, Utils.parseHostPort(hostPort)._1, sparkProperties,

         false)

     case RegisterExecutorFailed(message) =>

       logError("Slave registration failed: " + message)

       System.exit(1)

     case LaunchTask(data) =>  //收到task

       if (executor == null) {

         logError("Received LaunchTask command but executor was null")

         System.exit(1)

       } else {

         val ser = SparkEnv.get.closureSerializer.newInstance()

         val taskDesc = ser.deserialize[TaskDescription](data.value)

         logInfo("Got assigned task " + taskDesc.taskId)

         executor.launchTask(this, taskDesc.taskId, taskDesc.name, taskDesc.serializedTask)

       }

7. executor.launchTask

   def launchTask(

       context: ExecutorBackend, taskId: Long, taskName: String, serializedTask: ByteBuffer) {

     val tr = new TaskRunner(context, taskId, taskName, serializedTask)

     runningTasks.put(taskId, tr)

     threadPool.execute(tr)

   }

且听下回分解

spark1.1.0源码阅读-taskScheduler的更多相关文章

spark1.1.0源码阅读-dagscheduler and stage
1. rdd action ->sparkContext.runJob->dagscheduler.runJob def runJob[T, U: ClassTag]( rdd: RDD[ ...
spark1.1.0源码阅读-executor
1. executor上执行launchTask def launchTask( context: ExecutorBackend, taskId: Long, taskName: String, s ...
Yii2.0源码阅读-一次请求的完整过程
Yii2.0框架源码阅读,从请求发起,到结束的运行步骤其实最初阅读是从yii\web\UrlManager这个类开始看起,不断的寻找这个类中方法的调用者,最终回到了yii\web\Applicati ...
Vue2.0源码阅读笔记（四）：nextTick
在阅读 nextTick 的源码之前,要先弄明白 JS 执行环境运行机制,介绍 JS 执行环境的事件循环机制的文章很多,大部分都阐述的比较笼统,甚至有些文章说的是错误的,以下为个人理解,如有错误, ...
Vue2.0源码阅读笔记--生命周期
一.Vue2.0的生命周期 Vue2.0的整个生命周期有八个:分别是 1.beforeCreate,2.created,3.beforeMount,4.mounted,5.beforeUpdate,6 ...
Vue2.0源码阅读笔记--双向绑定实现原理
上一篇文章了解了Vue.js的生命周期.这篇分析Observe Data过程,了解Vue.js的双向数据绑定实现原理. 一.实现双向绑定的做法前端MVVM最令人激动的就是双向绑定机制了,实现双向 ...
Yii2.0源码阅读-从路由到控制器
之前的文章弄清了一次请求的开始到结束.主要讲了Yii Applicaton实例的创建.初始化,UrlManager如何返回Yii中的路由信息,到runAction,最后将Response发送给客户端. ...
Yii2.0源码阅读-视图(View)渲染过程
之前的文章我们根据源码的分析,弄清了Yii如何处理一次请求,以及根据解析的路由如何调用控制器中的action,那接下来好奇的可能就是,我在控制器action中执行了return $this->r ...
Vue2.0源码阅读笔记（二）：响应式原理
Vue是数据驱动的框架,在修改数据时,视图会进行更新.数据响应式系统使得状态管理变的简单直接,在开发过程中减少与DOM元素的接触.而深入学习其中的原理十分有必要,能够回避一些常见的问题,使开发变的 ...

随机推荐

RAM区间最值
RMQ (Range Minimum/Maximum Query)问题是指:对于长度为n的数列A,回答若干询问RMQ(A,i,j)(i,j<=n),返回数列A中下标在i,j里的最小(大)值,也就 ...
Java Socket 编程指南
Socket,又称为套接字,Socket是计算机网络通信的基本的技术之一.如今大多数基于网络的软件,如浏览器,即时通讯工具甚至是P2P下载都是基于Socket实现的.本文会介绍一下基于TCP/IP的S ...
机器学习实战__安装python环境
环境:win7 64位系统第一步:安装python 1.下载python2.7.3 64位 msi 版本(这里选择了很多2.7的其他更高版本导致安装setuptools失败,也不知道是什么原因,暂时 ...
Nodejs 上传下载功能的实现（同步）
上传和下载可分为两种,一种是form表单的形式(同步),另一种是Ajax的形式(异步). 示例一(form表单): html代码如下: <!DOCTYPE html> <html&g ...
利用golang语法检查对象是否实现了接口
var _ ipc.Server = &CenterServer{} CenterServer是否实现了 ipc.Server的接口.编译期间检测,这是很好的编程实践. 稍后详述...
Unity3D基础学习 NGUI之Example 13 - Tabs简要概述
首先建一个2D相机,在Anchor下新建一个子物体,添加WindowDrag Tilt脚本,用作拖动窗口然后新建一个Panel,包含两个content,两个Tab,设置两个Content用来显示切换 ...
intent的startActivityForResult()方法
/******************************************************************************************** * au ...
Appcelerator Titanium 3.x Win7 64位平台安装步骤
刚接触Android移动开发,第一次下载Titanium,第一次下载ADT,第一次看Javascript代码,N多第一次...... 慢慢摸索了一个礼拜把移动开发的工具链的配置学习了一下,抛砖引玉,但 ...
C++中struct和class的区别 [转]
一. C++中的struct对C中的struct进行了扩充,它已经不再只是一个包含不同数据类型的数据结构了,它已经获取了太多的功能. struct能包含成员函数吗? 能! struct能继承吗? ...
cocos2d-x 3.6版连连看载入资源
网上找了一个梦幻连连看的资源.大家能够百度一下.然后整理一下加到project里面去.包含声音和图片文件.后面解释怎样整理能够方便管理. 我不推荐在代码里面直接引用资源文件名称,我称之为硬编码. 做i ...

spark1.1.0源码阅读-taskScheduler

spark1.1.0源码阅读-taskScheduler的更多相关文章

随机推荐

热门专题