spark-3.0 application 调度算法解析

spark 各个版本的application 调度算法还是有这明显的不同之处的。从spark1.3.0 到 spark 1.6.1、spark2.0 到现在最新的spark 3.0 ，调度算法有了一定的修改。下面大家一起学习一下，最新的spark 版本spark-3.0的Application 调度机制。

private def startExecutorsOnWorkers(): Unit = {
  // Right now this is a very simple FIFO scheduler. We keep trying to fit in the first app
  // in the queue, then the second app, etc.
  for (app <- waitingApps) {
    //如果在 spark-submmit 脚本中，指定了每个executor 多少个 CPU core，
    // 则每个Executor 分配该个数的 core，
    // 否则 默认每个executor 只分配 1 个 CPU core
    val coresPerExecutor = app.desc.coresPerExecutor.getOrElse(1)
    // If the cores left is less than the coresPerExecutor,the cores left will not be allocated
    //  当前 APP 还需要分配的  core  数 不能  小于 单个 executor 启动 的 CPU core 数
    if (app.coresLeft >= coresPerExecutor) {
      // Filter out workers that don't have enough resources to launch an executo/*ku*/r
      // 过滤出 状态 为 ALIVE，并且还能 发布 Executor 的 worker
      // 按照剩余的 CPU core 数  倒序
      val usableWorkers = workers.toArray.filter(_.state == WorkerState.ALIVE)
        .filter(canLaunchExecutor(_, app.desc))
        .sortBy(_.coresFree).reverse
      if (waitingApps.length == 1 && usableWorkers.isEmpty) {
        logWarning(s"App ${app.id} requires more resource than any of Workers could have.")
      }

    // TODO:  默认采用 spreadOutApps  调度算法， 将 application需要的 executor资源 分派到  多个 worker 上去

      val assignedCores = scheduleExecutorsOnWorkers(app, usableWorkers, spreadOutApps)

      // Now that we've decided how many cores to allocate on each worker, let's allocate them
      for (pos <- 0 until usableWorkers.length if assignedCores(pos) > 0) {
        allocateWorkerResourceToExecutors(
          app, assignedCores(pos), app.desc.coresPerExecutor, usableWorkers(pos))
      }
    }
  }
}
判断一个 worker 是否可以发布 executor

private def canLaunchExecutor(worker: WorkerInfo, desc: ApplicationDescription): Boolean = {
  canLaunch(
    worker,
    desc.memoryPerExecutorMB,
    desc.coresPerExecutor.getOrElse(1),
    desc.resourceReqsPerExecutor)
}
让我们看一看里面的 canlaunch 方法

private def canLaunch(
    worker: WorkerInfo,
    memoryReq: Int,
    coresReq: Int,
    resourceRequirements: Seq[ResourceRequirement])
  : Boolean = {
  // worker 上 空闲的 内存值  要 大于等于  请求的 内存值
  val enoughMem = worker.memoryFree >= memoryReq
  // worker 上 空闲的 core 数  要 大于等于  请求的 core数
  val enoughCores = worker.coresFree >= coresReq
  //  worker 是否满足 executor 请求的资源   
  val enoughResources = ResourceUtils.resourcesMeetRequirements(
    worker.resourcesAmountFree, resourceRequirements)
  enoughMem && enoughCores && enoughResources
}

回到上面的 scheduleExecutorsOnWorkers

private def scheduleExecutorsOnWorkers(
    app: ApplicationInfo,
    usableWorkers: Array[WorkerInfo],
    spreadOutApps: Boolean): Array[Int] = {
  val coresPerExecutor = app.desc.coresPerExecutor
  val minCoresPerExecutor = coresPerExecutor.getOrElse(1)
  // 默认情况下 是 开启  oneExecutorPerWorker 机制的，也就是默认是在 一个 worker 上  只启动 一个 executor的
  //  如果在spark -submit 脚本中设置了coresPerExecutor ， 在worker资源充足的时候，则 会在每个worker 上，启动多个executor
  val oneExecutorPerWorker = coresPerExecutor.isEmpty
  val memoryPerExecutor = app.desc.memoryPerExecutorMB
  val resourceReqsPerExecutor = app.desc.resourceReqsPerExecutor
  val numUsable = usableWorkers.length
  val assignedCores = new Array[Int](numUsable) // Number of cores to give to each worker
  val assignedExecutors = new Array[Int](numUsable) // Number of new executors on each worker
  var coresToAssign = math.min(app.coresLeft, usableWorkers.map(_.coresFree).sum)

// 判断  Worker节点是否能够启动Executor
  def canLaunchExecutorForApp(pos: Int): Boolean = {

    val keepScheduling = coresToAssign >= minCoresPerExecutor
    val enoughCores = usableWorkers(pos).coresFree - assignedCores(pos) >= minCoresPerExecutor
    val assignedExecutorNum = assignedExecutors(pos)

    // If we allow multiple executors per worker, then we can always launch new executors.
    // Otherwise, if there is already an executor on this worker, just give it more cores.

    // 如果spark -submit 脚本中设置了coresPerExecutor值，
    // 或者当前 这个worker 还没有为这个 application 分配 过  executor ,
    val launchingNewExecutor = !oneExecutorPerWorker || assignedExecutorNum == 0
      // TODO:  可以启动新的 Executor
    if (launchingNewExecutor) {
      val assignedMemory = assignedExecutorNum * memoryPerExecutor
      val enoughMemory = usableWorkers(pos).memoryFree - assignedMemory >= memoryPerExecutor
      val assignedResources = resourceReqsPerExecutor.map {
        req => req.resourceName -> req.amount * assignedExecutorNum
      }.toMap
      val resourcesFree = usableWorkers(pos).resourcesAmountFree.map {
        case (rName, free) => rName -> (free - assignedResources.getOrElse(rName, 0))
      }
      val enoughResources = ResourceUtils.resourcesMeetRequirements(
        resourcesFree, resourceReqsPerExecutor)
      val underLimit = assignedExecutors.sum + app.executors.size < app.executorLimit
      keepScheduling && enoughCores && enoughMemory && enoughResources && underLimit
    } else {
      // We're adding cores to an existing executor, so no need
      // to check memory and executor limits
      // TODO:  不满足启动新的 Executor条件，则 在 老的 Executor 上 追加  core 数
      keepScheduling && enoughCores
    }
  }

  // Keep launching executors until no more workers can accommodate any
  // more executors, or if we have reached this application's limits

  var freeWorkers = (0 until numUsable).filter(canLaunchExecutorForApp)
  while (freeWorkers.nonEmpty) {
    freeWorkers.foreach { pos =>
      var keepScheduling = true
      while (keepScheduling && canLaunchExecutorForApp(pos)) {
        coresToAssign -= minCoresPerExecutor
        assignedCores(pos) += minCoresPerExecutor

        // If we are launching one executor per worker, then every iteration assigns 1 core
        // to the executor. Otherwise, every iteration assigns cores to a new executor.
        if (oneExecutorPerWorker) {
          //TODO: 如果该Worker节点不能启动新的 Executor，则每次在老的executor 上 分配 minCoresPerExecutor 个 CPU core(此时该值默认 为 1 )
          assignedExecutors(pos) = 1
        } else {
          //TODO: 如果该Worker节点可以启动新的 Executor，则每次在新的executor 上 分配 minCoresPerExecutor 个 CPU core（此时该值为 spark-submit脚本配置的 coresPerExecutor 值）
          assignedExecutors(pos) += 1
        }

        // Spreading out an application means spreading out its executors across as
        // many workers as possible. If we are not spreading out, then we should keep
        // scheduling executors on this worker until we use all of its resources.
        // Otherwise, just move on to the next worker.
        if (spreadOutApps) {
          // TODO： 这里传入 keepScheduling = false , 就是每次 worker上只分配 一次 core ,然后 到 下一个 worker 上  再去 分配 core，直到 worker
          // TODO:  完成一次遍历
          keepScheduling = false
        }
      }
    }
    freeWorkers = freeWorkers.filter(canLaunchExecutorForApp)
  }
  // 返回每个Worker节点分配的CPU核数
  assignedCores
}

再来分析 allocateWorkerResourceToExecutors

private def allocateWorkerResourceToExecutors(
    app: ApplicationInfo,
    assignedCores: Int,
    coresPerExecutor: Option[Int],
    worker: WorkerInfo): Unit = {
  // If the number of cores per executor is specified, we divide the cores assigned
  // to this worker evenly among the executors with no remainder.
  // Otherwise, we launch a single executor that grabs all the assignedCores on this worker.
  val numExecutors = coresPerExecutor.map { assignedCores / _ }.getOrElse(1)
  val coresToAssign = coresPerExecutor.getOrElse(assignedCores)
  for (i <- 1 to numExecutors) {
    val allocated = worker.acquireResources(app.desc.resourceReqsPerExecutor)
    // TODO : 当前 这个 application 追加 一次  Executor
    val exec = app.addExecutor(worker, coresToAssign, allocated)
    //TODO： 给worker 线程 发送 launchExecutor 命令
    launchExecutor(worker, exec)
    app.state = ApplicationState.RUNNING
  }
}
ok，至此，spark最新版本 spark-3.0的Application 调度算法分析完毕！！！

spark-3.0 application 调度算法解析的更多相关文章

Spark集群任务提交流程----2.1.0源码解析
Spark的应用程序是通过spark-submit提交到Spark集群上运行的,那么spark-submit到底提交了什么,集群是怎样调度运行的,下面一一详解. 0. spark-submit提交任务 ...
Apache Spark 3.0 预览版正式发布，多项重大功能发布
2019年11月08日数砖的 Xingbo Jiang 大佬给社区发了一封邮件,宣布 Apache Spark 3.0 预览版正式发布,这个版本主要是为了对即将发布的 Apache Spark 3. ...
[Spark] Spark 3.0 Accelerator Aware Scheduling - GPU
Ref: Spark3.0 preview预览版尝试GPU调用(本地模式不支持GPU) 预览版本:https://archive.apache.org/dist/spark/spark-3.0.0-p ...
solr&lucene3.6.0源码解析（四）
本文要描述的是solr的查询插件,该查询插件目的用于生成Lucene的查询Query,类似于查询条件表达式,与solr查询插件相关UML类图如下: 如果我们强行将上面的类图纳入某种设计模式语言的话,本 ...
solr&lucene3.6.0源码解析（三）
solr索引操作(包括新增更新删除提交合并等)相关UML图如下从上面的类图我们可以发现,其中体现了工厂方法模式及责任链模式的运用 UpdateRequestProcessor相当于责任链模式 ...
Spark 2.0
Apache Spark 2.0: Faster, Easier, and Smarter http://blog.madhukaraphatak.com/categories/spark-two/ ...
Heritrix 3.1.0 源码解析（三十七）
今天有兴趣重新看了一下heritrix3.1.0系统里面的线程池源码,heritrix系统没有采用java的cocurrency包里面的并发框架,而是采用了线程组ThreadGroup类来实现线程池的 ...
Cocos2d-x 3.0 使用TinyXml 解析XML文件
在cocos2d-x 3.0中Xml解析已经不用自己找库了,已经为我们集成好了. text.xml <!--?xml version ="1.0" encoding =&qu ...
Spark 1.0.0 横空出世 Spark on Yarn 部署(Hadoop 2.4)
就在昨天,北京时间5月30日20点多.Spark 1.0.0最终公布了:Spark 1.0.0 released 依据官网描写叙述,Spark 1.0.0支持SQL编写:Spark SQL Progr ...

随机推荐

Theano教程
让我们开始一个交互式会话(例如使用python或ipython)并导入Theano. from theano import * 你需要使用Theano的tensor子包中的几个符号.让我们以一个方便的 ...
MySQL（学生表、教师表、课程表、成绩表）多表查询
1.表架构 student(sid,sname,sage,ssex) 学生表 course(cid,cname,tid) 课程表 sC(sid,cid,score) 成绩表 teacher(tid,t ...
mysql插入的时间莫名的加一秒
1.问题描述我获取当天最大的时间: public static Date getEndOfDay(Date date) { LocalDateTime localDateTime = LocalDa ...
jitter()函数的使用
jitter()函数:对数值向量添加一个小的噪音量. jitter(x,factor=1,amount=NULL) ·x:数值变量,需要加入噪音的数值向量: ·factor:数值型: ·amount: ...
c语言l博客作业03
问题答案这个作业属于哪个课程 c语言程序设计ll 这个作业要求在哪里 https://edu.cnblogs.com/campus/zswxy/SE2019-3/homework/8727 我在这 ...
洛谷题解 CF1151D 【Stas and the Queue at the Buffet】
本蒟蒻又双叒叕被爆踩辣!!! 题目链接这道题我个人觉得没有紫题的水平. 步入正题先看题: 共有n个人,每个人2个属性,a,b; 窝们要求的是总的不满意度最小,最满意度的公式是什么? \(ai * ...
js 防抖、截流
突发奇想,在触发事件的时候,一些会频繁触发的事件会不会造成资源的浪费或者大量的计算造成页面卡顿,比如onresize,onscroll,onmousemove等事件. 然后就引出了一个新知识点:防抖. ...
git log详细使用参数
1. 可以看到fileName相关的commit记录 git log filename 2. 可以显示每次提交的diff git log -p filename 3. 只看某次提交中的某个文件变化,可 ...
简单高效的端口扫描python脚本
欢迎python爱好者加入:学习交流群 667279387 最近为了获取虚拟机端口开放情况,写了一个简单脚本来查看.共享给大家.下面的代码在python2种测试通过说明:concurrent是pyt ...
[TimLinux] Python 函数(2)
1. 作用最大化的代码重用:建设复制.粘贴最小化的代码冗余:减少重复代码流程分解:将做一件事情分解为相应的步骤,不同步骤封装在不同的函数中. 2. 定义 def 函数名(可选的参数列表): 函数 ...

spark-3.0 application 调度算法解析

spark-3.0 application 调度算法解析的更多相关文章

随机推荐

热门专题