BlockManagerMaster

只是维护一系列对BlockManagerMasterActor的接口, 所有的都是通过tell和askDriverWithReply从BlockManagerMasterActor获取数据
比较鸡肋的类

private[spark] class BlockManagerMaster(var driverActor: ActorRef) extends Logging {
/** Remove a dead executor from the driver actor. This is only called on the driver side. */
def removeExecutor(execId: String)
/**
* Send the driver actor a heart beat from the slave. Returns true if everything works out,
* false if the driver does not know about the given block manager, which means the block
* manager should re-register.
*/
def sendHeartBeat(blockManagerId: BlockManagerId): Boolean
/** Register the BlockManager's id with the driver. */
def registerBlockManager(blockManagerId: BlockManagerId, maxMemSize: Long, slaveActor: ActorRef)
def updateBlockInfo(
blockManagerId: BlockManagerId,
blockId: String,
storageLevel: StorageLevel,
memSize: Long,
diskSize: Long): Boolean
/** Get locations of the blockId from the driver */
def getLocations(blockId: String): Seq[BlockManagerId]
/** Get locations of multiple blockIds from the driver */
def getLocations(blockIds: Array[String]): Seq[Seq[BlockManagerId]]
/** Get ids of other nodes in the cluster from the driver */
def getPeers(blockManagerId: BlockManagerId, numPeers: Int): Seq[BlockManagerId]
/**
* Remove a block from the slaves that have it. This can only be used to remove
* blocks that the driver knows about.
*/
def removeBlock(blockId: String)
/**
* Remove all blocks belonging to the given RDD.
*/
def removeRdd(rddId: Int, blocking: Boolean)
/**
* Return the memory status for each block manager, in the form of a map from
* the block manager's id to two long values. The first value is the maximum
* amount of memory allocated for the block manager, while the second is the
* amount of remaining memory.
*/
def getMemoryStatus: Map[BlockManagerId, (Long, Long)]
def getStorageStatus: Array[StorageStatus]
/** Stop the driver actor, called only on the Spark driver node */
def stop() {
if (driverActor != null) {
tell(StopBlockManagerMaster)
driverActor = null
logInfo("BlockManagerMaster stopped")
}
} /** Send a one-way message to the master actor, to which we expect it to reply with true. */
private def tell(message: Any) {
if (!askDriverWithReply[Boolean](message)) {
throw new SparkException("BlockManagerMasterActor returned false, expected true.")
}
} /**
* Send a message to the driver actor and get its result within a default timeout, or
* throw a SparkException if this fails.
*/
private def askDriverWithReply[T](message: Any): T = {
// TODO: Consider removing multiple attempts
if (driverActor == null) {
throw new SparkException("Error sending message to BlockManager as driverActor is null" +
"[message =" + message + "]")
}
var attempts = 0
var lastException: Exception = null
while (attempts < AKKA_RETRY_ATTEMPTS) {
attempts += 1
try {
val future = driverActor.ask(message)(timeout)
val result = Await.result(future, timeout)
if (result == null) {
throw new SparkException("BlockManagerMaster returned null")
}
return result.asInstanceOf[T]
} catch {
case ie: InterruptedException => throw ie
case e: Exception =>
lastException = e
logWarning("Error sending message to BlockManagerMaster in" + attempts + " attempts", e)
}
Thread.sleep(AKKA_RETRY_INTERVAL_MS)
}
throw new SparkException(
"Error sending message to BlockManagerMaster [message =" + message + "]", lastException)
}
}

 

BlockManagerInfo

在BlockManagerMasterActor object中主要就是定义BlockManagerInfo

主要用于管理BlockManager下面的所有block的BlockStatus和hb, 更新和删除

为何要定义在这个地方?

private[spark]
object BlockManagerMasterActor {
case class BlockStatus(storageLevel: StorageLevel, memSize: Long, diskSize: Long) class BlockManagerInfo(
val blockManagerId: BlockManagerId,
timeMs: Long,
val maxMem: Long,
val slaveActor: ActorRef)
extends Logging {
    private var _remainingMem: Long = maxMem  //BlockManager的memory大小
    private var _lastSeenMs: Long = timeMs    //BlockManager的heartbeat, 会被不停的更新 
// Mapping from block id to its status.
private val _blocks = new JHashMap[String, BlockStatus] // buffer每个block的BlockStatus
    // 这里的memSize, 默认为0, 意思是droppedMemorySize
def updateBlockInfo(blockId: String, storageLevel: StorageLevel, memSize: Long, diskSize: Long) {
if (_blocks.containsKey(blockId)) {
// The block exists on the slave already.
val originalLevel: StorageLevel = _blocks.get(blockId).storageLevel
if (originalLevel.useMemory) {
_remainingMem += memSize
}
} if (storageLevel.isValid) {// isValid means it is either stored in-memory or on-disk.
_blocks.put(blockId, BlockStatus(storageLevel, memSize, diskSize))
        if (storageLevel.useMemory) {
_remainingMem -= memSize
}
} else if (_blocks.containsKey(blockId)) {
// If isValid is not true, drop the block.
val blockStatus: BlockStatus = _blocks.get(blockId)
_blocks.remove(blockId)
if (blockStatus.storageLevel.useMemory) {
_remainingMem += blockStatus.memSize
}
}
} def removeBlock(blockId: String) {
if (_blocks.containsKey(blockId)) {
_remainingMem += _blocks.get(blockId).memSize
_blocks.remove(blockId)
}
}
}
}

 

BlockManagerMasterActor

维护各个slave的BlockManagerInfo信息, 以及各个block的locations信息(所属哪个BlockManager) 
核心功能就是管理和更新这些元数据,

RegisterBlockManager

updateBlockInfo

heartBeat

RemoveRDD, Executor(BlockManager), Block

/**
* BlockManagerMasterActor is an actor on the master node to track statuses of
* all slaves' block managers.
*/
private[spark]
class BlockManagerMasterActor(val isLocal: Boolean) extends Actor with Logging {
// Mapping from block manager id to the block manager's information.
  // Buffer所有的BlockManager的Info
private val blockManagerInfo =
new mutable.HashMap[BlockManagerId, BlockManagerMasterActor.BlockManagerInfo] // Mapping from executor ID to block manager ID.
private val blockManagerIdByExecutor = new mutable.HashMap[String, BlockManagerId] // Mapping from block id to the set of block managers that have the block.
  // Buffer blockLocation,这里用BlockManagerId来表示location,因为从BlockManagerId可以知道对应的executor
private val blockLocations = new JHashMap[String, mutable.HashSet[BlockManagerId]]
  def receive = {
case RegisterBlockManager(blockManagerId, maxMemSize, slaveActor) =>
register(blockManagerId, maxMemSize, slaveActor)
sender ! true // BlockManagerMaster.tell要求返回true
// ……这里接收的和BlockManagerMaster中的接口一致, 省略
}

   // 处理RegisterBlockManager event, 用于slave向master注册自己的blockmanager

  // 主要就是将slave的BlockManagerInfo注册到master中

  private def register(id: BlockManagerId, maxMemSize: Long, slaveActor: ActorRef) {
if (id.executorId == "<driver>" && !isLocal) { // 如果本身就是driver,就不需要注册
// Got a register message from the master node; don't register it
} else if (!blockManagerInfo.contains(id)) { // 如果包含,说明已经注册过
blockManagerIdByExecutor.get(id.executorId) match {
case Some(manager) => // 一个executor应该只有一个bm, 所以如果该executor已经注册过bm ……
// A block manager of the same executor already exists.
// This should never happen. Let's just quit.
logError("Got two different block manager registrations on " + id.executorId)
System.exit(1)
case None =>
blockManagerIdByExecutor(id.executorId) = id
}
blockManagerInfo(id) = new BlockManagerMasterActor.BlockManagerInfo( // 创建新的BlockManagerInfo, 并buffer在blockManagerInfo中
id, System.currentTimeMillis(), maxMemSize, slaveActor)
}
}

 

  // 处理updateBlockInfo

  private def updateBlockInfo(
blockManagerId: BlockManagerId,
blockId: String,
storageLevel: StorageLevel,
memSize: Long,
diskSize: Long) { if (!blockManagerInfo.contains(blockManagerId)) { //blockManagerInfo中不包含这个blockManagerId
if (blockManagerId.executorId == "<driver>" && !isLocal) {
// We intentionally do not register the master (except in local mode),
// so we should not indicate failure.
sender ! true
} else {
sender ! false
}
return
}
//调用BlockManagerInfo.updateBlockInfo
blockManagerInfo(blockManagerId).updateBlockInfo(blockId, storageLevel, memSize, diskSize)
    var locations: mutable.HashSet[BlockManagerId] = null
if (blockLocations.containsKey(blockId)) {
locations = blockLocations.get(blockId)
} else {
locations = new mutable.HashSet[BlockManagerId]
blockLocations.put(blockId, locations) //缓存该block的location信息
} if (storageLevel.isValid) {
locations.add(blockManagerId)
} else {
locations.remove(blockManagerId)
} // Remove the block from master tracking if it has been removed on all slaves.
if (locations.size == 0) {
blockLocations.remove(blockId)
}

    sender ! true
}

    // 处理removeRdd, 删除RDD

  private def removeRdd(rddId: Int): Future[Seq[Int]] = {
// First remove the metadata for the given RDD, and then asynchronously remove the blocks from the slaves.
val prefix = "rdd_" + rddId + "_"
// Find all blocks for the given RDD, remove the block from both blockLocations and
// the blockManagerInfo that is tracking the blocks.
val blocks = blockLocations.keySet().filter(_.startsWith(prefix)) // 从blockLocations中找出所有该RDD对应的blocks
blocks.foreach { blockId => // 从blockManagerInfo和blockLocations中去除这些blocks信息
val bms: mutable.HashSet[BlockManagerId] = blockLocations.get(blockId)
bms.foreach(bm => blockManagerInfo.get(bm).foreach(_.removeBlock(blockId)))
blockLocations.remove(blockId)
}
// Ask the slaves to remove the RDD, and put the result in a sequence of Futures.
// The dispatcher is used as an implicit argument into the Future sequence construction.
import context.dispatcher
val removeMsg = RemoveRdd(rddId)
Future.sequence(blockManagerInfo.values.map { bm => // Future.sequence, Transforms a Traversable[Future[A]] into a Future[Traversable[A]
bm.slaveActor.ask(removeMsg)(akkaTimeout).mapTo[Int] // 将RemoveRDD的msg发送给每个slave actors
}.toSeq)
}
  //处理removeExecutor
//删除Executor上的BlockManager, 名字起的不好
  private def removeExecutor(execId: String) {
logInfo("Trying to remove executor " + execId + " from BlockManagerMaster.")
blockManagerIdByExecutor.get(execId).foreach(removeBlockManager)
}
private def removeBlockManager(blockManagerId: BlockManagerId) {
val info = blockManagerInfo(blockManagerId) // Remove the block manager from blockManagerIdByExecutor.
blockManagerIdByExecutor -= blockManagerId.executorId // Remove it from blockManagerInfo and remove all the blocks.
blockManagerInfo.remove(blockManagerId)
val iterator = info.blocks.keySet.iterator
while (iterator.hasNext) {
val blockId = iterator.next
val locations = blockLocations.get(blockId)
locations -= blockManagerId
if (locations.size == 0) {
blockLocations.remove(locations)
}
}
}

  // 处理sendHeartBeat

  // blockManager的hb通过blockManagerInfo的LastSeenMs来表示

  private def heartBeat(blockManagerId: BlockManagerId): Boolean = {
if (!blockManagerInfo.contains(blockManagerId)) {
blockManagerId.executorId == "<driver>" && !isLocal
} else {
blockManagerInfo(blockManagerId).updateLastSeenMs()
true
}
}

   // 处理removeBlock

  // Remove a block from the slaves that have it. This can only be used to remove
// blocks that the master knows about.
private def removeBlockFromWorkers(blockId: String) {
val locations = blockLocations.get(blockId)
if (locations != null) {
locations.foreach { blockManagerId: BlockManagerId =>
val blockManager = blockManagerInfo.get(blockManagerId)
if (blockManager.isDefined) {
// Remove the block from the slave's BlockManager.
// Doesn't actually wait for a confirmation and the message might get lost.
// If message loss becomes frequent, we should add retry logic here.
blockManager.get.slaveActor ! RemoveBlock(blockId)
}
}
}
}

 

BlockManagerSlaveActor

Master可用发给的slave的message就2种, 所以很简单...过于简单

因为他只处理master发送来的event, 而大部分对于数据的读写等, 在BlockManager中直接实现了

/**
* An actor to take commands from the master to execute options. For example,
* this is used to remove blocks from the slave's BlockManager.
*/
class BlockManagerSlaveActor(blockManager: BlockManager) extends Actor {
override def receive = {
case RemoveBlock(blockId) =>
blockManager.removeBlock(blockId)
case RemoveRdd(rddId) =>
val numBlocksRemoved = blockManager.removeRdd(rddId)
sender ! numBlocksRemoved
}
}

Spark 源码分析 – BlockManagerMaster&Slave的更多相关文章

  1. Spark源码分析 – 汇总索引

    http://jerryshao.me/categories.html#architecture-ref http://blog.csdn.net/pelick/article/details/172 ...

  2. Spark源码分析 – BlockManager

    参考, Spark源码分析之-Storage模块 对于storage, 为何Spark需要storage模块?为了cache RDD Spark的特点就是可以将RDD cache在memory或dis ...

  3. Spark源码分析之-Storage模块

    原文链接:http://jerryshao.me/architecture/2013/10/08/spark-storage-module-analysis/ Background 前段时间琐事颇多, ...

  4. 【转】Spark源码分析之-deploy模块

    原文地址:http://jerryshao.me/architecture/2013/04/30/Spark%E6%BA%90%E7%A0%81%E5%88%86%E6%9E%90%E4%B9%8B- ...

  5. Spark源码分析 – Shuffle

    参考详细探究Spark的shuffle实现, 写的很清楚, 当前设计的来龙去脉 Hadoop Hadoop的思路是, 在mapper端每次当memory buffer中的数据快满的时候, 先将memo ...

  6. Spark源码分析 – SchedulerBackend

    SchedulerBackend, 两个任务, 申请资源和task执行和管理 对于SparkDeploySchedulerBackend, 基于actor模式, 主要就是启动和管理两个actor De ...

  7. Spark源码分析 – DAGScheduler

    DAGScheduler的架构其实非常简单, 1. eventQueue, 所有需要DAGScheduler处理的事情都需要往eventQueue中发送event 2. eventLoop Threa ...

  8. Spark源码分析之六:Task调度(二)

    话说在<Spark源码分析之五:Task调度(一)>一文中,我们对Task调度分析到了DriverEndpoint的makeOffers()方法.这个方法针对接收到的ReviveOffer ...

  9. Spark源码分析之五:Task调度(一)

    在前四篇博文中,我们分析了Job提交运行总流程的第一阶段Stage划分与提交,它又被细化为三个分阶段: 1.Job的调度模型与运行反馈: 2.Stage划分: 3.Stage提交:对应TaskSet的 ...

随机推荐

  1. HDU 1863 畅通工程 克鲁斯卡尔算法

    畅通工程 Time Limit: 1000/1000 MS (Java/Others)    Memory Limit: 32768/32768 K (Java/Others)Total Submis ...

  2. 李洪强iOS开发之苹果企业开发者账号申请流程

    李洪强iOS开发之苹果企业开发者账号申请流程 一. 开发者账号类型选择 邓白氏码 DUNS number,是Data Universal Numbering System的缩写,是一个独一无二的9位数 ...

  3. linux 无外网情况下安装 mysql

    由于工作需要,需要在一台装有 CentOS 系统的测试服务器上安装 MySQL ,由于该服务器上存有其他比较重要的测试数据,所以不能连接外网.由于之前安装 MySQL 一直都是使用 yum 命令一键搞 ...

  4. 读取数据库中timestamp类型去掉毫秒

    数据库中查询出来的时间是:2015-09-24 14:30:26.2,带有毫秒,需要去掉. 方法一: public static Timestamp getSystemTime() { Date dt ...

  5. [Busybox]Busybox制作文件系统

    问题: 1.目前busybox和bootstrap两种方案制作文件系统,哪种开发周期更短,更加简单? 2.如果需要在文件系统中添加某个package,要怎么做,如vim/udhcpd等? 转自:htt ...

  6. 机器学习:如何通过Python入门机器学习

    我们都知道机器学习是一门综合性极强的研究课题,对数学知识要求很高.因此,对于非学术研究专业的程序员,如果希望能入门机器学习,最好的方向还是从实践触发. 我了解到Python的生态对入门机器学习很有帮助 ...

  7. oozie客户端常用操作命令

    1.提交作业,作业进入PREP状态 oozie job -oozie http://localhost:11000/oozie -config job.properties -submit job: ...

  8. 原生js怎么删除一个 div

    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/ ...

  9. php -- strpos,stripos,strrpos,strripos,strstr,strchr,stristr,strrchr

    strpos() 函数 语法: mixed strpos ( string $haystack , mixed $needle [, int $offset = 0 ] ) 查找 needle 在 h ...

  10. PHPMailer发送邮箱(ThinkPHP实战篇)

    1.下载phpmailer文件库 2.引用文件,此处将代码放到 :函数库中,function.php function sendConsultantMessage($sendData){ Vendor ...