Spark 源码分析 -- BlockStore

BlockStore

抽象接口类, 关键get和put都有两个版本
序列化, putBytes, getBytes
非序列化, putValues, getValues

其中putValues的返回值为PutResult, 其中的data可能是Iterator或ByteBuffer

private[spark] case class PutResult(size: Long, data: Either[Iterator[_], ByteBuffer])

/**

 * Abstract class to store blocks

 */

private[spark]

abstract class BlockStore(val blockManager: BlockManager) extends Logging {

  def putBytes(blockId: String, bytes: ByteBuffer, level: StorageLevel)

  /**

   * Put in a block and, possibly, also return its content as either bytes or another Iterator.

   * This is used to efficiently write the values to multiple locations (e.g. for replication).

   *

   * @return a PutResult that contains the size of the data, as well as the values put if

   *         returnValues is true (if not, the result's data field can be null)

   */

  def putValues(blockId: String, values: ArrayBuffer[Any], level: StorageLevel,

    returnValues: Boolean) : PutResult

  /**

   * Return the size of a block in bytes.

   */

  def getSize(blockId: String): Long

  def getBytes(blockId: String): Option[ByteBuffer]

  def getValues(blockId: String): Option[Iterator[Any]]

  /**

   * Remove a block, if it exists.

   * @param blockId the block to remove.

   * @return True if the block was found and removed, False otherwise.

   */

  def remove(blockId: String): Boolean

  def contains(blockId: String): Boolean

  def clear() { }

}

DiskStore

对应DiskStore其实很单纯, 就是打开相应的文件读或写.

/**

 * Stores BlockManager blocks on disk.

 */

private class DiskStore(blockManager: BlockManager, rootDirs: String)

  extends BlockStore(blockManager) with Logging {

  override def putBytes(blockId: String, _bytes: ByteBuffer, level: StorageLevel) {

    // So that we do not modify the input offsets !

    // duplicate does not copy buffer, so inexpensive

    val bytes = _bytes.duplicate()

    val file = createFile(blockId)

    val channel = new RandomAccessFile(file, "rw").getChannel()

    while (bytes.remaining > 0) {

      channel.write(bytes)

    }

    channel.close()

  }

  override def putValues(

      blockId: String,

      values: ArrayBuffer[Any],

      level: StorageLevel,

      returnValues: Boolean)

    : PutResult = {

    val file = createFile(blockId)

    val fileOut = blockManager.wrapForCompression(blockId,

      new FastBufferedOutputStream(new FileOutputStream(file)))

    val objOut = blockManager.defaultSerializer.newInstance().serializeStream(fileOut)

    objOut.writeAll(values.iterator)

    objOut.close()

    val length = file.length()

    if (returnValues) {

      // Return a byte buffer for the contents of the file

      val buffer = getFileBytes(file)

      PutResult(length, Right(buffer))

    } else {

      PutResult(length, null)

    }

  }

  override def getBytes(blockId: String): Option[ByteBuffer] = {

    val file = getFile(blockId)

    val bytes = getFileBytes(file)

    Some(bytes)

  }

  override def getValues(blockId: String): Option[Iterator[Any]] = {

    getBytes(blockId).map(bytes => blockManager.dataDeserialize(blockId, bytes))

  }

MemoryStore

对于MemoryStore复杂一些

首先使用LinkedHashMap, 可遍历的HashMap, 来组织MemoryStore, 其中的hashmap的结构(blockid, entry)
使用Entry抽象来表示block内容
并且在put的时候, 还涉及到memory空间的释放, ensureFreeSpace

/**

 * Stores blocks in memory, either as ArrayBuffers of deserialized Java objects or as

 * serialized ByteBuffers.

 */

private class MemoryStore(blockManager: BlockManager, maxMemory: Long)

  extends BlockStore(blockManager) {

  // 使用Entry来表示block内容

  case class Entry(value: Any, size: Long, deserialized: Boolean, var dropPending: Boolean = false) 

  private val entries = new LinkedHashMap[String, Entry](32, 0.75f, true) // 使用LinkedHashMap来表示整个MemoryStore

  private var currentMemory = 0L

  // Object used to ensure that only one thread is putting blocks and if necessary, dropping

  // blocks from the memory store.

  private val putLock = new Object() // HashMap不是线程安全的, 需要锁同步

  override def putBytes(blockId: String, _bytes: ByteBuffer, level: StorageLevel) {

    // Work on a duplicate - since the original input might be used elsewhere.

    val bytes = _bytes.duplicate()

    bytes.rewind()  // 对于NIO的ByteBuffer, 使用前最好rewind

    if (level.deserialized) { // 如果storage level需要非序列化的

      val values = blockManager.dataDeserialize(blockId, bytes) // 需要先反序列化

      val elements = new ArrayBuffer[Any]

      elements ++= values

      val sizeEstimate = SizeEstimator.estimate(elements.asInstanceOf[AnyRef])

      tryToPut(blockId, elements, sizeEstimate, true)

    } else {

      tryToPut(blockId, bytes, bytes.limit, false)

    }

  }

  // putValues的返回值取决于storage level, 如果是deserialized, 返回iterator, 否则ByteBuffer

  override def putValues(

      blockId: String,

      values: ArrayBuffer[Any],

      level: StorageLevel,

      returnValues: Boolean)

    : PutResult = {

    if (level.deserialized) {

      val sizeEstimate = SizeEstimator.estimate(values.asInstanceOf[AnyRef])

      tryToPut(blockId, values, sizeEstimate, true)

      PutResult(sizeEstimate, Left(values.iterator))

    } else {

      val bytes = blockManager.dataSerialize(blockId, values.iterator)

      tryToPut(blockId, bytes, bytes.limit, false)

      PutResult(bytes.limit(), Right(bytes.duplicate()))

    }

  }

  override def getBytes(blockId: String): Option[ByteBuffer] = {

    val entry = entries.synchronized {

      entries.get(blockId)

    }

    if (entry == null) {

      None

    } else if (entry.deserialized) {

      Some(blockManager.dataSerialize(blockId, entry.value.asInstanceOf[ArrayBuffer[Any]].iterator))

    } else {

      Some(entry.value.asInstanceOf[ByteBuffer].duplicate())   // Doesn't actually copy the data

    }

  }

  override def getValues(blockId: String): Option[Iterator[Any]] = {

    val entry = entries.synchronized {

      entries.get(blockId)

    }

    if (entry == null) {

      None

    } else if (entry.deserialized) {

      Some(entry.value.asInstanceOf[ArrayBuffer[Any]].iterator)

    } else {

      val buffer = entry.value.asInstanceOf[ByteBuffer].duplicate() // Doesn't actually copy data

      Some(blockManager.dataDeserialize(blockId, buffer))

    }

  }

  /**

   * Try to put in a set of values, if we can free up enough space. The value should either be

   * an ArrayBuffer if deserialized is true or a ByteBuffer otherwise. Its (possibly estimated)

   * size must also be passed by the caller.

   *

   * Locks on the object putLock to ensure that all the put requests and its associated block

   * dropping is done by only on thread at a time. Otherwise while one thread is dropping

   * blocks to free memory for one block, another thread may use up the freed space for

   * another block.

   */

  private def tryToPut(blockId: String, value: Any, size: Long, deserialized: Boolean): Boolean = {

    // TODO: Its possible to optimize the locking by locking entries only when selecting blocks

    // to be dropped. Once the to-be-dropped blocks have been selected, and lock on entries has been

    // released, it must be ensured that those to-be-dropped blocks are not double counted for

    // freeing up more space for another block that needs to be put. Only then the actually dropping

    // of blocks (and writing to disk if necessary) can proceed in parallel.

    putLock.synchronized {

      if (ensureFreeSpace(blockId, size)) { // 如果可用分配足够的memory

        val entry = new Entry(value, size, deserialized)

        entries.synchronized { entries.put(blockId, entry) }

        currentMemory += size

        true

      } else { // 如果memory无法放下这个block, 那么只有从memory删除, 如果可以用disk, 那么在dropFromMemory中会put到disk中

        // Tell the block manager that we couldn't put it in memory so that it can drop it to

        // disk if the block allows disk storage.

        val data = if (deserialized) {

          Left(value.asInstanceOf[ArrayBuffer[Any]])

        } else {

          Right(value.asInstanceOf[ByteBuffer].duplicate())

        }

        blockManager.dropFromMemory(blockId, data)

        false

      }

    }

  }

  /**

   * Tries to free up a given amount of space to store a particular block, but can fail and return

   * false if either the block is bigger than our memory or it would require replacing another

   * block from the same RDD (which leads to a wasteful cyclic replacement pattern for RDDs that

   * don't fit into memory that we want to avoid).

   *

   * Assumes that a lock is held by the caller to ensure only one thread is dropping blocks.

   * Otherwise, the freed space may fill up before the caller puts in their new value.

   */

  private def ensureFreeSpace(blockIdToAdd: String, space: Long): Boolean = {

    if (space > maxMemory) {

      logInfo("Will not store " + blockIdToAdd + " as it is larger than our memory limit")

      return false

    }

    if (maxMemory - currentMemory < space) {

      val rddToAdd = getRddId(blockIdToAdd)

      val selectedBlocks = new ArrayBuffer[String]()

      var selectedMemory = 0L

      // This is synchronized to ensure that the set of entries is not changed

      // (because of getValue or getBytes) while traversing the iterator, as that

      // can lead to exceptions.

      entries.synchronized {

        val iterator = entries.entrySet().iterator()  // 会依次删除现有的block, 直到可以放下新的block

        while (maxMemory - (currentMemory - selectedMemory) < space && iterator.hasNext) {

          val pair = iterator.next()

          val blockId = pair.getKey

          if (rddToAdd != null && rddToAdd == getRddId(blockId)) {

            logInfo("Will not store " + blockIdToAdd + " as it would require dropping another " +

              "block from the same RDD")

            return false

          }

          selectedBlocks += blockId

          selectedMemory += pair.getValue.size

        }

      }

      if (maxMemory - (currentMemory - selectedMemory) >= space) {

        logInfo(selectedBlocks.size + " blocks selected for dropping")

        for (blockId <- selectedBlocks) {  // 删除selectedBlocks, 释放空间

          val entry = entries.synchronized { entries.get(blockId) }

          // This should never be null as only one thread should be dropping

          // blocks and removing entries. However the check is still here for

          // future safety.

          if (entry != null) {

            val data = if (entry.deserialized) {

              Left(entry.value.asInstanceOf[ArrayBuffer[Any]])

            } else {

              Right(entry.value.asInstanceOf[ByteBuffer].duplicate())

            }

            blockManager.dropFromMemory(blockId, data)

          }

        }

        return true

      } else {

        return false

      }

    }

    return true

  }

Spark 源码分析 -- BlockStore的更多相关文章

Spark源码分析 – 汇总索引
http://jerryshao.me/categories.html#architecture-ref http://blog.csdn.net/pelick/article/details/172 ...
Spark源码分析 – BlockManager
参考, Spark源码分析之-Storage模块对于storage, 为何Spark需要storage模块?为了cache RDD Spark的特点就是可以将RDD cache在memory或dis ...
Spark源码分析之九：内存管理模型
Spark是现在很流行的一个基于内存的分布式计算框架,既然是基于内存,那么自然而然的,内存的管理就是Spark存储管理的重中之重了.那么,Spark究竟采用什么样的内存管理模型呢?本文就为大家揭开Sp ...
Spark源码分析之-Storage模块
原文链接:http://jerryshao.me/architecture/2013/10/08/spark-storage-module-analysis/ Background 前段时间琐事颇多, ...
Spark源码分析（三）-TaskScheduler创建
原创文章,转载请注明: 转载自http://www.cnblogs.com/tovin/p/3879151.html 在SparkContext创建过程中会调用createTaskScheduler函 ...
Spark源码分析环境搭建
原创文章,转载请注明: 转载自http://www.cnblogs.com/tovin/p/3868718.html 本文主要分享一下如何构建Spark源码分析环境.以前主要使用eclipse来阅读源 ...
Spark源码分析之Spark Shell（下）
继上次的Spark-shell脚本源码分析,还剩下后面半段.由于上次涉及了不少shell的基本内容,因此就把trap和stty放在这篇来讲述. 上篇回顾:Spark源码分析之Spark Shell(上 ...
Spark源码分析之Spark-submit和Spark-class
有了前面spark-shell的经验,看这两个脚本就容易多啦.前面总结的Spark-shell的分析可以参考: Spark源码分析之Spark Shell(上) Spark源码分析之Spark She ...
【转】Spark源码分析之-deploy模块
原文地址:http://jerryshao.me/architecture/2013/04/30/Spark%E6%BA%90%E7%A0%81%E5%88%86%E6%9E%90%E4%B9%8B- ...

随机推荐

cnblogs博客迁移到hexo
cnblogs博客备份备份地址:https://i.cnblogs.com/BlogBackup.aspx?type=1 备份文件为xml格式,打开备份文件,如下所示: <?xml versi ...
总结iOS9中的新的方法
iOS平台在快速的发展,各种接口正在不断的更新.随着iOS9的发布,又有一批老方法不推荐使用了,你若调用这些方法,运行的结果是没有问题的,但是会出现警告“***is deprecated :first ...
CM本地Yum源的搭建
CM本地Yum源的搭建以本地yum源安装CM5为例,解释本地yum源的安装和利用本地yum源安装CM5. Cloudera Manager 5(以下简称CM)默认采用在线安装的方式,给不能联互联网或 ...
ural1517后缀数组
题意:求两串字符(0————255)的最长公共字串思路:先将两个字符链接起来,中间用一个不曾出现过的字符,然后直接求出height数组,然后根据它的特性,求出最长的公共字串,当然这个最长公共字串的坐 ...
python学习笔记（6）--有道翻译爬虫
说明: 1. 导入三个模块,urllib.request.urlopen用来打开url链接,urllib.parse的urlencode方法将浏览器network里的data对象转为urlopen的第 ...
针对16v554(ttyS0-15)的ttyAT0的ｌｏｇｉｎ配置
1 ## /etc/inittab# console::sysinit:/etc/init.d/rcSconsole::respawn:/sbin/getty -L 115200 ttyAT0 vt1 ...
awk "sort -rnk3"
[root@Cobbler logs]# awk 'BEGIN{print "IP地址","访问流量","访问次数"}{a[$1]++;b[ ...
c配置库ccl使用小结
配置文件为key=value键值对形式下载与安装库文件下载:ccl-0.1.1.tar.gz 安装: tar -zxvf ccl-0.1.1.tar.gz cd ccl-0.1.1 ./con ...
使用asp.net调用谷歌地图api
<html xmlns="http://www.w3.org/1999/xhtml"> <head> <title></title> ...
sql one
查询的话子查询什么的都很正常添加的话尽量把东西都添加在一个表单里这是源头有个这个方便的源头查询和删除都会方便很多组建一个网站,不可避免的要进行调试,有些功能需要添加或者删除,对于后台来讲 ...