SparkDriver 用于提交用户的应用程序,

一、SparkConf

负责SparkContext的配置参数加载, 主要通过ConcurrentHashMap来维护各种`spark.*`的配置属性

class SparkConf(loadDefaults: Boolean) extends Cloneable with Logging with Serializable {

    import SparkConf._

    /** Create a SparkConf that loads defaults from system properties and the classpath */
def this() = this(true) /**
* 维护一个ConcurrentHashMap 来存储spark配置
*/
private val settings = new ConcurrentHashMap[String, String]() @transient private lazy val reader: ConfigReader = {
val _reader = new ConfigReader(new SparkConfigProvider(settings))
_reader.bindEnv(new ConfigProvider {
override def get(key: String): Option[String] = Option(getenv(key))
})
_reader
} if (loadDefaults) {
loadFromSystemProperties(false)
} /**
* 加载spark.*的配置
* @param silent
* @return
*/
private[spark] def loadFromSystemProperties(silent: Boolean): SparkConf = {
// Load any spark.* system properties, 只加载spark.*的配置
for ((key, value) <- Utils.getSystemProperties if key.startsWith("spark.")) {
set(key, value, silent)
}
this
}
}

二、SparkContext

2.1、创建Spark执行环境SparkEnv

SparkEnv是Spark的执行环境对象, 其中包括众多与Executor执行相关的对象。

创建, 主要通过SparkEnv.createSparkEnv, SparkContext初始化,只创建SparkEnv

  def isLocal: Boolean = Utils.isLocalMaster(_conf)

  // An asynchronous listener bus for Spark events
//采用监听器模式维护各类事件的处理
private[spark] val listenerBus = new LiveListenerBus(this) // This function allows components created by SparkEnv to be mocked in unit tests:
private[spark] def createSparkEnv(
conf: SparkConf,
isLocal: Boolean,
listenerBus: LiveListenerBus): SparkEnv = {
//创建DriverEnv
SparkEnv.createDriverEnv(conf, isLocal, listenerBus, SparkContext.numDriverCores(master))
}

继续进入createDriverEnv, 发现调用的是create方法, 该方法是为Driver或Executor创建SparkEnv

点击createExecutorEnv发现是CoarseGrainedExecutorBackend调用

下面具体看看create()中做了什么操作

2.1.1、创建SecurityManager

    //创建SecurityManager
val securityManager = new SecurityManager(conf, ioEncryptionKey)
ioEncryptionKey.foreach { _ =>
if (!securityManager.isSaslEncryptionEnabled()) {
logWarning("I/O encryption enabled without RPC encryption: keys will be visible on the " +
"wire.")
}
}

2.1.2、创建RpcEnv

    val systemName = if (isDriver) driverSystemName else executorSystemName
val rpcEnv = RpcEnv.create(systemName, bindAddress, advertiseAddress, port, conf,
securityManager, clientMode = !isDriver)

2.1.3、通过反射创建序列化器, 此处默认创建JavaSerializer

    // Create an instance of the class with the given name, possibly initializing it with our conf
def instantiateClass[T](className: String): T = {
val cls = Utils.classForName(className)
// Look for a constructor taking a SparkConf and a boolean isDriver, then one taking just
// SparkConf, then one taking no arguments
try {
cls.getConstructor(classOf[SparkConf], java.lang.Boolean.TYPE)
.newInstance(conf, new java.lang.Boolean(isDriver))
.asInstanceOf[T]
} catch {
case _: NoSuchMethodException =>
try {
cls.getConstructor(classOf[SparkConf]).newInstance(conf).asInstanceOf[T]
} catch {
case _: NoSuchMethodException =>
cls.getConstructor().newInstance().asInstanceOf[T]
}
}
} // Create an instance of the class named by the given SparkConf property, or defaultClassName
// if the property is not set, possibly initializing it with our conf
def instantiateClassFromConf[T](propertyName: String, defaultClassName: String): T = {
instantiateClass[T](conf.get(propertyName, defaultClassName))
} val serializer = instantiateClassFromConf[Serializer](
"spark.serializer", "org.apache.spark.serializer.JavaSerializer")
logDebug(s"Using serializer: ${serializer.getClass}")

2.1.3、创建SerializeManager

    val serializerManager = new SerializerManager(serializer, conf, ioEncryptionKey)

    val closureSerializer = new JavaSerializer(conf)

2.1.4、创建BroadcastManager

  val broadcastManager = new BroadcastManager(isDriver, conf, securityManager)

2.1.5、创建MapOutputTracker

    def registerOrLookupEndpoint(
name: String, endpointCreator: => RpcEndpoint):
RpcEndpointRef = {
if (isDriver) {
logInfo("Registering " + name)
rpcEnv.setupEndpoint(name, endpointCreator)
} else {
RpcUtils.makeDriverRef(name, conf, rpcEnv)
}
} val broadcastManager = new BroadcastManager(isDriver, conf, securityManager) //创建MapOutputTracker 区分Driver, Executor
val mapOutputTracker = if (isDriver) {
//Driver需要BroadcastManager
new MapOutputTrackerMaster(conf, broadcastManager, isLocal)
} else {
new MapOutputTrackerWorker(conf)
} // Have to assign trackerEndpoint after initialization as MapOutputTrackerEndpoint
// requires the MapOutputTracker itself
mapOutputTracker.trackerEndpoint = registerOrLookupEndpoint(MapOutputTracker.ENDPOINT_NAME,
new MapOutputTrackerMasterEndpoint(
rpcEnv, mapOutputTracker.asInstanceOf[MapOutputTrackerMaster], conf))

2.1.6、创建ShuffleManager

    // Let the user specify short names for shuffle managers
val shortShuffleMgrNames = Map(
"sort" -> classOf[org.apache.spark.shuffle.sort.SortShuffleManager].getName,
"tungsten-sort" -> classOf[org.apache.spark.shuffle.sort.SortShuffleManager].getName)
val shuffleMgrName = conf.get("spark.shuffle.manager", "sort")
val shuffleMgrClass = shortShuffleMgrNames.getOrElse(shuffleMgrName.toLowerCase, shuffleMgrName)
val shuffleManager = instantiateClass[ShuffleManager](shuffleMgrClass)

2.1.7、创建 BlockManager

    val useLegacyMemoryManager = conf.getBoolean("spark.memory.useLegacyMode", false)
val memoryManager: MemoryManager =
if (useLegacyMemoryManager) {
new StaticMemoryManager(conf, numUsableCores)
} else {
UnifiedMemoryManager(conf, numUsableCores)
} val blockManagerPort = if (isDriver) {
conf.get(DRIVER_BLOCK_MANAGER_PORT)
} else {
conf.get(BLOCK_MANAGER_PORT)
} val blockTransferService =
new NettyBlockTransferService(conf, securityManager, bindAddress, advertiseAddress,
blockManagerPort, numUsableCores) val blockManagerMaster = new BlockManagerMaster(registerOrLookupEndpoint(
BlockManagerMaster.DRIVER_ENDPOINT_NAME,
new BlockManagerMasterEndpoint(rpcEnv, isLocal, conf, listenerBus)),
conf, isDriver) // NB: blockManager is not valid until initialize() is called later.
val blockManager = new BlockManager(executorId, rpcEnv, blockManagerMaster,
serializerManager, conf, memoryManager, mapOutputTracker, shuffleManager,
blockTransferService, securityManager, numUsableCores)

2.1.8、创建MetricsSystem

    val metricsSystem = if (isDriver) {
// Don't start metrics system right now for Driver.
// We need to wait for the task scheduler to give us an app ID.
// Then we can start the metrics system.
MetricsSystem.createMetricsSystem("driver", conf, securityManager)
} else {
// We need to set the executor ID before the MetricsSystem is created because sources and
// sinks specified in the metrics configuration file will want to incorporate this executor's
// ID into the metrics they report.
conf.set("spark.executor.id", executorId)
val ms = MetricsSystem.createMetricsSystem("executor", conf, securityManager)
ms.start()
ms
}

2.1.9、创建SparkEnv实例

    val envInstance = new SparkEnv(
executorId,
rpcEnv,
serializer,
closureSerializer,
serializerManager,
mapOutputTracker,
shuffleManager,
broadcastManager,
blockManager,
securityManager,
metricsSystem,
memoryManager,
outputCommitCoordinator,
conf)

2.1.10、创建临时文件

    // Add a reference to tmp dir created by driver, we will delete this tmp dir when stop() is
// called, and we only need to do it for driver. Because driver may run as a service, and if we
// don't delete this tmp dir when sc is stopped, then will create too many tmp dirs.
if (isDriver) {
val sparkFilesDir = Utils.createTempDir(Utils.getLocalDir(conf), "userFiles").getAbsolutePath
envInstance.driverTmpDir = Some(sparkFilesDir)
}

创建Spark执行环境SparkEnv的更多相关文章

  1. Spark源码剖析 - SparkContext的初始化(二)_创建执行环境SparkEnv

    2. 创建执行环境SparkEnv SparkEnv是Spark的执行环境对象,其中包括众多与Executor执行相关的对象.由于在local模式下Driver会创建Executor,local-cl ...

  2. Spark 核心篇-SparkEnv

    本章内容: 1.功能概述 SparkEnv是Spark的执行环境对象,其中包括与众多Executor执行相关的对象.Spark 对任务的计算都依托于 Executor 的能力,所有的 Executor ...

  3. javaScript执行环境、作用域链与闭包

    一.执行环境 执行环境定义了变量和函数有权访问的其他数据,决定了他们各自的行为:每个执行环境都有一个与之关联的变量对象,环境中定义的所有变量和函数都保存在这个对象中.虽然我们编写的代码无法访问这个对象 ...

  4. VO、AO、执行环境和作用域链

    1.变量对象(variable object) 原文:Every execution context has associated with it a variable object. Variabl ...

  5. 理解JS的执行环境

    执行环境(Execution context,EC)或执行上下文,是JS中一个极为重要的概念 EC的组成 当JavaScript代码执行的时候,会进入不同的执行上下文,这些执行上下文会构成了一个执行上 ...

  6. Javascript高级编程学习笔记(9)—— 执行环境

    今天主要讲一下,JS底层的一些东西,这些东西不太好举例(应该是我水平不够) 望大家多多海涵,比心心 执行环境 执行环境(执行上下文,全文使用执行环境 )是JS中最为重要的一个概念,执行环境决定了,变量 ...

  7. (O)JS:执行环境、变量对象、活动对象和作用域链(原创)

    var a=1; function b(x){ var c=2; console.log(x); } b(3); ·执行环境(execution context),也称为环境.执行上下文.上下文环境. ...

  8. Javascript 函数及其执行环境和作用域

    函数在javascript中可以说是一等公民,也是最有意思的事情,javascript函数其实也是一个对象,是Function类型的实例.因此声明一个函数首先可以使用 Function构造函数: va ...

  9. js的闭包中关于执行环境和作用链的理解

    首先讲一讲执行环境: 执行环境按照字面上来理解就是指目前代码执行所在的环境. 当JavaScript代码执行的时候,会进入不同的执行上下文,这些执行上下文会构成了一个执行上下文栈(Execution ...

随机推荐

  1. codeforces 949B A Leapfrog in the Array

    B. A Leapfrog in the Array time limit per test 2 seconds memory limit per test 512 megabytes input s ...

  2. js判断ie6的代码

    var isIE=!!window.ActiveXObject; var isIE6=isIE&&!window.XMLHttpRequest; var isIE8=isIE& ...

  3. [Apple开发者帐户帮助]九、参考(2)撤销特权

    您可以撤消的证书取决于证书类型和您的角色.如果您是个人注册,则可以撤销所有类型的开发和分发证书,除非另有说明.组织团队的任何成员都可以撤销自己的开发证书,但只有帐户持有人或管理员可以撤销分发证书. 证 ...

  4. Netty(2) - HelloWorld

    Netty:作用场景. 1)Netty可以基于socket实现远程过程调用(RPC). 2)Netty可以基于WebSocket实现长连接. 3)Netty可以实现Http的服务器,类似于Jetty, ...

  5. Git系列学习(1)-Git安装

    一.概述 msysGit名字前面的四个字面来源于MSYS项目: MSYS项目来源于MinGW(Minimalist GNU for Windows,最简GNU工具集) 通过添加一个bash提供的she ...

  6. js基础---数据类型转换

    js中数据类型: 简单数据类型: number:233,-34,0x23,023 string:"hello"或者'hello' boolean:true.false undefi ...

  7. Android RecyclerView使用 及 滑动时加载图片优化方案

    1.控制线程数量 + 数据分页加载2.重写onScrollStateChanged方法 这个我们后面再谈,下面先来看看RecyclerView控件的使用及我们为什么选择使用它 RecyclerView ...

  8. 关于onActivityResult方法不执行的问题汇总

    我们不生产代码, 只是大自然的搬运工.  首先致谢: https://blog.csdn.net/sbvfhp/article/details/26858441 场景描述: 在A activity(由 ...

  9. Android集成二维码扫描功能

    文章转载自  https://github.com/yipianfengye/android-zxingLibrary 在具体介绍该扫描库之前我们先看一下其具体的使用方式,看看是不是几行代码就可以集成 ...

  10. 编写高质量的js之恰当选用if和switch

    switch结构中存在很多限制,存在这些限制的主要目的是提高多重分支结构的执行效率.因此,如果能够使用switch结构,就不要选择if结构. 无论是使用if结构,还是使用switch结构,应该确保下面 ...