Spark-submit脚本解读

#!/usr/bin/env bash

#

# Licensed to the Apache Software Foundation (ASF) under one or more

# contributor license agreements.  See the NOTICE file distributed with

# this work for additional information regarding copyright ownership.

# The ASF licenses this file to You under the Apache License, Version 2.0

# (the "License"); you may not use this file except in compliance with

# the License.  You may obtain a copy of the License at

#

#    http://www.apache.org/licenses/LICENSE-2.0

#

# Unless required by applicable law or agreed to in writing, software

# distributed under the License is distributed on an "AS IS" BASIS,

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

# See the License for the specific language governing permissions and

# limitations under the License.

#

# NOTE: Any changes in this file must be reflected in SparkSubmitDriverBootstrapper.scala!

#Spark的安装目录

export SPARK_HOME="$(cd `dirname $0`/..; pwd)"

#将参数已数组的形式赋值给ORIG_ARGS

ORIG_ARGS=("$@")

#根据不同的参数项，把对应的参数值赋给对应的环境变量

while (($#)); do

  if [ "$1" = "--deploy-mode" ]; then

    SPARK_SUBMIT_DEPLOY_MODE=$

  elif [ "$1" = "--properties-file" ]; then

    SPARK_SUBMIT_PROPERTIES_FILE=$

  elif [ "$1" = "--driver-memory" ]; then

    export SPARK_SUBMIT_DRIVER_MEMORY=$

  elif [ "$1" = "--driver-library-path" ]; then

    export SPARK_SUBMIT_LIBRARY_PATH=$

  elif [ "$1" = "--driver-class-path" ]; then

    export SPARK_SUBMIT_CLASSPATH=$

  elif [ "$1" = "--driver-java-options" ]; then

    export SPARK_SUBMIT_OPTS=$

  fi

  shift

done

#定义一些默认的变量，会被用户的自定义参数覆盖

# :- 同 nvl

DEFAULT_PROPERTIES_FILE="$SPARK_HOME/conf/spark-defaults.conf"

export SPARK_SUBMIT_DEPLOY_MODE=${SPARK_SUBMIT_DEPLOY_MODE:-"client"}

export SPARK_SUBMIT_PROPERTIES_FILE=${SPARK_SUBMIT_PROPERTIES_FILE:-"$DEFAULT_PROPERTIES_FILE"}

# For client mode, the driver will be launched in the same JVM that launches

# SparkSubmit, so we may need to read the properties file for any extra class

# paths, library paths, java options and memory early on. Otherwise, it will

# be too late by the time the driver JVM has started.

#从spark-defaults.conf文件中获取"spark.driver.extra*\|spark.driver.memory" 两个变量的值

if [[ "$SPARK_SUBMIT_DEPLOY_MODE" == "client" && -f "$SPARK_SUBMIT_PROPERTIES_FILE" ]]; then

  # Parse the properties file only if the special configs exist

  contains_special_configs=$(

    grep -e "spark.driver.extra*\|spark.driver.memory" "$SPARK_SUBMIT_PROPERTIES_FILE" | \

    grep -v "^[[:space:]]*#"

  )

  if [ -n "$contains_special_configs" ]; then

    export SPARK_SUBMIT_BOOTSTRAP_DRIVER=

  fi

fi

#将参数传递spark-class
#exec命令在执行时会把当前的shell process关闭，然后换到后面的命令继续执行

exec $SPARK_HOME/bin/spark-class org.apache.spark.deploy.SparkSubmit "${ORIG_ARGS[@]}"

Spark-submit脚本解读的更多相关文章

Spark Submit 脚本
当我们需要命令行传递参数时候,将--class 写在前面,然后是jar 最后是参数 spark-submit --master yarn --num-executors 3 --executor-me ...
【原创】大数据基础之Spark（1）Spark Submit即Spark任务提交过程
Spark2.1.1 一 Spark Submit本地解析 1.1 现象提交命令: spark-submit --master local[10] --driver-memory 30g --cla ...
Spark-class启动脚本解读
#!/usr/bin/env bash # # Licensed to the Apache Software Foundation (ASF) under one or more # contrib ...
spark submit参数及调优(转载)
spark submit参数介绍你可以通过spark-submit --help或者spark-shell --help来查看这些参数. 使用格式: ./bin/spark-submit \ -- ...
spark submit local遇到路径hdfs的问题
有时候第一次执行 spark submit --master local[*] 单机模式的时候,可以对linux本地路径进行输出.但是有时候提交到yarn的时候,是自动加上hdfs的路径这没问题, 但 ...
Spark 个人实战系列(2)--Spark 服务脚本分析
前言: spark最近非常的火热, 本文不讲spark原理, 而是研究spark集群搭建和服务的脚本是如何编写的, 管中窥豹, 希望从运行脚本的角度去理解spark集群. 研究的spark为1.0.1 ...
spark相关脚本解析
spark-shell/spark-submit/pyspark等关系如下: #spark-submit 逻辑: ########################################### ...
Spark-shell启动脚本解读
#!/usr/bin/env bash # # Licensed to the Apache Software Foundation (ASF) under one or more # contrib ...
spark standalone ha spark submit
when you build a spark standalone ha cluster, when you submit your app, you should send it to the l ...
Spark Shell & Spark submit
Spark 的 shell 是一个强大的交互式数据分析工具. 1. 搭建Spark 2. 两个目录下面有可执行文件: bin 包含spark-shell 和 spark-submit sbin 包含 ...

随机推荐

ironic的自动化脚本
# -*- coding:utf-8 -*- import json import subprocess import os import time import random trunk_start ...
【PTA】Tree Traversals Again
题目如下: An inorder binary tree traversal can be implemented in a non-recursive way with a stack. For e ...
Opencv3.2.0安装包
这个资源是Opencv3.2.0安装包,包括Windows软件包,Android软件包,IOS软件包,还有opencv的源代码:需要的下载吧. 点击下载
容器基础(二): 使用Namespace进行边界隔离
Linux Namespace 容器技术可以认为是一种沙盒(sandbox), 为了实现沙盒/容器/应用间的隔离,就需要一种技术来对容器界定边界,从而让容器不至于互相干扰.当前使用的技术就是Names ...
HDU 4747 Mex ( 线段树好题 + 思路 )
参考:http://www.cnblogs.com/oyking/p/3323306.html 相当不错的思路,膜拜之~ 个人理解改日补充. #include <cstdio> #incl ...
hadoop-hdfs(三)
HDFS概念 1 数据块* HDFS的一个数据块默认是64M,与元数据分开管理. 优点: 数据块的大小设计的较大,所以寻址占传输的时间比例较小,只需要计算传输速度即可. 便于简化管理,利于计算剩余空间 ...
[AGC005D] ~K Perm Counting [dp]
题面传送门思路首先可以明确的一点是,本题中出现不满足条件的所有的数,都是分组的只有模$K$意义下相同的数之间才会出现不满足条件的情况,而且仅出现在相邻的情况那么我们考虑把这个性质利用起来我 ...
HDU 4910 HDOJ Problem about GCD BestCoder #3 第四题
首先 m = 1 时 ans = 0对于 m > 1 的情况由于 1 到 m-1 中所有和m互质的数字,在对m的乘法取模运算上形成了群 ai = ( 1<=a<m & ...
uoj185 [ZJOI2016]小星星【dp + 容斥】
题目链接 uoj185 题解设$f[i][j]$表示$i$为根的子树,$i$号点对应图上$j$号点时的方案数显然这样$dp$会使一些节点使用同一个节点,此时总的节点数就不满\( ...
排序（sortb）
题目描述懒得写题目背景了,就不写了. 有一个 $0, 1 \dots n − 1$ 的排列 $p_1, p_2 \dots p_n$,如果 $p_i ⊕ p_j ≤ a$(其中 $⊕$ 为按位异或) ...

Spark-submit脚本解读

Spark-submit脚本解读的更多相关文章

随机推荐

热门专题