spark-submit（spark版本2.4.2）

spark-submit官方文档：http://spark.apache.org/docs/latest/submitting-applications.html

spark-properties官方文档：http://spark.apache.org/docs/latest/configuration.html

Launching Applications with spark-submit

./bin/spark-submit \

  --class <main-class> \

  --master <master-url> \

  --deploy-mode <deploy-mode> \

  --conf <key>=<value> \

  ... # other options

  <application-jar> \

  [application-arguments]

Spark shell和spark-submit工具支持两种动态加载配置的方法。 第一个是命令行选项，例如--master，如上所示。 spark-submit可以使用--conf标志接受任何Spark属性，但是对于在启动Spark应用程序中起作用的属性使用特殊标志。 运行./bin/spark-submit --help将显示这些选项的完整列表

Usage: spark-submit [options] <app jar | python file | R file> [app arguments]
Usage: spark-submit --kill [submission ID] --master [spark://...]
Usage: spark-submit --status [submission ID] --master [spark://...]
Usage: spark-submit run-example [options] example-class [example args]

Some of the commonly used options are:

|Options:
|  --master MASTER_URL         spark://host:port, mesos://host:port, yarn,
|                              k8s://https://host:port, or local (Default: local[*]).
|  --deploy-mode DEPLOY_MODE   Whether to launch the driver program locally ("client") or
|                              on one of the worker machines inside the cluster ("cluster")
|                              (Default: client).
|  --class CLASS_NAME          Your application's main class (for Java / Scala apps).
|  --name NAME                 A name of your application.
|  --jars JARS                 Comma-separated list of jars to include on the driver
|                              and executor classpaths.
|  --packages                  Comma-separated list of maven coordinates of jars to include
|                              on the driver and executor classpaths. Will search the local
|                              maven repo, then maven central and any additional remote
|                              repositories given by --repositories. The format for the
|                              coordinates should be groupId:artifactId:version.
|  --exclude-packages          Comma-separated list of groupId:artifactId, to exclude while
|                              resolving the dependencies provided in --packages to avoid
|                              dependency conflicts.
|  --repositories              Comma-separated list of additional remote repositories to
|                              search for the maven coordinates given with --packages.
|  --py-files PY_FILES         Comma-separated list of .zip, .egg, or .py files to place
|                              on the PYTHONPATH for Python apps.
|  --files FILES               Comma-separated list of files to be placed in the working
|                              directory of each executor. File paths of these files
|                              in executors can be accessed via SparkFiles.get(fileName).
|
|  --conf PROP=VALUE           Arbitrary Spark configuration property.
|  --properties-file FILE      Path to a file from which to load extra properties. If not
|                              specified, this will look for conf/spark-defaults.conf.
|
|  --driver-memory MEM         Memory for driver (e.g. 1000M, 2G) (Default: ${mem_mb}M).
|  --driver-java-options       Extra Java options to pass to the driver.
|  --driver-library-path       Extra library path entries to pass to the driver.
|  --driver-class-path         Extra class path entries to pass to the driver. Note that
|                              jars added with --jars are automatically included in the
|                              classpath.
|
|  --executor-memory MEM       Memory per executor (e.g. 1000M, 2G) (Default: 1G).
|
|  --proxy-user NAME           User to impersonate when submitting the application.
|                              This argument does not work with --principal / --keytab.
|
|  --help, -h                  Show this help message and exit.
|  --verbose, -v               Print additional debug output.
|  --version,                  Print the version of current Spark.
|
| Cluster deploy mode only:
|  --driver-cores NUM          Number of cores used by the driver, only in cluster mode
|                              (Default: 1).
|
| Spark standalone or Mesos with cluster deploy mode only:
|  --supervise                 If given, restarts the driver on failure.
|  --kill SUBMISSION_ID        If given, kills the driver specified.
|  --status SUBMISSION_ID      If given, requests the status of the driver specified.
|
| Spark standalone and Mesos only:
|  --total-executor-cores NUM  Total cores for all executors.
|
| Spark standalone and YARN only:
|  --executor-cores NUM        Number of cores per executor. (Default: 1 in YARN mode,
|                              or all available cores on the worker in standalone mode)
|
| YARN-only:
|  --queue QUEUE_NAME          The YARN queue to submit to (Default: "default").
|  --num-executors NUM         Number of executors to launch (Default: 2).
|                              If dynamic allocation is enabled, the initial number of
|                              executors will be at least NUM.
|  --archives ARCHIVES         Comma separated list of archives to be extracted into the
|                              working directory of each executor.
|  --principal PRINCIPAL       Principal to be used to login to KDC, while running on
|                              secure HDFS.
|  --keytab KEYTAB             The full path to the file that contains the keytab for the
|                              principal specified above. This keytab will be copied to
|                              the node running the Application Master via the Secure
|                              Distributed Cache, for renewing the login tickets and the
|                              delegation tokens periodically.

spark-submit（spark版本2.4.2）的更多相关文章

Spark应用程序部署工具Spark Submit
不多说,直接上干货! spark-submit在哪个位置 [spark@master ~]$ cd $SPARK_HOME/bin [spark@master bin]$ pwd /usr/loca ...
spark standalone ha spark submit
when you build a spark standalone ha cluster, when you submit your app, you should send it to the l ...
【原创】大数据基础之Spark（1）Spark Submit即Spark任务提交过程
Spark2.1.1 一 Spark Submit本地解析 1.1 现象提交命令: spark-submit --master local[10] --driver-memory 30g --cla ...
spark submit参数及调优(转载)
spark submit参数介绍你可以通过spark-submit --help或者spark-shell --help来查看这些参数. 使用格式: ./bin/spark-submit \ -- ...
关于spark与scala版本问题记录
记录一下版本问题: spark与scala版本对应问题: 1.官网会给出,如下,spark2.3.1默认需要scala2.11版本 2.在maven依赖网中也可以看到,如下 3.关于idea开发版本中 ...
spark与Scala版本对应问题
在阅读一些博客和资料中,发现安装spark与Scala是要严格遵守两者的版本对应关系,如果版本不对应会在之后的使用中出现许多问题. 在安装时,我们可以在spark的官网中查到对应的Scala版本号,如 ...
spark submit local遇到路径hdfs的问题
有时候第一次执行 spark submit --master local[*] 单机模式的时候,可以对linux本地路径进行输出.但是有时候提交到yarn的时候,是自动加上hdfs的路径这没问题, 但 ...
Spark Shell & Spark submit
Spark 的 shell 是一个强大的交互式数据分析工具. 1. 搭建Spark 2. 两个目录下面有可执行文件: bin 包含spark-shell 和 spark-submit sbin 包含 ...
Spark记录-Spark on Yarn框架
一.客户端进行操作 1.根据yarnConf来初始化yarnClient,并启动yarnClient2.创建客户端Application,并获取Application的ID,进一步判断集群中的资源是否 ...
Spark之 spark简介、生态圈详解
来源:http://www.cnblogs.com/shishanyuan/p/4700615.html 1.简介 1.1 Spark简介Spark是加州大学伯克利分校AMP实验室(Algorithm ...

随机推荐

canvas 的HTML属性
(一) width/height 默认值与单位 Canvas 标签只有两个属性—— width\height,作为一种替换元素,它默认大小为300×150像素. canvas的单位只能是是px,值只 ...
ASP.NET Core Web多语言项目
公司效益好了,准备和国外做生意,这个时候就需要多语言了. > 1. 这是一个ASP.NET Core Web多语言项目,主要展示项目的不同: > 2. 第一种:www.xxx.com/en ...
2018-2019-2 20175234 实验二《Java面向对象程序设计》实验报告
目录实验内容实验要求实验步骤实验收获参考资料实验内容初步掌握单元测试和TDD 理解并掌握面向对象三要素:封装.继承.多态初步掌握UML建模熟悉S.O.L.I.D原则解设计模式实验 ...
ListView的BeginUpdate()和EndUpdate()作用[z]
[z]https://blog.csdn.net/u011108093/article/details/79448060 许多Windows 窗体控件(例如,ListView 和 TreeView 控 ...
python日志
日志 -- 用来记录用户行为或者代码的执行过程 logging.debug('debug message') # 低级别的 # 排错信息 logging.info('info message') # ...
5个数组Array方法: indexOf、filter、forEach、map、reduce使用实例
ES5中,一共有9个Array方法 Array.prototype.indexOf Array.prototype.lastIndexOf Array.prototype.every Array.pr ...
爬取qq号
import reimport urllib.requestimport osimport jsonimport sslfrom collections import deque #把爬去的数据保存到 ...
调皮的udp组播技术
2017年本科毕业,经历过千辛万苦的找工作之后,我进入了现在的这家公司.虽是职场小白,但励志成为IT界的一股清流(毕竟开发的妹子少,哈哈).因为公司的业务需要,我负责的部分是利用组播技术实现OSG模型 ...
Ubuntu16.04 静态IP设置
为VMware虚拟机内安装的Ubuntu 16.04设置静态IP地址NAT方式 1.安装环境 VMware 12 Ubuntu 16.04 x86_64 2.在VMware中,配置网络环境 VMwar ...
快速排序的两种实现方法（js）
快速排序的基本思想:通过一趟排序,将待排记录分割成独立的两部分,其中一部分记录的关键字均比另外一部分记录的关键字小,则可分别对着两部分记录继续进行排序,以达到整个序列有序的目的.----------- ...

spark-submit（spark版本2.4.2）

Launching Applications with spark-submit

spark-submit（spark版本2.4.2）的更多相关文章

随机推荐

热门专题