spark-submit(spark版本2.4.2)
spark-submit官方文档 :http://spark.apache.org/docs/latest/submitting-applications.html
spark-properties官方文档:http://spark.apache.org/docs/latest/configuration.html
Launching Applications with spark-submit
./bin/spark-submit \
--class <main-class> \
--master <master-url> \
--deploy-mode <deploy-mode> \
--conf <key>=<value> \
... # other options
<application-jar> \
[application-arguments]
Spark shell和spark-submit工具支持两种动态加载配置的方法。 第一个是命令行选项,例如--master,如上所示。 spark-submit可以使用--conf标志接受任何Spark属性,但是对于在启动Spark应用程序中起作用的属性使用特殊标志。 运行./bin/spark-submit --help将显示这些选项的完整列表
Usage: spark-submit [options] <app jar | python file | R file> [app arguments]
Usage: spark-submit --kill [submission ID] --master [spark://...]
Usage: spark-submit --status [submission ID] --master [spark://...]
Usage: spark-submit run-example [options] example-class [example args]
Some of the commonly used options are:
|Options:
| --master MASTER_URL spark://host:port, mesos://host:port, yarn,
| k8s://https://host:port, or local (Default: local[*]).
| --deploy-mode DEPLOY_MODE Whether to launch the driver program locally ("client") or
| on one of the worker machines inside the cluster ("cluster")
| (Default: client).
| --class CLASS_NAME Your application's main class (for Java / Scala apps).
| --name NAME A name of your application.
| --jars JARS Comma-separated list of jars to include on the driver
| and executor classpaths.
| --packages Comma-separated list of maven coordinates of jars to include
| on the driver and executor classpaths. Will search the local
| maven repo, then maven central and any additional remote
| repositories given by --repositories. The format for the
| coordinates should be groupId:artifactId:version.
| --exclude-packages Comma-separated list of groupId:artifactId, to exclude while
| resolving the dependencies provided in --packages to avoid
| dependency conflicts.
| --repositories Comma-separated list of additional remote repositories to
| search for the maven coordinates given with --packages.
| --py-files PY_FILES Comma-separated list of .zip, .egg, or .py files to place
| on the PYTHONPATH for Python apps.
| --files FILES Comma-separated list of files to be placed in the working
| directory of each executor. File paths of these files
| in executors can be accessed via SparkFiles.get(fileName).
|
| --conf PROP=VALUE Arbitrary Spark configuration property.
| --properties-file FILE Path to a file from which to load extra properties. If not
| specified, this will look for conf/spark-defaults.conf.
|
| --driver-memory MEM Memory for driver (e.g. 1000M, 2G) (Default: ${mem_mb}M).
| --driver-java-options Extra Java options to pass to the driver.
| --driver-library-path Extra library path entries to pass to the driver.
| --driver-class-path Extra class path entries to pass to the driver. Note that
| jars added with --jars are automatically included in the
| classpath.
|
| --executor-memory MEM Memory per executor (e.g. 1000M, 2G) (Default: 1G).
|
| --proxy-user NAME User to impersonate when submitting the application.
| This argument does not work with --principal / --keytab.
|
| --help, -h Show this help message and exit.
| --verbose, -v Print additional debug output.
| --version, Print the version of current Spark.
|
| Cluster deploy mode only:
| --driver-cores NUM Number of cores used by the driver, only in cluster mode
| (Default: 1).
|
| Spark standalone or Mesos with cluster deploy mode only:
| --supervise If given, restarts the driver on failure.
| --kill SUBMISSION_ID If given, kills the driver specified.
| --status SUBMISSION_ID If given, requests the status of the driver specified.
|
| Spark standalone and Mesos only:
| --total-executor-cores NUM Total cores for all executors.
|
| Spark standalone and YARN only:
| --executor-cores NUM Number of cores per executor. (Default: 1 in YARN mode,
| or all available cores on the worker in standalone mode)
|
| YARN-only:
| --queue QUEUE_NAME The YARN queue to submit to (Default: "default").
| --num-executors NUM Number of executors to launch (Default: 2).
| If dynamic allocation is enabled, the initial number of
| executors will be at least NUM.
| --archives ARCHIVES Comma separated list of archives to be extracted into the
| working directory of each executor.
| --principal PRINCIPAL Principal to be used to login to KDC, while running on
| secure HDFS.
| --keytab KEYTAB The full path to the file that contains the keytab for the
| principal specified above. This keytab will be copied to
| the node running the Application Master via the Secure
| Distributed Cache, for renewing the login tickets and the
| delegation tokens periodically.
spark-submit(spark版本2.4.2)的更多相关文章
- Spark应用程序部署工具Spark Submit
不多说,直接上干货! spark-submit在哪个位置 [spark@master ~]$ cd $SPARK_HOME/bin [spark@master bin]$ pwd /usr/loca ...
- spark standalone ha spark submit
when you build a spark standalone ha cluster, when you submit your app, you should send it to the l ...
- 【原创】大数据基础之Spark(1)Spark Submit即Spark任务提交过程
Spark2.1.1 一 Spark Submit本地解析 1.1 现象 提交命令: spark-submit --master local[10] --driver-memory 30g --cla ...
- spark submit参数及调优(转载)
spark submit参数介绍 你可以通过spark-submit --help或者spark-shell --help来查看这些参数. 使用格式: ./bin/spark-submit \ -- ...
- 关于spark与scala版本问题记录
记录一下版本问题: spark与scala版本对应问题: 1.官网会给出,如下,spark2.3.1默认需要scala2.11版本 2.在maven依赖网中也可以看到,如下 3.关于idea开发版本中 ...
- spark与Scala版本对应问题
在阅读一些博客和资料中,发现安装spark与Scala是要严格遵守两者的版本对应关系,如果版本不对应会在之后的使用中出现许多问题. 在安装时,我们可以在spark的官网中查到对应的Scala版本号,如 ...
- spark submit local遇到路径hdfs的问题
有时候第一次执行 spark submit --master local[*] 单机模式的时候,可以对linux本地路径进行输出.但是有时候提交到yarn的时候,是自动加上hdfs的路径这没问题, 但 ...
- Spark Shell & Spark submit
Spark 的 shell 是一个强大的交互式数据分析工具. 1. 搭建Spark 2. 两个目录下面有可执行文件: bin 包含spark-shell 和 spark-submit sbin 包含 ...
- Spark记录-Spark on Yarn框架
一.客户端进行操作 1.根据yarnConf来初始化yarnClient,并启动yarnClient2.创建客户端Application,并获取Application的ID,进一步判断集群中的资源是否 ...
- Spark之 spark简介、生态圈详解
来源:http://www.cnblogs.com/shishanyuan/p/4700615.html 1.简介 1.1 Spark简介Spark是加州大学伯克利分校AMP实验室(Algorithm ...
随机推荐
- Codeforces Round #552 (Div. 3) C题
题目网址:http://codeforces.com/contest/1154/problem/C 题目意思:小猫吃三种食物,A,B,C,一周吃食物的次序是,A,B,C,A,C,B,A,当小猫该天无食 ...
- Requset模块
Requests是用python语言基于urllib编写的,采用的是Apache2 Licensed开源协议的HTTP库 各种请求方式: #!/urs/bin/evn python # -*- cod ...
- py文件的运行
安装过程及配置 安装过程准备: 下载好Python的安装程序后,开始安装,在进入安装界面后一定确保勾选将Python加入到系统环境变量的路径里.如图所示: 2 如果没有选取,那么按照下面的步骤进行操作 ...
- Mac使用Charles进行HTTPS抓包
技术来源: PengYunjing 第一步 配置HTTP代理,这步与抓取HTTP请求是一样的: 选择在8888端口上监听,然后确定.够选了SOCKS proxy,还能截获到浏览器的http访问请求. ...
- ASCII,Unicode 和 UTF-8
ASCII: 英文的编码方式,规定了128个字符的编码,使用了一个字节的后七位表示. Unicode : 每个国家的字符集都不同,世界上所有的字符远远超过128个.Unicode,就是一种所有符号的编 ...
- Python之路(第三十二篇) 网络编程:udp套接字、简单文件传输
一.UDP套接字 服务端 # udp是无链接的,先启动哪一端都不会报错 # udp没有链接,与tcp相比没有链接循环,只有通讯循环 server = socket.socket(socket.AF_I ...
- PPS--在download DN出现的问题注意:
1,DN的下载条件:(没有删除没有下载) PPSL=’N’(PPSL有两个值,N时是指这个DN还没有下载) DEL_FLAG<>’Y’(DEL_FLAG有两个值,Y时说明已经删除,不会下载 ...
- js 选择指定区域
/根据id 选择特定区域function SelectRange(id) { var div = document.getElementById(id); var controlRange; if ( ...
- Lucene学习笔记:基础
Lucence是Apache的一个全文检索引擎工具包.可以将采集的数据存储到索引库中,然后在根据查询条件从索引库中取出结果.索引库可以存在内存中或者存在硬盘上. 本文主要是参考了这篇博客进行学习的,原 ...
- [js]jQuery EasyUI的linkbutton组件disable方法无法禁用jQuery绑定事件的问题分析
问题由来 linkbutton 是 jQuery EasyUI 中常用的一个控件,可以使用它创建按钮.用法很简单,使用 a 标签给一个easyui-linkbutton 的class就可以了. < ...