spark各种模式提交任务介绍
前言
本文章部分内容翻译自:
http://spark.apache.org/docs/latest/submitting-applications.html
应用提交
Spark的bin目录中的spark-submit脚本用于在集群上启动应用程序。它可以通过统一的界面使用Spark支持的所有集群管理器,因此您不必为每个集群管理器配置应用程序。
捆绑应用程序的依赖关系
如果您的代码依赖于其他项目,则需要将它们与应用程序一起打包,以便将代码分发到Spark集群。为此,请创建包含代码及其依赖项的程序集jar(或“uber”jar)。sbt和Maven都有汇编插件。在创建程序集jar时,将Spark和Hadoop列为提供的依赖项;这些不需要捆绑,因为它们是由集群管理器在运行时提供的。一旦你有了一个组装的jar,你可以在传递你的jar时调用bin/spark-submit脚本。对于Python,您可以使用spark-submit的--py-files参数添加.py,.zip或.egg文件,以便与您的应用程序一起分发。如果您依赖多个Python文件,我们建议将它们打包成.zip或.egg。
使用spark-submit启动应用程序
捆绑用户应用程序后,可以使用bin/spark-submit脚本启动它。此脚本负责使用Spark及其依赖项设置类路径,并且可以支持Spark支持的不同集群管理器和部署模式:
./bin/spark-submit \ --class <main-class> \ --master <master-url> \ --deploy-mode <deploy-mode> \ --conf <key>=<value> \ ... # other options <application-jar> \ [application-arguments]
上述一些常用的选项分别是:
--class:应用程序的入口点(例如org.apache.spark.examples.SparkPi)
--master:集群的主URL(例如spark://23.195.26.187:7077)
--deploy-mode:是在工作节点(集群)上部署驱动程序还是在本地部署为外部客户端(客户端)(默认值:客户端)。
--conf:key = value格式的任意Spark配置属性。对于包含空格的值,用引号括起“key = value”(如图所示)。
application-jar:包含应用程序和所有依赖项的捆绑jar的路径。URL必须在群集内部全局可见,例如,hdfs://path或所有节点上都存在的file://path。
application-arguments:传递给主类的main方法的参数(如果有的话)。
常见的部署策略是从与您的工作机器物理位于同一位置的网关机器(例如,独立EC2集群中的主节点)提交您的应用程序。在此设置中,客户端模式是合适的。在客户端模式下,驱动程序直接在spark-submit进程中启动,该进程充当群集的客户端。应用程序的输入和输出附加到控制台。因此,该模式特别适用于涉及REPL的应用程序(例如Spark shell)。
或者,如果您的应用程序是从远离工作机器的计算机提交的(例如,在笔记本电脑上本地提交),则通常使用群集模式来最小化驱动程序和执行程序之间的网络延迟。目前,独立模式不支持Python应用程序的集群模式。
对于Python应用程序,只需传递一个.py文件代替<application-jar>而不是JAR,并使用--py-files将Python的.zip,.egg或.py文件添加到搜索路径中。
有一些特定于正在使用的集群管理器的选项。例如,对于具有集群部署模式的Spark独立集群,您还可以指定--supervise以确保驱动程序在失败且退出代码为非零时自动重新启动。要枚举所有可用于spark-submit的选项,请使用--help运行它。
各种模式运行spark任务
local模式
[root@hadoop1 spark--bin-hadoop2.]# ./bin/spark-submit ---.jar
-- :: WARN NativeCodeLoader: - Unable to load native-hadoop library for your platform... using builtin-java classepplicable
-- :: INFO SparkContext: - Running Spark version
-- :: INFO SparkContext: - Submitted application: Spark Pi
-- :: INFO SecurityManager: - Changing view acls to: root
-- :: INFO SecurityManager: - Changing modify acls to: root
-- :: INFO SecurityManager: - Changing view acls groups to:
-- :: INFO SecurityManager: - Changing modify acls groups to:
-- :: INFO SecurityManager: - SecurityManager: authentication disabled; ui acls disabled; users with view permiss(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
-- :: INFO Utils: - Successfully started service .
-- :: INFO SparkEnv: - Registering MapOutputTracker
-- :: INFO SparkEnv: - Registering BlockManagerMaster
-- :: INFO BlockManagerMasterEndpoint: - Using org.apache.spark.storage.DefaultTopologyMapper for getting topologyion
-- :: INFO BlockManagerMasterEndpoint: - BlockManagerMasterEndpoint up
-- :: INFO DiskBlockManager: - Created local directory at /tmp/blockmgr-4ddfef66---b029-05332cfa70a9
-- :: INFO MemoryStore: - MemoryStore started with capacity 413.9 MB
-- :: INFO SparkEnv: - Registering OutputCommitCoordinator
-- :: INFO log: - Logging initialized @9713ms
-- :: INFO Server: - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
-- :: INFO Server: - Started @9891ms
-- :: INFO AbstractConnector: - Started ServerConnector@40e4ea87{HTTP/}
-- :: INFO Utils: - Successfully started service .
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@1a38ba58{/jobs,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@24b52d3e{/jobs/json,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@15deb1dc{/jobs/job,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@57a4d5ee{/jobs/job/json,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@5af5def9{/stages,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@3a45c42a{/stages/json,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@36dce7ed{/stages/stage,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@27a0a5a2{/stages/stage/json,null,AVAILABLE,@Sp
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@7692cd34{/stages/pool,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@33aa93c{/stages/pool/json,null,AVAILABLE,@Spar
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@32c0915e{/storage,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@106faf11{/storage/json,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@70f43b45{/storage/rdd,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@26d10f2e{/storage/rdd/json,null,AVAILABLE,@Spa
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@10ad20cb{/environment,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@7dd712e8{/environment/json,null,AVAILABLE,@Spa
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@2c282004{/executors,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@22ee2d0{/executors/json,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@7bfc3126{/executors/threadDump,null,AVAILABLE,
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@3e792ce3{/executors/threadDump/json,null,AVAILrk}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@53bc1328{/static,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@e041f0c{/,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@6a175569{/api,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@4102b1b1{/jobs/job/kill,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@61a5b4ae{/stages/stage/kill,null,AVAILABLE,@Sp
-- :: INFO SparkUI: - Bound SparkUI to 192.168.217.201, and started at http://hadoop1.org.cn:4040
-- :: INFO SparkContext: - Added JAR file:/usr/hdp/spark--bin-hadoop2./examples/jars/spark-examples_2.-2.4 spark://hadoop1.org.cn:48468/jars/spark-examples_2.11-2.4.0.jar with timestamp 1550758442663
-- :: INFO Executor: - Starting executor ID driver on host localhost
-- :: INFO Utils: - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on por
-- :: INFO NettyBlockTransferService: - Server created on hadoop1.org.cn:
-- :: INFO BlockManager: - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication polic
-- :: INFO BlockManagerMaster: - Registering BlockManager BlockManagerId(driver, hadoop1.org.cn, , None)
-- :: INFO BlockManagerMasterEndpoint: - Registering block manager hadoop1.org.cn: with , None)
-- :: INFO BlockManagerMaster: - Registered BlockManager BlockManagerId(driver, hadoop1.org.cn, , None)
-- :: INFO BlockManager: - Initialized BlockManager: BlockManagerId(driver, hadoop1.org.cn, , None)
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@61a91912{/metrics/json,null,AVAILABLE,@Spark}
-- :: INFO SparkContext: - Starting job: reduce at SparkPi.scala:
-- :: INFO DAGScheduler: - Got job (reduce at SparkPi.scala:) with output partitions
-- :: INFO DAGScheduler: - Final stage: ResultStage (reduce at SparkPi.scala:)
-- :: INFO DAGScheduler: - Parents of final stage: List()
-- :: INFO DAGScheduler: - Missing parents: List()
-- :: INFO DAGScheduler: - Submitting ResultStage (MapPartitionsRDD[] at map at SparkPi.scala:), which has noparents
-- :: INFO MemoryStore: - Block broadcast_0 stored as values in memory (estimated size 1936.0 B, free 413.9 MB)
-- :: INFO MemoryStore: - Block broadcast_0_piece0 stored as bytes in memory (estimated size 1256.0 B, free 413.9
-- :: INFO BlockManagerInfo: - Added broadcast_0_piece0 (size:
-- :: INFO SparkContext: - Created broadcast
-- :: INFO DAGScheduler: - Submitting missing tasks (MapPartitionsRDD[] at map at SparkPi.scfirst tasks are , ))
-- :: INFO TaskSchedulerImpl: - Adding task tasks
-- :: INFO TaskSetManager: - Starting task , localhost, executor driver, partition , PROCE bytes)
-- :: INFO Executor: - Running task )
-- :: INFO Executor: - Fetching spark://hadoop1.org.cn:48468/jars/spark-examples_2.11-2.4.0.jar with timestamp 1553
-- :: INFO TransportClientFactory: - Successfully created connection to hadoop1.org.cn/ afte( ms spent in bootstraps)
-- :: INFO Utils: - Fetching spark://hadoop1.org.cn:48468/jars/spark-examples_2.11-2.4.0.jar to /tmp/spark-e9e2c8bda-9d3d-4a4f9671b0d9/userFiles-e2c1980d-6d11-48f1-8422-2b637ce7a1fb/fetchFileTemp701584033085131304.tmp
-- :: INFO Executor: - Adding file:/tmp/spark-e9e2c8b5-0a08-4dda-9d3d-4a4f9671b0d9/userFiles-e2c1980d-6d11-48f1-84e7a1fb/spark-examples_2.-.jar to class loader
-- :: INFO Executor: - Finished task ). bytes result sent to driver
-- :: INFO TaskSetManager: - Starting task , localhost, executor driver, partition , PROCE bytes)
-- :: INFO Executor: - Running task )
-- :: INFO Executor: - Finished task ). bytes result sent to driver
-- :: INFO TaskSetManager: - Finished task ) ms on localhost (executor driver) (/
-- :: INFO TaskSetManager: - Finished task ) ms on localhost (executor driver) (/)
-- :: INFO TaskSchedulerImpl: - Removed TaskSet 0.0, whose tasks have all completed, from pool
-- :: INFO DAGScheduler: - ResultStage (reduce at SparkPi.scala:) finished in 3.577 s
-- :: INFO DAGScheduler: - Job finished: reduce at SparkPi.scala:, took 4.506013 s
Pi is roughly 3.142475712378562
-- :: INFO AbstractConnector: - Stopped Spark@40e4ea87{HTTP/}
-- :: INFO SparkUI: - Stopped Spark web UI at http://hadoop1.org.cn:4040
-- :: INFO MapOutputTrackerMasterEndpoint: - MapOutputTrackerMasterEndpoint stopped!
-- :: INFO MemoryStore: - MemoryStore cleared
-- :: INFO BlockManager: - BlockManager stopped
-- :: INFO BlockManagerMaster: - BlockManagerMaster stopped
-- :: INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: - OutputCommitCoordinator stopped!
-- :: INFO SparkContext: - Successfully stopped SparkContext
-- :: INFO ShutdownHookManager: - Shutdown hook called
-- :: INFO ShutdownHookManager: - Deleting directory /tmp/spark-3f8eab55-786c--9f99-ca779610ee0d
-- :: INFO ShutdownHookManager: - Deleting directory /tmp/spark-e9e2c8b5-0a08-4dda-9d3d-4a4f9671b0d9
standalone模式
[root@hadoop1 spark--bin-hadoop2.]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://192.168.217 examples/jars/spark-examples_2.11-2.4.0.jar
-- :: WARN NativeCodeLoader: - Unable to load native-hadoop library for your platform... using builtin-java classepplicable
-- :: INFO SparkContext: - Running Spark version
-- :: INFO SparkContext: - Submitted application: Spark Pi
-- :: INFO SecurityManager: - Changing view acls to: root
-- :: INFO SecurityManager: - Changing modify acls to: root
-- :: INFO SecurityManager: - Changing view acls groups to:
-- :: INFO SecurityManager: - Changing modify acls groups to:
-- :: INFO SecurityManager: - SecurityManager: authentication disabled; ui acls disabled; users with view permiss(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
-- :: INFO Utils: - Successfully started service .
-- :: INFO SparkEnv: - Registering MapOutputTracker
-- :: INFO SparkEnv: - Registering BlockManagerMaster
-- :: INFO BlockManagerMasterEndpoint: - Using org.apache.spark.storage.DefaultTopologyMapper for getting topologyion
-- :: INFO BlockManagerMasterEndpoint: - BlockManagerMasterEndpoint up
-- :: INFO DiskBlockManager: - Created local directory at /tmp/blockmgr--9ee3---79c2902a5a4d
-- :: INFO MemoryStore: - MemoryStore started with capacity 413.9 MB
-- :: INFO SparkEnv: - Registering OutputCommitCoordinator
-- :: INFO log: - Logging initialized @9124ms
-- :: INFO Server: - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
-- :: INFO Server: - Started @9269ms
-- :: INFO AbstractConnector: - Started ServerConnector@3a7b503d{HTTP/}
-- :: INFO Utils: - Successfully started service .
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@6058e535{/jobs,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@6e9c413e{/jobs/json,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@57a4d5ee{/jobs/job,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@3a45c42a{/jobs/job/json,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@36dce7ed{/stages,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@47a64f7d{/stages/json,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@33d05366{/stages/stage,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@33aa93c{/stages/stage/json,null,AVAILABLE,@Spa
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@32c0915e{/stages/pool,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@106faf11{/stages/pool/json,null,AVAILABLE,@Spa
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@70f43b45{/storage,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@26d10f2e{/storage/json,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@10ad20cb{/storage/rdd,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@7dd712e8{/storage/rdd/json,null,AVAILABLE,@Spa
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@2c282004{/environment,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@22ee2d0{/environment/json,null,AVAILABLE,@Spar
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@7bfc3126{/executors,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@3e792ce3{/executors/json,null,AVAILABLE,@Spark
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@53bc1328{/executors/threadDump,null,AVAILABLE,
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@26f143ed{/executors/threadDump/json,null,AVAILrk}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@3c1e3314{/static,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@{/,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@3f3c966c{/api,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@3a71c100{/jobs/job/kill,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@5b69fd74{/stages/stage/kill,null,AVAILABLE,@Sp
-- :: INFO SparkUI: - Bound SparkUI to 192.168.217.201, and started at http://hadoop1.org.cn:4040
-- :: INFO SparkContext: - Added JAR file:/usr/hdp/spark--bin-hadoop2./examples/jars/spark-examples_2.-2.4 spark://hadoop1.org.cn:40178/jars/spark-examples_2.11-2.4.0.jar with timestamp 1550758781088
-- :: INFO StandaloneAppClient$ClientEndpoint: - Connecting to master spark://192.168.217.201:7077...
-- :: INFO TransportClientFactory: - Successfully created connection to / after ms ( ms sootstraps)
-- :: INFO StandaloneSchedulerBackend: - Connected to Spark cluster with app ID app--
-- :: INFO Utils: - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on por
-- :: INFO NettyBlockTransferService: - Server created on hadoop1.org.cn:
-- :: INFO BlockManager: - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication polic
-- :: INFO StandaloneAppClient$ClientEndpoint: - Executor added: app--/ on worker-- () with core(s)
-- :: INFO StandaloneSchedulerBackend: - Granted executor ID app--/ on hostPort .202th core(s), 1024.0 MB RAM
-- :: INFO StandaloneAppClient$ClientEndpoint: - Executor added: app--/ on worker-- () with core(s)
-- :: INFO StandaloneSchedulerBackend: - Granted executor ID app--/ on hostPort .203th core(s), 1024.0 MB RAM
-- :: INFO StandaloneAppClient$ClientEndpoint: - Executor added: app--/ on worker-- () with core(s)
-- :: INFO StandaloneSchedulerBackend: - Granted executor ID app--/ on hostPort .201th core(s), 1024.0 MB RAM
-- :: INFO StandaloneAppClient$ClientEndpoint: - Executor updated: app--/ is now RUNNING
-- :: INFO StandaloneAppClient$ClientEndpoint: - Executor updated: app--/ is now RUNNING
-- :: INFO BlockManagerMaster: - Registering BlockManager BlockManagerId(driver, hadoop1.org.cn, , None)
-- :: INFO BlockManagerMasterEndpoint: - Registering block manager hadoop1.org.cn: with , None)
-- :: INFO StandaloneAppClient$ClientEndpoint: - Executor updated: app--/ is now RUNNING
-- :: INFO BlockManagerMaster: - Registered BlockManager BlockManagerId(driver, hadoop1.org.cn, , None)
-- :: INFO BlockManager: - Initialized BlockManager: BlockManagerId(driver, hadoop1.org.cn, , None)
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@726a17c4{/metrics/json,null,AVAILABLE,@Spark}
-- :: INFO StandaloneSchedulerBackend: - SchedulerBackend is ready for scheduling beginning after reached minRegisurcesRatio: 0.0
-- :: INFO CoarseGrainedSchedulerBackend$DriverEndpoint: - Registered executor NettyRpcEndpointRef(spark-client:// (192.168.217.202:38810) with ID 0
-- :: INFO CoarseGrainedSchedulerBackend$DriverEndpoint: - Registered executor NettyRpcEndpointRef(spark-client:// (192.168.217.203:47346) with ID 1
-- :: INFO BlockManagerMasterEndpoint: - Registering block manager with , None)
-- :: INFO BlockManagerMasterEndpoint: - Registering block manager with , None)
-- :: INFO SparkContext: - Starting job: reduce at SparkPi.scala:
-- :: INFO DAGScheduler: - Got job (reduce at SparkPi.scala:) with output partitions
-- :: INFO DAGScheduler: - Final stage: ResultStage (reduce at SparkPi.scala:)
-- :: INFO DAGScheduler: - Parents of final stage: List()
-- :: INFO DAGScheduler: - Missing parents: List()
-- :: INFO DAGScheduler: - Submitting ResultStage (MapPartitionsRDD[] at map at SparkPi.scala:), which has noparents
-- :: INFO MemoryStore: - Block broadcast_0 stored as values in memory (estimated size 1936.0 B, free 413.9 MB)
-- :: INFO MemoryStore: - Block broadcast_0_piece0 stored as bytes in memory (estimated size 1256.0 B, free 413.9
-- :: INFO BlockManagerInfo: - Added broadcast_0_piece0 (size:
-- :: INFO SparkContext: - Created broadcast
-- :: INFO DAGScheduler: - Submitting missing tasks (MapPartitionsRDD[] at map at SparkPi.scfirst tasks are , ))
-- :: INFO TaskSchedulerImpl: - Adding task tasks
-- :: INFO TaskSetManager: - Starting task , , partition , PROC, bytes)
-- :: INFO TaskSetManager: - Starting task , , partition , PROC, bytes)
-- :: INFO BlockManagerInfo: - Added broadcast_0_piece0 (size: 1256.0 B, free:
-- :: INFO BlockManagerInfo: - Added broadcast_0_piece0 (size: 1256.0 B, free:
-- :: INFO TaskSetManager: - Finished task ) ms on ) (/
-- :: INFO TaskSetManager: - Finished task ) ms on ) (/
-- :: INFO DAGScheduler: - ResultStage (reduce at SparkPi.scala:) finished in 16.998 s
-- :: INFO TaskSchedulerImpl: - Removed TaskSet 0.0, whose tasks have all completed, from pool
-- :: INFO DAGScheduler: - Job finished: reduce at SparkPi.scala:, took 20.610491 s
Pi is roughly 3.1427357136785683
-- :: INFO AbstractConnector: - Stopped Spark@3a7b503d{HTTP/}
-- :: INFO SparkUI: - Stopped Spark web UI at http://hadoop1.org.cn:4040
-- :: INFO StandaloneSchedulerBackend: - Shutting down all executors
-- :: INFO CoarseGrainedSchedulerBackend$DriverEndpoint: - Asking each executor to shut down
-- :: INFO StandaloneAppClient$ClientEndpoint: - Executor updated: app--/ )
-- :: INFO StandaloneSchedulerBackend: - Executor app--/ removed: Command exited with code
-- :: INFO StandaloneAppClient$ClientEndpoint: - Executor added: app--/ on worker-- () with core(s)
-- :: INFO StandaloneSchedulerBackend: - Granted executor ID app--/ on hostPort .202th core(s), 1024.0 MB RAM
-- :: INFO StandaloneAppClient$ClientEndpoint: - Executor updated: app--/ is now RUNNING
-- :: INFO MapOutputTrackerMasterEndpoint: - MapOutputTrackerMasterEndpoint stopped!
-- :: INFO MemoryStore: - MemoryStore cleared
-- :: INFO BlockManager: - BlockManager stopped
-- :: INFO BlockManagerMaster: - BlockManagerMaster stopped
-- :: INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: - OutputCommitCoordinator stopped!
-- :: INFO SparkContext: - Successfully stopped SparkContext
-- :: INFO ShutdownHookManager: - Shutdown hook called
-- :: INFO ShutdownHookManager: - Deleting directory /tmp/spark-e9155121-edfe-4f64-b917-be3f9f62220a
-- :: INFO ShutdownHookManager: - Deleting directory /tmp/spark-90e9b57b-c3d4--8eec-906f121d6b98
[root@hadoop1 spark--bin-hadoop2.]#
yarn-cluster模式
所谓的yarn集群模式,就是讲spark任务提交给yarn,让yarn去执行相关的任务,因此需要在spark-env.sh文件中添加export HADOOP_CONF_DIR=/usr/hdp/hadoop-2.8.3/etc/hadoop,然后去执行相关的任务:
[root@hadoop1 spark--bin-hadoop2.]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode
cluster examples/jars/spark-examples_2.-.jar
-- :: WARN NativeCodeLoader: - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
-- :: INFO RMProxy: - Connecting to ResourceManager at hadoop1/
-- :: INFO Client: - Requesting a NodeManagers
-- :: INFO Client: - Verifying our application has not requested more than the maximum memory capability of the cluster ( MB per container)
-- :: INFO Client: - Will allocate AM container, with MB memory including MB overhead
-- :: INFO Client: - Setting up container launch context for our AM
-- :: INFO Client: - Setting up the launch environment for our AM container
-- :: INFO Client: - Preparing resources for our AM container
-- :: WARN Client: - Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
-- :: INFO Client: - Uploading resource file:/tmp/spark-2b75f51c-ce24--aa38-6d3262b1c7cb/__spark_libs__7092311691544510332.zip -> hdfs://hadoop1:9000/user/root/.sparkStaging/application_1550757972410_0001/__spark_libs__7092311691544510332.zip
-- :: INFO Client: - Uploading resource file:/usr/hdp/spark--bin-hadoop2./examples/jars/spark-examples_2.-.jar -> hdfs://hadoop1:9000/user/root/.sparkStaging/application_1550757972410_0001/spark-examples_2.11-2.4.0.jar
-- :: INFO Client: - Uploading resource file:/tmp/spark-2b75f51c-ce24--aa38-6d3262b1c7cb/__spark_conf__4473735302996115715.zip -> hdfs://hadoop1:9000/user/root/.sparkStaging/application_1550757972410_0001/__spark_conf__.zip
-- :: INFO SecurityManager: - Changing view acls to: root
-- :: INFO SecurityManager: - Changing modify acls to: root
-- :: INFO SecurityManager: - Changing view acls groups to:
-- :: INFO SecurityManager: - Changing modify acls groups to:
-- :: INFO SecurityManager: - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
-- :: INFO Client: - Submitting application application_1550757972410_0001 to ResourceManager
-- :: INFO YarnClientImpl: - Submitted application application_1550757972410_0001
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: -
client token: N/A
diagnostics: [星期五 二月 :: + ] Scheduler has assigned a container for AM, waiting for AM container to be launched
ApplicationMaster host: N/A
ApplicationMaster RPC port: -
queue: default
start time:
final status: UNDEFINED
tracking URL: http://hadoop1:8088/proxy/application_1550757972410_0001/
user: root
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: -
client token: N/A
diagnostics: N/A
ApplicationMaster host: hadoop2.org.cn
ApplicationMaster RPC port:
queue: default
start time:
final status: UNDEFINED
tracking URL: http://hadoop1:8088/proxy/application_1550757972410_0001/
user: root
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: RUNNING)
-- :: INFO Client: - Application report for application_1550757972410_0001 (state: FINISHED)
-- :: INFO Client: -
client token: N/A
diagnostics: N/A
ApplicationMaster host: hadoop2.org.cn
ApplicationMaster RPC port:
queue: default
start time:
final status: SUCCEEDED
tracking URL: http://hadoop1:8088/proxy/application_1550757972410_0001/
user: root
-- :: INFO ShutdownHookManager: - Shutdown hook called
-- :: INFO ShutdownHookManager: - Deleting directory /tmp/spark-afb05931-e273-4e9f-b38a-d3ca234dfb34
-- :: INFO ShutdownHookManager: - Deleting directory /tmp/spark-2b75f51c-ce24--aa38-6d3262b1c7cb
[root@hadoop1 spark--bin-hadoop2.]#
其他在python以及kubernets的相关运行程序就不在赘述了。
yarn-client模式
[root@hadoop1 spark--bin-hadoop2.]# ./bin/spark-submit ---.jar
Warning: Master yarn-client is deprecated since 2.0. Please use master "yarn" with specified deploy mode instead.
-- :: WARN NativeCodeLoader: - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
-- :: INFO SparkContext: - Running Spark version
-- :: INFO SparkContext: - Submitted application: Spark Pi
-- :: INFO SecurityManager: - Changing view acls to: root
-- :: INFO SecurityManager: - Changing modify acls to: root
-- :: INFO SecurityManager: - Changing view acls groups to:
-- :: INFO SecurityManager: - Changing modify acls groups to:
-- :: INFO SecurityManager: - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
-- :: INFO Utils: - Successfully started service .
-- :: INFO SparkEnv: - Registering MapOutputTracker
-- :: INFO SparkEnv: - Registering BlockManagerMaster
-- :: INFO BlockManagerMasterEndpoint: - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
-- :: INFO BlockManagerMasterEndpoint: - BlockManagerMasterEndpoint up
-- :: INFO DiskBlockManager: - Created local directory at /tmp/blockmgr-f9baa979-e964-46a9-b034-475ba5148562
-- :: INFO MemoryStore: - MemoryStore started with capacity 413.9 MB
-- :: INFO SparkEnv: - Registering OutputCommitCoordinator
-- :: INFO log: - Logging initialized @9175ms
-- :: INFO Server: - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
-- :: INFO Server: - Started @9359ms
-- :: INFO AbstractConnector: - Started ServerConnector@47a64f7d{HTTP/}
-- :: INFO Utils: - Successfully started service .
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@49ef32e0{/jobs,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@3be8821f{/jobs/json,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@64b31700{/jobs/job,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@bae47a0{/jobs/job/json,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@74a9c4b0{/stages,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@85ec632{/stages/json,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@1c05a54d{/stages/stage,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@214894fc{/stages/stage/json,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@{/stages/pool,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@e362c57{/stages/pool/json,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@1c4ee95c{/storage,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@79c4715d{/storage/json,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@5aa360ea{/storage/rdd,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@6548bb7d{/storage/rdd/json,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@e27ba81{/environment,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@54336c81{/environment/json,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@1556f2dd{/executors,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@35e52059{/executors/json,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@62577d6{/executors/threadDump,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@49bd54f7{/executors/threadDump/json,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@6b5f8707{/static,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@17ae98d7{/,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@59221b97{/api,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@704b2127{/jobs/job/kill,null,AVAILABLE,@Spark}
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@3ee39da0{/stages/stage/kill,null,AVAILABLE,@Spark}
-- :: INFO SparkUI: - Bound SparkUI to 192.168.217.201, and started at http://hadoop1.org.cn:4040
-- :: INFO SparkContext: - Added JAR file:/usr/hdp/spark--bin-hadoop2./examples/jars/spark-examples_2.-.jar at spark://hadoop1.org.cn:34169/jars/spark-examples_2.11-2.4.0.jar with timestamp 1550766679506
-- :: INFO RMProxy: - Connecting to ResourceManager at hadoop1/
-- :: INFO Client: - Requesting a NodeManagers
-- :: INFO Client: - Verifying our application has not requested more than the maximum memory capability of the cluster ( MB per container)
-- :: INFO Client: - Will allocate AM container, with MB memory including MB overhead
-- :: INFO Client: - Setting up container launch context for our AM
-- :: INFO Client: - Setting up the launch environment for our AM container
-- :: INFO Client: - Preparing resources for our AM container
-- :: WARN Client: - Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
-- :: INFO Client: - Uploading resource file:/tmp/spark-bc395b60-843f-4e24-841c-1fb09330b89f/__spark_libs__4384247224971462772.zip -> hdfs://hadoop1:9000/user/root/.sparkStaging/application_1550757972410_0006/__spark_libs__4384247224971462772.zip
-- :: INFO Client: - Uploading resource file:/tmp/spark-bc395b60-843f-4e24-841c-1fb09330b89f/__spark_conf__7312670304741942310.zip -> hdfs://hadoop1:9000/user/root/.sparkStaging/application_1550757972410_0006/__spark_conf__.zip
-- :: INFO SecurityManager: - Changing view acls to: root
-- :: INFO SecurityManager: - Changing modify acls to: root
-- :: INFO SecurityManager: - Changing view acls groups to:
-- :: INFO SecurityManager: - Changing modify acls groups to:
-- :: INFO SecurityManager: - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
-- :: INFO Client: - Submitting application application_1550757972410_0006 to ResourceManager
-- :: INFO YarnClientImpl: - Submitted application application_1550757972410_0006
-- :: INFO SchedulerExtensionServices: - Starting Yarn extension services with app application_1550757972410_0006 and attemptId None
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: -
client token: N/A
diagnostics: AM container is launched, waiting for AM container to Register with RM
ApplicationMaster host: N/A
ApplicationMaster RPC port: -
queue: default
start time:
final status: UNDEFINED
tracking URL: http://hadoop1:8088/proxy/application_1550757972410_0006/
user: root
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: ACCEPTED)
-- :: INFO Client: - Application report for application_1550757972410_0006 (state: RUNNING)
-- :: INFO Client: -
client token: N/A
diagnostics: N/A
ApplicationMaster host: 192.168.217.203
ApplicationMaster RPC port: -
queue: default
start time:
final status: UNDEFINED
tracking URL: http://hadoop1:8088/proxy/application_1550757972410_0006/
user: root
-- :: INFO YarnClientSchedulerBackend: - Application application_1550757972410_0006 has started running.
-- :: INFO Utils: - Successfully started service .
-- :: INFO NettyBlockTransferService: - Server created on hadoop1.org.cn:
-- :: INFO BlockManager: - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
-- :: INFO BlockManagerMaster: - Registering BlockManager BlockManagerId(driver, hadoop1.org.cn, , None)
-- :: INFO BlockManagerMasterEndpoint: - Registering block manager hadoop1.org.cn: with , None)
-- :: INFO BlockManagerMaster: - Registered BlockManager BlockManagerId(driver, hadoop1.org.cn, , None)
-- :: INFO BlockManager: - Initialized BlockManager: BlockManagerId(driver, hadoop1.org.cn, , None)
-- :: INFO ContextHandler: - Started o.s.j.s.ServletContextHandler@1788cb61{/metrics/json,null,AVAILABLE,@Spark}
-- :: INFO YarnClientSchedulerBackend: - Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> hadoop1, PROXY_URI_BASES -> http://hadoop1:8088/proxy/application_1550757972410_0006), /proxy/application_1550757972410_0006
-- :: INFO JettyUtils: - Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /jobs, /jobs/json, /jobs/job, /jobs/job/json, /stages, /stages/json, /stages/stage, /stages/stage/json, /stages/pool, /stages/pool/json, /storage, /storage/json, /storage/rdd, /storage/rdd/json, /environment, /environment/json, /executors, /executors/json, /executors/threadDump, /executors/threadDump/json, /static, /, /api, /jobs/job/kill, /stages/stage/kill, /metrics/json.
-- :: INFO YarnClientSchedulerBackend: - SchedulerBackend (ms)
-- :: INFO YarnSchedulerBackend$YarnSchedulerEndpoint: - ApplicationMaster registered as NettyRpcEndpointRef(spark-client://YarnAM)
-- :: INFO SparkContext: - Starting job: reduce at SparkPi.scala:
-- :: INFO DAGScheduler: - Got job (reduce at SparkPi.scala:) with output partitions
-- :: INFO DAGScheduler: - Final stage: ResultStage (reduce at SparkPi.scala:)
-- :: INFO DAGScheduler: - Parents of final stage: List()
-- :: INFO DAGScheduler: - Missing parents: List()
-- :: INFO DAGScheduler: - Submitting ResultStage (MapPartitionsRDD[] at map at SparkPi.scala:), which has no missing parents
-- :: INFO MemoryStore: - Block broadcast_0 stored as values in memory (estimated size 1936.0 B, free 413.9 MB)
-- :: INFO MemoryStore: - Block broadcast_0_piece0 stored as bytes in memory (estimated size 1256.0 B, free 413.9 MB)
-- :: INFO BlockManagerInfo: - Added broadcast_0_piece0 (size: 1256.0 B, free: 413.9 MB)
-- :: INFO SparkContext: - Created broadcast
-- :: INFO DAGScheduler: - Submitting missing tasks (MapPartitionsRDD[] at map at SparkPi.scala:) (first tasks are , ))
-- :: INFO YarnScheduler: - Adding task tasks
-- :: WARN YarnScheduler: - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
-- :: INFO YarnSchedulerBackend$YarnDriverEndpoint: - Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.217.202:43729) with ID 1
-- :: INFO TaskSetManager: - Starting task , hadoop2.org.cn, executor , partition , PROCESS_LOCAL, bytes)
-- :: INFO BlockManagerMasterEndpoint: - Registering block manager hadoop2.org.cn: with , hadoop2.org.cn, , None)
-- :: INFO BlockManagerInfo: - Added broadcast_0_piece0 (size: 1256.0 B, free: 413.9 MB)
-- :: INFO TaskSetManager: - Starting task , hadoop2.org.cn, executor , partition , PROCESS_LOCAL, bytes)
-- :: INFO TaskSetManager: - Finished task ) ms on hadoop2.org.cn (executor ) (/)
-- :: INFO TaskSetManager: - Finished task ) ms on hadoop2.org.cn (executor ) (/)
-- :: INFO DAGScheduler: - ResultStage (reduce at SparkPi.scala:) finished in 34.651 s
-- :: INFO YarnScheduler: - Removed TaskSet 0.0, whose tasks have all completed, from pool
-- :: INFO DAGScheduler: - Job finished: reduce at SparkPi.scala:, took 37.594449 s
Pi is roughly 3.14281571407857
-- :: INFO AbstractConnector: - Stopped Spark@47a64f7d{HTTP/}
-- :: INFO SparkUI: - Stopped Spark web UI at http://hadoop1.org.cn:4040
-- :: INFO YarnClientSchedulerBackend: - Interrupting monitor thread
-- :: INFO YarnClientSchedulerBackend: - Shutting down all executors
-- :: INFO YarnSchedulerBackend$YarnDriverEndpoint: - Asking each executor to shut down
-- :: INFO SchedulerExtensionServices: - Stopping SchedulerExtensionServices
(serviceOption=None,
services=List(),
started=false)
-- :: INFO YarnClientSchedulerBackend: - Stopped
-- :: INFO MapOutputTrackerMasterEndpoint: - MapOutputTrackerMasterEndpoint stopped!
-- :: INFO MemoryStore: - MemoryStore cleared
-- :: INFO BlockManager: - BlockManager stopped
-- :: INFO BlockManagerMaster: - BlockManagerMaster stopped
-- :: INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: - OutputCommitCoordinator stopped!
-- :: INFO SparkContext: - Successfully stopped SparkContext
-- :: INFO ShutdownHookManager: - Shutdown hook called
-- :: INFO ShutdownHookManager: - Deleting directory /tmp/spark-ea671cae-988b-4f5b-a85f-184ed2dba58d
-- :: INFO ShutdownHookManager: - Deleting directory /tmp/spark-bc395b60-843f-4e24-841c-1fb09330b89f
Master URLS
传递给Spark的Master URL可以采用以下格式之一:
- local:使用一个工作线程在本地运行Spark(即根本没有并行性)。
- local[K]:使用K个工作线程在本地运行Spark(理想情况下,将其设置为计算机上的核心数)。
- local[K,F]:使用K个工作线程和F个maxFailures在本地运行Spark(有关此变量的说明,请参阅spark.task.maxFailures)
- local[*]:使用与计算机上的逻辑核心一样多的工作线程在本地运行Spark。
- local [*,F]:本地运行Spark,其中包含与计算机和F maxFailures上的逻辑核心一样多的工作线程。
- spark://HOST:PORT:连接到给定的Spark独立集群主服务器。端口必须是主服务器配置使用的端口,默认为7077。
- spark://HOST1:PORT1,HOST2:PORT2:使用Zookeeper的备用主服务器连接到给定的Spark独立群集。该列表必须具有使用Zookeeper设置的高可用性群集中的所有主主机。端口必须是每个主服务器配置使用的默认端口,默认为7077。
- mesos://HOST:PORT:连接到给定的Mesos群集。端口必须是您配置使用的端口,默认为5050。或者,对于使用ZooKeeper的Mesos集群,请使用mesos://zk://....要使用--deploy-mode集群进行提交,应将HOST:PORT配置为连接到MesosClusterDispatcher。
- yarn:以客户端或集群模式连接到YARN集群,具体取决于--deploy-mode的值。将根据HADOOP_CONF_DIR或YARN_CONF_DIR变量找到群集位置。
- k8s://HOST:PORT:以群集模式连接到Kubernetes群集。客户端模式目前不受支持,将来的版本将支持。HOST和PORT参考[Kubernetes API服务器](https://kubernetes.io/docs/reference/generated/kube-apiserver/)。它默认使用TLS连接。为了强制它使用不安全的连接,您可以使用k8s://http://HOST:PORT。
从文件加载配置
spark-submit脚本可以从属性文件加载默认的Spark配置值,并将它们传递给您的应用程序。默认情况下,它将从Spark目录中的conf/spark-defaults.conf中读取选项。有关更多详细信息,请参阅有关加载默认配置的部分。
以这种方式加载默认Spark配置可以避免某些标志需要spark-submit。例如,如果设置了spark.master属性,则可以安全地从spark-submit中省略--master标志。通常,在SparkConf上显式设置的配置值采用最高优先级,然后传递给spark-submit的标志,然后是默认文件中的值。
如果您不清楚配置选项的来源,可以通过使用--verbose选项运行spark-submit来打印细粒度的调试信息。
高级依赖管理
使用spark-submit时,应用程序jar以及--jars选项中包含的任何jar都将自动传输到群集。-jars之后提供的URL必须用逗号分隔。该列表包含在驱动程序和执行程序类路径中。目录扩展不适用于--jars。
Spark使用以下URL方案来允许传播jar的不同策略:
file:-绝对路径和文file://URI由驱动程序的HTTP文件服务器提供服务,每个执行程序从驱动程序HTTP服务器提取文件。
hdfs:、http:、https:、ftp:-这些从URI中按预期下拉文件和JAR。
local:- 以local:/开头的URI应该作为每个工作节点上的本地文件存在。这意味着不会产生任何网络IO,并且适用于推送给每个工作者或通过NFS,GlusterFS等共享的大型文件/JAR。
请注意,JAR和文件将复制到执行程序节点上的每个SparkContext的工作目录中。随着时间的推移,这会占用大量空间,需要进行清理。使用YARN,可以自动处理清理,使用Spark standalone,可以使用spark.worker.cleanup.appDataTtl属性配置自动清理。
用户还可以通过使用--packages提供以逗号分隔的Maven坐标列表来包含任何其他依赖项。使用此命令时将处理所有传递依赖项。可以使用标志--repositories以逗号分隔的方式添加其他存储库(或SBT中的解析程序)。(请注意,在某些情况下,可以在存储库URI中提供受密码保护的存储库的凭据,例如在https://user:password@host/ ....以这种方式提供凭据时要小心。)这些命令可以是与pyspark,spark-shell和spark-submit一起使用以包含Spark Packages。
对于Python,可以使用等效的--py-files选项将.egg,.zip和.py库分发给执行程序。
更多信息
部署应用程序后,集群模式概述描述了分布式执行中涉及的组件,以及如何监视和调试应用程序。
坚壁清野
spark各种模式提交任务介绍的更多相关文章
- Spark部署三种方式介绍:YARN模式、Standalone模式、HA模式
参考自:Spark部署三种方式介绍:YARN模式.Standalone模式.HA模式http://www.aboutyun.com/forum.php?mod=viewthread&tid=7 ...
- Spark学习之路(五)—— Spark运行模式与作业提交
一.作业提交 1.1 spark-submit Spark所有模式均使用spark-submit命令提交作业,其格式如下: ./bin/spark-submit \ --class <main- ...
- Spark 系列(五)—— Spark 运行模式与作业提交
一.作业提交 1.1 spark-submit Spark 所有模式均使用 spark-submit 命令提交作业,其格式如下: ./bin/spark-submit \ --class <ma ...
- 入门大数据---Spark部署模式与作业提交
一.作业提交 1.1 spark-submit Spark 所有模式均使用 spark-submit 命令提交作业,其格式如下: ./bin/spark-submit \ --class <ma ...
- 大数据学习day18----第三阶段spark01--------0.前言(分布式运算框架的核心思想,MR与Spark的比较,spark可以怎么运行,spark提交到spark集群的方式)1. spark(standalone模式)的安装 2. Spark各个角色的功能 3.SparkShell的使用,spark编程入门(wordcount案例)
0.前言 0.1 分布式运算框架的核心思想(此处以MR运行在yarn上为例) 提交job时,resourcemanager(图中写成了master)会根据数据的量以及工作的复杂度,解析工作量,从而 ...
- Spark Standalone模式HA环境搭建
Spark Standalone模式常见的HA部署方式有两种:基于文件系统的HA和基于ZK的HA 本篇只介绍基于ZK的HA环境搭建: $SPARK_HOME/conf/spark-env.sh 添加S ...
- spark运行模式之二:Spark的Standalone模式安装部署
Spark运行模式 Spark 有很多种模式,最简单就是单机本地模式,还有单机伪分布式模式,复杂的则运行在集群中,目前能很好的运行在 Yarn和 Mesos 中,当然 Spark 还有自带的 Stan ...
- spark运行模式之一:Spark的local模式安装部署
Spark运行模式 Spark 有很多种模式,最简单就是单机本地模式,还有单机伪分布式模式,复杂的则运行在集群中,目前能很好的运行在 Yarn和 Mesos 中,当然 Spark 还有自带的 Stan ...
- Spark运行模式与Standalone模式部署
上节中简单的介绍了Spark的一些概念还有Spark生态圈的一些情况,这里主要是介绍Spark运行模式与Spark Standalone模式的部署: Spark运行模式 在Spark中存在着多种运行模 ...
随机推荐
- Python_day_01
python (1024程序员节) 语言分为很多种,但是如果要想和计算机交流,就必须知道计算 ...
- git 之路
1. 不要把配置文件放到你的 Git 代码仓库 https://www.oschina.net/translate/dont-include-configs-in-your-git-repos 2. ...
- JNI加载hal的dlopen()相关操作
1.函数集合 #include <dlfcn.h> void *dlopen(const char *filename, int flag); char *dlerror(void); v ...
- JSON和JSONP,浅析JSONP解决AJAX跨域问题
说到AJAX就会不可避免的面临两个问题,第一个是AJAX以何种格式来交换数据?第二个是跨域的需求如何解决?这两个问题目前都有不同的解决方案,比如数据可以用自定义字符串或者用XML来描述,跨域可以通过服 ...
- 数字色彩的艺术 | The Art Of Digital Color(修订)
翻译一篇来自2011年的文章,原链地址:https://www.fxguide.com/featured/the-art-of-digital-color/ 在这个时期,DPX日渐式微,ACES方兴未 ...
- SSHD启动失败,错误码255
查看/etc/ssh/sshd_config 发现,Listen Address并不是我想要的ip,将其注释掉 sshd restart,结果返回 Permission denied (publick ...
- 用anaconda安装tensorflow
conda create -n tensorflow python=2.7 conda activate tensorflow / source activate tensorflow anacond ...
- 7.2.5 多层嵌套的if语句
7.2.5 多层嵌套的if语句 在编写程序的代码之前要先规划好.首先,要总体设计一下程序. 为方便起见,程序应该使用一个连续的循环让用户能连续输入待测试的 数.这样,测试一个新的数字不必每次都要重新运 ...
- Spring事件通知机制
在上图中,调用 getApplicationEventMulticaster()方法,该方法返回的ApplicationEventMulticaster类型的对象applicationEventMul ...
- docker环境安装与开启远程访问
一,安装docker 1,服务器安装 docker yum install docker 直接yum安装版本太低 2,卸载:老版本的Docker在yum中名称为docker或docker-engine ...