一种flink 作业提交失败的情况描述与原因排查
- 遇到异常
2019-12-24 16:49:59,019 INFO org.apache.flink.yarn.YarnClusterClient - Starting client actor system.
2019-12-24 16:49:59,033 INFO org.apache.flink.yarn.YarnClusterClient - Trying to start actor system at 10-30-63-28_uf.cluster.ds.mosaic.com:0
2019-12-24 16:49:59,686 INFO org.apache.flink.yarn.YarnClusterClient - Actor system started at akka.tcp://flink@10-30-63-28_uf.cluster.ds.mosaic.com:33557
------------------------------------------------------------
The program finished with the following exception:
java.lang.RuntimeException: Unable to tell application master to stop once the specified job has been finised
at org.apache.flink.yarn.YarnClusterClient.stopAfterJob(YarnClusterClient.java:129)
at org.apache.flink.yarn.YarnClusterClient.submitJob(YarnClusterClient.java:154)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:486)
at org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:77)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:432)
at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:816)
at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:290)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:216)
at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1053)
at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1129)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1727)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1129)
Caused by: org.apache.flink.util.FlinkException: Could not connect to the leading JobManager. Please check that the JobManager is running.
at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:956)
at org.apache.flink.yarn.YarnClusterClient.stopAfterJob(YarnClusterClient.java:124)
... 14 more
Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not retrieve the leader gateway.
at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:83)
at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:951)
... 15 more
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:227)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:190)
at scala.concurrent.Await.result(package.scala)
at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:81)
... 16 more
- 发现异常:
- hostname 在shell中解析不完整
[d1_mosaic_bigdata_pa@10-30-63-28_uf ol-mpr]$ hostname
10-30-63-28_uf.cluster.ds.mosaic.com
- 调整hostname后
2019-12-24 17:13:10,044 INFO org.apache.flink.yarn.YarnClusterClient - Starting client actor system.
2019-12-24 17:13:10,060 INFO org.apache.flink.yarn.YarnClusterClient - Trying to start actor system at host10306328:0
2019-12-24 17:13:10,706 INFO org.apache.flink.yarn.YarnClusterClient - Actor system started at akka.tcp://flink@host10306328:41187
java.lang.RuntimeException: Unable to tell application master to stop once the specified job has been finised
at org.apache.flink.yarn.YarnClusterClient.stopAfterJob(YarnClusterClient.java:129)
at org.apache.flink.yarn.YarnClusterClient.submitJob(YarnClusterClient.java:154)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:486)
at org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:77)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:432)
at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:816)
at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:290)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:216)
at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1053)
at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1129)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1727)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1129)
Caused by: org.apache.flink.util.FlinkException: Could not find out our own hostname by connecting to the leading JobManager. Please make sure that the Flink cluster has been started.
at org.apache.flink.client.program.ClusterClient$LazyActorSystemLoader.get(ClusterClient.java:276)
at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:953)
at org.apache.flink.yarn.YarnClusterClient.stopAfterJob(YarnClusterClient.java:124)
... 14 more
Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not find the connecting address by connecting to the current leader.
at org.apache.flink.runtime.util.LeaderRetrievalUtils.findConnectingAddress(LeaderRetrievalUtils.java:182)
at org.apache.flink.runtime.util.LeaderRetrievalUtils.findConnectingAddress(LeaderRetrievalUtils.java:163)
at org.apache.flink.client.program.ClusterClient$LazyActorSystemLoader.get(ClusterClient.java:272)
... 16 more
Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not retrieve the connecting address to the current leader with the akka URL akka.tcp://flink@emr-worker-190.cluster-40699:45716/user/jobmanager.
at org.apache.flink.runtime.net.ConnectionUtils$LeaderConnectingAddressListener.findConnectingAddress(ConnectionUtils.java:472)
at org.apache.flink.runtime.net.ConnectionUtils$LeaderConnectingAddressListener.findConnectingAddress(ConnectionUtils.java:361)
at org.apache.flink.runtime.util.LeaderRetrievalUtils.findConnectingAddress(LeaderRetrievalUtils.java:180)
... 18 more
Caused by: java.net.UnknownHostException: host10306328: host10306328: Name or service not known
at java.net.InetAddress.getLocalHost(InetAddress.java:1505)
at org.apache.flink.runtime.net.ConnectionUtils.tryLocalHostBeforeReturning(ConnectionUtils.java:190)
at org.apache.flink.runtime.net.ConnectionUtils.findAddressUsingStrategy(ConnectionUtils.java:276)
at org.apache.flink.runtime.net.ConnectionUtils.access$100(ConnectionUtils.java:51)
at org.apache.flink.runtime.net.ConnectionUtils$LeaderConnectingAddressListener.findConnectingAddress(ConnectionUtils.java:413)
... 20 more
Caused by: java.net.UnknownHostException: host10306328: Name or service not known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
at java.net.InetAddress.getLocalHost(InetAddress.java:1500)
... 24 more
- 增加hosts配置
10.30.63.28 host10306328
2019-12-24 17:11:30,151 INFO org.apache.flink.yarn.YarnClusterClient - Starting client actor system.
2019-12-24 17:11:30,168 INFO org.apache.flink.yarn.YarnClusterClient - Trying to start actor system at 10-30-63-28_uf.cluster.ds.mosaic.com:0
2019-12-24 17:11:30,811 INFO org.apache.flink.yarn.YarnClusterClient - Actor system started at akka.tcp://flink@10-30-63-28_uf.cluster.ds.mosaic.com:13182
------------------------------------------------------------
The program finished with the following exception:
java.lang.RuntimeException: Unable to tell application master to stop once the specified job has been finised
at org.apache.flink.yarn.YarnClusterClient.stopAfterJob(YarnClusterClient.java:129)
at org.apache.flink.yarn.YarnClusterClient.submitJob(YarnClusterClient.java:154)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:486)
at org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:77)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:432)
at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:816)
at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:290)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:216)
at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1053)
at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1129)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1727)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1129)
Caused by: org.apache.flink.util.FlinkException: Could not connect to the leading JobManager. Please check that the JobManager is running.
at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:956)
at org.apache.flink.yarn.YarnClusterClient.stopAfterJob(YarnClusterClient.java:124)
... 14 more
Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not retrieve the leader gateway.
at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:83)
at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:951)
... 15 more
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:227)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:190)
at scala.concurrent.Await.result(package.scala)
at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:81)
... 16 more
- 屏蔽错误host配置
- 提交成功
# 10.30.63.28 10-30-63-28_uf.cluster.ds.mosaic.com
+++ dirname yarn_offltrain.sh
++ cd .
++ pwd
+ BASE_PATH=/data0/d1_mosaic_bigdata_test/mosaic/mosaicx/ol-mpr/offline_train_mainpage_mosaic_formosaic
+ HADOOP_USER_NAME=feed_mosaic
+ FLINK_RUN_MODE=yarn-cluster
+ mosaic_JOB_NAME=offlinetrain-beta_mainpage_mosaic6_base_prepage_v1-auc1
+ FLINK_TASK_MANAGER_NUMBER=15
+ FLINK_TASK_MANAGER_SLOT=5
+ FLINK_TASK_MANAGER_MEMORY=20000
+ FLINK_JOB_MANAGER_MEMORY=20000
+ JAR=mosaic-runtime-2.0.0.jar
+ XML=mosaic_offlinetrain_weiflow.xml
+ NODE=offline_training
+ FEATURE_CONF=feature_prepage.conf
+ export FLINK_LOG_DIR=/tmp
+ FLINK_LOG_DIR=/tmp
+ export FLINK_LOG_DIR=/tmp
+ FLINK_LOG_DIR=/tmp
++ hadoop classpath
+ export 'HADOOP_CLASSPATH=/data0/rsync_data/mosaic/ccConfs/yarn-setting/EMR-118-conf:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/common/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/common/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/hdfs:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/hdfs/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/hdfs/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/yarn/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/yarn/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/mapreduce/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/mapreduce/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/contrib/capacity-scheduler/*.jar'
+ HADOOP_CLASSPATH='/data0/rsync_data/mosaic/ccConfs/yarn-setting/EMR-118-conf:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/common/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/common/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/hdfs:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/hdfs/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/hdfs/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/yarn/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/yarn/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/mapreduce/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/mapreduce/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/contrib/capacity-scheduler/*.jar'
+ /usr/lib/flink-current/bin/flink run -d -m yarn-cluster -yD env.java.opts=-Djava.util.Arrays.useLegacyMergeSort=true -yD web.timeout=1000000 -yD 'env.java.opts=-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=128M' -yD metrics.reporter.monitor._FLINK_CLUSTER_NAME=offlinetrain-beta_mainpage_mosaic6_base_prepage_v1-auc1 -yD metrics.reporters=monitor -yD metrics.reporter.monitor.class=com.mosaic.datasys.mosaic.metrics.WeiboKafkaReporter -yD metrics.reporter.monitor.kafka.bootstrap.servers=10.85.184.204:9092,10.85.184.205:9092 -yD metrics.reporter.monitor.topicName=metrics-topic -yjm 20000 -yn 15 -ytm 20000 -ys 5 -ynm offlinetrain-beta_mainpage_mosaic6_base_prepage_v1-auc1 -c com.mosaic.datasys.mosaic.framework.common.parser.FlowBuilder mosaic-runtime-2.0.0.jar mosaic_offlinetrain_weiflow.xml offline_training
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data0/rsync_data/mosaic/ccConfs/yarn-setting/mosaic/flink-1.6.2-1.0.0-bin-weiclient/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2019-12-24 17:12:20,400 INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://emr-header-1.cluster-40699:8188/ws/v1/timeline/
2019-12-24 17:12:20,729 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.LegacyYarnClusterDescriptor to locate the jar
2019-12-24 17:12:20,729 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.LegacyYarnClusterDescriptor to locate the jar
2019-12-24 17:12:20,839 INFO org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm2
2019-12-24 17:12:21,011 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: ClusterSpecification{masterMemoryMB=20000, taskManagerMemoryMB=20000, numberTaskManagers=15, slotsPerTaskManager=5}
2019-12-24 17:12:21,376 WARN org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration directory ('/data0/rsync_data/mosaic/ccConfs/yarn-setting/mosaic/flink-1.6.2-1.0.0-bin-weiclient/conf') contains both LOG4J and Logback configuration files. Please delete or rename one of them.
2019-12-24 17:13:00,539 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application master application_1547542611290_3184536
2019-12-24 17:13:00,589 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1547542611290_3184536
2019-12-24 17:13:00,590 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster to be allocated
2019-12-24 17:13:00,598 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, current state ACCEPTED
2019-12-24 17:13:07,826 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has been deployed successfully.
2019-12-24 17:13:07,828 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - The Flink YARN client has been started in detached mode. In order to stop Flink on YARN, use the following command or a YARN web interface to stop it:
yarn application -kill application_1547542611290_3184536
Please also note that the temporary files of the YARN session in the home directory will not be removed.
Using the parallelism provided by the remote cluster (75). To use another parallelism, set it at the ./bin/flink client.
Starting execution of program
2019-12-24 17:13:07,863 INFO org.apache.flink.yarn.YarnClusterClient - Starting program in interactive mode (detached: true)
============================= Flink Job Name is :offlinetrain-mosaic6-mosaic-base-prepage-auc
============================= task info =============================
== task name: hiveInput1
2019-12-24 17:13:08,300 WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist
2019-12-24 17:13:08,453 WARN org.apache.hadoop.security.UserGroupInformation - No groups available for user feed_mosaic
database:default, table:model_frequence_30_days, partitionFilter:dt=test, fieldSize:49
Schema:[is_act, is_click, is_video_play, ma_id, index_type, m_midtextveccos, fm_1082, fm_10169, fm_10170, fm_10171, fm_1033, fm_1086, fm_1089, fu_210, fu_207, fu_211, fu_2117, fu_2135, fu_215, fu_216, fuu_403, fuu_407, fuu_400, fuu_401, fuu_4019, fuu_402, fuu_405, fuu_409, m_1082, m_1011, m_10148, m_10169, m_10170, m_10171, m_1030, m_1032, m_1040, m_1041, m_1042, m_1063, m_1086, fu_uid, fm_uid, fm_mid, m_mid, m_uid, expo_time, pre_page, dt]
== task name: featureProcess
Read Feature File:
== task name: libsvmProcess
Read Feature File: feature_prepage.conf
== task name: trainProcess
============================= task info =============================
2019-12-24 17:13:10,044 INFO org.apache.flink.yarn.YarnClusterClient - Starting client actor system.
2019-12-24 17:13:10,060 INFO org.apache.flink.yarn.YarnClusterClient - Trying to start actor system at host10306328:0
2019-12-24 17:13:10,706 INFO org.apache.flink.yarn.YarnClusterClient - Actor system started at akka.tcp://flink@host10306328:41187
2019-12-24 17:13:11,050 INFO org.apache.flink.yarn.YarnClusterClient - Waiting until all TaskManagers have connected
Waiting until all TaskManagers have connected
2019-12-24 17:13:11,111 INFO org.apache.flink.yarn.YarnClusterClient - TaskManager status (0/15)
TaskManager status (0/15)
2019-12-24 17:13:13,222 INFO org.apache.flink.yarn.YarnClusterClient - TaskManager status (2/15)
TaskManager status (2/15)
2019-12-24 17:13:13,504 INFO org.apache.flink.yarn.YarnClusterClient - TaskManager status (4/15)
TaskManager status (4/15)
2019-12-24 17:13:13,807 INFO org.apache.flink.yarn.YarnClusterClient - TaskManager status (9/15)
TaskManager status (9/15)
2019-12-24 17:13:14,084 INFO org.apache.flink.yarn.YarnClusterClient - TaskManager status (13/15)
TaskManager status (13/15)
2019-12-24 17:13:14,427 INFO org.apache.flink.yarn.YarnClusterClient - TaskManager status (15/15)
TaskManager status (15/15)
2019-12-24 17:13:14,427 INFO org.apache.flink.yarn.YarnClusterClient - All TaskManagers are connected
All TaskManagers are connected
2019-12-24 17:13:14,448 INFO org.apache.flink.yarn.YarnClusterClient - Submitting Job with JobID: a05e47246348d02d5c4fe5e322c8544d. Returning after job submission.
Submitting Job with JobID: a05e47246348d02d5c4fe5e322c8544d. Returning after job submission.
Job has been submitted with JobID a05e47246348d02d5c4fe5e322c8544d
一种flink 作业提交失败的情况描述与原因排查的更多相关文章
- YARN作业提交流程剖析
YARN(MapReduce2) Yet Another Resource Negotiator / YARN Application Resource Negotiator对于节点数超出4000的大 ...
- hadoop2.7之作业提交详解(上)
根据wordcount进行分析: import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; impo ...
- Apache Flink 进阶(六):Flink 作业执行深度解析
本文根据 Apache Flink 系列直播课程整理而成,由 Apache Flink Contributor.网易云音乐实时计算平台研发工程师岳猛分享.主要分享内容为 Flink Job 执行作业的 ...
- Spark学习(四) -- Spark作业提交
标签(空格分隔): Spark 作业提交 先回顾一下WordCount的过程: sc.textFile("README.rd").flatMap(line => line.s ...
- Hadoop作业提交之TaskTracker获取Task
[Hadoop代码笔记]Hadoop作业提交之TaskTracker获取Task 一.概要描述 在上上一篇博文和上一篇博文中分别描述了jobTracker和其服务(功能)模块初始化完成后,接收JobC ...
- Spark学习之路(五)—— Spark运行模式与作业提交
一.作业提交 1.1 spark-submit Spark所有模式均使用spark-submit命令提交作业,其格式如下: ./bin/spark-submit \ --class <main- ...
- Spark 系列(五)—— Spark 运行模式与作业提交
一.作业提交 1.1 spark-submit Spark 所有模式均使用 spark-submit 命令提交作业,其格式如下: ./bin/spark-submit \ --class <ma ...
- 记录一次Flink作业异常的排查过程
最近2周开始接手apache flink全链路监控数据的作业,包括指标统计,业务规则匹配等逻辑,计算结果实时写入elasticsearch. 昨天遇到生产环境有作业无法正常重启的问题,我负责对这个问题 ...
- 入门大数据---Spark部署模式与作业提交
一.作业提交 1.1 spark-submit Spark 所有模式均使用 spark-submit 命令提交作业,其格式如下: ./bin/spark-submit \ --class <ma ...
- 【hadoop代码笔记】Hadoop作业提交中EagerTaskInitializationListener的作用
在整理FairScheduler实现的task调度逻辑时,注意到EagerTaskInitializationListener类.差不多应该是job提交相关的逻辑代码中最简单清楚的一个了. todo: ...
随机推荐
- 用Python语言进行时间序列ARIMA模型分析
应用时间序列 时间序列分析是一种重要的数据分析方法,应用广泛.以下列举了几个时间序列分析的应用场景: 1.经济预测:时间序列分析可以用来分析经济数据,预测未来经济趋势和走向.例如,利用历史股市数据和经 ...
- 2021-09-06:给表达式添加运算符。给定一个仅包含数字 0-9 的字符串 num 和一个目标值整数 target ,在 num 的数字之间添加 二元 运算符(不是一元)+、- 或 * ,返回所有
2021-09-06:给表达式添加运算符.给定一个仅包含数字 0-9 的字符串 num 和一个目标值整数 target ,在 num 的数字之间添加 二元 运算符(不是一元)+.- 或 * ,返回所有 ...
- 给你安利一款国产良心软件uTools
前言 大家好,我是xiezhr 最近由于换了新电脑,也是在各种折腾搭建开发环境,安装各种常用软件.今天呢给大家安利一款你可能没用过的国产良心软件uTools,这也是我刚刚拿到电脑后安装的第一款软件吧. ...
- uni-app Pages.json配置
https://uniapp.dcloud.net.cn/collocation/pages.html pages.json 文件用来对 uni-app 进行全局配置,决定页面文件的路径.窗口样式.原 ...
- Vue3.3 的新功能的一些体验
Vue3 在大版本 3.3 里面推出来了一些新功能(主要是语法糖),网上有各种文章,但是看起来似乎是一样的. 我觉得吧,有新特性了,不能光看,还要动手尝试一下. DefineOptions 宏定义 先 ...
- lxml中xpath获取当前节点所有子节点的文本方法
一.场景还原 现在假定有如下html代码: <div class="content"> <p>输入只有一行半径r.</p> </div&g ...
- spring之AOP的概念及简单案例
AOP概念 AOP(Aspect Oriented Programming),即面向切面编程,可以说是OOP(Object Oriented Programming,面向对象编程)的补充和完善.OOP ...
- 企业研发效能度量利器,华为云发布CodeArts Board看板服务
摘要:华为云CodeArts Board正式上线,欢迎体验. 本文分享自华为云社区<企业研发效能度量利器,华为云发布CodeArts Board看板服务>,作者:华为云头条. 数字化时代, ...
- Java 泛型:理解和应用
概述 泛型是一种将类型参数化的动态机制,使用得到的话,可以从以下的方面提升的你的程序: 安全性:使用泛型可以使代码更加安全可靠,因为泛型提供了编译时的类型检查,使得编译器能够在编译阶段捕捉到类型错误. ...
- cv学习总结(11.21-11.27)
本周彻底完成了CNN的全部内容,包括CNN的原理,代码实现等.CNN是一种神经网络的framework,跟connected_layer相比,更加侧重于能够保持原来的空间结构不变:我们输入的图片是一个 ...