一种flink 作业提交失败的情况描述与原因排查
- 遇到异常
2019-12-24 16:49:59,019 INFO org.apache.flink.yarn.YarnClusterClient - Starting client actor system.
2019-12-24 16:49:59,033 INFO org.apache.flink.yarn.YarnClusterClient - Trying to start actor system at 10-30-63-28_uf.cluster.ds.mosaic.com:0
2019-12-24 16:49:59,686 INFO org.apache.flink.yarn.YarnClusterClient - Actor system started at akka.tcp://flink@10-30-63-28_uf.cluster.ds.mosaic.com:33557
------------------------------------------------------------
The program finished with the following exception:
java.lang.RuntimeException: Unable to tell application master to stop once the specified job has been finised
at org.apache.flink.yarn.YarnClusterClient.stopAfterJob(YarnClusterClient.java:129)
at org.apache.flink.yarn.YarnClusterClient.submitJob(YarnClusterClient.java:154)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:486)
at org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:77)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:432)
at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:816)
at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:290)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:216)
at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1053)
at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1129)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1727)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1129)
Caused by: org.apache.flink.util.FlinkException: Could not connect to the leading JobManager. Please check that the JobManager is running.
at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:956)
at org.apache.flink.yarn.YarnClusterClient.stopAfterJob(YarnClusterClient.java:124)
... 14 more
Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not retrieve the leader gateway.
at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:83)
at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:951)
... 15 more
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:227)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:190)
at scala.concurrent.Await.result(package.scala)
at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:81)
... 16 more
- 发现异常:
- hostname 在shell中解析不完整
[d1_mosaic_bigdata_pa@10-30-63-28_uf ol-mpr]$ hostname
10-30-63-28_uf.cluster.ds.mosaic.com
- 调整hostname后
2019-12-24 17:13:10,044 INFO org.apache.flink.yarn.YarnClusterClient - Starting client actor system.
2019-12-24 17:13:10,060 INFO org.apache.flink.yarn.YarnClusterClient - Trying to start actor system at host10306328:0
2019-12-24 17:13:10,706 INFO org.apache.flink.yarn.YarnClusterClient - Actor system started at akka.tcp://flink@host10306328:41187
java.lang.RuntimeException: Unable to tell application master to stop once the specified job has been finised
at org.apache.flink.yarn.YarnClusterClient.stopAfterJob(YarnClusterClient.java:129)
at org.apache.flink.yarn.YarnClusterClient.submitJob(YarnClusterClient.java:154)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:486)
at org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:77)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:432)
at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:816)
at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:290)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:216)
at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1053)
at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1129)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1727)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1129)
Caused by: org.apache.flink.util.FlinkException: Could not find out our own hostname by connecting to the leading JobManager. Please make sure that the Flink cluster has been started.
at org.apache.flink.client.program.ClusterClient$LazyActorSystemLoader.get(ClusterClient.java:276)
at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:953)
at org.apache.flink.yarn.YarnClusterClient.stopAfterJob(YarnClusterClient.java:124)
... 14 more
Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not find the connecting address by connecting to the current leader.
at org.apache.flink.runtime.util.LeaderRetrievalUtils.findConnectingAddress(LeaderRetrievalUtils.java:182)
at org.apache.flink.runtime.util.LeaderRetrievalUtils.findConnectingAddress(LeaderRetrievalUtils.java:163)
at org.apache.flink.client.program.ClusterClient$LazyActorSystemLoader.get(ClusterClient.java:272)
... 16 more
Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not retrieve the connecting address to the current leader with the akka URL akka.tcp://flink@emr-worker-190.cluster-40699:45716/user/jobmanager.
at org.apache.flink.runtime.net.ConnectionUtils$LeaderConnectingAddressListener.findConnectingAddress(ConnectionUtils.java:472)
at org.apache.flink.runtime.net.ConnectionUtils$LeaderConnectingAddressListener.findConnectingAddress(ConnectionUtils.java:361)
at org.apache.flink.runtime.util.LeaderRetrievalUtils.findConnectingAddress(LeaderRetrievalUtils.java:180)
... 18 more
Caused by: java.net.UnknownHostException: host10306328: host10306328: Name or service not known
at java.net.InetAddress.getLocalHost(InetAddress.java:1505)
at org.apache.flink.runtime.net.ConnectionUtils.tryLocalHostBeforeReturning(ConnectionUtils.java:190)
at org.apache.flink.runtime.net.ConnectionUtils.findAddressUsingStrategy(ConnectionUtils.java:276)
at org.apache.flink.runtime.net.ConnectionUtils.access$100(ConnectionUtils.java:51)
at org.apache.flink.runtime.net.ConnectionUtils$LeaderConnectingAddressListener.findConnectingAddress(ConnectionUtils.java:413)
... 20 more
Caused by: java.net.UnknownHostException: host10306328: Name or service not known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
at java.net.InetAddress.getLocalHost(InetAddress.java:1500)
... 24 more
- 增加hosts配置
10.30.63.28 host10306328
2019-12-24 17:11:30,151 INFO org.apache.flink.yarn.YarnClusterClient - Starting client actor system.
2019-12-24 17:11:30,168 INFO org.apache.flink.yarn.YarnClusterClient - Trying to start actor system at 10-30-63-28_uf.cluster.ds.mosaic.com:0
2019-12-24 17:11:30,811 INFO org.apache.flink.yarn.YarnClusterClient - Actor system started at akka.tcp://flink@10-30-63-28_uf.cluster.ds.mosaic.com:13182
------------------------------------------------------------
The program finished with the following exception:
java.lang.RuntimeException: Unable to tell application master to stop once the specified job has been finised
at org.apache.flink.yarn.YarnClusterClient.stopAfterJob(YarnClusterClient.java:129)
at org.apache.flink.yarn.YarnClusterClient.submitJob(YarnClusterClient.java:154)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:486)
at org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:77)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:432)
at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:816)
at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:290)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:216)
at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1053)
at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1129)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1727)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1129)
Caused by: org.apache.flink.util.FlinkException: Could not connect to the leading JobManager. Please check that the JobManager is running.
at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:956)
at org.apache.flink.yarn.YarnClusterClient.stopAfterJob(YarnClusterClient.java:124)
... 14 more
Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not retrieve the leader gateway.
at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:83)
at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:951)
... 15 more
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:227)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:190)
at scala.concurrent.Await.result(package.scala)
at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:81)
... 16 more
- 屏蔽错误host配置
- 提交成功
# 10.30.63.28 10-30-63-28_uf.cluster.ds.mosaic.com
+++ dirname yarn_offltrain.sh
++ cd .
++ pwd
+ BASE_PATH=/data0/d1_mosaic_bigdata_test/mosaic/mosaicx/ol-mpr/offline_train_mainpage_mosaic_formosaic
+ HADOOP_USER_NAME=feed_mosaic
+ FLINK_RUN_MODE=yarn-cluster
+ mosaic_JOB_NAME=offlinetrain-beta_mainpage_mosaic6_base_prepage_v1-auc1
+ FLINK_TASK_MANAGER_NUMBER=15
+ FLINK_TASK_MANAGER_SLOT=5
+ FLINK_TASK_MANAGER_MEMORY=20000
+ FLINK_JOB_MANAGER_MEMORY=20000
+ JAR=mosaic-runtime-2.0.0.jar
+ XML=mosaic_offlinetrain_weiflow.xml
+ NODE=offline_training
+ FEATURE_CONF=feature_prepage.conf
+ export FLINK_LOG_DIR=/tmp
+ FLINK_LOG_DIR=/tmp
+ export FLINK_LOG_DIR=/tmp
+ FLINK_LOG_DIR=/tmp
++ hadoop classpath
+ export 'HADOOP_CLASSPATH=/data0/rsync_data/mosaic/ccConfs/yarn-setting/EMR-118-conf:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/common/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/common/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/hdfs:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/hdfs/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/hdfs/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/yarn/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/yarn/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/mapreduce/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/mapreduce/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/contrib/capacity-scheduler/*.jar'
+ HADOOP_CLASSPATH='/data0/rsync_data/mosaic/ccConfs/yarn-setting/EMR-118-conf:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/common/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/common/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/hdfs:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/hdfs/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/hdfs/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/yarn/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/yarn/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/mapreduce/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/mapreduce/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/contrib/capacity-scheduler/*.jar'
+ /usr/lib/flink-current/bin/flink run -d -m yarn-cluster -yD env.java.opts=-Djava.util.Arrays.useLegacyMergeSort=true -yD web.timeout=1000000 -yD 'env.java.opts=-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=128M' -yD metrics.reporter.monitor._FLINK_CLUSTER_NAME=offlinetrain-beta_mainpage_mosaic6_base_prepage_v1-auc1 -yD metrics.reporters=monitor -yD metrics.reporter.monitor.class=com.mosaic.datasys.mosaic.metrics.WeiboKafkaReporter -yD metrics.reporter.monitor.kafka.bootstrap.servers=10.85.184.204:9092,10.85.184.205:9092 -yD metrics.reporter.monitor.topicName=metrics-topic -yjm 20000 -yn 15 -ytm 20000 -ys 5 -ynm offlinetrain-beta_mainpage_mosaic6_base_prepage_v1-auc1 -c com.mosaic.datasys.mosaic.framework.common.parser.FlowBuilder mosaic-runtime-2.0.0.jar mosaic_offlinetrain_weiflow.xml offline_training
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data0/rsync_data/mosaic/ccConfs/yarn-setting/mosaic/flink-1.6.2-1.0.0-bin-weiclient/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2019-12-24 17:12:20,400 INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://emr-header-1.cluster-40699:8188/ws/v1/timeline/
2019-12-24 17:12:20,729 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.LegacyYarnClusterDescriptor to locate the jar
2019-12-24 17:12:20,729 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.LegacyYarnClusterDescriptor to locate the jar
2019-12-24 17:12:20,839 INFO org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm2
2019-12-24 17:12:21,011 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: ClusterSpecification{masterMemoryMB=20000, taskManagerMemoryMB=20000, numberTaskManagers=15, slotsPerTaskManager=5}
2019-12-24 17:12:21,376 WARN org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration directory ('/data0/rsync_data/mosaic/ccConfs/yarn-setting/mosaic/flink-1.6.2-1.0.0-bin-weiclient/conf') contains both LOG4J and Logback configuration files. Please delete or rename one of them.
2019-12-24 17:13:00,539 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application master application_1547542611290_3184536
2019-12-24 17:13:00,589 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1547542611290_3184536
2019-12-24 17:13:00,590 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster to be allocated
2019-12-24 17:13:00,598 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, current state ACCEPTED
2019-12-24 17:13:07,826 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has been deployed successfully.
2019-12-24 17:13:07,828 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - The Flink YARN client has been started in detached mode. In order to stop Flink on YARN, use the following command or a YARN web interface to stop it:
yarn application -kill application_1547542611290_3184536
Please also note that the temporary files of the YARN session in the home directory will not be removed.
Using the parallelism provided by the remote cluster (75). To use another parallelism, set it at the ./bin/flink client.
Starting execution of program
2019-12-24 17:13:07,863 INFO org.apache.flink.yarn.YarnClusterClient - Starting program in interactive mode (detached: true)
============================= Flink Job Name is :offlinetrain-mosaic6-mosaic-base-prepage-auc
============================= task info =============================
== task name: hiveInput1
2019-12-24 17:13:08,300 WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist
2019-12-24 17:13:08,453 WARN org.apache.hadoop.security.UserGroupInformation - No groups available for user feed_mosaic
database:default, table:model_frequence_30_days, partitionFilter:dt=test, fieldSize:49
Schema:[is_act, is_click, is_video_play, ma_id, index_type, m_midtextveccos, fm_1082, fm_10169, fm_10170, fm_10171, fm_1033, fm_1086, fm_1089, fu_210, fu_207, fu_211, fu_2117, fu_2135, fu_215, fu_216, fuu_403, fuu_407, fuu_400, fuu_401, fuu_4019, fuu_402, fuu_405, fuu_409, m_1082, m_1011, m_10148, m_10169, m_10170, m_10171, m_1030, m_1032, m_1040, m_1041, m_1042, m_1063, m_1086, fu_uid, fm_uid, fm_mid, m_mid, m_uid, expo_time, pre_page, dt]
== task name: featureProcess
Read Feature File:
== task name: libsvmProcess
Read Feature File: feature_prepage.conf
== task name: trainProcess
============================= task info =============================
2019-12-24 17:13:10,044 INFO org.apache.flink.yarn.YarnClusterClient - Starting client actor system.
2019-12-24 17:13:10,060 INFO org.apache.flink.yarn.YarnClusterClient - Trying to start actor system at host10306328:0
2019-12-24 17:13:10,706 INFO org.apache.flink.yarn.YarnClusterClient - Actor system started at akka.tcp://flink@host10306328:41187
2019-12-24 17:13:11,050 INFO org.apache.flink.yarn.YarnClusterClient - Waiting until all TaskManagers have connected
Waiting until all TaskManagers have connected
2019-12-24 17:13:11,111 INFO org.apache.flink.yarn.YarnClusterClient - TaskManager status (0/15)
TaskManager status (0/15)
2019-12-24 17:13:13,222 INFO org.apache.flink.yarn.YarnClusterClient - TaskManager status (2/15)
TaskManager status (2/15)
2019-12-24 17:13:13,504 INFO org.apache.flink.yarn.YarnClusterClient - TaskManager status (4/15)
TaskManager status (4/15)
2019-12-24 17:13:13,807 INFO org.apache.flink.yarn.YarnClusterClient - TaskManager status (9/15)
TaskManager status (9/15)
2019-12-24 17:13:14,084 INFO org.apache.flink.yarn.YarnClusterClient - TaskManager status (13/15)
TaskManager status (13/15)
2019-12-24 17:13:14,427 INFO org.apache.flink.yarn.YarnClusterClient - TaskManager status (15/15)
TaskManager status (15/15)
2019-12-24 17:13:14,427 INFO org.apache.flink.yarn.YarnClusterClient - All TaskManagers are connected
All TaskManagers are connected
2019-12-24 17:13:14,448 INFO org.apache.flink.yarn.YarnClusterClient - Submitting Job with JobID: a05e47246348d02d5c4fe5e322c8544d. Returning after job submission.
Submitting Job with JobID: a05e47246348d02d5c4fe5e322c8544d. Returning after job submission.
Job has been submitted with JobID a05e47246348d02d5c4fe5e322c8544d
一种flink 作业提交失败的情况描述与原因排查的更多相关文章
- YARN作业提交流程剖析
YARN(MapReduce2) Yet Another Resource Negotiator / YARN Application Resource Negotiator对于节点数超出4000的大 ...
- hadoop2.7之作业提交详解(上)
根据wordcount进行分析: import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; impo ...
- Apache Flink 进阶(六):Flink 作业执行深度解析
本文根据 Apache Flink 系列直播课程整理而成,由 Apache Flink Contributor.网易云音乐实时计算平台研发工程师岳猛分享.主要分享内容为 Flink Job 执行作业的 ...
- Spark学习(四) -- Spark作业提交
标签(空格分隔): Spark 作业提交 先回顾一下WordCount的过程: sc.textFile("README.rd").flatMap(line => line.s ...
- Hadoop作业提交之TaskTracker获取Task
[Hadoop代码笔记]Hadoop作业提交之TaskTracker获取Task 一.概要描述 在上上一篇博文和上一篇博文中分别描述了jobTracker和其服务(功能)模块初始化完成后,接收JobC ...
- Spark学习之路(五)—— Spark运行模式与作业提交
一.作业提交 1.1 spark-submit Spark所有模式均使用spark-submit命令提交作业,其格式如下: ./bin/spark-submit \ --class <main- ...
- Spark 系列(五)—— Spark 运行模式与作业提交
一.作业提交 1.1 spark-submit Spark 所有模式均使用 spark-submit 命令提交作业,其格式如下: ./bin/spark-submit \ --class <ma ...
- 记录一次Flink作业异常的排查过程
最近2周开始接手apache flink全链路监控数据的作业,包括指标统计,业务规则匹配等逻辑,计算结果实时写入elasticsearch. 昨天遇到生产环境有作业无法正常重启的问题,我负责对这个问题 ...
- 入门大数据---Spark部署模式与作业提交
一.作业提交 1.1 spark-submit Spark 所有模式均使用 spark-submit 命令提交作业,其格式如下: ./bin/spark-submit \ --class <ma ...
- 【hadoop代码笔记】Hadoop作业提交中EagerTaskInitializationListener的作用
在整理FairScheduler实现的task调度逻辑时,注意到EagerTaskInitializationListener类.差不多应该是job提交相关的逻辑代码中最简单清楚的一个了. todo: ...
随机推荐
- Spring之丐版IOC实现
文章目录 IOC控制反转 依赖注入 Bean的自动装配方式 丐版IOC实现 BeanDefinition.java ResourceLoader.java BeanRegister.java Bean ...
- Kubernetes Gateway API 深入解读和落地指南
背景 Kubernetes Gateway API 是 Kubernetes 1.18 版本引入的一种新的 API 规范,是 Kubernetes 官方正在开发的新的 API,Ingress 是 Ku ...
- 第十四届蓝桥杯省赛C++ B组(个人经历 + 题解)
参赛感受 这是我第一次参加蓝桥杯的省赛,虽然没什么参赛经验,但是自己做了很多前几届蓝桥杯的题,不得不说,这一届蓝桥杯省赛的难度相较于之前而言还是比较大的.之前很流行蓝桥杯就是暴力杯的说法,但是随着参赛 ...
- Vue中使用富文本编辑器
原文链接:https://blog.csdn.net/qq_45695853/article/details/114635009
- 2022-08-15:k8s安装pgadmin,yaml如何写?
2022-08-15:k8s安装pgadmin,yaml如何写? 答案2022-08-15: yaml如下: # 依赖postgres.yaml apiVersion: apps/v1 kind: D ...
- 2021-07-12:缺失的第一个正数。给你一个未排序的整数数组 nums ,请你找出其中没有出现的最小的正整数。请你实现时间复杂度为 O(n) 并且只使用常数级别额外空间的解决方案。比如[3,4,5
2021-07-12:缺失的第一个正数.给你一个未排序的整数数组 nums ,请你找出其中没有出现的最小的正整数.请你实现时间复杂度为 O(n) 并且只使用常数级别额外空间的解决方案.比如[3,4,5 ...
- Selenium - 基础知识介绍
Selenium - 基础知识介绍 介绍 Selenium是ThoughtWorks员工在业余时间开发并维护的开源项目,并且在ThoughtWorks的项 目中被广泛应用. 简单地说,Selenium ...
- 解决:django.db.utils.OperationalError: no such table: auth_user
解决:django.db.utils.OperationalError: no such table: auth_user 我们在创建Django项目的时候已经创建这个表了,表一般都保存在轻量级数据库 ...
- from . import XXX
[Python]from . import XXX 一. 官方文档 sound/ __init__.py formats/ __init__.py wavread.py wavwrite.py ai ...
- Odoo-----计算字段、depnds,onchange 机制、模型约束
1 计算字段和默认值问题 字段通过调用模型的方法的实时计算获得,一般都是 compute 属性为主的方法,这个计算方法通过计算self每条记录设置的的值,self 是一个有记录的有序集合,支持py ...