一种flink 作业提交失败的情况描述与原因排查
- 遇到异常
2019-12-24 16:49:59,019 INFO org.apache.flink.yarn.YarnClusterClient - Starting client actor system.
2019-12-24 16:49:59,033 INFO org.apache.flink.yarn.YarnClusterClient - Trying to start actor system at 10-30-63-28_uf.cluster.ds.mosaic.com:0
2019-12-24 16:49:59,686 INFO org.apache.flink.yarn.YarnClusterClient - Actor system started at akka.tcp://flink@10-30-63-28_uf.cluster.ds.mosaic.com:33557
------------------------------------------------------------
The program finished with the following exception:
java.lang.RuntimeException: Unable to tell application master to stop once the specified job has been finised
at org.apache.flink.yarn.YarnClusterClient.stopAfterJob(YarnClusterClient.java:129)
at org.apache.flink.yarn.YarnClusterClient.submitJob(YarnClusterClient.java:154)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:486)
at org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:77)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:432)
at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:816)
at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:290)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:216)
at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1053)
at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1129)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1727)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1129)
Caused by: org.apache.flink.util.FlinkException: Could not connect to the leading JobManager. Please check that the JobManager is running.
at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:956)
at org.apache.flink.yarn.YarnClusterClient.stopAfterJob(YarnClusterClient.java:124)
... 14 more
Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not retrieve the leader gateway.
at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:83)
at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:951)
... 15 more
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:227)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:190)
at scala.concurrent.Await.result(package.scala)
at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:81)
... 16 more
- 发现异常:
- hostname 在shell中解析不完整
[d1_mosaic_bigdata_pa@10-30-63-28_uf ol-mpr]$ hostname
10-30-63-28_uf.cluster.ds.mosaic.com
- 调整hostname后
2019-12-24 17:13:10,044 INFO org.apache.flink.yarn.YarnClusterClient - Starting client actor system.
2019-12-24 17:13:10,060 INFO org.apache.flink.yarn.YarnClusterClient - Trying to start actor system at host10306328:0
2019-12-24 17:13:10,706 INFO org.apache.flink.yarn.YarnClusterClient - Actor system started at akka.tcp://flink@host10306328:41187
java.lang.RuntimeException: Unable to tell application master to stop once the specified job has been finised
at org.apache.flink.yarn.YarnClusterClient.stopAfterJob(YarnClusterClient.java:129)
at org.apache.flink.yarn.YarnClusterClient.submitJob(YarnClusterClient.java:154)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:486)
at org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:77)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:432)
at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:816)
at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:290)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:216)
at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1053)
at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1129)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1727)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1129)
Caused by: org.apache.flink.util.FlinkException: Could not find out our own hostname by connecting to the leading JobManager. Please make sure that the Flink cluster has been started.
at org.apache.flink.client.program.ClusterClient$LazyActorSystemLoader.get(ClusterClient.java:276)
at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:953)
at org.apache.flink.yarn.YarnClusterClient.stopAfterJob(YarnClusterClient.java:124)
... 14 more
Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not find the connecting address by connecting to the current leader.
at org.apache.flink.runtime.util.LeaderRetrievalUtils.findConnectingAddress(LeaderRetrievalUtils.java:182)
at org.apache.flink.runtime.util.LeaderRetrievalUtils.findConnectingAddress(LeaderRetrievalUtils.java:163)
at org.apache.flink.client.program.ClusterClient$LazyActorSystemLoader.get(ClusterClient.java:272)
... 16 more
Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not retrieve the connecting address to the current leader with the akka URL akka.tcp://flink@emr-worker-190.cluster-40699:45716/user/jobmanager.
at org.apache.flink.runtime.net.ConnectionUtils$LeaderConnectingAddressListener.findConnectingAddress(ConnectionUtils.java:472)
at org.apache.flink.runtime.net.ConnectionUtils$LeaderConnectingAddressListener.findConnectingAddress(ConnectionUtils.java:361)
at org.apache.flink.runtime.util.LeaderRetrievalUtils.findConnectingAddress(LeaderRetrievalUtils.java:180)
... 18 more
Caused by: java.net.UnknownHostException: host10306328: host10306328: Name or service not known
at java.net.InetAddress.getLocalHost(InetAddress.java:1505)
at org.apache.flink.runtime.net.ConnectionUtils.tryLocalHostBeforeReturning(ConnectionUtils.java:190)
at org.apache.flink.runtime.net.ConnectionUtils.findAddressUsingStrategy(ConnectionUtils.java:276)
at org.apache.flink.runtime.net.ConnectionUtils.access$100(ConnectionUtils.java:51)
at org.apache.flink.runtime.net.ConnectionUtils$LeaderConnectingAddressListener.findConnectingAddress(ConnectionUtils.java:413)
... 20 more
Caused by: java.net.UnknownHostException: host10306328: Name or service not known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
at java.net.InetAddress.getLocalHost(InetAddress.java:1500)
... 24 more
- 增加hosts配置
10.30.63.28 host10306328
2019-12-24 17:11:30,151 INFO org.apache.flink.yarn.YarnClusterClient - Starting client actor system.
2019-12-24 17:11:30,168 INFO org.apache.flink.yarn.YarnClusterClient - Trying to start actor system at 10-30-63-28_uf.cluster.ds.mosaic.com:0
2019-12-24 17:11:30,811 INFO org.apache.flink.yarn.YarnClusterClient - Actor system started at akka.tcp://flink@10-30-63-28_uf.cluster.ds.mosaic.com:13182
------------------------------------------------------------
The program finished with the following exception:
java.lang.RuntimeException: Unable to tell application master to stop once the specified job has been finised
at org.apache.flink.yarn.YarnClusterClient.stopAfterJob(YarnClusterClient.java:129)
at org.apache.flink.yarn.YarnClusterClient.submitJob(YarnClusterClient.java:154)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:486)
at org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:77)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:432)
at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:816)
at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:290)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:216)
at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1053)
at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1129)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1727)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1129)
Caused by: org.apache.flink.util.FlinkException: Could not connect to the leading JobManager. Please check that the JobManager is running.
at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:956)
at org.apache.flink.yarn.YarnClusterClient.stopAfterJob(YarnClusterClient.java:124)
... 14 more
Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not retrieve the leader gateway.
at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:83)
at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:951)
... 15 more
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:227)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:190)
at scala.concurrent.Await.result(package.scala)
at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:81)
... 16 more
- 屏蔽错误host配置
- 提交成功
# 10.30.63.28 10-30-63-28_uf.cluster.ds.mosaic.com
+++ dirname yarn_offltrain.sh
++ cd .
++ pwd
+ BASE_PATH=/data0/d1_mosaic_bigdata_test/mosaic/mosaicx/ol-mpr/offline_train_mainpage_mosaic_formosaic
+ HADOOP_USER_NAME=feed_mosaic
+ FLINK_RUN_MODE=yarn-cluster
+ mosaic_JOB_NAME=offlinetrain-beta_mainpage_mosaic6_base_prepage_v1-auc1
+ FLINK_TASK_MANAGER_NUMBER=15
+ FLINK_TASK_MANAGER_SLOT=5
+ FLINK_TASK_MANAGER_MEMORY=20000
+ FLINK_JOB_MANAGER_MEMORY=20000
+ JAR=mosaic-runtime-2.0.0.jar
+ XML=mosaic_offlinetrain_weiflow.xml
+ NODE=offline_training
+ FEATURE_CONF=feature_prepage.conf
+ export FLINK_LOG_DIR=/tmp
+ FLINK_LOG_DIR=/tmp
+ export FLINK_LOG_DIR=/tmp
+ FLINK_LOG_DIR=/tmp
++ hadoop classpath
+ export 'HADOOP_CLASSPATH=/data0/rsync_data/mosaic/ccConfs/yarn-setting/EMR-118-conf:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/common/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/common/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/hdfs:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/hdfs/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/hdfs/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/yarn/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/yarn/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/mapreduce/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/mapreduce/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/contrib/capacity-scheduler/*.jar'
+ HADOOP_CLASSPATH='/data0/rsync_data/mosaic/ccConfs/yarn-setting/EMR-118-conf:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/common/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/common/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/hdfs:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/hdfs/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/hdfs/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/yarn/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/yarn/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/mapreduce/lib/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/mapreduce/*:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/contrib/capacity-scheduler/*.jar'
+ /usr/lib/flink-current/bin/flink run -d -m yarn-cluster -yD env.java.opts=-Djava.util.Arrays.useLegacyMergeSort=true -yD web.timeout=1000000 -yD 'env.java.opts=-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=128M' -yD metrics.reporter.monitor._FLINK_CLUSTER_NAME=offlinetrain-beta_mainpage_mosaic6_base_prepage_v1-auc1 -yD metrics.reporters=monitor -yD metrics.reporter.monitor.class=com.mosaic.datasys.mosaic.metrics.WeiboKafkaReporter -yD metrics.reporter.monitor.kafka.bootstrap.servers=10.85.184.204:9092,10.85.184.205:9092 -yD metrics.reporter.monitor.topicName=metrics-topic -yjm 20000 -yn 15 -ytm 20000 -ys 5 -ynm offlinetrain-beta_mainpage_mosaic6_base_prepage_v1-auc1 -c com.mosaic.datasys.mosaic.framework.common.parser.FlowBuilder mosaic-runtime-2.0.0.jar mosaic_offlinetrain_weiflow.xml offline_training
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data0/rsync_data/mosaic/ccConfs/yarn-setting/mosaic/flink-1.6.2-1.0.0-bin-weiclient/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data0/rsync_data/mosaic/ccConfs/yarn-setting/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2019-12-24 17:12:20,400 INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://emr-header-1.cluster-40699:8188/ws/v1/timeline/
2019-12-24 17:12:20,729 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.LegacyYarnClusterDescriptor to locate the jar
2019-12-24 17:12:20,729 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.LegacyYarnClusterDescriptor to locate the jar
2019-12-24 17:12:20,839 INFO org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm2
2019-12-24 17:12:21,011 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: ClusterSpecification{masterMemoryMB=20000, taskManagerMemoryMB=20000, numberTaskManagers=15, slotsPerTaskManager=5}
2019-12-24 17:12:21,376 WARN org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration directory ('/data0/rsync_data/mosaic/ccConfs/yarn-setting/mosaic/flink-1.6.2-1.0.0-bin-weiclient/conf') contains both LOG4J and Logback configuration files. Please delete or rename one of them.
2019-12-24 17:13:00,539 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application master application_1547542611290_3184536
2019-12-24 17:13:00,589 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1547542611290_3184536
2019-12-24 17:13:00,590 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster to be allocated
2019-12-24 17:13:00,598 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, current state ACCEPTED
2019-12-24 17:13:07,826 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has been deployed successfully.
2019-12-24 17:13:07,828 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - The Flink YARN client has been started in detached mode. In order to stop Flink on YARN, use the following command or a YARN web interface to stop it:
yarn application -kill application_1547542611290_3184536
Please also note that the temporary files of the YARN session in the home directory will not be removed.
Using the parallelism provided by the remote cluster (75). To use another parallelism, set it at the ./bin/flink client.
Starting execution of program
2019-12-24 17:13:07,863 INFO org.apache.flink.yarn.YarnClusterClient - Starting program in interactive mode (detached: true)
============================= Flink Job Name is :offlinetrain-mosaic6-mosaic-base-prepage-auc
============================= task info =============================
== task name: hiveInput1
2019-12-24 17:13:08,300 WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist
2019-12-24 17:13:08,453 WARN org.apache.hadoop.security.UserGroupInformation - No groups available for user feed_mosaic
database:default, table:model_frequence_30_days, partitionFilter:dt=test, fieldSize:49
Schema:[is_act, is_click, is_video_play, ma_id, index_type, m_midtextveccos, fm_1082, fm_10169, fm_10170, fm_10171, fm_1033, fm_1086, fm_1089, fu_210, fu_207, fu_211, fu_2117, fu_2135, fu_215, fu_216, fuu_403, fuu_407, fuu_400, fuu_401, fuu_4019, fuu_402, fuu_405, fuu_409, m_1082, m_1011, m_10148, m_10169, m_10170, m_10171, m_1030, m_1032, m_1040, m_1041, m_1042, m_1063, m_1086, fu_uid, fm_uid, fm_mid, m_mid, m_uid, expo_time, pre_page, dt]
== task name: featureProcess
Read Feature File:
== task name: libsvmProcess
Read Feature File: feature_prepage.conf
== task name: trainProcess
============================= task info =============================
2019-12-24 17:13:10,044 INFO org.apache.flink.yarn.YarnClusterClient - Starting client actor system.
2019-12-24 17:13:10,060 INFO org.apache.flink.yarn.YarnClusterClient - Trying to start actor system at host10306328:0
2019-12-24 17:13:10,706 INFO org.apache.flink.yarn.YarnClusterClient - Actor system started at akka.tcp://flink@host10306328:41187
2019-12-24 17:13:11,050 INFO org.apache.flink.yarn.YarnClusterClient - Waiting until all TaskManagers have connected
Waiting until all TaskManagers have connected
2019-12-24 17:13:11,111 INFO org.apache.flink.yarn.YarnClusterClient - TaskManager status (0/15)
TaskManager status (0/15)
2019-12-24 17:13:13,222 INFO org.apache.flink.yarn.YarnClusterClient - TaskManager status (2/15)
TaskManager status (2/15)
2019-12-24 17:13:13,504 INFO org.apache.flink.yarn.YarnClusterClient - TaskManager status (4/15)
TaskManager status (4/15)
2019-12-24 17:13:13,807 INFO org.apache.flink.yarn.YarnClusterClient - TaskManager status (9/15)
TaskManager status (9/15)
2019-12-24 17:13:14,084 INFO org.apache.flink.yarn.YarnClusterClient - TaskManager status (13/15)
TaskManager status (13/15)
2019-12-24 17:13:14,427 INFO org.apache.flink.yarn.YarnClusterClient - TaskManager status (15/15)
TaskManager status (15/15)
2019-12-24 17:13:14,427 INFO org.apache.flink.yarn.YarnClusterClient - All TaskManagers are connected
All TaskManagers are connected
2019-12-24 17:13:14,448 INFO org.apache.flink.yarn.YarnClusterClient - Submitting Job with JobID: a05e47246348d02d5c4fe5e322c8544d. Returning after job submission.
Submitting Job with JobID: a05e47246348d02d5c4fe5e322c8544d. Returning after job submission.
Job has been submitted with JobID a05e47246348d02d5c4fe5e322c8544d
一种flink 作业提交失败的情况描述与原因排查的更多相关文章
- YARN作业提交流程剖析
YARN(MapReduce2) Yet Another Resource Negotiator / YARN Application Resource Negotiator对于节点数超出4000的大 ...
- hadoop2.7之作业提交详解(上)
根据wordcount进行分析: import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; impo ...
- Apache Flink 进阶(六):Flink 作业执行深度解析
本文根据 Apache Flink 系列直播课程整理而成,由 Apache Flink Contributor.网易云音乐实时计算平台研发工程师岳猛分享.主要分享内容为 Flink Job 执行作业的 ...
- Spark学习(四) -- Spark作业提交
标签(空格分隔): Spark 作业提交 先回顾一下WordCount的过程: sc.textFile("README.rd").flatMap(line => line.s ...
- Hadoop作业提交之TaskTracker获取Task
[Hadoop代码笔记]Hadoop作业提交之TaskTracker获取Task 一.概要描述 在上上一篇博文和上一篇博文中分别描述了jobTracker和其服务(功能)模块初始化完成后,接收JobC ...
- Spark学习之路(五)—— Spark运行模式与作业提交
一.作业提交 1.1 spark-submit Spark所有模式均使用spark-submit命令提交作业,其格式如下: ./bin/spark-submit \ --class <main- ...
- Spark 系列(五)—— Spark 运行模式与作业提交
一.作业提交 1.1 spark-submit Spark 所有模式均使用 spark-submit 命令提交作业,其格式如下: ./bin/spark-submit \ --class <ma ...
- 记录一次Flink作业异常的排查过程
最近2周开始接手apache flink全链路监控数据的作业,包括指标统计,业务规则匹配等逻辑,计算结果实时写入elasticsearch. 昨天遇到生产环境有作业无法正常重启的问题,我负责对这个问题 ...
- 入门大数据---Spark部署模式与作业提交
一.作业提交 1.1 spark-submit Spark 所有模式均使用 spark-submit 命令提交作业,其格式如下: ./bin/spark-submit \ --class <ma ...
- 【hadoop代码笔记】Hadoop作业提交中EagerTaskInitializationListener的作用
在整理FairScheduler实现的task调度逻辑时,注意到EagerTaskInitializationListener类.差不多应该是job提交相关的逻辑代码中最简单清楚的一个了. todo: ...
随机推荐
- HTAP for MySQL 在腾讯云数据库的演进
摘要:MySQL在充分利用多核计算资源方面比较欠缺,无法同时满足在线业务和分析型业务的客户需求,而单独部署一套专用的分析型数据库意味着额外的成本和复杂的数据链路.本次主题将介绍腾讯云数据库为满足此类场 ...
- jenkins的安装和配置(flask结合jenkins半自动化部署流程)
jenkins在虚拟机中安装 1.1 背景介绍 Jenkins 是一款流行的开源持续集成(Continuous Integration)工具,广泛用于项目开发,具有自动化构建.测试和部署等功能. Je ...
- 18年CCCC赛后总结
C4赛后总结: 我正式入坑以来,大约5个月,这也是我第一次出去参与这样正式的比赛,其实比赛结果并不尽人意,但有很多还是需要记录下来的,通过这次比赛的确获得了很多的比赛经验: 一赛前: 其实赛前的状态, ...
- 2022-08-27:以下go语言代码输出什么?A:[0];B:panic;C:7;D:不清楚。 package main import ( “fmt“ ) func main() { a
2022-08-27:以下go语言代码输出什么?A:[0]:B:panic:C:7:D:不清楚. package main import ( "fmt" ) func main() ...
- 数字分频器设计(偶数分频、奇数分频、小数分频、半整数分频、状态机分频|verilog代码|Testbench|仿真结果)
目录 一.前言 二.偶数分频 2.1 触发器级联法 2.2 计数器法 2.3 verilog代码 2.4 Testbench 2.5 仿真结果 三.奇数分频 3.1 占空比非50%奇数分频 3.2 占 ...
- pages.json 文件:globalStyle 全局配置
globalStyle 用于设置应用的状态栏.导航条.标题.窗口背景色等. 属性 类型 默认值 描述 平台差异说明 navigationBarBackgroundColor HexColor #F7F ...
- 【Python】爬虫下载视频
Python爬虫下载视频 前言 这两天我一时兴起想学习 PS ,于是去我的软件宝库中翻出陈年已久的 PhotoshopCS6 安装,结果发现很真流畅诶! 然后去搜索学习视频,网上的视频大多浮躁,收费, ...
- Galaxy Release (v 21.05),众多核心技术栈变更
2021年6月初,Galaxy Project 正式发布了 release 21.05 版本:随后6月中旬,发布该版本的 announcement 文档.这里总结一下该版本一些主要的更新内容,为关注和 ...
- 基于飞桨paddlespeech训练中文唤醒词模型
飞桨Paddlespeech中的语音唤醒是基于hey_snips数据集做的.Hey_snips数据集是英文唤醒词,对于中国人来说,最好是中文唤醒词.经过一番尝试,我发现它也能训练中文唤醒词,于是我决定 ...
- 旧版Vue配置API_ROOT,开发、生产地址切换
1 目录 config/dev.env.js1 'use strict' 2 const merge = require('webpack-merge') 3 const prodEnv = requ ...