hadoop job -kill 调用的是CLI.java里面的job.killJob(); 这里会分几种情况,如果是能查询到状态是RUNNING的话,是直接向AppMaster发送kill请求的。
YARNRunner.java
@Override
public void killJob(JobID arg0) throws IOException, InterruptedException {
/* check if the status is not running, if not send kill to RM */
JobStatus status = clientCache.getClient(arg0).getJobStatus(arg0);
ApplicationId appId = TypeConverter.toYarn(arg0).getAppId();
// get status from RM and return
if (status == null) {
killUnFinishedApplication(appId);
return;
}
if (status.getState() != JobStatus.State.RUNNING) {
killApplication(appId);
return;
}
try {
/* send a kill to the AM */
clientCache.getClient(arg0).killJob(arg0);
long currentTimeMillis = System.currentTimeMillis();
long timeKillIssued = currentTimeMillis;
while ((currentTimeMillis < timeKillIssued + 10000L)
&& !isJobInTerminalState(status)) {
try {
Thread.sleep(1000L);
} catch (InterruptedException ie) {
/** interrupted, just break */
break;
}
currentTimeMillis = System.currentTimeMillis();
status = clientCache.getClient(arg0).getJobStatus(arg0);
if (status == null) {
killUnFinishedApplication(appId);
return;
}
}
} catch(IOException io) {
LOG.debug("Error when checking for application status", io);
}
if (status != null && !isJobInTerminalState(status)) {
killApplication(appId);
}
}
|
MRClientService.java
@SuppressWarnings("unchecked")
@Override
public KillJobResponse killJob(KillJobRequest request)
throws IOException {
JobId jobId = request.getJobId();
UserGroupInformation callerUGI = UserGroupInformation.getCurrentUser();
String message = "Kill job " + jobId + " received from " + callerUGI
+ " at " + Server.getRemoteAddress();
LOG.info(message);
verifyAndGetJob(jobId, JobACL.MODIFY_JOB);
appContext.getEventHandler().handle(
new JobDiagnosticsUpdateEvent(jobId, message));
appContext.getEventHandler().handle(
new JobEvent(jobId, JobEventType.JOB_KILL));
KillJobResponse response =
recordFactory.newRecordInstance(KillJobResponse.class);
return response;
}
|
yarn application -kill 使用的是ApplicationCLI.java,是向RM发送kill请求
/**
* Kills the application with the application id as appId
*
* @param applicationId
* @throws YarnException
* @throws IOException
*/
private void killApplication(String applicationId) throws YarnException,
IOException {
ApplicationId appId = ConverterUtils.toApplicationId(applicationId);
ApplicationReport appReport = null;
try {
appReport = client.getApplicationReport(appId);
} catch (ApplicationNotFoundException e) {
sysout.println("Application with id '" + applicationId +
"' doesn't exist in RM.");
throw e;
}
if (appReport.getYarnApplicationState() == YarnApplicationState.FINISHED
|| appReport.getYarnApplicationState() == YarnApplicationState.KILLED
|| appReport.getYarnApplicationState() == YarnApplicationState.FAILED) {
sysout.println("Application " + applicationId + " has already finished ");
} else {
sysout.println("Killing application " + applicationId);
client.killApplication(appId);
}
}
|
YarnClientImpl.java
@Override
public void killApplication(ApplicationId applicationId)
throws YarnException, IOException {
KillApplicationRequest request =
Records.newRecord(KillApplicationRequest.class);
request.setApplicationId(applicationId);
try {
int pollCount = 0;
long startTime = System.currentTimeMillis();
while (true) {
KillApplicationResponse response =
rmClient.forceKillApplication(request);
if (response.getIsKillCompleted()) {
LOG.info("Killed application " + applicationId);
break;
}
long elapsedMillis = System.currentTimeMillis() - startTime;
if (enforceAsyncAPITimeout() &&
elapsedMillis >= this.asyncApiPollTimeoutMillis) {
throw new YarnException("Timed out while waiting for application " +
applicationId + " to be killed.");
}
if (++pollCount % 10 == 0) {
LOG.info("Waiting for application " + applicationId + " to be killed.");
}
Thread.sleep(asyncApiPollIntervalMillis);
}
} catch (InterruptedException e) {
LOG.error("Interrupted while waiting for application " + applicationId
+ " to be killed.");
}
}
|
- hadoop job -kill 与 yarn application -kii(作业卡了或作业重复提交或MapReduce任务运行到running job卡住)
问题详情 解决办法 [hadoop@master ~]$ hadoop job -kill job_1493782088693_0001 DEPRECATED: Use of this script ...
- yarn application -kill application_id yarn kill 超时任务脚本
需求:kill 掉yarn上超时的任务,实现不同队列不同超时时间的kill机制,并带有任务名的白名单功能 此为python脚本,可配置crontab使用 # _*_ coding=utf-8 _*_ ...
- yarn application命令介绍
yarn application 1.-list 列出所有 application 信息 示例:yarn application -list 2.-appStates <Stat ...
- Linux kill、kill-15、kill-9区别
进程状态转换图 kill和kill -9,两个命令在linux中都有杀死进程的效果,然而两命令的执行过程却大有不同,在程序中如果用错了,可能会造成莫名其妙的现象. 执行kill(不加 -* 默认kil ...
- Hadoop 新 MapReduce 框架 Yarn 详解【转】
[转自:http://www.ibm.com/developerworks/cn/opensource/os-cn-hadoop-yarn/] 简介: 本文介绍了 Hadoop 自 0.23.0 版本 ...
- spark-shell启动报错:Yarn application has already ended! It might have been killed or unable to launch application master
spark-shell不支持yarn cluster,以yarn client方式启动 spark-shell --master=yarn --deploy-mode=client 启动日志,错误信息 ...
- yarn application ID 增长达到10000后
Job, Task, and Task Attempt IDs In Hadoop 2, MapReduce job IDs are generated from YARN application I ...
- spark利用yarn提交任务报:YARN application has exited unexpectedly with state UNDEFINED
spark用yarn提交任务会报ERROR cluster.YarnClientSchedulerBackend: YARN application has exited unexpectedly w ...
- 【深入浅出 Yarn 架构与实现】3-1 Yarn Application 流程与编写方法
本篇学习 Yarn Application 编写方法,将带你更清楚的了解一个任务是如何提交到 Yarn ,在运行中的交互和任务停止的过程.通过了解整个任务的运行流程,帮你更好的理解 Yarn 运作方式 ...
随机推荐
- 1.单机部署hadoop测试环境
之前看了很多理论上的知识,感觉云里雾里的,所以赶紧着手搭建个单机版的hadoop跑一跑,开启自学大数据技术的第一步~~ 1.在开源的世界里,我就是个土豪,要啥有啥,所以首先你得有个jdk,有钱所以用最 ...
- maven 引用本地jar
1.添加lib文件夹在src文件夹中.2.拷贝所需要的test.jar包到lib文件夹.3.在pom文件加入如下依赖 <!--添加本地私有包--><dependency> &l ...
- [GO]通过结构体生成json
package main import ( "encoding/json" "fmt" ) type IT struct { //一定要注意这里的成员变量的名字 ...
- Mac Android8.0源码编译笔记
原因:内存不够 办法:添加限制,输入如下命令:export JACK_SERVER_VM_ARGUMENTS="-Dfile.encoding=UTF-8 -XX:+TieredCompil ...
- 深入理解java虚拟机(一)虚拟机内存划分
Java虚拟机在执行Java程序时,会把它管理的内存划分为若干个不同的数据区.这些区域有不同的特性,起不同的作用.它们有各自的创建时间,销毁时间.有的区域随着进程的启动而创建,随着进程结束而销毁,有的 ...
- 【小梅哥SOPC学习笔记】sof与NIOS II的elf固件合并jic得到文件
sof与NIOS II的elf固件合并jic得到文件 注意,本方法已经有更加简便的方法,小梅哥提供相应的脚本文件,可以一键生成所需文件,脚本请前往芯航线FPGA技术支持群获取. 7.1 为什么需要将S ...
- [LeetCode 题解]:Candy
There are N children standing in a line. Each child is assigned a rating value. You are giving candi ...
- Re:从零开始的Spring Security Oauth2(三)
上一篇文章中我们介绍了获取token的流程,这一篇重点分析一下,携带token访问受限资源时,内部的工作流程. @EnableResourceServer与@EnableAuthorizationSe ...
- mysql 命令备份还原
备份 mysqldump -h localhost -uroot -p123456 springbootdb > e:/springbootdb.sql 还原 mysql -h localhos ...
- Mysql数据操作《一》数据的增删改
插入数据INSERT 1. 插入完整数据(顺序插入) 语法一: INSERT INTO 表名(字段1,字段2,字段3…字段n) VALUES(值1,值2,值3…值n); 语法二: INSERT INT ...