hadoop job -kill 调用的是CLI.java里面的job.killJob(); 这里会分几种情况,如果是能查询到状态是RUNNING的话,是直接向AppMaster发送kill请求的。
YARNRunner.java
@Override
public void killJob(JobID arg0) throws IOException, InterruptedException {
/* check if the status is not running, if not send kill to RM */
JobStatus status = clientCache.getClient(arg0).getJobStatus(arg0);
ApplicationId appId = TypeConverter.toYarn(arg0).getAppId();
// get status from RM and return
if (status == null) {
killUnFinishedApplication(appId);
return;
}
if (status.getState() != JobStatus.State.RUNNING) {
killApplication(appId);
return;
}
try {
/* send a kill to the AM */
clientCache.getClient(arg0).killJob(arg0);
long currentTimeMillis = System.currentTimeMillis();
long timeKillIssued = currentTimeMillis;
while ((currentTimeMillis < timeKillIssued + 10000L)
&& !isJobInTerminalState(status)) {
try {
Thread.sleep(1000L);
} catch (InterruptedException ie) {
/** interrupted, just break */
break;
}
currentTimeMillis = System.currentTimeMillis();
status = clientCache.getClient(arg0).getJobStatus(arg0);
if (status == null) {
killUnFinishedApplication(appId);
return;
}
}
} catch(IOException io) {
LOG.debug("Error when checking for application status", io);
}
if (status != null && !isJobInTerminalState(status)) {
killApplication(appId);
}
}
|
MRClientService.java
@SuppressWarnings("unchecked")
@Override
public KillJobResponse killJob(KillJobRequest request)
throws IOException {
JobId jobId = request.getJobId();
UserGroupInformation callerUGI = UserGroupInformation.getCurrentUser();
String message = "Kill job " + jobId + " received from " + callerUGI
+ " at " + Server.getRemoteAddress();
LOG.info(message);
verifyAndGetJob(jobId, JobACL.MODIFY_JOB);
appContext.getEventHandler().handle(
new JobDiagnosticsUpdateEvent(jobId, message));
appContext.getEventHandler().handle(
new JobEvent(jobId, JobEventType.JOB_KILL));
KillJobResponse response =
recordFactory.newRecordInstance(KillJobResponse.class);
return response;
}
|
yarn application -kill 使用的是ApplicationCLI.java,是向RM发送kill请求
/**
* Kills the application with the application id as appId
*
* @param applicationId
* @throws YarnException
* @throws IOException
*/
private void killApplication(String applicationId) throws YarnException,
IOException {
ApplicationId appId = ConverterUtils.toApplicationId(applicationId);
ApplicationReport appReport = null;
try {
appReport = client.getApplicationReport(appId);
} catch (ApplicationNotFoundException e) {
sysout.println("Application with id '" + applicationId +
"' doesn't exist in RM.");
throw e;
}
if (appReport.getYarnApplicationState() == YarnApplicationState.FINISHED
|| appReport.getYarnApplicationState() == YarnApplicationState.KILLED
|| appReport.getYarnApplicationState() == YarnApplicationState.FAILED) {
sysout.println("Application " + applicationId + " has already finished ");
} else {
sysout.println("Killing application " + applicationId);
client.killApplication(appId);
}
}
|
YarnClientImpl.java
@Override
public void killApplication(ApplicationId applicationId)
throws YarnException, IOException {
KillApplicationRequest request =
Records.newRecord(KillApplicationRequest.class);
request.setApplicationId(applicationId);
try {
int pollCount = 0;
long startTime = System.currentTimeMillis();
while (true) {
KillApplicationResponse response =
rmClient.forceKillApplication(request);
if (response.getIsKillCompleted()) {
LOG.info("Killed application " + applicationId);
break;
}
long elapsedMillis = System.currentTimeMillis() - startTime;
if (enforceAsyncAPITimeout() &&
elapsedMillis >= this.asyncApiPollTimeoutMillis) {
throw new YarnException("Timed out while waiting for application " +
applicationId + " to be killed.");
}
if (++pollCount % 10 == 0) {
LOG.info("Waiting for application " + applicationId + " to be killed.");
}
Thread.sleep(asyncApiPollIntervalMillis);
}
} catch (InterruptedException e) {
LOG.error("Interrupted while waiting for application " + applicationId
+ " to be killed.");
}
}
|
- hadoop job -kill 与 yarn application -kii(作业卡了或作业重复提交或MapReduce任务运行到running job卡住)
问题详情 解决办法 [hadoop@master ~]$ hadoop job -kill job_1493782088693_0001 DEPRECATED: Use of this script ...
- yarn application -kill application_id yarn kill 超时任务脚本
需求:kill 掉yarn上超时的任务,实现不同队列不同超时时间的kill机制,并带有任务名的白名单功能 此为python脚本,可配置crontab使用 # _*_ coding=utf-8 _*_ ...
- yarn application命令介绍
yarn application 1.-list 列出所有 application 信息 示例:yarn application -list 2.-appStates <Stat ...
- Linux kill、kill-15、kill-9区别
进程状态转换图 kill和kill -9,两个命令在linux中都有杀死进程的效果,然而两命令的执行过程却大有不同,在程序中如果用错了,可能会造成莫名其妙的现象. 执行kill(不加 -* 默认kil ...
- Hadoop 新 MapReduce 框架 Yarn 详解【转】
[转自:http://www.ibm.com/developerworks/cn/opensource/os-cn-hadoop-yarn/] 简介: 本文介绍了 Hadoop 自 0.23.0 版本 ...
- spark-shell启动报错:Yarn application has already ended! It might have been killed or unable to launch application master
spark-shell不支持yarn cluster,以yarn client方式启动 spark-shell --master=yarn --deploy-mode=client 启动日志,错误信息 ...
- yarn application ID 增长达到10000后
Job, Task, and Task Attempt IDs In Hadoop 2, MapReduce job IDs are generated from YARN application I ...
- spark利用yarn提交任务报:YARN application has exited unexpectedly with state UNDEFINED
spark用yarn提交任务会报ERROR cluster.YarnClientSchedulerBackend: YARN application has exited unexpectedly w ...
- 【深入浅出 Yarn 架构与实现】3-1 Yarn Application 流程与编写方法
本篇学习 Yarn Application 编写方法,将带你更清楚的了解一个任务是如何提交到 Yarn ,在运行中的交互和任务停止的过程.通过了解整个任务的运行流程,帮你更好的理解 Yarn 运作方式 ...
随机推荐
- Swig在Mac OS X上的安装
网上有很多类似文章介绍Swig怎么在Mac OS X上安装和配置,一般来说就是: 下载pcre,configure & make & make install 下载swig,confi ...
- 【转】OJ提交题目中的语言选项里G++与C++的区别
原文链接:http://blog.polossk.com/201405/c-plus-plus-g-plus-plus G++? 首先更正一个概念,C++是一门计算机编程语言,G++不是语言,是一款编 ...
- OSG图形设备接口GraphicsContext
1.图形设备与相机 在Camera类的成员函数中,setGraphicContext()函数的工作是设置相机对应的图形设备对象,换句话说,下面要介绍的GraphicsContext类就是图形设备对象的 ...
- php 导出csv表格文件
1.数据库取出数据,存放在二维数组中 $conn=new mysqli('localhost','root','root','myDBPDO'); $result=$conn->query('s ...
- 设计模式18:Observer 观察者模式(行为型模式)
Observer 观察者模式(行为型模式) 动机(Motivation) 在软件构建过程中,我们需要为某些对象建立一种“通知依赖关系”——一个对象(目标对象)的状态发生改变,所有依赖对象(观察者对象) ...
- 【Head First Java 读书笔记】(七)继承
继承与多态 了解继承 继承的关系意味着子类继承了父类的实例变量和方法.父类比较抽象,子类比较具体. 继承层次的设计 找出具有共同属性和行为的对象(用继承来防止子类中出现重复的程序代码) 设计代表共同状 ...
- Subsequence——POJ3061
题目:http://poj.org/problem?id=3061 尺取法解题 import java.util.Scanner;; public class Main { public static ...
- .net 任务(Task)
1. Task (任务): 很容易调用 ThreadPool.QueueUserWorkItem 实现异步操作,但是这个技术有许多 .net 引入Task类型来使用任务. 如下几种方式都是实现异步的方 ...
- c3p0-数据库连接池原理
一直用c3p0很久了,但也没时间或没主动去研究过,直到最近频频在出现一些莫名其妙的问题,觉得还是有必要了解和研究一下. c3p0是什么 c3p0的出现,是为了大大提高应用程序和数据库之间访问效率的. ...
- PMBOK项目管理相关知识梳理
该次梳理,依据PMBOK(第五版),罗列项目管理十三章节重要的知识点. 一.引论1.项目的定义与举例:2.项目.项目组合.项目集与项目组织管理:3.范进本质是风资(范围.进度.成本.质量.风险.资源) ...