YARN 启动后失败退出——没有请求资源——Invalid resource request, no resources request
在ambari-server中修改了yarn的配置,重新启动服务,结果RM启动失败,错误也很奇怪,“不合理的资源请求,没有请求任何资源”!详细如下:
-- ::, FATAL resourcemanager.ResourceManager (ResourceManager.java:main(1495)) - Error starting ResourceManager
org.apache.hadoop.service.ServiceStateException: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, no resources requested
at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$.run(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$.run(ResourceManager.java:)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:)
Caused by: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, no resources requested
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:)
... more
-- ::, INFO zookeeper.ZooKeeper (ZooKeeper.java:close(684)) - Session: 0x36546c044dc0113 closed
-- ::, INFO zookeeper.ClientCnxn (ClientCnxn.java:run()) - EventThread shut down
-- ::, INFO resourcemanager.ResourceManager (LogAdapter.java:info()) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down ResourceManager at ep-bd01/192.168.58.11
网上多方搜索无解,最后无奈重新启动主机,重启所有服务,结果成功! 再次重启RM,失败,原因同上。
一、配置RM HA,这次启动了,但是配置的两个RM节点都是standby状态! 期间再次修改配置文件无数次,无效,错误信息依然。
二、手工激活一台主机上的RM,失败,错误原因相同
[root@ep-bd01 zookeeper]# yarn rmadmin -transitionToActive --forceactive --forcemanual rm1
You have specified the --forcemanual flag. This flag is dangerous, as it can induce a split-brain scenario that WILL CORRUPT your HDFS namespace, possibly irrecoverably. It is recommended not to use this flag, but instead to shut down the cluster and disable automatic failover if you prefer to manually manage your HA state. You may abort safely by answering 'n' or hitting ^C now. Are you sure you want to continue? (Y or N) y
......
......
// :: WARN ha.ActiveStandbyElector: Exception handling the winning of election
org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:)
at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:)
at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:)
Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when transitioning to Active mode
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:)
... more
Caused by: org.apache.hadoop.service.ServiceStateException: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, no resources requested
at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$.run(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$.run(ResourceManager.java:)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:)
... more
At Last! 经过好几天的网上搜索以及思考,这个错误可能是HDP3.0的新错误信息,和网上搜索到的一个问题有些类似,现象同样是RM启动成功后马上挂掉! 其中提到可能是RM回复application的状态引起的故障,急忙实验一下。
简而言之,使用zookeeper命令删除 /rmstore/ZKRMStateRoot/RMAppRoot 下面的所有子目录。
然后重启RM,没想到困扰几天的问题就这么解决了,具体请看输出吧(容我乐一会儿先)。
[root@ep-bd03 pg_log]# sudo -u zookeeper /usr/hdp/3.0.0.0-1634/zookeeper/bin/zkCli.sh Connecting to localhost:
-- ::, - INFO [main:Environment@] - Client environment:zookeeper.version=3.4.---, built on // : GMT
-- ::, - INFO [main:Environment@] - Client environment:host.name=ep-bd03
-- ::, - INFO [main:Environment@] - Client environment:java.version=1.8.0_181
-- ::, - INFO [main:Environment@] - Client environment:java.vendor=Oracle Corporation
-- ::, - INFO [main:Environment@] - Client environment:java.home=/usr/java/jdk1..0_181-amd64/jre
-- ::, - INFO [main:Environment@] - Client environment:java.class.path=/usr/hdp/3.0.0.0-/zookeeper/bin/../build/classes:/usr/hdp/3.0.0.0-/zookeeper/bin/../build/lib/*.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/xercesMinimal-1.9.6.2.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-provider-api-2.4.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-http-shared4-2.4.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-http-shared-1.0-beta-6.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-http-lightweight-1.0-beta-6.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-http-2.4.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-file-1.0-beta-6.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/plexus-utils-3.0.8.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/plexus-interpolation-1.11.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/plexus-container-default-1.0-alpha-9-stable-1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/netty-3.10.5.Final.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/nekohtml-1.9.6.2.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-settings-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-repository-metadata-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-project-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-profile-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-plugin-registry-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-model-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-error-diagnostics-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-artifact-manager-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-artifact-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-ant-tasks-2.1.3.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/log4j-1.2.16.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/jsoup-1.7.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/jline-0.9.94.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/commons-logging-1.1.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/commons-io-2.2.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/commons-codec-1.6.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/classworlds-1.1-alpha-2.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/backport-util-concurrent-3.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/ant-launcher-1.8.0.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/ant-1.8.0.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../zookeeper-3.4.6.3.0.0.0-1634.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../src/java/lib/*.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../conf::/usr/share/zookeeper/*
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:java.compiler=<NA>
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:os.name=Linux
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:os.arch=amd64
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:os.version=3.10.0-862.6.3.el7.x86_64
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:user.name=zookeeper
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:user.home=/var/lib/zookeeper
2018-08-29 15:04:02,400 - INFO [main:Environment@100] - Client environment:user.dir=/tmp/hsperfdata_zookeeper
2018-08-29 15:04:02,401 - INFO [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@6438a396
Welcome to ZooKeeper!
2018-08-29 15:04:02,417 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1019] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
JLine support is enabled
2018-08-29 15:04:02,461 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@864] - Socket connection established, initiating session, client: /127.0.0.1:7637, server: localhost/127.0.0.1:2181
[zk: localhost:2181(CONNECTING) 0] 2018-08-29 15:04:02,484 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1279] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x3658450e5f202da, negotiated timeout = 30000 WATCHER:: WatchedEvent state:SyncConnected type:None path:null [zk: localhost:2181(CONNECTED) 0] ls /rmstore
[ZKRMStateRoot]
[zk: localhost:2181(CONNECTED) 1] ls /rmstore/ZKRMStateRoot
[ReservationSystemRoot, RMAppRoot, AMRMTokenSecretManagerRoot, EpochNode, RMDTSecretManagerRoot, RMVersionNode]
[zk: localhost:2181(CONNECTED) 6] ls /rmstore/ZKRMStateRoot/RMAppRoot
[application_1534904073745_0001, HIERARCHIES, application_1534904073745_0003, application_1534904073745_0002]
[zk: localhost:2181(CONNECTED) 3] rmr /rmstore/ZKRMStateRoot/RMAppRoot/application_1534904073745_0001
[zk: localhost:2181(CONNECTED) 4] rmr /rmstore/ZKRMStateRoot/RMAppRoot/HIERARCHIES
[zk: localhost:2181(CONNECTED) 5] rmr /rmstore/ZKRMStateRoot/RMAppRoot/application_1534904073745_0003
[zk: localhost:2181(CONNECTED) 5] rmr /rmstore/ZKRMStateRoot/RMAppRoot/application_1534904073745_0002 [zk: localhost:2181(CONNECTED) 7] ls /rmstore/ZKRMStateRoot/RMAppRoot
[]
[zk: localhost:2181(CONNECTED) 8]
YARN 启动后失败退出——没有请求资源——Invalid resource request, no resources request的更多相关文章
- springboot启动后自动退出
有时新建的springboot启动后自动退出运行,如图所示: 此种情况大都数是因为pom文件加入了tomcat的依赖,与springboot内嵌的tomcat冲突导致,所以只需将pom文件中的tomc ...
- docker 容器启动后立马退出的解决方法
原因: 容器同时只能管理一个进程,如果这个进程结束了容器就退出了,但是不表示容器只能运行一个进程(其他进程可在后台运行),但是要使容器不退出必须要有一个进程在前台执行. 解决方案: 启动脚本最后一 ...
- 记录一次OracleJDK开发的项目发部到Linux中使用OpenJDK启动后失败的错误的解决方案
一.现象 基于JAVA SpringBoot2.0.4的项目,发部后项目发部后,放到OpenJDK环境中运行时,提示下列错误: 2019-10-22 10:03:55 [main] WARN o.s ...
- 在nginx启动后,如果我们要操作nginx,要怎么做呢 别增加无谓的上下文切换 异步非阻塞的方式来处理请求 worker的个数为cpu的核数 红黑树
nginx平台初探(100%) — Nginx开发从入门到精通 http://ten 众所周知,nginx性能高,而nginx的高性能与其架构是分不开的.那么nginx究竟是怎么样的呢?这一节我们先来 ...
- SQL Server(MSSQLSERVER)启动失败,提示“请求失败或服务未及时响应
1.SQL Server(MSSQLSERVER)启动失败,提示“请求失败或服务未及时响应. --------------------------- SQL Server 配置管理器 -------- ...
- 【技术贴】第二篇 :解决使用maven jetty启动后无法加载修改过后的静态资源
之前写过第一篇:[技术贴]解决使用maven jetty启动后无法加载修改过后的静态资源 一直用着挺舒服的,直到今天,出现了又不能修改静态js,jsp等资源的现象.很是苦闷. 经过调错处理之后,发现是 ...
- Servlet访问路径的两种方式、Servlet生命周期特点、计算服务启动后的访问次数、Get请求、Post请求
Servlet访问路径的两种方式: 1:注解 即在Servlet里写一个@WebServlet @WebServlet("/myServlet") 2:配置web.xml < ...
- Spark(四十九):Spark On YARN启动流程源码分析(一)
引导: 该篇章主要讲解执行spark-submit.sh提交到将任务提交给Yarn阶段代码分析. spark-submit的入口函数 一般提交一个spark作业的方式采用spark-submit来提交 ...
- Spark On YARN启动流程源码分析(一)
本文主要参考: a. https://www.cnblogs.com/yy3b2007com/p/10934090.html 0. 说明 a. 关于spark源码会不定期的更新与补充 b. 对于spa ...
随机推荐
- 修改Chrome启动参数解决跨域问题
这个做法仅仅是针对自己本机,只是一个权宜方案 --disable-web-security --user-data-dir=本地用户信息目录 之后启动Chrome浏览器即可
- 浏览器JS报错Uncaught RangeError Maximum call stack size exceeded
JavaScript错误:Uncaught RangeError: Maximum call stack size exceeded 堆栈溢出 原因:有小类到大类的递归查询导致溢出 解决方法思想: A ...
- 3ds max学习笔记(十)-- 实例操作(镜像和对齐)
1,镜像 选择物体对象然后点击: 偏移:新对象距离轴心所在的直线的距离: 2.对齐 栗子: 选择小球,点击[对齐];鼠标放置在图种位置,点击鼠标左键 出现弹框 调整位置: 先选择对齐位置-->当 ...
- sublime text3中emmet插件的使用
首先,想要快速编码需 要在编辑器中安装常用插件,下面是emmet插件的使用: html5文档结构的生成方式: 1).!+tab键 2).html:5 +tab键 头部head中meta字符集的生成: ...
- Java泛型之Type体系
Type是java类型信息体系中的顶级接口,其中Class就是Type的一个直接实现类.此外,Type还有有四个直接子接口:ParameterizedType,TypeVariable,Wildcar ...
- pointer-net
Pointer network 主要用在解决组合优化类问题(TSP, Convex Hull等等),实际上是Sequence to Sequence learning中encoder RNN和deco ...
- CSS元素定位
使用 CSS 选择器定位元素 CSS可以通过元素的id.class.标签(input)这三个常规属性直接定位到,而这三种编写方式,在HTML中编写style的时候,可以进行标识如: #su ...
- python之socket编程2
1 套接字发展史及发展 套接字起源于 20 世纪 70 年代加利福尼亚大学伯克利分校版本的 Unix,即人们所说的 BSD Unix. 因此,有时人们也把套接字称为“伯克利套接字”或“BSD 套接字” ...
- poj3468 A Simple Problem with Integers(线段树区间更新)
https://vjudge.net/problem/POJ-3468 线段树区间更新(lazy数组)模板题 #include<iostream> #include<cstdio&g ...
- [转载]理解 Git 分支管理最佳实践
原文 理解 Git 分支管理最佳实践 Git 分支有哪些 在进行分支管理讲解之前,我们先来对分支进行一个简单的分类,并明确每一类分支的用途. 分支分类 根据生命周期区分 主分支:master,deve ...