在ambari-server中修改了yarn的配置,重新启动服务,结果RM启动失败,错误也很奇怪,“不合理的资源请求,没有请求任何资源”!详细如下:

-- ::, FATAL resourcemanager.ResourceManager (ResourceManager.java:main(1495)) - Error starting ResourceManager
org.apache.hadoop.service.ServiceStateException: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, no resources requested
at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$.run(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$.run(ResourceManager.java:)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:)
Caused by: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, no resources requested
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:)
... more
-- ::, INFO zookeeper.ZooKeeper (ZooKeeper.java:close(684)) - Session: 0x36546c044dc0113 closed
-- ::, INFO zookeeper.ClientCnxn (ClientCnxn.java:run()) - EventThread shut down
-- ::, INFO resourcemanager.ResourceManager (LogAdapter.java:info()) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down ResourceManager at ep-bd01/192.168.58.11

网上多方搜索无解,最后无奈重新启动主机,重启所有服务,结果成功! 再次重启RM,失败,原因同上。

一、配置RM HA,这次启动了,但是配置的两个RM节点都是standby状态! 期间再次修改配置文件无数次,无效,错误信息依然。

二、手工激活一台主机上的RM,失败,错误原因相同

[root@ep-bd01 zookeeper]# yarn rmadmin -transitionToActive --forceactive --forcemanual rm1
You have specified the --forcemanual flag. This flag is dangerous, as it can induce a split-brain scenario that WILL CORRUPT your HDFS namespace, possibly irrecoverably. It is recommended not to use this flag, but instead to shut down the cluster and disable automatic failover if you prefer to manually manage your HA state. You may abort safely by answering 'n' or hitting ^C now. Are you sure you want to continue? (Y or N) y
......
......
// :: WARN ha.ActiveStandbyElector: Exception handling the winning of election
org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:)
at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:)
at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:)
Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when transitioning to Active mode
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:)
... more
Caused by: org.apache.hadoop.service.ServiceStateException: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, no resources requested
at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$.run(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$.run(ResourceManager.java:)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:)
... more

At Last! 经过好几天的网上搜索以及思考,这个错误可能是HDP3.0的新错误信息,和网上搜索到的一个问题有些类似,现象同样是RM启动成功后马上挂掉! 其中提到可能是RM回复application的状态引起的故障,急忙实验一下。

简而言之,使用zookeeper命令删除 /rmstore/ZKRMStateRoot/RMAppRoot 下面的所有子目录。

然后重启RM,没想到困扰几天的问题就这么解决了,具体请看输出吧(容我乐一会儿先)。

[root@ep-bd03 pg_log]# sudo -u zookeeper /usr/hdp/3.0.0.0-1634/zookeeper/bin/zkCli.sh

Connecting to localhost:
-- ::, - INFO [main:Environment@] - Client environment:zookeeper.version=3.4.---, built on // : GMT
-- ::, - INFO [main:Environment@] - Client environment:host.name=ep-bd03
-- ::, - INFO [main:Environment@] - Client environment:java.version=1.8.0_181
-- ::, - INFO [main:Environment@] - Client environment:java.vendor=Oracle Corporation
-- ::, - INFO [main:Environment@] - Client environment:java.home=/usr/java/jdk1..0_181-amd64/jre
-- ::, - INFO [main:Environment@] - Client environment:java.class.path=/usr/hdp/3.0.0.0-/zookeeper/bin/../build/classes:/usr/hdp/3.0.0.0-/zookeeper/bin/../build/lib/*.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/xercesMinimal-1.9.6.2.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-provider-api-2.4.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-http-shared4-2.4.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-http-shared-1.0-beta-6.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-http-lightweight-1.0-beta-6.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-http-2.4.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-file-1.0-beta-6.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/plexus-utils-3.0.8.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/plexus-interpolation-1.11.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/plexus-container-default-1.0-alpha-9-stable-1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/netty-3.10.5.Final.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/nekohtml-1.9.6.2.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-settings-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-repository-metadata-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-project-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-profile-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-plugin-registry-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-model-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-error-diagnostics-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-artifact-manager-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-artifact-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-ant-tasks-2.1.3.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/log4j-1.2.16.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/jsoup-1.7.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/jline-0.9.94.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/commons-logging-1.1.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/commons-io-2.2.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/commons-codec-1.6.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/classworlds-1.1-alpha-2.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/backport-util-concurrent-3.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/ant-launcher-1.8.0.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/ant-1.8.0.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../zookeeper-3.4.6.3.0.0.0-1634.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../src/java/lib/*.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../conf::/usr/share/zookeeper/*
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:java.compiler=<NA>
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:os.name=Linux
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:os.arch=amd64
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:os.version=3.10.0-862.6.3.el7.x86_64
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:user.name=zookeeper
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:user.home=/var/lib/zookeeper
2018-08-29 15:04:02,400 - INFO [main:Environment@100] - Client environment:user.dir=/tmp/hsperfdata_zookeeper
2018-08-29 15:04:02,401 - INFO [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@6438a396
Welcome to ZooKeeper!
2018-08-29 15:04:02,417 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1019] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
JLine support is enabled
2018-08-29 15:04:02,461 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@864] - Socket connection established, initiating session, client: /127.0.0.1:7637, server: localhost/127.0.0.1:2181
[zk: localhost:2181(CONNECTING) 0] 2018-08-29 15:04:02,484 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1279] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x3658450e5f202da, negotiated timeout = 30000 WATCHER:: WatchedEvent state:SyncConnected type:None path:null [zk: localhost:2181(CONNECTED) 0] ls /rmstore
[ZKRMStateRoot]
[zk: localhost:2181(CONNECTED) 1] ls /rmstore/ZKRMStateRoot
[ReservationSystemRoot, RMAppRoot, AMRMTokenSecretManagerRoot, EpochNode, RMDTSecretManagerRoot, RMVersionNode]

[zk: localhost:2181(CONNECTED) 6] ls /rmstore/ZKRMStateRoot/RMAppRoot
[application_1534904073745_0001, HIERARCHIES, application_1534904073745_0003, application_1534904073745_0002]

[zk: localhost:2181(CONNECTED) 3] rmr /rmstore/ZKRMStateRoot/RMAppRoot/application_1534904073745_0001
[zk: localhost:2181(CONNECTED) 4] rmr /rmstore/ZKRMStateRoot/RMAppRoot/HIERARCHIES
[zk: localhost:2181(CONNECTED) 5] rmr /rmstore/ZKRMStateRoot/RMAppRoot/application_1534904073745_0003
[zk: localhost:2181(CONNECTED) 5] rmr /rmstore/ZKRMStateRoot/RMAppRoot/application_1534904073745_0002

[zk: localhost:2181(CONNECTED) 7] ls /rmstore/ZKRMStateRoot/RMAppRoot
[]
[zk: localhost:2181(CONNECTED) 8] 

YARN 启动后失败退出——没有请求资源——Invalid resource request, no resources request的更多相关文章

  1. springboot启动后自动退出

    有时新建的springboot启动后自动退出运行,如图所示: 此种情况大都数是因为pom文件加入了tomcat的依赖,与springboot内嵌的tomcat冲突导致,所以只需将pom文件中的tomc ...

  2. docker 容器启动后立马退出的解决方法

    原因: 容器同时只能管理一个进程,如果这个进程结束了容器就退出了,但是不表示容器只能运行一个进程(其他进程可在后台运行),但是要使容器不退出必须要有一个进程在前台执行.   解决方案: 启动脚本最后一 ...

  3. 记录一次OracleJDK开发的项目发部到Linux中使用OpenJDK启动后失败的错误的解决方案

    一.现象 基于JAVA SpringBoot2.0.4的项目,发部后项目发部后,放到OpenJDK环境中运行时,提示下列错误: 2019-10-22 10:03:55 [main] WARN  o.s ...

  4. 在nginx启动后,如果我们要操作nginx,要怎么做呢 别增加无谓的上下文切换 异步非阻塞的方式来处理请求 worker的个数为cpu的核数 红黑树

    nginx平台初探(100%) — Nginx开发从入门到精通 http://ten 众所周知,nginx性能高,而nginx的高性能与其架构是分不开的.那么nginx究竟是怎么样的呢?这一节我们先来 ...

  5. SQL Server(MSSQLSERVER)启动失败,提示“请求失败或服务未及时响应

    1.SQL Server(MSSQLSERVER)启动失败,提示“请求失败或服务未及时响应. --------------------------- SQL Server 配置管理器 -------- ...

  6. 【技术贴】第二篇 :解决使用maven jetty启动后无法加载修改过后的静态资源

    之前写过第一篇:[技术贴]解决使用maven jetty启动后无法加载修改过后的静态资源 一直用着挺舒服的,直到今天,出现了又不能修改静态js,jsp等资源的现象.很是苦闷. 经过调错处理之后,发现是 ...

  7. Servlet访问路径的两种方式、Servlet生命周期特点、计算服务启动后的访问次数、Get请求、Post请求

    Servlet访问路径的两种方式: 1:注解 即在Servlet里写一个@WebServlet @WebServlet("/myServlet") 2:配置web.xml < ...

  8. Spark(四十九):Spark On YARN启动流程源码分析(一)

    引导: 该篇章主要讲解执行spark-submit.sh提交到将任务提交给Yarn阶段代码分析. spark-submit的入口函数 一般提交一个spark作业的方式采用spark-submit来提交 ...

  9. Spark On YARN启动流程源码分析(一)

    本文主要参考: a. https://www.cnblogs.com/yy3b2007com/p/10934090.html 0. 说明 a. 关于spark源码会不定期的更新与补充 b. 对于spa ...

随机推荐

  1. ACM知识点总结

    1 枚举 2 模拟 3 构造 4 位运算的应用 5 查找 5.1 二分查找 5.2 分块查找 5.3 哈希查找HASH 5.3.1 线性探测法 5.3.2 字符串与哈希 6 搜索 6.1 深度优先搜索 ...

  2. vmware提示请卸载干净再重新安装的解决办法

    结论:删掉   HKEY_LOCAL_MACHINE\\SOFTWARE\Wow6432Node\VMware, Inc.    就可以了. ----------------------------- ...

  3. jmeter接口测试实例1-添加学生信息

    jmeter实例1:添加学生信息 进入jmeter,添加线程组改名称为添加学生信息(为了好区分接口),添加http请求,输入IP,方法,路径,在body data中输入json串,同上面postman ...

  4. ACM-ICPC 2018 南京赛区网络预赛 E题

    ACM-ICPC 2018 南京赛区网络预赛 E题 题目链接: https://nanti.jisuanke.com/t/30994 Dlsj is competing in a contest wi ...

  5. Java知识回顾 (10) 线程

    再次声明,正如(1)中所描述的,本资料来自于runoob,略有修改. 一条线程指的是进程中一个单一顺序的控制流,一个进程中可以并发多个线程,每条线程并行执行不同的任务. Java 给多线程编程提供了内 ...

  6. UML建模——用例图(Use Case Diagram)

    用例图主要用来描述角色以及角色与用例之间的连接关系.说明的是谁要使用系统,以及他们使用该系统可以做些什么.一个用例图包含了多个模型元素,如系统.参与者和用例,并且显示这些元素之间的各种关系,如泛化.关 ...

  7. centos7 快速安装 mariadb(mysql)

    从最新版本的linux系统开始,默认的是 Mariadb而不是mysql! 使用系统自带的repos安装很简单: yum install mariadb mariadb-server systemct ...

  8. std::vector push_back报错Access violation

    C/C++ code   ? 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 #include < ...

  9. Project with Match in aggregate not working in mongodb

    [问题] 2down votefavorite I am trying to fetch data based on some match condition. First I've tried th ...

  10. Fiddler Composer 模拟post请求

    在模拟post请求的时候,发现服务器端无法接收post参数 发现原来的请求表头的设置问题加上表头 Content-Type: application/x-www-form-urlencoded 后正常