在ambari-server中修改了yarn的配置,重新启动服务,结果RM启动失败,错误也很奇怪,“不合理的资源请求,没有请求任何资源”!详细如下:

-- ::, FATAL resourcemanager.ResourceManager (ResourceManager.java:main(1495)) - Error starting ResourceManager
org.apache.hadoop.service.ServiceStateException: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, no resources requested
at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$.run(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$.run(ResourceManager.java:)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:)
Caused by: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, no resources requested
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:)
... more
-- ::, INFO zookeeper.ZooKeeper (ZooKeeper.java:close(684)) - Session: 0x36546c044dc0113 closed
-- ::, INFO zookeeper.ClientCnxn (ClientCnxn.java:run()) - EventThread shut down
-- ::, INFO resourcemanager.ResourceManager (LogAdapter.java:info()) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down ResourceManager at ep-bd01/192.168.58.11

网上多方搜索无解,最后无奈重新启动主机,重启所有服务,结果成功! 再次重启RM,失败,原因同上。

一、配置RM HA,这次启动了,但是配置的两个RM节点都是standby状态! 期间再次修改配置文件无数次,无效,错误信息依然。

二、手工激活一台主机上的RM,失败,错误原因相同

[root@ep-bd01 zookeeper]# yarn rmadmin -transitionToActive --forceactive --forcemanual rm1
You have specified the --forcemanual flag. This flag is dangerous, as it can induce a split-brain scenario that WILL CORRUPT your HDFS namespace, possibly irrecoverably. It is recommended not to use this flag, but instead to shut down the cluster and disable automatic failover if you prefer to manually manage your HA state. You may abort safely by answering 'n' or hitting ^C now. Are you sure you want to continue? (Y or N) y
......
......
// :: WARN ha.ActiveStandbyElector: Exception handling the winning of election
org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:)
at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:)
at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:)
Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when transitioning to Active mode
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:)
... more
Caused by: org.apache.hadoop.service.ServiceStateException: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, no resources requested
at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$.run(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$.run(ResourceManager.java:)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:)
... more

At Last! 经过好几天的网上搜索以及思考,这个错误可能是HDP3.0的新错误信息,和网上搜索到的一个问题有些类似,现象同样是RM启动成功后马上挂掉! 其中提到可能是RM回复application的状态引起的故障,急忙实验一下。

简而言之,使用zookeeper命令删除 /rmstore/ZKRMStateRoot/RMAppRoot 下面的所有子目录。

然后重启RM,没想到困扰几天的问题就这么解决了,具体请看输出吧(容我乐一会儿先)。

[root@ep-bd03 pg_log]# sudo -u zookeeper /usr/hdp/3.0.0.0-1634/zookeeper/bin/zkCli.sh

Connecting to localhost:
-- ::, - INFO [main:Environment@] - Client environment:zookeeper.version=3.4.---, built on // : GMT
-- ::, - INFO [main:Environment@] - Client environment:host.name=ep-bd03
-- ::, - INFO [main:Environment@] - Client environment:java.version=1.8.0_181
-- ::, - INFO [main:Environment@] - Client environment:java.vendor=Oracle Corporation
-- ::, - INFO [main:Environment@] - Client environment:java.home=/usr/java/jdk1..0_181-amd64/jre
-- ::, - INFO [main:Environment@] - Client environment:java.class.path=/usr/hdp/3.0.0.0-/zookeeper/bin/../build/classes:/usr/hdp/3.0.0.0-/zookeeper/bin/../build/lib/*.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/xercesMinimal-1.9.6.2.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-provider-api-2.4.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-http-shared4-2.4.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-http-shared-1.0-beta-6.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-http-lightweight-1.0-beta-6.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-http-2.4.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-file-1.0-beta-6.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/plexus-utils-3.0.8.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/plexus-interpolation-1.11.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/plexus-container-default-1.0-alpha-9-stable-1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/netty-3.10.5.Final.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/nekohtml-1.9.6.2.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-settings-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-repository-metadata-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-project-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-profile-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-plugin-registry-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-model-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-error-diagnostics-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-artifact-manager-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-artifact-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-ant-tasks-2.1.3.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/log4j-1.2.16.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/jsoup-1.7.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/jline-0.9.94.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/commons-logging-1.1.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/commons-io-2.2.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/commons-codec-1.6.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/classworlds-1.1-alpha-2.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/backport-util-concurrent-3.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/ant-launcher-1.8.0.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/ant-1.8.0.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../zookeeper-3.4.6.3.0.0.0-1634.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../src/java/lib/*.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../conf::/usr/share/zookeeper/*
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:java.compiler=<NA>
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:os.name=Linux
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:os.arch=amd64
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:os.version=3.10.0-862.6.3.el7.x86_64
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:user.name=zookeeper
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:user.home=/var/lib/zookeeper
2018-08-29 15:04:02,400 - INFO [main:Environment@100] - Client environment:user.dir=/tmp/hsperfdata_zookeeper
2018-08-29 15:04:02,401 - INFO [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@6438a396
Welcome to ZooKeeper!
2018-08-29 15:04:02,417 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1019] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
JLine support is enabled
2018-08-29 15:04:02,461 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@864] - Socket connection established, initiating session, client: /127.0.0.1:7637, server: localhost/127.0.0.1:2181
[zk: localhost:2181(CONNECTING) 0] 2018-08-29 15:04:02,484 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1279] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x3658450e5f202da, negotiated timeout = 30000 WATCHER:: WatchedEvent state:SyncConnected type:None path:null [zk: localhost:2181(CONNECTED) 0] ls /rmstore
[ZKRMStateRoot]
[zk: localhost:2181(CONNECTED) 1] ls /rmstore/ZKRMStateRoot
[ReservationSystemRoot, RMAppRoot, AMRMTokenSecretManagerRoot, EpochNode, RMDTSecretManagerRoot, RMVersionNode]

[zk: localhost:2181(CONNECTED) 6] ls /rmstore/ZKRMStateRoot/RMAppRoot
[application_1534904073745_0001, HIERARCHIES, application_1534904073745_0003, application_1534904073745_0002]

[zk: localhost:2181(CONNECTED) 3] rmr /rmstore/ZKRMStateRoot/RMAppRoot/application_1534904073745_0001
[zk: localhost:2181(CONNECTED) 4] rmr /rmstore/ZKRMStateRoot/RMAppRoot/HIERARCHIES
[zk: localhost:2181(CONNECTED) 5] rmr /rmstore/ZKRMStateRoot/RMAppRoot/application_1534904073745_0003
[zk: localhost:2181(CONNECTED) 5] rmr /rmstore/ZKRMStateRoot/RMAppRoot/application_1534904073745_0002

[zk: localhost:2181(CONNECTED) 7] ls /rmstore/ZKRMStateRoot/RMAppRoot
[]
[zk: localhost:2181(CONNECTED) 8] 

YARN 启动后失败退出——没有请求资源——Invalid resource request, no resources request的更多相关文章

  1. springboot启动后自动退出

    有时新建的springboot启动后自动退出运行,如图所示: 此种情况大都数是因为pom文件加入了tomcat的依赖,与springboot内嵌的tomcat冲突导致,所以只需将pom文件中的tomc ...

  2. docker 容器启动后立马退出的解决方法

    原因: 容器同时只能管理一个进程,如果这个进程结束了容器就退出了,但是不表示容器只能运行一个进程(其他进程可在后台运行),但是要使容器不退出必须要有一个进程在前台执行.   解决方案: 启动脚本最后一 ...

  3. 记录一次OracleJDK开发的项目发部到Linux中使用OpenJDK启动后失败的错误的解决方案

    一.现象 基于JAVA SpringBoot2.0.4的项目,发部后项目发部后,放到OpenJDK环境中运行时,提示下列错误: 2019-10-22 10:03:55 [main] WARN  o.s ...

  4. 在nginx启动后,如果我们要操作nginx,要怎么做呢 别增加无谓的上下文切换 异步非阻塞的方式来处理请求 worker的个数为cpu的核数 红黑树

    nginx平台初探(100%) — Nginx开发从入门到精通 http://ten 众所周知,nginx性能高,而nginx的高性能与其架构是分不开的.那么nginx究竟是怎么样的呢?这一节我们先来 ...

  5. SQL Server(MSSQLSERVER)启动失败,提示“请求失败或服务未及时响应

    1.SQL Server(MSSQLSERVER)启动失败,提示“请求失败或服务未及时响应. --------------------------- SQL Server 配置管理器 -------- ...

  6. 【技术贴】第二篇 :解决使用maven jetty启动后无法加载修改过后的静态资源

    之前写过第一篇:[技术贴]解决使用maven jetty启动后无法加载修改过后的静态资源 一直用着挺舒服的,直到今天,出现了又不能修改静态js,jsp等资源的现象.很是苦闷. 经过调错处理之后,发现是 ...

  7. Servlet访问路径的两种方式、Servlet生命周期特点、计算服务启动后的访问次数、Get请求、Post请求

    Servlet访问路径的两种方式: 1:注解 即在Servlet里写一个@WebServlet @WebServlet("/myServlet") 2:配置web.xml < ...

  8. Spark(四十九):Spark On YARN启动流程源码分析(一)

    引导: 该篇章主要讲解执行spark-submit.sh提交到将任务提交给Yarn阶段代码分析. spark-submit的入口函数 一般提交一个spark作业的方式采用spark-submit来提交 ...

  9. Spark On YARN启动流程源码分析(一)

    本文主要参考: a. https://www.cnblogs.com/yy3b2007com/p/10934090.html 0. 说明 a. 关于spark源码会不定期的更新与补充 b. 对于spa ...

随机推荐

  1. VS Code打造一个完美的Springboot开发环境

    对于使用Springboot环境开发java应用,首选IDE还是IntelliJ IDEA(2018),当前版本已经很流畅了,现在开发用的电脑配置基本都能够很6的跑起来,IDEA用起来真心爽啊,比Ec ...

  2. BZOJ3457 : Ring

    根据Polya定理: \[ans=\frac{\sum_{d|n}\varphi(d)cal(\frac{n}{d})}{n}\] 其中$cal(n)$表示长度为$n$的无限循环后包含$S$的串的数量 ...

  3. css3 学习 重点 常用

    1 -webkit-    -moz-   -o-浏览器兼容 2 box-sizing:border-box;   两个近乎一样的div一样的样式 平分一个div  定义:属性允许您以确切的方式定义适 ...

  4. C C++ 数字后面加 LL是什么意思

    long long类型,在赋初值的时候,如果大于2的31次方-1,那么后面需要加上LL

  5. 如何实现织梦dedecms表单提交时发送邮箱功能【已解决】

    我们通过织梦系统制作网站时,很多客户需要有在线留言功能,这时就会用到自定义表单.但是很多用户觉得经常登陆后台查看留言信息太麻烦了,于是想能否在提交留言是直接把内容发送到指定邮箱.网站经过测试终于实现了 ...

  6. SpringMVC知识点

    一.SpringMVC 1.HelloWorld案例 ①步骤: 加jar包 在web.xml文件中配置DispatcherServlet 加入SpringMVC的配置文件 编写处理请求的处理器,并标识 ...

  7. Spring源码分析 之浅谈设计模式

    一直想专门写个Spring源码的博客,工作了,可以全身性的投入到互联网行业中.虽然加班很严重,但是依然很开心.趁着凌晨有时间,总结总结. 首先spring,相信大家都很熟悉了. 1.轻量级  零配置, ...

  8. java springboot2 jquery 抽奖项目源码

    java+springboot2+jquery+jdk8   实现的多种抽奖效果! 体验抽奖地址: http://47.98.175.6:8091/ 赞助获得源码!!!

  9. 使用ScriptableObject创建.asset文件

    .asset一般用来存储一些配置,比如SDK初始化的相关参数. using System.Collections.Generic; using UnityEngine; namespace XXX { ...

  10. Chart:Grafana

    ylbtech-Chart:Grafana 1.返回顶部 1-1. 2.返回顶部   3.返回顶部   4.返回顶部   5.返回顶部 0. https://grafana.com/ 1. http: ...