hbase报错ERROR: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet 采坑记
1、错误异常信息:
Exception in thread "main" java.lang.IllegalArgumentException: Failed to find metadata store by url: kylin_metadata@hbase
at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(ResourceStore.java:99)
at org.apache.kylin.common.persistence.ResourceStore.getStore(ResourceStore.java:111)
at org.apache.kylin.rest.service.AclTableMigrationTool.checkIfNeedMigrate(AclTableMigrationTool.java:99)
at org.apache.kylin.tool.AclTableMigrationCLI.main(AclTableMigrationCLI.java:43)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(ResourceStore.java:92)
... 3 more
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=2, exceptions:
Wed Aug 04 11:08:45 CST 2021, RpcRetryingCaller{globalStartTime=1628046524833, pause=100, maxAttempts=2}, org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=2, exceptions:
Wed Aug 04 11:08:45 CST 2021, RpcRetryingCaller{globalStartTime=1628046524855, pause=100, maxAttempts=2}, org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server recbd5.hwwt2.com,16020,1628044355182 is not running yet
at org.apache.hadoop.hbase.regionserver.RSRpcServices.checkOpen(RSRpcServices.java:1501)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2440)
at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41998)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) Wed Aug 04 11:08:45 CST 2021, RpcRetryingCaller{globalStartTime=1628046524855, pause=100, maxAttempts=2}, org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server recbd5.hwwt2.com,16020,1628044355182 is not running yet
at org.apache.hadoop.hbase.regionserver.RSRpcServices.checkOpen(RSRpcServices.java:1501)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2440)
at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41998)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) Wed Aug 04 11:08:45 CST 2021, RpcRetryingCaller{globalStartTime=1628046524833, pause=100, maxAttempts=2}, org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=2, exceptions:
Wed Aug 04 11:08:45 CST 2021, RpcRetryingCaller{globalStartTime=1628046525453, pause=100, maxAttempts=2}, org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server recbd5.hwwt2.com,16020,1628044355182 is not running yet
at org.apache.hadoop.hbase.regionserver.RSRpcServices.checkOpen(RSRpcServices.java:1501)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2440)
at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41998)
2、解决方式:
1)、检查发现此时hadoop 处于安全模式,需要让hadoop退出安全模式
hadoop dfsadmin -safemode leave
2) 重启hbase ,发现Hbase还是服务不能正常使用,Hmaster异常,Regionserver异常,异常日志如下:
Hmaster关键异常日志
2018-05-25 10:19:12,737 DEBUG[hadoop001:60000.activeMasterManager] wal.WALProcedureStore: Opening state-log:FileStatus{path=hdfs://beh/hbase/MasterProcWALs/state-00000000000000036689.log;isDirectory=false; length=45760804; replication=3; blocksize=536870912;modification_time=1527123981127; access_time=1527165673882; owner=hadoop;group=hadoop; permission=rw-rw-r--; isSymlink=false} 2018-05-25 10:19:12,742 INFO [hadoop001:60000.activeMasterManager]util.FSHDFSUtils: Recover lease on dfs filehdfs://beh/hbase/MasterProcWALs/state-00000000000000036690.log 2018-05-25 10:19:12,742 INFO [hadoop001:60000.activeMasterManager]util.FSHDFSUtils: Recovered lease, attempt=0 onfile=hdfs://beh/hbase/MasterProcWALs/state-00000000000000036690.log after 0ms 2018-05-25 10:19:12,742 DEBUG[hadoop001:60000.activeMasterManager] wal.WALProcedureStore: Opening state-log:FileStatus{path=hdfs://beh/hbase/MasterProcWALs/state-00000000000000036690.log;isDirectory=false; length=45761668; replication=3; blocksize=536870912;modification_time=1527123982242; access_time=1527165673883; owner=hadoop;group=hadoop; permission=rw-rw-r--; isSymlink=false} 2018-05-25 10:19:12,767 INFO [hadoop001:60000.activeMasterManager]util.FSHDFSUtils: Recover lease on dfs filehdfs://beh/hbase/MasterProcWALs/state-00000000000000036691.log 2018-05-25 10:19:12,768 INFO [hadoop001:60000.activeMasterManager]util.FSHDFSUtils: Recovered lease, attempt=0 onfile=hdfs://beh/hbase/MasterProcWALs/state-00000000000000036691.log after 1ms . . . 2018-05-25 10:29:29,656 DEBUG[B.defaultRpcServer.handler=31,queue=13,port=60000] ipc.RpcServer: B.defaultRpcServer.handler=31,queue=13,port=60000:callId: 301 service: RegionServerStatusService methodName: RegionServerStartupsize: 46 connection: 172.33.2.22:38698 org.apache.hadoop.hbase.ipc.ServerNotRunningYetException:Server is not running yet
at org.apache.hadoop.hbase.master.HMaster.checkServiceStarted(HMaster.java:2296)
atorg.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.java:361)
atorg.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:8615)
atorg.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2170)
atorg.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
atorg.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
atorg.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
atjava.lang.Thread.run(Thread.java:745)
Regionserver关键异常日志
20180525 10:31:14,446 WARN [regionserver/hadoo031730 regionserver .HRegionServer: reportForDuty failed; sleeping and then retrying.
201805-25 10:31:17 446 INFO [regionserver/hadop030 regionserver .HRegionServer: reportForDuty to master=hadoop001 , 60o00, 1527214458906 with port=60025, startcode=1527214459823
20180525 10:31:17 447 DEBUG [regionserver/hadoo03730 regionserver .HRegionServer: Master is not running yet
20180525 10:31:17 447 WARN [regionserver/hadoo03730 regionserver .HRegionServer: reportForDuty failed
sleeping and then retrying
20180525 10:31:20,447 INFO [regionserver/hadoo031730 regionserver .HRegionServer: reportForDuty to master=hadoop001 60000, 1527214458906 with port60025, startcode1527214459823
20180525 10:31:20,448 DEBUG [regionserver/hadoop003173 regionserver .HRegionServer: Master is not running yet
20180525 10:31:20,448 WARN [ regionserver/hadoop003173 regionserver .HRegionServer: reportForDuty failed
sleeping and then retrying.
201805-25 10:31:23,448 INFO [regionserver/hadop030 regionserver .HRegionServer: reportForDuty to master=hadoop001 , 60000,1527214458906 with port=60025, startcode=1527214459823
20180525 10:31:23,449 DEBUG [regionserver/hadoop003/173 regionserver .HRegionServer: Master is not running yet
Datanode关键异常日志
2018-05-25 11:04:20,540 INFOorg.apache.hadoop.hdfs.server.datanode.DataNode: Likely the client has stoppedreading, disconnecting it (hadoop028:50010:DataXceiver error processingREAD_BLOCK operation src: /172.33.2.17:39882dst: /172.33.2.44:50010);
java.net.SocketTimeoutException: 600000 millistimeout while waiting for channel to be ready for write. ch :java.nio.channels.SocketChannel[connected local=/172.33.2.44:50010remote=/172.33.2.17:39882] 2018-05-25 11:04:20,652 INFOorg.apache.hadoop.hdfs.server.datanode.DataNode: Likely the client has stoppedreading, disconnecting it (hadoop028:50010:DataXceiver error processingREAD_BLOCK operation src:/172.33.2.17:39930 dst: /172.33.2.44:50010);
java.net.SocketTimeoutException:600000 millis timeout while waiting for channel to be ready for write. ch :java.nio.channels.SocketChannel[connected local=/172.33.2.44:50010remote=/172.33.2.17:39930] 2018-05-25 11:04:21,088 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:Likely the client has stopped reading, disconnecting it(hadoop028:50010:DataXceiver error processing READ_BLOCK operation src: /172.33.2.17:40038 dst:/172.33.2.44:50010);
java.net.SocketTimeoutException: 600000 millis timeoutwhile waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connectedlocal=/172.33.2.44:50010 remote=/172.33.2.17:40038]
3)、问题分析
解决前以排除hdfs问题,datanode异常信息是由hbase Hmaster不能正常启动导致,172.33.2.17是active(zk确定)Hmaster节点;
根据Reginserver和Hmaster的日志org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is notrunning yet
Master is not running yet
确定是Hmaster服务不能正常启动导致;
根据Hmaster异常日志:2018-05-25 10:19:59,868 WARN [hadoop001:60000.activeMasterManager] wal.WALProcedureStore: Unable toread tracker for hdfs://beh/hbase/MasterProcWALs/state-00000000000000040786.log- Missing trailer: size=11 startPos=11查看目录hdfs://beh/hbase/MasterProcWALs,该目录总大小为1.3T大小
Ø 原因:Hmaster状态变为active状态,它就会有许多不同的日志来recover, lease, read;但是日志量巨大,是给了namenode很大压力,耗尽了tcp缓冲空间,导致服务恢复时间超长。
4)、解决方式: 删除hdfs://beh/hbase/MasterProcWALs目录下的日志文件 ,然后重启hbase集群
hbase报错ERROR: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet 采坑记的更多相关文章
- hbase shell中执行list命令报错:ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
问题描述: 今天在测试环境中,搭建hbase环境,执行list命令之后,报错: hbase(main):001:0> list TABLE ERROR: org.apache.hadoop.hb ...
- Eclipse连接HBase 报错:org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
在eclipse中连接到HBase报错org.apache.hadoop.hbase.PleaseHoldException: Master is initializing,搜索了好久,网上其它人说的 ...
- 安装hbase分布式集群出现的报错- ERROR:org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
可能的原因如下: 1. 时间没有同步 HBase需要结点间的时间必须是同步的,可以使用date命令在Linux查看时间(同步时间命令:ntpdate 1.cn.pool.ntp.org) 2. 底层采 ...
- 使用IDEA操作Hbase API 报错:org.apache.hadoop.hbase.client.RetriesExhaustedException的解决方法:
使用IDEA操作Hbase API 报错:org.apache.hadoop.hbase.client.RetriesExhaustedException的解决方法: 1.错误详情: Excepti ...
- hbase运行时ERROR:org.apache.hadoop.hbase.PleaseHoldException:Master is initializing的解决方法
最终解决了,其实我心中有一句MMP. 版本: hadoop 2.6.4 + hbase0.98 第一个问题,端口问题8020 hadoop默认的namenode 资源子接口是8020 端口,然后我这接 ...
- 通过phoenix创建hbase表失败,创建语句卡住,hbase-hmaster报错:exception=org.apache.hadoop.hbase.TableExistsException: SYNC_BUSINESS_INFO_BYDAY_EFFECT
问题描述: 前几天一个同事来说,通过phoenix创建表失败了,一直报表存在的错误,删除也报错,然后就针对这个问题找下解决方案. 问题分析: 1.通过phoenix创建表,一直卡住不动了.创建语句如下 ...
- Hbase 配置问题(ERROR: org.apache.hadoop.hbase.PleaseHoldException: org.apache.hadoop.hbase.PleaseHoldEx)
ERROR: org.apache.hadoop.hbase.PleaseHoldException: org.apache.hadoop.hbase.PleaseHoldException: Mas ...
- 启动hbase shell报错:org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet
查看日志发现:Waiting for dfs to exit safe mode 这说明HDFS目前处于安全模式,需要退出才行,于是进入Namdenode节点,执行命令: hdfs dfsadmin ...
- HBase启动报错:ERROR: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet
今天进入hbase shell中输入命令报错:ERROR: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is no ...
- hbase(ERROR: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet)
今天启动clouder manager集群时候hbase list出现 (ERROR: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException ...
随机推荐
- Python 潮流周刊#12:Python 中如何调试死锁问题?
查看全文: https://pythoncat.top/posts/2023-07-22-weekly 文章&教程 1.使用 PyStack 调试 Python 中的崩溃和死锁 (英) 2.介 ...
- 【三】强化学习之PaddlePaddlle-Notebook、&pdb、ipdb 调试---及PARL框架
相关文章: [一]飞桨paddle[GPU.CPU]安装以及环境配置+python入门教学 [二]-Parl基础命令 [三]-Notebook.&pdb.ipdb 调试 [四]-强化学习入门简 ...
- LyScript 插件实现UPX寻找入口
LyScript 插件可实现对压缩壳的快速脱壳操作,目前支持两种脱壳方式,一种是运用API接口自己编写脱壳过程,另一种是直接加载现有的脱壳脚本运行脱壳. 插件地址:https://github.com ...
- 英特尔发布酷睿Ultra移动处理器:Intel 4制程工艺、AI性能飙升
英特尔今日发布了第一代酷睿Ultra移动处理器,是首款基于Intel 4制程工艺打造的处理器. 据了解,英特尔酷睿Ultra采用了英特尔首个用于客户端的片上AI加速器"神经网络处理单元(NP ...
- GoodSync(最好的文件同步软件)
GoodSync是一款使用创新的最好的文件同步软件,可以在你的台式机.笔记本.USB外置驱动器等设备直接进行数据文件同步软件. GoodSync将高度稳定的可靠性和极其简单的易用性完美结合起来,可以实 ...
- Git企业开发控制理论和实操-从入门到深入(六)|多人协作开发
前言 那么这里博主先安利一些干货满满的专栏了! 首先是博主的高质量博客的汇总,这个专栏里面的博客,都是博主最最用心写的一部分,干货满满,希望对大家有帮助. 高质量博客汇总 然后就是博主最近最花时间的一 ...
- [Java]HashMap与ConcurrentHashMap的一些总结
HashMap与ConcurrentHashMap的一些总结 HashMap底层数据结构 JDK7:数组+链表 JDK8:数组+链表+红黑树 JDK8中的HashMap什么时候将链表转为红黑树? 当发 ...
- 【SpringBootStarter】自定义全局加解密组件
[SpringBootStarter] 目的 了解SpringBoot Starter相关概念以及开发流程 实现自定义SpringBoot Starter(全局加解密) 了解测试流程 优化 最终引用的 ...
- 初步体验通过 Semantic Kernel 与自己部署的通义千问开源大模型进行对话
春节之前被 Semantic Kernel 所吸引,开始了解它,学习它. 在写这篇博文之前读了一些英文博文,顺便在这里分享一下: Intro to Semantic Kernel – Part One ...
- 多线程系列(三) -synchronized 关键字使用详解
一.简介 在之前的线程系列文章中,我们介绍了线程创建的几种方式以及常用的方法介绍. 今天我们接着聊聊多线程线程安全的问题,以及解决办法. 实际上,在多线程环境中,难免会出现多个线程对一个对象的实例变量 ...