【原创】大叔问题定位分享(25)ambari metrics collector内置standalone hbase启动失败
ambari metrics collector内置hbase目录位于
/usr/lib/ams-hbase
配置位于
/etc/ams-hbase/conf
通过ruby启动
/usr/lib/ams-hbase/bin/hirb.rb
实际的启动命令为
/usr/lib/ams-hbase/bin/hbase-daemon.sh --config /etc/ams-hbase/conf foreground_start master
但是启动一段时间报错:
java.lang.RuntimeException: Master not initialized after 200000ms
at org.apache.hadoop.hbase.util.JVMClusterUtil.waitForEvent(JVMClusterUtil.java:229)
at org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:197)
at org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:413)
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:232)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:140)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:149)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:3100)
查看日志发现第一个错:
2019-01-16 16:20:29,295 WARN [StoreOpener-1588230740-1] util.CommonFSUtils: FileSystem doesn't support setStoragePolicy; HDFS-6584, HDFS-9345 not available. This is normal and expected on earlier Hadoop versions.
java.lang.NoSuchMethodException: org.apache.hadoop.fs.LocalFileSystem.setStoragePolicy(org.apache.hadoop.fs.Path, java.lang.String)
at java.lang.Class.getDeclaredMethod(Class.java:2130)
at org.apache.hadoop.hbase.util.CommonFSUtils.invokeSetStoragePolicy(CommonFSUtils.java:533)
at org.apache.hadoop.hbase.util.CommonFSUtils.setStoragePolicy(CommonFSUtils.java:514)
at org.apache.hadoop.hbase.util.CommonFSUtils.setStoragePolicy(CommonFSUtils.java:482)
at org.apache.hadoop.hbase.regionserver.HRegionFileSystem.setStoragePolicy(HRegionFileSystem.java:193)
at org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:258)
at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:5571)
at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1025)
at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1022)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
这个错可以忽略,然后是第二个错:
2019-01-17 15:49:08,730 ERROR [Thread-24] master.HMaster: Failed to become active master
java.lang.IllegalStateException: The procedure WAL relies on the ability to hsync for proper operation during component failures, but the underlying filesystem does not support doing so. Please check the config value of 'hbase.procedure.store.wal.use.hsync' to set the desired level of robustness and ensure the config value of 'hbase.wal.dir' points to a FileSystem mount that can provide it.
at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.rollWriter(WALProcedureStore.java:1082)
at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.recoverLease(WALProcedureStore.java:423)
at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:714)
at org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1398)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:857)
at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2225)
at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:568)
at java.lang.Thread.run(Thread.java:748)
这个可以通过设置hbase.unsafe.stream.capability.enforce=false解决,详见 https://www.cnblogs.com/barneywill/p/10283076.html
最后发现很多warn:
2019-01-17 18:22:12,657 WARN [Thread-24] master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1547720528009, server=test-server-01,16020,1547717775199}; ServerCrashProcedures=false. Master startup cannot progress, in holding-pattern until region onlined.
2019-01-17 18:22:13,660 WARN [Thread-24] master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1547720528009, server=test-server-01,16020,1547717775199}; ServerCrashProcedures=false. Master startup cannot progress, in holding-pattern until region onlined.
2019-01-17 18:22:15,660 WARN [Thread-24] master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1547720528009, server=test-server-01,16020,1547717775199}; ServerCrashProcedures=false. Master startup cannot progress, in holding-pattern until region onlined.
2019-01-17 18:22:19,661 WARN [Thread-24] master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1547720528009, server=test-server-01,16020,1547717775199}; ServerCrashProcedures=false. Master startup cannot progress, in holding-pattern until region onlined.
2019-01-17 18:22:27,661 WARN [Thread-24] master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1547720528009, server=test-server-01,16020,1547717775199}; ServerCrashProcedures=false. Master startup cannot progress, in holding-pattern until region onlined.
2019-01-17 18:22:43,662 WARN [Thread-24] master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1547720528009, server=test-server-01,16020,1547717775199}; ServerCrashProcedures=false. Master startup cannot progress, in holding-pattern until region onlined.
清空hbase.rootdir,hbase.tmp.dir,hbase.zookeeper.property.dataDir这3个目录,同时设置好用户权限,然后重启hbase即可;
【原创】大叔问题定位分享(25)ambari metrics collector内置standalone hbase启动失败的更多相关文章
- 【原创】大叔问题定位分享(24)hbase standalone方式启动报错
hbase 2.0.2 hbase standalone方式启动报错: 2019-01-17 15:49:08,730 ERROR [Thread-24] master.HMaster: Failed ...
- 【原创】大叔问题定位分享(23)Ambari安装向导点击下一步卡住
ambari安装第一步是输入集群name,点击next时页面卡住不动,如下图: 注意到其中一个接口请求结果异常,http://ambari.server:8080/api/v1/version_def ...
- 【原创】大叔问题定位分享(21)spark执行insert overwrite非常慢,比hive还要慢
最近把一些sql执行从hive改到spark,发现执行更慢,sql主要是一些insert overwrite操作,从执行计划看到,用到InsertIntoHiveTable spark-sql> ...
- 【原创】大叔问题定位分享(20)hdfs文件create写入正常,append写入报错
最近在hdfs写文件的时候发现一个问题,create写入正常,append写入报错,每次都能重现,代码示例如下: FileSystem fs = FileSystem.get(conf); Outpu ...
- 【原创】大叔问题定位分享(13)HBase Region频繁下线
问题现象:hive执行sql报错 select count(*) from test_hive_table; 报错 Error: java.io.IOException: org.apache.had ...
- 【原创】大叔问题定位分享(11)Spark中对大表子查询加limit为什么会报Broadcast超时错误
当两个表需要join时,如果一个是大表,一个是小表,正常的map-reduce流程需要shuffle,这会导致大表数据在节点间网络传输,常见的优化方式是将小表读到内存中并广播到大表处理,避免shuff ...
- 【原创】大叔问题定位分享(6)Dubbo monitor服务iowait高,负载高
一 问题 Dubbo monitor所在服务器状态异常,iowait一直很高,load也一直很高,监控如下: iowait如图: load如图: 二 分析 通过iotop命令可以查看当前系统中磁盘io ...
- 【原创】大叔问题定位分享(3)Kafka集群broker进程逐个报错退出
kafka0.8.1 一 问题现象 生产环境kafka服务器134.135.136分别在10月11号.10月13号挂掉: 134日志 [2014-10-13 16:45:41,902] FATAL [ ...
- 【原创】大叔问题定位分享(2)spark任务一定几率报错java.lang.NoSuchFieldError: HIVE_MOVE_FILES_THREAD_COUNT
最近用yarn cluster方式提交spark任务时,有时会报错,报错几率是40%,报错如下: 18/03/15 21:50:36 116 ERROR ApplicationMaster91: Us ...
随机推荐
- 【学习总结】GirlsInAI ML-diary day-10-if条件执行
[学习总结]GirlsInAI ML-diary 总 原博github链接-day10 认识if条件执行 一般条件执行 分支执行 链式条件执行 嵌套条件执行 1-if一般条件执行 格式:如果(满足这个 ...
- Github经理和员工开发
Git简介 Git是目前世界上最先进的分布式版本控制系统 git的两大特点: 版本控制:可以解决多人同时开发的代码问题,也可以解决找回历史代码的问题 分布式:Git是分布式版本控制系统,同一个Git仓 ...
- Linux 默认连接数
Linux 默认连接数 - 国内版 Binghttps://cn.bing.com/search?FORM=U227DF&PC=U227&q=Linux+%E9%BB%98%E8%AE ...
- vue2.0里的路由钩子
路由钩子 在某些情况下,当路由跳转前或跳转后.进入.离开某一个路由前.后,需要做某些操作,就可以使用路由钩子来监听路由的变化 全局路由钩子: router.beforeEach((to, from, ...
- Spring boot 的自动配置
Xml 配置文件 日志 Spring Boot对各种日志框架都做了支持,我们可以通过配置来修改默认的日志的配置: #设置日志级别 logging.level.org.springframework=D ...
- DAY16、模块和包
一.模块 1.模块的加载顺序:内存 =>内置 =>sys.path(一系列自定义模块) 2.sys.path:环境变量,存放文件路径的列表 重点:默认列表第一个元素就是当前被执行文件所在的 ...
- Linux程序宕掉后如何通过gdb查看出错信息
我们在编写服务端程序的时候,由于多线程并且环境复杂,程序可能在不确定条件的情况下宕掉,还不好重新,这是我们如何获取程序的出错信息,一种方法通过打日志,有时候一些错误日志也不能体现出来,这时就用到我们的 ...
- fullpage.js参数参考
fullpage函数里面的参数: //Navigationmenu: false,//绑定菜单,设定的相关属性与anchors的值对应后,菜单可以控制滚动,默认为false.anchors:['fir ...
- linux中去掉^M的方法
转:https://blog.csdn.net/sty945/article/details/80347901 (1)是用VI的命令: 在命令模式下运行命令 :%s/^M//g 回车 注意:手动输入该 ...
- 小数点保留n位有效数字
char *psf = "183.0000000000000001"; ]; sprintf(chBuff, "%.2lf", atof(psf)); doub ...