主节点间歇性报错其他没有问题 ,SNN的NN没有问题,相关的journalNode也都在,就是主节点的NN会停止。

查看hadoop主节点的NN日志。

2016-11-21 22:36:40,908 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 19822 ms (timeout=20000 ms) for a response for sendEdits. No responses yet.
2016-11-21 22:36:41,088 FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: flush failed for required journal (JournalAndStream(mgr=QJM to [192.168.58.183:8485, 192.168.58.181:8485, 192.168.58.182:8485], stream=QuorumOutputStream starting at txid 24533))
java.io.IOException: Timed out waiting 20000ms for a quorum of nodes to respond.
at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:137)
at org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107)
at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113)
at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107)
at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:533)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:57)
at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:529)
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:639)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2645)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2520)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:579)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:394)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:975)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2036)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2034)
2016-11-21 22:36:41,089 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Aborting QuorumOutputStream starting at txid 24533
2016-11-21 22:36:41,113 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2016-11-21 22:36:41,122 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Slave2/192.168.58.182:8485. Already tried 0 time(s); maxRetries=45
2016-11-21 22:36:41,123 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Slave1/192.168.58.181:8485. Already tried 0 time(s); maxRetries=45
2016-11-21 22:36:41,123 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: StandByNameNode/192.168.58.183:8485. Already tried 0 time(s); maxRetries=45
2016-11-21 22:36:41,137 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Took 20050ms to send a batch of 1 edits (218 bytes) to remote journal 192.168.58.182:8485
2016-11-21 22:36:41,137 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Took 20052ms to send a batch of 1 edits (218 bytes) to remote journal 192.168.58.181:8485
2016-11-21 22:36:41,137 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Took 20065ms to send a batch of 1 edits (218 bytes) to remote journal 192.168.58.183:8485
2016-11-21 22:36:41,145 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at CentOSMaster/192.168.58.180
************************************************************/

  首先保证设置dfs.namenode.edits.dir和dfs.journalnode.edits.dir,然后设置在hdfs-site.xml中超时时间如下:

<property>
<name>dfs.qjournal.start-segment.timeout.ms</name>
<value>600000000</value>
</property> <property>
<name>dfs.qjournal.prepare-recovery.timeout.ms</name>
<value>600000000</value>
</property> <property>
<name>dfs.qjournal.accept-recovery.timeout.ms</name>
<value>600000000</value>
</property>
<property>
<name>dfs.qjournal.prepare-recovery.timeout.ms</name>
<value>600000000</value>
</property> <property>
<name>dfs.qjournal.accept-recovery.timeout.ms</name>
<value>600000000</value>
</property> <property>
<name>dfs.qjournal.finalize-segment.timeout.ms</name>
<value>600000000</value>
</property> <property>
<name>dfs.qjournal.select-input-streams.timeout.ms</name>
<value>600000000</value>
</property> <property>
<name>dfs.qjournal.get-journal-state.timeout.ms</name>
<value>600000000</value>
</property> <property>
<name>dfs.qjournal.new-epoch.timeout.ms</name>
<value>600000000</value>
</property> <property>
<name>dfs.qjournal.write-txns.timeout.ms</name>
<value>600000000</value>
</property>

  貌似解决了,至今今天早上没出问题。

Namenode主节点停止报错 Error: flush failed for required journal的更多相关文章

  1. Jenkins之发布报错“error: RPC failed; curl 18 transfer closed with outstanding read data remaining”

    报错信息: error: RPC failed; curl transfer closed with outstanding read data remaining fatal: The remote ...

  2. pod lib create ObjcName 时候报错error: RPC failed; curl 56 LibreSSL SSL_read: SSL_ERROR_SYSCALL, errno 54

    众所周知 pod lib create ObjcName 需要从git 上边克隆模版 :https://github.com/CocoaPods/pod-template.git 然后有时候会很慢报错 ...

  3. 安卓中运行报错Error:Execution failed for task ':app:transformClassesWithDexForDebug'解决

    在androidstuio中运行我的未完项目,报错: Error:Execution failed for task ':app:transformClassesWithDexForDebug'.&g ...

  4. git报错error: RPC failed; HTTP 500 curl 22 The requested URL returned error: 500

    报错 $ git push; Enumerating objects: 1002, done. Counting objects: 100% (1002/1002), done. Delta comp ...

  5. 使用spark streaming报错ERROR DFSClient: Failed to close inode xxxx

    转载自:http://blog.csdn.net/xiaolixiaoyi/article/details/45875101 好几个Spark streaming的程序同时运行,发现spark报出了如 ...

  6. Android studio中的一次编译报错’Error:Execution failed for task ':app:transformClassesWithDexForDebug‘,困扰了两天

    先说下背景:随着各种第三方框架的使用,studio在编译打包成apk时,在dex如果发现有相同的jar包,不能创建dalvik虚拟机.一个apk,就是一个运行在linux上的一个虚拟机. 上图就是一直 ...

  7. 使用git克隆github上的项目失败,报错error: RPC failed; curl 56 OpenSSL SSL_read: SSL_ERROR_SYSCALL, errno 10054

    错误描述 今天在github上使用 git clone 某个项目代码的时, git clone https://github.com/XXXX/xxx-blog.git 下载速度很慢,然后下载一段时间 ...

  8. git clone报错error: RPC failed; curl 18 transfer closed with outstanding read data remaining

    具体错误信息如下图: error: RPC failed; curl 18 transfer closed with outstanding read data remaining    fatal: ...

  9. mac M1通过homebrew安装python3报错Error: Command failed with exit 128: git

    fatal: not in a git directoryError: Command failed with exit 128: git 只需要运行 git config --global --ad ...

随机推荐

  1. sublime自定义快键键不行,

    其实sublime自身就有格式化命令,就不再安装插件,位置在[Edit]->[Line]->[Reindent]但这个默认的命令没有快捷键,就重新定义了一下,想用习惯了的eclipse快捷 ...

  2. JavaScript Date对象 日期获取函数

    JavaScript Date对象使用小例子: 运行结果: 总结: 1.尽管我们认为12月是第12个月份,但是JavaScript从0开始计算月份,所以月份11表示12月: 2.nowDate.set ...

  3. IE8中给HTML标签负值报错问题

    当通过JS给一个HTML标签设置高宽为负值的时候,会爆出一个“参数无效”的错误 chrome下不会报错,但是元素不会做任何关于负值的改变

  4. SQL Server中的索引结构与疑惑

    说实话我从没有在实际项目中使用过索引,仅知道索引是一个相当重要的技术点,因此我也看了不少文章知道了索引的区别.分类.优缺点以及如何使用索引.但关于索引它最本质的是什么笔者一直没明白,本文是笔者带着这些 ...

  5. denounce函数:Javascript中如何应对高频触发事件

    在DOM Event的世界中,以scroll.resize.mouseover等为代表的高频触发事件显得有些与众不同.通常,DOM事件只有在明确的时间点才会被触发,比如被点击,比如XMLHttpReq ...

  6. 【NDK开发】android-ndk r10环境搭建

    1)打开Android开发者的官网http://developer.android.com/找到Develop点击.如果页面打不开,通过代理来访问. 2)进入后再点击Tools 3)进入后在左侧找到N ...

  7. Ext.NET-布局篇

    概述 前一篇介绍了Ext.NET基础知识,并对Ext.NET布局进行了简要的说明,本文中我们用一个完整的示例代码来看看Ext.NET的布局. 示例代码下载地址>>>>> ...

  8. [CF#286 Div2 D]Mr. Kitayuta's Technology(结论题)

    题目:http://codeforces.com/contest/505/problem/D 题目大意:就是给你一个n个点的图,然后你要在图中加入尽量少的有向边,满足所有要求(x,y),即从x可以走到 ...

  9. 根本没有“JSON“对象这回事(读汤姆大叔博文记录)

    1.字面量 (1)他们是固定的值,不是变量,让你从“字面上”理解脚本. (2)字符串字面量是由双引号("")或单引号('')包围起来的零个或多个字符串组成的. (3)对象字面量是由 ...

  10. SD卡状态广播

    SD状态发生改变的时候会对外发送广播.SD卡的状态一般有挂载.未挂载和无SD卡. 清单文件 一个广播接受者可以接受多条广播.这里在意图过滤器中添加是data属性是因为广播也需要进行匹配的.对方发送的广 ...