Hadoop- Namenode经常挂掉 IPC's epoch 9 is less than the last promised epoch 10
如题出现Namenode经常挂掉 IPC's epoch 9 is less than the last promised epoch 10,
2019-01-03 05:36:14,774 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocate blk_1073741949_1131{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW], ReplicaUC[[DISK]DS-0e124439-ff5d-49d7-879f-e54f191c4578:NORMAL:192.168.41.89:50010|RBW], ReplicaUC[[DISK]DS-2bb9dc09-913e-4ce4-a274-9b0bedc6bfc1:NORMAL:192.168.41.90:50010|RBW]]} for /hbase/WALs/hdp,16020,1546464960046/hdp%2C16020%2C1546464960046..meta.1546464974748.meta
2019-01-03 05:36:14,812 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* fsync: /hbase/WALs/hdp,16020,1546464960046/hdp%2C16020%2C1546464960046..meta.1546464974748.meta for DFSClient_NONMAPREDUCE_-874241219_1
2019-01-03 05:36:15,565 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocate blk_1073741950_1132{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW], ReplicaUC[[DISK]DS-0e124439-ff5d-49d7-879f-e54f191c4578:NORMAL:192.168.41.89:50010|RBW], ReplicaUC[[DISK]DS-2bb9dc09-913e-4ce4-a274-9b0bedc6bfc1:NORMAL:192.168.41.90:50010|RBW]]} for /hbase/data/hbase/meta/1588230740/.tmp/b45aafd78d7e4888a301aee2751d9038
2019-01-03 05:36:15,714 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 192.168.41.90:50010 is added to blk_1073741950_1132{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW], ReplicaUC[[DISK]DS-0e124439-ff5d-49d7-879f-e54f191c4578:NORMAL:192.168.41.89:50010|RBW], ReplicaUC[[DISK]DS-2bb9dc09-913e-4ce4-a274-9b0bedc6bfc1:NORMAL:192.168.41.90:50010|RBW]]} size 0
2019-01-03 05:36:15,727 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 192.168.41.89:50010 is added to blk_1073741950_1132{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW], ReplicaUC[[DISK]DS-0e124439-ff5d-49d7-879f-e54f191c4578:NORMAL:192.168.41.89:50010|RBW], ReplicaUC[[DISK]DS-2bb9dc09-913e-4ce4-a274-9b0bedc6bfc1:NORMAL:192.168.41.90:50010|RBW]]} size 0
2019-01-03 05:36:15,732 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 192.168.41.88:50010 is added to blk_1073741950_1132 size 5422
2019-01-03 05:36:15,735 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile: /hbase/data/hbase/meta/1588230740/.tmp/b45aafd78d7e4888a301aee2751d9038 is closed by DFSClient_NONMAPREDUCE_-874241219_1
2019-01-03 05:36:15,958 INFO BlockStateChange: BLOCK* addToInvalidates: blk_1073741948_1130 192.168.41.88:50010 192.168.41.89:50010 192.168.41.90:50010
2019-01-03 05:36:16,013 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile: /hbase/data/hbase/meta/1588230740/recovered.edits/53.seqid is closed by DFSClient_NONMAPREDUCE_-874241219_1
2019-01-03 05:36:16,133 INFO BlockStateChange: BLOCK* BlockManager: ask 192.168.41.89:50010 to delete [blk_1073741948_1130]
2019-01-03 05:36:16,354 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocate blk_1073741951_1133{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW], ReplicaUC[[DISK]DS-2bb9dc09-913e-4ce4-a274-9b0bedc6bfc1:NORMAL:192.168.41.90:50010|RBW], ReplicaUC[[DISK]DS-0e124439-ff5d-49d7-879f-e54f191c4578:NORMAL:192.168.41.89:50010|RBW]]} for /hbase/data/hbase/meta/1588230740/.tmp/c79ddf66ee7c4ba69a0c6c428e817c8d
2019-01-03 05:36:16,418 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 192.168.41.89:50010 is added to blk_1073741951_1133{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW], ReplicaUC[[DISK]DS-2bb9dc09-913e-4ce4-a274-9b0bedc6bfc1:NORMAL:192.168.41.90:50010|RBW], ReplicaUC[[DISK]DS-0e124439-ff5d-49d7-879f-e54f191c4578:NORMAL:192.168.41.89:50010|RBW]]} size 0
2019-01-03 05:36:16,434 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 192.168.41.90:50010 is added to blk_1073741951_1133{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW], ReplicaUC[[DISK]DS-2bb9dc09-913e-4ce4-a274-9b0bedc6bfc1:NORMAL:192.168.41.90:50010|RBW], ReplicaUC[[DISK]DS-0e124439-ff5d-49d7-879f-e54f191c4578:NORMAL:192.168.41.89:50010|RBW]]} size 0
2019-01-03 05:36:16,440 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 192.168.41.88:50010 is added to blk_1073741951_1133{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW], ReplicaUC[[DISK]DS-2bb9dc09-913e-4ce4-a274-9b0bedc6bfc1:NORMAL:192.168.41.90:50010|RBW], ReplicaUC[[DISK]DS-0e124439-ff5d-49d7-879f-e54f191c4578:NORMAL:192.168.41.89:50010|RBW]]} size 0
2019-01-03 05:36:16,452 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile: /hbase/data/hbase/meta/1588230740/.tmp/c79ddf66ee7c4ba69a0c6c428e817c8d is closed by DFSClient_NONMAPREDUCE_-874241219_1
2019-01-03 05:36:16,853 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocate blk_1073741952_1134{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW], ReplicaUC[[DISK]DS-2bb9dc09-913e-4ce4-a274-9b0bedc6bfc1:NORMAL:192.168.41.90:50010|RBW], ReplicaUC[[DISK]DS-0e124439-ff5d-49d7-879f-e54f191c4578:NORMAL:192.168.41.89:50010|RBW]]} for /hbase/MasterProcWALs/state-00000000000000000022.log
2019-01-03 05:36:16,995 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* fsync: /hbase/MasterProcWALs/state-00000000000000000022.log for DFSClient_NONMAPREDUCE_-586038503_1
2019-01-03 05:36:17,010 INFO BlockStateChange: BLOCK* addToInvalidates: blk_1073741932_1112 192.168.41.90:50010 192.168.41.89:50010 192.168.41.88:50010
2019-01-03 05:36:17,296 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: recoverLease: [Lease. Holder: DFSClient_NONMAPREDUCE_1637648643_1, pendingcreates: 1], src=/hbase/WALs/hdp2,16020,1546464401321-splitting/hdp2%2C16020%2C1546464401321.default.1546464428803 from client DFSClient_NONMAPREDUCE_1637648643_1
2019-01-03 05:36:17,296 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering [Lease. Holder: DFSClient_NONMAPREDUCE_1637648643_1, pendingcreates: 1], src=/hbase/WALs/hdp2,16020,1546464401321-splitting/hdp2%2C16020%2C1546464401321.default.1546464428803
2019-01-03 05:36:17,296 INFO BlockStateChange: BLOCK* blk_1073741944_1124{UCState=UNDER_RECOVERY, truncateBlock=null, primaryNodeIndex=1, replicas=[ReplicaUC[[DISK]DS-2bb9dc09-913e-4ce4-a274-9b0bedc6bfc1:NORMAL:192.168.41.90:50010|RBW], ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW]]} recovery started, primary=ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW]
2019-01-03 05:36:17,296 WARN org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.internalReleaseLease: File /hbase/WALs/hdp2,16020,1546464401321-splitting/hdp2%2C16020%2C1546464401321.default.1546464428803 has not been closed. Lease recovery is in progress. RecoveryId = 1135 for block blk_1073741944_1124{UCState=UNDER_RECOVERY, truncateBlock=null, primaryNodeIndex=1, replicas=[ReplicaUC[[DISK]DS-2bb9dc09-913e-4ce4-a274-9b0bedc6bfc1:NORMAL:192.168.41.90:50010|RBW], ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW]]}
2019-01-03 05:36:17,632 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: recoverLease: [Lease. Holder: DFSClient_NONMAPREDUCE_-877858473_1, pendingcreates: 1], src=/hbase/WALs/hdp,16020,1546464395339-splitting/hdp%2C16020%2C1546464395339.default.1546464427866 from client DFSClient_NONMAPREDUCE_-877858473_1
2019-01-03 05:36:17,632 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering [Lease. Holder: DFSClient_NONMAPREDUCE_-877858473_1, pendingcreates: 1], src=/hbase/WALs/hdp,16020,1546464395339-splitting/hdp%2C16020%2C1546464395339.default.1546464427866
2019-01-03 05:36:17,633 INFO BlockStateChange: BLOCK* blk_1073741941_1121{UCState=UNDER_RECOVERY, truncateBlock=null, primaryNodeIndex=2, replicas=[ReplicaUC[[DISK]DS-2bb9dc09-913e-4ce4-a274-9b0bedc6bfc1:NORMAL:192.168.41.90:50010|RBW], ReplicaUC[[DISK]DS-0e124439-ff5d-49d7-879f-e54f191c4578:NORMAL:192.168.41.89:50010|RBW], ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW]]} recovery started, primary=ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW]
2019-01-03 05:36:17,633 WARN org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.internalReleaseLease: File /hbase/WALs/hdp,16020,1546464395339-splitting/hdp%2C16020%2C1546464395339.default.1546464427866 has not been closed. Lease recovery is in progress. RecoveryId = 1136 for block blk_1073741941_1121{UCState=UNDER_RECOVERY, truncateBlock=null, primaryNodeIndex=2, replicas=[ReplicaUC[[DISK]DS-2bb9dc09-913e-4ce4-a274-9b0bedc6bfc1:NORMAL:192.168.41.90:50010|RBW], ReplicaUC[[DISK]DS-0e124439-ff5d-49d7-879f-e54f191c4578:NORMAL:192.168.41.89:50010|RBW], ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW]]}
2019-01-03 05:36:19,133 INFO BlockStateChange: BLOCK* BlockManager: ask 192.168.41.90:50010 to delete [blk_1073741932_1112, blk_1073741948_1130]
2019-01-03 05:36:20,015 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 192.168.41.90:50010 is added to blk_1073741944_1124{UCState=UNDER_RECOVERY, truncateBlock=null, primaryNodeIndex=1, replicas=[ReplicaUC[[DISK]DS-2bb9dc09-913e-4ce4-a274-9b0bedc6bfc1:NORMAL:192.168.41.90:50010|RBW], ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW]]} size 83
2019-01-03 05:36:20,016 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 192.168.41.88:50010 is added to blk_1073741944_1124{UCState=UNDER_RECOVERY, truncateBlock=null, primaryNodeIndex=1, replicas=[ReplicaUC[[DISK]DS-2bb9dc09-913e-4ce4-a274-9b0bedc6bfc1:NORMAL:192.168.41.90:50010|RBW], ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW]]} size 83
2019-01-03 05:36:20,017 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: commitBlockSynchronization(oldBlock=BP-310132273-192.168.41.88-1546299030273:blk_1073741944_1124, newgenerationstamp=1135, newlength=83, newtargets=[192.168.41.90:50010, 192.168.41.88:50010], closeFile=true, deleteBlock=false)
2019-01-03 05:36:20,023 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: commitBlockSynchronization(oldBlock=BP-310132273-192.168.41.88-1546299030273:blk_1073741944_1124, file=/hbase/WALs/hdp2,16020,1546464401321-splitting/hdp2%2C16020%2C1546464401321.default.1546464428803, newgenerationstamp=1135, newlength=83, newtargets=[192.168.41.90:50010, 192.168.41.88:50010]) successful
2019-01-03 05:36:20,033 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 192.168.41.90:50010 is added to blk_1073741941_1121{UCState=UNDER_RECOVERY, truncateBlock=null, primaryNodeIndex=2, replicas=[ReplicaUC[[DISK]DS-2bb9dc09-913e-4ce4-a274-9b0bedc6bfc1:NORMAL:192.168.41.90:50010|RBW], ReplicaUC[[DISK]DS-0e124439-ff5d-49d7-879f-e54f191c4578:NORMAL:192.168.41.89:50010|RBW], ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW]]} size 83
2019-01-03 05:36:20,036 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 192.168.41.89:50010 is added to blk_1073741941_1121{UCState=UNDER_RECOVERY, truncateBlock=null, primaryNodeIndex=2, replicas=[ReplicaUC[[DISK]DS-2bb9dc09-913e-4ce4-a274-9b0bedc6bfc1:NORMAL:192.168.41.90:50010|RBW], ReplicaUC[[DISK]DS-0e124439-ff5d-49d7-879f-e54f191c4578:NORMAL:192.168.41.89:50010|RBW], ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW]]} size 83
2019-01-03 05:36:20,039 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 192.168.41.88:50010 is added to blk_1073741941_1121{UCState=UNDER_RECOVERY, truncateBlock=null, primaryNodeIndex=2, replicas=[ReplicaUC[[DISK]DS-2bb9dc09-913e-4ce4-a274-9b0bedc6bfc1:NORMAL:192.168.41.90:50010|RBW], ReplicaUC[[DISK]DS-0e124439-ff5d-49d7-879f-e54f191c4578:NORMAL:192.168.41.89:50010|RBW], ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW]]} size 83
2019-01-03 05:36:20,040 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: commitBlockSynchronization(oldBlock=BP-310132273-192.168.41.88-1546299030273:blk_1073741941_1121, newgenerationstamp=1136, newlength=83, newtargets=[192.168.41.90:50010, 192.168.41.89:50010, 192.168.41.88:50010], closeFile=true, deleteBlock=false)
2019-01-03 05:36:20,046 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: commitBlockSynchronization(oldBlock=BP-310132273-192.168.41.88-1546299030273:blk_1073741941_1121, file=/hbase/WALs/hdp,16020,1546464395339-splitting/hdp%2C16020%2C1546464395339.default.1546464427866, newgenerationstamp=1136, newlength=83, newtargets=[192.168.41.90:50010, 192.168.41.89:50010, 192.168.41.88:50010]) successful
2019-01-03 05:36:21,950 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile: /hbase/data/default/user_info/7faad63bbfc155cb40501710a76b1afa/recovered.edits/16.seqid is closed by DFSClient_NONMAPREDUCE_-1652540829_1
2019-01-03 05:36:22,135 INFO BlockStateChange: BLOCK* ask 192.168.41.88:50010 to replicate blk_1073741944_1135 to datanode(s) 192.168.41.89:50010
2019-01-03 05:36:22,135 INFO BlockStateChange: BLOCK* BlockManager: ask 192.168.41.88:50010 to delete [blk_1073741932_1112, blk_1073741948_1130]
2019-01-03 05:36:22,217 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile: /hbase/data/hbase/namespace/f755ccf016e1b9ba0a237525c8d23d57/recovered.edits/26.seqid is closed by DFSClient_NONMAPREDUCE_-1652540829_1
2019-01-03 05:36:23,048 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 192.168.41.89:50010 is added to blk_1073741944_1135 size 83
2019-01-03 05:36:25,135 INFO BlockStateChange: BLOCK* BlockManager: ask 192.168.41.89:50010 to delete [blk_1073741932_1112]
2019-01-03 05:36:34,026 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 192.168.41.89:50010 is added to blk_1073741952_1134{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW], ReplicaUC[[DISK]DS-2bb9dc09-913e-4ce4-a274-9b0bedc6bfc1:NORMAL:192.168.41.90:50010|RBW], ReplicaUC[[DISK]DS-0e124439-ff5d-49d7-879f-e54f191c4578:NORMAL:192.168.41.89:50010|RBW]]} size 123
2019-01-03 05:36:34,031 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 192.168.41.90:50010 is added to blk_1073741952_1134{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW], ReplicaUC[[DISK]DS-2bb9dc09-913e-4ce4-a274-9b0bedc6bfc1:NORMAL:192.168.41.90:50010|RBW], ReplicaUC[[DISK]DS-0e124439-ff5d-49d7-879f-e54f191c4578:NORMAL:192.168.41.89:50010|RBW]]} size 123
2019-01-03 05:36:34,046 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 192.168.41.88:50010 is added to blk_1073741952_1134 size 2034
2019-01-03 05:36:34,046 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile: /hbase/MasterProcWALs/state-00000000000000000022.log is closed by DFSClient_NONMAPREDUCE_-586038503_1
2019-01-03 05:36:34,059 INFO BlockStateChange: BLOCK* addToInvalidates: blk_1073741952_1134 192.168.41.90:50010 192.168.41.89:50010 192.168.41.88:50010
2019-01-03 05:36:34,140 INFO BlockStateChange: BLOCK* BlockManager: ask 192.168.41.89:50010 to delete [blk_1073741952_1134]
2019-01-03 05:36:37,141 INFO BlockStateChange: BLOCK* BlockManager: ask 192.168.41.88:50010 to delete [blk_1073741952_1134]
2019-01-03 05:36:40,142 INFO BlockStateChange: BLOCK* BlockManager: ask 192.168.41.90:50010 to delete [blk_1073741952_1134]
2019-01-03 05:36:55,450 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 192.168.41.89
2019-01-03 05:36:55,450 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
2019-01-03 05:36:55,450 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 1896
2019-01-03 05:36:55,466 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 101 Total time for transactions(ms): 24 Number of transactions batched in Syncs: 21 Number of syncs: 79 SyncTimes(ms): 611 101
2019-01-03 05:36:55,475 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /root/hdpdata/name/current/edits_inprogress_0000000000000001896 -> /root/hdpdata/name/current/edits_0000000000000001896-0000000000000001996
2019-01-03 05:36:55,476 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 1997
2019-01-03 05:38:54,483 WARN org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 64530ms
No GCs detected
2019-01-03 05:38:54,591 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 1 to reach 3 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=false) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
2019-01-03 05:38:54,592 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 1 to reach 3 (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=false) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
2019-01-03 05:38:54,592 WARN org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough replicas: expected size is 1 but only 0 storage types can be selected (replication=3, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK], policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
2019-01-03 05:38:54,592 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 1 to reach 3 (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=false) All required storage types are unavailable: unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
2019-01-03 05:38:54,599 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 1 to reach 3 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=false) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
2019-01-03 05:38:54,599 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 1 to reach 3 (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=false) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
2019-01-03 05:38:54,599 WARN org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough replicas: expected size is 1 but only 0 storage types can be selected (replication=3, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK], policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
2019-01-03 05:38:54,599 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 1 to reach 3 (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=false) All required storage types are unavailable: unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
2019-01-03 05:38:55,507 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocate blk_1073741953_1137{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-4d1a4a8a-a442-4db7-80d2-16699acc814c:NORMAL:192.168.41.88:50010|RBW], ReplicaUC[[DISK]DS-2bb9dc09-913e-4ce4-a274-9b0bedc6bfc1:NORMAL:192.168.41.90:50010|RBW], ReplicaUC[[DISK]DS-0e124439-ff5d-49d7-879f-e54f191c4578:NORMAL:192.168.41.89:50010|RBW]]} for /hbase/MasterProcWALs/state-00000000000000000023.log
2019-01-03 05:38:55,507 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 4 Total time for transactions(ms): 1 Number of transactions batched in Syncs: 0 Number of syncs: 1 SyncTimes(ms): 6 22
2019-01-03 05:38:55,524 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Remote journal 192.168.41.89:8485 failed to write txns 1998-2000. Will try to write to this JN again after the next log roll.
org.apache.hadoop.ipc.RemoteException(java.io.IOException): IPC's epoch 9 is less than the last promised epoch 10
at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:418)
at org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:446)
at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:341)
at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:148)
at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158)
at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25421)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) at org.apache.hadoop.ipc.Client.call(Client.java:1475)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy11.journal(Unknown Source)
at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolTranslatorPB.journal(QJournalProtocolTranslatorPB.java:167)
at org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$7.call(IPCLoggerChannel.java:385)
at org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$7.call(IPCLoggerChannel.java:378)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2019-01-03 05:38:55,526 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Remote journal 192.168.41.90:8485 failed to write txns 1998-2000. Will try to write to this JN again after the next log roll.
org.apache.hadoop.ipc.RemoteException(java.io.IOException): IPC's epoch 9 is less than the last promised epoch 10
at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:418)
at org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:446)
at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:341)
at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:148)
at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158)
at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25421)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) at org.apache.hadoop.ipc.Client.call(Client.java:1475)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy11.journal(Unknown Source)
at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolTranslatorPB.journal(QJournalProtocolTranslatorPB.java:167)
at org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$7.call(IPCLoggerChannel.java:385)
at org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$7.call(IPCLoggerChannel.java:378)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2019-01-03 05:38:55,527 FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: flush failed for required journal (JournalAndStream(mgr=QJM to [192.168.41.88:8485, 192.168.41.89:8485, 192.168.41.90:8485], stream=QuorumOutputStream starting at txid 1997))
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/3. 1 successful responses:
192.168.41.88:8485: null [success]
2 exceptions thrown:
192.168.41.90:8485: IPC's epoch 9 is less than the last promised epoch 10
at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:418)
at org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:446)
at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:341)
at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:148)
at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158)
at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25421)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) 192.168.41.89:8485: IPC's epoch 9 is less than the last promised epoch 10
at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:418)
at org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:446)
at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:341)
at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:148)
at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158)
at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25421)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223)
at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:142)
at org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107)
at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113)
at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107)
at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:533)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:57)
at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:529)
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:647)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.storeAllocatedBlock(FSNamesystem.java:3164)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3038)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:725)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
2019-01-03 05:38:55,528 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Aborting QuorumOutputStream starting at txid 1997
2019-01-03 05:38:55,530 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2019-01-03 05:38:55,531 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hdp/192.168.41.88
************************************************************/
解决方案
可以在core-site.xml文件中修改ha.health-monitor.rpc-timeout.ms参数值,来扩大zkfc监控检查超时时间。
<property> <name>ha.health-monitor.rpc-timeout.ms</name> <value>180000</value> </property>
Hadoop- Namenode经常挂掉 IPC's epoch 9 is less than the last promised epoch 10的更多相关文章
- HA模式强制手动切换:IPC's epoch [X] is less than the last promised epoch [X+1]
-- ::, WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Remote journal failed to wr ...
- IPC's epoch 6 is less than the last promised epoch 7
一.错误起因 Active NameNode日志出现异常IPC‘s epoch [X] is less than the last promised epoch [X+1],出现短期的双Active ...
- Hadoop记录-Hadoop NameNode 高可用 (High Availability) 实现解析
Hadoop NameNode 高可用 (High Availability) 实现解析 NameNode 高可用整体架构概述 在 Hadoop 1.0 时代,Hadoop 的两大核心组件 HDF ...
- Hadoop NameNode 高可用 (High Availability) 实现解析
转载自:http://reb12345reb.iteye.com/blog/2306818 在 Hadoop 的整个生态系统中,HDFS NameNode 处于核心地位,NameNode 的可用性直接 ...
- Hadoop NameNode 高可用 (High Availability) 实现解析[转]
NameNode 高可用整体架构概述 在 Hadoop 1.0 时代,Hadoop 的两大核心组件 HDFS NameNode 和 JobTracker 都存在着单点问题,这其中以 NameNode ...
- 【转载】Hadoop NameNode 高可用 (High Availability) 实现解析
转载:https://www.ibm.com/developerworks/cn/opensource/os-cn-hadoop-name-node/ NameNode 高可用整体架构概述 在 Had ...
- 重启HA集群NameNode无缘无故挂掉
重启HA集群后,两个NameNode无缘无故挂掉,查看日志时显示错误如下: 原因:journalnode的端口是8485,默认情况下是先NameNode启动后再启动journalnode,如果在Nam ...
- 通过tarball形式安装HBASE Cluster(CDH5.0.2)——Hadoop NameNode HA 切换引起的Hbase错误,以及Hbase如何基于NameNode的HA进行配置
通过tarball形式安装HBASE Cluster(CDH5.0.2)——Hadoop NameNode HA 切换引起的Hbase错误,以及Hbase如何基于NameNode的HA进行配置 配置H ...
- hadoop nameNode 无法启动
/************************************************************STARTUP_MSG: Starting NameNodeSTARTUP_M ...
随机推荐
- 安装mysql报错—解决方法:error while loading shared libraries: libssl.so.6
for 32bit ln -sf /usr/lib/libssl.so.10 /usr/lib/libssl.so.6ln -sf /usr/lib/libcrypto.so.10 /usr/lib/ ...
- 【转】再谈CLR查找和加载程序集的方式
这是一个老问题,以前也有朋友写过一些文章介绍,但可能还不是很全面.我也多次被人问到,这里结合案例再次谈谈,希望对大家有所帮助. 本文范例代码可以通过这里下载 http://files.cnblogs. ...
- Cookies with curl the command line tool
w https://curl.haxx.se/docs/http-cookies.html curl has a full cookie "engine" built in. If ...
- Distance matrix
w https://en.wikipedia.org/wiki/Distance_matrix For example, suppose these data are to be analyzed, ...
- 运行 Tomcat, 在 Intellij IDEA 控制台输出中文乱码的解决方法
打开 Run/Debug Configurations → Tomcat Server → 要运行的 Tomcat → Server 页签,在 VM options 中输入: -Dfile.encod ...
- SQLtie 增删该查
建表,添加数据,更新数据,删除数据,删除表 . 先介绍三个核心方法 1.openDatabase:这个方法使用现有数据库或创建新数据库创建数据库对象. 2.transaction:这个方法允许我们根据 ...
- 《Python 机器学习》笔记(二)
机器学习分类算法 本章将介绍最早以算法方式描述的分类机器学习算法:感知器(perceptron)和自适应线性神经元. 人造神经元--早期机器学习概览 MP神经元 生物神经元和MP神经元模型的对应关系如 ...
- Android测试读写sd卡文件与写sd卡文件耗时
测试从sd卡读1k大小的文件,再写1k大小的文件,由于处理耗时很短,所以循环500次,查看耗时:测试写1k大小的文件,直接在内存构造一个1k的buffer,将这个buffer直接写到文件,同样循环50 ...
- 【Network】DDoS攻击防御
DDoS(Distributed Denial of Service,分布式拒绝服务)攻击的主要目的是让指定目标无法提供正常服务,甚至从互联网上消失,是目前最强大,最难防御的攻击之一. 按照发起的方式 ...
- Tips for Unix/Linux
@1: 在单个命令中创建目录树:不要逐层创建目录,尽量使用mkdir的-p选项: ~$ mkdir -p one/two/three # 假设目录one不存在 创建复杂的目录树: ~$ mkdir - ...