启动之后发现slave上正常启动了DataNode,DataManager,但是过了几秒后发现DataNode被关闭

以slave1上错误日期为例查看错误信息:

more /opt/hadoop-2.9./logs/hadoop-spark-datanode-slave1.log

找到错误信息:

-- ::, WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage directory [DISK]file:/opt/hadoop-2.9./dfs/data/
java.io.IOException: Incompatible clusterIDs in /opt/hadoop-2.9./dfs/data: namenode clusterID = CID-f1195fc7-ca7c-4a2a-b32f-211131a5d699; datanode clusterID = CID-292293a6-9c34-4de7-aecd-d72657a26dd5
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:)
at java.lang.Thread.run(Thread.java:)
-- ::, ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid f4badff3-7a0b-4db0-bd77-83b370f67eed) service to master/
.168.0.:. Exiting.
java.io.IOException: All specified directories have failed to load.
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:)
at java.lang.Thread.run(Thread.java:)
-- ::, WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid f4badff3-7a0b-4db0-bd77-83b370f67eed) service to master
/192.168.0.120:
-- ::, INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid f4badff3-7a0b-4db0-bd77-83b370f67eed)
-- ::, WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
-- ::, INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at slave1/192.168.0.121
************************************************************/

解决方案

错误问题原因:多次格式化导致的。

1)在master执行sbin/stop-all.sh,关闭hadoop:

cd /opt/hadoop-2.9.
sbin/stop-all.sh

2)依次在master,slave1,slave2,slave3上执行以下命令:

cd /opt/hadoop-2.9.
rm -r dfs
rm -r logs
rm -r tmp

3)在master上重新格式化hadoop,重新启动hadoop

cd /opt/hadoop-2.9.     #进入hadoop目录
bin/hadoop namenode -format #格式化namenode
sbin/start-all.sh #启动dfs
[spark@master hadoop-2.9.]$ cd /opt/hadoop-2.9. #进入hadoop目录
[spark@master hadoop-2.9.]$ bin/hadoop namenode -format #格式化namenode
sbin/start-all.sh #启动dfs
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it. // :: INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = master/192.168.0.120
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.9.0
STARTUP_MSG: classpath = /opt/hadoop-2.9.0/etc/hadoop:/opt/hadoop-2.9.0/share/hadoop/common/lib/nimbus-jose-jwt-3.9.jar:/opt/hadoop-2.9.0/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/opt/hadoop-...
STARTUP_MSG: build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r 756ebc8394e473ac25feac05fa493f6d612e6c50; compiled by 'arsuresh' on 2017-11-13T23:15Z
STARTUP_MSG: java = 1.8.0_171
************************************************************/
// :: INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
// :: INFO namenode.NameNode: createNameNode [-format]
Formatting using clusterid: CID-d4e2f108-de3c--9eeb-abbbb1024fe8
// :: INFO namenode.FSEditLog: Edit logging is async:true
// :: INFO namenode.FSNamesystem: KeyProvider: null
// :: INFO namenode.FSNamesystem: fsLock is fair: true
// :: INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false
// :: INFO namenode.FSNamesystem: fsOwner = spark (auth:SIMPLE)
// :: INFO namenode.FSNamesystem: supergroup = supergroup
// :: INFO namenode.FSNamesystem: isPermissionEnabled = true
// :: INFO namenode.FSNamesystem: HA Enabled: false
// :: INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to . Disabling file IO profiling
// :: INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit: configured=, counted=, effected=
// :: INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
// :: INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to :::00.000
// :: INFO blockmanagement.BlockManager: The block deletion will start around Jun ::
// :: INFO util.GSet: Computing capacity for map BlocksMap
// :: INFO util.GSet: VM type = -bit
// :: INFO util.GSet: 2.0% max memory MB = 17.8 MB
// :: INFO util.GSet: capacity = ^ = entries
// :: INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
// :: WARN conf.Configuration: No unit for dfs.namenode.safemode.extension() assuming MILLISECONDS
// :: INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
// :: INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.min.datanodes =
// :: INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.extension =
// :: INFO blockmanagement.BlockManager: defaultReplication =
// :: INFO blockmanagement.BlockManager: maxReplication =
// :: INFO blockmanagement.BlockManager: minReplication =
// :: INFO blockmanagement.BlockManager: maxReplicationStreams =
// :: INFO blockmanagement.BlockManager: replicationRecheckInterval =
// :: INFO blockmanagement.BlockManager: encryptDataTransfer = false
// :: INFO blockmanagement.BlockManager: maxNumBlocksToLog =
// :: INFO namenode.FSNamesystem: Append Enabled: true
// :: INFO util.GSet: Computing capacity for map INodeMap
// :: INFO util.GSet: VM type = -bit
// :: INFO util.GSet: 1.0% max memory MB = 8.9 MB
// :: INFO util.GSet: capacity = ^ = entries
// :: INFO namenode.FSDirectory: ACLs enabled? false
// :: INFO namenode.FSDirectory: XAttrs enabled? true
// :: INFO namenode.NameNode: Caching file names occurring more than times
// :: INFO snapshot.SnapshotManager: Loaded config captureOpenFiles: falseskipCaptureAccessTimeOnlyChange: false
// :: INFO util.GSet: Computing capacity for map cachedBlocks
// :: INFO util.GSet: VM type = -bit
// :: INFO util.GSet: 0.25% max memory MB = 2.2 MB
// :: INFO util.GSet: capacity = ^ = entries
// :: INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets =
// :: INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users =
// :: INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = ,,
// :: INFO namenode.FSNamesystem: Retry cache on namenode is enabled
// :: INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is millis
// :: INFO util.GSet: Computing capacity for map NameNodeRetryCache
// :: INFO util.GSet: VM type = -bit
// :: INFO util.GSet: 0.029999999329447746% max memory MB = 273.1 KB
// :: INFO util.GSet: capacity = ^ = entries
// :: INFO namenode.FSImage: Allocated new BlockPoolId: BP--192.168.0.120-
// :: INFO common.Storage: Storage directory /opt/hadoop-2.9./dfs/name has been successfully formatted.
// :: INFO namenode.FSImageFormatProtobuf: Saving image file /opt/hadoop-2.9./dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
// :: INFO namenode.FSImageFormatProtobuf: Image file /opt/hadoop-2.9./dfs/name/current/fsimage.ckpt_0000000000000000000 of size bytes saved in seconds.
// :: INFO namenode.NNStorageRetentionManager: Going to retain images with txid >=
// :: INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/192.168.0.120
************************************************************/
[spark@master hadoop-2.9.]$ sbin/start-all.sh #启动dfs
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [master]
master: starting namenode, logging to /opt/hadoop-2.9./logs/hadoop-spark-namenode-master.out
slave1: starting datanode, logging to /opt/hadoop-2.9./logs/hadoop-spark-datanode-slave1.out
slave3: starting datanode, logging to /opt/hadoop-2.9./logs/hadoop-spark-datanode-slave3.out
slave2: starting datanode, logging to /opt/hadoop-2.9./logs/hadoop-spark-datanode-slave2.out
Starting secondary namenodes [master]
master: starting secondarynamenode, logging to /opt/hadoop-2.9./logs/hadoop-spark-secondarynamenode-master.out
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop-2.9./logs/yarn-spark-resourcemanager-master.out
slave2: starting nodemanager, logging to /opt/hadoop-2.9./logs/yarn-spark-nodemanager-slave2.out
slave3: starting nodemanager, logging to /opt/hadoop-2.9./logs/yarn-spark-nodemanager-slave3.out
slave1: starting nodemanager, logging to /opt/hadoop-2.9./logs/yarn-spark-nodemanager-slave1.out

4)过30s后,查看master,slave1,slave2,slave3是否启动成功

查看master是否启动成功:

[spark@master hadoop-2.9.]$ jps
Jps
ResourceManager
NameNode
SecondaryNameNode
[spark@master hadoop-2.9.]$

在slave1,slave2,slave3分别jps查看是否都启动了DataNode,DataManager进程:
以slave1为例:

[spark@slave1 hadoop-2.9.]$ jps
Jps
NodeManager
DataNode
[spark@slave1 hadoop-2.9.]$

参考《https://blog.csdn.net/magggggic/article/details/52503502》

Kafka:ZK+Kafka+Spark Streaming集群环境搭建(五)针对hadoop2.9.0启动之后发现slave上正常启动了DataNode,DataManager,但是过了几秒后发现DataNode被关闭的更多相关文章

  1. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(二十一)NIFI1.7.1安装

    一.nifi基本配置 1. 修改各节点主机名,修改/etc/hosts文件内容. 192.168.0.120 master 192.168.0.121 slave1 192.168.0.122 sla ...

  2. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(十三)kafka+spark streaming打包好的程序提交时提示虚拟内存不足(Container is running beyond virtual memory limits. Current usage: 119.5 MB of 1 GB physical memory used; 2.2 GB of 2.1 G)

    异常问题:Container is running beyond virtual memory limits. Current usage: 119.5 MB of 1 GB physical mem ...

  3. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(十二)VMW安装四台CentOS,并实现本机与它们能交互,虚拟机内部实现可以上网。

    Centos7出现异常:Failed to start LSB: Bring up/down networking. 按照<Kafka:ZK+Kafka+Spark Streaming集群环境搭 ...

  4. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(十一)定制一个arvo格式文件发送到kafka的topic,通过Structured Streaming读取kafka的数据

    将arvo格式数据发送到kafka的topic 第一步:定制avro schema: { "type": "record", "name": ...

  5. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(十)安装hadoop2.9.0搭建HA

    如何搭建配置centos虚拟机请参考<Kafka:ZK+Kafka+Spark Streaming集群环境搭建(一)VMW安装四台CentOS,并实现本机与它们能交互,虚拟机内部实现可以上网.& ...

  6. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(九)安装kafka_2.11-1.1.0

    如何搭建配置centos虚拟机请参考<Kafka:ZK+Kafka+Spark Streaming集群环境搭建(一)VMW安装四台CentOS,并实现本机与它们能交互,虚拟机内部实现可以上网.& ...

  7. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(八)安装zookeeper-3.4.12

    如何搭建配置centos虚拟机请参考<Kafka:ZK+Kafka+Spark Streaming集群环境搭建(一)VMW安装四台CentOS,并实现本机与它们能交互,虚拟机内部实现可以上网.& ...

  8. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(三)安装spark2.2.1

    如何搭建配置centos虚拟机请参考<Kafka:ZK+Kafka+Spark Streaming集群环境搭建(一)VMW安装四台CentOS,并实现本机与它们能交互,虚拟机内部实现可以上网.& ...

  9. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(二)安装hadoop2.9.0

    如何搭建配置centos虚拟机请参考<Kafka:ZK+Kafka+Spark Streaming集群环境搭建(一)VMW安装四台CentOS,并实现本机与它们能交互,虚拟机内部实现可以上网.& ...

随机推荐

  1. 使用 IntraWeb (12) - 基本控件之 TIWGradButton、TIWImageButton

    TIWGradButton.TIWImageButton 分别是有颜色梯度变化按钮和图像按钮. TIWGradButton 所在单元及继承链: IWCompGradButton.TIWGradButt ...

  2. 使用 IntraWeb (2) - Hello IntraWeb

    IntraWeb 比我相像中的更贴近 VCL, 传统的非可视组件在这里大都可用(其内部很多复合属性是 TStringList 类型的), 它的诸多可视控件也是从 TControl 继承下来的. 这或许 ...

  3. jtagger Versatile multiprogrammer for FPGAs, MCUs, etc.

    jtagger Versatile multiprogrammer for FPGAs, MCUs, etc. Well, it's not really just a jtagger, but I' ...

  4. 使用Edge模式通知Internet Explorer以最高级别的可用模式显示内容

    一.EasyUI$的window('open')在IE8下兼容性问题 今天在公司使用EasyUI的$('#win').window('open');方法打开一个window窗体时发现EaysUI的脚本 ...

  5. 使用 MVVMLight 命令绑定(转)

    继上一篇文章的项目,我们实现了数据绑定到界面中.这篇文章我们将实现把命令绑定到按钮上,在XAML中的Button之类的都会有个Command属性可以让我们来绑定命令使用. 首先我们要实现的目标是,在界 ...

  6. C# 连接Oracle数据库,免安装oracle客户端

    一.方案1 首先下面的内容,有待我的进一步测试和证实.18.12.20 被证实了,还需要安装Oracle客户端,或者本机上安装oracle数据库软件. 18.12.20 1.下载Oracle.Mana ...

  7. 1300多万条数据30G论坛大数据优化实战经验小结

    最近由于某大型网站社区论坛运行效率比较低用户反馈论坛有些卡需要对系统进行优化,论坛性能影响了公司的形象还有网站的流量,当然这也会影响到公司的收入,而且后期还需要长期维护网站的社区论坛服务. 1:并发访 ...

  8. Quartz 2.3.0 升级感受

    Quartz 2.3.0 发布,Quartz是一个开源的作业调度框架,它完全由Java写成,并设计用于J2SE和J2EE应用中.它提供了巨大的灵 活性而不牺牲简单性.你能够用它来为执行一个作业而创建简 ...

  9. JKS和PFX文件相互转换方法

    JKS(JavaKeysotre)格式和PFX(PKCS12)格式,是最常见的SSL证书格式文件,可以包含完整的证书密钥对,证书链和信任证书信息.PFX常用于Windows IIS服务器,JKS常用语 ...

  10. iTunes Connect开发者指南中的一个疑问

    iTunes Connect Developer Guide     避免app版本出现在iClound中,我的疑问是对已经上架的版本不能设置,那么这个功能的真正意义在哪里? 大部分用户去应用页面下载 ...