启动之后发现slave上正常启动了DataNode,DataManager,但是过了几秒后发现DataNode被关闭

以slave1上错误日期为例查看错误信息:

more /opt/hadoop-2.9./logs/hadoop-spark-datanode-slave1.log

找到错误信息:

-- ::, WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage directory [DISK]file:/opt/hadoop-2.9./dfs/data/
java.io.IOException: Incompatible clusterIDs in /opt/hadoop-2.9./dfs/data: namenode clusterID = CID-f1195fc7-ca7c-4a2a-b32f-211131a5d699; datanode clusterID = CID-292293a6-9c34-4de7-aecd-d72657a26dd5
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:)
at java.lang.Thread.run(Thread.java:)
-- ::, ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid f4badff3-7a0b-4db0-bd77-83b370f67eed) service to master/
.168.0.:. Exiting.
java.io.IOException: All specified directories have failed to load.
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:)
at java.lang.Thread.run(Thread.java:)
-- ::, WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid f4badff3-7a0b-4db0-bd77-83b370f67eed) service to master
/192.168.0.120:
-- ::, INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid f4badff3-7a0b-4db0-bd77-83b370f67eed)
-- ::, WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
-- ::, INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at slave1/192.168.0.121
************************************************************/

解决方案

错误问题原因:多次格式化导致的。

1)在master执行sbin/stop-all.sh,关闭hadoop:

cd /opt/hadoop-2.9.
sbin/stop-all.sh

2)依次在master,slave1,slave2,slave3上执行以下命令:

cd /opt/hadoop-2.9.
rm -r dfs
rm -r logs
rm -r tmp

3)在master上重新格式化hadoop,重新启动hadoop

cd /opt/hadoop-2.9.     #进入hadoop目录
bin/hadoop namenode -format #格式化namenode
sbin/start-all.sh #启动dfs
[spark@master hadoop-2.9.]$ cd /opt/hadoop-2.9. #进入hadoop目录
[spark@master hadoop-2.9.]$ bin/hadoop namenode -format #格式化namenode
sbin/start-all.sh #启动dfs
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it. // :: INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = master/192.168.0.120
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.9.0
STARTUP_MSG: classpath = /opt/hadoop-2.9.0/etc/hadoop:/opt/hadoop-2.9.0/share/hadoop/common/lib/nimbus-jose-jwt-3.9.jar:/opt/hadoop-2.9.0/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/opt/hadoop-...
STARTUP_MSG: build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r 756ebc8394e473ac25feac05fa493f6d612e6c50; compiled by 'arsuresh' on 2017-11-13T23:15Z
STARTUP_MSG: java = 1.8.0_171
************************************************************/
// :: INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
// :: INFO namenode.NameNode: createNameNode [-format]
Formatting using clusterid: CID-d4e2f108-de3c--9eeb-abbbb1024fe8
// :: INFO namenode.FSEditLog: Edit logging is async:true
// :: INFO namenode.FSNamesystem: KeyProvider: null
// :: INFO namenode.FSNamesystem: fsLock is fair: true
// :: INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false
// :: INFO namenode.FSNamesystem: fsOwner = spark (auth:SIMPLE)
// :: INFO namenode.FSNamesystem: supergroup = supergroup
// :: INFO namenode.FSNamesystem: isPermissionEnabled = true
// :: INFO namenode.FSNamesystem: HA Enabled: false
// :: INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to . Disabling file IO profiling
// :: INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit: configured=, counted=, effected=
// :: INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
// :: INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to :::00.000
// :: INFO blockmanagement.BlockManager: The block deletion will start around Jun ::
// :: INFO util.GSet: Computing capacity for map BlocksMap
// :: INFO util.GSet: VM type = -bit
// :: INFO util.GSet: 2.0% max memory MB = 17.8 MB
// :: INFO util.GSet: capacity = ^ = entries
// :: INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
// :: WARN conf.Configuration: No unit for dfs.namenode.safemode.extension() assuming MILLISECONDS
// :: INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
// :: INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.min.datanodes =
// :: INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.extension =
// :: INFO blockmanagement.BlockManager: defaultReplication =
// :: INFO blockmanagement.BlockManager: maxReplication =
// :: INFO blockmanagement.BlockManager: minReplication =
// :: INFO blockmanagement.BlockManager: maxReplicationStreams =
// :: INFO blockmanagement.BlockManager: replicationRecheckInterval =
// :: INFO blockmanagement.BlockManager: encryptDataTransfer = false
// :: INFO blockmanagement.BlockManager: maxNumBlocksToLog =
// :: INFO namenode.FSNamesystem: Append Enabled: true
// :: INFO util.GSet: Computing capacity for map INodeMap
// :: INFO util.GSet: VM type = -bit
// :: INFO util.GSet: 1.0% max memory MB = 8.9 MB
// :: INFO util.GSet: capacity = ^ = entries
// :: INFO namenode.FSDirectory: ACLs enabled? false
// :: INFO namenode.FSDirectory: XAttrs enabled? true
// :: INFO namenode.NameNode: Caching file names occurring more than times
// :: INFO snapshot.SnapshotManager: Loaded config captureOpenFiles: falseskipCaptureAccessTimeOnlyChange: false
// :: INFO util.GSet: Computing capacity for map cachedBlocks
// :: INFO util.GSet: VM type = -bit
// :: INFO util.GSet: 0.25% max memory MB = 2.2 MB
// :: INFO util.GSet: capacity = ^ = entries
// :: INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets =
// :: INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users =
// :: INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = ,,
// :: INFO namenode.FSNamesystem: Retry cache on namenode is enabled
// :: INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is millis
// :: INFO util.GSet: Computing capacity for map NameNodeRetryCache
// :: INFO util.GSet: VM type = -bit
// :: INFO util.GSet: 0.029999999329447746% max memory MB = 273.1 KB
// :: INFO util.GSet: capacity = ^ = entries
// :: INFO namenode.FSImage: Allocated new BlockPoolId: BP--192.168.0.120-
// :: INFO common.Storage: Storage directory /opt/hadoop-2.9./dfs/name has been successfully formatted.
// :: INFO namenode.FSImageFormatProtobuf: Saving image file /opt/hadoop-2.9./dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
// :: INFO namenode.FSImageFormatProtobuf: Image file /opt/hadoop-2.9./dfs/name/current/fsimage.ckpt_0000000000000000000 of size bytes saved in seconds.
// :: INFO namenode.NNStorageRetentionManager: Going to retain images with txid >=
// :: INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/192.168.0.120
************************************************************/
[spark@master hadoop-2.9.]$ sbin/start-all.sh #启动dfs
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [master]
master: starting namenode, logging to /opt/hadoop-2.9./logs/hadoop-spark-namenode-master.out
slave1: starting datanode, logging to /opt/hadoop-2.9./logs/hadoop-spark-datanode-slave1.out
slave3: starting datanode, logging to /opt/hadoop-2.9./logs/hadoop-spark-datanode-slave3.out
slave2: starting datanode, logging to /opt/hadoop-2.9./logs/hadoop-spark-datanode-slave2.out
Starting secondary namenodes [master]
master: starting secondarynamenode, logging to /opt/hadoop-2.9./logs/hadoop-spark-secondarynamenode-master.out
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop-2.9./logs/yarn-spark-resourcemanager-master.out
slave2: starting nodemanager, logging to /opt/hadoop-2.9./logs/yarn-spark-nodemanager-slave2.out
slave3: starting nodemanager, logging to /opt/hadoop-2.9./logs/yarn-spark-nodemanager-slave3.out
slave1: starting nodemanager, logging to /opt/hadoop-2.9./logs/yarn-spark-nodemanager-slave1.out

4)过30s后,查看master,slave1,slave2,slave3是否启动成功

查看master是否启动成功:

[spark@master hadoop-2.9.]$ jps
Jps
ResourceManager
NameNode
SecondaryNameNode
[spark@master hadoop-2.9.]$

在slave1,slave2,slave3分别jps查看是否都启动了DataNode,DataManager进程:
以slave1为例:

[spark@slave1 hadoop-2.9.]$ jps
Jps
NodeManager
DataNode
[spark@slave1 hadoop-2.9.]$

参考《https://blog.csdn.net/magggggic/article/details/52503502》

Kafka:ZK+Kafka+Spark Streaming集群环境搭建(五)针对hadoop2.9.0启动之后发现slave上正常启动了DataNode,DataManager,但是过了几秒后发现DataNode被关闭的更多相关文章

  1. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(二十一)NIFI1.7.1安装

    一.nifi基本配置 1. 修改各节点主机名,修改/etc/hosts文件内容. 192.168.0.120 master 192.168.0.121 slave1 192.168.0.122 sla ...

  2. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(十三)kafka+spark streaming打包好的程序提交时提示虚拟内存不足(Container is running beyond virtual memory limits. Current usage: 119.5 MB of 1 GB physical memory used; 2.2 GB of 2.1 G)

    异常问题:Container is running beyond virtual memory limits. Current usage: 119.5 MB of 1 GB physical mem ...

  3. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(十二)VMW安装四台CentOS,并实现本机与它们能交互,虚拟机内部实现可以上网。

    Centos7出现异常:Failed to start LSB: Bring up/down networking. 按照<Kafka:ZK+Kafka+Spark Streaming集群环境搭 ...

  4. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(十一)定制一个arvo格式文件发送到kafka的topic,通过Structured Streaming读取kafka的数据

    将arvo格式数据发送到kafka的topic 第一步:定制avro schema: { "type": "record", "name": ...

  5. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(十)安装hadoop2.9.0搭建HA

    如何搭建配置centos虚拟机请参考<Kafka:ZK+Kafka+Spark Streaming集群环境搭建(一)VMW安装四台CentOS,并实现本机与它们能交互,虚拟机内部实现可以上网.& ...

  6. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(九)安装kafka_2.11-1.1.0

    如何搭建配置centos虚拟机请参考<Kafka:ZK+Kafka+Spark Streaming集群环境搭建(一)VMW安装四台CentOS,并实现本机与它们能交互,虚拟机内部实现可以上网.& ...

  7. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(八)安装zookeeper-3.4.12

    如何搭建配置centos虚拟机请参考<Kafka:ZK+Kafka+Spark Streaming集群环境搭建(一)VMW安装四台CentOS,并实现本机与它们能交互,虚拟机内部实现可以上网.& ...

  8. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(三)安装spark2.2.1

    如何搭建配置centos虚拟机请参考<Kafka:ZK+Kafka+Spark Streaming集群环境搭建(一)VMW安装四台CentOS,并实现本机与它们能交互,虚拟机内部实现可以上网.& ...

  9. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(二)安装hadoop2.9.0

    如何搭建配置centos虚拟机请参考<Kafka:ZK+Kafka+Spark Streaming集群环境搭建(一)VMW安装四台CentOS,并实现本机与它们能交互,虚拟机内部实现可以上网.& ...

随机推荐

  1. CAP原则(CAP定理)、BASE理论

    CAP原则又称CAP定理,指的是在一个分布式系统中, Consistency(一致性). Availability(可用性).Partition tolerance(分区容错性),三者不可得兼. CA ...

  2. Linux虚拟主机管理系统---wdcp

    关于WDCP这款虚拟主机管理系统,是疯子使用的第二款Linux虚拟主机管理系统,使用是挺简单的,以前好像是因为编码问题而放弃这款面板. WDCP功能比较完善,基本上需要的功能都能满足,例如:在线下载. ...

  3. 坐标的相对转换ClientToScreen与ScreenToClient

    假如一个有一个TEdit的实例edt_Position,edt_Position所在容器有好几层,所在的窗体为frmMain.现要弹出一个FORM,FORM的容器为frmMain,弹出的位置在edt_ ...

  4. delphi 处理缩放图像

    procedure TTMEImageDeviceIdentifyFrom.DrawText(AImage : TImage; AFile: string);var I: Integer; iWidt ...

  5. Delphi 判断时间是否合法 -IsValidDateTime、IsValidDate、IsValidTime、IsValidDateDay

    DateUtils.IsValidDateTimeDateUtils.IsValidDateDateUtils.IsValidTimeDateUtils.IsValidDateDayDateUtils ...

  6. 快速将wax配置到项目中进行lua开发

    通过Finder浏览到你保存该项目的文件夹.创建三个新的文件夹:wax.scripts和Classes. 第一:首先,下载源代码的压缩包.Wax放在GitHub上(https://github.com ...

  7. layoutSubviews 在什么情况下会被触发

    layoutSubviews在以下情况下会被调用: 1.init初始化不会触发layoutSubviews 2.addSubview会触发layoutSubviews 3.设置view的Frame会触 ...

  8. Linux学习11-CentOS如何设置java环境变量

    前言 之前用yum安装的java,现在想添加环境变量,yum安装的java路径在哪呢?如何找到安装的路径,把jdk添加到环境变量. 本篇详细讲解linux系统设置java环境变量 找到jdk路径 之前 ...

  9. java算法实现树型目录反向生成(在指定的盘符或位置生成相应的文件结构)

    http://www.cnblogs.com/interdrp/p/6702482.html 由于此次文件管理系统的升级确实给我们带来了很多方便且在性能上有很大提升,经过这段时间的使用 也发现了些问题 ...

  10. DNS 劫持 和 DNS 污染

    1,用户需要访问www.liusuping.com这个网站,向DNS服务器提出解析请求. 2,DNS服务器通过检查发现www.liusuping.com域名的IP地址是127.0.0.1,将结果返回给 ...