经过一下午的尝试,终于把这个集群的搭建好了,搭完感觉也没有太大的必要,就当是学习了吧,为之后搭建真实环境做基础。

以下搭建的是一个Ha-Federation-hdfs+Yarn的集群部署。

  首先讲一下我的配置:

  四个节点上的启动的分别是:

  1.qiang117:active namenode,

  2.qiang118 standby namenode ,journalnode,datanode

  3.qiang119 active namenode    ,journalnode,datanode

  4.qiang120 standby namenode ,journalnode,datanode

  这样做纯粹是因为电脑hold不住那么虚拟机了,其实这里所有的节点都应该在不同的服务器上。简单的说,就是117和119做active namenode,118和120做standby namenode,在118.119.120上分别放datanode和journalnode。

此处省略一万字,各种配置好之后。。遇到的问题和记录如下:

1.启动 journalnode,这个journalnode话说我也不是太明白他是干嘛的~~,后续研究吧。在各个节点上启动journalnode:

[qiang@qiang118 hadoop-2.6.]$ sbin/hadoop-daemon.sh start journalnode
starting journalnode, logging to /home/qiang/hadoop-2.6./logs/hadoop-qiang-journalnode-qiang118.qiang.out
[qiang@qiang118 hadoop-2.6.]$ jps
JournalNode
Jps

2. 格式化namenode时报错:(最后查出来是没有关防火墙。。。免密码登陆不代表不用关防火墙)

// :: INFO ipc.Client: Retrying connect to server: qiang119/192.168.75.119:. Already tried  time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=, sleepTime= MILLISECONDS)
// :: INFO ipc.Client: Retrying connect to server: qiang118/192.168.75.118:. Already tried time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=, sleepTime= MILLISECONDS)
// :: INFO ipc.Client: Retrying connect to server: qiang120/192.168.75.120:. Already tried time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=, sleepTime= MILLISECONDS)
// :: INFO ipc.Client: Retrying connect to server: qiang119/192.168.75.119:. Already tried time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=, sleepTime= MILLISECONDS)
// :: WARN namenode.NameNode: Encountered exception during format:
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Unable to check if JNs are ready for formatting. exceptions thrown:
192.168.75.120:: No Route to Host from 43.49.49.59.broad.ty.sx.dynamic.163data.com.cn/59.49.49.43 to qiang120: failed on socket timeout exception: java.net.NoRouteToHostException: No route to host; For more details see: http://wiki.apache.org/hadoop/NoRouteToHost
192.168.75.119:: No Route to Host from 43.49.49.59.broad.ty.sx.dynamic.163data.com.cn/59.49.49.43 to qiang119: failed on socket timeout exception: java.net.NoRouteToHostException: No route to host; For more details see: http://wiki.apache.org/hadoop/NoRouteToHost
at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:)
at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:)
at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.hasSomeData(QuorumJournalManager.java:)
at org.apache.hadoop.hdfs.server.common.Storage.confirmFormat(Storage.java:)
at org.apache.hadoop.hdfs.server.namenode.FSImage.confirmFormat(FSImage.java:)
at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:)
// :: INFO ipc.Client: Retrying connect to server: qiang118/192.168.75.118:. Already tried time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=, sleepTime= MILLISECONDS)
// :: FATAL namenode.NameNode: Failed to start namenode.
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Unable to check if JNs are ready for formatting. exceptions thrown:
192.168.75.120:: No Route to Host from 43.49.49.59.broad.ty.sx.dynamic.163data.com.cn/59.49.49.43 to qiang120: failed on socket timeout exception: java.net.NoRouteToHostException: No route to host; For more details see: http://wiki.apache.org/hadoop/NoRouteToHost
192.168.75.119:: No Route to Host from 43.49.49.59.broad.ty.sx.dynamic.163data.com.cn/59.49.49.43 to qiang119: failed on socket timeout exception: java.net.NoRouteToHostException: No route to host; For more details see: http://wiki.apache.org/hadoop/NoRouteToHost
at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:)
at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:)
at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.hasSomeData(QuorumJournalManager.java:)
at org.apache.hadoop.hdfs.server.common.Storage.confirmFormat(Storage.java:)
at org.apache.hadoop.hdfs.server.namenode.FSImage.confirmFormat(FSImage.java:)
at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:)
// :: INFO util.ExitUtil: Exiting with status
// :: INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at 43.49.49.59.broad.ty.sx.dynamic.163data.com.cn/59.49.49.43

格式化成功!

[qiang@qiang117 hadoop-2.6.]$ bin/hdfs namenode -format -clusterId hadoop-cluster
// :: INFO namenode.FSNamesystem: Append Enabled: true
// :: INFO util.GSet: Computing capacity for map INodeMap
// :: INFO util.GSet: VM type = -bit
// :: INFO util.GSet: 1.0% max memory MB = 8.9 MB
// :: INFO util.GSet: capacity = ^ = entries
// :: INFO namenode.NameNode: Caching file names occuring more than times
// :: INFO util.GSet: Computing capacity for map cachedBlocks
// :: INFO util.GSet: VM type = -bit
// :: INFO util.GSet: 0.25% max memory MB = 2.2 MB
// :: INFO util.GSet: capacity = ^ = entries
// :: INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
// :: INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes =
// :: INFO namenode.FSNamesystem: dfs.namenode.safemode.extension =
// :: INFO namenode.FSNamesystem: Retry cache on namenode is enabled
// :: INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is millis
// :: INFO util.GSet: Computing capacity for map NameNodeRetryCache
// :: INFO util.GSet: VM type = -bit
// :: INFO util.GSet: 0.029999999329447746% max memory MB = 273.1 KB
// :: INFO util.GSet: capacity = ^ = entries
// :: INFO namenode.NNConf: ACLs enabled? false
// :: INFO namenode.NNConf: XAttrs enabled? true
// :: INFO namenode.NNConf: Maximum size of an xattr:
// :: INFO namenode.FSImage: Allocated new BlockPoolId: BP--192.168.75.117-
// :: INFO common.Storage: Storage directory /home/qiang/hadoop/hdfs/name has been successfully formatted.
// :: INFO namenode.NNStorageRetentionManager: Going to retain images with txid >=
// :: INFO util.ExitUtil: Exiting with status
// :: INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at qiang117/192.168.75.117
************************************************************/

3.开启namenode:

[qiang@qiang117 hadoop-2.6.]$ sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /home/qiang/hadoop-2.6./logs/hadoop-qiang-namenode-qiang117.out
[qiang@qiang117 hadoop-2.6.]$ jps
NameNode
Jps

4.格式化standby namenode

[qiang@qiang119 hadoop-2.6.]$ bin/hdfs namenode -bootstrapStandby
// :: INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = qiang119/192.168.75.119
STARTUP_MSG: args = [-bootstrapStandby]
STARTUP_MSG: version = 2.6.0
.....
.....
STARTUP_MSG: build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1; compiled by 'jenkins' on 2014-11-13T21:10Z
STARTUP_MSG: java = 1.8.0_51
************************************************************/
// :: INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
// :: INFO namenode.NameNode: createNameNode [-bootstrapStandby]
=====================================================
About to bootstrap Standby ID nn2 from:
Nameservice ID: hadoop-cluster1
Other Namenode ID: nn1
Other NN's HTTP address: http://qiang117:50070
Other NN's IPC address: qiang117/192.168.75.117:8020
Namespace ID:
Block pool ID: BP--192.168.75.117-
Cluster ID: hadoop-cluster
Layout version: -
=====================================================
// :: INFO common.Storage: Storage directory /home/qiang/hadoop/hdfs/name has been successfully formatted.
// :: INFO namenode.TransferFsImage: Opening connection to http://qiang117:50070/imagetransfer?getimage=1&txid=0&storageInfo=-60:1244139539:0:hadoop-cluster
// :: INFO namenode.TransferFsImage: Image Transfer timeout configured to milliseconds
// :: INFO namenode.TransferFsImage: Transfer took .01s at 0.00 KB/s
// :: INFO namenode.TransferFsImage: Downloaded file fsimage.ckpt_0000000000000000000 size bytes.
// :: INFO util.ExitUtil: Exiting with status
// :: INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at qiang119/192.168.75.119
************************************************************/

5.开启standby namenode

[qiang@qiang119 hadoop-2.6.]$ sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /home/qiang/hadoop-2.6./logs/hadoop-qiang-namenode-qiang119.out
[qiang@qiang119 hadoop-2.6.]$ jps
JournalNode
NameNode
Jps

在web上 打开以后二个显示都是standy状态:

使用这个命令将nn1切换为active状态:

bin/hdfs haadmin -ns hadoop-cluster1 -transitionToActive nn1

另外两个一样的道理:

开启所有的datanode,这里是在只有配置好ssh免密码登录的情况下才能使用。可以参考:http://www.cnblogs.com/qiangweikang/p/4740936.html

[qiang@qiang117 hadoop-2.6.]$ sbin/hadoop-daemons.sh start datanode

开了仨,就是之前预设好的192.168.1.118,192.168.1.119和192.168.1.120

启动yarn

[qiang@qiang117 hadoop-2.6.]$ sbin/start-yarn.sh 
starting yarn daemons
starting resourcemanager, logging to /home/qiang/hadoop-2.6./logs/yarn-qiang-resourcemanager-qiang117.out
qiang118: nodemanager running as process . Stop it first.
qiang120: nodemanager running as process . Stop it first.
qiang119: nodemanager running as process . Stop it first.
[qiang@qiang117 hadoop-2.6.]$ jps
NameNode
Jps
ResourceManager

也是可以看到有三个datanode

最后总结一下吧...... 自学大数据的话,有一个简单的部署就足够了,能够让你写好的程序放入hdfs中跑就可以了,这样的集群应该是在最后,或者需要的时候再去详细的做研究,抓紧进入之后的阶段吧~~

Ha-Federation-hdfs +Yarn集群部署方式的更多相关文章

  1. Kubernetes 企业级集群部署方式

    一.Kubernetes介绍与特性 1.1.kubernetes是什么 官方网站:http://www.kubernetes.io • Kubernetes是Google在2014年开源的一个容器集群 ...

  2. 大数据【三】YARN集群部署

    一 概述 YARN是一个资源管理.任务调度的框架,采用master/slave架构,主要包含三大模块:ResourceManager(RM).NodeManager(NM).ApplicationMa ...

  3. MinIO分布式集群部署方式

    文章转载自:https://blog.51cto.com/u_10950710/4843738 关于分布式集群MinIo 单机Minio服务存在单点故障,如果是一个有N块硬盘的分布式Minio,只要有 ...

  4. 大数据Hadoop的HA高可用架构集群部署

        1 概述 在Hadoop 2.0.0之前,一个Hadoop集群只有一个NameNode,那么NameNode就会存在单点故障的问题,幸运的是Hadoop 2.0.0之后解决了这个问题,即支持N ...

  5. activeMQ主要的几类集群部署方式

    官方主从实现的文档:http://activemq.apache.org/masterslave.html   一.activeMQ主要的几类部署方式比较 1.默认的单机部署(kahadb) acti ...

  6. yarn 集群部署,遇到的问题小结

    版本号信息: hadoop 2.3.0  hive 0.11.0 1. Application Master 无法訪问     点击application mater 链接,出现 http 500 错 ...

  7. (转)yarn 集群部署,遇到的问题小结

    link:http://blog.csdn.net/uniquechao/article/details/26449761   版本信息: hadoop 2.3.0  hive 0.11.0   1. ...

  8. spark on yarn 集群部署

    概述 hadoop2.7.1 spark 1.5.1 192.168.31.62   resourcemanager, namenode, master 192.168.31.63   nodeman ...

  9. Flink集群部署

    部署方式 一般来讲有三种方式: Local Standalone Flink On Yarn/Mesos/K8s… 单机模式 参考上一篇Flink从入门到放弃(入门篇2)-本地环境搭建&构建第 ...

随机推荐

  1. xshell连接kali

    连接出现错误,连接不上去,看到一篇文章可以使用,https://blog.csdn.net/yemaxq/article/details/78171241

  2. Django CBV与FBV

    FBV FBV(function base views) 就是在视图里使用函数处理请求. CBV CBV(class base views) 就是在视图里使用类处理请求. Python是一个面向对象的 ...

  3. 利用docker 最新漏洞渗透--提取root 权限

    一.事出 近期乌云漏洞平台等科技新闻,爆出Docker虚拟化 端口漏洞,本着热爱开源,实践动手的精神,我也去尝试了下,漏洞严重性确实很高,可以拿到root 登陆账户. 二.还原 2.1 通过扫描,我们 ...

  4. Fastq 常用软件

    文章转载于 Original 2017-06-08 Jolvii 生信百科 由于生物信息的大部分工作都是在没有 root 权限的集群上进行的,本期我主要介绍一下非 root 用户怎么安装常用的软件.工 ...

  5. Flask之模板

    2 了解Jinja2模板 知识点 模板使用 变量 过滤器 web表单 控制语句 宏.继承.包含 Flask中的特殊变量和方法 3.1 模板 在前面的示例中,视图函数的主要作用是生成请求的响应,这是最简 ...

  6. 关于网格比较工具metro使用的几点注意事项

    Metro作为一个非常好用的简化网格比较工具,在科研界几乎算是标准了.不过很多比较牛的作者会使用自己设计的一些比较算法,但是如果metro够用了也就不必那么麻烦了,毕竟Metro使用的方法还算是很成熟 ...

  7. 了解zookeeper

    ZooKeeper操作和维护多个小型的数据节点,这些节点被称为znode,采用类似于文件系统的层级树状结构进行管理.图2-1描述了一个znode树的结构,根节点包含4个个节点,其中三个子节点拥有下一级 ...

  8. C#中如何判断线程当前所处的状态

    转自原文 在C#中如何判断线程当前所处的状态 在C#中,线程对象Thread使用ThreadState属性指示线程状态,它是带Flags特性的枚举类型对象.    ThreadState 为线程定义了 ...

  9. select sum也会返回null值

    SELECT  SUM(detail.VAL)  FROM   AI_SDP_ORDER_MONTH_DETAIL_201706    detail 如果所有的VAL都是null的话,或者根本就不存在 ...

  10. PHP - 脚本退出(包括异常退出),执行指定代码

    之前做聊天室的时候有那么个需求就是当用户异常断线的时候就应该清除她的在线状态.因为当时对于flush不够了解,尝试了各种办法,好像都没办法在我们开发机上面执行相应的代码.后来知道是flush的原因.我 ...