1.首先添加hosts文件

vim /etc/hosts

192.168.0.1  MSJTVL-DSJC-H01
192.168.0.2 MSJTVL-DSJC-H03
192.168.0.3 MSJTVL-DSJC-H05
192.168.0.4 MSJTVL-DSJC-H02
192.168.0.5 MSJTVL-DSJC-H04

2.几台机器做互信

Setup passphraseless ssh

Now check that you can ssh to the localhost without a passphrase:

  $ ssh localhost
If you cannot ssh to localhost without a passphrase, execute the following commands: $ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

把其他几台机器的秘钥文件复制到MSJTVL-DSJC-H01的authorized_keys文件中

[hadoop@MSJTVL-DSJC-H01 .ssh]$ scp hadoop@MSJTVL-DSJC-H02:/hadoop/.ssh/id_dsa.pub ./id_dsa.pub2
[hadoop@MSJTVL-DSJC-H01 .ssh]$ scp hadoop@MSJTVL-DSJC-H03:/hadoop/.ssh/id_dsa.pub ./id_dsa.pub3
[hadoop@MSJTVL-DSJC-H01 .ssh]$ scp hadoop@MSJTVL-DSJC-H04:/hadoop/.ssh/id_dsa.pub ./id_dsa.pub4
[hadoop@MSJTVL-DSJC-H01 .ssh]$ scp hadoop@MSJTVL-DSJC-H05:/hadoop/.ssh/id_dsa.pub ./id_dsa.pub5 [hadoop@MSJTVL-DSJC-H01 .ssh]$ cat ~/.ssh/id_dsa.pub2 >> ~/.ssh/authorized_keys
[hadoop@MSJTVL-DSJC-H01 .ssh]$ cat ~/.ssh/id_dsa.pub3 >> ~/.ssh/authorized_keys
[hadoop@MSJTVL-DSJC-H01 .ssh]$ cat ~/.ssh/id_dsa.pub4 >> ~/.ssh/authorized_keys
[hadoop@MSJTVL-DSJC-H01 .ssh]$ cat ~/.ssh/id_dsa.pub5 >> ~/.ssh/authorized_keys

以上操作实现了MSJTVL-DSJC-H02,3,4,5对MSJTVL-DSJC-H01的无密码登录

要是实现MSJTVL-DSJC-H01-5的全部互信则把MSJTVL-DSJC-H01上的authorized_keys文件COPY到其他机器上去

[hadoop@MSJTVL-DSJC-H02 ~]$ scp hadoop@MSJTVL-DSJC-H01:/hadoop/.ssh/authorized_keys /hadoop/.ssh/authorized_keys

 

下载相应的tar包

wget http://apache.fayea.com/hadoop/common/hadoop-2.6.4/hadoop-2.6.4.tar.gz

解压tar包并且建立相应的软链接

[hadoop@MSJTVL-DSJC-H01 ~]$ tar -zxvf hadoop-2.6.4.tar.gz
[hadoop@MSJTVL-DSJC-H01 ~]$ ln -sf hadoop-2.6.4 hadoop

进到hadoop相应的配置文件路径,修改hadoop-env.sh的内容

[hadoop@MSJTVL-DSJC-H01 ~]$ cd hadoop/etc/hadoop/
[hadoop@MSJTVL-DSJC-H01 hadoop]$ vim hadoop-env.sh

修改hadoop-env.sh里java_home的参数信息

接下来修改hdfs-site.xml中的相关内容,来源http://hadoop.apache.org/docs/r2.6.4/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html

首先配置一个匿名服务dfs.nameservices

[hadoop@MSJTVL-DSJC-H01 hadoop]$ vim hdfs-site.xml
<configuration>
//配置服务的名称,可以进行相应的修改
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property> //配置namenode的名称,mycluster需要和前面的保持一致,nn1和nn2只是名称无所谓叫啥
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property> //配置RPC协议的端口,两个namenode的RPC协议和端口,需要修改servicesname和value中的主机名称,MSJTVL-DSJC-H01和MSJTVL-DSJC-H02是两个namenode的主机名称
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>MSJTVL-DSJC-H01:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>MSJTVL-DSJC-H02:8020</value>
</property> //配置下面是http的主机和端口
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>MSJTVL-DSJC-H01:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>MSJTVL-DSJC-H02:50070</value>
</property> //接下来配置的是JournalNodes的URL地址
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://MSJTVL-DSJC-H03:8485;MSJTVL-DSJC-H04:8485;MSJTVL-DSJC-H05:8485/mycluster</value>
</property> //然后是固定的一个客户端使用的类(需要修改serversname的名称),客户端通过这个类找到
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property> //sshfence - SSH to the Active NameNode and kill the process,注意为hadoop下.ssh目录中生成的秘钥文件
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/hadoop/.ssh/id_dsa</value>
</property> //JournalNodes的工作目录
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/hadoop/jn/data</value>
</property> //开启自动切换namenode
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
</configuration>

接下来编辑core-site.xml的配置文件

//首先配置namenode的入口,同样注意serversname的名称
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property> //配置zookeeper的集群
<property>
<name>ha.zookeeper.quorum</name>
<value>MSJTVL-DSJC-H03:2181,MSJTVL-DSJC-H04:2181,MSJTVL-DSJC-H05:2181</value>
</property> //hadoop的临时目录
<property>
<name>hadoop.tmp.dir</name>
<value>/hadoop/tmp</value>
</property>

配置slaves

MSJTVL-DSJC-H03
MSJTVL-DSJC-H04
MSJTVL-DSJC-H05

安装zookeeper

直接解压

修改相应的配置文件

[zookeeper@MSJTVL-DSJC-H03 conf]$ vim zoo.cfg
//修改dataDir=/opt/zookeeper/data,不要放到tmp下
dataDir=/opt/zookeeper/data #autopurge.purgeInterval=1
server.1=MSJTVL-DSJC-H03:2888:3888
server.2=MSJTVL-DSJC-H04:2888:3888
server.3=MSJTVL-DSJC-H05:2888:3888 在/opt/zookeeper/data下建立myid里面存储跟server一样的数字

启动zookeeper(zkServer.sh start),jps查看启动状态

启动HA集群

1.首先启动JournalNodes,到sbin目录下

./hadoop-daemon.sh start journalnode

[hadoop@MSJTVL-DSJC-H03 sbin]$ ./hadoop-daemon.sh start journalnode
starting journalnode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-journalnode-MSJTVL-DSJC-H03.out
[hadoop@MSJTVL-DSJC-H03 sbin]$ jps
3204 JournalNode
3252 Jps
[hadoop@MSJTVL-DSJC-H03 sbin]$

2.在一台namenode上进行格式化

[hadoop@MSJTVL-DSJC-H01 bin]$ ./hdfs namenode -format

初始化之后会在/hadoop/tmp/dfs/name/current下产生相应的元数据文件

[hadoop@MSJTVL-DSJC-H01 ~]$ cd tmp/
[hadoop@MSJTVL-DSJC-H01 tmp]$ ll
总用量 4
drwxr-xr-x. 3 hadoop hadoop 4096 9月 6 16:54 dfs
[hadoop@MSJTVL-DSJC-H01 tmp]$ cd dfs/
[hadoop@MSJTVL-DSJC-H01 dfs]$ ll
总用量 4
drwxr-xr-x. 3 hadoop hadoop 4096 9月 6 16:54 name
[hadoop@MSJTVL-DSJC-H01 dfs]$ cd name/
[hadoop@MSJTVL-DSJC-H01 name]$ ll
总用量 4
drwxr-xr-x. 2 hadoop hadoop 4096 9月 6 16:54 current
[hadoop@MSJTVL-DSJC-H01 name]$ cd current/
[hadoop@MSJTVL-DSJC-H01 current]$ ll
总用量 16
-rw-r--r--. 1 hadoop hadoop 352 9月 6 16:54 fsimage_0000000000000000000
-rw-r--r--. 1 hadoop hadoop 62 9月 6 16:54 fsimage_0000000000000000000.md5
-rw-r--r--. 1 hadoop hadoop 2 9月 6 16:54 seen_txid
-rw-r--r--. 1 hadoop hadoop 201 9月 6 16:54 VERSION
[hadoop@MSJTVL-DSJC-H01 current]$ pwd
/hadoop/tmp/dfs/name/current
[hadoop@MSJTVL-DSJC-H01 current]$

3.把初始化的元数据文件COPY到其他的namenode上去,COPY之前需要先启动格式化的namenode

[hadoop@MSJTVL-DSJC-H01 sbin]$ ./hadoop-daemon.sh start namenode
starting namenode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-namenode-MSJTVL-DSJC-H01.out
[hadoop@MSJTVL-DSJC-H01 sbin]$ jps
3324 NameNode
3396 Jps
[hadoop@MSJTVL-DSJC-H01 sbin]$

然后在没有格式化的namenode上执行hdfs namenode -bootstrapStandby,执行完后查看元数据文件是一样的表示成功。

[hadoop@MSJTVL-DSJC-H02 bin]$ hdfs namenode -bootstrapStandby

  

4.初始化ZKFC,在任意一台机器上执行hdfs zkfc -formatZK初始化ZKFC

5.重启整个HDFS集群

[hadoop@MSJTVL-DSJC-H01 sbin]$ ./start-dfs.sh
16/09/06 17:10:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [MSJTVL-DSJC-H01 MSJTVL-DSJC-H02]
MSJTVL-DSJC-H02: starting namenode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-namenode-MSJTVL-DSJC-H02.out
MSJTVL-DSJC-H01: starting namenode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-namenode-MSJTVL-DSJC-H01.out
MSJTVL-DSJC-H03: starting datanode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-datanode-MSJTVL-DSJC-H03.out
MSJTVL-DSJC-H04: starting datanode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-datanode-MSJTVL-DSJC-H04.out
MSJTVL-DSJC-H05: starting datanode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-datanode-MSJTVL-DSJC-H05.out
Starting journal nodes [MSJTVL-DSJC-H03 MSJTVL-DSJC-H04 MSJTVL-DSJC-H05]
MSJTVL-DSJC-H03: starting journalnode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-journalnode-MSJTVL-DSJC-H03.out
MSJTVL-DSJC-H04: starting journalnode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-journalnode-MSJTVL-DSJC-H04.out
MSJTVL-DSJC-H05: starting journalnode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-journalnode-MSJTVL-DSJC-H05.out
16/09/06 17:10:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting ZK Failover Controllers on NN hosts [MSJTVL-DSJC-H01 MSJTVL-DSJC-H02]
MSJTVL-DSJC-H02: starting zkfc, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-zkfc-MSJTVL-DSJC-H02.out
MSJTVL-DSJC-H01: starting zkfc, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-zkfc-MSJTVL-DSJC-H01.out
[hadoop@MSJTVL-DSJC-H01 sbin]$ jps
4345 Jps
4279 DFSZKFailoverController
3993 NameNode

6.创建一个目录

./hdfs dfs -mkdir -p /usr/file
./hdfs dfs -put /hadoop/tian.txt /usr/file

放上一个文件可以在网页中查看相应的文件。

MR高可用

配置yarn-site.xml

<configuration>
<!--启用RM高可用--> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <!--RM集群标识符--> <property> <name>yarn.resourcemanager.cluster-id</name> <value>rm-cluster</value> </property> <property> <!--指定两台RM主机名标识符--> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <!--RM故障自动切换--> <property> <name>yarn.resourcemanager.ha.automatic-failover.recover.enabled</name> <value>true</value> </property> <!--RM故障自动恢复--> <property> <name>yarn.resourcemanager.recovery.enabled</name> <value>true</value> </property> --> <!--RM主机1--> <property> <name>yarn.resourcemanager.hostname.rm1</name> <value>MSJTVL-DSJC-H01</value> </property> <!--RM主机2--> <property> <name>yarn.resourcemanager.hostname.rm2</name> <value>MSJTVL-DSJC-H02</value> </property> <!--RM状态信息存储方式,一种基于内存(MemStore),另一种基于ZK(ZKStore)--> <property> <name>yarn.resourcemanager.store.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value> </property> <!--使用ZK集群保存状态信息--> <property> <name>yarn.resourcemanager.zk-address</name> <value>MSJTVL-DSJC-H03:2181,MSJTVL-DSJC-H04:2181,MSJTVL-DSJC-H05:2181</value> </property> <!--向RM调度资源地址--> <property> <name>yarn.resourcemanager.scheduler.address.rm1</name> <value>MSJTVL-DSJC-H01:8030</value> </property> <property> <name>yarn.resourcemanager.scheduler.address.rm2</name> <value>MSJTVL-DSJC-H02:8030</value> </property> <!--NodeManager通过该地址交换信息--> <property> <name>yarn.resourcemanager.resource-tracker.address.rm1</name> <value>MSJTVL-DSJC-H01:8031</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address.rm2</name> <value>MSJTVL-DSJC-H02:8031</value> </property> <!--客户端通过该地址向RM提交对应用程序操作--> <property> <name>yarn.resourcemanager.address.rm1</name> <value>MSJTVL-DSJC-H01:8032</value> </property> <property> <name>yarn.resourcemanager.address.rm2</name> <value>MSJTVL-DSJC-H02:8032</value> </property> <!--管理员通过该地址向RM发送管理命令--> <property> <name>yarn.resourcemanager.admin.address.rm1</name> <value>MSJTVL-DSJC-H01:8033</value> </property> <property> <name>yarn.resourcemanager.admin.address.rm2</name> <value>MSJTVL-DSJC-H02:8033</value> </property> <!--RM HTTP访问地址,查看集群信息--> <property> <name>yarn.resourcemanager.webapp.address.rm1</name> <value>MSJTVL-DSJC-H01:8088</value> </property> <property> <name>yarn.resourcemanager.webapp.address.rm2</name> <value>MSJTVL-DSJC-H02:8088</value> </property> </configuration>

  

配置mapred-site.xml

//指定mr框架为yarn方式
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

standby的MR需要手动启动

[hadoop@MSJTVL-DSJC-H02 sbin]$ yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /hadoop/hadoop-2.6.4/logs/yarn-hadoop-resourcemanager-MSJTVL-DSJC-H02.out
[hadoop@MSJTVL-DSJC-H02 sbin]$ jps
3000 ResourceManager
2812 NameNode
3055 Jps
2922 DFSZKFailoverController
[hadoop@MSJTVL-DSJC-H02 sbin]$

  

  

 

Hadoop HA的搭建的更多相关文章

  1. HBase HA + Hadoop HA 搭建

    HBase 使用的是 1.2.9 的版本.  Hadoop HA 的搭建见我的另外一篇:Hadoop 2.7.3 HA 搭建及遇到的一些问题 以下目录均为 HBase 解压后的目录. 1. 修改 co ...

  2. Spark HA 的搭建

    接hadoop HA的搭建,因为你zookeeper已经部署完成,所以直接安装spark就可以 tar –xzf spark-1.6.1-bin-hadoop2.6.tgz -C ../service ...

  3. Hadoop_33_Hadoop HA的搭建

    Hadoop HA的搭建,可参考链接:https://blog.csdn.net/mrbcy/article/details/64939623 说明:    1.在hadoop2.0中通常由两个Nam ...

  4. 攻城狮在路上(陆)-- hadoop分布式环境搭建(HA模式)

    一.环境说明: 操作系统:Centos6.5 Linux node1 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 03:15:09 UTC 2013 x86_64 ...

  5. Hadoop HA高可用集群搭建(Hadoop+Zookeeper+HBase)

    声明:作者原创,转载注明出处. 作者:帅气陈吃苹果 一.服务器环境 主机名 IP 用户名 密码 安装目录 master188 192.168.29.188 hadoop hadoop /home/ha ...

  6. hadoop HA分布式集群搭建

    概述 hadoop2中NameNode可以有多个(目前只支持2个).每一个都有相同的职能.一个是active状态的,一个是standby状态的.当集群运行时,只有active状态的NameNode是正 ...

  7. hadoop完全分布式搭建HA(高可用)

    2018年03月25日 16:25:26 D调的Stanley 阅读数:2725 标签: hadoop HAssh免密登录hdfs HA配置hadoop完全分布式搭建zookeeper 配置 更多 个 ...

  8. Hadoop生产环境搭建(含HA、Federation)

    Hadoop生产环境搭建 1. 将安装包hadoop-2.x.x.tar.gz存放到某一目录下,并解压. 2. 修改解压后的目录中的文件夹etc/hadoop下的配置文件(若文件不存在,自己创建.) ...

  9. 1、hadoop HA分布式集群搭建

    概述 hadoop2中NameNode可以有多个(目前只支持2个).每一个都有相同的职能.一个是active状态的,一个是standby状态的.当集群运行时,只有active状态的NameNode是正 ...

随机推荐

  1. Hibernate 主键生成策略

    表示符生成器 描述 Increment 由hibernate自动以递增的方式生成表识符,每次增量为1 Identity 由底层数据库生成表识符.条件是数据库支持自动增长数据类型. Sequence H ...

  2. 【USACO 3.2.4】饲料调配

    [描述] 农夫约翰从来只用调配得最好的饲料来喂他的奶牛.饲料用三种原料调配成:大麦,燕麦和小麦.他知道自己的饲料精确的配比,在市场上是买不到这样的饲料的.他只好购买其他三种混合饲料(同样都由三种麦子组 ...

  3. Unique Binary Search Tree

    Given n, how many structurally unique BST's (binary search trees) that store values 1...n? For examp ...

  4. 如何使用LoadRunner监控Windows

    1.监视连接前的准备工作   1)进入被监视windows系统,开启以下二个服务Remote Procedure Call(RPC) 和Remote Registry Service (开始—)运行 ...

  5. 初涉JavaScript模式 (13) : 代码复用 【上】

    引子 博客断了一段时间,不是不写,一是没时间,二是觉得自己沉淀不够,经过一段时间的学习和实战,今天来总结下一个老生常谈的东西: 代码复用. 为何复用 JS门槛低,故很多人以为写几个特效就会JS,其实真 ...

  6. InstallShield安装包中集成第三方安装包的方案选择[转]

      我们在制作安装包时,有些情况下会涉及第三方安装的集成,这里将讨论如何调用安装第三方包,以及需要注意的事项. 第三方安装包的介质类型有很多,主要有:单独的一个Setup.exe,单独的一个msi包, ...

  7. [Git]Git安装

    1.什么是Git Git是一个分布式版本控制/软件配置管理软件, git是用于Linux内核开发的版本控制工具, 与CVS.Subversion一类的集中式版本控制工具不同,它采用了分布式版本库的作法 ...

  8. 代码发布架构方案(SVN)

    问题: 安装优化软件环境nginx,lvs  程序代码(不断更新) 配置更新(不断变更) 1.SVN介绍 1.1 什么是SVN(Subversion)?         SVN(Subversion) ...

  9. [转载]在 Windows 10 中, 如何卸载和重新安装 OneNote App

    在 Windows 10 中, 如何卸载和重新安装 OneNote App 15/8/2015 使用 PowerShell 命令卸载 OneNote App 开始菜单 -> 输入 "P ...

  10. IOS 笔试

    iOS基础教程之Objective-C:Objective-C笔试题 作者:蓝鸥科技 发布于:2012-12-14 14:38 Friday 分类:iOS中级-OC开发 iOS基础教程之Objecti ...