软件环境:

linux系统: CentOS6.7
Hadoop版本: 2.6.5
zookeeper版本: 3.4.8

##主机配置:
######一共m1, m2, m3, m4, m5这五部机, 每部主机的用户名都为centos
```
192.168.179.201: m1
192.168.179.202: m2
192.168.179.203: m3
192.168.179.204: m4
192.168.179.205: m5

m1: Namenode, YARN, ResourceManager

m2: Namenode, YARN, ResourceManager

m3: Zookeeper, DataNode, NodeManager

m4: Zookeeper, DataNode, NodeManager

m5: Zookeeper, DataNode, NodeManager


<br>
##前期准备
####1.配置主机IP:

sudo vi /etc/sysconfig/network-scripts/ifcfg-eth0


####2.配置主机名:

sudo vi /etc/sysconfig/network


####3.配置主机名和IP的映射关系:

sudo vi /etc/hosts


####4.关闭防火墙
(1)临时关闭:

service iptables stop

service iptables status


(2)开机时自动关闭:

chkconfig iptables off

chkconfig iptables --list



---
<br>
<br>
<br>
##搭建步骤:
####一.安装配置Zookeeper集群(在m3.m4,m5三部主机上)
####1.解压

tar -zxvf zookeeper-3.4.8.tar.gz -C /home/hadoop/soft/zookeeper



----
<br>
####2.配置环境变量

vi /etc/profile

Zookeeper

export ZK_HOME=/home/centos/soft/zookeeper

export CLASSPATH=$CLASSPATH:$ZK_HOME/lib

export PATH=$PATH:$ZK_HOME/sbin:$ZK_HOME/bin

source /etc/profile



----
<br>
####3.修改配置
(1)配置zoo.cfg文件

cd /home/centos/soft/zookeeper/conf/

cp zoo_sample.cfg zoo.cfg

vi zoo.cfg

修改dataDir此项配置

dataDir=/home/centos/soft/zookeeper/tmp

添加以下三项配置

server.1=m3:2888:3888

server.2=m4:2888:3888

server.3=m5:2888:3888


(2)创建tmp目录

mkdir /home/centos/soft/zookeeper/tmp


(3)编辑myid文件

touch /home/centos/soft/zookeeper/tmp/myid

echo 1 > /home/centos/soft/zookeeper/tmp/myid ## 在m3主机上myid=1



---
<br>
####4.配置zookeeper日志存放位置
1. 编辑```zkEnv.sh```文件

vi /home/centos/soft/zookeeper/bin/zkEnv.sh

编辑下列该项配置

if [ "x${ZOO_LOG_DIR}" = "x" ]

then

ZOO_LOG_DIR="/home/centos/soft/zookeeper/logs" ## 修改此项

fi



---
<br>
(5)创建```logs```目录

mkdir /home/centos/soft/zookeeper/logs



---
####5. 拷贝到其他主机并修改myid
(1)拷贝到其他主机

scp -r /home/centos/soft/zookeeper/ m4:/home/centos/soft/

scp -r /home/centos/soft/zookeeper/ m5:/home/centos/soft/



(2)修改myid

echo 2 > /home/centos/soft/zookeeper/tmp/myid ## m4主机

echo 3 > /home/centos/soft/zookeeper/tmp/myid ## m5主机



----
<br>
<br>
<br>
<br>
##二.安装配置hadoop集群
####1.解压

tar -zxvf hadoop-2.6.5.tar.gz -C /home/centos/soft/hadoop



---
<br>
####2.将Hadoop配置进环境变量

vi /etc/profile

Java

export JAVA_HOME=/home/centos/soft/jdk

export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib

export PATH=$PATH:$JAVA_HOME/bin

Hadoop

export HADOOP_USER_NAME=centos

export HADOOP_HOME=/home/centos/soft/hadoop

export CLASSPATH=$CLASSPATH:$HADOOP_HOME/lib

export PATH=$PATH:$HADOOP_HOME/bin

source /etc/profile



---

####3.编辑hadoop-env.sh文件
####1.编辑hadoop-env.sh文件

export JAVA_HOME=/home/centos/soft/jdk


---
####2.编辑core-site.xml文件

  fs.defaultFS
  hdfs://ns1

  hadoop.tmp.dir
  /home/centos/soft/hadoop/tmp

  ha.zookeeper.quorum
  m3:2181,m4:2181,m5:2181

hadoop.proxyuser.centos.hosts
*

  hadoop.proxyuser.centos.groups
*

```


3.编辑hdfs-site.xml文件

<configuration>
<property>
  <name>dfs.nameservices</name>
  <value>ns1</value>
</property>
<property>
  <name>dfs.ha.namenodes.ns1</name>
  <value>nn1,nn2</value>
</property>
<property>
  <name>dfs.namenode.rpc-address.ns1.nn1</name>
  <value>m1:9000</value>
</property>
<property>
  <name>dfs.namenode.http-address.ns1.nn1</name>
  <value>m1:50070</value>
</property>
<property>
  <name>dfs.namenode.rpc-address.ns1.nn2</name>
  <value>m2:9000</value>
</property>
<property>
  <name>dfs.namenode.http-address.ns1.nn2</name>
  <value>m2:50070</value>
</property>
<property>
  <name>dfs.namenode.shared.edits.dir</name>
  <value>qjournal://m3:8485;m4:8485;m5:8485/ns1</value>
</property>
<property>
  <name>dfs.journalnode.edits.dir</name>
  <value>/home/centos/soft/hadoop/journal</value>
</property>
<property>
  <name>dfs.namenode.name.dir</name>
  <value>/home/centos/soft/hadoop/tmp/dfs/name</value>
</property>
<property>
 <name>dfs.datanode.data.dir</name>
 <value>/home/centos/soft/hadoop/tmp/dfs/data</value>
</property>
<property>
  <name>dfs.replication</name>
  <value>1</value>
</property>
<property>
  <name>dfs.ha.automatic-failover.enabled</name>
  <value>true</value>
</property>
<property>
  <name>dfs.webhdfs.enabled</name>
  <value>true</value>
</property>
<property>
  <name>dfs.client.failover.proxy.provider.ns1</name>
  <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
  <name>dfs.ha.fencing.methods</name>
  <value>
    sshfence
    shell(/bin/true)
  </value>
</property>
<property>
  <name>dfs.ha.fencing.ssh.private-key-files</name>
  <value>/home/centos/.ssh/id_rsa</value>
</property>
<property>
  <name>dfs.ha.fencing.ssh.connect-timeout</name>
  <value>30000</value>
</property>
<property>
  <name>dfs.permissions</name>
  <value>false</value>
</property>
<property>
  <name>heartbeat.recheck.interval</name>
  <value>2000</value>
</property>
<property>
  <name>dfs.heartbeat.interval</name>
<value>1</value>
</property>
<property>
<name>dfs.blockreport.intervalMsec</name>
<value>3600000</value>
<description>Determines block reporting interval in milliseconds.</description>
</property>
</configuration>

4.编辑mapred-site.xml文件

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>0.0.0.0:10020</value>
<description>MapReduce JobHistory Server IPC host:port</description>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>0.0.0.0:19888</value>
<description>MapReduce JobHistory Server Web UI host:port</description>
</property>
<property>
<name>mapreduce.task.io.sort.mb</name>
<value>1</value>
</property>
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/user</value>
</property>
<property>
<name>mapreduce.jobhistory.intermediate-done-dir</name>
<value>/user/history/done_intermediate</value>
</property>
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>/user/history</value>
</property>
</configuration>

5.编辑yarn-site.xml文件

<configuration>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>yrc</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>m1</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>m2</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>m3:2181,m4:2181,m5:2181</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle,spark_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>2048</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>4096</value>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>/home/centos/soft/hadoop/logs</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.spark_shuffle.class</name>
<value>org.apache.spark.network.yarn.YarnShuffleService</value>
</property>
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
<description>是否启动一个线程检查每个任务正使用的物理内存量,如果任务超出分配值,则直接将其杀掉,默认是true</description>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
<description>是否启动一个线程检查每个任务正使用的物理内存量,如果任务超出分配值,则直接将其杀掉,默认是true</description>
</property>
<property>
<name>spark.shuffle.service.port</name>
<value>7337</value>
</property>
</configuration>

6.编辑slaves文件

编辑slaves文件, slaves是指定子节点的位置, 在HDFS上为DataNode的节点位置, 在YARN上为NodeManager的节点位置, 以你的实际情况而定

m3
m4
m5









##三.初始化Hadoop
####1. 配置主机之间免密码登陆
(1)在m1上生产一对密匙
```
ssh-keygen -t rsa
```

(2)将公钥拷贝到其他节点,包括本主机

ssh-coyp-id 127.0.0.1
ssh-coyp-id localhost
ssh-coyp-id m1
ssh-coyp-id m2
ssh-coyp-id m3

(3)在其他主机上重复(1)(2)的操作




####2.将配置好的hadoop拷贝到其他节点
```
scp -r /home/centos/soft/hadoop m2:/home/centos/soft/
scp -r /home/centos/soft/hadoop m3:/home/centos/soft/
scp -r /home/centos/soft/hadoop m4:/home/centos/soft/
scp -r /home/centos/soft/hadoop m5:/home/centos/soft/
```




####注意:严格按照下面的步骤
####3.启动zookeeper集群(分别在m3、m4、m5上启动zk)
1. 启动zookeeper服务
```
cd /home/centos/soft/zookeeper-3.4.5/bin/
```
```
./zkServer.sh start
```

  1. 查看状态:一个leader,两个follower
./zkServer.sh status

4.启动journalnode (分别在m3、m4、m5主机上执行, 必须在HDFS格式化前执行, 不然会报错)

(1)启动JournalNode服务

cd /home/centos/soft/hadoop
sbin/hadoop-daemon.sh start journalnode

(2)运行jps命令检验,m3、m4、m5上多了JournalNode进程

jps



####5.格式化HDFS(在m1上执行即可)
(1)在m1上执行命令:
```
hdfs namenode -format
```

(2)格式化后会在根据core-site.xml中的hadoop.tmp.dir配置生成个文件,这里我配置的是/home/centos/soft/hadoop/tmp,然后将m1主机上的/home/centos/soft/hadoop下的tmp目录拷贝到m2主机上的/home/centos/soft/hadoop目录下

scp -r /home/centos/soft/hadoop/tmp/ m2:/home/centos/soft/hadoop/

6.格式化ZK(在m1上执行)

hdfs zkfc -formatZK

7.启动HDFS(在m1上执行)

sbin/start-dfs.sh

8.启动YARN(在m1,m2上执行)

sbin/start-yarn.sh

#### 至此,Hadoop-2.6.5配置完毕!!!














####四.检验Hadoop集群搭建成功
#######0.在Windows下编辑hosts文件, 配置主机名与IP的映射(此步骤可跳过)**
```
C:\Windows\System32\drivers\etc\hosts

192.168.179.201 m1

192.168.179.202 m2

192.168.179.203 m3

192.168.179.204 m4

192.168.179.205 m5



----
######1.可以统计浏览器访问:

http://m1:50070

NameNode 'm1:9000' (active)

http://m2:50070

NameNode 'm2:9000' (standby)


---
######2.验证HDFS HA
1. 首先向hdfs上传一个文件

hadoop fs -put /etc/profile /profile


2. 查看是否已上传到HDFS上

hadoop fs -ls /


3. 然后再kill掉active的NameNode

kill -9


4. 通过浏览器访问:http://m2:50070

NameNode 'm2:9000' (active) ## 主机m2上的NameNode变成了active


5. 执行命令:

hadoop fs -ls / ## 看之前在m1上传的文件是否还存在!!!


6. 手动在m1上启动挂掉的NameNode

sbin/hadoop-daemon.sh start namenode


7. 通过浏览器访问:http://m1:50070

NameNode 'm1:9000' (standby)



---
######3.验证YARN:
1. 用浏览器访问: http://m1:8088, 查看是否有NodeManager服务在运行
2. 运行一下hadoop提供的demo中的WordCount程序, 在linux上执行以下命令

hadoop jar /home/centos/soft/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.5.jar wordcount InputParameter OutputParameter

####在http://m1:8088 上是否有application在运行,若有则YARN没问题

---
<br>
####OK,大功告成!!! <br>
<br>
<br>

Hadoop2.6.5高可用集群搭建的更多相关文章

  1. Hadoop HA高可用集群搭建(Hadoop+Zookeeper+HBase)

    声明:作者原创,转载注明出处. 作者:帅气陈吃苹果 一.服务器环境 主机名 IP 用户名 密码 安装目录 master188 192.168.29.188 hadoop hadoop /home/ha ...

  2. Hbase 完全分布式 高可用 集群搭建

    1.准备 Hadoop 版本:2.7.7 ZooKeeper 版本:3.4.14 Hbase 版本:2.0.5 四台主机: s0, s1, s2, s3 搭建目标如下: HMaster:s0,s1(备 ...

  3. Flume 学习笔记之 Flume NG高可用集群搭建

    Flume NG高可用集群搭建: 架构总图: 架构分配: 角色 Host 端口 agent1 hadoop3 52020 collector1 hadoop1 52020 collector2 had ...

  4. HDFS-HA高可用集群搭建

    HA高可用集群搭建 1.总体集群规划 在hadoop102.hadoop103和hadoop104三个节点上部署Zookeeper. hadoop102 hadoop103 hadoop104 Nam ...

  5. hadoop高可用集群搭建小结

    hadoop高可用集群搭建小结1.Zookeeper集群搭建2.格式化Zookeeper集群 (注:在Zookeeper集群建立hadoop-ha,amenode的元数据)3.开启Journalmno ...

  6. Spark高可用集群搭建

    Spark高可用集群搭建 node1    node2    node3   1.node1修改spark-env.sh,注释掉hadoop(就不用开启Hadoop集群了),添加如下语句 export ...

  7. Hadoop 3.1.2(HA)+Zookeeper3.4.13+Hbase1.4.9(HA)+Hive2.3.4+Spark2.4.0(HA)高可用集群搭建

    目录 目录 1.前言 1.1.什么是 Hadoop? 1.1.1.什么是 YARN? 1.2.什么是 Zookeeper? 1.3.什么是 Hbase? 1.4.什么是 Hive 1.5.什么是 Sp ...

  8. MongoDB高可用集群搭建(主从、分片、路由、安全验证)

    目录 一.环境准备 1.部署图 2.模块介绍 3.服务器准备 二.环境变量 1.准备三台集群 2.安装解压 3.配置环境变量 三.集群搭建 1.新建配置目录 2.修改配置文件 3.分发其他节点 4.批 ...

  9. RabbitMQ高级指南:从配置、使用到高可用集群搭建

    本文大纲: 1. RabbitMQ简介 2. RabbitMQ安装与配置 3. C# 如何使用RabbitMQ 4. 几种Exchange模式 5. RPC 远程过程调用 6. RabbitMQ高可用 ...

随机推荐

  1. JSONP代码收藏

    摘抄自jQuery,用于JSONP请求. var callback = 'callback_' + (new Date() - 0), url = 'http://localhost/', scrip ...

  2. 使用PHP操作MongoDB数据库

    1.连接MongoDB数据库(在已安装php-mongodb扩展的前提下) $config = "mongodb://{$user}:{$pass}@{$host}:{$port}" ...

  3. 【Codeforces Global Round 1 E】Magic Stones

    [链接] 我是链接,点我呀:) [题意] 你可以把c[i]改成c[i+1]+c[i-1]-c[i] (2<=i<=n-1) 问你能不能把每一个c[i]都换成对应的t[i]; [题解] d[ ...

  4. JavaScript学习总结(7)——JavaScript基础知识汇总

  5. CodeForcesGym 100735H Words from cubes

    Words from cubes Time Limit: Unknown ms Memory Limit: 65536KB This problem will be judged on CodeFor ...

  6. MG loves string

    MG loves string Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 262144/262144 K (Java/Others ...

  7. Python 7 列表 for 字典,嵌套

    列表: 基本格式:变量名 = [元素1,元素2,元素3] 创建:A = ['访客','admin',19]  或  A = list(['armin','admin',19]),  后者更倾向于转换为 ...

  8. 【ACM】hdu_zs3_1007_Rails_201308100802

    Rails Time Limit : 2000/1000ms (Java/Other)   Memory Limit : 20000/10000K (Java/Other)Total Submissi ...

  9. Spring MVC中<mvc:annotation-driven />和<context:annotation-config />的区别分析

    个人最简单的使用理解: <mvc:annotation-driven />是管理静态资源的,比如静态页面,返回JSON这些. <context:annotation-config / ...

  10. 【bzoj2038】[2009国家集训队]小Z的袜子(hose)(细致总结)

    [bzoj2038][2009国家集训队]小Z的袜子(hose)(细致总结) Description 作为一个生活散漫的人,小Z每天早上都要耗费很久从一堆五颜六色的袜子中找出一双来穿.终于有一天,小Z ...