HA分布式集群一hadoop+zookeeper
一:HA分布式配置的优势:
1,防止由于一台namenode挂掉,集群失败的情形
2,适合工业生产的需求
二:HA安装步骤:
1,安装虚拟机
1,型号:VMware_workstation_full_12.5.0.11529.exe linux镜像:CentOS-7-x86_64-DVD-1611.iso
注意点:
1,网络选择了桥接模式(可以防止route总变),(台式机或服务器最好设置自己的本机的ip地址为静态的ip)
2,安装过程中选择了基础建设模式(infras...),(减少内存的消耗,但又保证基本的环境的模式)
3,用户名root 密码 root
4,网络配置使用了手动网络固定网络ip4地址(固定ip)
2,linux基本环境配置:(操作都在root权限下进行的)
1,验证网络服务:ping <主机ip> 主机 ping <虚拟机ip> ping www.baidu.ok 验证ok
备份ip地址:cp /etc/sysconfig/network-scripts/ifcfg-ens33 /etc/sysconfig/network-scripts/ifcfg-ens33.bak
2,防火墙设置:关闭并禁用防火墙
关闭防火墙 systemctl stop firewalld.service(cetos7与前面系列的iptables不同)
禁用防火墙:systemctl disable firewalld.service
查看防火墙状态:firewall-cmd --state
3,设置hosts,hostname,network
vim /etc/hostname
ha1 vim /etc/hosts
192.168.1.116 ha1
192.168.1.117 ha2
192.168.1.118 ha3
192.168.1.119 ha4 vim /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=ha1
4,安装一些必要的包:(不一定全)
yum install -y chkconfig
yum install -y Python
yum install -y bind-utils
yum install -y psmisc
yum install -y libxslt
yum install -y zlib
yum install -y sqlite
yum install -y cyrus-sasl-plain
yum install -y cyrus-sasl-gssapi
yum install -y fuse
yum install -y portmap
yum install -y fuse-libs
yum install -y redhat-lsb
5,安装Java和Scala
java版本:jdk-7u80-linux-x64.rpm
scala版本:scala-2.11.6.tgz 验证是否有java:
rpm -qa|grep java 无 tar -zxf jdk-8u111-linux-x64.tar.gz
tar -zxf scala-2.11.6.tgz
mv jdk1.8.0_111 /usr/java
mv scala-2.11.6 /usr/scala 配置环境变量:
vim /etc/profile
export JAVA_HOME=/usr/java
export SCALA_HOME=/usr/scala
export PATH=$JAVA_HOME/bin:$SCALA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
6,重启,验证上述是否设置 ok :重启 使用vm快照,命名为:初始化ok java,scala,主机名,防火墙,ip
3,hadoop+zookeeper集群配置
1,集群机准备
连接克隆:对ha1克隆出ha2,ha3,ha4
对ha2,ha3,ha4修改网络地址,network,防火墙
vim /etc/sysconfig/network-scripts/ifcfg-ens33
116 117/118/119
service network restart
vim /etc/hostname
vim /etc/sysconfig/network
systemctl disable firewalld.service
对ha2,ha3,ha4重启验证ip,网络,防火墙,分别对三台机快照,命名为:初始化ok java,scala,主机名,防火墙,ip
2,集群框架图
|
机子 |
Namenode |
DataNode |
Zookeeper |
ZkFC |
JournalNode |
RM |
DM |
|
Ha1 |
1 |
1 |
1 |
1 |
1 |
||
|
Ha2 |
1 |
1 |
1 |
1 |
1 |
1 |
|
|
Ha3 |
1 |
1 |
1 |
1 |
|||
|
Ha4 |
1 |
1 |
3,ssh通信: ok后 快照 ssh ok
四台机:
ssh-keygen -t rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys ha1下:
scp ~/.ssh/* root@ha2:~/.ssh/
scp ~/.ssh/* root@ha3:~/.ssh/
scp ~/.ssh/* root@ha4:~/.ssh/ 验证:
ssh ha2/ha3/ha4
4,zookeeper集群配置:
1,配置环境变量
zook安装:
tar -zxf zookeeper-3.4.8.tar.gz
mv zookeeper-3.4.8 /usr/zookeeper-3.4.8
修改配置文件:
export ZK_HOME=/usr/zookeeper-3.4.8
scp /etc/profile root@ha2:/etc/
scp /etc/profile root@ha3:/etc/
source /etc/profile
2,zoo.cfg配置(加粗修改出)
cd /usr/zookeeper-3.4.8/conf
cp zoo_sample.cfg zoo.cfg
内容:
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/opt/zookeeper/datas
dataLogDir=/opt/zookeeper/logs
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=ha1:2888:3888
server.2=ha2:2888:3888
server.3=ha3:2888:3888
3,启动zookeeper集群:
#三台机(ha1,ha2,ha3)
新建文件夹:
mkdir -p /opt/zookeeper/datas
mkdir -p /opt/zookeeper/logs
cd /opt/zookeeper/datas
vim myid 写1/2/3 #分发给ha2,ha3(注意ha4不需要)
cd /usr
scp -r zookeeper-3.4.8 root@ha2:/usr
scp -r zookeeper-3.4.8 root@ha3:/usr
#启动(三台机)
cd $ZK_HOME/bin
zkServer.sh start
zkServer.sh status 一个leader和连个follower
5,hadoop集群配置
1,配置环境变量:
版本:hadoop-2.7.3.tar.gz tar -zxf hadoop-2.7.3.tar.gz
mv hadoop2.7.3 /usr/hadoop2.7.3 export JAVA_HOME=/usr/java
export SCALA_HOME=/usr/scala
export HADOOP_HOME=/usr/hadoop-2.7.3
export PATH=$JAVA_HOME/bin:$SCALA_HOME/bin:$HADOOP_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
source /etc/profile
2,hadoop.env.sh配置:
export JAVA_HOME=/usr/java
source hadoop.env.sh
hadoop version 验证ok
3,hdfs-site.xml配置:后续修改后发送(scp hdfs-site.xml root@ha4:/usr/hadoop-2.7.3/etc/hadoop/)
vim hdfs-site.xml
<configuration>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>ha1:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>ha2:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>ha1:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>ha2:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://ha2:8485;ha3:8485;ha4:8485/mycluster</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/jn/data</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
</configuration>
4,core-site.xml配置
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>ha1:2181,ha2:2181,ha3:2181</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop2</value>
<description>A base for other temporary directories.</description>
</property>
</configuration>
5,yarn-site.xml配置
vim yarn-site.xml
<configuration> <!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>ha1</value>
</property>
</configuration>
6,mapred-site.xml配置
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
7,slaves配置:
vim slaves
ha2
ha3
ha4
8,分发并启动:
#分发
scp -r hadoop-2.7.3 root@ha2:/usr/
scp -r hadoop-2.7.3 root@ha3:/usr/
scp -r hadoop-2.7.3 root@ha4:/usr/
#启动JN(在ha2,ha3,ha4)
cd sbin
./hadoop-daemon.sh start journalnode [root@ha2 sbin]# jps
2646 JournalNode
2695 Jps
2287 QuorumPeerMain(#zk启动的线程) #ha1:namenode格式化
cd bin
./hdfs namenode -format
#zk格式化
./hdfs zkfc -formatZK
#可以查看cd /opt/hadoop2文件来查看元数据是否格式化正常 #ha2:namenode格式化
1,ha1要先启动namenode:
./hadoop-daemon.sh start namenode
2,ha2下
./hdfs namenode -bootstrapStandby
9,验证:http://192.168.1.116:50070/验证 ok 快照 ha模式下的hadoop+zookeeper安装ok
#hdfs集群验证
[root@ha1 sbin]# ./stop-dfs.sh
Stopping namenodes on [ha1 ha2]
ha2: no namenode to stop
ha1: stopping namenode
ha2: no datanode to stop
ha3: no datanode to stop
ha4: no datanode to stop
Stopping journal nodes [ha2 ha3 ha4]
ha3: stopping journalnode
ha4: stopping journalnode
ha2: stopping journalnode
Stopping ZK Failover Controllers on NN hosts [ha1 ha2]
ha2: no zkfc to stop
ha1: no zkfc to stop
[root@ha1 sbin]# ./start-dfs.sh
ha1下:
[root@ha1 sbin]# jps
3506 Jps
3140 NameNode
2255 QuorumPeerMain
3439 DFSZKFailoverController
[root@ha2 dfs]# jps
31264 NameNode
31556 DFSZKFailoverController
31638 Jps
31336 DataNode
31436 JournalNode
2287 QuorumPeerMain
[root@ha3 sbin]# jps
2290 QuorumPeerMain
31074 DataNode
31173 JournalNode
31242 Jps
[root@ha4 sbin]# jps
31153 Jps
30986 DataNode
31084 JournalNode
配置yarn和mapred
[root@ha1 sbin]# jps
4614 NameNode
4920 DFSZKFailoverController
5356 Jps
2255 QuorumPeerMain
5103 ResourceManager
[root@ha2 hadoop]# jps
32320 DataNode
32243 NameNode
32548 DFSZKFailoverController
32423 JournalNode
32713 NodeManager
32813 Jps
2287 QuorumPeerMain
[root@ha3 ~]# jps
2290 QuorumPeerMain
31495 DataNode
31736 NodeManager
31885 Jps
31598 JournalNode
[root@ha4 ~]# jps
31505 JournalNode
31641 NodeManager
31404 DataNode
31790 Jps
HA分布式集群一hadoop+zookeeper的更多相关文章
- Hadoop(HA)分布式集群部署
Hadoop(HA)分布式集群部署和单节点namenode部署其实一样,只是配置文件的不同罢了. 这篇就讲解hadoop双namenode的部署,实现高可用. 系统环境: OS: CentOS 6.8 ...
- 基于HBase0.98.13搭建HBase HA分布式集群
在hadoop2.6.0分布式集群上搭建hbase ha分布式集群.搭建hadoop2.6.0分布式集群,请参考“基于hadoop2.6.0搭建5个节点的分布式集群”.下面我们开始啦 1.规划 1.主 ...
- HBase HA分布式集群搭建
HBase HA分布式集群搭建部署———集群架构 搭建之前建议先学习好HBase基本构架原理:https://www.cnblogs.com/lyywj170403/p/9203012.html 集群 ...
- Ambari安装之部署3个节点的HA分布式集群
前期博客 Ambari安装之部署单节点集群 其实,按照这个步骤是一样的.只是按照好3个节点后,再做下HA即可. 部署3个节点的HA分布式集群 (1)添加机器 和添加服务的操作类似,如下图 之后的添加a ...
- hadoop HA分布式集群搭建
概述 hadoop2中NameNode可以有多个(目前只支持2个).每一个都有相同的职能.一个是active状态的,一个是standby状态的.当集群运行时,只有active状态的NameNode是正 ...
- 1、hadoop HA分布式集群搭建
概述 hadoop2中NameNode可以有多个(目前只支持2个).每一个都有相同的职能.一个是active状态的,一个是standby状态的.当集群运行时,只有active状态的NameNode是正 ...
- hadoop2.2.0的ha分布式集群搭建
hadoop2.2.0 ha集群搭建 使用的文件如下: jdk-6u45-linux-x64.bin hadoop-2.2.0.x86_64.tar zookeeper-3.4.5. ...
- HA分布式集群配置三 spark集群配置
(一)HA下配置spark 1,spark版本型号:spark-2.1.0-bin-hadoop2.7 2,解压,修改配置环境变量 tar -zxvf spark-2.1.0-bin-hadoop2. ...
- 通过tarball形式安装HBASE Cluster(CDH5.0.2)——如何配置分布式集群中的zookeeper
集群安装总览参见这里 Zookeeper的配置 1,/etc/profile中加入zk的路径设置,见上面背景说明. 2,进入~/zk/conf目录,复制zoo_sample.cfg为zoo.cfg v ...
随机推荐
- git error: unable to write file xxx,git fatal: unable to write new index file
执行git checkout -- . error: unable to write file mobile/manifest.jsonfatal: unable to write new index ...
- python--requests_html
这个模块从名字上也能看出来,是专门用来解析html的,这个和requests的作者是同一人,非常牛逼的一位大佬. 大致读了一下源码,总共一个py文件(但是引用了其他的模块),加上注释总共才700多行, ...
- KVM(三)I/O 全虚拟化和准虚拟化
在 QEMU/KVM 中,客户机可以使用的设备大致可分为三类: 1. 模拟设备:完全由 QEMU 纯软件模拟的设备. 2. Virtio 设备:实现 VIRTIO API 的半虚拟化设备. 3. PC ...
- java两种实现二分查找方式
二分查找法适用于 升序排列的数组,如果你所要操作的数组不是升序排序的,那么请用排序算法,排序一下. 说明:使用二分查找法相比顺序查找 节约了时间的开销,但是增加了空间使用.因为需要动态记录 起始索引 ...
- 【hdoj_2124】RepairTheWall
题目:http://acm.hdu.edu.cn/showproblem.php?pid=2124 思路:贪心法.由于要求所需的块儿(block)的最小数目,先把所有的块儿加起来,看看大小是否> ...
- 微信支付报错:app没有获取微信支付权限
调试微信支付的时候报错: Array( [return_code] => FAIL [return_msg] => 您没有APP支付权限) 查询了,发现自己将之前的公众号支付的APPID一 ...
- 关于oracle的sqlplus的另一些小技巧
执行脚本的命令在上一节已经讲过,不再重复. sqlplus user/password@ip:port/servicename @/path/sqltest.sql; sqltest的内容及注释: - ...
- [jquery] ajax 调试
$.ajax({ type: ‘post’, url: url, data: {code: ‘15′}, dataType: ‘jsonp’, sccuess:function(data){ aler ...
- Don't Be a Subsequence
问题 F: Don't Be a Subsequence 时间限制: 1 Sec 内存限制: 128 MB提交: 33 解决: 2[提交] [状态] [讨论版] [命题人:] 题目描述 A sub ...
- C++线段树模板(区间和、区间加)
操作说明: segtree<T>tree(len) =>创建一个内部元素类型为T.区间为1-len的线段树tree tree.build(l,r) =>以[l,r]区间建立线段 ...