版权声明:本文为博主原创文章,未经博主允许不得转载。

一、环境说明

1、机器:一台物理机 和一台虚拟机

2、Linux版本:[Spark@S1PA11 ~]$ cat /etc/issue
Red Hat Enterprise Linux Server release 5.4 (Tikanga)

3、JDK: [spark@S1PA11 ~]$ Java -version
Javaversion "1.6.0_27"
Java(TM) SE Runtime Environment (build 1.6.0_27-b07)
Java HotSpot(TM) 64-Bit Server VM (build 20.2-b06, mixed mode)

4、集群节点:两个 S1PA11(Master),S1PA222(Slave)

二、准备工作

1、安装Java jdk前一篇文章撰写了:http://blog.csdn.net/stark_summer/article/details/42391531

2、ssh免密码验证 :http://blog.csdn.net/stark_summer/article/details/42393053

3、下载Hadoop版本:http://mirror.bit.edu.cn/apache/hadoop/common/

三、安装Hadoop

这是下载后的hadoop-2.6.0.tar.gz压缩包,

1、解压 tar -xzvf hadoop-2.6.0.tar.gz

2、move到指定目录下:[spark@S1PA11 software]$ mv hadoop-2.6.0 ~/opt/

3、进入hadoop目前  [spark@S1PA11 opt]$ cd hadoop-2.6.0/
[spark@S1PA11 hadoop-2.6.0]$ ls
bin  dfs  etc  include  input  lib  libexec  LICENSE.txt  logs  NOTICE.txt  README.txt  sbin  share  tmp

配置之前,先在本地文件系统创建以下文件夹:~/hadoop/tmp、~/dfs/data、~/dfs/name。 主要涉及的配置文件有7个:都在/hadoop/etc/hadoop文件夹下,可以用gedit命令对其进行编辑。

~/hadoop/etc/hadoop/hadoop-env.sh
~/hadoop/etc/hadoop/yarn-env.sh
~/hadoop/etc/hadoop/slaves
~/hadoop/etc/hadoop/core-site.xml
~/hadoop/etc/hadoop/hdfs-site.xml
~/hadoop/etc/hadoop/mapred-site.xml
~/hadoop/etc/hadoop/yarn-site.xml

4、进去hadoop配置文件目录

[spark@S1PA11 hadoop-2.6.0]$ cd etc/hadoop/
[spark@S1PA11 hadoop]$ ls
capacity-scheduler.xml  hadoop-env.sh               httpfs-env.sh            kms-env.sh            mapred-env.sh               ssl-client.xml.example
configuration.xsl       hadoop-metrics2.properties  httpfs-log4j.properties  kms-log4j.properties  mapred-queues.xml.template  ssl-server.xml.example
Container-executor.cfg  hadoop-metrics.properties   httpfs-signature.secret  kms-site.xml          mapred-site.xml             yarn-env.cmd
core-site.xml           hadoop-policy.xml           httpfs-site.xml          log4j.properties      mapred-site.xml.template    yarn-env.sh
hadoop-env.cmd          hdfs-site.xml               kms-acls.xml             mapred-env.cmd        slaves                      yarn-site.xml

4.1、配置 hadoop-env.sh文件-->修改JAVA_HOME

# The java implementation to use.
export JAVA_HOME=/home/spark/opt/java/jdk1.6.0_37

4.2、配置 yarn-env.sh 文件-->>修改JAVA_HOME

# some Java parameters

export JAVA_HOME=/home/spark/opt/java/jdk1.6.0_37

4.3、配置slaves文件-->>增加slave节点

S1PA222

4.4、配置 core-site.xml文件-->>增加hadoop核心配置(hdfs文件端口是9000、file:/home/spark/opt/hadoop-2.6.0/tmp、)

<configuration>
 <property>
  <name>fs.defaultFS</name>
  <value>hdfs://S1PA11:9000</value>
 </property>

<property>
  <name>io.file.buffer.size</name>
  <value>131072</value>
 </property>
 <property>
  <name>hadoop.tmp.dir</name>
  <value>file:/home/spark/opt/hadoop-2.6.0/tmp</value>
  <description>Abasefor other temporary directories.</description>
 </property>
 <property>
  <name>hadoop.proxyuser.spark.hosts</name>
  <value>*</value>
 </property>
<property>
  <name>hadoop.proxyuser.spark.groups</name>
  <value>*</value>
 </property>
</configuration>

4.5、配置  hdfs-site.xml 文件-->>增加hdfs配置信息(namenode、datanode端口和目录位置)

<configuration>
 <property>
  <name>dfs.namenode.secondary.http-address</name>
  <value>S1PA11:9001</value>
 </property>

<property>
   <name>dfs.namenode.name.dir</name>
   <value>file:/home/spark/opt/hadoop-2.6.0/dfs/name</value>
 </property>

<property>
  <name>dfs.datanode.data.dir</name>
  <value>file:/home/spark/opt/hadoop-2.6.0/dfs/data</value>
  </property>

<property>
  <name>dfs.replication</name>
  <value>3</value>
 </property>

<property>
  <name>dfs.webhdfs.enabled</name>
  <value>true</value>
 </property>

</configuration>

4.6、配置  mapred-site.xml 文件-->>增加mapreduce配置(使用yarn框架、jobhistory使用地址以及web地址)

<configuration>
  <property>
   <name>mapreduce.framework.name</name>
   <value>yarn</value>
 </property>
 <property>
  <name>mapreduce.jobhistory.address</name>
  <value>S1PA11:10020</value>
 </property>
 <property>
  <name>mapreduce.jobhistory.webapp.address</name>
  <value>S1PA11:19888</value>
 </property>
</configuration>

4.7、配置   yarn-site.xml  文件-->>增加yarn功能

<configuration>
  <property>
   <name>yarn.nodemanager.aux-services</name>
   <value>mapreduce_shuffle</value>
  </property>
  <property>
   <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
   <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>
  <property>
   <name>yarn.resourcemanager.address</name>
   <value>S1PA11:8032</value>
  </property>
  <property>
   <name>yarn.resourcemanager.scheduler.address</name>
   <value>S1PA11:8030</value>
  </property>
  <property>
   <name>yarn.resourcemanager.resource-tracker.address</name>
   <value>S1PA11:8035</value>
  </property>
  <property>
   <name>yarn.resourcemanager.admin.address</name>
   <value>S1PA11:8033</value>
  </property>
  <property>
   <name>yarn.resourcemanager.webapp.address</name>
   <value>S1PA11:8088</value>
  </property>

</configuration>

5、将配置好的hadoop文件copy到另一台slave机器上

[spark@S1PA11 opt]$ scp -r hadoop-2.6.0/ spark@10.126.34.43:~/opt/

四、验证

1、格式化namenode:

[spark@S1PA11 opt]$ cd hadoop-2.6.0/
[spark@S1PA11 hadoop-2.6.0]$ ls
bin  dfs  etc  include  input  lib  libexec  LICENSE.txt  logs  NOTICE.txt  README.txt  sbin  share  tmp
[spark@S1PA11 hadoop-2.6.0]$ ./bin/hdfs namenode -format

[spark@S1PA222 .ssh]$ cd ~/opt/hadoop-2.6.0
[spark@S1PA222 hadoop-2.6.0]$ ./bin/hdfs  namenode -format

2、启动hdfs:

[spark@S1PA11 hadoop-2.6.0]$ ./sbin/start-dfs.sh 
15/01/05 16:41:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [S1PA11]
S1PA11: starting namenode, logging to /home/spark/opt/hadoop-2.6.0/logs/hadoop-spark-namenode-S1PA11.out
S1PA222: starting datanode, logging to /home/spark/opt/hadoop-2.6.0/logs/hadoop-spark-datanode-S1PA222.out
Starting secondary namenodes [S1PA11]
S1PA11: starting secondarynamenode, logging to /home/spark/opt/hadoop-2.6.0/logs/hadoop-spark-secondarynamenode-S1PA11.out
15/01/05 16:41:21 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[spark@S1PA11 hadoop-2.6.0]$ jps
22230 Master
30889 Jps
22478 Worker
30498 NameNode
30733 SecondaryNameNode
19781 ResourceManager

3、停止hdfs:

[spark@S1PA11 hadoop-2.6.0]$./sbin/stop-dfs.sh 
15/01/05 16:40:28 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping namenodes on [S1PA11]
S1PA11: stopping namenode
S1PA222: stopping datanode
Stopping secondary namenodes [S1PA11]
S1PA11: stopping secondarynamenode
15/01/05 16:40:48 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[spark@S1PA11 hadoop-2.6.0]$ jps
30336 Jps
22230 Master
22478 Worker
19781 ResourceManager

4、启动yarn:

[spark@S1PA11 hadoop-2.6.0]$./sbin/start-yarn.sh 
starting yarn daemons
starting resourcemanager, logging to /home/spark/opt/hadoop-2.6.0/logs/yarn-spark-resourcemanager-S1PA11.out
S1PA222: starting nodemanager, logging to /home/spark/opt/hadoop-2.6.0/logs/yarn-spark-nodemanager-S1PA222.out
[spark@S1PA11 hadoop-2.6.0]$ jps
31233 ResourceManager
22230 Master
22478 Worker
30498 NameNode
30733 SecondaryNameNode
31503 Jps

5、停止yarn:

[spark@S1PA11 hadoop-2.6.0]$ ./sbin/stop-yarn.sh 
stopping yarn daemons
stopping resourcemanager
S1PA222: stopping nodemanager
no proxyserver to stop
[spark@S1PA11 hadoop-2.6.0]$ jps
31167 Jps
22230 Master
22478 Worker
30498 NameNode
30733 SecondaryNameNode

6、查看集群状态:

[spark@S1PA11 hadoop-2.6.0]$ ./bin/hdfs dfsadmin -report
15/01/05 16:44:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Configured Capacity: 52101857280 (48.52 GB)
Present Capacity: 45749510144 (42.61 GB)
DFS Remaining: 45748686848 (42.61 GB)
DFS Used: 823296 (804 KB)
DFS Used%: 0.00%
Under replicated blocks: 10
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Live datanodes (1):

Name: 10.126.45.56:50010 (S1PA222)
Hostname: S1PA209
Decommission Status : Normal
Configured Capacity: 52101857280 (48.52 GB)
DFS Used: 823296 (804 KB)
Non DFS Used: 6352347136 (5.92 GB)
DFS Remaining: 45748686848 (42.61 GB)
DFS Used%: 0.00%
DFS Remaining%: 87.81%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Mon Jan 05 16:44:50 CST 2015

7、查看hdfs:http://10.58.44.47:50070/

hadoop2.6集群环境搭建的更多相关文章

  1. 虚拟机centos6.5 --hadoop2.6集群环境搭建

    一.环境说明 虚拟机:virtualBox 系统:centos6.5,64位 集群:3个节点 master 192.168.12.232 slave01 192.168.12.233 slave02 ...

  2. 原创hadoop2.6集群环境搭建

    三台机器: Hmaster 172.168.2.3.Hslave1 172.168.2.4.Hslave2 172.168.2.6 JDK:1.8.49 OS:red hat 5.4 64 (由于后期 ...

  3. hadoop2集群环境搭建

    在查询了很多资料以后,发现国内外没有一篇关于hadoop2集群环境搭建的详细步骤的文章. 所以,我想把我知道的分享给大家,方便大家交流. 以下是本文的大纲: 1. 在windows7 下面安装虚拟机2 ...

  4. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(十)安装hadoop2.9.0搭建HA

    如何搭建配置centos虚拟机请参考<Kafka:ZK+Kafka+Spark Streaming集群环境搭建(一)VMW安装四台CentOS,并实现本机与它们能交互,虚拟机内部实现可以上网.& ...

  5. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(二)安装hadoop2.9.0

    如何搭建配置centos虚拟机请参考<Kafka:ZK+Kafka+Spark Streaming集群环境搭建(一)VMW安装四台CentOS,并实现本机与它们能交互,虚拟机内部实现可以上网.& ...

  6. Hadoop+Spark:集群环境搭建

    环境准备: 在虚拟机下,大家三台Linux ubuntu 14.04 server x64 系统(下载地址:http://releases.ubuntu.com/14.04.2/ubuntu-14.0 ...

  7. Spark 1.6.1分布式集群环境搭建

    一.软件准备 scala-2.11.8.tgz spark-1.6.1-bin-hadoop2.6.tgz 二.Scala 安装 1.master 机器 (1)下载 scala-2.11.8.tgz, ...

  8. hadoop集群环境搭建之安装配置hadoop集群

    在安装hadoop集群之前,需要先进行zookeeper的安装,请参照hadoop集群环境搭建之zookeeper集群的安装部署 1 将hadoop安装包解压到 /itcast/  (如果没有这个目录 ...

  9. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(十三)kafka+spark streaming打包好的程序提交时提示虚拟内存不足(Container is running beyond virtual memory limits. Current usage: 119.5 MB of 1 GB physical memory used; 2.2 GB of 2.1 G)

    异常问题:Container is running beyond virtual memory limits. Current usage: 119.5 MB of 1 GB physical mem ...

随机推荐

  1. Java读源码之ThreadLocal

    前言 JDK版本: 1.8 之前在看Thread源码时候看到这么一个属性 ThreadLocal.ThreadLocalMap threadLocals = null; ThreadLocal实现的是 ...

  2. HashTable、Dictionary、ConcurrentDictionary三者区别

    转载自https://blog.csdn.net/yinghuolsx/article/details/72952857 1.HashTable HashTable表示键/值对的集合.在.NET Fr ...

  3. Python从入门到精通视频(全60集) ☝☝☝

    Python从入门到精通视频(全60集) Python入门到精通 学习 教程 首先,课程的顺序需要调整:一和三主要是介绍学习和布置开发环境的,一介绍的是非VS开发,三介绍的是VS开发.VS2017现在 ...

  4. Linxu下Yii2的POST请求被拒经历

    Linxu下Yii2的POST提交被拒经历 介于对Yii2的使用,浅谈一下自己的经验,在以往的项目中我使用的框架是Yii1,由于Yii2的出现,所以极力的想使用一下它的新特性. 我的使用环境Linux ...

  5. 讲真,MySQL索引优化看这篇文章就够了

    本文主要讨论MySQL索引的部分知识.将会从MySQL索引基础.索引优化实战和数据库索引背后的数据结构三部分相关内容,下面一一展开. 一.MySQL——索引基础 首先,我们将从索引基础开始介绍一下什么 ...

  6. PMP涉及的几个工作系统

    PMP涉及的几个工作系统   工作系统作为事业环境因素,提高或限制项目管理的灵活性,并可能对项目结果产生积极或消极影响,包括项目管理系统.项目管理信息系统PMIS.配置管理系统.变更控制系统.合同变更 ...

  7. 前端工程师如何理解 TCP/IP 传输层协议?

    网络协议是每个前端工程师都必须要掌握的知识,TCP/IP 中有两个具有代表性的传输层协议,分别是 TCP 和 UDP,本文将介绍下这两者以及它们之间的区别. TCP/IP网络模型 计算机与网络设备要相 ...

  8. Jenkins构建 前端node项目

    1.新建一个自由风格的项目 2.配置git 3.构建-增加构建步骤-执行shell cd $WORKSPACE npm install --registry=http://ip:port --unsa ...

  9. MyBatis 概念

    简介 什么是 MyBatis? MyBatis 是一款优秀的持久层框架,它支持定制化 SQL.存储过程以及高级映射.MyBatis 避免了几乎所有的 JDBC 代码和手动设置参数以及获取结果集.MyB ...

  10. Asp.net WebApi的授权安全机制 Basic认证

    1:Home/index.cshtml下面的Html代码 <div> <input value="1点击先登陆" type="button" ...