3.Hadoop完全分布式搭建

1.完全分布式搭建

  1. 配置

    #cd /soft/hadoop/etc/
    #mv hadoop local
    #cp -r local full
    #ln -s full hadoop
    #cd hadoop
  2. 修改core-site.xml配置文件

    #vim core-site.xml
    [core-site.xml配置如下]
    <?xml version="1.0"?>
    <configuration>
    <property>
    <name>fs.defaultFS</name>
    <value>hdfs://hadoop-1</value>
    </property>
    </configuration>
  3. 修改hdfs-site.xml配置文件

    #vim hdfs-site.xml
    [hdfs-site.xml配置如下]
    <?xml version="1.0"?>
    <configuration>
    <property>
    <name>dfs.replication</name>
    <value>3</value>
    </property>
    <property>
    <name>dfs.namenode.secondary.http-address</name>
    <value>hadoop-2:50090</value>
    </description>
    </property>
    </configuration>
  4. 修改mapred-site.xml配置文件

    #cp mapred-site.xml.template mapred-site.xml
    #vim mapred-site.xml
    [mapred-site.xml配置如下]
    <?xml version="1.0"?>
    <configuration>
    <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
    </property>
    </configuration>
  5. 修改yarn-site.xml配置文件

    #vim yarn-site.xml
    [yarn-site.xml配置如下]
    <?xml version="1.0"?>
    <configuration>
    <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>hadoop-1</value>
    </property>
    <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
    </property>
    </configuration>
  6. 修改slaves配置文件

    #vim slaves
    [salves]
    hadoop-2
    hadoop-3
    hadoop-4
    hadoop-5
  7. 同步到其他节点

     #scp -r /soft/hadoop/etc/full  hadoop-2:/soft/hadoop/etc/
    #scp -r /soft/hadoop/etc/full hadoop-3:/soft/hadoop/etc/
    #scp -r /soft/hadoop/etc/full hadoop-4:/soft/hadoop/etc/
    #scp -r /soft/hadoop/etc/full hadoop-5:/soft/hadoop/etc/
    #ssh hadoop-2 ln -s /soft/hadoop/etc/full /soft/hadoop/etc/hadoop
    #ssh hadoop-3 ln -s /soft/hadoop/etc/full /soft/hadoop/etc/hadoop
    #ssh hadoop-4 ln -s /soft/hadoop/etc/full /soft/hadoop/etc/hadoop
    #ssh hadoop-5 ln -s /soft/hadoop/etc/full /soft/hadoop/etc/hadoop
  8. 格式化hdfs分布式文件系统

    #hadoop namenode -format
  9. 启动服务

    [root@hadoop-1 hadoop]# start-all.sh
    This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
    Starting namenodes on [hadoop-1]
    hadoop-1: starting namenode, logging to /soft/hadoop-2.7.3/logs/hadoop-root-namenode-hadoop-1.out
    hadoop-2: starting datanode, logging to /soft/hadoop-2.7.3/logs/hadoop-root-datanode-hadoop-2.out
    hadoop-3: starting datanode, logging to /soft/hadoop-2.7.3/logs/hadoop-root-datanode-hadoop-3.out
    hadoop-4: starting datanode, logging to /soft/hadoop-2.7.3/logs/hadoop-root-datanode-hadoop-4.out
    hadoop-5: starting datanode, logging to /soft/hadoop-2.7.3/logs/hadoop-root-datanode-hadoop-5.out
    Starting secondary namenodes [hadoop-2]
    hadoop-2: starting secondarynamenode, logging to /soft/hadoop-2.7.3/logs/hadoop-root-secondarynamenode-hadoop-2.out
    starting yarn daemons
    starting resourcemanager, logging to /soft/hadoop-2.7.3/logs/yarn-root-resourcemanager-hadoop-1.out
    hadoop-3: starting nodemanager, logging to /soft/hadoop-2.7.3/logs/yarn-root-nodemanager-hadoop-3.out
    hadoop-4: starting nodemanager, logging to /soft/hadoop-2.7.3/logs/yarn-root-nodemanager-hadoop-4.out
    hadoop-2: starting nodemanager, logging to /soft/hadoop-2.7.3/logs/yarn-root-nodemanager-hadoop-2.out
    hadoop-5: starting nodemanager, logging to /soft/hadoop-2.7.3/logs/yarn-root-nodemanager-hadoop-5.out
  10. 查看服务运行状态

        [root@hadoop-1 hadoop]# jps
    16358 ResourceManager
    12807 NodeManager
    16011 NameNode
    16204 SecondaryNameNode
    16623 Jps hadoop-5 | SUCCESS | rc=0 >>
    16993 NodeManager
    16884 DataNode
    17205 Jps hadoop-1 | SUCCESS | rc=0 >>
    28520 ResourceManager
    28235 NameNode
    29003 Jps hadoop-2 | SUCCESS | rc=0 >>
    17780 Jps
    17349 DataNode
    17529 NodeManager
    17453 SecondaryNameNode hadoop-4 | SUCCESS | rc=0 >>
    17105 Jps
    16875 NodeManager
    16766 DataNode hadoop-3 | SUCCESS | rc=0 >>
    16769 DataNode
    17121 Jps
    16878 NodeManager
  11. 登陆WEB查看

2. 完全分布式单词统计

  1. 通过hadoop自带的demo运行单词统计

    #mkdir /input
    #cd /input/
    #echo "hello world" > file1.txt
    #echo "hello world" > file2.txt
    #echo "hello world" > file3.txt
    #echo "hello hadoop" > file4.txt
    #echo "hello hadoop" > file5.txt
    #echo "hello mapreduce" > file6.txt
    #echo "hello mapreduce" > file7.txt
    #hadoop dfs -mkdir /input
    #hdfs dfs -ls /
    #hadoop fs -ls /
    #hadoop fs -put /input/* /input
    #hadoop fs -ls /input
  2. 开始统计

    [root@hadoop-1 ~]# hadoop jar /soft/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /input/ /output
    17/05/14 23:01:07 INFO client.RMProxy: Connecting to ResourceManager at hadoop-1/10.31.133.19:8032
    17/05/14 23:01:09 INFO input.FileInputFormat: Total input paths to process : 7
    17/05/14 23:01:10 INFO mapreduce.JobSubmitter: number of splits:7
    17/05/14 23:01:10 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1494773207391_0001
    17/05/14 23:01:10 INFO impl.YarnClientImpl: Submitted application application_1494773207391_0001
    17/05/14 23:01:11 INFO mapreduce.Job: The url to track the job: http://hadoop-1:8088/proxy/application_1494773207391_0001/
    17/05/14 23:01:11 INFO mapreduce.Job: Running job: job_1494773207391_0001
    17/05/14 23:01:23 INFO mapreduce.Job: Job job_1494773207391_0001 running in uber mode : false
    17/05/14 23:01:23 INFO mapreduce.Job: map 0% reduce 0%
    17/05/14 23:01:56 INFO mapreduce.Job: map 43% reduce 0%
    17/05/14 23:01:57 INFO mapreduce.Job: map 100% reduce 0%
    17/05/14 23:02:04 INFO mapreduce.Job: map 100% reduce 100%
    17/05/14 23:02:05 INFO mapreduce.Job: Job job_1494773207391_0001 completed successfully
    17/05/14 23:02:05 INFO mapreduce.Job: Counters: 50
    File System Counters
    FILE: Number of bytes read=184
    FILE: Number of bytes written=949365
    FILE: Number of read operations=0
    FILE: Number of large read operations=0
    FILE: Number of write operations=0
    HDFS: Number of bytes read=801
    HDFS: Number of bytes written=37
    HDFS: Number of read operations=24
    HDFS: Number of large read operations=0
    HDFS: Number of write operations=2
    Job Counters
    Killed map tasks=1
    Launched map tasks=7
    Launched reduce tasks=1
    Data-local map tasks=7
    Total time spent by all maps in occupied slots (ms)=216289
    Total time spent by all reduces in occupied slots (ms)=4827
    Total time spent by all map tasks (ms)=216289
    Total time spent by all reduce tasks (ms)=4827
    Total vcore-milliseconds taken by all map tasks=216289
    Total vcore-milliseconds taken by all reduce tasks=4827
    Total megabyte-milliseconds taken by all map tasks=221479936
    Total megabyte-milliseconds taken by all reduce tasks=4942848
    Map-Reduce Framework
    Map input records=7
    Map output records=14
    Map output bytes=150
    Map output materialized bytes=220
    Input split bytes=707
    Combine input records=14
    Combine output records=14
    Reduce input groups=4
    Reduce shuffle bytes=220
    Reduce input records=14
    Reduce output records=4
    Spilled Records=28
    Shuffled Maps =7
    Failed Shuffles=0
    Merged Map outputs=7
    GC time elapsed (ms)=3616
    CPU time spent (ms)=3970
    Physical memory (bytes) snapshot=1528823808
    Virtual memory (bytes) snapshot=16635846656
    Total committed heap usage (bytes)=977825792
    Shuffle Errors
    BAD_ID=0
    CONNECTION=0
    IO_ERROR=0
    WRONG_LENGTH=0
    WRONG_MAP=0
    WRONG_REDUCE=0
    File Input Format Counters
    Bytes Read=94
    File Output Format Counters
    Bytes Written=37
  3. 查看

    [root@hadoop-1 ~]# hadoop fs -ls /out/put
    Found 2 items
    -rw-r--r-- 3 root supergroup 0 2017-05-14 23:02 /out/put/_SUCCESS
    -rw-r--r-- 3 root supergroup 37 2017-05-14 23:02 /out/put/part-r-00000
    [root@hadoop-1 ~]# hadoop fs -cat /out/put/part-r-00000
    hadoop 2
    hello 7
    mapreduce 2
    world 3
    [root@hadoop-1 ~]#

3.hadoop完全分布式搭建的更多相关文章

  1. hadoop完全分布式搭建HA(高可用)

    2018年03月25日 16:25:26 D调的Stanley 阅读数:2725 标签: hadoop HAssh免密登录hdfs HA配置hadoop完全分布式搭建zookeeper 配置 更多 个 ...

  2. 超详细解说Hadoop伪分布式搭建--实战验证【转】

    超详细解说Hadoop伪分布式搭建 原文http://www.tuicool.com/articles/NBvMv2原原文 http://wojiaobaoshanyinong.iteye.com/b ...

  3. Hadoop伪分布式搭建(一)

     下面内容主要说明在Windows虚拟机上面,怎么搭建一个Hadoop伪分布式,并如何运行wordcount程序和网页查看HDFS文件系统. 1 相关软件下载和安装 APACH官网提供hadoop版本 ...

  4. Hadoop伪分布式搭建步骤

    说明: 搭建环境是VMware10下用的是Linux CENTOS 32位,Hadoop:hadoop-2.4.1  JAVA :jdk7 32位:本文是本人在网络上收集的HADOOP系列视频所附带的 ...

  5. Hadoop 完全分布式搭建

    搭建环境 https://www.cnblogs.com/YuanWeiBlogger/p/11456623.html 修改主机名------------------- 1./etc/hostname ...

  6. hadoop 伪分布式搭建

    下载hadoop1.0.4版本,和jdk1.6版本或更高版本:1. 安装JDK,安装目录大家可以自定义,下面是我的安装目录: /usr/jdk1.6.0_22 配置环境变量: [root@hadoop ...

  7. Hadoop完全分布式搭建过程中遇到的问题小结

    前一段时间,终于抽出了点时间,在自己本地机器上尝试搭建完全分布式Hadoop集群环境,也是借助网络上虾皮的Hadoop开发指南系列书籍一步步搭建起来的,在这里仅代表hadoop初学者向虾皮表示衷心的感 ...

  8. Hadoop完全分布式搭建流程

    centos7 搭建完全分布式 Hadoop 环境  SSR 前言 本次教程是以先创建 四台虚拟机 为基础,再配置好一台虚拟机的情况下,直接复制文件到另外的虚拟机中(这样做大大简化了安装流程) 且本次 ...

  9. Hadoop伪分布式搭建CentOS

    所需软件及版本: jdk-7u80-linux-x64.tar.gz hadoop-2.6.0.tar.gz 1.安装JDK Hadoop 在需在JDK下运行,注意JDK最好使用Oracle的否则可能 ...

随机推荐

  1. 20181030noip模拟赛T1

    YY的矩阵 YY有一个大矩阵(N*M), 矩阵的每个格子里都有一个整数权值W[i,j](1<=i<=M,1<=j<=N) 对于这个矩阵YY会有P次询问,每次询问这个大矩阵的一个 ...

  2. mysql集群压测

    mysql压测 mysql自带就有一个叫mysqlslap的压力测试工具,通过模拟多个并发客户端访问MySQL来执行压力测试,并且能很好的对比多个存储引擎在相同环境下的并发压力性能差别.通过mysql ...

  3. mysql主从延迟复制

    需求描述 正常情况下我们是不会有刻意延迟从库的需求的,因为正常的线上业务自然是延迟越低越好.但是针对测试场景,业务上偶尔需要测试延迟场景下业务是否能正常运行. 解决方案 针对这种场景mysql有一个叫 ...

  4. Spring Boot 微信-验证服务器有效性【转】

    转:https://blog.csdn.net/jeikerxiao/article/details/68064145 概述 接入微信公众平台开发,开发者需要按照如下步骤完成: 在自己服务器上,开发验 ...

  5. Python字符串必记函数

    Python字符串函数数不胜数,想要记完所有几乎不可能,下列几个是极为重要的一些函数,属于必记函数. 一.join 功能: 将字符串.元组.列表中的元素以指定的字符(分隔符)连接生成一个新的字符串 语 ...

  6. URL参数获取/转码

    JS中对URL进行转码与解码 1.escape 和 unescape escape()不能直接用于URL编码,它的真正作用是返回一个字符的Unicode编码值. 采用unicode字符集对指定的字符串 ...

  7. 【saltstack 集中化管理】

    Master(监控端): Minion(被监控端) 监控: /etc/master: #interface:监控端地址 #自动接受被监控端证书 #saltstack文件根目录位置 #启动监控 被监控: ...

  8. PHP 获取客户端 IP 地址

    先来了解一个变量的含义: $_SERVER['REMOTE_ADDR']:浏览当前页面的用户计算机的ip地址 $_SERVER['HTTP_CLIENT_IP']:客户端的ip $_SERVER['H ...

  9. Java学习笔记二十八:Java中的接口

    Java中的接口 一:Java的接口: 接口(英文:Interface),在JAVA编程语言中是一个抽象类型,是抽象方法的集合,接口通常以interface来声明.一个类通过继承接口的方式,从而来继承 ...

  10. Zabbix 3.4.11版本 自定义监控项

    一.实验思路过程 创建项目.触发器.图形,验证监控效果: Template OS Linux 模板基本涵盖了所有系统层面的监控,包括了我们最关注的 几项:ping.load.cpu 使用率.memor ...