目的:

初步感受一下hadoop mapreduce

环境:

hadoop 2.6.4

1 准备输入文件

paper.txt 内容一般为英文文章,随便弄点什么进去
hadoop@ssmaster:~$ hadoop fs -mkdir /input
hadoop@ssmaster:~$ ls
Desktop Documents Downloads examples.desktop hadoop-2.6..tar.gz Music paper.txt Pictures Public Templates Videos
hadoop@ssmaster:~$ hadoop fs -put paper.txt /input
hadoop@ssmaster:~$ hadoop fs -ls /input
Found items
-rw-r--r-- hadoop supergroup -- : /input/paper.txt

注意:输出目录/output 不用提前创建,程序会自动做这一步

2  执行

hadoop@ssmaster:~$ hadoop jar /opt/hadoop-2.6./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6..jar  wordcount /input /output
// :: INFO client.RMProxy: Connecting to ResourceManager at ssmaster/192.168.249.144:
// :: INFO input.FileInputFormat: Total input paths to process :
// :: INFO mapreduce.JobSubmitter: number of splits:
// :: INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1477208120905_0001
// :: INFO impl.YarnClientImpl: Submitted application application_1477208120905_0001
// :: INFO mapreduce.Job: The url to track the job: http://ssmaster:8088/proxy/application_1477208120905_0001/
// :: INFO mapreduce.Job: Running job: job_1477208120905_0001
// :: INFO mapreduce.Job: Job job_1477208120905_0001 running in uber mode : false
// :: INFO mapreduce.Job: map % reduce %

6/10/23 00:51:38 INFO mapreduce.Job: map 0% reduce 0%
16/10/23 00:52:17 INFO mapreduce.Job: map 100% reduce 0%
16/10/23 00:52:39 INFO mapreduce.Job: map 100% reduce 100%
16/10/23 00:52:41 INFO mapreduce.Job: Job job_1477208120905_0001 completed successfully
16/10/23 00:52:41 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=2061
FILE: Number of bytes written=217797
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=1863
HDFS: Number of bytes written=1425
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=35792
Total time spent by all reduces in occupied slots (ms)=18540
Total time spent by all map tasks (ms)=35792
Total time spent by all reduce tasks (ms)=18540
Total vcore-milliseconds taken by all map tasks=35792
Total vcore-milliseconds taken by all reduce tasks=18540
Total megabyte-milliseconds taken by all map tasks=36651008
Total megabyte-milliseconds taken by all reduce tasks=18984960
Map-Reduce Framework
Map input records=11
Map output records=303
Map output bytes=2969
Map output materialized bytes=2061
Input split bytes=101
Combine input records=303
Combine output records=158
Reduce input groups=158
Reduce shuffle bytes=2061
Reduce input records=158
Reduce output records=158
Spilled Records=316
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=1093
CPU time spent (ms)=5550
Physical memory (bytes) snapshot=442781696
Virtual memory (bytes) snapshot=1448112128
Total committed heap usage (bytes)=276299776
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=1762
File Output Format Counters
Bytes Written=1425

可以从Web监控页面查看执行状态

http://ssmaster:8088/cluster

Cluster Metrics

Apps Submitted Apps Pending Apps Running Apps Completed Containers Running Memory Used Memory Total Memory Reserved VCores Used VCores Total VCores Reserved Active Nodes Decommissioned Nodes Lost Nodes Unhealthy Nodes Rebooted Nodes
1 0 1 0 2 3 GB 8 GB 0 B 2 8 0 1 0 0 0 0
Show 
20
40
60
80
100

entries

Search: 
 
ID
User
Name
Application Type
Queue
StartTime
FinishTime
State
FinalStatus
Progress
Tracking UI
Blacklisted Nodes
application_1477208120905_0001 hadoop word count MAPREDUCE default Sun, 23 Oct 2016 07:51:13 GMT N/A RUNNING UNDEFINED   ApplicationMaster 0

3 查看输出结果

hadoop@ssmaster:~$ hadoop fs -ls /output
Found items
-rw-r--r-- hadoop supergroup -- : /output/_SUCCESS
-rw-r--r-- hadoop supergroup -- : /output/part-r-
hadoop@ssmaster:~$ hadoop fs -cat /output/part-r-
Always
Dream
There
a
all
along
always
...........
...........

Q 总结

非常简单,没什么感觉。

后续:

  • 自己编写mapreduce wordcount 程序
  • 搭建一个纯分布式,同样的程序处理一个大文件,观察一下速度

[b0004] Hadoop 版hello word mapreduce wordcount 运行的更多相关文章

  1. [b0013] Hadoop 版hello word mapreduce wordcount 运行(三)

    目的: 不用任何IDE,直接在linux 下输入代码.调试执行 环境: Linux  Ubuntu Hadoop 2.6.4 相关: [b0012] Hadoop 版hello word mapred ...

  2. [b0012] Hadoop 版hello word mapreduce wordcount 运行(二)

    目的: 学习Hadoop mapreduce 开发环境eclipse windows下的搭建 环境: Winows 7 64 eclipse 直接连接hadoop运行的环境已经搭建好,结果输出到ecl ...

  3. Hadoop版Helloworld之wordcount运行示例

    1.编写一个统计单词数量的java程序,并命名为wordcount.java,代码如下: import java.io.IOException; import java.util.StringToke ...

  4. Hadoop集群WordCount运行详解(转)

    原文链接:Hadoop集群(第6期)_WordCount运行详解 1.MapReduce理论简介 1.1 MapReduce编程模型 MapReduce采用"分而治之"的思想,把对 ...

  5. hadoop 2.7.3本地环境运行官方wordcount

    hadoop 2.7.3本地环境运行官方wordcount 基本环境: 系统:win7 虚机环境:virtualBox 虚机:centos 7 hadoop版本:2.7.3 本次先以独立模式(本地模式 ...

  6. Hadoop学习历程(四、运行一个真正的MapReduce程序)

    上次的程序只是操作文件系统,本次运行一个真正的MapReduce程序. 运行的是官方提供的例子程序wordcount,这个例子类似其他程序的hello world. 1. 首先确认启动的正常:运行 s ...

  7. (三)配置Hadoop1.2.1+eclipse(Juno版)开发环境,并运行WordCount程序

    配置Hadoop1.2.1+eclipse(Juno版)开发环境,并运行WordCount程序 一.   需求部分 在ubuntu上用Eclipse IDE进行hadoop相关的开发,需要在Eclip ...

  8. hadoop笔记之MapReduce的运行流程

    MapReduce的运行流程 MapReduce的运行流程 基本概念: Job&Task:要完成一个作业(Job),就要分成很多个Task,Task又分为MapTask和ReduceTask ...

  9. Hadoop(六)MapReduce的入门与运行原理

    一 MapReduce入门 1.1 MapReduce定义 Mapreduce是一个分布式运算程序的编程框架,是用户开发“基于hadoop的数据分析应用”的核心框架: Mapreduce核心功能是将用 ...

随机推荐

  1. WinForms项目升级.Net Core 3.0之后,没有WinForm设计器?

    目录 .NET Conf 2019 Window Forms 设计器 .NET Conf 2019 2019 9.23-9.25召开了 .NET Conf 2019 大会,大会宣布了 .Net Cor ...

  2. spark SQL、RDD、Dataframe总结

  3. 【Web前端】VS code 快捷键tips 【陆续记录】

    学习资料为:chuanzhiheima培训资料,freecodecamp300小时基础前端,<精编CSS第三版>,<Node.js 开发指南>(BYvoid编著,淘宝买的二手书 ...

  4. URL.createObjectURL()的使用方法

    URL.createObjectURL() 静态方法会创建一个 DOMString,其中包含一个表示参数中给出的对象的URL.这个 URL 的生命周期和创建它的窗口中的 document 绑定.这个新 ...

  5. flutter_inner_drawer 使用

    版本: flutter_inner_drawer: "^0.2.2" github:  https://github.com/Dn-a/flutter_inner_drawer 这 ...

  6. ios中仿蚂蚁森林动画效果

    参考链接:https://www.jianshu.com/p/0ba9d80f8e77 demo下载链接:https://gitee.com/ovix/TreeWithRandomFruitBtn

  7. groupid公司名,artifactid模块名,version版本

  8. windows下切换Python运行环境。

    1.首先确保你的系统里已经安装了Conda,打开命令行窗口,执行命令:conda --version 2.查看你的系统当前已有的Python环境,执行命令:conda info --envs,从图中我 ...

  9. 页面一刷新让文本框自动获取焦点-- 和自定义v-focus指令

    <body> <div id="app"> <input type="text" value="" id=&q ...

  10. SpringCloud学习笔记(二、SpringCloud Config)

    目录: 配置中心简介 SpringCloud Config服务端 SpringCloud Config客户端 动态配置属性bean 一些补充(源码分析):Spring事件监听.健康检查health() ...