运行平台:Hadoop 2.6.3

模式:完全分布模式

1、准备统计文本,以一段文字为例:eg.txt

The Project Gutenberg EBook of War and Peace, by Leo Tolstoy

This eBook is for the use of anyone anywhere at no cost and with almost
no restrictions whatsoever. You may copy it, give it away or re-use it
under the terms of the Project Gutenberg License included with this
eBook or online at www.gutenberg.org Title: War and Peace Author: Leo Tolstoy

2、在Shell中上传文本

hadoop fs -put ./eg.txt /

3、进入share/hadoop/mapreduce目录下,启动排序

hadoop jar hadoop-mapreduce-examples-2.6..jar wordcount /eg.txt /out

4、屏幕输出结果如下:

16/03/29 21:30:26 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/03/29 21:30:30 INFO input.FileInputFormat: Total input paths to process : 1
16/03/29 21:30:30 INFO mapreduce.JobSubmitter: number of splits:1
16/03/29 21:30:31 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1459233715960_0004
16/03/29 21:30:31 INFO impl.YarnClientImpl: Submitted application application_1459233715960_0004
16/03/29 21:30:31 INFO mapreduce.Job: The url to track the job: http://m1.fredlab.org:8088/proxy/application_1459233715960_0004/
16/03/29 21:30:31 INFO mapreduce.Job: Running job: job_1459233715960_0004
16/03/29 21:30:47 INFO mapreduce.Job: Job job_1459233715960_0004 running in uber mode : false
16/03/29 21:30:47 INFO mapreduce.Job: map 0% reduce 0%
16/03/29 21:30:57 INFO mapreduce.Job: map 100% reduce 0%
16/03/29 21:31:09 INFO mapreduce.Job: map 100% reduce 100%
16/03/29 21:31:10 INFO mapreduce.Job: Job job_1459233715960_0004 completed successfully
16/03/29 21:31:11 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=547
FILE: Number of bytes written=213761
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=453
HDFS: Number of bytes written=361
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=7594
Total time spent by all reduces in occupied slots (ms)=9087
Total time spent by all map tasks (ms)=7594
Total time spent by all reduce tasks (ms)=9087
Total vcore-milliseconds taken by all map tasks=7594
Total vcore-milliseconds taken by all reduce tasks=9087
Total megabyte-milliseconds taken by all map tasks=7776256
Total megabyte-milliseconds taken by all reduce tasks=9305088
Map-Reduce Framework
Map input records=11
Map output records=62
Map output bytes=598
Map output materialized bytes=547
Input split bytes=98
Combine input records=62
Combine output records=45
Reduce input groups=45
Reduce shuffle bytes=547
Reduce input records=45
Reduce output records=45
Spilled Records=90
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=310
CPU time spent (ms)=2010
Physical memory (bytes) snapshot=273182720
Virtual memory (bytes) snapshot=4122341376
Total committed heap usage (bytes)=137498624
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=355
File Output Format Counters
Bytes Written=361

5、结果文件位于hadoop集群/out目录下,如果执行成功,则出现_SUCCESS标识文件,并将结果存放于part-r-00000文件中。

Author:	1
EBook 1
Gutenberg 2
Leo 2
License 1
Peace 1
Peace, 1
Project 2
The 1
This 1
Title: 1
Tolstoy 2
War 2
You 1
almost 1
and 3
anyone 1
anywhere 1
at 2
away 1
by 1
copy 1
cost 1
eBook 2
for 1
give 1
included 1
is 1
it 2
it, 1
may 1
no 2
of 3
online 1
or 2
re-use 1
restrictions 1
terms 1
the 3
this 1
under 1
use 1
whatsoever. 1
with 2
www.gutenberg.org 1

可以到http://www.gutenberg.org/上下载更多txt版书籍文本来练习。

Hadoop 2.6.3运行自带WordCount程序笔记的更多相关文章

  1. hadoop2.2使用手册2:如何运行自带wordcount

    问题导读:1.hadoop2.x自带wordcount在什么位置?2.运行wordcount程序,需要做哪些准备? 此篇是在hadoop2完全分布式最新高可靠安装文档 hadoop2.X使用手册1:通 ...

  2. 大数据之路week07--day03(Hadoop深入理解,JAVA代码编写WordCount程序,以及扩展升级)

    什么是MapReduce 你想数出一摞牌中有多少张黑桃.直观方式是一张一张检查并且数出有多少张是黑桃. MapReduce方法则是: 1.给在座的所有玩家中分配这摞牌 2.让每个玩家数自己手中的牌有几 ...

  3. hadoop:如何运行自带wordcount

    1.在linux系统创建文件 vi aa.txt   --------i 进行编辑  输入  内容(多个单词例如:aa bb cc aa) 2.在HDFS上面创建文件夹 hdfs dfs -mkdir ...

  4. MapReduce编程入门实例之WordCount:分别在Eclipse和Hadoop集群上运行

    上一篇博文如何在Eclipse下搭建Hadoop开发环境,今天给大家介绍一下如何分别分别在Eclipse和Hadoop集群上运行我们的MapReduce程序! 1. 在Eclipse环境下运行MapR ...

  5. Hadoop下WordCount程序

    一.前言 在之前我们已经在 CenOS6.5 下搭建好了 Hadoop2.x 的开发环境.既然环境已经搭建好了,那么现在我们就应该来干点正事嘛!比如来一个Hadoop世界的HelloWorld,也就是 ...

  6. Hadoop入门 完全分布式运行模式-集群配置

    目录 集群配置 集群部署规划 配置文件说明 配置集群 群起集群 1 配置workers 2 启动集群 总结 3 集群基本测试 上传文件到集群 查看数据真实存储路径 下载 执行wordcount程序 配 ...

  7. spark wordcount程序

    spark wordcount程序 IllegalAccessError错误 这个错误是权限错误,错误的引用方法,比如方法中调用private,protect方法. 当然大家知道wordcount业务 ...

  8. Hadoop_05_运行 Hadoop 自带 MapReduce程序

    1. MapReduce使用 MapReduce是Hadoop中的分布式运算编程框架,只要按照其编程规范,只需要编写少量的业务逻辑代码即可实现 一个强大的海量数据并发处理程序 2. 运行Hadoop自 ...

  9. 020_自己编写的wordcount程序在hadoop上面运行,不使用插件hadoop-eclipse-plugin-1.2.1.jar

    1.Eclipse中无插件运行MP程序 1)在Eclipse中编写MapReduce程序 2)打包成jar包 3)使用FTP工具,上传jar到hadoop 集群环境 4)运行 2.具体步骤 说明:该程 ...

随机推荐

  1. 【UVA1331】关于最优三角剖分

    最近在练习DP专题,学会了很多表示方法和转换方法,今天做最优三角剖分的时候发现脑子卡了,不会表示状态,于是写个博客记录一下. 最优三角剖分的一类题目都是差不多的.给你一个多边形,让你把它分割成若干个三 ...

  2. UVA 10034 Freckles 最小生成树

    虽然是道普通的最小生成树题目,可还是中间出了不少问题,暴露的一个问题是不够细心,不够熟练.所以这篇博客就当记录一下bug吧. 代码一:kruskal #include<stdio.h> # ...

  3. CF_Lucky Sum

    幸运数字的定义是这样:仅含4和7且不比n小的数为n的幸运数字. 输入范围l,r要求输出这个范围内的数字的幸运数字之和. 代码: #include<stdio.h> #define N 10 ...

  4. [mock]10月4日

    第一次mock,CollabEdit开一个页面,开始做题.题目是,有方法pow(m,n),m和n都大于1,给出N,有顺序的打印出前N个pow(m,n)的结果.前一个是:4,8,9,16,... 然后在 ...

  5. VC下载文件显示进度条

    VC下载文件显示进度条 逗比汪星人2009-09-18上传   by Koma http://blog.csd.net/wangningyu http://download.csdn.net/deta ...

  6. 【转并修改】VS2013 MVC Web项目使用内置的IISExpress支持局域网内部机器(手机、PC)访问、调试

    转:http://www.cnblogs.com/ShaYeBlog/p/4072074.html VS2013内置了IISExpress.做asp.net MVC的web项目开发时,Ctrl+F5和 ...

  7. 深入详解SQL中的Null

    深入详解SQL中的Null NULL 在计算机和编程世界中表示的是未知,不确定.虽然中文翻译为 “空”, 但此空(null)非彼空(empty). Null表示的是一种未知状态,未来状态,比如小明兜里 ...

  8. [ZOJ 3623] Battle Ships

    Battle Ships Time Limit: 2 Seconds      Memory Limit: 65536 KB Battle Ships is a new game which is s ...

  9. JAVA 文件下载乱码问题解决办法

    页面设置隐藏的iframe <iframe id='reqFrame' frameborder='0' style='display:none' allowtransparency='true' ...

  10. scp 在不同机器上传文件

    推荐个博客,挺好的.http://www.cnblogs.com/hyddd/archive/2009/09/19/1570224.html 在不同机器上传文件是一个很常见的需求,也有很多种方法.我只 ...