Wordcount -- MapReduce example -- Reducer

Reducer receives (key, values) pairs and aggregate values to a desired format, then write produced (key, value) pairs back into HDFS.

E.g.

Input: (term, [1, 1, 1, 1])

Output: (term, 4)

Reducer Class Prototype:

Reducer<Text, IntWritable, Text, IntWritable>

// Text:: INPUT_KEY

// IntWritable:: INPUT_VALUE

// Text:: OUTPUT_KEY

// IntWritable:: OUTPUT_VALUE

Reduce Method for Mapper

Method header

public void reduce(Text key, Iterable<IntWritable> values,

                     Context context

                     ) throws IOException, InterruptedException

// Text key:: Declare data type of input key;

// Iterable<IntWritable> values:: Declare data type of input values; (Note: Received values from mapper should be in a list)

// Context context:: Declare data type of output. Context is often used for output data collection.

Aggregate Values

// Iterate through all the values wrt the key:

int sum = 0;

for (IntWritable val : values) {

  sum += val.get();

}

Building (key, value) pairs

// Convert built-in int into IntWritable

result.set(sum);

// build (key, value) pair into Context and emit:

context.write(key, result);

Reducer Class Summary

Reducer class produces Reducer.Context object and serialize obtained (key, value) pair into HDFS.

Overview of Reducer Class

public static class IntSumReducer

     extends Reducer<Text,IntWritable,Text,IntWritable> {

  private IntWritable result = new IntWritable();

  public void reduce(Text key, Iterable<IntWritable> values,

                     Context context

                     ) throws IOException, InterruptedException {

    int sum = 0;

    for (IntWritable val : values) {

      sum += val.get();

    }

    result.set(sum);

    context.write(key, result);

  }

}

Written with StackEdit.

Wordcount -- MapReduce example -- Reducer的更多相关文章

Wordcount -- MapReduce example -- Mapper
Mapper maps input key/value pairs into intermediate key/value pairs. E.g. Input: (docID, doc) Output ...
Hadoop（十七）之MapReduce作业配置与Mapper和Reducer类
前言前面一篇博文写的是Combiner优化MapReduce执行,也就是使用Combiner在map端执行减少reduce端的计算量. 一.作业的默认配置 MapReduce程序的默认配置 1)概述 ...
MapReduce原理与设计思想
简单解释 MapReduce 算法一个有趣的例子你想数出一摞牌中有多少张黑桃.直观方式是一张一张检查并且数出有多少张是黑桃? MapReduce方法则是: 给在座的所有玩家中分配这摞牌让每个玩家 ...
hadoop2.2.0的WordCount程序
package com.my.hadoop.mapreduce.wordcount; import java.io.IOException; import org.apache.hadoop.conf ...
MapReduce极简教程
一个有趣的例子你想数出一摞牌中有多少张黑桃.直观方式是一张一张检查并且数出有多少张是黑桃? MapReduce方法则是: 给在座的所有玩家中分配这摞牌让每个玩家数自己手中的牌有几张是黑桃,然后 ...
大数据 --> MapReduce原理与设计思想
MapReduce原理与设计思想简单解释 MapReduce 算法一个有趣的例子:你想数出一摞牌中有多少张黑桃.直观方式是一张一张检查并且数出有多少张是黑桃? MapReduce方法则是: 给在座 ...
如何在Windows下面运行hadoop的MapReduce程序
在Windows下面运行hadoop的MapReduce程序的方法: 1.下载hadoop的安装包,这里使用的是"hadoop-2.6.4.tar.gz": 2.将安装包直接解压到 ...
【Hadoop】Hadoop mr wordcount基础
1.基本概念 2.Mapper package com.ares.hadoop.mr.wordcount; import java.io.IOException; import java.util.S ...
转：MapReduce原理与设计思想
转自:http://www.cnblogs.com/wuyudong/p/mapreduce-principle.html 简单解释 MapReduce 算法一个有趣的例子你想数出一摞牌中有多少张 ...

随机推荐

apache Rewrite配置（转）
1.Rewrite规则简介: Rewirte主要的功能就是实现URL的跳转,它的正则表达式是基于Perl语言.可基于服务器级的(httpd.conf)和目录级的 (.htaccess)两种方式.如果要 ...
LeetCode21.合并两个有序链表 JavaScript
将两个有序链表合并为一个新的有序链表并返回.新链表是通过拼接给定的两个链表的所有节点组成的. 示例: 输入:1->2->4, 1->3->4 输出:1->1->2- ...
逍遥云天微信小程序开发之获取用户手机号码——使用简单php接口demo进行加密数据解密
后边要做一个微信小程序,并要能获取用户微信绑定的手机号码.而小程序开发文档上边提供的获取手机号码的接口(getPhoneNumber())返回的是密文,需要服务器端进行解密,但是官方提供的开发文档一如 ...
c#聊聊文件数据库kv
现在有很多KV嵌入式存储,或者已经增加的.leveldb,RaptorDB等,都是相对比较好的存储.基本存储,一般配置.大概在6w/s左右.当然还有缓存等设置问题.这些基本是字符串和int的存储,对于 ...
echarts 点击方法总结，点任意一点获取点击数据，举例说明:在多图联动中点击绘制标线
关于点击(包括左击,双击,右击等)echarts图形任意一点,获取相关的图形数据,尤其是多图,我想部分人遇到这个问题一直很头大.下面我用举例说明,如何在多图联动基础上,我们点击任意一个图上任意一点,在 ...
node.js使用Sequelize 操作mysql
Sequelize就是Node上的ORM框架 ,相当于java端的Hibernate 是一个基于 promise 的 Node.js ORM, 目前支持 Postgres, MySQL, SQLite ...
Ubuntu12.04下zxing源码编译
1.下载zxing源码 git clone https://github.com/15903016222/zxing-cpp.git 2.安装依赖工具cmake sudo apt-get instal ...
CentOS6升级Python2.6到3.7，错误处理[No module named '_ctypes']
CentOS6升级Python2.6到3.7,错误处理[No module named '_ctypes'] 因开发需要,在CentOS 6 服务器将Python2进行升级到Python3.由于工作中 ...
python在lxml中使用XPath语法进行#数据解析
在lxml中使用XPath语法: 获取所有li标签: from lxml import etree html = etree.parse('hello.html') print type(html) ...
python 自定义函数表达式拟合求系数
https://docs.scipy.org/doc/scipy/reference/tutorial/integrate.html https://docs.scipy.org/doc/scipy/ ...

Wordcount -- MapReduce example -- Reducer

Reducer Class Prototype:

Reduce Method for Mapper

Method header

Aggregate Values

Building (key, value) pairs

Reducer Class Summary

Overview of Reducer Class

Wordcount -- MapReduce example -- Reducer的更多相关文章

随机推荐

热门专题