Hadoop实战-MapReduce之max、min、avg统计(六)

1、数据准备：

Mike,35

Steven,40

Ken,28

Cindy,32

2、预期结果

Max　　40

Min　　 28

Avg 33

3、MapReduce代码如下

import java.io.IOException;

import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Job;

import org.apache.hadoop.mapreduce.Mapper;

import org.apache.hadoop.mapreduce.Reducer;

import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

import org.apache.hadoop.util.GenericOptionsParser;

public class AgeMapReduce {

    public static class WordCountMapper extends

            Mapper<Object, Text, Text, Text> {

        private Text nameKey = new Text();

        private Text ageValue = new Text();

        @Override

        public void map(Object key, Text value, Context context)

                throws IOException, InterruptedException {

            StringTokenizer itr = new StringTokenizer(value.toString());

            while (itr.hasMoreTokens()) {

                String content = itr.nextToken();

                String[] nameAndAge = content.split(",");

                //String name = nameAndAge[0];

                String age = nameAndAge[1];

                nameKey.set("only you");

                ageValue.set(age);

                context.write(nameKey, ageValue);

            }

        }

    }

    public static class WordCountReduce extends Reducer<Text, Text, Text, Text> {

        private int min = Integer.MAX_VALUE;

        private int max = 0;

        private int sum = 0;

        private int count = 0;

        @Override

        public void reduce(Text key, Iterable<Text> values, Context context)

                throws IOException, InterruptedException {

            for (Text tmpAge : values) {

                int age = Integer.valueOf(tmpAge.toString());

                if (age < min) {

                    min = age;

                }

                if (age > max) {

                    max = age;

                }

                sum += age;

                count++;

            }

            //String resultStr = min + "\t" + max + "\t" + (sum / count);

            //result.set(resultStr);

            context.write(new Text("Max"), new Text(String.valueOf(min)));

            context.write(new Text("Min"), new Text(String.valueOf(max)));

            context.write(new Text("Avg"), new Text(String.valueOf(sum/count)));

        }

    }

    public static void main(String[] args) throws Exception {

        Configuration conf = new Configuration();

        String[] otherArgs = new GenericOptionsParser(conf, args)

                .getRemainingArgs();

        if (otherArgs.length != 2) {

            System.err.println("Usage: MinMaxCountDriver <in> <out>");

            System.exit(2);

        }

        Job job = new Job(conf, "StackOverflow Comment Date Min Max Count");

        job.setJarByClass(AgeMapReduce.class);

        job.setMapperClass(WordCountMapper.class);

        // job.setCombinerClass(MusicReduce.class);

        job.setReducerClass(WordCountReduce.class);

        job.setOutputKeyClass(Text.class);

        job.setOutputValueClass(Text.class);

        // user/joe/wordcount/input

        FileInputFormat.addInputPath(job, new Path(otherArgs[0]));

        // user/joe/wordcount/output

        FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));

        System.exit(job.waitForCompletion(true) ? 0 : 1);

    }

}

4、注意事项

因为输出的结果和Key没有关系，所以在map阶段要固定一个Key即可。

Hadoop实战-MapReduce之max、min、avg统计(六)的更多相关文章

Hadoop实战-MapReduce之分组(group-by)统计(七)
1.数据准备使用MapReduce计算age.txt中年龄最大.最小.均值name,min,max,countMike,35,20,1Mike,5,15,2Mike,20,13,1Steven,40 ...
Hadoop实战-MapReduce之倒排索引(八)
倒排索引 (就是key和Value对调的显示结果) 一.需求:下面是用户播放音乐记录,统计歌曲被哪些用户播放过 tom LittleApple jack YesterdayO ...
Hadoop实战-MapReduce之WordCount(五)
环境介绍: 主服务器ip:192.168.80.128(master) NameNode SecondaryNameNode ResourceManager 从服务器ip:192.168.80.1 ...
深入浅出Hadoop实战开发(HDFS实战图片、MapReduce、HBase实战微博、Hive应用)
Hadoop是什么,为什么要学习Hadoop? Hadoop是一个分布式系统基础架构,由Apache基金会开发.用户可以在不了解分布式底层细节的情况下,开发分布式程序.充分利用集群的威力高速运 ...
升级版:深入浅出Hadoop实战开发(云存储、MapReduce、HBase实战微博、Hive应用、Storm应用)
Hadoop是一个分布式系统基础架构,由Apache基金会开发.用户可以在不了解分布式底层细节的情况下,开发分布式程序.充分利用集群的威力高速运算和存储.Hadoop实现了一个分布式文件系 ...
王家林的“云计算分布式大数据Hadoop实战高手之路---从零开始”的第十一讲Hadoop图文训练课程：MapReduce的原理机制和流程图剖析
这一讲我们主要剖析MapReduce的原理机制和流程. “云计算分布式大数据Hadoop实战高手之路”之完整发布目录云计算分布式大数据实战技术Hadoop交流群:312494188,每天都会在群中发 ...
6.组函数（avg(),sum(),max(),min(),count()）、多行函数，分组数据（group by，求各部门的平均工资），分组过滤(having和where)，sql优化
1组函数 avg(),sum(),max(),min(),count()案例: selectavg(sal),sum(sal),max(sal),min(sal),count(sal) from ...
Hadoop实战训练————MapReduce实现PageRank算法
经过一段时间的学习,对于Hadoop有了一些了解,于是决定用MapReduce实现PageRank算法,以下简称PR 先简单介绍一下PR算法(摘自百度百科:https://baike.baidu.co ...
group by与avg(),max(),min(),sum()函数的关系
数据库表: create table pay_report( rdate varchar(8), --日期 region_id varchar(4), --地市 ...

随机推荐

linux 下异步IO
方法一:使用fcntl来置O_ASYNC位. 这个方法的效果是,当输入缓存中的输入数据就绪时(输入数据可读),内核向用F_SETOWN来绑定的那个进程发送SIGIO信号.此时程序应该用getchar等 ...
chroot下二进制程序迁移
#!/bin/bash # #define function#Copy binary programcp_bin(){ cmd_dir=${cmd_path%/*} [ ! -d /mnt/sysro ...
Objective-C日期相关工具方法
//date根据formatter转换成string +(NSString*)dateToString:(NSString *)formatter date:(NSDate *)date { NSDa ...
js-利用插件qrcode.min.js，前端实时生成二维码
qrcode.min.js <script type="text/javascript" src="js/jquery.min.js"></s ...
MX
A mail exchanger record (MX record) is a type of resource record in the Domain Name System that spec ...
win7快捷键和ubuntu快捷键
http://www.cnblogs.com/xfiver/archive/2010/12/08/1899905.html http://www.pc841.com/article/20121203- ...
华硕win7安装ubuntu14.04.02注意事项
一.win7下划出给ubuntu系统的分区 1.win7自带分磁盘的工具,只需要压缩步骤即可,不需要继续分盘符格式化等操作 win7下为绿色安装时为free space 二.制作启动盘并安装注意事项 ...
codeforces 979E（dp套dp）
题意: 有n个点,编号为1~n.有的点颜色是黑色,有的点颜色是白色,有的点的颜色待涂.你还可以连一些边,但这些边一定是从小编号连到大编号的点. 对于一个确定的图,我们去统计有多少条路径满足“该路径经过 ...
Java中ArrayList的初始容量和容量分配
1.实例化ArrayList时默认不输入大小是10个,并且如果增加到11个时不会报错,会自动扩容. 2.获取指定索引的值时就必须保证ArrayList有这么多个. 3.推荐在new ArrayList ...
字符串(NSString)及常见字符串处理函数
从本系列文章的开始,我们就使用过字符串对象,但是我们却还没有比较详细的介绍过它.使用@符,再一对双引号将一组字符串引用起来,例如: @”In fact, Objective-C is very sim ...

Hadoop实战-MapReduce之max、min、avg统计(六)

Hadoop实战-MapReduce之max、min、avg统计(六)的更多相关文章

随机推荐

热门专题