MapReduce Unit Test

　　以前用java写MR程序总不习惯写单元测试，就是查错也只是在小规模数据上跑一下程序。昨天工作时，遇到一个bug，查了好久也查出来。估计是业务逻辑上的错误。后来没办法，只好写了个单元测试，一步步跟踪，瞬间找到问题所在。所以说，工作中还是要勤快些。

 import static org.junit.Assert.assertEquals;

 import java.io.IOException;

 import java.util.ArrayList;

 import java.util.List;

 import org.apache.hadoop.conf.Configuration;

 import org.apache.hadoop.io.LongWritable;

 import org.apache.hadoop.io.Text;

 import org.apache.hadoop.mrunit.mapreduce.MapDriver;

 import org.apache.hadoop.mrunit.mapreduce.MapReduceDriver;

 import org.apache.hadoop.mrunit.mapreduce.ReduceDriver;

 import org.apache.hadoop.mrunit.types.Pair;

 import org.junit.Before;

 import org.junit.Test;

 import com.wanda.predict.GenerateCustomerNatureFeature.NatureFeatureMappper;

 import com.wanda.predict.GenerateCustomerNatureFeature.NatureReducer;

 import com.wanda.predict.pojo.Settings;

 /**

  * MapReduce 单元测试的模板 , 依赖于junit环境(junit.jar)， mrunit.jar , mockito.jar

  *

  */

 public class MapperReducerUnitTest {

     // 一些设置，与正常的mr程序一样，不过这里主要是加载一些信息。性能优化之类的就不要在单元测试里设置了。

     Configuration conf = new Configuration();

     //Map.class 的测试驱动类

     MapDriver<LongWritable, Text, Text, Text> mapDriver;

     //Reduce.class 的测试驱动类

     ReduceDriver<Text, Text, Text, Text> reduceDriver;

     //Map.calss 、 Reduce.class转接到一起的流程测试驱动

     MapReduceDriver<LongWritable, Text, Text, Text, Text, Text> mapReduceDriver;

     @Before

     public void setUp() {

         //测试mapreduce

         NatureFeatureMappper mapper = new NatureFeatureMappper();

         NatureReducer reducer = new NatureReducer();

         //添加要测试的map类

         mapDriver = MapDriver.newMapDriver(mapper);

         //添加要测试的reduce类

         reduceDriver = ReduceDriver.newReduceDriver(reducer);

         //添加map类和reduce类

         mapReduceDriver = MapReduceDriver.newMapReduceDriver(mapper, reducer);

         //测试配置参数

         conf.setInt(Settings.TestDataSize.getName(), 1);

         conf.setInt(Settings.TrainDataSize.getName(), 6);

         //driver之间是独立的，谁用到谁就设置conf

         reduceDriver.setConfiguration(conf);

         mapReduceDriver.setConfiguration(conf);

     }

     @Test

     public void testMapper() throws IOException {

         mapDriver.withInput(new LongWritable(), new Text("map的输入"));

         mapDriver.withOutput(new Text("期望的key"), new Text("期望的value"));

         //打印实际结果

         List<Pair<Text , Text>> result = mapDriver.run();

         for(Pair<Text , Text> kv : result){

             System.out.println("mapper : " + kv.getFirst());

             System.out.println("mapper : " + kv.getSecond());

         }

         //进行case测试，对比输入输出结果

         mapDriver.runTest();

     }

     @Test

     public void testReducer() throws IOException {

         List<Text> values = new ArrayList<Text>();

         values.add(new Text("输入"));

         reduceDriver.withInput(new Text("输入"), values);

         reduceDriver.withOutput(new Text("期望的输出"), new Text("期望的输出"));

         reduceDriver.runTest();

     }

     @Test

     public void testMapperReducer() throws IOException {

         mapReduceDriver.withInput(new LongWritable(), new Text("输入"));

         mapReduceDriver.withOutput(new Text("期望的输出"), new Text("期望的输出"));

         //打印实际结果

         List<Pair<Text, Text>> list =  mapReduceDriver.run();

         System.out.println("mapreducedriver size:" + list.size());

         for(Pair<Text , Text> lst : list){

             System.out.println(lst.getFirst());

             System.out.println(lst.getSecond());

         }

         //进行case测试，对比输入输出结果

         mapReduceDriver.runTest();

     }

     @Test

     public void testMapperCount() throws IOException {

         mapDriver.withInput(new LongWritable(), new Text("输入"));

         mapDriver.withOutput(new Text("期望的输出"), new Text("期望的输出"));

         mapDriver.runTest();

         //判断 map中的counter值是否与期望的相同

         assertEquals("Expected 1 counter increment", 1, mapDriver.getCounters().findCounter("data", "suc").getValue());

     }

 }

MapReduce Unit Test的更多相关文章

MapReduce和Spark写入Hbase多表总结
作者:Syn良子出处:http://www.cnblogs.com/cssdongl 转载请注明出处大家都知道用mapreduce或者spark写入已知的hbase中的表时,直接在mapreduc ...
mapReduce编程之Recommender System
1 协同过滤算法协同过滤算法是现在推荐系统的一种常用算法.分为user-CF和item-CF. 本文的电影推荐系统使用的是item-CF,主要是由于用户数远远大于电影数,构建矩阵的代价更小:另外,电 ...
Hadoop官方文档翻译——MapReduce Tutorial
MapReduce Tutorial(个人指导) Purpose(目的) Prerequisites(必备条件) Overview(综述) Inputs and Outputs(输入输出) MapRe ...
Hadoop 学习笔记3 Develping MapReduce
小笔记: Mavon是一种项目管理工具,通过xml配置来设置项目信息. Mavon POM(project of model). Steps: 1. set up and configure the ...
mapReduce编程之google pageRank
1 pagerank算法介绍 1.1 pagerank的假设数量假设:每个网页都会给它的链接网页投票,假设这个网页有n个链接,则该网页给每个链接平分投1/n票. 质量假设:一个网页的pagerank ...
hadoop权威指南 chapter2 MapReduce
MapReduce MapReduce is a programming model for data processing. The model is simple, yet not too sim ...
Hadoop权威指南:MapReduce应用开发
Hadoop权威指南:MapReduce应用开发 [TOC] 一般流程编写map函数和reduce函数编写驱动程序运行作业用于配置的API Hadoop中的组件是通过Hadoop自己的配置API ...
Hadoop Mapreduce 参数（二）
MergeManagerImpl 类内存参数计算 maxInMemCopyUse 位于构造函数中 final float maxInMemCopyUse = jobConf.getFloat(MRJ ...
MapReduce C++ Library
MapReduce C++ Library for single-machine, multicore applications Distributed and scalable computing ...

随机推荐

一般来说，主方法main()结束的时候线程结束
suspend()是将一个运行时状态进入阻塞状态(注意不释放锁标记).恢复状态的时候用resume().Stop()指释放全部. 这几个方法上都有Deprecated标志,说明这个方法不推荐使用. 一 ...
【BZOJ】1672: [Usaco2005 Dec]Cleaning Shifts 清理牛棚（dp/线段树）
http://www.lydsy.com/JudgeOnline/problem.php?id=1672 dp很好想,但是是n^2的..但是可以水过..(5s啊..) 按左端点排序后 f[i]表示取第 ...
cpio -H newc参数详解
-H format 其中个format可以是: ‘bin’ The obsolete binary format. (2147483647 bytes) ‘odc’ The old (POSIX.1) ...
P2483 [SDOI2010]魔法猪学院
P2483 [SDOI2010]魔法猪学院摘要 --> 题目描述 iPig在假期来到了传说中的魔法猪学院,开始为期两个月的魔法猪训练.经过了一周理论知识和一周基本魔法的学习之后,iPig对猪世 ...
oracle decode处理被除数为0 的情况
,,a) per from aa; 例如我的b为(N30+N31+N32+N33+N34+N35+N36+N37+N38) ,,(N33)||'%' WHERE ssrq=''||sssq||'';
DM8168 OpenCV尝试与评估（编译ARM版OpenCV）
交叉编译opencv2.3.1,并在DM8168 cortex A8中执行图像处理. 开发环境: PC:ubuntu12.04LTS.Intel Core 2 Duo CPU E7200@2. ...
Spring中Adivisor和Aspect的差别（自我理解）
在AOP中有几个概念: - 方/切面(Aspect):一个关注点的模块化,这个关注点实现可能另外横切多个对象.事务管理是J2EE应用中一个非常好的横切关注点样例. 方面用Spring的Advisor ...
编程之美 set 1 不要被阶乘吓倒
总结 1. 使用加法解决指数问题时, 可用背包问题的变形 2. 题目用到的公式和求解 1~N 中 1 出现的次数的公式类似题目 1. 给定一个整数 N, 那么 N 的阶乘 N! 末尾有多少个 0 呢 ...
使用 composer 下载更新卸载类库
前言:要下载什么包,可以去 https://packagist.org/ 找一下包名及其版本信息 1)配置composer.json文件,并使用composer install 命令下载类包,下面以下 ...
透过Nim游戏浅谈博弈
452. Nim游戏! ★ 输入文件:nim!.in 输出文件:nim!.out 简单对比时间限制:1 s 内存限制:128 MB 甲,乙两个人玩Nim取石子游戏. nim游戏的规则是 ...

MapReduce Unit Test

MapReduce Unit Test的更多相关文章

随机推荐

热门专题