1.WordCount(统计单词) 经典的运用MapReuce编程模型的实例 1.1 Description 给定一系列的单词/数据,输出每个单词/数据的数量 1.2 Sample a is b is not c b is a is not d 1.3 Output a: b: c: d: not: 1.4 Solution /** * Licensed under the Apache License, Version 2.0 (the "License"); * you may n…
倒排索引 (就是key和Value对调的显示结果) 一.需求:下面是用户播放音乐记录,统计歌曲被哪些用户播放过 tom LittleApple jack YesterdayOnceMore Rose MyHeartWillGoOn jack LittleApple John MyHeartWillGoOn kissinger LittleApple kissinger YesterdayOnceMore 二.最终的效果 Littl…
1 public class TopK extends Configured implements Tool { public static class TopKMapper extends Mapper<Object, Text, NullWritable, LongWritable> { public static final int K = 100; private TreeMap<Long, Long> tm = new TreeMap<Long, Long>(…
public class GroupComparator implements RawComparator<MyBinaryKey> { @Override public int compare(MyBinaryKey o1, MyBinaryKey o2) { return o1.toString().compareTo(o2.toString()); } @Override public int compare(byte[] b1, int s1, int l1, byte[] b2, i…
Python实现MapReduce 下面使用mapreduce模式实现了一个简单的统计日志中单词出现次数的程序: from functools import reduce from multiprocessing import Pool from collections import Counter def read_inputs(file): for line in file: line = line.strip() yield line.split() def count(file_name…