第一次接触mr还是在入门mit6.824的lab1,最近重新读了一遍原始论文,又有了一些新的想法,简单做一些记录。

作为Google分布式系统的重要组成,本篇文章核心在于map/reduce操作带来的抽象并行化,给出接口之后,编写应用程序的程序员就不需要对底层的机制做过多的处理。而在本质上,mr只是实现了一组分布式的并行框架,而实际依赖的底层分布式infrastructure还是GFS。

MapReduce: Simplified Data Processing on Large Clusters

Programming Model

K/V pairs

original task --map--> intermedicate K/V pairs --shuffle--> --reduce--> result

shuffle: to generate the same key list

the result can be multi-set, the merge work is finished by user function

Here is a word cound app from mit 6.824(golang)

// The map function is called once for each file of input. The first
// argument is the name of the input file, and the second is the
// file's complete contents. You should ignore the input file name,
// and look only at the contents argument. The return value is a slice
// of key/value pairs.
func Map(filename string, contents string) []mr.KeyValue {
// function to detect word separators.
ff := func(r rune) bool { return !unicode.IsLetter(r) } // split contents into an array of words.
words := strings.FieldsFunc(contents, ff) kva := []mr.KeyValue{}
for _, w := range words {
kv := mr.KeyValue{w, "1"}
kva = append(kva, kv)
}
return kva
} // The reduce function is called once for each key generated by the
// map tasks, with a list of all the values created for that key by
// any map task.
func Reduce(key string, values []string) string {
// return the number of occurrences of this word.
return strconv.Itoa(len(values))
}

map (k1,v1) → list(k2,v2)

reduce (k2,list(v2))list(v2)


working flow

  1. split input file
  2. master(coordinator) allocate map-task
  3. do map, generate inter k/v pair
  4. write inter k/v pair in R partition
  5. sort on master, reduce: RPC read inter-file from map machine
  6. final output to GFS
  7. return

After successful completion, the output of the mapreduce execution is available in the R output files (one per reduce task, with file names as specified by the user). Typically, users do not need to combine these R output files into one file – they often pass these files as input to another MapReduce call, or use them from another distributed application that is able to deal with input that is partitioned into multiple files.

In practical grogramming, the atomic operation is important(regardless of C++ or Go or...)

fault tolerant

heartbeat: master <---> slave

Completed map tasks are re-executed on a failure because their output is stored on the local disk(s) of the failed machine and is therefore inaccessible. Completed reduce tasks do not need to be re-executed since their output is stored in a global file system.

reduce re-execute if has not read finish from a map-machine(RPC would fail)

Task Granularity

M, R >> machine number -> load balance

common: M > R, to decrease final file number

Backup Tasks

solve straggler: When a MapReduce operation is close to completion, the master schedules backup executions of the remaining in-progress tasks.


Refinements

  1. Partitioning function: pre-define the number of output file: use hash
  2. pre-sort
  3. combiner, eg. in wc map-task, append the same key-value here
  4. input/output: different file type(read by line or offset), database and memory are also useful.
  5. side-effects
  6. error in code: skipping bad records(optional)
  7. sequential on local machine(help to debug)
  8. display the task status(command or gui) -> data collect and analyse
  9. counter(in lib) for sth.

Discussion

in some cases: we can also store the inter-file in the global file system,

thus we dont need re-execute the map-task if machine shutdown,

take the reduce RPC as GFS reading.

the bindwidth can be the essential bottleneck, p2p network can decrease the master's I/O pressure

MapReduce: open source version: hadoop(yahoo/apache)

middle step: shuffle, one key run (not) once????? in reduce

so we need combiner?

  • use combiner: reduce one key once
  • dont use combiner: reduce one key map partition times

but where combiner running?

map-local-disk?: local combine

master?: dont do any logic calculating work

before reduce?: shuffle

shuffle & combine could be bottleneck

task failure: restart tasks

node failure: restart tasks on new node: re-run all finished task for lose inter-file

Google三驾马车之二:MapReduce的更多相关文章

  1. 分布式系统漫谈一 —— Google三驾马车: GFS,mapreduce,Bigtable

    分布式系统学习必读文章!!!! 原文:http://blog.sina.com.cn/s/blog_4ed630e801000bi3.html 分布式系统漫谈一 —— Google三驾马车: GFS, ...

  2. [MapReduce] Google三驾马车:GFS、MapReduce和Bigtable

    声明:此文转载自博客开发团队的博客,尊重原创工作.该文适合学分布式系统之前,作为背景介绍来读. 谈到分布式系统,就不得不提Google的三驾马车:Google FS[1],MapReduce[2],B ...

  3. Google三驾马车:GFS、MapReduce和Bigtable

    谈到分布式系统,就不得不提Google的三驾马车:Google fs[1],Mapreduce[2],Bigtable[3]. 虽然Google没有公布这三个产品的源码,但是他发布了这三个产品的详细设 ...

  4. Google三驾马车

    Google旧三驾马车: GFS,mapreduce,Bigtable http://blog.sina.com.cn/s/blog_4ed630e801000bi3.html Google新三驾马车 ...

  5. 【技术与商业案例解读笔记】095:Google大数据三驾马车笔记

     1.谷歌三驾马车地位 [关键词]开启时代,指明方向 聊起大数据,我们通常言必称谷歌,谷歌有“三驾马车”:谷歌文件系统(GFS).MapReduce和BigTable.谷歌的“三驾马车”开启了大数据时 ...

  6. Childlife旗下三驾马车

    Childlife旗下,尤其以 “提高免疫力”为口号的“三驾马车”:第一防御液.VC.紫雏菊,是相当热门的海淘产品.据说这是一系列“成分天然.有效治愈感冒提升免疫力.由美国著名儿科医生研发”的药物.

  7. Ubuntu 安装 k8s 三驾马车 kubelet kubeadm kubectl

    Ubuntu 版本是 18.04 ,用的是阿里云服务器,记录一下自己实际安装过程的操作步骤. 安装 docker 安装所需的软件 apt-get update apt-get install -y a ...

  8. Qt 学习笔记 - 第三章 - Qt的三驾马车之一 - 串口编程 + 程序打包成Windows软件

    Qt 学习笔记全系列传送门: Qt 学习笔记 - 第一章 - 快速开始.信号与槽 Qt 学习笔记 - 第二章 - 添加图片.布局.界面切换 [本章]Qt 学习笔记 - 第三章 - Qt的三驾马车之一 ...

  9. 更强、更稳、更高效:解读 etcd 技术升级的三驾马车

    点击下载<不一样的 双11 技术:阿里巴巴经济体云原生实践> 本文节选自<不一样的 双11 技术:阿里巴巴经济体云原生实践>一书,点击上方图片即可下载! 作者 | 陈星宇(宇慕 ...

  10. java大数据最全课程学习笔记(6)--MapReduce精通(二)--MapReduce框架原理

    目前CSDN,博客园,简书同步发表中,更多精彩欢迎访问我的gitee pages 目录 MapReduce精通(二) MapReduce框架原理 MapReduce工作流程 InputFormat数据 ...

随机推荐

  1. 从输入URL后浏览器的渲染逻辑

    从输入URL到浏览器渲染页面需要经过很多过程,本文简单说明下各个环节的内容 主要渲染节点如下: 一.浏览器进程说明 出于安全考虑,打开一个浏览器的Tab页签,会生成1个浏览器主进程.1个网络进程.1个 ...

  2. bitcask论文翻译/笔记

    翻译 论文来源:bitcask-intro.pdf (riak.com) 背景介绍 Bitcask的起源与Riak分布式数据库的历史紧密相连.在Riak的K/V集群中,每个节点都使用了可插拔的本地存储 ...

  3. 运筹学 | 退化的最优解 vs 无穷多最优解?

    退化的最优解: 单纯形表的基可行解中,出现等于零的基变量.或者,按最小比值来确定出基向量时,存在两个以上相同最小比值. 出现的原因:模型中存在多余的约束. 无穷多最优解: 单纯形表中,按最大检验数 σ ...

  4. 安装MicroStation软件、Terrasolid插件的方法

      本文介绍在Win10电脑中,安装MicroStation软件与Terrasolid插件合集的详细方法.   首先,我们需要有MicroStation软件与Terrasolid插件合集的安装包:这些 ...

  5. JavaScript : 获取文件名后缀

               /** 获取文件后缀               *               * indexOf 和 lastIndexOf 都是索引文件            indexO ...

  6. Linux-目录-cd-mdkir-rm-ls-pwd

  7. [转帖]Run Grafana behind a reverse proxy

    On this page Introduction Configure NGINX Configure HAProxy Configure IIS Configure Traefik Summary ...

  8. [转帖]TIDB TIKV 数据是怎么写入与通过Region 分割的?

    https://cloud.tencent.com/developer/article/1882194 国产的分布式数据库不少,TDSQL, OB, TIDB ,等等都是比较知名的产品,使用的分布式协 ...

  9. [转帖]TiFlash 面向编译器的自动向量化加速

    作者:朱一帆 目录​ SIMD 介绍 SIMD 函数派发方案 面向编译器的优化 SIMD 介绍​ SIMD 是重要的重要的程序加速手段.CMU DB 组在 Advanced Database Syst ...

  10. [转帖]Rocksdb的优劣及应用场景分析

      研究Rocksdb已经有七个月的时间了,这期间阅读了它的大部分代码,对底层存储引擎进行了适配,同时也做了大量的测试.在正式研究之前由于对其在本地存储引擎这个江湖地位的膜拜,把它想象的很完美,深入摸 ...