第一次接触mr还是在入门mit6.824的lab1,最近重新读了一遍原始论文,又有了一些新的想法,简单做一些记录。

作为Google分布式系统的重要组成,本篇文章核心在于map/reduce操作带来的抽象并行化,给出接口之后,编写应用程序的程序员就不需要对底层的机制做过多的处理。而在本质上,mr只是实现了一组分布式的并行框架,而实际依赖的底层分布式infrastructure还是GFS。

MapReduce: Simplified Data Processing on Large Clusters

Programming Model

K/V pairs

original task --map--> intermedicate K/V pairs --shuffle--> --reduce--> result

shuffle: to generate the same key list

the result can be multi-set, the merge work is finished by user function

Here is a word cound app from mit 6.824(golang)

// The map function is called once for each file of input. The first
// argument is the name of the input file, and the second is the
// file's complete contents. You should ignore the input file name,
// and look only at the contents argument. The return value is a slice
// of key/value pairs.
func Map(filename string, contents string) []mr.KeyValue {
// function to detect word separators.
ff := func(r rune) bool { return !unicode.IsLetter(r) } // split contents into an array of words.
words := strings.FieldsFunc(contents, ff) kva := []mr.KeyValue{}
for _, w := range words {
kv := mr.KeyValue{w, "1"}
kva = append(kva, kv)
}
return kva
} // The reduce function is called once for each key generated by the
// map tasks, with a list of all the values created for that key by
// any map task.
func Reduce(key string, values []string) string {
// return the number of occurrences of this word.
return strconv.Itoa(len(values))
}

map (k1,v1) → list(k2,v2)

reduce (k2,list(v2))list(v2)


working flow

  1. split input file
  2. master(coordinator) allocate map-task
  3. do map, generate inter k/v pair
  4. write inter k/v pair in R partition
  5. sort on master, reduce: RPC read inter-file from map machine
  6. final output to GFS
  7. return

After successful completion, the output of the mapreduce execution is available in the R output files (one per reduce task, with file names as specified by the user). Typically, users do not need to combine these R output files into one file – they often pass these files as input to another MapReduce call, or use them from another distributed application that is able to deal with input that is partitioned into multiple files.

In practical grogramming, the atomic operation is important(regardless of C++ or Go or...)

fault tolerant

heartbeat: master <---> slave

Completed map tasks are re-executed on a failure because their output is stored on the local disk(s) of the failed machine and is therefore inaccessible. Completed reduce tasks do not need to be re-executed since their output is stored in a global file system.

reduce re-execute if has not read finish from a map-machine(RPC would fail)

Task Granularity

M, R >> machine number -> load balance

common: M > R, to decrease final file number

Backup Tasks

solve straggler: When a MapReduce operation is close to completion, the master schedules backup executions of the remaining in-progress tasks.


Refinements

  1. Partitioning function: pre-define the number of output file: use hash
  2. pre-sort
  3. combiner, eg. in wc map-task, append the same key-value here
  4. input/output: different file type(read by line or offset), database and memory are also useful.
  5. side-effects
  6. error in code: skipping bad records(optional)
  7. sequential on local machine(help to debug)
  8. display the task status(command or gui) -> data collect and analyse
  9. counter(in lib) for sth.

Discussion

in some cases: we can also store the inter-file in the global file system,

thus we dont need re-execute the map-task if machine shutdown,

take the reduce RPC as GFS reading.

the bindwidth can be the essential bottleneck, p2p network can decrease the master's I/O pressure

MapReduce: open source version: hadoop(yahoo/apache)

middle step: shuffle, one key run (not) once????? in reduce

so we need combiner?

  • use combiner: reduce one key once
  • dont use combiner: reduce one key map partition times

but where combiner running?

map-local-disk?: local combine

master?: dont do any logic calculating work

before reduce?: shuffle

shuffle & combine could be bottleneck

task failure: restart tasks

node failure: restart tasks on new node: re-run all finished task for lose inter-file

Google三驾马车之二:MapReduce的更多相关文章

  1. 分布式系统漫谈一 —— Google三驾马车: GFS,mapreduce,Bigtable

    分布式系统学习必读文章!!!! 原文:http://blog.sina.com.cn/s/blog_4ed630e801000bi3.html 分布式系统漫谈一 —— Google三驾马车: GFS, ...

  2. [MapReduce] Google三驾马车:GFS、MapReduce和Bigtable

    声明:此文转载自博客开发团队的博客,尊重原创工作.该文适合学分布式系统之前,作为背景介绍来读. 谈到分布式系统,就不得不提Google的三驾马车:Google FS[1],MapReduce[2],B ...

  3. Google三驾马车:GFS、MapReduce和Bigtable

    谈到分布式系统,就不得不提Google的三驾马车:Google fs[1],Mapreduce[2],Bigtable[3]. 虽然Google没有公布这三个产品的源码,但是他发布了这三个产品的详细设 ...

  4. Google三驾马车

    Google旧三驾马车: GFS,mapreduce,Bigtable http://blog.sina.com.cn/s/blog_4ed630e801000bi3.html Google新三驾马车 ...

  5. 【技术与商业案例解读笔记】095:Google大数据三驾马车笔记

     1.谷歌三驾马车地位 [关键词]开启时代,指明方向 聊起大数据,我们通常言必称谷歌,谷歌有“三驾马车”:谷歌文件系统(GFS).MapReduce和BigTable.谷歌的“三驾马车”开启了大数据时 ...

  6. Childlife旗下三驾马车

    Childlife旗下,尤其以 “提高免疫力”为口号的“三驾马车”:第一防御液.VC.紫雏菊,是相当热门的海淘产品.据说这是一系列“成分天然.有效治愈感冒提升免疫力.由美国著名儿科医生研发”的药物.

  7. Ubuntu 安装 k8s 三驾马车 kubelet kubeadm kubectl

    Ubuntu 版本是 18.04 ,用的是阿里云服务器,记录一下自己实际安装过程的操作步骤. 安装 docker 安装所需的软件 apt-get update apt-get install -y a ...

  8. Qt 学习笔记 - 第三章 - Qt的三驾马车之一 - 串口编程 + 程序打包成Windows软件

    Qt 学习笔记全系列传送门: Qt 学习笔记 - 第一章 - 快速开始.信号与槽 Qt 学习笔记 - 第二章 - 添加图片.布局.界面切换 [本章]Qt 学习笔记 - 第三章 - Qt的三驾马车之一 ...

  9. 更强、更稳、更高效:解读 etcd 技术升级的三驾马车

    点击下载<不一样的 双11 技术:阿里巴巴经济体云原生实践> 本文节选自<不一样的 双11 技术:阿里巴巴经济体云原生实践>一书,点击上方图片即可下载! 作者 | 陈星宇(宇慕 ...

  10. java大数据最全课程学习笔记(6)--MapReduce精通(二)--MapReduce框架原理

    目前CSDN,博客园,简书同步发表中,更多精彩欢迎访问我的gitee pages 目录 MapReduce精通(二) MapReduce框架原理 MapReduce工作流程 InputFormat数据 ...

随机推荐

  1. freeswitch修改mod_sofia模块并上报自定义头域

    概述 在之前的文章中,我们介绍了如何使用fs的event事件机制来获取呼叫的各种信息. 这些event事件一般都是底层模块定义好的,其中的各种信息已经很完备了,日常的开发需求都可以满足. 但是,总有一 ...

  2. 基于java+springboot的图书借阅网站-在线图书借阅管理系统

    该系统是基于java+springboot开发的图书借阅管理系统.是给师弟开发的课程作业.大家学习过程中,遇到问题可以github咨询作者. 系统演示地址 前台 http://book.gitapp. ...

  3. Angular系列教程之MVC模式和MVVM模式

    .markdown-body { line-height: 1.75; font-weight: 400; font-size: 16px; overflow-x: hidden; color: rg ...

  4. SoC scan implementation

    scan chain产生之前需要进行scan drc的过程,判断cell是不是能够串到scan chain上去 mux-d scan cell(是最常用的scan cell),还有其他的scan ce ...

  5. SV Interface and Program 2

    Clocking:激励的时序 memory检测start信号,当start上升沿的时候,如果write信号拉高之后,将data存储到mem中 start\write\addr\data - 四个信号是 ...

  6. [转帖]oracle rac后台进程和LMS说明

    本文摘抄录oracle官方文档,oracle rac使用的后台进程,用以备忘,记录之. About Oracle RAC Background Processes The GCS and GES pr ...

  7. [转帖]一次ORA-3136的处理

    https://oracleblog.org/working-case/deal-with-ora3136/ 最近收到一个告警,用户说数据库无法连接,但是从监控上看,oracle的后台进程已经侦听进程 ...

  8. [转帖]linux磁盘IO读写性能优化

    在LINUX系统中,如果有大量读请求,默认的请求队列或许应付不过来,我们可以 动态调整请求队列数来提高效率,默认的请求队列数存放在/sys/block/xvda/queue/nr_requests 文 ...

  9. [转帖]一行Python代码实现同一局域网内的文件共享

    在不同的设备之间传输文件除了数据线,网盘传输外是否还有其他优雅的方法?我们可以使用一行Python代码使局域网内的所有设备都可以访问并下载文件夹内的文件. 要求: 电脑中安装配置好python 访问的 ...

  10. [转帖]python读取配置文件获取所有键值对_python总结——处理配置文件(ConfigParser)

    python处理ConfigParser 使用ConfigParser模块读写ini文件 (转载) ConfigParserPython 的ConfigParser Module中定义了3个类对INI ...