MIT 6.824学习笔记4 Lab1
现在我们准备做第一个作业Lab1啦
wjk大神也在做6.824,可以参考大神的笔记https://github.com/zzzyyyxxxmmm/MIT6824_Distribute_System
Part I
The Map/Reduce implementation you are given is missing some pieces. Before you can write your first Map/Reduce function pair, you will need to fix the sequential
implementation. In particular, the code we give you is missing two crucial pieces: the function that divides up the output of a map task, and the function that gathers all the inputs for a reduce task. These tasks are carried out by the doMap() function in common_map.go, and the doReduce() function in common_reduce.go respectively. The comments in those files should point you in the right direction.
doMap()要求对一个文件进行map操作,并输出nReduce个intermediate files。map task会运行很多(但part1是sequential的,暂时不需要考虑加锁之类的问题),每个都有自己的jobName和mapTask作为id号。为了简化很多细节,doMap()中提供了以下函数可用:
mapF():读取指定输入文件,并返回数据内容(是一堆key-value)。mapF() is the map function provided by the application. The first argument should be the input file name, though the map function typically ignores it. The second argument should be the entire input file contents. mapF() returns a slice containing the key/value pairs for reduce; see common.go for the definition of KeyValue.
reduceName():按规则生成intermediate files的文件名。编号为r的文件负责存储ihash(key) mod nReduce==r的键值对。There is one intermediate file per reduce task. The file name includes both the map task number and the reduce task number. Use the filename generated by reduceName(jobName, mapTask, r) as the intermediate file for reduce task r. Call ihash() (see below) on each key, mod nReduce, to pick r for a key/value pair.
那么doMap()的工作就是读取文件,然后枚举每个kv键值对,送给对应的intermediate file就好啦。
doReduce()就正好反过来,要求读取nMap个intermediate file中的kv键值对,调用用户指定的reduce function(这里是把对key相同的values给append到一起),并写入outFile。reduce task也会运行很多(但part1是sequential的,暂时不需要考虑加锁之类的问题),每个都有自己的jobName和reduceTask作为id号。doReduce()同样提供了很多函数可用:
reduceName():reduceName(jobName, m, reduceTask) yields the file name from map task m.
reduceF():reduceF() is the application's reduce function. You should call it once per distinct key, with a slice of all the values for that key. reduceF() returns the reduced value for that key.
doReduce()的工作就是对map出来的intermediate files,用一个大map来合并所有的kv键值对,然后写到输出文件。输出时按key的顺序排好序输出。
代码:https://github.com/pentium3/mit6824/tree/master/src/mapreduce
Part I的代码位于 src/mapreduce/common_map.go 和 src/mapreduce/common_reduce.go
PartII
这一步的任务是做一个Wordcount。 Now you will implement word count — a simple Map/Reduce example. Look in main/wc.go; you'll find empty mapF() and reduceF() functions. Your job is to insert code so that wc.go reports the number of occurrences of each word in its input. A word is any contiguous sequence of letters, as determined by unicode.IsLetter.
Review Section 2 of the MapReduce paper. Your mapF() and reduceF() functions will differ a bit from those in the paper's Section 2.1. Your mapF() will be passed the name of a file, as well as that file's contents; it should split the contents into words, and return a Go slice of mapreduce.KeyValue. While you can choose what to put in the keys and values for the mapF output, for word count it only makes sense to use words as the keys. Your reduceF() will be called once for each key, with a slice of all the values generated by mapF() for that key. It must return a string containing the total number of occurences of the key.
mapF():The map function is called once for each file of input. The first argument is the name of the input file, and the second is the file's complete contents. You should ignore the input file name, and look only at the contents argument. The return value is a slice of key/value pairs. 作用就是把content分割成words,并返回Wordcount的key-value(在当前content里,每个word的出现次数)
reduceF():The reduce function is called once for each key generated by the map tasks, with a list of all the values created for that key by any map task.
(其实和5105的Lab1基本上一样......)
代码:https://github.com/pentium3/mit6824/tree/master/src/main
PartII的代码位于 src/main/wc.go
Part III
之前做的还都是单机串行的,这次要来个并行的啦
几个go语言知识总结
代码中大量用到了make生成切片。make的用法参考 https://www.cnblogs.com/pdev/p/10928735.html
- a good read on Go strings is theGo Blog on strings.
- you can use strings.FieldsFunc to split a string into components.
- the strconv package (http://golang.org/pkg/strconv/) is handy to convert strings to integers etc.
Ref:
https://zhuanlan.zhihu.com/p/36158168
https://www.cnblogs.com/a1225234/p/10886410.html
http://nil.csail.mit.edu/6.824/2018/labs/lab-1.html
MIT 6.824学习笔记4 Lab1的更多相关文章
- MIT 6.824学习笔记2 RPC/Thread
本节内容:Lect 2 RPC and Threads 线程:Threads allow one program to (logically) execute many things at onc ...
- MIT 6.824学习笔记3 Go语言并发解析
之前看过一个go语言并发的介绍:https://www.cnblogs.com/pdev/p/10936485.html 但这个太简略啦.下面看点深入的 还记得https://www.cnblog ...
- MIT 6.824学习笔记1 MapReduce
本节内容:Lect 1 MapReduce框架的执行过程: master分发任务,把map任务和reduce任务分发下去 map worker读取输入,进行map计算写入本地临时文件 map任务完成通 ...
- MIT 6.824 lab1:mapreduce
这是 MIT 6.824 课程 lab1 的学习总结,记录我在学习过程中的收获和踩的坑. 我的实验环境是 windows 10,所以对lab的code 做了一些环境上的修改,如果你仅仅对code 感兴 ...
- MIT 6.824(Spring 2020) Lab1: MapReduce 文档翻译
首发于公众号:努力学习的阿新 前言 大家好,这里是阿新. MIT 6.824 是麻省理工大学开设的一门关于分布式系统的明星课程,共包含四个配套实验,实验的含金量很高,十分适合作为校招生的项目经历,在文 ...
- MIT 6.828 JOS学习笔记2. Lab 1 Part 1.2: PC bootstrap
Lab 1 Part 1: PC bootstrap 我们继续~ PC机的物理地址空间 这一节我们将深入的探究到底PC是如何启动的.首先我们看一下通常一个PC的物理地址空间是如何布局的: ...
- (转) OpenCV学习笔记大集锦 与 图像视觉博客资源2之MIT斯坦福CMU
首页 视界智尚 算法技术 每日技术 来打我呀 注册 OpenCV学习笔记大集锦 整理了我所了解的有关OpenCV的学习笔记.原理分析.使用例程等相关的博文.排序不分先后,随机整理的 ...
- Flas-SQLAchemy数据库操作使用学习笔记
Flas-SQLAchemy数据库操作使用学习笔记 Flask-SQLALchemy 是一个给你的应用添加 SQLALchemy 支持的 Flask 扩展.SQLALchemy 是Python语言的S ...
- Deep Learning(深度学习)学习笔记整理系列之(八)
Deep Learning(深度学习)学习笔记整理系列 zouxy09@qq.com http://blog.csdn.net/zouxy09 作者:Zouxy version 1.0 2013-04 ...
随机推荐
- 【python】对于程序员来说,2018刑侦科推理试卷是问题么?
最近网上很火的2018刑侦科推理试卷,题目确实很考验人逻辑思维能力. 可是对于程序员来说,这根本不是问题.写个程序用穷举法计算一遍即可,太简单. import itertools class Solu ...
- echart-如何画自定义的图形,三角形为例
- 阿里云 Serverless 应用引擎(SAE)发布 v1.2.0,支持一键启停、NAS 存储、小规格实例等实用特性
近日,阿里云 Serverless 应用引擎(SAE)发布 v1.2.0版本,新版本实现了以下新功能/新特性: 一键启停开发测试环境:企业开发测试环境一般晚上不常用,长期保有应用实例,闲置浪费很高.使 ...
- php list()函数 语法
php list()函数 语法 作用:用于在一次操作中给一组变量赋值.博智达 语法:list(var1,var2...) 参数: 参数 描述 var1 必需.第一个需要赋值的变量. var2,... ...
- 【bzoj2946】[Poi2000]公共串
*题目描述: 给出几个由小写字母构成的单词,求它们最长的公共子串的长度. 任务: l 读入单词 l 计算最长公共子串的长度 l 输出结果 *输入: 文件的第一行是整数 n,1<=n<=5, ...
- shell scripts 编写基础
一.shell变量的相关用法: 变量作为被赋值的一方的时候不加$,只有在使用其值的内容的时候需要加上$,该符号可 1,变量中的单引号‘’.双引号“”“.反单引号‵`.括号().大括号{}.双括号(() ...
- What does the dot after dollar sign mean in jQuery when declaring variables?
https://stackoverflow.com/questions/22156664/what-does-the-dot-after-dollar-sign-mean-in-jquery-when ...
- python开发环境准备
python 以版本众多,包之间依赖复杂而著称,所以一个趁手的开发环境还是很有必要的. 我的建议是用Anaconda做环境隔离.包管理,PyCharm做项目开发,jupyter做笔记,ipython和 ...
- 整体二分初探 两类区间第K大问题 poj2104 & hdu5412
看到好多讲解都把整体二分和$CDQ$分治放到一起讲 不过自己目前还没学会$CDQ$分治 就单独谈谈整体二分好了 先推荐一下$XHR$的 <浅谈数据结构题的几个非经典解法> 整体二分在当中有 ...
- CyclicBarrier 源码分析
CyclicBarrier CyclicBarrier 是一个同步辅助类,它允许一组线程互相等待,直到到达某个公共屏障点 (common barrier point) 之后同时释放执行.CyclicB ...