google三驾马车之一:Bigtable解读(英文版)
本文重点关注了系统设计相关的内容,paper后半部分的具体应用此处没有过多涉及。从个人笔记修改而来,因此为英文版本。
Bigtable: A Distributed Storage System for Structured Data
Data model: not a relational data model
A Bigtable is a sparse, distributed, persistent multidimensional sorted map. —— part2
How the map indexed?
(row:string, column:string, time:int64) → string
just like json format, eg:
table{
// ...
"aaaaa" : { //row
"A:foo" : { //col
15 : "y", //timestamp
4 : "m"
},
"A:bar" : { //col
15 : "d",
},
"B:" : { //col
6 : "w"
3 : "o"
1 : "w"
}
},
// ...
}
a particular table: webtable
- row(also called tablet): reversed URL
concurrent: single row key is atomic
lexicographic order - col: column families, contents
family:qualifier
Access control and both disk and memory accounting - timestamp
avoid collisions: unique timestamp, decreasing order
garbage-collection mechanism(eg.)
API
C++ read/write
MapReduce + Bigtable
Building Block
Google File System: store log and data files
distributed Google File System
Google SSTable file format: store Bigtable data
K-V map: iterate key/value pairs in a specified key range
- a sequence of blocks
- a block index
disk seek or memory seek?
Optionally, SSTable can be completely mapped into memory, which allows us to perform lookups and scans without touching disk.
Chubby: distributed lock service
5 active replicas: 1 master, 4 slave
Paxos algorithm: to keep its replicas consistent in the face of failure
namespace: including directory and small file, op r/w is atomic
session: when expires, lose locks and open handles
Implementation
consist:
- library(?) linked to every client
- 1 master server(schedule, garbage-collect......)
- many tablet server(10-1000 tablets)
As with many single-master distributed storage systems, client data does not move through the master: clients communicate directly with tablet servers for reads and writes.
hierarchy (B+-tree)
Chubby file -> Root tablet -> other METADATA tablets -> UserTables
METADATA: many other things stored in it
Master: schedule & manage
Each tablet is assigned to one tablet server at a time. Bigtable uses Chubby to keep track of tablet servers. When a tablet server starts, it creates, and acquires an exclusive lock on, a uniquely-named file in a specific Chubby directory. The master monitors this directory (the servers directory) to discover tablet servers.
The essential point for distributed database: lock
The Bigtable is only a series of ops, real data is stored in GFS.(SSTable)
Tablet Representation
memtable: the recently committed updates are stored in memory in a sorted buffer
reconstruct: redo points in commit logs
Compactions
As write operations execute, the size of the memtable increases. When the memtable size reaches a threshold, the memtable is frozen, a new memtable is created, and the frozen memtable is converted to an SSTable and written to GFS.
minor(memtable) -> major(SSTable) compaction
Refinement
locality group
Clients can group multiple column families together into a locality group. A separate SSTable is generated for each locality group in each tablet.
This section describes portions of the implementation in more detail in order to highlight these refinements.
in-memory locality groups are loaded lazily
storage: compression
read performance: caching
Bloom filters
commit-log
Speeding up tablet recovery
Exploiting immutability
Performance Evaluation
Lesson
- large distributed systems are vulnerable to many types of failures
- it is important to delay adding new features until it is clear how the new features will be used
- the importance of proper system-level monitoring
- the value of simple designs
google三驾马车之一:Bigtable解读(英文版)的更多相关文章
- 分布式系统漫谈一 —— Google三驾马车: GFS,mapreduce,Bigtable
分布式系统学习必读文章!!!! 原文:http://blog.sina.com.cn/s/blog_4ed630e801000bi3.html 分布式系统漫谈一 —— Google三驾马车: GFS, ...
- [MapReduce] Google三驾马车:GFS、MapReduce和Bigtable
声明:此文转载自博客开发团队的博客,尊重原创工作.该文适合学分布式系统之前,作为背景介绍来读. 谈到分布式系统,就不得不提Google的三驾马车:Google FS[1],MapReduce[2],B ...
- Google三驾马车:GFS、MapReduce和Bigtable
谈到分布式系统,就不得不提Google的三驾马车:Google fs[1],Mapreduce[2],Bigtable[3]. 虽然Google没有公布这三个产品的源码,但是他发布了这三个产品的详细设 ...
- Google三驾马车
Google旧三驾马车: GFS,mapreduce,Bigtable http://blog.sina.com.cn/s/blog_4ed630e801000bi3.html Google新三驾马车 ...
- 【技术与商业案例解读笔记】095:Google大数据三驾马车笔记
1.谷歌三驾马车地位 [关键词]开启时代,指明方向 聊起大数据,我们通常言必称谷歌,谷歌有“三驾马车”:谷歌文件系统(GFS).MapReduce和BigTable.谷歌的“三驾马车”开启了大数据时 ...
- Childlife旗下三驾马车
Childlife旗下,尤其以 “提高免疫力”为口号的“三驾马车”:第一防御液.VC.紫雏菊,是相当热门的海淘产品.据说这是一系列“成分天然.有效治愈感冒提升免疫力.由美国著名儿科医生研发”的药物.
- Ubuntu 安装 k8s 三驾马车 kubelet kubeadm kubectl
Ubuntu 版本是 18.04 ,用的是阿里云服务器,记录一下自己实际安装过程的操作步骤. 安装 docker 安装所需的软件 apt-get update apt-get install -y a ...
- Qt 学习笔记 - 第三章 - Qt的三驾马车之一 - 串口编程 + 程序打包成Windows软件
Qt 学习笔记全系列传送门: Qt 学习笔记 - 第一章 - 快速开始.信号与槽 Qt 学习笔记 - 第二章 - 添加图片.布局.界面切换 [本章]Qt 学习笔记 - 第三章 - Qt的三驾马车之一 ...
- 更强、更稳、更高效:解读 etcd 技术升级的三驾马车
点击下载<不一样的 双11 技术:阿里巴巴经济体云原生实践> 本文节选自<不一样的 双11 技术:阿里巴巴经济体云原生实践>一书,点击上方图片即可下载! 作者 | 陈星宇(宇慕 ...
- itemKNN发展史----推荐系统的三篇重要的论文解读
itemKNN发展史----推荐系统的三篇重要的论文解读 本文用到的符号标识 1.Item-based CF 基本过程: 计算相似度矩阵 Cosine相似度 皮尔逊相似系数 参数聚合进行推荐 根据用户 ...
随机推荐
- MetaGPT day02: MetaGPT Role源码分析
MetaGPT源码分析 思维导图 MetaGPT版本为v0.4.0,如下是from metagpt.roles import Role,Role类执行Role.run时的思维导图: 概述 其中最重要的 ...
- 机器学习-决策树系列-Adaboost算法-集成学习-29
目录 1. adaboost算法的基本思想 2. 具体实现 1. adaboost算法的基本思想 集成学习是将多个弱模型集成在一起 变成一个强模型 提高模型的准确率,一般有如下两种: bagging: ...
- CF1656F Parametric MST 题解
为了便于解题,先对 \(a\) 数组从小到大进行排序. 首先,根据定义可以得出总价值的表达式: \[\begin{aligned} W&=\sum\limits_{(u,v)\in E}[a_ ...
- Jupyter Notebook报错'500 : Internal Server Error'的解决方法
问题根因 Jupyter相关的软件包版本匹配存在问题,或者历史上安装过Jupyter相关的配套软件但是有残留.大部分网上的博客都是推荐用pip重装jupyter或者nbconvert,亲测无法解决该问 ...
- 05-Shell索引数组变量
1.介绍 Shell 支持数组(Array),数组是若干数据的集合,其中的每一份数据都称为数组的元素. 注意Bash Shell 只支持一维数组,不支持多维数组. 2.数组的定义 2.1 语法 在 S ...
- 类外static函数定义要不要加static关键字?
类外static函数定义要不要加static关键字? 先说答案:不需要. 错误代码: #include<iostream> #include<memory> using nam ...
- C#操作 excel 表格
nuget引入: EPPlus.Core FileInfo file = new FileInfo(@"d:\test.xlsx"); using (ExcelPackage pa ...
- Mygin 实现简单Http
本篇是完全参考gin的功能,自己手动实现一个类似的功能,帮助自己理解和学习gin框架 目的 简单介绍net/http库以及http.Handler接口 实现简单的功能 标准库启动Web服务 impor ...
- [转帖]如何使用coredump
一.coredump 当用户态进程出现异常后,在该进程的执行目录下生成对应的coredump文件,如果我们想将coredump生成的位置做改变,就需要如下设置. echo "/home/co ...
- [转帖]Skip List--跳表(全网最详细的跳表文章没有之一)
https://www.jianshu.com/p/9d8296562806 跳表是一种神奇的数据结构,因为几乎所有版本的大学本科教材上都没有跳表这种数据结构,而且神书<算法导论>.< ...