google三驾马车之一:Bigtable解读(英文版)
本文重点关注了系统设计相关的内容,paper后半部分的具体应用此处没有过多涉及。从个人笔记修改而来,因此为英文版本。
Bigtable: A Distributed Storage System for Structured Data
Data model: not a relational data model
A Bigtable is a sparse, distributed, persistent multidimensional sorted map. —— part2
How the map indexed?
(row:string, column:string, time:int64) → string
just like json format, eg:
table{
// ...
"aaaaa" : { //row
"A:foo" : { //col
15 : "y", //timestamp
4 : "m"
},
"A:bar" : { //col
15 : "d",
},
"B:" : { //col
6 : "w"
3 : "o"
1 : "w"
}
},
// ...
}
a particular table: webtable
- row(also called tablet): reversed URL
concurrent: single row key is atomic
lexicographic order - col: column families, contents
family:qualifier
Access control and both disk and memory accounting - timestamp
avoid collisions: unique timestamp, decreasing order
garbage-collection mechanism(eg.)
API
C++ read/write
MapReduce + Bigtable
Building Block
Google File System: store log and data files
distributed Google File System
Google SSTable file format: store Bigtable data
K-V map: iterate key/value pairs in a specified key range
- a sequence of blocks
- a block index
disk seek or memory seek?
Optionally, SSTable can be completely mapped into memory, which allows us to perform lookups and scans without touching disk.
Chubby: distributed lock service
5 active replicas: 1 master, 4 slave
Paxos algorithm: to keep its replicas consistent in the face of failure
namespace: including directory and small file, op r/w is atomic
session: when expires, lose locks and open handles
Implementation
consist:
- library(?) linked to every client
- 1 master server(schedule, garbage-collect......)
- many tablet server(10-1000 tablets)
As with many single-master distributed storage systems, client data does not move through the master: clients communicate directly with tablet servers for reads and writes.
hierarchy (B+-tree)
Chubby file -> Root tablet -> other METADATA tablets -> UserTables
METADATA: many other things stored in it
Master: schedule & manage
Each tablet is assigned to one tablet server at a time. Bigtable uses Chubby to keep track of tablet servers. When a tablet server starts, it creates, and acquires an exclusive lock on, a uniquely-named file in a specific Chubby directory. The master monitors this directory (the servers directory) to discover tablet servers.
The essential point for distributed database: lock
The Bigtable is only a series of ops, real data is stored in GFS.(SSTable)
Tablet Representation
memtable: the recently committed updates are stored in memory in a sorted buffer
reconstruct: redo points in commit logs
Compactions
As write operations execute, the size of the memtable increases. When the memtable size reaches a threshold, the memtable is frozen, a new memtable is created, and the frozen memtable is converted to an SSTable and written to GFS.
minor(memtable) -> major(SSTable) compaction
Refinement
locality group
Clients can group multiple column families together into a locality group. A separate SSTable is generated for each locality group in each tablet.
This section describes portions of the implementation in more detail in order to highlight these refinements.
in-memory locality groups are loaded lazily
storage: compression
read performance: caching
Bloom filters
commit-log
Speeding up tablet recovery
Exploiting immutability
Performance Evaluation
Lesson
- large distributed systems are vulnerable to many types of failures
- it is important to delay adding new features until it is clear how the new features will be used
- the importance of proper system-level monitoring
- the value of simple designs
google三驾马车之一:Bigtable解读(英文版)的更多相关文章
- 分布式系统漫谈一 —— Google三驾马车: GFS,mapreduce,Bigtable
分布式系统学习必读文章!!!! 原文:http://blog.sina.com.cn/s/blog_4ed630e801000bi3.html 分布式系统漫谈一 —— Google三驾马车: GFS, ...
- [MapReduce] Google三驾马车:GFS、MapReduce和Bigtable
声明:此文转载自博客开发团队的博客,尊重原创工作.该文适合学分布式系统之前,作为背景介绍来读. 谈到分布式系统,就不得不提Google的三驾马车:Google FS[1],MapReduce[2],B ...
- Google三驾马车:GFS、MapReduce和Bigtable
谈到分布式系统,就不得不提Google的三驾马车:Google fs[1],Mapreduce[2],Bigtable[3]. 虽然Google没有公布这三个产品的源码,但是他发布了这三个产品的详细设 ...
- Google三驾马车
Google旧三驾马车: GFS,mapreduce,Bigtable http://blog.sina.com.cn/s/blog_4ed630e801000bi3.html Google新三驾马车 ...
- 【技术与商业案例解读笔记】095:Google大数据三驾马车笔记
1.谷歌三驾马车地位 [关键词]开启时代,指明方向 聊起大数据,我们通常言必称谷歌,谷歌有“三驾马车”:谷歌文件系统(GFS).MapReduce和BigTable.谷歌的“三驾马车”开启了大数据时 ...
- Childlife旗下三驾马车
Childlife旗下,尤其以 “提高免疫力”为口号的“三驾马车”:第一防御液.VC.紫雏菊,是相当热门的海淘产品.据说这是一系列“成分天然.有效治愈感冒提升免疫力.由美国著名儿科医生研发”的药物.
- Ubuntu 安装 k8s 三驾马车 kubelet kubeadm kubectl
Ubuntu 版本是 18.04 ,用的是阿里云服务器,记录一下自己实际安装过程的操作步骤. 安装 docker 安装所需的软件 apt-get update apt-get install -y a ...
- Qt 学习笔记 - 第三章 - Qt的三驾马车之一 - 串口编程 + 程序打包成Windows软件
Qt 学习笔记全系列传送门: Qt 学习笔记 - 第一章 - 快速开始.信号与槽 Qt 学习笔记 - 第二章 - 添加图片.布局.界面切换 [本章]Qt 学习笔记 - 第三章 - Qt的三驾马车之一 ...
- 更强、更稳、更高效:解读 etcd 技术升级的三驾马车
点击下载<不一样的 双11 技术:阿里巴巴经济体云原生实践> 本文节选自<不一样的 双11 技术:阿里巴巴经济体云原生实践>一书,点击上方图片即可下载! 作者 | 陈星宇(宇慕 ...
- itemKNN发展史----推荐系统的三篇重要的论文解读
itemKNN发展史----推荐系统的三篇重要的论文解读 本文用到的符号标识 1.Item-based CF 基本过程: 计算相似度矩阵 Cosine相似度 皮尔逊相似系数 参数聚合进行推荐 根据用户 ...
随机推荐
- C#树的实现
ddd /// <summary> /// 遍历,线索化等操作的接口 /// </summary> interface ITravelBinTree { void PreOrd ...
- 机器学习-决策树系列-GBDT算法-集成学习-30
目录 1. 复习 2. GBDT 3. gbdt应用于二分类: 3. gbdt应用于多类 4. 叶子节点输出值c的计算 5. GBDT的其他应用 6. GBDT+LR 代码实现 1. 复习 再开始学习 ...
- Go 疑难杂症汇总
1. revision v0.0.0: unknown revision v0.0.0 go get -u github.com/uudashr/gopkgs/cmd/gopkgs 报错: [root ...
- 每天学五分钟 Liunx 1000 | 软件篇:源码安装
软件安装流程 前面软件篇提到了通过 RPM 和 YUM 在线安装的机制安装软件,除了这两种方式之外还有一种通过源码来安装软件的方式.
- 在线photoshop网页版工具开发
基于javascript开发的在线ps工具,打包方式webpack 在线预览 在线ps网页版 源码地址 https://github.com/geeeeeeeek 功能介绍 在线图像编辑器允许您使用H ...
- Keep English Level-02
change -- n 零钱 climate change -- 气候变化 exchange -- 交换,兑换(金融) exchange rate -- 汇率 move -- 感动,改变,移动 (n) ...
- 《OnJava》——11内部类
内部类 利用内部类,可以将逻辑上存在关联的类组织在一起,而且可以控制一个类在另一个类中的可见性. 内部类和组合不同,内部类是一种代码隐藏机制:将代码放在其他类的内部. 11.1 创建内部类 创建内部类 ...
- mongo-连接失败
连接mongo失败 默认情况下,mongo最大支持65535个连接 查询当前支持的连接数 db.serverStatus.connections { "current" : 3,/ ...
- [转帖]UNIX SOCKET简介
UNIX Domain SOCKET 是在Socket架构上发展起来的用于同一台主机的进程间通讯(IPC).它不需要经过网络协议栈,不需要打包拆包.计算校验和.维护序列号应答等.只是将应用层数据从一个 ...
- tiup 工具离线安装与简单导出数据说明
tiup 工具离线安装说明 mirror的创建 能上网的机器上面进行如下操作: curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pi ...