本文重点关注了系统设计相关的内容,paper后半部分的具体应用此处没有过多涉及。从个人笔记修改而来,因此为英文版本。

Bigtable: A Distributed Storage System for Structured Data

Data model: not a relational data model

A Bigtable is a sparse, distributed, persistent multidimensional sorted map. —— part2

How the map indexed?

(row:string, column:string, time:int64) → string

just like json format, eg:

table{
// ...
"aaaaa" : { //row
"A:foo" : { //col
15 : "y", //timestamp
4 : "m"
},
"A:bar" : { //col
15 : "d",
},
"B:" : { //col
6 : "w"
3 : "o"
1 : "w"
}
},
// ...
}

a particular table: webtable

  • row(also called tablet): reversed URL

    concurrent: single row key is atomic

    lexicographic order

  • col: column families, contents

    family:qualifier

    Access control and both disk and memory accounting

  • timestamp

    avoid collisions: unique timestamp, decreasing order

    garbage-collection mechanism(eg.)


API

C++ read/write

MapReduce + Bigtable


Building Block

Google File System: store log and data files

distributed Google File System

Google SSTable file format: store Bigtable data

K-V map: iterate key/value pairs in a specified key range

  • a sequence of blocks
  • a block index
disk seek or memory seek?

Optionally, SSTable can be completely mapped into memory, which allows us to perform lookups and scans without touching disk.

Chubby: distributed lock service

5 active replicas: 1 master, 4 slave

Paxos algorithm: to keep its replicas consistent in the face of failure

namespace: including directory and small file, op r/w is atomic

session: when expires, lose locks and open handles


Implementation

consist:

  1. library(?) linked to every client
  2. 1 master server(schedule, garbage-collect......)
  3. many tablet server(10-1000 tablets)

As with many single-master distributed storage systems, client data does not move through the master: clients communicate directly with tablet servers for reads and writes.

hierarchy (B+-tree)

Chubby file -> Root tablet -> other METADATA tablets -> UserTables

METADATA: many other things stored in it

Master: schedule & manage

Each tablet is assigned to one tablet server at a time. Bigtable uses Chubby to keep track of tablet servers. When a tablet server starts, it creates, and acquires an exclusive lock on, a uniquely-named file in a specific Chubby directory. The master monitors this directory (the servers directory) to discover tablet servers.

The essential point for distributed database: lock

The Bigtable is only a series of ops, real data is stored in GFS.(SSTable)

Tablet Representation

memtable: the recently committed updates are stored in memory in a sorted buffer

reconstruct: redo points in commit logs

Compactions

As write operations execute, the size of the memtable increases. When the memtable size reaches a threshold, the memtable is frozen, a new memtable is created, and the frozen memtable is converted to an SSTable and written to GFS.

minor(memtable) -> major(SSTable) compaction


Refinement

locality group

Clients can group multiple column families together into a locality group. A separate SSTable is generated for each locality group in each tablet.

This section describes portions of the implementation in more detail in order to highlight these refinements.

in-memory locality groups are loaded lazily

storage: compression

read performance: caching

Bloom filters

commit-log

Speeding up tablet recovery

Exploiting immutability


Performance Evaluation


Lesson

  1. large distributed systems are vulnerable to many types of failures
  2. it is important to delay adding new features until it is clear how the new features will be used
  3. the importance of proper system-level monitoring
  4. the value of simple designs

google三驾马车之一:Bigtable解读(英文版)的更多相关文章

  1. 分布式系统漫谈一 —— Google三驾马车: GFS,mapreduce,Bigtable

    分布式系统学习必读文章!!!! 原文:http://blog.sina.com.cn/s/blog_4ed630e801000bi3.html 分布式系统漫谈一 —— Google三驾马车: GFS, ...

  2. [MapReduce] Google三驾马车:GFS、MapReduce和Bigtable

    声明:此文转载自博客开发团队的博客,尊重原创工作.该文适合学分布式系统之前,作为背景介绍来读. 谈到分布式系统,就不得不提Google的三驾马车:Google FS[1],MapReduce[2],B ...

  3. Google三驾马车:GFS、MapReduce和Bigtable

    谈到分布式系统,就不得不提Google的三驾马车:Google fs[1],Mapreduce[2],Bigtable[3]. 虽然Google没有公布这三个产品的源码,但是他发布了这三个产品的详细设 ...

  4. Google三驾马车

    Google旧三驾马车: GFS,mapreduce,Bigtable http://blog.sina.com.cn/s/blog_4ed630e801000bi3.html Google新三驾马车 ...

  5. 【技术与商业案例解读笔记】095:Google大数据三驾马车笔记

     1.谷歌三驾马车地位 [关键词]开启时代,指明方向 聊起大数据,我们通常言必称谷歌,谷歌有“三驾马车”:谷歌文件系统(GFS).MapReduce和BigTable.谷歌的“三驾马车”开启了大数据时 ...

  6. Childlife旗下三驾马车

    Childlife旗下,尤其以 “提高免疫力”为口号的“三驾马车”:第一防御液.VC.紫雏菊,是相当热门的海淘产品.据说这是一系列“成分天然.有效治愈感冒提升免疫力.由美国著名儿科医生研发”的药物.

  7. Ubuntu 安装 k8s 三驾马车 kubelet kubeadm kubectl

    Ubuntu 版本是 18.04 ,用的是阿里云服务器,记录一下自己实际安装过程的操作步骤. 安装 docker 安装所需的软件 apt-get update apt-get install -y a ...

  8. Qt 学习笔记 - 第三章 - Qt的三驾马车之一 - 串口编程 + 程序打包成Windows软件

    Qt 学习笔记全系列传送门: Qt 学习笔记 - 第一章 - 快速开始.信号与槽 Qt 学习笔记 - 第二章 - 添加图片.布局.界面切换 [本章]Qt 学习笔记 - 第三章 - Qt的三驾马车之一 ...

  9. 更强、更稳、更高效:解读 etcd 技术升级的三驾马车

    点击下载<不一样的 双11 技术:阿里巴巴经济体云原生实践> 本文节选自<不一样的 双11 技术:阿里巴巴经济体云原生实践>一书,点击上方图片即可下载! 作者 | 陈星宇(宇慕 ...

  10. itemKNN发展史----推荐系统的三篇重要的论文解读

    itemKNN发展史----推荐系统的三篇重要的论文解读 本文用到的符号标识 1.Item-based CF 基本过程: 计算相似度矩阵 Cosine相似度 皮尔逊相似系数 参数聚合进行推荐 根据用户 ...

随机推荐

  1. idea安装并使用maven依赖分析插件:Maven Helper

    本文为博主原创,转载请注明出处: 在maven工程中,经常会查看maven的依赖树,在没使用该插件时,需要maven dependency:tree命令进行查看依赖树, 通过maven helper ...

  2. Linux 中常见目录的作用

    by emanjusaka from https://www.emanjusaka.top/2024/01/linux-directory-role 彼岸花开可奈何 本文欢迎分享与聚合,全文转载请留下 ...

  3. 【PHP】 延时跳转

    echo "<meta http-equiv=\"refresh\" content=\"5;url="."register.php& ...

  4. 【转帖】ESXI上安装和使用MegaCli

    https://www.diewufeiyang.com/post/964.html 一.第一步获取安装文件,官网目前搜索也找不到了,这里附件提供之前保存的安装文件 点击下载 二.使用VMware v ...

  5. Core 文件的简单学习

    背景 最近公司内经常出现jvm进程宕机的情况. 宕机之后没有产生jvm的dump文件.比如xxx.hprof 但是产生了 core.$pid的文件. 曾经在aarch64架构上宕机时曾经想学习一下co ...

  6. mysql 获取 今天是今年的第几天, 以及 还有多少天元旦的方法

    1. 获取今天是这一年的第几天 select dayofyear(curdate()); 或者是 select dayofyear(now()); 2. 获取还有多少天元旦的方法 select dat ...

  7. NutUI 4.0 正式发布!

    作者: 京东零售 NutUI NutUI 4.0 Github 地址:github.com/jdf2e/nutui NutUI 4.0 官网:nutui.jd.com 前言 技术日异月新.发展创新.持 ...

  8. Vite 按需引入 Ant Design Vue 3.0

    Vite 按需引入 Ant Design Vue 3.0 第一步下载: npm i unplugin-vue-components -D 需要注意的是:Vite你可以用 unplugin-vue-co ...

  9. vue2-vue3监听子组件的生命周期的两种方式

    1.生命周期 生命周期是指:vue实例从创建到销毁这一系列过程.vue官网生命周期如下图所示: vue的生命周期有多少个 beforeCreate, created, beforeMount, mou ...

  10. k8s笔记——NodePort暴露nginx-controller实现https自动跳转自定义nodePort端口

    安装nginx-controller并暴露nodePort helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx ...