Facebook MyRocks at MariaDB

Recently my colleague Rasmus Johansson announced that MariaDB is adding support for the Facebook MyRocks storage engine. Today I’m going to share a bit more on what that means for MariaDB users. Members of the Facebook Database Engineering team helped us answer some questions we think our community will have about MyRocks.

Benefits of MariaDB Server’s Extensible Architecture

Before discussing specifics of MyRocks, new readers may benefit from a description of MariaDB Serverarchitecture, which is extensible at every layerincluding the storage layer. This means users and the community can add functionality to meet unique needs. Community contributions are one of MariaDB’s greatest advantages over other databases, and a big reason for us becoming the fastest growing open source database in the marketplace.

Openness in the storage layer is especially important because being able to use the right storage engine for the right use case ensures better performance optimization. Both mysql and MariaDB support InnoDB - a well known, general purpose storage engine. But InnoDB is not suited to every use case, so the MariaDB engineering team is extending support for additional storage engines, including Facebook’s MyRocks for workloads requiring greater compression and IO efficiency, and MariaDBColumnStore (currently inbeta), which will provide faster time-to-insight with Massively Parallel Execution (MPP).

Facebook MyRocks for MariaDB

When searching for a storage engine that could give greater performance for web scale type applications, MyRocks was an obvious choice because of its superior handling of data compression and IO efficiency. Besides that, its LSM architecture allows for very efficient data ingestion, like read-free replication slaves, or fast bulk data loading.

As we add support for new storage engines, many of our current users may ask, “What happens to MariaDB’s support for InnoDB? Do I have to migrate?” Of course not! We have no plans to abandon InnoDB. InnoDB is a proven storage engine and we expect it to continue to be used by MariaDB users. But we do expect that deployments that need highest possible efficiency will opt for MyRocks because of its performance gains and IO efficiency. Over time, as MyRocks matures we expect it will become appropriate for even more use cases.

The first MariaDB version of MyRocks will be available in a release candidate of MariaDB Server 10.2 coming this winter. Our goal is for MyRocks to work with all MariaDB features, but some of them, like optimistic parallel replication, may not work in the first release. MariaDB is an open source project that follows the "release often, release early" approach, so our goal is to first make a release that meets core requirements, and then add support for special cases in subsequent releases.

Now let’s move onto my discussion with Facebook’s Database Engineering team!

Can you tell us a bit about the history of RocksDB at Facebook? In 2012, we started to build an embedded storage engine optimized for flash-based SSD, by forking LevelDB. The fork became RocksDB, which was open-sourced on November 2013 [1] . After RocksDB proved to be an effective persistent key-value store for SSD, we enhanced RocksDB for other platforms. We improved its performance on DRAM in 2014 and on hard drives in 2015, two platforms with production use cases now.

Over the past few years, we've introduced numerous features and improvements. To name a few, we built compaction filter and merge operator in 2013, backup and column families in 2014, transactions and bulk loading in 2015, and persistent cache in 2016. See the list of features that are not in LevelDB: https://github.com/facebook/rocksdb/wiki/Features-Not-in-LevelDB .

Early RocksDB adopters at Facebook such as the distributed key-value store ZippyDB [2], Laser [2] and Dragon [3] went into production in early 2013. Since then, many more new or existing services at Facebook started to use RocksDB every year. Now RocksDB is used in a number of services across multiple hardware platforms at Facebook. [1] https://code.facebook.com/posts/666746063357648/under-the-hood-building-and-open-sourcing-rocksdb/ and http://rocksdb.blogspot.com/2013/11/the-history-of-rocksdb.html [2] https://research.facebook.com/publications/realtime-data-processing-at-facebook/ [3] https://code.facebook.com/posts/1737605303120405/dragon-a-distributed-graph-query-engine/ Why did FB go down the RocksDB path for MySQL?

MySQL is a popular storage solution at Facebook because we have a great team dedicated to running MySQL at scale that provides a high quality of service. The MySQL tiers store many petabytes of data that have been compressed with InnoDB table compression. We are always looking for ways to improve compression and the LSM algorithm used by RocksDB has several advantages over the B-Tree used by InnoDB. This led us to MyRocks: RocksDB is a key-value storage engine. MyRocks implements that MySQL storage engine API to make RocksDB work with MySQL and provide SQL functionality. Our initial goal was to get 2x more compression from MyRocks than from compressed InnoDB without affecting read performance. We exceeded our goal. In addition to getting 2x better compression, we also got much lower write rates to storage, faster database loads, and better performance.

Lower write rates enable the use of lower endurance flash, and faster loads simplify the migration from MySQL on InnoDB to MySQL on RocksDB. While we don't expect better performance for all workloads, the way in which we operate the database tier for the initial MyRocks deployment favors RocksDB more than InnoDB. Finally, there are features unique to an LSM that we expect to support in the future, including the merge operator and compaction filters. MyRocks can be helpful to the MySQL community because of efficiency and innovation.

We considered multiple write-optimized database engines. We chose RocksDB because it has excellent performance and efficiency and because we work directly with the team. The MyRocks effort has benefited greatly from being able to collaborate on a daily basis with the RocksDB team. We appreciate that the RocksDB team treats us like a very important customer. They move fast to make RocksDB better for MyRocks.

How was MyRocks developed? MyRocks is developed by engineers from several locations across the globe. The team had the privilege to work with Sergey Petrunia right from the beginning, and he is based in Russia. At Facebook's Menlo Park campus, Siying Dong leads RocksDB development and Yoshinori Matsunobu leads the collaboration with MySQL infrastructure and data performance teams. From the Seattle office, Herman Lee worked on the initial validation of MyRocks that gave the team the confidence to proceed with MyRocks for our user databases as well as led the MyRocks feature development. In Oregon, Mark Callaghan has been benchmarking all aspects of MyRocks and

Let’s start by creating a table, where we will store the CIDR, split in two columns: one for the IPv6 address and one for the network length. The most compact way of storing IPv6 values is to use the binary(16) .

CREATETABLE `cidr` ( `id` int(11) NOT NULL AUTO_INCREMENT, `ip_address` binary(16) NOT NULL, `net_len` int(11) NOT NULL, PRIMARYKEY (`id`) ) ENGINE=InnoDB;

In order to generate some random data, I will use a stored procedure which will insert 100,000 IP addresses in the previously created table. The addresses will be generated by inserting the first 16 bytes of the current time-stamp’s SHA value.

DELIMITER // CREATEPROCEDUREgenerate_ips(no_ipsINT) BEGIN DECLARE x INT DEFAULT 0; DECLARE ipbinary(16); DECLARE rand_net_lenINT; REPEAT SET x = x + 1; # Generate a random ip address SETip= substring(unhex(sha(RAND())), 1, 16); # Generate a random network length between 64 and 123 SETrand_net_len= 64 + FLOOR(RAND() * 60); INSERTINTOcidr(ip_address, net_len) values (ip, rand_net_len); UNTIL x > no_ipsEND REPEAT; END // DELIMITER ; # Generate the IPs CALLgenerate_ips(100000);

Ifyou want to insert the values from a human-readable string, MySQL provides the INET6_ATON("2001:0db8:85a3:0000:0000:8a2e:0370:7334") function which will strip the colons (“:”)and convertthe 32 characters to a binary(16) string.

Next Iwill select the first 3 IP address. In order to do thatI will use the INET6_NTOA function, which will convert from the binary format to the human-readable format.

mysql> SELECTid, INET6_NTOA(ip_address), net_lenFROMcidrLIMIT 3; +----+-----------------------------------------+---------+ | id | INET6_NTOA(ip_address)| net_len | +----+-----------------------------------------+---------+ |1 | 200d:31c4:1905:9eb2:3c7f:c45c:de78:42cd |97 | |2 | 59b0:c4d6:48b4:3717:f031:d05b:705d:6c65 |95 | |3 | 788e:3f48:e62b:c3bb:da10:6a03:f987:7a16 |110 | +----+-----------------------------------------+---------+ 3 rowsin set (0,01 sec)

Next, I would like to select the network mask, host mask, network address and also generate the network range intervals. For this I will create a few helper functions:

# Returns the net mask based on the network length DELIMITER // CREATEFUNCTION net_mask(net_lenint) RETURNSbinary(16) DETERMINISTIC BEGIN RETURN (~INET6_ATON('::') << (128 - net_len)); END // # Returns the network address using an IP and the network length CREATEFUNCTION subnet(ipBINARY(16), net_lenint) RETURNSbinary(16) DETERMINISTIC BEGIN RETURN ip & ((~INET6_ATON('::') << (128 - net_len))); END // # Returns the host mask CREATEFUNCTION host_mask(net_lenint) RETURNSbinary(16) DETERMINISTIC BEGIN RETURN (~INET6_ATON('::') >> net_len); END // DELIMITER ;

Facebook MyRocks at MariaDB的更多相关文章

MySQL与MariaDB核心特性比较详细版v1.0（覆盖mysql 8.0/mariadb 10.3，包括优化、功能及维护）
注:本文严禁任何形式的转载,原文使用word编写,为了大家阅读方便,提供pdf版下载. MySQL与MariaDB主要特性比较详细版v1.0(不含HA).pdf 链接:https://pan.baid ...
【巨杉数据库SequoiaDB】巨杉Tech | 巨杉数据库的并发 malloc 实现
本文由巨杉数据库北美实验室资深数据库架构师撰写,主要介绍巨杉数据库的并发malloc实现与架构设计.原文为英文撰写,我们提供了中文译本在英文之后. SequoiaDB Concurrent mallo ...
数据库对比：选择MariaDB还是MySQL？
作者 | EverSQL 译者 | 无明这篇文章的目的主要是比较 MySQL 和 MariaDB 之间的主要相似点和不同点.我们将从性能.安全性和主要功能方面对这两个数据库展开对比,并列出在选择数据 ...
myrocks复制中断问题排查
背景 mysql可以支持多种不同的存储引擎,innodb由于其高效的读写性能,并且支持事务特性,使得它成为mysql存储引擎的代名词,使用非常广泛.随着SSD逐渐普及,硬件存储成本越来越高,面向写优化 ...
AliOS编译安装MyRocks
MyRocks是facabook版将自主研发的MySQL分支,其源码位于为:https://github.com/facebook/mysql-5.6/ 首先需要安装以下: sudo yum inst ...
MyRocks简介
RocksDB是facebook基于LevelDB实现的,目前为facebook内部大量业务提供服务.经过facebook大量工作,将RocksDB为MySQL的一个存储引擎移植到MySQL,称之为M ...
初识MariaDB存储引擎
在看MariaDB的存储引擎之前,可以先了解MySQL存储引擎. MySQL常用的存储引擎: MyISAM存储引擎:是MySQL的默认存储引擎.MyISAM不支持事务.也不支持外键,但其访问速度快,对 ...
【转】哦，mysql 的其它发行版本Percona， mariadb
原文:http://geek.csdn.net/news/detail/130146 2016年11月25日,沃趣科技"智慧应用数据先行"2016产品发布会暨新三板挂牌庆祝会在杭 ...
MyRocks DDL原理
最近一个日常实例在做DDL过程中,直接把数据库给干趴下了,问题还是比较严重的,于是赶紧排查问题,撸了下crash堆栈和alert日志,发现是在去除唯一约束的场景下,MyRocks存在一个严重的bug, ...

随机推荐

AutoMappeer自动映射
AutoMapper是用来解决对象之间映射转换的类库.对于我们开发人员来说,写对象之间互相转换的代码是一件极其浪费生命的事情,AutoMapper能够帮助我们节省不少时间.
shellinabox安装
Shell In A Box(发音是shellinabox)是一款基于Web的终端模仿器,由Markus Gutschke开辟而成.它有内置的Web办事器,在指定的端口上作为一个基于Web的SSH客户 ...
SWIFT Button的基本用法
import UIKit @UIApplicationMain class AppDelegate: UIResponder, UIApplicationDelegate { var window: ...
iOS UICollectionView之二（垂直滚动）
#import <UIKit/UIKit.h> @interface AppDelegate : UIResponder <UIApplicationDelegate> @pr ...
虚拟化技术比较 PV HVM
很多人看到同样配置的VPS价格相差很大,甚是不理解,其实VPS使用的虚拟技术种类有很多,如OpenVZ.Xen.KVM.Xen和HVM与PV.在XEN中pv是半虚拟化,hvm是全虚拟化,pv只能用于L ...
ajax get和post请求后端接收并返回数据
get请求$(function(){ //alert("23"); var x = "#page"; var y = "${ctx!}/static/ ...
Java基础之访问文件与目录——列出目录内容（ListDirectoryContents）
控制台程序,列出目录的全部内容并使用过滤器来选择特定的条目. import java.nio.file.*; import java.io.IOException; public class List ...
java collections读书笔记（10) Set
aaarticlea/png;base64,iVBORw0KGgoAAAANSUhEUgAAAVgAAADbCAIAAACnXR7VAAAgAElEQVR4nOx9d1hVV9Y3880zb2YmM3 ...
CSS 中文字体的英文名称
宋体 SimSun 黑体 SimHei 微软雅黑 Microsoft YaHei 微软正黑体 Microsoft JhengHei 新宋体 NSimSun 新细明体 PMingLiU 细明体 Ming ...
Lintcode: Segment Tree Build
The structure of Segment Tree is a binary tree which each node has two attributes start and end deno ...

Facebook MyRocks at MariaDB

Facebook MyRocks at MariaDB的更多相关文章

随机推荐

热门专题