MySQL binlog 组提交与 XA(两阶段提交)
InnoDB storage engine. The MySQL XA implementation is based on the X/Open CAE document Distributed Transaction Processing: The XA Specification.
Currently, among the MySQL Connectors, MySQL Connector/J 5.0.0 and higher supports XA directly, by means of a class interface that handles the XA SQL statement interface for you.
XA supports distributed transactions, that is, the ability to permit multiple separate transactional resources to participate in a global transaction. Transactional resources often are RDBMSs but may be other kinds of resources.
A global transaction involves several actions that are transactional in themselves, but that all must either complete successfully as a group, or all be rolled back as a group. In essence, this extends ACID properties “up a level” so that multiple ACID transactions can be executed in concert as components of a global operation that also has ACID properties. (However, for a distributed transaction, you must use the SERIALIZABLE isolation level to achieve ACID properties. It is enough to use REPEATABLE READ for a nondistributed transaction, but not for a distributed transaction.)
最重要的一点:使用MySQL中的XA实现分布式事务时必须使用serializable隔离级别。
The MySQL implementation of XA MySQL enables a MySQL server to act as a Resource Manager that handles XA transactions within a global transaction. A client program that connects to the MySQL server acts as the Transaction Manager.
The process for executing a global transaction uses two-phase commit (2PC). This takes place after the actions performed by the branches of the global transaction have been executed.
In the first phase, all branches are prepared. That is, they are told by the TM to get ready to commit. Typically, this means each RM that manages a branch records the actions for the branch in stable storage. The branches indicate whether they are able to do this, and these results are used for the second phase.
In the second phase, the TM tells the RMs whether to commit or roll back. If all branches indicated when they were prepared that they will be able to commit, all branches are told to commit. If any branch indicated when it was prepared that it will not be able to commit, all branches are told to roll back.
第一阶段:为prepare阶段,TM向RM发出prepare指令,RM进行操作,然后返回成功与否的信息给TM;
第二阶段:为事务提交或者回滚阶段,如果TM收到所有RM的成功消息,则TM向RM发出提交指令;不然则发出回滚指令;
XA transaction support is limited to the InnoDB storage engine.(只有innodb支持XA分布式事务)
For "external XA" a MySQL server acts as a Resource Manager and client programs act as Transaction Managers. For "Internal XA", storage engines within a MySQL server act as RMs, and the server itself acts as a TM. Internal XA support is limited by the capabilities of individual storage engines. Internal XA is required for handling XA transactions that involve more than one storage engine. The implementation of internal XA requires that a storage engine support two-phase commit at the table handler level, and currently this is true only for InnoDB.
MySQL中的XA实现分为:外部XA和内部XA;前者是指我们通常意义上的分布式事务实现;后者是指单台MySQL服务器中,Server层作为TM(事务协调者),而服务器中的多个数据库实例作为RM,而进行的一种分布式事务,也就是MySQL跨库事务;也就是一个事务涉及到同一条MySQL服务器中的两个innodb数据库(因为其它引擎不支持XA)。
3. 内部XA的额外功能
XA 将事务的提交分为两个阶段,而这种实现,解决了 binlog 和 redo log的一致性问题,这就是MySQL内部XA的第三种功能。
MySQL为了兼容其它非事物引擎的复制,在server层面引入了 binlog, 它可以记录所有引擎中的修改操作,因而可以对所有的引擎使用复制功能;MySQL在4.x 的时候放弃redo的复制策略而引入binlog的复制(淘宝丁奇)。
2> InnoDB维持了状态为Prepare的事务链表,将这些事务的xid和Binlog中记录的xid做比较,如果在Binlog中存在,则提交,否则回滚事务。
将Binlog Group Commit的过程拆分成了三个阶段:
1> flush stage 将各个线程的binlog从cache写到文件中;
2> sync stage 对binlog做fsync操作(如果需要的话;最重要的就是这一步,对多个线程的binlog合并写入磁盘);
3> commit stage 为各个线程做引擎层的事务commit(这里不用写redo log,在prepare阶段已写)。每个stage同时只有一个线程在操作。(分成三个阶段,每个阶段的任务分配给一个专门的线程,这是典型的并发优化)
这种实现的优势在于三个阶段可以并发执行,从而提升效率。注意prepare阶段没有变,还是write/sync redo log.
淘宝对binlog group commit进行了进一步的优化,其原理如下:
从XA恢复的逻辑我们可以知道,只要保证InnoDB Prepare的redo日志在写Binlog前完成write/sync即可。因此我们对Group Commit的第一个stage的逻辑做了些许修改,大概描述如下:
Step1. InnoDB Prepare,记录当前的LSN到thd中;
Step2. 进入Group Commit的flush stage;Leader搜集队列,同时算出队列中最大的LSN。
Step3. 将InnoDB的redo log write/fsync到指定的LSN (注:这一步就是redo log的组写入。因为小于等于LSN的redo log被一次性写入到ib_logfile[0|1])
Step4. 写Binlog并进行随后的工作(sync Binlog, InnoDB commit , etc)
也就是将 redo log的write/sync延迟到了 binlog group commit的 flush stage 之后,sync binlog之前。
通过延迟写redo log的方式,显式的为redo log做了一次组写入(redo log group write),并减少了(redo log) log_sys->mutex的竞争。
也就是将 binlog group commit 对应的redo log也进行了 group write. 这样binlog 和 redo log都进行了优化。
官方MySQL在5.7.6的代码中引入了淘宝的优化,对应的Release Note如下:
When using InnoDB with binary logging enabled, concurrent transactions written in the InnoDB redo log are now grouped together before synchronizing to disk when innodb_flush_log_at_trx_commit is set to 1, which reduces the amount of synchronization operations. This can lead to improved performance.
5. XA参数 innodb_support_xa
| Command-Line Format | --innodb_support_xa |
||
| System Variable | Name | innodb_support_xa |
|
| Variable Scope | Global, Session | ||
| Dynamic Variable | Yes | ||
| Permitted Values | Type | boolean |
|
| Default | TRUE |
||
Enables InnoDB support for two-phase commit(2PC) in XA transactions, causing an extra disk flush for transaction preparation. This setting is the default. The XA mechanism is used internally and is essential for any server that has its binary log turned on and is accepting changes to its data from more than one thread. If you turn it off, transactions can be written to the binary log in a different order from the one in which the live database is committing them. This can produce different data when the binary log is replayed in disaster recovery or on a replication slave. Do not turn it off on a replication master server unless you have an unusual setup where only one thread is able to change data.
For a server that is accepting data changes from only one thread, it is safe and recommended to turn off this option to improve performance forInnoDB tables. For example, you can turn it off on replication slaves where only the replication SQL thread is changing data.
You can also turn off this option if you do not need it for safe binary logging or replication, and you also do not use an external XA transaction manager.
参数innodb_support_xa默认为true,表示启用XA,虽然它会导致一次额外的磁盘flush(prepare阶段flush redo log). 但是我们必须启用,而不能关闭它。因为关闭会导致binlog写入的顺序和实际的事务提交顺序不一致,会导致崩溃恢复和slave复制时发生数据错误。如果启用了log-bin参数,并且不止一个线程对数据库进行修改,那么就必须启用innodb_support_xa参数。
参考:
1. http://www.csdn.net/article/2015-01-16/2823591 (淘宝丁奇:怎么跳出MySQL的10个大坑)
2. http://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_support_xa
http://dev.mysql.com/doc/refman/5.6/en/xa.html
http://dev.mysql.com/doc/refman/5.6/en/xa-restrictions.html
MySQL binlog 组提交与 XA(两阶段提交)的更多相关文章
- MySQL binlog 组提交与 XA(两阶段提交)--1
参考了网上几篇比较靠谱的文章 http://www.linuxidc.com/Linux/2015-11/124942.htm http://blog.csdn.net/woqutechteam/ar ...
- 全网最牛X的!!! MySQL两阶段提交串讲
目录 一.吹个牛 二.事务及它的特性 三.简单看下两阶段提交的流程 四.两阶段写日志用意? 五.加餐:sync_binlog = 1 问题 六.如何判断binlog和redolog是否达成了一致 七. ...
- 聊一聊 MySQL 中的数据编辑过程中涉及的两阶段提交
MySQL 数据库中的两阶段提交,不知道您知道不?这篇文章就简单的聊一聊 MySQL 数据库中的两阶段提交,两阶段提交发生在数据变更期间(更新.删除.新增等),两阶段提交过程中涉及到了 MySQL 数 ...
- count(*)实现原理+两阶段提交总结
count(*)实现原理 不同引擎的实现: MyISAM引擎把表的总行数存在了磁盘上,执行COUNT(*)就会直接返回,效率很高: InnoDB在count(*)时,需要把数据一行一行的从引擎里面取出 ...
- 分布式事务(一)两阶段提交及JTA
原创文章,同步发自作者个人博客 http://www.jasongj.com/big_data/two_phase_commit/ 分布式事务 分布式事务简介 分布式事务是指会涉及到操作多个数据库(或 ...
- 两阶段提交及JTA
两阶段提交及JTA 分布式事务 分布式事务简介 分布式事务是指会涉及到操作多个数据库(或者提供事务语义的系统,如JMS)的事务.其实就是将对同一数据库事务的概念扩大到了对多个数据库的事务.目的是为了保 ...
- 分布式:分布式事务(CAP、两阶段提交、三阶段提交)
1 关于分布式系统 1.1 介绍 我们常见的单体结构的集中式系统,一般整个项目就是一个独立的应用,所有的模块都聚合在一起.明显的弊端就是不易扩展.发布冗重.服务治理不好做. 所以我们把整个系统拆分成若 ...
- MySQL binlog 组提交与 XA(分布式事务、两阶段提交)【转】
概念: XA(分布式事务)规范主要定义了(全局)事务管理器(TM: Transaction Manager)和(局部)资源管理器(RM: Resource Manager)之间的接口.XA为了实现分布 ...
- 使用golang理解mysql的两阶段提交
使用golang理解mysql的两阶段提交 文章源于一个问题:如果我们现在有两个mysql实例,在我们要尽量简单地完成分布式事务,怎么处理? 场景重现 比如我们现在有两个数据库,mysql3306和m ...
随机推荐
- Apache+Mod_Python配置
我其实不是个适合做编程的人,因为喜欢折腾,不喜欢日复一日的重复同样的事情.感觉挺适合做网管(运维)的. 经常在摆弄一些小众的程序员不怎么会关心的东西,不走寻常路.有时也挺纠结的,折腾这些东西的过程中, ...
- Ibatis中常见错误解决方案
在Ibatis 的sqlMap或者sqlMapConfig配置文件中如果出现以下错误信息: Referenced file contains errors (http://www.ibatis.com ...
- 习题:codevs 1035 火车停留解题报告
本蒟蒻又来写解题报告了.这次的题目是codevs 1035 火车停留. 题目大意就是给m个火车的到达时间.停留时间和车载货物的价值,车站有n个车道,而火车停留一次车站就会从车载货物价值中获得1%的利润 ...
- php ci 2.0框架 ORM
很早知道ci出了2.0版本了.这几天正好有项目要用ci开发 虽然项目不大.不过也从开发项目的过程中熟悉了CI框架 因为是个电商项目 本来想用个YII2 的. 封装的虽然厉害不过功能强大 因为另个兄弟坚 ...
- Linux更改主机名-适用于abuntu
今天复制了个ubuntu虚拟机,于是想更改下主机名以作区别.这是搜到的比较完整的资料,适用abuntu,不过其他linux系统还有待求证. 1.查看主机名 在Ubuntu系统中,快速查看主机名有多种方 ...
- ASP.NET根据URL生成网页缩略图示例程序(C#语言)
工作中可能马上要用到根据URL生成网页缩略图功能,提前做好准备. 在网上找了份源码,但是有错误:当前线程不在单线程单元中,因此无法实例化 ActiveX 控件“8856f961-340a-11d0-a ...
- Sequence.js 实现带有视差滚动特效的图片滑块
Sequence.js 功能齐全,除了能实现之前分享过的现代的图片滑动效果,还可以融合当前非常流行的视差滚动(Parallax Scrolling)效果.让多层背景以不同的速度移动,形成立体的运动效果 ...
- 高性能javascript学习笔记系列(3) -DOM编程
参考 高性能javascript 文档对象模型(DOM)是独立于语言的,用于操作XML和HTML文档的程序接口API,在浏览器中主要通过DOM提供的API与HTML进行交互,浏览器通常会把DOM和ja ...
- [js开源组件开发]ajax分页组件
ajax分页组件 我以平均每一周出一个开源的js组件为目标行动着,虽然每个组件并不是很庞大,它只完成某一个较小部分的工作,但相信,只要有付出,总会得到回报的.这个组件主要完成分页的工作. 这张图里显示 ...
- 从零开始,做一个NodeJS博客(二):实现首页-加载文章列表和详情
标签: NodeJS 0 这个伪系列的第二篇,不过和之前的几篇是同一天写的.三分钟热度貌似还没过. 1 静态资源代理 上一篇,我们是通过判断请求的路径来直接返回结果的.简单粗暴,缺点明显:如果url后 ...