It is everywhere in the world of MySQL that if your replication is broken because an event caused a duplicate key or a row was not found and it cannot be updated or deleted, then you can use ‘STOP SLAVE; SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; START SLAVE; ‘ and be done with it. In some cases this is fine and you can repair the offending row or statements later on. But what if the statement is part of a multi-statement transaction? Well, then it becomes more interesting, because skipping the offending statement will cause the whole transaction to be skipped. This is well documented in the manual by the way. So here’s a quick example.

3 rows on the master:

 
 
 
 
 

Shell

 
1
2
3
4
5
6
7
8
9
master> select * from t;
+----+-----+
| id | pid |
+----+-----+
|  1 |   1 |
|  2 |   2 |
|  3 |   3 |
+----+-----+
3 rows in set (0.00 sec)

2 on the slave:

 
 
 
 
 

Shell

 
1
2
3
4
5
6
7
8
slave> select * from t;
+----+-----+
| id | pid |
+----+-----+
|  1 |   1 |
|  3 |   3 |
+----+-----+
2 rows in set (0.00 sec)

Execute a transaction on the master to break replication:

 
 
 
 
 

Shell

 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
master> BEGIN;
Query OK, 0 rows affected (0.00 sec)
 
master> DELETE FROM t WHERE id = 1;
Query OK, 1 row affected (0.00 sec)
 
master> DELETE FROM t WHERE id = 2;
Query OK, 1 row affected (0.00 sec)
 
master> DELETE FROM t WHERE id = 3;
Query OK, 1 row affected (0.00 sec)
 
master> COMMIT;
Query OK, 0 rows affected (0.01 sec)

Broken slave:

 
 
 
 
 
 

Shell

 
1
2
3
4
5
6
7
slave> show slave status G
*************************** 1. row ***************************
...
Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Delete_rows event on table test.t; Can't find record in 't', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000002, end_log_pos 333
...
1 row in set (0.00 sec)

An attempt to fix replication only caused bigger inconsistencies on slave:

 
 
 
 
 

Shell

 
1
2
3
4
5
6
7
8
9
10
11
slave> STOP SLAVE; SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; START SLAVE;
Query OK, 0 rows affected (0.00 sec)
 
slave> select * from t;
+----+-----+
| id | pid |
+----+-----+
|  1 |   1 |
|  3 |   3 |
+----+-----+
2 rows in set (0.00 sec)

This happens because the replication honors transaction boundaries, and is definitely something you should consider the next time you try to use this workaround on a broken slave. Of course, there is pt-table-checksum and pt-table-sync to rescue you when inconsistencies occur, however, prevention is always better than cure. Make sure to put safeguards in place to prevent your slaves from drifting.

Lastly, the example above is for ROW-based replication as my colleague pointed out, but can similarly happen with STATEMENT for example with a duplicate key error.  You can optionally fix the error above by temporarily setting slave_exec_mode to IDEMPOTENT so errors because of missing rows are skipped, but then again, it does not apply in all cases like an UPDATE statement that cannot be applied because the row on the slave is missing.

Here is a demonstration of the problem with STATEMENT-based replication:

 
 
 
 
 
 

Shell

 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
master> select * from t;
+----+-----+
| id | pid |
+----+-----+
|  4 |   1 |
|  6 |   3 |
+----+-----+
2 rows in set (0.00 sec)
 
slave> select * from t;
+----+-----+
| id | pid |
+----+-----+
|  4 |   1 |
|  5 |   2 |
|  6 |   3 |
+----+-----+
3 rows in set (0.00 sec)
 
master> BEGIN;
Query OK, 0 rows affected (0.00 sec)
 
master> delete from t where id = 4;
Query OK, 1 row affected (0.00 sec)
 
master> insert into t values (5,2);
Query OK, 1 row affected (0.00 sec)
 
master> delete from t where id = 6;
Query OK, 1 row affected (0.00 sec)
 
master> COMMIT;
Query OK, 0 rows affected (0.15 sec)
 
slave> show slave status G
*************************** 1. row ***************************
...
               Last_SQL_Errno: 1062    
               Last_SQL_Error: Error 'Duplicate entry '5' for key 'PRIMARY'' on query. Default database: 'test'. Query: 'insert into t values (5,2)'
...
1 row in set (0.00 sec)
 
slave> stop slave; set global sql_slave_skip_counter = 1; start slave;
Query OK, 0 rows affected (0.05 sec)
 
slave> select * from t;
+----+-----+
| id | pid |
+----+-----+
|  4 |   1 |
|  5 |   2 |
|  6 |   3 |
+----+-----+
3 rows in set (0.00 sec)

https://www.percona.com/blog/2013/07/23/another-reason-why-sql_slave_skip_counter-is-bad-in-mysql/

Another reason why SQL_SLAVE_SKIP_COUNTER is bad in MySQL的更多相关文章

  1. 查看mysql主从配置的状态及修正 slave不启动问题

    1.查看master的状态 mysql> show master status;  //Position不应该为0 mysql> show processlist;  //state状态应 ...

  2. MySQL Replication 优化和技巧、常见故障解决方法

    MySQL 主从同步错误(error)解决(转) sql_slave_skip_counter参数 附: 一些错误信息的处理,主从服务器上的命令,及状态信息. 在从服务器上使用show slave s ...

  3. mysql主从数据库复制

    http://blog.csdn.net/lgh1117/article/details/8786274 http://blog.csdn.net/libraworm/article/details/ ...

  4. MySQL show slave status命令参数

    ? Slave_IO_State SHOW PROCESSLIST输出的State字段的拷贝.SHOW PROCESSLIST用于从属I/O线程.如果线程正在试图连接到主服务器,正在等待来自主服务器的 ...

  5. canal mysql slave

    [mysqld] log-bin=mysql-bin #添加这一行就ok binlog-format=ROW #选择row模式 server_id=1 #配置mysql replaction需要定义, ...

  6. Mysql主从同步问题汇总

    data-1-1主机是master,data-1-2是slave Last_IO_Errno: 1236 slave查看show slave status\G; 显示Last_IO_Errno: 12 ...

  7. Mysql主从---删除master.info和relya-log.info实验

    relay-log.info, master.info 这连个文件时在建立复制时产生的,现在主要说明以下问题: 1.如果修改删除master.info文件,复制会中断么? 不会,如果stop slav ...

  8. MySQL面试宝典

    ==============================================# 参数==============================================auto ...

  9. Mysql 主从主主复制总结(详细)

    环境:Centos 6.9,Mysql 8.0 配置准备:1.为Centos配置好网络,使用远程工具连接. 2.安装mysql,途径不限.mysql8.0和5.7区别不大,8.0在配置主从的时候默认开 ...

随机推荐

  1. 使用Nagios打造专业的业务状态监控

    想必各个公司都有部署zabbix之类的监控系统来监控服务器的资源使用情况.各服务的运行状态,是否这种监控就足够了呢?有没有遇到监控系统一切正常确发现项目无法正常对外提供服务的情况呢?本篇文章聊聊我们如 ...

  2. Nginx图片防盗链【实战】

    访问我的博客 前言 博主目前在一家原创小说网站公司工作,由于站内的作品全部是原创,于是乎不可避免地会被一些盗版网站爬取盗版,对于防盗版一直没有很好的对策,让公司很是苦恼. 最近去一些盗版网站上搜索我们 ...

  3. SpringCloud入门之eclipse新建maven子项目和聚合项目

    一.new maven project :  next 二.勾选 create a simple project  :  next 三.Group Id:项目的包路径 如com.test,之后创建的C ...

  4. 【详解】GrantedAuthority(已授予的权限)

    前言 这篇是很久之前学习Spring Security整理的博客,发现浏览量都1000多了,一个赞都没有,那说明写得确实不怎么样,哈哈.应该很多初学者对这个接口存在疑问,特别是如果学习这个框架之前还了 ...

  5. B 树、B+ 树、B* 树

    B 树.B+ 树.B* 树 作者:July.weedge.Frankie.编程艺术室出品. 说明:本文从B树开始谈起,然后论述B+树.B*树,最后谈到R 树.其中B树.B+树及B*树部分由weedge ...

  6. Font Awesome(一套很棒的图标库)

    Font Awesome 是一个非常方便的图标库.这些图标都是矢量图形,被保存在 .svg 的文件格式中.这些图标就和字体一样,你可以通过像素单位指定它们的大小,它们将会继承其父HTML元素的字体大小 ...

  7. Git Windows客户端保存用户名和密码

    解决Git Windows客户端保存用户名和密码的方法,至于为什么,就不想说了. 1. 添加一个HOME环境变量,值为%USERPROFILE% 2. 开始菜单中,点击“运行”,输入“%Home%”并 ...

  8. windows开机提示文件损坏

    今早按部就班的开机,然后准备吃热干面,很多时候事情都是同步进行的... 然后眼前出现这样一个界面 心情果断灰暗下来,按照提示一步步操作,点enter进入高级选项,试过了安全模式启动.最后一次正确配置启 ...

  9. BestCoder Round #29——A--GTY's math problem(快速幂(对数法))、B--GTY's birthday gift(矩阵快速幂)

    GTY's math problem Time Limit: 1000/1000 MS (Java/Others)    Memory Limit: 65536/65536 K (Java/Other ...

  10. Factorial Problem in Base K(zoj3621)

    Factorial Problem in Base K Time Limit: 2 Seconds Memory Limit: 65536 KB How many zeros are there in ...