It is everywhere in the world of MySQL that if your replication is broken because an event caused a duplicate key or a row was not found and it cannot be updated or deleted, then you can use ‘STOP SLAVE; SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; START SLAVE; ‘ and be done with it. In some cases this is fine and you can repair the offending row or statements later on. But what if the statement is part of a multi-statement transaction? Well, then it becomes more interesting, because skipping the offending statement will cause the whole transaction to be skipped. This is well documented in the manual by the way. So here’s a quick example.

3 rows on the master:

 
 
 
 
 

Shell

 
1
2
3
4
5
6
7
8
9
master> select * from t;
+----+-----+
| id | pid |
+----+-----+
|  1 |   1 |
|  2 |   2 |
|  3 |   3 |
+----+-----+
3 rows in set (0.00 sec)

2 on the slave:

 
 
 
 
 

Shell

 
1
2
3
4
5
6
7
8
slave> select * from t;
+----+-----+
| id | pid |
+----+-----+
|  1 |   1 |
|  3 |   3 |
+----+-----+
2 rows in set (0.00 sec)

Execute a transaction on the master to break replication:

 
 
 
 
 

Shell

 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
master> BEGIN;
Query OK, 0 rows affected (0.00 sec)
 
master> DELETE FROM t WHERE id = 1;
Query OK, 1 row affected (0.00 sec)
 
master> DELETE FROM t WHERE id = 2;
Query OK, 1 row affected (0.00 sec)
 
master> DELETE FROM t WHERE id = 3;
Query OK, 1 row affected (0.00 sec)
 
master> COMMIT;
Query OK, 0 rows affected (0.01 sec)

Broken slave:

 
 
 
 
 
 

Shell

 
1
2
3
4
5
6
7
slave> show slave status G
*************************** 1. row ***************************
...
Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Delete_rows event on table test.t; Can't find record in 't', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000002, end_log_pos 333
...
1 row in set (0.00 sec)

An attempt to fix replication only caused bigger inconsistencies on slave:

 
 
 
 
 

Shell

 
1
2
3
4
5
6
7
8
9
10
11
slave> STOP SLAVE; SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; START SLAVE;
Query OK, 0 rows affected (0.00 sec)
 
slave> select * from t;
+----+-----+
| id | pid |
+----+-----+
|  1 |   1 |
|  3 |   3 |
+----+-----+
2 rows in set (0.00 sec)

This happens because the replication honors transaction boundaries, and is definitely something you should consider the next time you try to use this workaround on a broken slave. Of course, there is pt-table-checksum and pt-table-sync to rescue you when inconsistencies occur, however, prevention is always better than cure. Make sure to put safeguards in place to prevent your slaves from drifting.

Lastly, the example above is for ROW-based replication as my colleague pointed out, but can similarly happen with STATEMENT for example with a duplicate key error.  You can optionally fix the error above by temporarily setting slave_exec_mode to IDEMPOTENT so errors because of missing rows are skipped, but then again, it does not apply in all cases like an UPDATE statement that cannot be applied because the row on the slave is missing.

Here is a demonstration of the problem with STATEMENT-based replication:

 
 
 
 
 
 

Shell

 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
master> select * from t;
+----+-----+
| id | pid |
+----+-----+
|  4 |   1 |
|  6 |   3 |
+----+-----+
2 rows in set (0.00 sec)
 
slave> select * from t;
+----+-----+
| id | pid |
+----+-----+
|  4 |   1 |
|  5 |   2 |
|  6 |   3 |
+----+-----+
3 rows in set (0.00 sec)
 
master> BEGIN;
Query OK, 0 rows affected (0.00 sec)
 
master> delete from t where id = 4;
Query OK, 1 row affected (0.00 sec)
 
master> insert into t values (5,2);
Query OK, 1 row affected (0.00 sec)
 
master> delete from t where id = 6;
Query OK, 1 row affected (0.00 sec)
 
master> COMMIT;
Query OK, 0 rows affected (0.15 sec)
 
slave> show slave status G
*************************** 1. row ***************************
...
               Last_SQL_Errno: 1062    
               Last_SQL_Error: Error 'Duplicate entry '5' for key 'PRIMARY'' on query. Default database: 'test'. Query: 'insert into t values (5,2)'
...
1 row in set (0.00 sec)
 
slave> stop slave; set global sql_slave_skip_counter = 1; start slave;
Query OK, 0 rows affected (0.05 sec)
 
slave> select * from t;
+----+-----+
| id | pid |
+----+-----+
|  4 |   1 |
|  5 |   2 |
|  6 |   3 |
+----+-----+
3 rows in set (0.00 sec)

https://www.percona.com/blog/2013/07/23/another-reason-why-sql_slave_skip_counter-is-bad-in-mysql/

Another reason why SQL_SLAVE_SKIP_COUNTER is bad in MySQL的更多相关文章

  1. 查看mysql主从配置的状态及修正 slave不启动问题

    1.查看master的状态 mysql> show master status;  //Position不应该为0 mysql> show processlist;  //state状态应 ...

  2. MySQL Replication 优化和技巧、常见故障解决方法

    MySQL 主从同步错误(error)解决(转) sql_slave_skip_counter参数 附: 一些错误信息的处理,主从服务器上的命令,及状态信息. 在从服务器上使用show slave s ...

  3. mysql主从数据库复制

    http://blog.csdn.net/lgh1117/article/details/8786274 http://blog.csdn.net/libraworm/article/details/ ...

  4. MySQL show slave status命令参数

    ? Slave_IO_State SHOW PROCESSLIST输出的State字段的拷贝.SHOW PROCESSLIST用于从属I/O线程.如果线程正在试图连接到主服务器,正在等待来自主服务器的 ...

  5. canal mysql slave

    [mysqld] log-bin=mysql-bin #添加这一行就ok binlog-format=ROW #选择row模式 server_id=1 #配置mysql replaction需要定义, ...

  6. Mysql主从同步问题汇总

    data-1-1主机是master,data-1-2是slave Last_IO_Errno: 1236 slave查看show slave status\G; 显示Last_IO_Errno: 12 ...

  7. Mysql主从---删除master.info和relya-log.info实验

    relay-log.info, master.info 这连个文件时在建立复制时产生的,现在主要说明以下问题: 1.如果修改删除master.info文件,复制会中断么? 不会,如果stop slav ...

  8. MySQL面试宝典

    ==============================================# 参数==============================================auto ...

  9. Mysql 主从主主复制总结(详细)

    环境:Centos 6.9,Mysql 8.0 配置准备:1.为Centos配置好网络,使用远程工具连接. 2.安装mysql,途径不限.mysql8.0和5.7区别不大,8.0在配置主从的时候默认开 ...

随机推荐

  1. VS和Eclipse的调试功能哪个更强大?

    以前一直用VS 2012来调试C/C++代码,F5.F10.F11用起来甚是顺手,前面也写过一篇关于VS最好用的快捷键:Visual Studio最好用的快捷键(你最喜欢哪个), 所以对于调试C/C+ ...

  2. NHibernate 有好几种数据库查询方式

    NHibernate 有好几种数据库查询方式 1.原生SQL var employeeQuery = Database.Session .CreateSQLQuery("select * f ...

  3. Hibernate杂问

    1 谈谈你对ORM框架的基本思想的了解? 首先 ORM是 对象关系映射,是为了解决类似于JDBC实现对象持久化的问题开发的. 框架的基本特征:完成面向对象的编程语言到关系数据库之间的映射. 他的映射分 ...

  4. Windows Mobile设备操作演示准备工作小记

    公司最近为PDA开发了一款作业程序,我在工作中常常需要将操作过程通过电脑上设影出来为客户讲解使用方法.本文记录了相关的准备工作. 1. 微软嵌入式操作系统体系 RTOS: Embedded Real ...

  5. 扒一扒HTTPS网站的内幕

    215年6月,维基媒体基金会发布公告,旗下所有网站将默认开启HTTPS,这些网站中最为人所知的当然是全球最大的在线百科-维基百科.而更早时候的3月,百度已经发布公告,百度全站默认开启HTTPS.淘宝也 ...

  6. 状态压缩·一(状态压缩DP)

    描述 小Hi和小Ho在兑换到了喜欢的奖品之后,便继续起了他们的美国之行,思来想去,他们决定乘坐火车前往下一座城市——那座城市即将举行美食节! 但是不幸的是,小Hi和小Ho并没有能够买到很好的火车票—— ...

  7. MVC应用程序实现上传文件(续)

    前几天,有练习了<MVC应用程序实现上传文件>http://www.cnblogs.com/insus/p/3590907.html 那只是把文档上传至MVC应用程序下的某一目录之中. 其 ...

  8. Unity3D学习笔记第一课

    第一课程:1.Unity类名必须与文件名保持一致2.讲属性设置为public可以在Unity中访问 public float speed; // Use this for initialization ...

  9. Maven为不同环境配置打包

    在开发过程中经常要遇到为不同的环境打包,这里面最主要的问题在于,不同环境的配置是不一样的,如果为不同环境打包每次都手工修改配置,那不但工作量大,而且很容易出错.如果用ant的话,用变量加上replac ...

  10. 【Java基础】5、java中的匿名内部类

    匿名内部类也就是没有名字的内部类 正因为没有名字,所以匿名内部类只能使用一次,它通常用来简化代码编写 但使用匿名内部类还有个前提条件:必须继承一个父类或实现一个接口 实例1:使用匿名内部类来实现抽象方 ...