Preface
 
    In my last test of pt-heartbeat,both of master and slave were out of disk.And the mysql client was hang.In order to resolve the issue,I've tryed to fix the replicaiton environment without using mysqldump to reconfigure the slave.Let's see the details.
 
Procedure
 
I dropped test tables in database "sysbench" to release the disk space on master.
 [root@zlm2 :: /data/mysql/mysql3306/logs]
#sysbench oltp_read_write.lua --mysql-host=192.168.1.101 --mysql-port= --mysql-user=zlm --mysql-password=zlmzlm --mysql-db=sysbench --tables= --table-size= --mysql-storage-engine=innodb cleanup
sysbench 1.0. (using bundled LuaJIT 2.1.-beta2) Dropping table 'sbtest1'... (zlm@192.168.1.101 )[(none)]>use sysbench;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A Database changed
(zlm@192.168.1.101 )[sysbench]>show tables;
+--------------------+
| Tables_in_sysbench |
+--------------------+
| hb |
| sbtest2 |
| sbtest3 |
| sbtest4 |
| sbtest5 |
+--------------------+
rows in set (0.00 sec) //Only sbtest1 was deleted.It's not enough. [root@zlm2 :: ~/sysbench-1.0/src/lua]
#sysbench oltp_read_write.lua --mysql-host=192.168.1.101 --mysql-port= --mysql-user=zlm --mysql-password=zlmzlm --mysql-db=sysbench --tables= --table-size= --mysql-storage-engine=innodb cleanup
sysbench 1.0. (using bundled LuaJIT 2.1.-beta2) Dropping table 'sbtest1'...
Dropping table 'sbtest2'...
Dropping table 'sbtest3'...
Dropping table 'sbtest4'...
Dropping table 'sbtest5'... [root@zlm2 :: ~/sysbench-1.0/src/lua]
#df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root .4G .9G .5G % / //I'd got 27% free space.
devtmpfs 488M 488M % /dev
tmpfs 497M 497M % /dev/shm
tmpfs 497M 6.6M 491M % /run
tmpfs 497M 497M % /sys/fs/cgroup
/dev/sda1 497M 118M 379M % /boot
none 87G 80G .6G % /vagrant (zlm@192.168.1.101 )[(none)]>drop database sysbench;
Query OK, row affected (0.04 sec) //Further more,I dropped the "sysbench".
The slave hung still and disk space was full.
 [root@zlm3 :: ~]
#df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root .4G .4G 20K % /
devtmpfs 488M 488M % /dev
tmpfs 497M 497M % /dev/shm
tmpfs 497M 6.5M 491M % /run
tmpfs 497M 497M % /sys/fs/cgroup
/dev/sda1 497M 118M 379M % /boot
none 87G 80G .6G % /vagrant (zlm@192.168.1.102 )[(none)]>show slave status\G
^C^C -- query aborted ^Z
[]+ Stopped mysql [root@zlm3 :: ~]
#pkill mysqld [root@zlm3 :: ~]
#./mysqld.sh [root@zlm3 :: ~]
#mysql
ERROR (HY000): Can't connect to MySQL server on '192.168.1.102' (111) [root@zlm3 :: ~]
#cd /data/mysql/mysql3306/data [root@zlm3 :: /data/mysql/mysql3306/data]
#cat error.log |tail -n
--19T08::02.581937+: [Note] InnoDB: Log scan progressed past the checkpoint lsn
--19T08::02.581958+: [Note] InnoDB: Doing recovery: scanned up to log sequence number
--19T08::02.581963+: [Note] InnoDB: Database was not shutdown normally!
--19T08::02.581965+: [Note] InnoDB: Starting crash recovery.
--19T08::02.696292+: [Note] InnoDB: Transaction was in the XA prepared state.
--19T08::02.700688+: [Note] InnoDB: Transaction was in the XA prepared state.
--19T08::02.700814+: [Note] InnoDB: transaction(s) which must be rolled back or cleaned up in total row operations to undo
--19T08::02.700821+: [Note] InnoDB: Trx id counter is
--19T08::02.701719+: [Note] InnoDB: Last MySQL binlog file position , file name mysql-bin.
--19T08::02.805965+: [Note] InnoDB: Ignoring tablespace `zlm`.`sbtest2` because the DISCARD flag is set .
--19T08::02.806462+: [Note] InnoDB: Creating shared tablespace for temporary tables
--19T08::02.807316+: [Note] InnoDB: Setting file './ibtmp1' size to MB. Physically writing the file full; Please wait ...
--19T08::02.807568+: [Note] InnoDB: Starting in background the rollback of uncommitted transactions
--19T08::02.807594+: [Note] InnoDB: Rollback of non-prepared transactions completed
--19T08::02.871396+: [Warning] InnoDB: Retry attempts for writing partial data failed.
--19T08::02.871423+: [ERROR] InnoDB: Write to file ./ibtmp1failed at offset , bytes should have been written, only were written. Operating system error number . Check that your OS and file system support files of this size. Check also that the disk is not full or a disk quota exceeded.
--19T08::02.871441+: [ERROR] InnoDB: Error number means 'No space left on device'
--19T08::02.871446+: [Note] InnoDB: Some operating system error numbers are described at http://dev.mysql.com/doc/refman/5.7/en/operating-system-error-codes.html
--19T08::02.871451+: [ERROR] InnoDB: Could not set the file size of './ibtmp1'. Probably out of disk space
--19T08::02.871456+: [ERROR] InnoDB: Unable to create the shared innodb_temporary
--19T08::02.871459+: [ERROR] InnoDB: Plugin initialization aborted with error Generic error
--19T08::03.273011+: [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
--19T08::03.273029+: [ERROR] Plugin 'InnoDB' init function returned error.
--19T08::03.273033+: [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
--19T08::03.273037+: [ERROR] Failed to initialize builtin plugins.
--19T08::03.273040+: [ERROR] Aborting --19T08::03.273046+: [Note] Binlog end
--19T08::03.273389+: [Note] mysqld: Shutdown complete //The mysqld process could not run again because of no free disk space.
I decided to drop all the binlogs on slave to release the disk space.
 [root@zlm3 :: /data/mysql/mysql3306]
#cd logs [root@zlm3 :: /data/mysql/mysql3306/logs]
#ls -l
total
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.index [root@zlm3 :: /data/mysql/mysql3306/logs]
#rm -f * [root@zlm3 :: /data/mysql/mysql3306/logs]
#ls -l
total [root@zlm3 :: ~]
#df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root .4G .5G .0G % / //The free disk space had been reduced to 47%.
devtmpfs 488M 488M % /dev
tmpfs 497M 497M % /dev/shm
tmpfs 497M 6.5M 491M % /run
tmpfs 497M 497M % /sys/fs/cgroup
/dev/sda1 497M 118M 379M % /boot
none 87G 80G .6G % /vagrant
Ran the mysqld again and dropped the database "sysbench" on slave.
 [root@zlm3 :: /data/mysql/mysql3306/logs]
#sh /root/mysqld.sh [root@zlm3 :: /data/mysql/mysql3306/logs]
#ps aux|grep mysqld
mysql 7.0 17.8 pts/ Sl : : mysqld --defaults-file=/data/mysql/mysql3306/my.cnf
root 0.0 0.0 pts/ R+ : : grep --color=auto mysqld [root@zlm3 :: /data/mysql/mysql3306/logs]
#mysql
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is
Server version: 5.7.-log MySQL Community Server (GPL) Copyright (c) , , Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. (zlm@192.168.1.102 )[(none)]>drop database sysbench;
Query OK, rows affected (0.11 sec)

Started the replication threads of slave.

 (zlm@192.168.1.102 )[(none)]>start slave;
Query OK, rows affected (0.00 sec) (zlm@192.168.1.102 )[(none)]>show slave status\G
*************************** . row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.101
Master_User: repl
Master_Port:
Connect_Retry:
Master_Log_File: mysql-bin.
Read_Master_Log_Pos:
Relay_Log_File: relay-bin.
Relay_Log_Pos:
Relay_Master_Log_File: mysql-bin.
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno:
Last_Error: Error executing row event: 'Table 'sysbench.sbtest1' doesn't exist'
Skip_Counter:
Exec_Master_Log_Pos:
Relay_Log_Space:
Until_Condition: None
Until_Log_File:
Until_Log_Pos:
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno:
Last_IO_Error:
Last_SQL_Errno:
Last_SQL_Error: Error executing row event: 'Table 'sysbench.sbtest1' doesn't exist' //Since the database had been droppted.This error was notable.
Replicate_Ignore_Server_Ids:
Master_Server_Id:
Master_UUID: 1b7181ee-6eaf-11e8-998e-080027de0e0e
Master_Info_File: mysql.slave_master_info
SQL_Delay:
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count:
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: ::
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 1b7181ee-6eaf-11e8-998e-080027de0e0e:- //It was stuck on transaction 3714549(which contained error).
Executed_Gtid_Set: 1b7181ee-6eaf-11e8-998e-080027de0e0e:-,
5c77c31b-4add-11e8-81e2-080027de0e0e:-
Auto_Position:
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
row in set (0.00 sec) [root@zlm3 :: ~]
#perror
MySQL error code (ER_NO_SUCH_TABLE): Table '%-.192s.%-.192s' doesn't exist //Error 1146 indicated the absence of table "sbtest1" in "sysbench" database.
//Obviously,the slave was replaying the operations relevant to this table on master.The table even the database had been dropped.
//How could I do next step?Do I have to generate a new mysqldump file and reconfigure the slave again?
//There's One thing I'm rather sure that there were no other transactions generated in the whole course except the operations on "sysbench" database.
//Since I'd drop "sysbentch" database on both master and slave.Maybe I can fix the issue easily.

Checked the Executed_Gtid_Set on master.

 (zlm@192.168.1.101 )[(none)]>show master status;
+------------------+-----------+--------------+------------------+------------------------------------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+-----------+--------------+------------------+------------------------------------------------+
| mysql-bin. | | | | 1b7181ee-6eaf-11e8-998e-080027de0e0e:- |
+------------------+-----------+--------------+------------------+------------------------------------------------+
row in set (0.00 sec) //The executed gtid was upto "3730021".

Tryed to fix the replica of master.

 (zlm@192.168.1.102 )[(none)]>show slave status\G
*************************** . row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.101
Master_User: repl
Master_Port:
Connect_Retry:
Master_Log_File: mysql-bin.
Read_Master_Log_Pos:
Relay_Log_File: relay-bin.
Relay_Log_Pos:
Relay_Master_Log_File: mysql-bin.
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno:
Last_Error: Error executing row event: 'Table 'sysbench.sbtest1' doesn't exist'
Skip_Counter:
Exec_Master_Log_Pos:
Relay_Log_Space:
Until_Condition: None
Until_Log_File:
Until_Log_Pos:
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno:
Last_IO_Error:
Last_SQL_Errno:
Last_SQL_Error: Error executing row event: 'Table 'sysbench.sbtest1' doesn't exist'
Replicate_Ignore_Server_Ids:
Master_Server_Id:
Master_UUID: 1b7181ee-6eaf-11e8-998e-080027de0e0e
Master_Info_File: mysql.slave_master_info
SQL_Delay:
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count:
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: ::
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 1b7181ee-6eaf-11e8-998e-080027de0e0e:-
Executed_Gtid_Set:
Auto_Position:
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
row in set (0.00 sec) (zlm@192.168.1.102 )[(none)]>reset master;
Query OK, rows affected (0.02 sec) (zlm@192.168.1.102 )[(none)]>set @@global.gtid_purged='1b7181ee-6eaf-11e8-998e-080027de0e0e:1-3730021';
Query OK, rows affected (0.00 sec) //On account of surely knowing there were no other transactions at all.I set the "gtid_purged" variable to the value of "gtid_executed" on master.
//It means I guised that all the transactions generated on master had been replayed on slave already.The slave could retrieve new GTID at the moment. (zlm@192.168.1.102 )[(none)]>start slave sql_thread;
Query OK, rows affected (0.02 sec) (zlm@192.168.1.102 )[(none)]>show slave status\G
*************************** . row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.101
Master_User: repl
Master_Port:
Connect_Retry:
Master_Log_File: mysql-bin.
Read_Master_Log_Pos:
Relay_Log_File: relay-bin.
Relay_Log_Pos:
Relay_Master_Log_File: mysql-bin.
Slave_IO_Running: Yes
Slave_SQL_Running: Yes //The sql_thread became "Yes".
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno:
Last_Error:
Skip_Counter:
Exec_Master_Log_Pos:
Relay_Log_Space:
Until_Condition: None
Until_Log_File:
Until_Log_Pos:
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master:
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno:
Last_IO_Error:
Last_SQL_Errno:
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id:
Master_UUID: 1b7181ee-6eaf-11e8-998e-080027de0e0e
Master_Info_File: mysql.slave_master_info
SQL_Delay:
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count:
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 1b7181ee-6eaf-11e8-998e-080027de0e0e:-
Executed_Gtid_Set: 1b7181ee-6eaf-11e8-998e-080027de0e0e:- //The slave had skipped those GTID(which contained error 1146) of master and waited for newer GTID.The replica had been fixed up.
Auto_Position:
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
row in set (0.00 sec)

Summary

  • The variable "gtid_purged" cannot be set if "gtid_executed" is not empty.
  • Caution,"reset master" can only be used on slave.Keep in mind that don't do it on master anytime.
  • This case can be followed only in test environment 'cause you cannot guarantee whether all the transactions are really replayed on slave.

GTID环境中手动修复主从故障一例(Error 1146)的更多相关文章

  1. GTID环境中手动修复主从故障一例(Error 1236/Error 1396)

      Preface       I got an replication error 1236 when I modified the password of a user without start ...

  2. SqlServer 禁止架构更改的复制中手动修复使发布和订阅中分别增加的字段同步

    原文:SqlServer 禁止架构更改的复制中手动修复使发布和订阅中分别增加的字段同步 由于之前的需要,禁止了复制架构更改,以至在发布中添加一个字段,并不会同步到订阅中,而现在又在订阅中添加了一个同名 ...

  3. 企业运维 | MySQL关系型数据库在Docker与Kubernetes容器环境中快速搭建部署主从实践

    [点击 关注「 WeiyiGeek」公众号 ] 设为「️ 星标」每天带你玩转网络安全运维.应用开发.物联网IOT学习! 希望各位看友[关注.点赞.评论.收藏.投币],助力每一个梦想. 本章目录 目录 ...

  4. 业务零影响!如何在Online环境中巧用MySQL传统复制技术【转】

    业务零影响!如何在Online环境中巧用MySQL传统复制技术 这篇文章我并不会介绍如何部署一个MySQL复制环境或keepalived+双主环境,因为此类安装搭建的文章已经很多,大家也很熟悉.在这篇 ...

  5. .NET 环境中使用RabbitMQ RabbitMQ与Redis队列对比 RabbitMQ入门与使用篇

    .NET 环境中使用RabbitMQ   在企业应用系统领域,会面对不同系统之间的通信.集成与整合,尤其当面临异构系统时,这种分布式的调用与通信变得越发重要.其次,系统中一般会有很多对实时性要求不高的 ...

  6. 5.7 并行复制配置 基于GTID 搭建中从 基于GTID的备份与恢复,同步中断处理

    5.7 并行复制配置 基于GTID 搭建中从 基于GTID的备份与恢复,同步中断处理 这个文章包含三个部分 1:gtid的多线程复制2:同步中断处理3:GTID的备份与恢复 下面文字相关的东西 大部分 ...

  7. 生产环境中的kubernetes 优先级与抢占

    kubernetes 中的抢占功能是调度器比较重要的feature,但是真正使用起来还是比较危险,否则很容易把低优先级的pod给无辜kill.为了提高GPU集群的资源利用率,决定勇于尝试一番该feat ...

  8. Redis 哨兵模式实现主从故障互切换

    200 ? "200px" : this.width)!important;} --> 介绍 Redis Sentinel 是一个分布式系统, 你可以在一个架构中运行多个 S ...

  9. 生产环境中nginx既做web服务又做反向代理

    一.写对于初入博客园的感想 众所周知,nginx是一个高性能的HTTP和反向代理服务器,在以前工作中要么实现http要么做反向代理或者负载均衡.尚未在同一台nginx或者集群上同时既实现HTTP又实现 ...

随机推荐

  1. leetcode 141、Linked list cycle

    一种方法是用set存储出现过的指针,重复出现时就是有环: class Solution { public: bool hasCycle(ListNode *head) { set<ListNod ...

  2. 【转载】#370 - Subscribe to an Event by Adding an Event Handle

    You subscribe to a particular event in C# by defining an event handler-code that will be called when ...

  3. 温故而知新:Asp.Net中如何正确使用Session

    原文链接作者:菩提树下的杨过出处:http://yjmyzz.cnblogs.com Asp.Net中的Session要比Asp中的Session灵活和强大很多,同时也复杂很多:看到有一些Asp.Ne ...

  4. Uva 12657 双向链表

    题目链接:https://uva.onlinejudge.org/external/126/12657.pdf 题意: 给你一个从1~n的数,然后给你操作方案 • 1 X Y : move box X ...

  5. 单调队列 poj2823,fzu1894

    题目链接:http://poj.org/problem?id=2823 用RMQ超时了,我想应该是不会的,看discuss说,之前RMQ过了. 维护两个单调队列. 单调递减的队列,每插入一个时: 超过 ...

  6. selenium+chrome下载文件,格式怎么选择???

    学习了下载 if browser == "Chrome": options=webdriver.ChromeOptions() prefs={'profile.default_co ...

  7. Linux开发常见问题:GCC:链接器输入文件未使用,因为链接尚未完成

    问:我在Linux中运行一个make文件去编译C语言代码,然后得到了如下的错误信息: gcc  -Wall  -fPIC  -DSOLARIS  -DXP_UNIX  -DMCC_HTTPD  -D_ ...

  8. tp5 验证是不是ajax提交

    话不多说,看代码 if(request()->isAjax()){ return "是ajax提交"; }else{ return "不是ajax提交"; ...

  9. 【洛谷P1939】 矩阵加速模板

    https://www.luogu.org/problemnew/show/P1939 矩阵快速幂 斐波那契数列 首先看一下斐波那契数列的矩阵快速幂求法: 有一个矩阵1*2的矩阵|f[n-2],f[n ...

  10. JavaScript的算术、赋值、关系运算符的讲解

    JS中的运算符分为:算术/赋值/关系/逻辑/字符串       算术运算符:  +加法    -减法    *乘法    /除法    %取余 var a = 1, b = 2; a + b = 3 ...