Preface
 
    In my last test of pt-heartbeat,both of master and slave were out of disk.And the mysql client was hang.In order to resolve the issue,I've tryed to fix the replicaiton environment without using mysqldump to reconfigure the slave.Let's see the details.
 
Procedure
 
I dropped test tables in database "sysbench" to release the disk space on master.
 [root@zlm2 :: /data/mysql/mysql3306/logs]
#sysbench oltp_read_write.lua --mysql-host=192.168.1.101 --mysql-port= --mysql-user=zlm --mysql-password=zlmzlm --mysql-db=sysbench --tables= --table-size= --mysql-storage-engine=innodb cleanup
sysbench 1.0. (using bundled LuaJIT 2.1.-beta2) Dropping table 'sbtest1'... (zlm@192.168.1.101 )[(none)]>use sysbench;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A Database changed
(zlm@192.168.1.101 )[sysbench]>show tables;
+--------------------+
| Tables_in_sysbench |
+--------------------+
| hb |
| sbtest2 |
| sbtest3 |
| sbtest4 |
| sbtest5 |
+--------------------+
rows in set (0.00 sec) //Only sbtest1 was deleted.It's not enough. [root@zlm2 :: ~/sysbench-1.0/src/lua]
#sysbench oltp_read_write.lua --mysql-host=192.168.1.101 --mysql-port= --mysql-user=zlm --mysql-password=zlmzlm --mysql-db=sysbench --tables= --table-size= --mysql-storage-engine=innodb cleanup
sysbench 1.0. (using bundled LuaJIT 2.1.-beta2) Dropping table 'sbtest1'...
Dropping table 'sbtest2'...
Dropping table 'sbtest3'...
Dropping table 'sbtest4'...
Dropping table 'sbtest5'... [root@zlm2 :: ~/sysbench-1.0/src/lua]
#df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root .4G .9G .5G % / //I'd got 27% free space.
devtmpfs 488M 488M % /dev
tmpfs 497M 497M % /dev/shm
tmpfs 497M 6.6M 491M % /run
tmpfs 497M 497M % /sys/fs/cgroup
/dev/sda1 497M 118M 379M % /boot
none 87G 80G .6G % /vagrant (zlm@192.168.1.101 )[(none)]>drop database sysbench;
Query OK, row affected (0.04 sec) //Further more,I dropped the "sysbench".
The slave hung still and disk space was full.
 [root@zlm3 :: ~]
#df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root .4G .4G 20K % /
devtmpfs 488M 488M % /dev
tmpfs 497M 497M % /dev/shm
tmpfs 497M 6.5M 491M % /run
tmpfs 497M 497M % /sys/fs/cgroup
/dev/sda1 497M 118M 379M % /boot
none 87G 80G .6G % /vagrant (zlm@192.168.1.102 )[(none)]>show slave status\G
^C^C -- query aborted ^Z
[]+ Stopped mysql [root@zlm3 :: ~]
#pkill mysqld [root@zlm3 :: ~]
#./mysqld.sh [root@zlm3 :: ~]
#mysql
ERROR (HY000): Can't connect to MySQL server on '192.168.1.102' (111) [root@zlm3 :: ~]
#cd /data/mysql/mysql3306/data [root@zlm3 :: /data/mysql/mysql3306/data]
#cat error.log |tail -n
--19T08::02.581937+: [Note] InnoDB: Log scan progressed past the checkpoint lsn
--19T08::02.581958+: [Note] InnoDB: Doing recovery: scanned up to log sequence number
--19T08::02.581963+: [Note] InnoDB: Database was not shutdown normally!
--19T08::02.581965+: [Note] InnoDB: Starting crash recovery.
--19T08::02.696292+: [Note] InnoDB: Transaction was in the XA prepared state.
--19T08::02.700688+: [Note] InnoDB: Transaction was in the XA prepared state.
--19T08::02.700814+: [Note] InnoDB: transaction(s) which must be rolled back or cleaned up in total row operations to undo
--19T08::02.700821+: [Note] InnoDB: Trx id counter is
--19T08::02.701719+: [Note] InnoDB: Last MySQL binlog file position , file name mysql-bin.
--19T08::02.805965+: [Note] InnoDB: Ignoring tablespace `zlm`.`sbtest2` because the DISCARD flag is set .
--19T08::02.806462+: [Note] InnoDB: Creating shared tablespace for temporary tables
--19T08::02.807316+: [Note] InnoDB: Setting file './ibtmp1' size to MB. Physically writing the file full; Please wait ...
--19T08::02.807568+: [Note] InnoDB: Starting in background the rollback of uncommitted transactions
--19T08::02.807594+: [Note] InnoDB: Rollback of non-prepared transactions completed
--19T08::02.871396+: [Warning] InnoDB: Retry attempts for writing partial data failed.
--19T08::02.871423+: [ERROR] InnoDB: Write to file ./ibtmp1failed at offset , bytes should have been written, only were written. Operating system error number . Check that your OS and file system support files of this size. Check also that the disk is not full or a disk quota exceeded.
--19T08::02.871441+: [ERROR] InnoDB: Error number means 'No space left on device'
--19T08::02.871446+: [Note] InnoDB: Some operating system error numbers are described at http://dev.mysql.com/doc/refman/5.7/en/operating-system-error-codes.html
--19T08::02.871451+: [ERROR] InnoDB: Could not set the file size of './ibtmp1'. Probably out of disk space
--19T08::02.871456+: [ERROR] InnoDB: Unable to create the shared innodb_temporary
--19T08::02.871459+: [ERROR] InnoDB: Plugin initialization aborted with error Generic error
--19T08::03.273011+: [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
--19T08::03.273029+: [ERROR] Plugin 'InnoDB' init function returned error.
--19T08::03.273033+: [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
--19T08::03.273037+: [ERROR] Failed to initialize builtin plugins.
--19T08::03.273040+: [ERROR] Aborting --19T08::03.273046+: [Note] Binlog end
--19T08::03.273389+: [Note] mysqld: Shutdown complete //The mysqld process could not run again because of no free disk space.
I decided to drop all the binlogs on slave to release the disk space.
 [root@zlm3 :: /data/mysql/mysql3306]
#cd logs [root@zlm3 :: /data/mysql/mysql3306/logs]
#ls -l
total
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.
-rw-r----- mysql mysql Jul : mysql-bin.index [root@zlm3 :: /data/mysql/mysql3306/logs]
#rm -f * [root@zlm3 :: /data/mysql/mysql3306/logs]
#ls -l
total [root@zlm3 :: ~]
#df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root .4G .5G .0G % / //The free disk space had been reduced to 47%.
devtmpfs 488M 488M % /dev
tmpfs 497M 497M % /dev/shm
tmpfs 497M 6.5M 491M % /run
tmpfs 497M 497M % /sys/fs/cgroup
/dev/sda1 497M 118M 379M % /boot
none 87G 80G .6G % /vagrant
Ran the mysqld again and dropped the database "sysbench" on slave.
 [root@zlm3 :: /data/mysql/mysql3306/logs]
#sh /root/mysqld.sh [root@zlm3 :: /data/mysql/mysql3306/logs]
#ps aux|grep mysqld
mysql 7.0 17.8 pts/ Sl : : mysqld --defaults-file=/data/mysql/mysql3306/my.cnf
root 0.0 0.0 pts/ R+ : : grep --color=auto mysqld [root@zlm3 :: /data/mysql/mysql3306/logs]
#mysql
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is
Server version: 5.7.-log MySQL Community Server (GPL) Copyright (c) , , Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. (zlm@192.168.1.102 )[(none)]>drop database sysbench;
Query OK, rows affected (0.11 sec)

Started the replication threads of slave.

 (zlm@192.168.1.102 )[(none)]>start slave;
Query OK, rows affected (0.00 sec) (zlm@192.168.1.102 )[(none)]>show slave status\G
*************************** . row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.101
Master_User: repl
Master_Port:
Connect_Retry:
Master_Log_File: mysql-bin.
Read_Master_Log_Pos:
Relay_Log_File: relay-bin.
Relay_Log_Pos:
Relay_Master_Log_File: mysql-bin.
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno:
Last_Error: Error executing row event: 'Table 'sysbench.sbtest1' doesn't exist'
Skip_Counter:
Exec_Master_Log_Pos:
Relay_Log_Space:
Until_Condition: None
Until_Log_File:
Until_Log_Pos:
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno:
Last_IO_Error:
Last_SQL_Errno:
Last_SQL_Error: Error executing row event: 'Table 'sysbench.sbtest1' doesn't exist' //Since the database had been droppted.This error was notable.
Replicate_Ignore_Server_Ids:
Master_Server_Id:
Master_UUID: 1b7181ee-6eaf-11e8-998e-080027de0e0e
Master_Info_File: mysql.slave_master_info
SQL_Delay:
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count:
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: ::
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 1b7181ee-6eaf-11e8-998e-080027de0e0e:- //It was stuck on transaction 3714549(which contained error).
Executed_Gtid_Set: 1b7181ee-6eaf-11e8-998e-080027de0e0e:-,
5c77c31b-4add-11e8-81e2-080027de0e0e:-
Auto_Position:
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
row in set (0.00 sec) [root@zlm3 :: ~]
#perror
MySQL error code (ER_NO_SUCH_TABLE): Table '%-.192s.%-.192s' doesn't exist //Error 1146 indicated the absence of table "sbtest1" in "sysbench" database.
//Obviously,the slave was replaying the operations relevant to this table on master.The table even the database had been dropped.
//How could I do next step?Do I have to generate a new mysqldump file and reconfigure the slave again?
//There's One thing I'm rather sure that there were no other transactions generated in the whole course except the operations on "sysbench" database.
//Since I'd drop "sysbentch" database on both master and slave.Maybe I can fix the issue easily.

Checked the Executed_Gtid_Set on master.

 (zlm@192.168.1.101 )[(none)]>show master status;
+------------------+-----------+--------------+------------------+------------------------------------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+-----------+--------------+------------------+------------------------------------------------+
| mysql-bin. | | | | 1b7181ee-6eaf-11e8-998e-080027de0e0e:- |
+------------------+-----------+--------------+------------------+------------------------------------------------+
row in set (0.00 sec) //The executed gtid was upto "3730021".

Tryed to fix the replica of master.

 (zlm@192.168.1.102 )[(none)]>show slave status\G
*************************** . row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.101
Master_User: repl
Master_Port:
Connect_Retry:
Master_Log_File: mysql-bin.
Read_Master_Log_Pos:
Relay_Log_File: relay-bin.
Relay_Log_Pos:
Relay_Master_Log_File: mysql-bin.
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno:
Last_Error: Error executing row event: 'Table 'sysbench.sbtest1' doesn't exist'
Skip_Counter:
Exec_Master_Log_Pos:
Relay_Log_Space:
Until_Condition: None
Until_Log_File:
Until_Log_Pos:
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno:
Last_IO_Error:
Last_SQL_Errno:
Last_SQL_Error: Error executing row event: 'Table 'sysbench.sbtest1' doesn't exist'
Replicate_Ignore_Server_Ids:
Master_Server_Id:
Master_UUID: 1b7181ee-6eaf-11e8-998e-080027de0e0e
Master_Info_File: mysql.slave_master_info
SQL_Delay:
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count:
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: ::
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 1b7181ee-6eaf-11e8-998e-080027de0e0e:-
Executed_Gtid_Set:
Auto_Position:
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
row in set (0.00 sec) (zlm@192.168.1.102 )[(none)]>reset master;
Query OK, rows affected (0.02 sec) (zlm@192.168.1.102 )[(none)]>set @@global.gtid_purged='1b7181ee-6eaf-11e8-998e-080027de0e0e:1-3730021';
Query OK, rows affected (0.00 sec) //On account of surely knowing there were no other transactions at all.I set the "gtid_purged" variable to the value of "gtid_executed" on master.
//It means I guised that all the transactions generated on master had been replayed on slave already.The slave could retrieve new GTID at the moment. (zlm@192.168.1.102 )[(none)]>start slave sql_thread;
Query OK, rows affected (0.02 sec) (zlm@192.168.1.102 )[(none)]>show slave status\G
*************************** . row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.101
Master_User: repl
Master_Port:
Connect_Retry:
Master_Log_File: mysql-bin.
Read_Master_Log_Pos:
Relay_Log_File: relay-bin.
Relay_Log_Pos:
Relay_Master_Log_File: mysql-bin.
Slave_IO_Running: Yes
Slave_SQL_Running: Yes //The sql_thread became "Yes".
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno:
Last_Error:
Skip_Counter:
Exec_Master_Log_Pos:
Relay_Log_Space:
Until_Condition: None
Until_Log_File:
Until_Log_Pos:
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master:
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno:
Last_IO_Error:
Last_SQL_Errno:
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id:
Master_UUID: 1b7181ee-6eaf-11e8-998e-080027de0e0e
Master_Info_File: mysql.slave_master_info
SQL_Delay:
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count:
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 1b7181ee-6eaf-11e8-998e-080027de0e0e:-
Executed_Gtid_Set: 1b7181ee-6eaf-11e8-998e-080027de0e0e:- //The slave had skipped those GTID(which contained error 1146) of master and waited for newer GTID.The replica had been fixed up.
Auto_Position:
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
row in set (0.00 sec)

Summary

  • The variable "gtid_purged" cannot be set if "gtid_executed" is not empty.
  • Caution,"reset master" can only be used on slave.Keep in mind that don't do it on master anytime.
  • This case can be followed only in test environment 'cause you cannot guarantee whether all the transactions are really replayed on slave.

GTID环境中手动修复主从故障一例(Error 1146)的更多相关文章

  1. GTID环境中手动修复主从故障一例(Error 1236/Error 1396)

      Preface       I got an replication error 1236 when I modified the password of a user without start ...

  2. SqlServer 禁止架构更改的复制中手动修复使发布和订阅中分别增加的字段同步

    原文:SqlServer 禁止架构更改的复制中手动修复使发布和订阅中分别增加的字段同步 由于之前的需要,禁止了复制架构更改,以至在发布中添加一个字段,并不会同步到订阅中,而现在又在订阅中添加了一个同名 ...

  3. 企业运维 | MySQL关系型数据库在Docker与Kubernetes容器环境中快速搭建部署主从实践

    [点击 关注「 WeiyiGeek」公众号 ] 设为「️ 星标」每天带你玩转网络安全运维.应用开发.物联网IOT学习! 希望各位看友[关注.点赞.评论.收藏.投币],助力每一个梦想. 本章目录 目录 ...

  4. 业务零影响!如何在Online环境中巧用MySQL传统复制技术【转】

    业务零影响!如何在Online环境中巧用MySQL传统复制技术 这篇文章我并不会介绍如何部署一个MySQL复制环境或keepalived+双主环境,因为此类安装搭建的文章已经很多,大家也很熟悉.在这篇 ...

  5. .NET 环境中使用RabbitMQ RabbitMQ与Redis队列对比 RabbitMQ入门与使用篇

    .NET 环境中使用RabbitMQ   在企业应用系统领域,会面对不同系统之间的通信.集成与整合,尤其当面临异构系统时,这种分布式的调用与通信变得越发重要.其次,系统中一般会有很多对实时性要求不高的 ...

  6. 5.7 并行复制配置 基于GTID 搭建中从 基于GTID的备份与恢复,同步中断处理

    5.7 并行复制配置 基于GTID 搭建中从 基于GTID的备份与恢复,同步中断处理 这个文章包含三个部分 1:gtid的多线程复制2:同步中断处理3:GTID的备份与恢复 下面文字相关的东西 大部分 ...

  7. 生产环境中的kubernetes 优先级与抢占

    kubernetes 中的抢占功能是调度器比较重要的feature,但是真正使用起来还是比较危险,否则很容易把低优先级的pod给无辜kill.为了提高GPU集群的资源利用率,决定勇于尝试一番该feat ...

  8. Redis 哨兵模式实现主从故障互切换

    200 ? "200px" : this.width)!important;} --> 介绍 Redis Sentinel 是一个分布式系统, 你可以在一个架构中运行多个 S ...

  9. 生产环境中nginx既做web服务又做反向代理

    一.写对于初入博客园的感想 众所周知,nginx是一个高性能的HTTP和反向代理服务器,在以前工作中要么实现http要么做反向代理或者负载均衡.尚未在同一台nginx或者集群上同时既实现HTTP又实现 ...

随机推荐

  1. 【JavaScript 封装库】BETA 4.0 测试版发布!

    /* 源码作者: 石不易(Louis Shi) 联系方式: http://www.shibuyi.net =============================================== ...

  2. http协议,tcp协议,ip协议,dns服务之前的关系和区别

    长期以来都有一个问题,大家都在说http协议,tcp协议,ip协议,他们之间到底什么区别,有什么用,没人告诉我,最近看了这本<图解http>明白了一些,以下图片摘自这本书 一.理解一个传输 ...

  3. P1036 选数

    题目描述 已知 nn 个整数 x_1,x_2,…,x_nx1​,x2​,…,xn​,以及11个整数kk(k<nk<n).从nn个整数中任选kk个整数相加,可分别得到一系列的和.例如当n=4 ...

  4. 关于win10深度学习安装配置 CUDA9.0+VS2017+Cudnn7.4.1.5+Anaconda3(cupy安装包)+python3.7+pycharm

    0 查看电脑系统版本(非常重要) WIN+R->输入winver, 系统版本号必须高于1703,否则CUDA9.0难以运行!!!! 1 安装 NVIDIA 显卡驱动程序 下载地址:驱动程序 选择 ...

  5. Thinkphp 5 使用DOMDocument

    每一个载入浏览器都会生成一个 Document 对象. Document 对象使我们可以从脚本中对 HTML 页面中的所有元素进行访问.Document 对象是 Window 对象的一部分. 我们项目 ...

  6. apache配置局域网访问

    1.配置vhost.conf NameVirtualHost 192.168.2.74:80 <VirtualHost 192.168.2.74:80> DocumentRoot /var ...

  7. Data Warehouse 简介

    数据仓库定义 数据仓库之父Bill Inmon在1991年出版的“Building the Data Warehouse”一书中所提出的定义被广泛接受:数据仓库(Data Warehouse)是一个面 ...

  8. mybatis两级缓存原理剖析

    https://blog.csdn.net/zhurhyme/article/details/81064108 对于mybatis的缓存认识一直有一个误区,所以今天写一篇文章帮自己订正一下.mybat ...

  9. mysql架构和历史

    存储引擎 查看: show table status like 'bigcourse'; 结果: +-----------+--------+---------+------------+------ ...

  10. Linux文件系统与目录结构

    在Linux系统中,目录被组织成一个:单根倒置树结构,文件系统从根目录开始,用/来表示.文件名称区分大小写( 大小写敏感还需要看具体的文件系统格式 ),以.开头的为隐藏文件,路径用/来进行分割(win ...