一、mysql group replication 生来就要面对两个问题:

  一、主节点宕机如何恢复。

  二、多数节点离线的情况下、余下节点如何继续承载业务。

  在这里我们只讨论第一个问题、也就是说当主结点宕机之后、我们怎么把它从新加入到高可用集群中去。这个问题又可以细分成

  两种情况:

    1、温和打击:主结点的数据还在、宕机期间集群中的其它结点的binlog日志也都还在

          这种情况下重新启动mysql group replication 就可修复问题。

    2、毁灭打击:主结点的数据都没有了

          这种情况下要从其余结点备份恢复宕机结点、然后再重启mysql group replication 就可修复问题。

  详细的修复步骤请看后面的例子

二、环境介绍:

  环境简介

主机名         ip地址        mgr角色

mtls17        10.186.19.17      primary    

mtls18        10.186.19.18      seconde

mtls19        10.186.19.19      seconde

  集群状态:

mysql> select * from replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 12b6f8d9-d655-11e7-936a-9a17854b700d | mtls17 | 3306 | ONLINE |
| group_replication_applier | 12bfe200-d655-11e7-a264-1e1b3511358e | mtsl18 | 3306 | ONLINE |
| group_replication_applier | 1453bcac-d655-11e7-a503-8a7c439b72d9 | mtls19 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
3 rows in set (0.00 sec) mysql> show global status like 'group_replication_primary_member';
+----------------------------------+--------------------------------------+
| Variable_name | Value |
+----------------------------------+--------------------------------------+
| group_replication_primary_member | 12b6f8d9-d655-11e7-936a-9a17854b700d |
+----------------------------------+--------------------------------------+
1 row in set (0.00 sec)

  说明:

  由上面的信息可以看出mtls17上的mysql为集群当前的primary结点、并且集群的各结点的状态正常。

三、情况下的故障模拟 + 解决:

  1、模拟mtls17结点宕机

ps -ef | grep mysql
mysql : ? :: /usr/local/mysql/bin/mysqld --defaults-file=/etc/my.cnf
root : pts/ :: grep --color=auto mysql
[root@mtls17 data]# kill -
[root@mtls17 data]# ps -ef | grep mysql
root : pts/ :: grep --color=auto mysql

  

  2、查看余下两个结点的情况

mysql> melect * from replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 12bfe200-d655-11e7-a264-1e1b3511358e | mtsl18 | 3306 | ONLINE |
| group_replication_applier | 1453bcac-d655-11e7-a503-8a7c439b72d9 | mtls19 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
2 rows in set (0.00 sec) mysql> show global status like 'group_replication_primary_member';
+----------------------------------+--------------------------------------+
| Variable_name | Value |
+----------------------------------+--------------------------------------+
| group_replication_primary_member | 12bfe200-d655-11e7-a264-1e1b3511358e |
+----------------------------------+--------------------------------------+
1 row in set (0.00 sec)

  由上面可以看出在mtls17结点上的mysql被kill掉之后、余下的两个结点组成了新的集群、并且mtls18上的mysql

  成为了primary

  

  3、解决primary宕机恢复的问题

systemctl start mysql
[root@mtls17 data]# mysql -uroot -pmtls0352
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is
Server version: 5.7.-log MySQL Community Server (GPL) Copyright (c) , , Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> start group_replication;
Query OK, rows affected (4.03 sec) mysql>

  4、检查问题是否正常解决

select * from replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 12b6f8d9-d655-11e7-936a-9a17854b700d | mtls17 | 3306 | ONLINE |
| group_replication_applier | 12bfe200-d655-11e7-a264-1e1b3511358e | mtsl18 | 3306 | ONLINE |
| group_replication_applier | 1453bcac-d655-11e7-a503-8a7c439b72d9 | mtls19 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
3 rows in set (0.00 sec) mysql> show global status like 'group_replication_primary_member';
+----------------------------------+--------------------------------------+
| Variable_name | Value |
+----------------------------------+--------------------------------------+
| group_replication_primary_member | 12bfe200-d655-11e7-a264-1e1b3511358e |
+----------------------------------+--------------------------------------+
1 row in set (0.00 sec)

  总论:之前的主结点在宕机之后、通过重启服务、重启mysql-group-replication成功的解决了问题。

四、模拟primary结点上的数据已经丢失的情况下、如果恢复结点:

  1、退出服务、删除数据

[root@mtsl18 ~]# ps -ef | grep mysql
mysql : ? :: /usr/local/mysql/bin/mysqld --defaults-file=/etc/my.cnf
root : pts/ :: grep --color=auto mysql
[root@mtsl18 ~]# kill -
[root@mtsl18 ~]# rm -rf /database/mysql/data/
[root@mtsl18 ~]# ps -ef | grep mysql
root : pts/ :: grep --color=auto mysql

  这个实验是接着情况一做下去的、所以primary在mtls18上、所以我们在mtls18上做退出服务、删除数据的动作

  2、查看集群的状态:

mysql> select * from replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 12b6f8d9-d655-11e7-936a-9a17854b700d | mtls17 | 3306 | ONLINE |
| group_replication_applier | 1453bcac-d655-11e7-a503-8a7c439b72d9 | mtls19 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
2 rows in set (0.00 sec) mysql> show global status like 'group_replication_primary_member';
+----------------------------------+--------------------------------------+
| Variable_name | Value |
+----------------------------------+--------------------------------------+
| group_replication_primary_member | 12b6f8d9-d655-11e7-936a-9a17854b700d |
+----------------------------------+--------------------------------------+
1 row in set (0.01 sec)

  说明:当mtls18宕机后primary就从mtls18切到了mtls17上去了

  3、通过meb备份mtls19用于还原宕机的mtls18

mysqlbackup --defaults-file=/etc/my.cnf --with-timestamp \
--host=localhost --user=root --password=mtls0352 \
--backup-dir=/tmp/ --backup-image=/tmp/2017-12-01T12:30:00.mbi --no-history-logging \
backup-to-image MySQL Enterprise Backup version 4.1. Linux-2.6.-400.215..el5uek-x86_64 [//]
Copyright (c) , , Oracle and/or its affiliates. All Rights Reserved. :: MAIN INFO: A thread created with Id ''
:: MAIN INFO: Starting with following command line ...
mysqlbackup --defaults-file=/etc/my.cnf --with-timestamp --host=localhost
--user=root --password=xxxxxxxx --backup-dir=/tmp/
--backup-image=/tmp/--01T12::.mbi --no-history-logging
backup-to-image :: MAIN INFO:
:: MAIN INFO: MySQL server version is '5.7.20-log'
.......
........
:: MAIN INFO: Full Image Backup operation completed successfully.
:: MAIN INFO: Backup image created successfully.
:: MAIN INFO: Image Path = /tmp/--01T12::.mbi
:: MAIN INFO: MySQL binlog position: filename mysql-bin., position -------------------------------------------------------------
Parameters Summary
-------------------------------------------------------------
Start LSN :
End LSN :
------------------------------------------------------------- mysqlbackup completed OK!

  4、传输备份到mtls18

scp /tmp/--01T12::.mbi mtls18:/tmp/

  5、还原备份

mysqlbackup --defaults-file=/etc/my.cnf --backup-image=/tmp/2017-12-01T12:30:00.mbi \
> --backup-dir=/tmp/ --datadir=/database/mysql/data/3306/ \
> copy-back-and-apply-log
MySQL Enterprise Backup version 4.1. Linux-2.6.-400.215..el5uek-x86_64 [//]
Copyright (c) , , Oracle and/or its affiliates. All Rights Reserved. :: MAIN INFO: A thread created with Id ''
:: MAIN INFO: Starting with following command line ...
mysqlbackup --defaults-file=/etc/my.cnf
--backup-image=/tmp/--01T12::.mbi --backup-dir=/tmp/
--datadir=/database/mysql/data// copy-back-and-apply-log :: MAIN INFO:
IMPORTANT: Please check that mysqlbackup run completes successfully.
.....
.....
:: PCR1 INFO: The first data file is '/database/mysql/data/3306/ibdata1'
and the new created log files are at '/database/mysql/data/3306/'
:: MAIN INFO: MySQL server version is '5.7.20-log'
:: MAIN INFO: Restoring ...5.7.-log version
:: MAIN INFO: Apply-log operation completed successfully.
:: MAIN INFO: Full Backup has been restored successfully. mysqlbackup completed OK!

  6、重启mtls18上的mysql

[root@mtsl18 tmp]# chown -R mysql:mysql /database/mysql/data/
[root@mtsl18 tmp]# systemctl start mysql
[root@mtsl18 tmp]# ps -ef | grep mysql
mysql : ? :: /usr/local/mysql/bin/mysqld --defaults-file=/etc/my.cnf
root : pts/ :: grep --color=auto mysql

  7、重启mysql group replication

mysql -uroot -pmtls0352
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.7.20-log MySQL Community Server (GPL) Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> reset master;
Query OK, 0 rows affected (0.10 sec) mysql> reset slave;
Query OK, 0 rows affected (0.00 sec) mysql> set sql_log_bin=0;
Query OK, 0 rows affected (0.00 sec) mysql> source /database/mysql/data/3306/backup_gtid_executed.sql ;
Query OK, 0 rows affected (0.10 sec) mysql> set sql_log_bin=1;
Query OK, 0 rows affected (0.00 sec) mysql> change master to
-> master_user='mgr_usr',
-> master_password='mgr10352'
-> for channel 'group_replication_recovery';
Query OK, 0 rows affected, 2 warnings (0.21 sec) mysql> start group_replication;
Query OK, 0 rows affected (3.46 sec)

  8、检查集群的状态是否正常

mysql> select * from replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 12b6f8d9-d655-11e7-936a-9a17854b700d | mtls17 | 3306 | ONLINE |
| group_replication_applier | 1453bcac-d655-11e7-a503-8a7c439b72d9 | mtls19 | 3306 | ONLINE |
| group_replication_applier | 85f82fce-d65e-11e7-9e92-1e1b3511358e | mtsl18 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
3 rows in set (0.01 sec) mysql> show global status like 'group_replication_primary_member';
+----------------------------------+--------------------------------------+
| Variable_name | Value |
+----------------------------------+--------------------------------------+
| group_replication_primary_member | 12b6f8d9-d655-11e7-936a-9a17854b700d |
+----------------------------------+--------------------------------------+
1 row in set (0.01 sec)

五、总结:

  对于两种primary宕故障的修复总结:

    1、数据没有丢、binlog日志也没有丢 那直接重启mysql group replication 就行、它会自动修复问题。

    2、数据丢失的情况、先备份还原-->重启mysql group replication 就行。

  对于mysql group replication 维护操作复杂性的总结:  

    总的来说mysql group replication 对dba还是比较友好的、几个小小的操作就能恢复故障的集群。

六、我写的关于mysql group replication 的相关文章 

  1、mysql group replication 安装与配置详解:http://www.cnblogs.com/JiangLe/p/6727281.html#3849996

  2、mysql group replication 在mysql-5.7.20版本下的可用性报告:http://www.cnblogs.com/JiangLe/p/7809229.html

  3、mysql group replication 主节宕机点恢复 https://i.cnblogs.com/EditPosts.aspx?postid=7941929

  4、mysql group replication 多数据结点丢失情况下的恢复

  5、我写的全自动化安装mysql-group-replication 开源工具 https://github.com/Neeky/mysqltools

----

mysql group replication 主节点宕机恢复的更多相关文章

  1. CDH集群主节点宕机恢复

    1       情况概述 公司的开发集群在周末莫名其妙的主节点Hadoop-1的启动固态盘挂了,由于CM.HDFS的NameNode.HBase的Master都安装在Hadoop-1,导致了整个集群都 ...

  2. Mysql 5.7 基于组复制(MySQL Group Replication) - 运维小结

    之前介绍了Mysq主从同步的异步复制(默认模式).半同步复制.基于GTID复制.基于组提交和并行复制 (解决同步延迟),下面简单说下Mysql基于组复制(MySQL Group Replication ...

  3. Mysql Group Replication 简介及单主模式组复制配置【转】

    一 Mysql Group Replication简介    Mysql Group Replication(MGR)是一个全新的高可用和高扩张的MySQL集群服务.    高一致性,基于原生复制及p ...

  4. MySQL Group Replication 介绍

    2016-12-12,一个重要的日子,mysql5.7.17 GA版发布,正式推出Group Replication(组复制) 插件,通过这个插件增强了MySQL原有的高可用方案(原有的Replica ...

  5. 使用ProxySQL实现MySQL Group Replication的故障转移、读写分离(一)

    导读: 在之前,我们搭建了MySQL组复制集群环境,MySQL组复制集群环境解决了MySQL集群内部的自动故障转移,但是,组复制并没有解决外部业务的故障转移.举个例子,在A.B.C 3台机器上搭建了组 ...

  6. MySQL group replication介绍

    “MySQL group replication” group replication是MySQL官方开发的一个开源插件,是实现MySQL高可用集群的一个工具.第一个GA版本正式发布于MySQL5.7 ...

  7. mysql group replication观点及实践

    一:个人看法 Mysql  Group Replication  随着5.7发布3年了.作为技术爱好者.mgr 是继 oracle database rac 之后. 又一个“真正” 的群集,怎么做到“ ...

  8. MySQL Group Replication配置

    MySQL Group Replication简述 MySQL 组复制实现了基于复制协议的多主更新(单主模式). 复制组由多个 server成员构成,并且组中的每个 server 成员可以独立地执行事 ...

  9. MySQL Group Replication 技术点

    mysql group replication,组复制,提供了多写(multi-master update)的特性,增强了原有的mysql的高可用架构.mysql group replication基 ...

随机推荐

  1. 翻页效果实现turn.js

    使用插件turn.js实现翻书功能. 看下效果:http://yk.wanxue.cn/2019/01/yd/ 当然第一版做的时候加载很慢很慢,原版插件会把所有图片加载出来再显示页面.很不爽的体验就改 ...

  2. C# chart控件基础使用

    基本介绍:chart(图表) 功能:主要用来绘制折线图,柱状图与饼状图,也可达到动态效果(例如作示波器): 需要说明 一个chart可以包含多个chartArea. chartArea是具体的坐标区域 ...

  3. VS2015使用小技巧

    VS2015常用快捷键 1.回到上一个光标位置/前进到下一个光标位置 1)回到上一个光标位置:使用组合键“Ctrl + -”; 2)前进到下一个光标位置:“Ctrl + Shift + - ” 2.复 ...

  4. MySql绿色版安装步骤和方法,以及配置文件修改,Mysql服务器启动

    MySql绿色版Windows安装步骤和方法,以及配置文件修改,Mysql服务器启动 支持“标准”Markdown / CommonMark和Github风格的语法,也可变身为代码编辑器: 支持实时预 ...

  5. Linux(centos)新建,删除,移动,重命名文件夹和文件的命令

    1.新建文件夹 mkdir 文件名 新建一个名为test的文件夹在home下 view source1 mkdir /home/test 2.新建文本 在home下新建一个test.sh脚本 vi / ...

  6. Oracle SQLPLUS提示符设置

    Oracle SQLPLUS提示符设置 把Oracle sqlplus提示符修改为如下,可以提醒你所在的用户模式,减少误操作. set sqlprompt _user'@'_connect_ident ...

  7. mysql保留2位小数字段如何设置 浮点数

    保留2位小数字段如何设置 方法1: select cast(avg(75.3333333) as decimal(10,2)) as '平均分' );#format第一个参数传递浮点数

  8. mysql统计函数

    数据记录统计函数: AVG(字段名) 得出一个表格栏平均值 COUNT(*|字段名) 对数据行数的统计或对某一栏有值的数据行数统计 MAX(字段名) 取得一个表格栏最大的值 MIN(字段名) 取得一个 ...

  9. python之函数用法endswith()

    # -*- coding: utf-8 -*- #python 27 #xiaodeng #python之函数用法endswith() #http://www.runoob.com/python/at ...

  10. JS修改当前控件样式&为控件追加事件

    先搁这吧,今天太晚了,以后再加注释和修整吧.不幸搜到的朋友就别看了 <%@ Page Language="vb" AutoEventWireup="false&qu ...