MySQL系列:高可用架构之MHA
前言
从11年毕业到现在,工作也好些年头,入坑mysql也有近四年的时间,也捣鼓过像mongodb、redis、cassandra、neo4j等Nosql数据库。其实一直想写博客分享下工作上的零零碎碎(日记倒是有坚持写),无奈比较懒,也就一直搁着了;近来心血来潮,着手整理之前的笔记就顺便把写博客的事补上,也算是温故知新,巩固之前学过用过的知识。从这篇《MySQL系列:高可用架构之MHA》开始后面会陆陆续续的分享一些关于MySQL高可用架构的东西。
MHA介绍
MHA(Master High Availability)目前在MySQL高可用方面是一个相对成熟的解决方案,它由日本DeNA公司youshimaton(现就职于Facebook公司)开发,是一套优秀的作为MySQL高可用性环境下故障切换和主从提升的高可用软件。在MySQL故障切换过程中,MHA能做到在0~30秒之内自动完成数据库的故障切换操作,并且在进行故障切换的过程中,MHA能在最大程度上保证数据的一致性,是强一致性主从工具。
MHA由两部分组成:
MHA Manager(管理节点):MHA Manager可以单独部署在一台独立的服务器,也可以和MySQL部署在同一台服务器上。MHA Manager会定时探测集群中的master节点,当master出现故障时,它可以自动将最新数据的slave提升为新的master,然后将所有其他的slave重新指向新的master。整个故障转移过程对应用程序完全透明;MHA Manager可以管理多组主从复制。
包含工具集:
masterha_check_ssh 检查MHA的SSH配置状况
masterha_check_repl 检查MySQL复制状况
masterha_manger 启动MHA
masterha_check_status 检测当前MHA运行状态
masterha_master_monitor 检测master是否宕机
masterha_master_switch 控制故障转移(自动或者手动)
masterha_conf_host 添加或删除配置的server信息
MHA Node(数据节点):MHA Node运行在每台MySQL服务器上。
包含工具集:
save_binary_logs 保存和复制master的二进制日志
apply_diff_relay_logs 识别差异的中继日志事件并将其差异的事件应用于其他的slave
filter_mysqlbinlog 去除不必要的ROLLBACK事件(MHA已不再使用这个工具)
purge_relay_logs 清除中继日志(不会阻塞SQL线程)
原理:
1. 从宕机崩溃的Master保存二进制日志事件(binlogevent)
2. 识别含有最新更新的Slave
3. 应用差异的中继日志(relaylog)到其他Slave
4. 应用从Master保存的二进制日志事件
5. 提升一个Slave为新的Master
6. 使其他的Slave连接新的Master进行复制
配置安装
我的“一主二从”已经搭建好了,这里不在赘述。
192.168.3.126: master (mha node)
192.168.3.127: slave (mha node)
192.168.3.128: slave (mha manager/node)
192.168.3.130 VIP
注:集群复制开启GTID模式。
用户MHA监控的账户授权如下:
GRANT ALL PRIVILEGES ON *.* TO 'mha'@'192.168.3.%' identified by '';
配置无秘钥登录:
[root@mha-maxscale- ~]# ssh-keygen -t rsa
[root@mha-maxscale- ~]# ssh-copy-id 192.168.3.126
[root@mha-maxscale- ~]# ssh-copy-id 192.168.3.127
[root@mha-maxscale- ~]# ssh-copy-id 192.168.3.128
下载MHA:
git clone https://github.com/yoshinorim/mha4mysql-manager.git
git clone https://github.com/yoshinorim/mha4mysql-node.git
安装依赖包:
wget http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
rpm -ivh epel-release-latest-.noarch.rpm
yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes
安装MHA Manager(192.168.3.129):
[root@mha-maxscale- mha-manager]# perl Makefile.PL
2 [root@mha-maxscale- mha-manager]# make && make install
安装MHA Node(192.168.3.126|127|128):
[root@mha-maxscale- mha-manager]# perl Makefile.PL
[root@mha-maxscale- mha-manager]# make && make install
MHA配置文件:
######################################################################
# git clone https://github.com/yoshinorim/mha4mysql-manager.git
# git clone https://github.com/yoshinorim/mha4mysql-node.git
# wget http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
# rpm -ivh epel-release-latest-7.noarch.rpm
# yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes
# cd mha-manager; perl Makefile.PL; make && make install
# cd mha-node; perl Makefile.PL; make && make install
# ssh-keygen -t rsa
# ssh-copy-id 192.168.3.126
# ssh-copy-id 192.168.3.127
# ssh-copy-id 192.168.3.128
###################################################################### [server default] # mysql监控账户
user=mha
password=123456
port=3306 # ssh免密钥登录的帐号名
ssh_user=root # 设置manager的工作目录
manager_workdir=/usr/local/masterha # 设置manager的日志
manager_log=/usr/local/masterha/app1/manager.log # 设置远端mysql在发生切换时binlog的保存位置
remote_workdir=/usr/local/masterha # 设置master 保存binlog的位置
master_binlog_dir=/labdata/3306/log # 设置自动failover时候的切换脚本
master_ip_failover_script= /usr/local/bin/master_ip_failover # 设置手动切换时候的切换脚本
master_ip_online_change_script= /usr/local/bin/master_ip_online_change # 只有当两个manager都ping不通才会切换,防止数据不一致
# secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.3.129 -s 192.168.3.131 --user=root --master_host=192.168.3.126 --master_ip=192.168.3.126 --master_port=3306 # 设置故障发生后关闭故障主机脚本(该脚本的主要作用是关闭主机,防止发生脑裂)
shutdown_script="" # 设置发生切换后发送的报警的脚本
report_script="" # mysql复制帐号,用来在主从机之间同步二进制日志等
repl_user=replicator
repl_password=123456 # ping间隔,用来检测master是否正常
ping_interval= 10 [server1]
hostname=192.168.3.126
port=3306
master_binlog_dir=/labdata/3306/log #候选人,master挂掉时候优先让它顶
candidate_master=1 [server2]
hostname=192.168.3.127
port=3306
master_binlog_dir=/labdata/3306/log #候选人,master挂掉时候优先让它顶
candidate_master=1 [server3]
hostname=192.168.3.128
port=3306
master_binlog_dir=/labdata/3306/log
candidate_master=1 # 不升级为主
# no_master=1 [binlog1]
hostname=192.168.3.126
# hostname=192.168.3.127
# hostname=192.168.3.128
注:对于GTID模式,要注意,如果数据库没有执行过一条事务,那么show slave status中即没有执行过任何Executed_Gtid_Set,MHA会认为是非GTID模式。另外,对于GTID模式需要配置binlog server。如果不配置,即使在old master SSH可达的情况下,它也不会去save binlog,所以GTID和non-GTID模式的区别比较大。解决的方案就是: 配置Binlog Server
master_ip_failover脚本
[root@mha-maxscale- ~]# cat /usr/local/bin/master_ip_failover
#!/usr/bin/env perl
use strict;
use warnings FATAL =>'all'; use Getopt::Long; my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
); my $vip = '192.168.3.130/24'; # Virtual IP
my $key = "1";
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";
my $exit_code = ; GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
); exit &main(); sub main { #print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n"; if ( $command eq "stop" || $command eq "stopssh" ) { # $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = ;
eval {
print "\n\n\n***************************************************************\n";
print "Disabling the VIP - $vip on old master: $orig_master_host\n";
print "***************************************************************\n\n\n\n";
&stop_vip();
$exit_code = ;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) { # all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = ;
eval {
print "\n\n\n***************************************************************\n";
print "Enabling the VIP - $vip on new master: $new_master_host \n";
print "***************************************************************\n\n\n\n";
&start_vip();
$exit_code = ;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
`ssh $ssh_user\@$orig_master_host \" $ssh_start_vip \"`;
exit 0;
}
else {
&usage();
exit 1;
}
} # A simple system call that enable the VIP on the new master
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
} sub usage {
"Usage: master_ip_failover –command=start|stop|stopssh|status –orig_master_host=host –orig_master_ip=ip –orig_master_port=port –new_master_host=host –new_master_ip=ip –new_master_port=port\n";
}
master_ip_online_change脚本:
[root@mha-maxscale- ~]# cat /usr/local/bin/master_ip_online_change
#!/usr/bin/env perl
use strict;
use warnings FATAL =>'all'; use Getopt::Long; my $vip = '192.168.3.130/24'; # Virtual IP
my $key = "1";
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";
my $exit_code = ; my (
$command, $orig_master_is_new_slave, $orig_master_host,
$orig_master_ip, $orig_master_port, $orig_master_user,
$orig_master_password, $orig_master_ssh_user, $new_master_host,
$new_master_ip, $new_master_port, $new_master_user,
$new_master_password, $new_master_ssh_user,
);
GetOptions(
'command=s' => \$command,
'orig_master_is_new_slave' => \$orig_master_is_new_slave,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'orig_master_user=s' => \$orig_master_user,
'orig_master_password=s' => \$orig_master_password,
'orig_master_ssh_user=s' => \$orig_master_ssh_user,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
'new_master_user=s' => \$new_master_user,
'new_master_password=s' => \$new_master_password,
'new_master_ssh_user=s' => \$new_master_ssh_user,
); exit &main(); sub main { #print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n"; if ( $command eq "stop" || $command eq "stopssh" ) { # $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = ;
eval {
print "\n\n\n***************************************************************\n";
print "Disabling the VIP - $vip on old master: $orig_master_host\n";
print "***************************************************************\n\n\n\n";
&stop_vip();
$exit_code = ;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) { # all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = ;
eval {
print "\n\n\n***************************************************************\n";
print "Enabling the VIP - $vip on new master: $new_master_host \n";
print "***************************************************************\n\n\n\n";
&start_vip();
$exit_code = ;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
`ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_start_vip \"`;
exit 0;
}
else {
&usage();
exit 1;
}
} # A simple system call that enable the VIP on the new master
sub start_vip() {
`ssh $new_master_ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
} sub usage {
"Usage: master_ip_failover –command=start|stop|stopssh|status –orig_master_host=host –orig_master_ip=ip –orig_master_port=port –new_master_host=host –new_master_ip=ip –new_master_port=port\n";
}
设置relay log的清除方式(每个slave节点上):
set global relay_log_purge=0;
注:MySQL数据库主从复制在缺省情况下从库的relay logs会在SQL线程执行完毕后被自动删除,但是对于MHA场景下,对于某些滞后从库的恢复依赖于其他从库的relay log,因此采取禁用自动删除功能以及定期清理的办法。对于清理过多过大的relay log需要注意引起的复制延迟资源开销等。MHA可通过purge_relay_logs脚本及配合cronjob来完成此项任务,具体描述如下:
1、purge_relay_logs的功能
a、为relay日志创建硬链接(最小化批量删除大文件导致的性能问题)
b、SET GLOBAL relay_log_purge=1; FLUSH LOGS; SET GLOBAL relay_log_purge=0;
c、删除relay log(rm –f /path/to/archive_dir/*)
2、purge_relay_logs的用法及相关参数
用法:
purge_relay_logs --user=root --password=rootpass --host=127.0.0.1
参数描述
--user mysql 用户名,缺省为root
--password mysql 密码
--port 端口号
--host 主机名,缺省为127.0.0.1
--workdir 指定创建relay log的硬链接的位置,默认是/var/tmp,成功执行脚本后,硬链接的中继日志文件被删除;由于系统不同分区创建硬链接文件会失败,故需要执行硬链接具体位置,建议指定为relay log相同的分区
--disable_relay_log_purge 默认情况下,参数relay_log_purge=1,脚本不做任何处理,自动退出;设定该参数,脚本会将relay_log_purge设置为0,当清理relay log之后,最后将参数设置为OFF(0)
3、定制清理relay log cronjob
pureg_relay_logs脚本在不阻塞SQL线程的情况下自动清理relay log。对于不断产生的relay log直接将该脚本部署到crontab以实现按天或按小时定期清理。
[root@mha-maxscale-1 ~]# crontab -l
# purge relay logs at 5am
0 5 * * * /usr/local/bin/purge_relay_logs --user=mha --password=123456 --disable_relay_log_purge >> / usr/local/masterha/purge_relay_logs.log 2>&1
检查ssh配置:
[root@mha-maxscale- ~]# masterha_check_ssh --conf=/etc/masterha/app1.cnf
Sat Dec :: - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Dec :: - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Sat Dec :: - [info] Reading server configuration from /etc/masterha/app1.cnf..
Sat Dec :: - [info] Starting SSH connection tests..
Sat Dec :: - [debug]
Sat Dec :: - [debug] Connecting via SSH from root@192.168.3.126(192.168.3.126:) to root@192.168.3.127(192.168.3.127:)..
Sat Dec :: - [debug] ok.
Sat Dec :: - [debug] Connecting via SSH from root@192.168.3.126(192.168.3.126:) to root@192.168.3.128(192.168.3.128:)..
Sat Dec :: - [debug] ok.
Sat Dec :: - [debug]
Sat Dec :: - [debug] Connecting via SSH from root@192.168.3.128(192.168.3.128:) to root@192.168.3.126(192.168.3.126:)..
Sat Dec :: - [debug] ok.
Sat Dec :: - [debug] Connecting via SSH from root@192.168.3.128(192.168.3.128:) to root@192.168.3.127(192.168.3.127:)..
Sat Dec :: - [debug] ok.
Sat Dec :: - [debug]
Sat Dec :: - [debug] Connecting via SSH from root@192.168.3.127(192.168.3.127:) to root@192.168.3.126(192.168.3.126:)..
Sat Dec :: - [debug] ok.
Sat Dec :: - [debug] Connecting via SSH from root@192.168.3.127(192.168.3.127:) to root@192.168.3.128(192.168.3.128:)..
Sat Dec :: - [debug] ok.
Sat Dec :: - [info] All SSH connection tests passed successfully.
检查主从复制状态:
[root@mha-maxscale- ~]# masterha_check_ssh --conf=/etc/masterha/app1.cnf
Sat Dec :: - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Dec :: - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Sat Dec :: - [info] Reading server configuration from /etc/masterha/app1.cnf..
Sat Dec :: - [info] Starting SSH connection tests..
Sat Dec :: - [debug]
Sat Dec :: - [debug] Connecting via SSH from root@192.168.3.126(192.168.3.126:) to root@192.168.3.127(192.168.3.127:)..
Sat Dec :: - [debug] ok.
Sat Dec :: - [debug] Connecting via SSH from root@192.168.3.126(192.168.3.126:) to root@192.168.3.128(192.168.3.128:)..
Sat Dec :: - [debug] ok.
Sat Dec :: - [debug]
Sat Dec :: - [debug] Connecting via SSH from root@192.168.3.128(192.168.3.128:) to root@192.168.3.126(192.168.3.126:)..
Sat Dec :: - [debug] ok.
Sat Dec :: - [debug] Connecting via SSH from root@192.168.3.128(192.168.3.128:) to root@192.168.3.127(192.168.3.127:)..
Sat Dec :: - [debug] ok.
Sat Dec :: - [debug]
Sat Dec :: - [debug] Connecting via SSH from root@192.168.3.127(192.168.3.127:) to root@192.168.3.126(192.168.3.126:)..
Sat Dec :: - [debug] ok.
Sat Dec :: - [debug] Connecting via SSH from root@192.168.3.127(192.168.3.127:) to root@192.168.3.128(192.168.3.128:)..
Sat Dec :: - [debug] ok.
Sat Dec :: - [info] All SSH connection tests passed successfully.
[root@mha-maxscale- ~]# masterha_check_repl --conf=/etc/masterha/app1.cnf
Sat Dec :: - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Dec :: - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Sat Dec :: - [info] Reading server configuration from /etc/masterha/app1.cnf..
Sat Dec :: - [info] MHA::MasterMonitor version 0.56.
Sat Dec :: - [info] GTID failover mode =
Sat Dec :: - [info] Dead Servers:
Sat Dec :: - [info] Alive Servers:
Sat Dec :: - [info] 192.168.3.126(192.168.3.126:)
Sat Dec :: - [info] 192.168.3.127(192.168.3.127:)
Sat Dec :: - [info] 192.168.3.128(192.168.3.128:)
Sat Dec :: - [info] Alive Slaves:
Sat Dec :: - [info] 192.168.3.127(192.168.3.127:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled
Sat Dec :: - [info] GTID ON
Sat Dec :: - [info] Replicating from 192.168.3.126(192.168.3.126:)
Sat Dec :: - [info] Primary candidate for the new Master (candidate_master is set)
Sat Dec :: - [info] 192.168.3.128(192.168.3.128:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled
Sat Dec :: - [info] GTID ON
Sat Dec :: - [info] Replicating from 192.168.3.126(192.168.3.126:)
Sat Dec :: - [info] Current Alive Master: 192.168.3.126(192.168.3.126:)
Sat Dec :: - [info] Checking slave configurations..
Sat Dec :: - [info] read_only= is not set on slave 192.168.3.127(192.168.3.127:).
Sat Dec :: - [info] read_only= is not set on slave 192.168.3.128(192.168.3.128:).
Sat Dec :: - [info] Checking replication filtering settings..
Sat Dec :: - [info] binlog_do_db= , binlog_ignore_db=
Sat Dec :: - [info] Replication filtering check ok.
Sat Dec :: - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Sat Dec :: - [info] HealthCheck: SSH to 192.168.3.126 is reachable.
Sat Dec :: - [info] Binlog server 192.168.3.126 is reachable.
Sat Dec :: - [info] Checking recovery script configurations on 192.168.3.126(192.168.3.126:)..
Sat Dec :: - [info] Executing command: save_binary_logs --command=test --start_pos= --binlog_dir=/labdata//log --output_file=/usr/local/masterha/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.
Sat Dec :: - [info] Connecting to root@192.168.3.126(192.168.3.126:)..
Creating /usr/local/masterha if not exists.. ok.
Checking output directory is accessible or not..
ok.
Binlog found at /labdata//log, up to mysql-bin.
Sat Dec :: - [info] Binlog setting check done.
Sat Dec :: - [info] Checking SSH publickey authentication settings on the current master..
Sat Dec :: - [info] HealthCheck: SSH to 192.168.3.126 is reachable.
Sat Dec :: - [info]
192.168.3.126(192.168.3.126:) (current master)
+--192.168.3.127(192.168.3.127:)
+--192.168.3.128(192.168.3.128:) Sat Dec :: - [info] Checking replication health on 192.168.3.127..
Sat Dec :: - [info] ok.
Sat Dec :: - [info] Checking replication health on 192.168.3.128..
Sat Dec :: - [info] ok.
Sat Dec :: - [info] Checking master_ip_failover_script status:
Sat Dec :: - [info] /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.3.126 --orig_master_ip=192.168.3.126 --orig_master_port=
Checking the Status of the script.. OK
Sat Dec :: - [info] OK.
Sat Dec :: - [warning] shutdown_script is not defined.
Sat Dec :: - [info] Got exit code (Not master dead). MySQL Replication Health is OK.
启动MHA:
[root@mha-maxscale- ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf ---remove_dead_master_conf --ignore_last_failover &
注:参数介绍:
--remove_dead_master_conf 该参数代表当发生主从切换后,老的主库的ip将会从配置文件中移除。
--ignore_last_failover 在缺省情况下,如果MHA检测到连续发生宕机,且两次宕机间隔不足8小时的话,则不会进行Failover,之所以这样限制是为了避免ping-pong效应。该参数代表忽略上次MHA触发切换产生的文件,默认情况下,MHA发生切换后会在日志目录,也就是上面我设置的/usr/local/masterha产生app1.failover.complete文件,下次再次切换的时候如果发现该目录下存在该文件将不允许触发切换,除非在第一次切换后收到删除该文件,为了方便,这里设置为--ignore_last_failover。
启动日志:
Sat Dec 16 00:31:29 2017 - [info] MHA::MasterMonitor version 0.56.
Sat Dec 16 00:31:30 2017 - [info] GTID failover mode = 1
Sat Dec 16 00:31:30 2017 - [info] Dead Servers:
Sat Dec 16 00:31:30 2017 - [info] Alive Servers:
Sat Dec 16 00:31:30 2017 - [info] 192.168.3.126(192.168.3.126:3306)
Sat Dec 16 00:31:30 2017 - [info] 192.168.3.127(192.168.3.127:3306)
Sat Dec 16 00:31:30 2017 - [info] 192.168.3.128(192.168.3.128:3306)
Sat Dec 16 00:31:30 2017 - [info] Alive Slaves:
Sat Dec 16 00:31:30 2017 - [info] 192.168.3.127(192.168.3.127:3306) Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Sat Dec 16 00:31:30 2017 - [info] GTID ON
Sat Dec 16 00:31:30 2017 - [info] Replicating from 192.168.3.126(192.168.3.126:3306)
Sat Dec 16 00:31:30 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Sat Dec 16 00:31:30 2017 - [info] 192.168.3.128(192.168.3.128:3306) Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Sat Dec 16 00:31:30 2017 - [info] GTID ON
Sat Dec 16 00:31:30 2017 - [info] Replicating from 192.168.3.126(192.168.3.126:3306)
Sat Dec 16 00:31:30 2017 - [info] Current Alive Master: 192.168.3.126(192.168.3.126:3306)
Sat Dec 16 00:31:30 2017 - [info] Checking slave configurations..
Sat Dec 16 00:31:30 2017 - [info] read_only=1 is not set on slave 192.168.3.127(192.168.3.127:3306).
Sat Dec 16 00:31:30 2017 - [info] read_only=1 is not set on slave 192.168.3.128(192.168.3.128:3306).
Sat Dec 16 00:31:30 2017 - [info] Checking replication filtering settings..
Sat Dec 16 00:31:30 2017 - [info] binlog_do_db= , binlog_ignore_db=
Sat Dec 16 00:31:30 2017 - [info] Replication filtering check ok.
Sat Dec 16 00:31:30 2017 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Sat Dec 16 00:31:30 2017 - [info] HealthCheck: SSH to 192.168.3.126 is reachable.
Sat Dec 16 00:31:30 2017 - [info] Binlog server 192.168.3.126 is reachable.
Sat Dec 16 00:31:30 2017 - [info] Checking recovery script configurations on 192.168.3.126(192.168.3.126:3306)..
Sat Dec 16 00:31:30 2017 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/labdata/3306/log --output_file=/usr/local/masterha/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000008
Sat Dec 16 00:31:30 2017 - [info] Connecting to root@192.168.3.126(192.168.3.126:22)..
Creating /usr/local/masterha if not exists.. ok.
Checking output directory is accessible or not..
ok.
Binlog found at /labdata/3306/log, up to mysql-bin.000008
Sat Dec 16 00:31:30 2017 - [info] Binlog setting check done.
Sat Dec 16 00:31:30 2017 - [info] Checking SSH publickey authentication settings on the current master..
Sat Dec 16 00:31:30 2017 - [info] HealthCheck: SSH to 192.168.3.126 is reachable.
Sat Dec 16 00:31:30 2017 - [info]
192.168.3.126(192.168.3.126:3306) (current master)
+--192.168.3.127(192.168.3.127:3306)
+--192.168.3.128(192.168.3.128:3306) Sat Dec 16 00:31:30 2017 - [info] Checking master_ip_failover_script status:
Sat Dec 16 00:31:30 2017 - [info] /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.3.126 --orig_master_ip=192.168.3.126 --orig_master_port=3306
Checking the Status of the script.. OK
Sat Dec 16 00:31:31 2017 - [info] OK.
Sat Dec 16 00:31:31 2017 - [warning] shutdown_script is not defined.
Sat Dec 16 00:31:31 2017 - [info] Set master ping interval 10 seconds.
Sat Dec 16 00:31:31 2017 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Sat Dec 16 00:31:31 2017 - [info] Starting ping health check on 192.168.3.126(192.168.3.126:3306)..
Sat Dec 16 00:31:31 2017 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
测试MHA
一、测试自动failover
使用sysbench生成测试数据,在主库(192.168.3.126)上生成数据:
mysql > create database zow;
生成数据(192.168.3.126):
sysbench /usr/share/sysbench/oltp_read_only.lua \
--mysql-host=127.0.0.1 \
--mysql-port= \
--mysql-user=mha \
--mysql-password= \
--mysql-socket=/labdata//mysql.sock \
--mysql-db=zow \
--db-driver=mysql \
--tables= \
--table-size= \
--report-interval= \
--threads= \
--time= \
prepare
停止其中一台slave复制(192.168.3.127)
mysql > stop slave;
mysql> select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
| 204288 |
+----------+
1 row in set (0.10 sec)
在同步完master数据之前,停止另一台slave(192.168.3.128)
mysql > stop slave;
mysql> select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
| 887072 |
+----------+
1 row in set (0.10 sec)
关闭主库,模拟主库发生故障:
[root@mha-maxscale- ~]# /etc/init.d/mysql stop
注:从上面可以看出,slave 3.128比3.127新,且两台机器都是备选主库,所以3.128将会成为新的主库。
切换日志如下:
[root@mha-maxscale-1 ~]# cat /usr/local/masterha/app1/manager.log
Mon Dec 18 18:30:27 2017 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)
Mon Dec 18 18:30:27 2017 - [info] Executing SSH check script: exit 0
Mon Dec 18 18:30:28 2017 - [info] HealthCheck: SSH to 192.168.3.126 is reachable.
Mon Dec 18 18:30:37 2017 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.3.126' (111))
Mon Dec 18 18:30:37 2017 - [warning] Connection failed 2 time(s)..
Mon Dec 18 18:30:47 2017 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.3.126' (111))
Mon Dec 18 18:30:47 2017 - [warning] Connection failed 3 time(s)..
Mon Dec 18 18:30:57 2017 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.3.126' (111))
Mon Dec 18 18:30:57 2017 - [warning] Connection failed 4 time(s)..
Mon Dec 18 18:30:57 2017 - [warning] Master is not reachable from health checker!
Mon Dec 18 18:30:57 2017 - [warning] Master 192.168.3.126(192.168.3.126:3306) is not reachable!
Mon Dec 18 18:30:57 2017 - [warning] SSH is reachable.
Mon Dec 18 18:30:57 2017 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/masterha/app1.cnf again, and trying to connect to all servers to check server status..
Mon Dec 18 18:30:57 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon Dec 18 18:30:57 2017 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Mon Dec 18 18:30:57 2017 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Mon Dec 18 18:30:58 2017 - [warning] SQL Thread is stopped(no error) on 192.168.3.127(192.168.3.127:3306)
Mon Dec 18 18:30:58 2017 - [warning] SQL Thread is stopped(no error) on 192.168.3.128(192.168.3.128:3306)
Mon Dec 18 18:30:58 2017 - [info] GTID failover mode = 1
Mon Dec 18 18:30:58 2017 - [info] Dead Servers:
Mon Dec 18 18:30:58 2017 - [info] 192.168.3.126(192.168.3.126:3306)
Mon Dec 18 18:30:58 2017 - [info] Alive Servers:
Mon Dec 18 18:30:58 2017 - [info] 192.168.3.127(192.168.3.127:3306)
Mon Dec 18 18:30:58 2017 - [info] 192.168.3.128(192.168.3.128:3306)
Mon Dec 18 18:30:58 2017 - [info] Alive Slaves:
Mon Dec 18 18:30:58 2017 - [info] 192.168.3.127(192.168.3.127:3306) Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon Dec 18 18:30:58 2017 - [info] GTID ON
Mon Dec 18 18:30:58 2017 - [info] Replicating from 192.168.3.126(192.168.3.126:3306)
Mon Dec 18 18:30:58 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Dec 18 18:30:58 2017 - [info] 192.168.3.128(192.168.3.128:3306) Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon Dec 18 18:30:58 2017 - [info] GTID ON
Mon Dec 18 18:30:58 2017 - [info] Replicating from 192.168.3.126(192.168.3.126:3306)
Mon Dec 18 18:30:58 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Dec 18 18:30:58 2017 - [info] Checking slave configurations..
Mon Dec 18 18:30:58 2017 - [info] read_only=1 is not set on slave 192.168.3.127(192.168.3.127:3306).
Mon Dec 18 18:30:58 2017 - [info] read_only=1 is not set on slave 192.168.3.128(192.168.3.128:3306).
Mon Dec 18 18:30:58 2017 - [info] Checking replication filtering settings..
Mon Dec 18 18:30:58 2017 - [info] Replication filtering check ok.
Mon Dec 18 18:30:58 2017 - [info] Master is down!
Mon Dec 18 18:30:58 2017 - [info] Terminating monitoring script.
Mon Dec 18 18:30:58 2017 - [info] Got exit code 20 (Master dead).
Mon Dec 18 18:30:58 2017 - [info] MHA::MasterFailover version 0.56.
Mon Dec 18 18:30:58 2017 - [info] Starting master failover.
Mon Dec 18 18:30:58 2017 - [info]
Mon Dec 18 18:30:58 2017 - [info] * Phase 1: Configuration Check Phase..
Mon Dec 18 18:30:58 2017 - [info]
Mon Dec 18 18:30:59 2017 - [info] HealthCheck: SSH to 192.168.3.126 is reachable.
Mon Dec 18 18:30:59 2017 - [info] Binlog server 192.168.3.126 is reachable.
Mon Dec 18 18:31:00 2017 - [warning] SQL Thread is stopped(no error) on 192.168.3.127(192.168.3.127:3306)
Mon Dec 18 18:31:00 2017 - [warning] SQL Thread is stopped(no error) on 192.168.3.128(192.168.3.128:3306)
Mon Dec 18 18:31:00 2017 - [info] GTID failover mode = 1
Mon Dec 18 18:31:00 2017 - [info] Dead Servers:
Mon Dec 18 18:31:00 2017 - [info] 192.168.3.126(192.168.3.126:3306)
Mon Dec 18 18:31:00 2017 - [info] Checking master reachability via MySQL(double check)...
Mon Dec 18 18:31:00 2017 - [info] ok.
Mon Dec 18 18:31:00 2017 - [info] Alive Servers:
Mon Dec 18 18:31:00 2017 - [info] 192.168.3.127(192.168.3.127:3306)
Mon Dec 18 18:31:00 2017 - [info] 192.168.3.128(192.168.3.128:3306)
Mon Dec 18 18:31:00 2017 - [info] Alive Slaves:
Mon Dec 18 18:31:00 2017 - [info] 192.168.3.127(192.168.3.127:3306) Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon Dec 18 18:31:00 2017 - [info] GTID ON
Mon Dec 18 18:31:00 2017 - [info] Replicating from 192.168.3.126(192.168.3.126:3306)
Mon Dec 18 18:31:00 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Dec 18 18:31:00 2017 - [info] 192.168.3.128(192.168.3.128:3306) Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon Dec 18 18:31:00 2017 - [info] GTID ON
Mon Dec 18 18:31:00 2017 - [info] Replicating from 192.168.3.126(192.168.3.126:3306)
Mon Dec 18 18:31:00 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Dec 18 18:31:00 2017 - [info] Starting SQL thread on 192.168.3.127(192.168.3.127:3306) ..
Mon Dec 18 18:31:00 2017 - [info] done.
Mon Dec 18 18:31:00 2017 - [info] Starting SQL thread on 192.168.3.128(192.168.3.128:3306) ..
Mon Dec 18 18:31:00 2017 - [info] done.
Mon Dec 18 18:31:00 2017 - [info] Starting GTID based failover.
Mon Dec 18 18:31:00 2017 - [info]
Mon Dec 18 18:31:00 2017 - [info] ** Phase 1: Configuration Check Phase completed.
Mon Dec 18 18:31:00 2017 - [info]
Mon Dec 18 18:31:00 2017 - [info] * Phase 2: Dead Master Shutdown Phase..
Mon Dec 18 18:31:00 2017 - [info]
Mon Dec 18 18:31:00 2017 - [info] Forcing shutdown so that applications never connect to the current master..
Mon Dec 18 18:31:00 2017 - [info] Executing master IP deactivation script:
Mon Dec 18 18:31:00 2017 - [info] /usr/local/bin/master_ip_failover --orig_master_host=192.168.3.126 --orig_master_ip=192.168.3.126 --orig_master_port=3306 --command=stopssh --ssh_user=root ***************************************************************
Disabling the VIP - 192.168.3.130/24 on old master: 192.168.3.126
*************************************************************** Mon Dec 18 18:31:00 2017 - [info] done.
Mon Dec 18 18:31:00 2017 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Mon Dec 18 18:31:00 2017 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Mon Dec 18 18:31:00 2017 - [info]
Mon Dec 18 18:31:00 2017 - [info] * Phase 3: Master Recovery Phase..
Mon Dec 18 18:31:00 2017 - [info]
Mon Dec 18 18:31:00 2017 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Mon Dec 18 18:31:00 2017 - [info]
Mon Dec 18 18:31:00 2017 - [info] The latest binary log file/position on all slaves is mysql-bin.000010:190904631
Mon Dec 18 18:31:00 2017 - [info] Retrieved Gtid Set: 16a3fea3-db39-11e7-9862-000c29d98525:780-1155
Mon Dec 18 18:31:00 2017 - [info] Latest slaves (Slaves that received relay log files to the latest):
Mon Dec 18 18:31:00 2017 - [info] 192.168.3.127(192.168.3.127:3306) Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon Dec 18 18:31:00 2017 - [info] GTID ON
Mon Dec 18 18:31:00 2017 - [info] Replicating from 192.168.3.126(192.168.3.126:3306)
Mon Dec 18 18:31:00 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Dec 18 18:31:00 2017 - [info] The oldest binary log file/position on all slaves is mysql-bin.000010:155094918
Mon Dec 18 18:31:00 2017 - [info] Retrieved Gtid Set: 16a3fea3-db39-11e7-9862-000c29d98525:780-1083
Mon Dec 18 18:31:00 2017 - [info] Oldest slaves:
Mon Dec 18 18:31:00 2017 - [info] 192.168.3.128(192.168.3.128:3306) Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon Dec 18 18:31:00 2017 - [info] GTID ON
Mon Dec 18 18:31:00 2017 - [info] Replicating from 192.168.3.126(192.168.3.126:3306)
Mon Dec 18 18:31:00 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Dec 18 18:31:00 2017 - [info]
Mon Dec 18 18:31:00 2017 - [info] * Phase 3.3: Determining New Master Phase..
Mon Dec 18 18:31:00 2017 - [info]
Mon Dec 18 18:31:00 2017 - [warning] Slave 192.168.3.127(192.168.3.127:3306) SQL Thread delays too much. Latest log file:mysql-bin.000010:190904631, Current log file:mysql-bin.000010:39513351. This server is not selected as a new master because recovery will take long time.
Mon Dec 18 18:31:00 2017 - [info] Searching new master from slaves..
Mon Dec 18 18:31:00 2017 - [info] Candidate masters from the configuration file:
Mon Dec 18 18:31:00 2017 - [info] 192.168.3.127(192.168.3.127:3306) Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon Dec 18 18:31:00 2017 - [info] GTID ON
Mon Dec 18 18:31:00 2017 - [info] Replicating from 192.168.3.126(192.168.3.126:3306)
Mon Dec 18 18:31:00 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Dec 18 18:31:00 2017 - [info] 192.168.3.128(192.168.3.128:3306) Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon Dec 18 18:31:00 2017 - [info] GTID ON
Mon Dec 18 18:31:00 2017 - [info] Replicating from 192.168.3.126(192.168.3.126:3306)
Mon Dec 18 18:31:00 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Dec 18 18:31:00 2017 - [info] Non-candidate masters:
Mon Dec 18 18:31:00 2017 - [info] 192.168.3.127(192.168.3.127:3306) Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon Dec 18 18:31:00 2017 - [info] GTID ON
Mon Dec 18 18:31:00 2017 - [info] Replicating from 192.168.3.126(192.168.3.126:3306)
Mon Dec 18 18:31:00 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Dec 18 18:31:00 2017 - [info] Searching from candidate_master slaves which have received the latest relay log events..
Mon Dec 18 18:31:00 2017 - [info] Not found.
Mon Dec 18 18:31:00 2017 - [info] Searching from all candidate_master slaves..
Mon Dec 18 18:31:00 2017 - [info] New master is 192.168.3.128(192.168.3.128:3306)
Mon Dec 18 18:31:00 2017 - [info] Starting master failover..
Mon Dec 18 18:31:00 2017 - [info]
From:
192.168.3.126(192.168.3.126:3306) (current master)
+--192.168.3.127(192.168.3.127:3306)
+--192.168.3.128(192.168.3.128:3306) To:
192.168.3.128(192.168.3.128:3306) (new master)
+--192.168.3.127(192.168.3.127:3306)
Mon Dec 18 18:31:00 2017 - [info]
Mon Dec 18 18:31:00 2017 - [info] * Phase 3.3: New Master Recovery Phase..
Mon Dec 18 18:31:00 2017 - [info]
Mon Dec 18 18:31:00 2017 - [info] Waiting all logs to be applied..
Mon Dec 18 18:31:00 2017 - [info] done.
Mon Dec 18 18:31:00 2017 - [info] Replicating from the latest slave 192.168.3.127(192.168.3.127:3306) and waiting to apply..
Mon Dec 18 18:31:00 2017 - [info] Waiting all logs to be applied on the latest slave..
Mon Dec 18 18:31:22 2017 - [info] Resetting slave 192.168.3.128(192.168.3.128:3306) and starting replication from the new master 192.168.3.127(192.168.3.127:3306)..
Mon Dec 18 18:31:23 2017 - [info] Executed CHANGE MASTER.
Mon Dec 18 18:31:23 2017 - [info] Slave started.
Mon Dec 18 18:31:23 2017 - [info] Waiting to execute all relay logs on 192.168.3.128(192.168.3.128:3306)..
Mon Dec 18 18:31:36 2017 - [info] master_pos_wait(mysql-bin.000008:258136066) completed on 192.168.3.128(192.168.3.128:3306). Executed 73 events.
Mon Dec 18 18:31:36 2017 - [info] done.
Mon Dec 18 18:31:36 2017 - [info] done.
Mon Dec 18 18:31:36 2017 - [info] -- Saving binlog from host 192.168.3.126 started, pid: 105211
Mon Dec 18 18:31:37 2017 - [info]
Mon Dec 18 18:31:37 2017 - [info] Log messages from 192.168.3.126 ...
Mon Dec 18 18:31:37 2017 - [info]
Mon Dec 18 18:31:36 2017 - [info] Fetching binary logs from binlog server 192.168.3.126..
Mon Dec 18 18:31:36 2017 - [info] Executing binlog save command: save_binary_logs --command=save --start_file=mysql-bin.000010 --start_pos=190904631 --output_file=/usr/local/masterha/saved_binlog_binlog1_20171218183058.binlog --handle_raw_binlog=0 --skip_filter=1 --disable_log_bin=0 --manager_version=0.56 --oldest_version=5.7.18-log --binlog_dir=/labdata/3306/log
Creating /usr/local/masterha if not exists.. ok.
Concat binary/relay logs from mysql-bin.000010 pos 190904631 to mysql-bin.000011 EOF into /usr/local/masterha/saved_binlog_binlog1_20171218183058.binlog ..
Concat succeeded.
Mon Dec 18 18:31:37 2017 - [info] scp from root@192.168.3.126:/usr/local/masterha/saved_binlog_binlog1_20171218183058.binlog to local:/usr/local/masterha/saved_binlog_192.168.3.126_binlog1_20171218183058.binlog succeeded.
Mon Dec 18 18:31:37 2017 - [info] End of log messages from 192.168.3.126.
Mon Dec 18 18:31:37 2017 - [info] Saved mysqlbinlog size from 192.168.3.126 is 1807 bytes.
Mon Dec 18 18:31:37 2017 - [info] Applying differential binlog /usr/local/masterha/saved_binlog_192.168.3.126_binlog1_20171218183058.binlog ..
Mon Dec 18 18:31:37 2017 - [info] Differential log apply from binlog server succeeded.
Mon Dec 18 18:31:37 2017 - [info] Getting new master's binlog name and position..
Mon Dec 18 18:31:37 2017 - [info] mysql-bin.000005:258135406
Mon Dec 18 18:31:37 2017 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.3.128', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='replicator', MASTER_PASSWORD='xxx';
Mon Dec 18 18:31:37 2017 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysql-bin.000005, 258135406, 16a3fea3-db39-11e7-9862-000c29d98525:1-1155,
29c9acf3-db39-11e7-823c-000c29e5c084:1-15
Mon Dec 18 18:31:37 2017 - [info] Executing master IP activate script:
Mon Dec 18 18:31:37 2017 - [info] /usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host=192.168.3.126 --orig_master_ip=192.168.3.126 --orig_master_port=3306 --new_master_host=192.168.3.128 --new_master_ip=192.168.3.128 --new_master_port=3306 --new_master_user='mha' --new_master_password='123456'
Unknown option: new_master_user
Unknown option: new_master_password ***************************************************************
Enabling the VIP - 192.168.3.130/24 on new master: 192.168.3.128
*************************************************************** Mon Dec 18 18:31:40 2017 - [info] OK.
Mon Dec 18 18:31:40 2017 - [info] ** Finished master recovery successfully.
Mon Dec 18 18:31:40 2017 - [info] * Phase 3: Master Recovery Phase completed.
Mon Dec 18 18:31:40 2017 - [info]
Mon Dec 18 18:31:40 2017 - [info] * Phase 4: Slaves Recovery Phase..
Mon Dec 18 18:31:40 2017 - [info]
Mon Dec 18 18:31:40 2017 - [info]
Mon Dec 18 18:31:40 2017 - [info] * Phase 4.1: Starting Slaves in parallel..
Mon Dec 18 18:31:40 2017 - [info]
Mon Dec 18 18:31:40 2017 - [info] -- Slave recovery on host 192.168.3.127(192.168.3.127:3306) started, pid: 105235. Check tmp log /usr/local/masterha/192.168.3.127_3306_20171218183058.log if it takes time..
Mon Dec 18 18:31:41 2017 - [info]
Mon Dec 18 18:31:41 2017 - [info] Log messages from 192.168.3.127 ...
Mon Dec 18 18:31:41 2017 - [info]
Mon Dec 18 18:31:40 2017 - [info] Resetting slave 192.168.3.127(192.168.3.127:3306) and starting replication from the new master 192.168.3.128(192.168.3.128:3306)..
Mon Dec 18 18:31:40 2017 - [info] Executed CHANGE MASTER.
Mon Dec 18 18:31:40 2017 - [info] Slave started.
Mon Dec 18 18:31:40 2017 - [info] gtid_wait(16a3fea3-db39-11e7-9862-000c29d98525:1-1155,
29c9acf3-db39-11e7-823c-000c29e5c084:1-15) completed on 192.168.3.127(192.168.3.127:3306). Executed 0 events.
Mon Dec 18 18:31:41 2017 - [info] End of log messages from 192.168.3.127.
Mon Dec 18 18:31:41 2017 - [info] -- Slave on host 192.168.3.127(192.168.3.127:3306) started.
Mon Dec 18 18:31:41 2017 - [info] All new slave servers recovered successfully.
Mon Dec 18 18:31:41 2017 - [info]
Mon Dec 18 18:31:41 2017 - [info] * Phase 5: New master cleanup phase..
Mon Dec 18 18:31:41 2017 - [info]
Mon Dec 18 18:31:41 2017 - [info] Resetting slave info on the new master..
Mon Dec 18 18:31:42 2017 - [info] 192.168.3.128: Resetting slave info succeeded.
Mon Dec 18 18:31:42 2017 - [info] Master failover to 192.168.3.128(192.168.3.128:3306) completed successfully.
Mon Dec 18 18:31:42 2017 - [info] ----- Failover Report ----- app1: MySQL Master failover 192.168.3.126(192.168.3.126:3306) to 192.168.3.128(192.168.3.128:3306) succeeded Master 192.168.3.126(192.168.3.126:3306) is down! Check MHA Manager logs at mha-maxscale-1:/usr/local/masterha/app1/manager.log for details. Started automated(non-interactive) failover.
Invalidated master IP address on 192.168.3.126(192.168.3.126:3306)
Selected 192.168.3.128(192.168.3.128:3306) as a new master.
192.168.3.128(192.168.3.128:3306): OK: Applying all logs succeeded.
192.168.3.128(192.168.3.128:3306): OK: Activated master IP address.
192.168.3.127(192.168.3.127:3306): OK: Slave started, replicating from 192.168.3.128(192.168.3.128:3306)
192.168.3.128(192.168.3.128:3306): Resetting slave info succeeded.
Master failover to 192.168.3.128(192.168.3.128:3306) completed successfully.
从切换日志中可以看到整个切换过程:
1. 检查配置文件;
2. 处理宕机的master,vip摘除;
3. 从宕机的master中保留binlog日志到/usr/loca/masterha/;
4. 识别含有最新更新slave;
5. 应用差异的中继日志(relaylog)到其他Slave;
5. 应该从宕机的master中保存的binlog日志;
6. 提升一个slave为master;
7. 使其他的Slave连接新的Master进行复制
二、测试手动切换
[root@mha-maxscale- masterha]# masterha_stop --conf=/etc/masterha/app1.cnf
[root@mha-maxscale- masterha]# masterha_master_switch --conf=/etc/masterha/app1.cnf --master_state=alive --new_master_host=192.168.3.126 --new_master_port= --orig_master_is_new_slave --running_updates_limit=
各MHA参数说明请见:http://wubx.net/mha-parameters/
MySQL系列:高可用架构之MHA的更多相关文章
- mysql实现高可用架构之MHA
一.简介 MHA(Master HA)是一款开源的 MySQL 的高可用程序,它为 MySQL 主从复制架构提供了 automating master failover 功能.MHA 在监控到 mas ...
- mysql高可用架构之MHA,haproxy实现读写分离详解
MySQL高可用架构之MHA 一.运维人员需要掌握的MySQL技术: 1.基本SQL语句 2.基本的管理[库表数据的管理 权限的管理] 3.容灾 保证数据不丢失. 二.工作中MySQ ...
- 美团点评MySQL数据库高可用架构从MMM到MHA+Zebra以及MHA+Proxy的演进
本文介绍最近几年美团点评MySQL数据库高可用架构的演进过程,以及我们在开源技术基础上做的一些创新.同时,也和业界其它方案进行综合对比,了解业界在高可用方面的进展,和未来我们的一些规划和展望. MMM ...
- (转)MySQL高可用架构之MHA
MySQL高可用架构之MHA 原文:http://www.cnblogs.com/gomysql/p/3675429.html 简介: MHA(Master High Availability)目前 ...
- mysql复制(高可用架构方案的基础)
mysql复制:把一个数据库实例上所有改变复制到另外一个数据库库服务器实例的过程特点:1.没有改变就无所谓复制 ;改变是复制的根本与数据源2.所有的改变:是指可以复制全部改变,也可以复制部分改变 可以 ...
- Oracle Compute云快速搭建MySQL Keepalived高可用架构
最近有个客户在测试Oracle Compute云,他们的应用需要使用MySQL数据库,由于是企业级应用一定要考虑高可用架构,因此有需求要在Oracle Compute云上搭建MySQL高可用集群.客户 ...
- MySQL高可用架构之MHA 原理与实践
MHA简介 关于MHA MHA(Master HA)是一款开源的MySQL的高可用程序,它为MySQL主从复制架构提供了automating master failover 功能.MHA在监控到mas ...
- 【MySQL】MySQL高可用架构之MHA
一.关于MHA MHA(Master HA)是一款开源的MySQL的高可用程序,它为MySQL主从复制架构提供了automating master failover 功能.MHA在监控到master节 ...
- (5.1)mysql高可用系列——高可用架构方案概述
关键词:mysql高可用概述,mysql高可用架构 常用高可用方案 20190918 现在业内常用的MySQL高可用方案有哪些?目前来说,用的比较多的开源方案分内置高可用与外部实现,内置高可用有如下: ...
随机推荐
- windy数(数位DP)
windy数Crawling in process... Crawling failed Time Limit:1000MS Memory Limit:165888KB 64bit I ...
- Codeforces 376A. Night at the Museum
A. Night at the Museum time limit per test 1 second memory limit per test 256 megabytes input standa ...
- Basic Data Structure
Basic Data Structure Time Limit: 7000/3500 MS (Java/Others) Memory Limit: 65536/65536 K (Java/Oth ...
- javascript 之变量对象-09
变量对象 变量对象:每个执行环境(执行上下文)都有一个对应的变量对象(variable object),环境中(执行上下文中)定义的所有变量.函数都保存在这个对象中. 在上篇中说到,当执行流执行一个函 ...
- 使用PostgreSQL进行全文检索
* { color: #3e3e3e } body { font-family: "Helvetica Neue", Helvetica, "Hiragino Sans ...
- <template> 标签
<template> 元素,用于描述一个标准的以 DOM 为基础的方案来实现客户端模板.该模板允许你定义一段可以被转为 HTML 的标记,在页面加载时不生效,但可以在后续进行动态实例化.( ...
- Red Hat 7.0 DNS服务配置笔记
先挂载镜像,然后配置yum,然后安装yum install -y bind 配置静态 IP.DNS就是他本身的IP地址. 修改DNS的配置文件,在后面加入区域配置信息.vim /etc/named.c ...
- Python爬虫入门:Urllib库的基本使用
1.分分钟扒一个网页下来 怎样扒网页呢?其实就是根据URL来获取它的网页信息,虽然我们在浏览器中看到的是一幅幅优美的画面,但是其实是由浏览器解释才呈现出来的,实质它 是一段HTML代码,加 JS.CS ...
- zip 安装mysql 和遇到的坑
在官网下载了mysql 社区版的,官方网址:https://dev.mysql.com/downloads/mysql/ 解压后发现里面没有安装快捷方式,才知道是zip解压,dos窗口安装.这就比界面 ...
- express的学习,与使用
最近在学习vue的一个实战项目,碰到一个express,当时很萌,就随便看了看................ expres是基于node 的一个web框架, 首先可以找到它的官网照着学习 这里只讲一 ...