MySQL MHA--在线主库切换(Online master switch)
在线主库切换(Online master switch)条件
1、所有节点正常运行,无论时原主还是新主或者其他从库
if ( $#dead_servers >= 0 ) {
$log->error(
"Switching master should not be started if one or more servers is down."
);
$log->info("Dead Servers:");
$_server_manager->print_dead_servers();
croak;
}
2、主库正常,能获取到相关主库信息如Server-ID和BINLOG位点信息。
$orig_master = $_server_manager->get_current_alive_master();
if ( !$orig_master ) {
$log->error(
"Failed to get current master. Maybe replication is misconfigured or configuration file is broken?"
);
croak;
}
3、MHA Manager/Monitor处于关闭状态
$log->info("Checking MHA is not monitoring or doing failover..");
if ( $orig_master->get_monitor_advisory_lock() ) {
$log->error(
"Getting advisory lock failed on the current master. MHA Monitor runs on the current master. Stop MHA Manager/Monitor and try again."
);
croak;
}
4、主库和从库上没有超大事务(默认参数running_updates_limit=1)
my @threads = $orig_master->get_running_update_threads( $g_running_updates_limit + );
if ( $#threads >= 0 ) {
$log->error(
sprintf(
"We should not start online master switch when one of connections are running long updates on the current master(%s). Currently %d update thread(s) are running.",
$orig_master->get_hostinfo(),
$#threads + 1
)
);
MHA::DBHelper::print_threads_util( \@threads, );
croak;
} my @threads = $new_master->get_running_threads($g_running_seconds_limit);
if ( $#threads >= 0 ) {
$log->error(
sprintf(
"We should not start online master switch when one of connections are running long queries on the new master(%s). Currently %d thread(s) are running.",
$new_master->get_hostinfo(),
$#threads + 1
)
);
MHA::DBHelper::print_threads_util( \@threads, );
croak;
}
在线主库切换(Online master switch)步骤
1、配置检测(Configuration Check Phase)
、检查主从关系和存活服务器
、主库执行FLUSH NO_WRITE_TO_BINLOG TABLES关闭打开的表
、检查主从复制是否正常
、挑选新主库,并检查新主是否满足条件
、检查当前主库的复制过滤规则,将当前主库设置为dummy slave。
2、禁写当前主库(Rejecting updates Phase)
、尝试当前主库上调用master_ip_online_change_script来进行操作,建议在该脚本中对主库禁写和停用VIP
、使用FLUSH TABLES WITH READ LOCK来禁止当前主库上所有写操作,获取当前主库上最新位点信息
3、启用新主库(switch_master)
、等待新主库复制同步,获取新主库上最新位点信息
、尝试在新主库上调用master_ip_online_change_script来进行操作,建议在该脚本中对从库开启写权限和启用VIP
、新主库上关闭READ_ONLY选项
4、并行切换所有从库(Switching slaves in parallel)
、根据步骤2获取到的原主库最后位点,等待从库应用完所有BINLOG
、根据步骤3获取到的新主库最初位点,重置所有从库。
5、重置原主库(Starting orig master as a new slave)
、对原主库执行(UNLOCK TABLES)释放锁
、按照步骤3获取到的新主库最初位点,重置原主库为新从库。
6、新主库复制信息清理(New master cleanup phase)
、调用STOP SLAVE命令停止复制
、对于5.5版本调用CHANGE MASTER TO MASTER_HOST=''去除原复制信息
、调用RESET SLAVE /*! ALL */命令重置复制
在线主库切换的输出日志:
[root@DBproxy app1]# masterha_master_switch --conf=/data/masterha/app1/app1.cnf --master_state=alive --new_master_host=192.168.0.60 --orig_master_is_new_slave --running_updates_limit= --interactive=
Sat Jul :: - [info] MHA::MasterRotate version 0.56.
Sat Jul :: - [info] Starting online master switch..
Sat Jul :: - [info]
Sat Jul :: - [info] * Phase : Configuration Check Phase..
Sat Jul :: - [info]
Sat Jul :: - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Jul :: - [info] Reading application default configuration from /data/masterha/app1/app1.cnf..
Sat Jul :: - [info] Reading server configuration from /data/masterha/app1/app1.cnf..
Sat Jul :: - [info] GTID failover mode =
Sat Jul :: - [info] Current Alive Master: 192.168.0.50(192.168.0.50:)
Sat Jul :: - [info] Alive Slaves:
Sat Jul :: - [info] 192.168.0.60(192.168.0.60:) Version=5.6.-log (oldest major version between slaves) log-bin:enabled
Sat Jul :: - [info] Replicating from 192.168.0.50(192.168.0.50:)
Sat Jul :: - [info] Primary candidate for the new Master (candidate_master is set)
Sat Jul :: - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Sat Jul :: - [info] ok.
Sat Jul :: - [info] Checking MHA is not monitoring or doing failover..
Sat Jul :: - [info] Checking replication health on 192.168.0.60..
Sat Jul :: - [info] ok.
Sat Jul :: - [info] 192.168.0.60 can be new master.
Sat Jul :: - [info]
From:
192.168.0.50(192.168.0.50:) (current master)
+--192.168.0.60(192.168.0.60:) To:
192.168.0.60(192.168.0.60:) (new master)
+--192.168.0.50(192.168.0.50:)
Sat Jul :: - [info] Checking whether 192.168.0.60(192.168.0.60:) is ok for the new master..
Sat Jul :: - [info] ok.
Sat Jul :: - [info] 192.168.0.50(192.168.0.50:): SHOW SLAVE STATUS returned empty result. To check replication filtering rules, temporarily executing CHANGE MASTER to a dummy host.
Sat Jul :: - [info] 192.168.0.50(192.168.0.50:): Resetting slave pointing to the dummy host.
Sat Jul :: - [info] ** Phase : Configuration Check Phase completed.
Sat Jul :: - [info]
Sat Jul :: - [info] * Phase : Rejecting updates Phase..
Sat Jul :: - [info]
Sat Jul :: - [warning] master_ip_online_change_script is not defined. Skipping disabling writes on the current master.
Sat Jul :: - [info] Locking all tables on the orig master to reject updates from everybody (including root):
Sat Jul :: - [info] Executing FLUSH TABLES WITH READ LOCK..
Sat Jul :: - [info] ok.
Sat Jul :: - [info] Orig master binlog:pos is mysql-bin.:.
Sat Jul :: - [info] Waiting to execute all relay logs on 192.168.0.60(192.168.0.60:)..
Sat Jul :: - [info] master_pos_wait(mysql-bin.:) completed on 192.168.0.60(192.168.0.60:). Executed events.
Sat Jul :: - [info] done.
Sat Jul :: - [info] Getting new master's binlog name and position..
Sat Jul :: - [info] mysql-bin.:
Sat Jul :: - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.0.60', MASTER_PORT=, MASTER_LOG_FILE='mysql-bin.000008', MASTER_LOG_POS=, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Sat Jul :: - [info]
Sat Jul :: - [info] * Switching slaves in parallel..
Sat Jul :: - [info]
Sat Jul :: - [info] Unlocking all tables on the orig master:
Sat Jul :: - [info] Executing UNLOCK TABLES..
Sat Jul :: - [info] ok.
Sat Jul :: - [info] Starting orig master as a new slave..
Sat Jul :: - [info] Resetting slave 192.168.0.50(192.168.0.50:) and starting replication from the new master 192.168.0.60(192.168.0.60:)..
Sat Jul :: - [info] Executed CHANGE MASTER.
Sat Jul :: - [info] Slave started.
Sat Jul :: - [info] All new slave servers switched successfully.
Sat Jul :: - [info]
Sat Jul :: - [info] * Phase : New master cleanup phase..
Sat Jul :: - [info]
Sat Jul :: - [info] 192.168.0.60: Resetting slave info succeeded.
Sat Jul :: - [info] Switching master to 192.168.0.60(192.168.0.60:) completed successfully.
上面日志摘抄自:https://www.cnblogs.com/polestar/p/5737121.html
GTID模式和非GTID模式切换
“原主库切换为新从库”和“原从库切换为新从库”都调用/MHA/ServerManager.pm中的change_master_and_start_slave方法:
if ( $self->is_gtid_auto_pos_enabled() && !$target->{is_mariadb} ) {
$dbhelper->change_master_gtid( $addr, $master->{port},
$master->{repl_user}, $master->{repl_password} );
}
else {
$dbhelper->change_master( $addr,
$master->{port}, $master_log_file, $master_log_pos, $master->{repl_user},
$master->{repl_password} );
}
会根据每个从库的原模式来进行切换,如果原模式使用GTID复制,则切换后也使用GTID复制。
在判断复制同步时,使用/MHA/DBHelper.pm中的master_pos_wait方法:
use constant Master_Pos_Wait_NoTimeout_SQL => "SELECT MASTER_POS_WAIT(?,?,0) AS Result";
sub master_pos_wait($$$) {
my $self = shift;
my $binlog_file = shift;
my $binlog_pos = shift;
my $sth = $self->{dbh}->prepare(Master_Pos_Wait_NoTimeout_SQL);
$sth->execute( $binlog_file, $binlog_pos );
my $href = $sth->fetchrow_hashref;
return $href->{Result};
}
通过MySQL中MASTER_POS_WAIT函数来确保所有原主库上的日志被应用完成,在该过程中,没有使用Executed_Gtid_Set来对比差异。
函数master_pos_wait
语法 select master_pos_wait(file, pos[, timeout]).
参数file和pos对应要执行到主库BINLOG位点信息,函数逻辑是等待当前从库达到这个位置后返回, 返回期间执行的事务个数。
参数timeout可选,若缺省则无限等待,timeout<=0时与缺省的逻辑相同。若为正数,则等待这么多秒,超时函数返回-.
其他返回值:若当前slave为启动或在等待期间被终止,返回NULL; 若指定的值已经在之前达到,返回0
参考资料:
https://www.cnblogs.com/xiaoboluo768/p/5210820.html
MySQL MHA--在线主库切换(Online master switch)的更多相关文章
- mysql mha 主从自动切换 高可用
mha(Master High Availability)目前在MySQL多服务器(超过二台),高可用方面是一个相对成熟的解决方案. 一,什么是mha,有什么特性 1. 主服务器的自动监控和故障转移 ...
- MySQL MHA FailOver后,原Master节点自动以Slave角色加入解群的研究与实现
MHA是一套MySQL高可用管理软件,除了检测Master宕机后,提升候选Slave为New Master之外(漂虚拟IP),还会自动让其他Slave与New Master 建立复制关系.MHA Ma ...
- MySQL MHA候选主库选择
MHA在选择新主库时,会将所有存活的从库分为下面几类: 存活从库数组:挑选所有存活的从库 最新从库数组: 挑选Master_Log_File+Read_Master_Log_Pos最高的从库 优选从库 ...
- MySQL MHA+Keepalived
一.MHA的简单介绍MHA是由perl语言编写的,用外挂脚本的方式实现mysql主从复制的高可用性.MHA可以自动检测mysql是否宕机,如果宕机,在10-30s内完成new master的选举,应用 ...
- MySQL高可用方案MHA在线切换的步骤及原理
在日常工作中,会碰到如下的场景,如mysql数据库升级,主服务器硬件升级等,这个时候就需要将写操作切换到另外一台服务器上,那么如何进行在线切换呢?同时,要求切换过程短,对业务的影响比较小. MHA就提 ...
- MHA在线切换过程
MHA 在线切换是MHA除了自动监控切换换提供的另外一种方式,多用于诸如硬件升级,MySQL数据库迁移等等.该方式提供快速切换和优雅的阻塞写入,无关关闭原有服务器,整个切换过程在0.5-2s 的时间左 ...
- MHA在线切换的步骤及原理
在日常工作中,会碰到如下的场景,如mysql数据库升级,主服务器硬件升级等,这个时候就需要将写操作切换到另外一台服务器上,那么如何进行在线切换呢?同时,要求切换过程短,对业务的影响比较小. MHA就提 ...
- MySQL 有关MHA搭建与切换的几个错误log
1:masterha_check_repl 副本集方面报错 replicates is not defined in the configuration file! 具体信息如下: # /usr/l ...
- MHA故障切换和在线手工切换原理
一.故障切换的过程 当master_manager监控到主库mysqld服务停止后,首先对主库进行SSH登录检查(save_binary_logs -command=test),然后对mysqld服务 ...
随机推荐
- SpringCloud基础
SpringCloud极大的简化了分布式系统的开发,实现了微服务的快速部署和灵活应用 SpringCloud主要框架 * 服务发现--Netfix Eureka * 服务调用--Netfix Feig ...
- Docker守护式容器
1.什么是守护式容器 能够长期运行 没有交互式会话 适合运行应用程序和服务 2.以守护形式运行容器 运行交互式容器时以Ctrl+P Ctrl+Q 来退出容器,此时容器还在后台继续运行,我们可以通过do ...
- Linux 提示更新密码
You are required to change your password immediately (password aged)Last login: Thu Aug 22 17:04:01 ...
- JAVA中生成指定位数随机数的方法总结
JAVA中生成指定位数随机数的方法很多,下面列举几种比较常用的方法. 方法一.通过Math类 public static String getRandom1(int len) { int rs = ( ...
- c++生成数据程序模板
in.cpp: #include<bits/stdc++.h> #define random(a,b) rand()%(b-a+1)+a using namespace std; cons ...
- Java中常量以及常量池
1.举例说明 变量 常量 字面量 int a=10; float b=1.234f; String c="abc"; final long d=10L; a,b,c为变量,d为常量 ...
- Docker 两键创建 ZeroTier moon 节点
一条命令创建 ZeroTier moon 节点: $ docker run --name zerotier-moon -d -p 9993:9993 -p 9993:9993/udp seedgou/ ...
- Zynq 7020笔记之 GPIO MIO 和EMIO的学习
1 参考 Xilinx ZYNQ 7000+Vivado2015.2系列(四)之GPIO的三种方式:MIO.EMIO.AXI_GPIO 2 理论指示 在PS侧,有PS自己的IO pin,称为MIO,共 ...
- PHP ob_gzhandler的理解
PHP ob_gzhandler的理解那么对于我们这些没有开启mod_deflate模块的主机来说,就只能采用ob_gzhandler函数来压缩了,它的压缩效果和mod_deflate相比,相差很小, ...
- charles 4.2.1 Ubuntu破解版安装
charles 4.2.1 Ubuntu破解版安装 下载 charles-proxy-4.2.1_amd64.tar.gz 破解版 charles.jar 破解包 解压 sudo tar -zxvf ...