MHA ssh检查,repl复制检查和在线切换日志分析
一、SSh 检查日志分析
执行过程及对应的日志:
1、读取MHA manger 节点上的配置文件
2、根据配置文件,得到各个主机的信息,逐一进行SSH检查
3、每个主机都通过SSH连接除了自己以外的其他所有主机
4、当所有主机相互之间都能通过SSH免密登录,SSH检查就通过。
[root@A2 app1]# masterha_check_ssh --conf=/etc/masterha/app1.conf
Sun Jun :: - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun Jun :: - [info] Reading application default configuration from /etc/masterha/app1.conf..
Sun Jun :: - [info] Reading server configuration from /etc/masterha/app1.conf..
Sun Jun :: - [info] Starting SSH connection tests..
Sun Jun :: - [debug]
Sun Jun :: - [debug] Connecting via SSH from root@172.16.13.15(172.16.13.15:) to root@172.16.15.3(172.16.15.3:)..
Sun Jun :: - [debug] ok.
Sun Jun :: - [debug] Connecting via SSH from root@172.16.13.15(172.16.13.15:) to root@172.16.15.2(172.16.15.2:)..
Sun Jun :: - [debug] ok.
Sun Jun :: - [debug]
Sun Jun :: - [debug] Connecting via SSH from root@172.16.15.3(172.16.15.3:) to root@172.16.13.15(172.16.13.15:)..
Sun Jun :: - [debug] ok.
Sun Jun :: - [debug] Connecting via SSH from root@172.16.15.3(172.16.15.3:) to root@172.16.15.2(172.16.15.2:)..
Sun Jun :: - [debug] ok.
Sun Jun :: - [debug]
Sun Jun :: - [debug] Connecting via SSH from root@172.16.15.2(172.16.15.2:) to root@172.16.13.15(172.16.13.15:)..
Sun Jun :: - [debug] ok.
Sun Jun :: - [debug] Connecting via SSH from root@172.16.15.2(172.16.15.2:) to root@172.16.15.3(172.16.15.3:)..
Sun Jun :: - [debug] ok.
Sun Jun :: - [info] All SSH connection tests passed successfully.
二、主从复制检查日志分析
1、读取配置文件,根据配置文件,检查当前的所有主机状态,MHA Node版本,是否支持GTID主从复制,得到当前的主从复制架构
[root@A2 app1]# masterha_check_repl --conf=/etc/masterha/app1.conf
Sun Jun :: - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun Jun :: - [info] Reading application default configuration from /etc/masterha/app1.conf..
Sun Jun :: - [info] Reading server configuration from /etc/masterha/app1.conf..
Sun Jun :: - [info] MHA::MasterMonitor version 0.56.
Sun Jun :: - [info] GTID failover mode =
Sun Jun :: - [info] Dead Servers:
Sun Jun :: - [info] Alive Servers:
Sun Jun :: - [info] 172.16.13.15(172.16.13.15:)
Sun Jun :: - [info] 172.16.15.3(172.16.15.3:)
Sun Jun :: - [info] 172.16.15.2(172.16.15.2:)
Sun Jun :: - [info] Alive Slaves:
Sun Jun :: - [info] 172.16.13.15(172.16.13.15:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled
Sun Jun :: - [info] Replicating from 172.16.15.3(172.16.15.3:)
Sun Jun :: - [info] Primary candidate for the new Master (candidate_master is set)
Sun Jun :: - [info] 172.16.15.2(172.16.15.2:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled
Sun Jun :: - [info] Replicating from 172.16.15.3(172.16.15.3:)
Sun Jun :: - [info] Current Alive Master: 172.16.15.3(172.16.15.3:)
2、检查从库配置,检查主从复制是否过滤,是否支持GTID复制
Sun Jun :: - [info] Checking slave configurations..
Sun Jun :: - [info] Checking replication filtering settings..
Sun Jun :: - [info] binlog_do_db= , binlog_ignore_db=
Sun Jun :: - [info] Replication filtering check ok.
Sun Jun :: - [info] GTID (with auto-pos) is not supported
3、进行SSH连接测试,MHA版本检查
Sun Jun :: - [info] Starting SSH connection tests..
Sun Jun :: - [info] All SSH connection tests passed successfully.
Sun Jun :: - [info] Checking MHA Node version..
Sun Jun :: - [info] Version check ok.
4、检查主库上SSH 配置,测试恢复脚本(save_binary_logs)的可用性,对binlog设置进行检查
Sun Jun :: - [info] Checking SSH publickey authentication settings on the current master..
Sun Jun :: - [info] HealthCheck: SSH to 172.16.15.3 is reachable.
Sun Jun :: - [info] Master MHA Node version is 0.56.
Sun Jun :: - [info] Checking recovery script configurations on 172.16.15.3(172.16.15.3:)..
Sun Jun :: - [info] Executing command: save_binary_logs --command=test --start_pos= --binlog_dir=/usr/local/mysql/data --output_file=/tmp/save_binary_logs_test --manager_version=0.56 --start_file=mysql_bin.
Sun Jun :: - [info] Connecting to root@172.16.15.3(172.16.15.3:)..
Creating /tmp if not exists.. ok.
Checking output directory is accessible or not..
ok.
Binlog found at /usr/local/mysql/data, up to mysql_bin.
Sun Jun :: - [info] Binlog setting check done.
5、检查从库SSH配置,测试应用差异日志脚本(apply_diff_relay_logs)的可用性,检查从库恢复环境和 relay log 的情况,检查MySQL的连接和权限,清理测试文件
Sun Jun :: - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Sun Jun :: - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=172.16.13.15 --slave_ip=172.16.13.15 --slave_port= --workdir=/tmp --target_version=5.7.-log --manager_version=0.56 --relay_log_info=/usr/local/mysql/data/relay-log.info --relay_dir=/usr/local/mysql/data/ --slave_pass=xxx
Sun Jun :: - [info] Connecting to root@172.16.13.15(172.16.13.15:)..
Checking slave recovery environment settings..
Opening /usr/local/mysql/data/relay-log.info ... ok.
Relay log found at /usr/local/mysql/data, up to mysqlserver-relay-bin.
Temporary relay log file is /usr/local/mysql/data/mysqlserver-relay-bin.
Testing mysql connection and privileges.. done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done. Sun Jun :: - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=172.16.15.2 --slave_ip=172.16.15.2 --slave_port= --workdir=/tmp --target_version=5.7.-log --manager_version=0.56 --relay_log_info=/usr/local/mysql/data/relay-log.info --relay_dir=/usr/local/mysql/data/ --slave_pass=xxx
Sun Jun :: - [info] Connecting to root@172.16.15.2(172.16.15.2:)..
Checking slave recovery environment settings..
Opening /usr/local/mysql/data/relay-log.info ... ok.
Relay log found at /usr/local/mysql/data, up to A2-relay-bin.
Temporary relay log file is /usr/local/mysql/data/A2-relay-bin.
Testing mysql connection and privileges.. done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Sun Jun :: - [info] Slaves settings check done.
6、得到当前的主从结构,检查每个从库的复制状态。检查故障切换等脚本的状态,完成主从复制检查
Sun Jun :: - [info]
172.16.15.3(172.16.15.3:) (current master)
+--172.16.13.15(172.16.13.15:)
+--172.16.15.2(172.16.15.2:) Sun Jun :: - [info] Checking replication health on 172.16.13.15..
Sun Jun :: - [info] ok.
Sun Jun :: - [info] Checking replication health on 172.16.15.2..
Sun Jun :: - [info] ok.
Sun Jun :: - [info] Checking master_ip_failover_script status:
Sun Jun :: - [info] /var/log/masterha/scripts/master_ip_failover --command=status --ssh_user=root --orig_master_host=172.16.15.3 --orig_master_ip=172.16.15.3 --orig_master_port=
Checking the Status of the script.. OK
Sun Jun :: - [info] OK.
Sun Jun :: - [warning] shutdown_script is not defined.
Sun Jun :: - [info] Got exit code (Not master dead). MySQL Replication Health is OK.
三、在线日志切换分析
1、读取配置文件,检查是否支持GTID复制,得到当前的主从结构
[root@A2 app1]# masterha_master_switch --conf=/etc/masterha/app1.conf --master_state=alive --new_master_host=172.16.13.15 --new_master_port= --orig_master_is_new_slave --running_updates_limit=
Sun Jun :: - [info] MHA::MasterRotate version 0.56.
Sun Jun :: - [info] Starting online master switch..
Sun Jun :: - [info]
Sun Jun :: - [info] * Phase : Configuration Check Phase..
Sun Jun :: - [info]
Sun Jun :: - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun Jun :: - [info] Reading application default configuration from /etc/masterha/app1.conf..
Sun Jun :: - [info] Reading server configuration from /etc/masterha/app1.conf..
Sun Jun :: - [info] GTID failover mode =
Sun Jun :: - [info] Current Alive Master: 172.16.15.3(172.16.15.3:)
Sun Jun :: - [info] Alive Slaves:
Sun Jun :: - [info] 172.16.13.15(172.16.13.15:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled
Sun Jun :: - [info] Replicating from 172.16.15.3(172.16.15.3:)
Sun Jun :: - [info] Primary candidate for the new Master (candidate_master is set)
Sun Jun :: - [info] 172.16.15.2(172.16.15.2:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled
Sun Jun :: - [info] Replicating from 172.16.15.3(172.16.15.3:)
2、在主库上确认执行 FLUSH NO_WRITE_TO_BINLOG TABLES,关闭已经打开的表,不再记录binlog,进行主从复制检查,得到新主库的信息
It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 172.16.15.3(172.16.15.3:)? (YES/no): yes
Sun Jun :: - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Sun Jun :: - [info] ok.
Sun Jun :: - [info] Checking MHA is not monitoring or doing failover..
Sun Jun :: - [info] Checking replication health on 172.16.13.15..
Sun Jun :: - [info] ok.
Sun Jun :: - [info] Checking replication health on 172.16.15.2..
Sun Jun :: - [info] ok.
Sun Jun :: - [info] 172.16.13.15 can be new master.
Sun Jun :: - [info]
From:
172.16.15.3(172.16.15.3:) (current master)
+--172.16.13.15(172.16.13.15:)
+--172.16.15.2(172.16.15.2:) To:
172.16.13.15(172.16.13.15:) (new master)
+--172.16.15.2(172.16.15.2:)
+--172.16.15.3(172.16.15.3:)
3、开始从旧主切换到新主,检查新主能否成为主库。检查复制过滤,临时将旧主change master to到一个dummy地址
Starting master switch from 172.16.15.3(172.16.15.3:) to 172.16.13.15(172.16.13.15:)? (yes/NO): yes
Sun Jun :: - [info] Checking whether 172.16.13.15(172.16.13.15:) is ok for the new master..
Sun Jun :: - [info] ok.
Sun Jun :: - [info] 172.16.15.3(172.16.15.3:): SHOW SLAVE STATUS returned empty result. To check replication filtering rules, temporarily executing CHANGE MASTER to a dummy host.
Sun Jun :: - [info] 172.16.15.3(172.16.15.3:): Resetting slave pointing to the dummy host.
Sun Jun :: - [info] ** Phase : Configuration Check Phase completed.
Sun Jun :: - [info]
4、旧主库上执行master_ip_online_change,停止虚拟IP。在旧主上执行 FLUSH TABLES WITH READ LOCK..,实现全局读锁。
Sun Jun :: - [info] * Phase : Rejecting updates Phase..
Sun Jun :: - [info]
Sun Jun :: - [info] Executing master ip online change script to disable write on the current master:
Sun Jun :: - [info] /var/log/masterha/scripts/master_ip_online_change --command=stop --orig_master_host=172.16.15.3 --orig_master_ip=172.16.15.3 --orig_master_port= --orig_master_user='root' --orig_master_password='' --new_master_host=172.16.13.15 --new_master_ip=172.16.13.15 --new_master_port= --new_master_user='root' --new_master_password='' --orig_master_ssh_user=root --new_master_ssh_user=root --orig_master_is_new_slave *************************************************************** Disabling the VIP - 172.16.13.141/ on old master: 172.16.15.3 Disabled the VIP successfully
*************************************************************** Sun Jun :: - [info] ok.
Sun Jun :: - [info] Locking all tables on the orig master to reject updates from everybody (including root):
Sun Jun :: - [info] Executing FLUSH TABLES WITH READ LOCK..
Sun Jun :: - [info] ok.
5、获取旧主的binlog位置,在新主上面应用中继日志。得到新主的binlog 位置,用于后期在其他从库上执行change master to,在新主库上面开启虚拟IP,set read_only =0
Sun Jun :: - [info] Orig master binlog:pos is mysql_bin.:.
Sun Jun :: - [info] Waiting to execute all relay logs on 172.16.13.15(172.16.13.15:)..
Sun Jun :: - [info] master_pos_wait(mysql_bin.:) completed on 172.16.13.15(172.16.13.15:). Executed events.
Sun Jun :: - [info] done.
Sun Jun :: - [info] Getting new master's binlog name and position..
Sun Jun :: - [info] mysql_bin.:
Sun Jun :: - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.16.13.15', MASTER_PORT=, MASTER_LOG_FILE='mysql_bin.000058', MASTER_LOG_POS=, MASTER_USER='root', MASTER_PASSWORD='xxx';
Sun Jun :: - [info] Executing master ip online change script to allow write on the new master:
Sun Jun :: - [info] /var/log/masterha/scripts/master_ip_online_change --command=start --orig_master_host=172.16.15.3 --orig_master_ip=172.16.15.3 --orig_master_port= --orig_master_user='root' --orig_master_password='' --new_master_host=172.16.13.15 --new_master_ip=172.16.13.15 --new_master_port= --new_master_user='root' --new_master_password='' --orig_master_ssh_user=root --new_master_ssh_user=root --orig_master_is_new_slave *************************************************************** Enabling the VIP - 172.16.13.141/ on new master: 172.16.13.15 Enabled the VIP successfully
*************************************************************** Sun Jun :: - [info] ok.
Sun Jun :: - [info] Setting read_only= on 172.16.13.15(172.16.13.15:)..
Sun Jun :: - [info] ok.
6、并行切换从库,应用中继日志到 旧主binlog位置,执行change master to;旧主也同时执行,而且执行 UNLOCK TABLES ,解锁。至此,从库切换完成
Sun Jun :: - [info]
Sun Jun :: - [info] * Switching slaves in parallel..
Sun Jun :: - [info]
Sun Jun :: - [info] -- Slave switch on host 172.16.15.2(172.16.15.2:) started, pid:
Sun Jun :: - [info]
Sun Jun :: - [info] Log messages from 172.16.15.2 ...
Sun Jun :: - [info]
Sun Jun :: - [info] Waiting to execute all relay logs on 172.16.15.2(172.16.15.2:)..
Sun Jun :: - [info] master_pos_wait(mysql_bin.:) completed on 172.16.15.2(172.16.15.2:). Executed events.
Sun Jun :: - [info] done.
Sun Jun :: - [info] Resetting slave 172.16.15.2(172.16.15.2:) and starting replication from the new master 172.16.13.15(172.16.13.15:)..
Sun Jun :: - [info] Executed CHANGE MASTER.
Sun Jun :: - [info] Slave started.
Sun Jun :: - [info] End of log messages from 172.16.15.2 ...
Sun Jun :: - [info]
Sun Jun :: - [info] -- Slave switch on host 172.16.15.2(172.16.15.2:) succeeded.
Sun Jun :: - [info] Unlocking all tables on the orig master:
Sun Jun :: - [info] Executing UNLOCK TABLES..
Sun Jun :: - [info] ok.
Sun Jun :: - [info] Starting orig master as a new slave..
Sun Jun :: - [info] Resetting slave 172.16.15.3(172.16.15.3:) and starting replication from the new master 172.16.13.15(172.16.13.15:)..
Sun Jun :: - [info] Executed CHANGE MASTER.
Sun Jun :: - [info] Slave started.
Sun Jun :: - [info] All new slave servers switched successfully.
Sun Jun :: - [info]
7、对新主清理,更改从库信息
Sun Jun :: - [info] * Phase : New master cleanup phase..
Sun Jun :: - [info]
Sun Jun :: - [info] 172.16.13.15: Resetting slave info succeeded.
Sun Jun :: - [info] Switching master to 172.16.13.15(172.16.13.15:) completed successfully.
MHA ssh检查,repl复制检查和在线切换日志分析的更多相关文章
- MySQL--19 MHA切换日志分析
MHA切换检测日志分析 GTID模式 [root@db03 ~]# tail -f /etc/mha/manager.log #在MySQL select ping:2006上出错(MySQL服务器已 ...
- MHA的在线切换后的一些总结(mha方案来自网络)
mha方案来自:http://www.cnblogs.com/xuanzhi201111/p/4231412.html MHA的在线切换 192.168.2.131 [root bin]$ maste ...
- MHA在线切换过程
MHA 在线切换是MHA除了自动监控切换换提供的另外一种方式,多用于诸如硬件升级,MySQL数据库迁移等等.该方式提供快速切换和优雅的阻塞写入,无关关闭原有服务器,整个切换过程在0.5-2s 的时间左 ...
- MHA在线切换的步骤及原理
在日常工作中,会碰到如下的场景,如mysql数据库升级,主服务器硬件升级等,这个时候就需要将写操作切换到另外一台服务器上,那么如何进行在线切换呢?同时,要求切换过程短,对业务的影响比较小. MHA就提 ...
- MySQL高可用方案MHA在线切换的步骤及原理
在日常工作中,会碰到如下的场景,如mysql数据库升级,主服务器硬件升级等,这个时候就需要将写操作切换到另外一台服务器上,那么如何进行在线切换呢?同时,要求切换过程短,对业务的影响比较小. MHA就提 ...
- (5.12)mysql高可用系列——复制中的在线切换GTID模式/增加节点/删除节点
目录 [0]需求 前提,已经假设好基于传统异步复制的主库和从库1. [0.1]传统异步切换成基于GTID的无损模式 [0.2]增加特殊要求的从库 [1]操作环境 [2]构建 复制->半同步复制 ...
- MHA 主从切换过程及日志分析
本文主要在MHA 切换日志的角度分析MHA切换的过.MHA故障切换过程如下图所示 第一部分:开启MHA 监控 通过分析日志,得到以下步骤: 1.读取MHA manager 节点的配置文件,并检查配置文 ...
- Oracle 无备份情况下的恢复--临时文件/在线重做日志/ORA-00205
13.5 恢复临时文件 临时文件没有也不应该备份.通过V$TEMPFILE可以找到所有的临时文件. 此类文件的损坏会造成需要使用临时表空间的命令执行失败,不至于造成实例崩溃或session中断.由于临 ...
- Oracle RACDB 增加、删除 在线重做日志组
Oracle RACDB 增加.删除 在线重做日志组 select * from v$log;select * from v$logfile ; ----删除日志组:alter database dr ...
随机推荐
- 使用SNMP监控服务器运行情况
系统监测的基本概念及分类: a.系统监测的概述: 如何对现有IT架构的整体以及细节运行情况进行科学.系统和高效地监测是目前各企业运维和管理部门一项非常重要的工作内容.随着当前企业IT环境中服务器.应用 ...
- Emacs常用命令快速参考
原文地址 Emacs常用命令的汇总,来自Emacs参考卡片 注意:以下命令中标注的按键,大写的C代表Control,在键盘上通常是Ctrl键,而M代表Meta,在键盘上通常是Alt键,S则代表Shif ...
- <img>的title和alt有什么区别
1.title是全局属性之一,用于为元素提供附加的advisory information.通常当鼠标滑动到元素上的时候显示. 2.alt是<img>的特有属性,是图片内容的等价描述,用于 ...
- Python学习之==>条件判断
1.单条件判断 # 接收输入的值,使用input函数,用input接收输入的值都是string类型的 age = input('请输入你的年龄:') age = int(age) # 类型转换,转换成 ...
- java:ER图,Springmvc:Mapper代理开发规范,PB(PowerDesigner数据库建模)
1.ER图(Entity Relationship Diagram实体关系图): 工具: ER-win Viso 矩形:实体对象 椭圆:属性 菱形:关系 2.Mapper代理的开发规范: 1.mapp ...
- beego框架学习(三) -orm的使用
2 3 4 5 6 7 8 9 10 11 目前beego-orm支持的数据有: - MySQL:https://github.com/go-sql-driver/mysql - PostgreSQL ...
- C#学习笔记一(概念,对象与类型,继承)
一.基础 1.CLR为公共语言运行库,类似于JVM 2..NET Framwork是一个独立发布的程序包,其包含了CLR,类库及相关的语言编辑器等工具,类似于JDK,除了C#,还有其他几种语言在CLR ...
- QA Issue: PN: startUp is upper case, this can result in unexpected behavior. [uppercase-pn]
(借用一下) 该错误直接导致生成开机启动程序无法启动,既无法生成S99***快捷链接. 解决方法:仅仅将recpie lib-Test改成lib-test就可以了,即不要有大写字母. 附启动方法: S ...
- 应用安全 - 免杀 - 工具 - the-backdoor-factory - 使用|命令 - 汇总
安装 Kali下方式一: git clone https://github.com/secretsquirrel/the-backdoor-factory方式二: apt-get install ba ...
- Mybatis(二) SQL映射文件
SQL映射文件 单条件查询 1. 在UserMapper接口添加抽象方法 //根据用户名模糊查询 List<User> getUserListByName(); 2. 在UserMappe ...