KingbaseES V8R6 集群运维系列--sys_monitor.sh stop关闭集群分析

案例说明：

对于KingbaseES V8R6集群关闭整个集群通过执行‘sys_monitor.sh stop’命令完成，本案例解析了在执行‘sys_monitor.sh stop’后，数据库的关闭方式及数据库访问带来的影响。

KingbaseES停止数据库方式：（sys_ctl）

参数-m控制数据库停止模式 smart fast immediate：

smart 模式会等待活动的事务提交结束，并等待客户端主动断开连接之后关闭数据库

fast 模式则会回滚所有活动的事务，并强制断客户端的连接之后关闭数据库

immediate 模式立即终止所有服务器进程，当下一次数据库启动时它会首先进入恢复状态，一般不推荐使用。

适用版本：

KingbaseES V8R6

一、集群关闭案例测试

1、查看关闭数据库前控制文件

[kingbase@node101 bin]$ ./sys_controldata -D /data/kingbase/r6ha/data

sys_control version number:            1201

Catalog version number:               202112261

Database system identifier:           7080367334319169673

Database cluster state:               in production

sys_control last modified:             Fri 23 Dec 2022 11:01:58 AM CST

Latest checkpoint location:           4/F5000028

Latest checkpoint's REDO location:    4/F5000028

Latest checkpoint's REDO WAL file:    0000002300000004000000F5

Latest checkpoint's TimeLineID:       35

Latest checkpoint's PrevTimeLineID:   35

......

2、sys_monitor.sh关闭集群

[kingbase@node101 bin]$ ./sys_monitor.sh stop

.......

2022-12-23 11:04:53 begin to stop DB on "[192.168.1.102]".

waiting for server to shut down.... done

server stopped

2022-12-23 11:04:55 DB on "[192.168.1.102]" stop success.

2022-12-23 11:04:55 begin to stop DB on "[192.168.1.101]".

waiting for server to shut down............ done

server stopped

2022-12-23 11:05:06 DB on "[192.168.1.101]" stop success.

2022-12-23 11:05:06 Done.

3、查看主备库sys_log日志

#主库

2022-12-23 11:04:57.625 CST,,,10977,,63a51a26.2ae1,3,,2022-12-23 11:01:58 CST,,0,LOG,00000,"received fast shutdown request",,,,,,,,,""

2022-12-23 11:04:57.638 CST,,,10977,,63a51a26.2ae1,4,,2022-12-23 11:01:58 CST,,0,LOG,00000,"aborting any active transactions",,,,,,,,,""

2022-12-23 11:04:57.643 CST,,,10977,,63a51a26.2ae1,5,,2022-12-23 11:01:58 CST,,0,LOG,00000,"background worker ""kwr collector"" (PID 10988) exited with exit code 1",,,,,,,,,""

2022-12-23 11:04:57.643 CST,,,10977,,63a51a26.2ae1,6,,2022-12-23 11:01:58 CST,,0,LOG,00000,"background worker ""logical replication launcher"" (PID 10989) exited with exit code 1",,,,,,,,,""

2022-12-23 11:04:57.643 CST,"system","prod",12318,"192.168.1.102:58496",63a51a8e.301e,1,"INSERT",2022-12-23 11:03:42 CST,10/118,3274,FATAL,57P01,"terminating connection due to administrator command",,,,,,"insert into t1 select * from t1;",,,"kingbase_*&+_"

2022-12-23 11:04:58.876 CST,,,10980,,63a51a26.2ae4,2,,2022-12-23 11:01:58 CST,,0,LOG,00000,"checkpoint complete: wrote 15484 buffers (94.5%); 1 WAL file(s) added, 0 removed, 0 recycled; write=12.169 s, sync=0.426 s, total=13.518 s; sync files=6, longest=0.224 s, average=0.071 s; distance=694028 kB, estimate=694028 kB",,,,,,,,,""

2022-12-23 11:04:58.876 CST,,,10980,,63a51a26.2ae4,3,,2022-12-23 11:01:58 CST,,0,LOG,00000,"shutting down",,,,,,,,,""

2022-12-23 11:04:58.913 CST,,,10980,,63a51a26.2ae4,4,,2022-12-23 11:01:58 CST,,0,LOG,00000,"checkpoint starting: shutdown immediate",,,,,,,,,""

2022-12-23 11:04:59.494 CST,,,10980,,63a51a26.2ae4,5,,2022-12-23 11:01:58 CST,,0,LOG,00000,"checkpoint complete: wrote 4108 buffers (25.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.130 s, sync=0.301 s, total=0.615 s; sync files=2, longest=0.271 s, average=0.150 s; distance=223475 kB, estimate=646973 kB",,,,,,,,,""

2022-12-23 11:05:06.494 CST,,,10977,,63a51a26.2ae1,7,,2022-12-23 11:01:58 CST,,0,LOG,00000,"database system is shut down",,,,,,,,,""

#备库

2022-12-22 17:18:09.565 CST,,,1887,,63a4206f.75f,3,,2022-12-22 17:16:31 CST,,0,LOG,00000,"received fast shutdown request",,,,,,,,,""

2022-12-22 17:18:09.587 CST,,,1887,,63a4206f.75f,4,,2022-12-22 17:16:31 CST,,0,LOG,00000,"aborting any active transactions",,,,,,,,,""

2022-12-22 17:18:09.598 CST,,,1909,,63a42079.775,2,,2022-12-22 17:16:41 CST,,0,FATAL,57P01,"terminating walreceiver process due to administrator command",,,,,,,,,""

2022-12-22 17:18:09.599 CST,,,1906,,63a4206f.772,1,,2022-12-22 17:16:31 CST,,0,LOG,00000,"shutting down",,,,,,,,,""

2022-12-22 17:18:09.599 CST,,,1906,,63a4206f.772,2,,2022-12-22 17:16:31 CST,,0,LOG,00000,"restartpoint starting: shutdown immediate",,,,,,,,,""

2022-12-22 17:18:11.179 CST,,,1906,,63a4206f.772,3,,2022-12-22 17:16:31 CST,,0,LOG,00000,"restartpoint complete: wrote 44628 buffers (68.1%); 0 WAL file(s) added, 0 removed, 35 recycled; write=1.014 s, sync=0.458 s, total=1.580 s; sync files=6, longest=0.174 s, average=0.076 s; distance=573439 kB, estimate=573439 kB",,,,,,,,,""

2022-12-22 17:18:11.179 CST,,,1906,,63a4206f.772,4,,2022-12-22 17:16:31 CST,,0,LOG,00000,"recovery restart point at 1/CD000028","Last completed transaction was at log time 2022-12-22 17:17:06.024532+08.",,,,,,,,""

2022-12-22 17:18:11.241 CST,,,1887,,63a4206f.75f,5,,2022-12-22 17:16:31 CST,,0,LOG,00000,"database system is shut down",,,,,,,,,""

.......

如下图所示：

1）数据库收到“received fast shutdown request"的关库的请求。

2）数据库终止并回滚正在执行的事务，断开客户端的连接。

3）在关闭数据库之前执行了checkpoint。

4、客户端连接测试

#客户端连接数据库访问

[kingbase@node102 bin]$ ./ksql -h 192.168.1.101 -U system test

ksql (V8.0)

Type "help" for help.

test=# \c prod

You are now connected to database "prod" as user "system".

#查看t1表关库前数据

prod=# select count(*) from  t1;

  count

----------

 16000000

(1 row)

#执行insert操作，在数据库关闭时，事务被回滚并且客户端连接被断开

prod=# insert into t1 select * from t1;

FATAL:  terminating connection due to administrator command

server closed the connection unexpectedly

        This probably means the server terminated abnormally

        before or while processing the request.

5、查看数据库服务关闭后控制文件

[kingbase@node101 bin]$ ./sys_controldata -D /data/kingbase/r6ha/data

sys_control version number:            1201

Catalog version number:               202112261

Database system identifier:           7080367334319169673

Database cluster state:               shut down

sys_control last modified:             Fri 23 Dec 2022 11:04:59 AM CST

Latest checkpoint location:           5/2D000028

Latest checkpoint's REDO location:    5/2D000028

Latest checkpoint's REDO WAL file:    00000023000000050000002D

Latest checkpoint's TimeLineID:       35

Latest checkpoint's PrevTimeLineID:   35

.......

---如上所示，在数据库被关闭后，产生了新的检查点。

6、数据库服务启动后，查询数据（事务被回滚）

[kingbase@node102 bin]$ ./ksql -h 192.168.1.101 -U system test

ksql (V8.0)

Type "help" for help.

test=# \c prod

You are now connected to database "prod" as user "system".

prod=# select count(*) from  t1;

  count

----------

 16000000

(1 row)

二、sys_ctl fast stop关库测试

1、查看关库前检查点

sys_control version number:            1201

Catalog version number:               202202151

Database system identifier:           7100823070678531859

Database cluster state:               in production

sys_control last modified:             Thu 22 Dec 2022 05:30:20 PM CST

Latest checkpoint location:           0/242CFEF0

Latest checkpoint's REDO location:    0/242CFEF0

Latest checkpoint's REDO WAL file:    000000010000000000000024

Latest checkpoint's TimeLineID:       1

Latest checkpoint's PrevTimeLineID:   1

.......

2、sys_ctl fast stop 关闭数据库

[kingbase@node2 bin]$ ./sys_ctl -m fast stop -D /data/kingbase/v8r6_054/data

waiting for server to shut down.... done

server stopped

3、查看sys_log日志

[kingbase@node2 sys_log]$ tail -f kingbase-2022-12-22_165801.log

2022-12-22 16:58:01.351 CST [31158] LOG:  database system was shut down at 2022-12-22 16:56:55 CST

2022-12-22 16:58:01.361 CST [31156] LOG:  database system is ready to accept connections

2022-12-22 16:59:16.314 CST [31156] LOG:  received fast shutdown request

2022-12-22 16:59:16.320 CST [31156] LOG:  aborting any active transactions

2022-12-22 16:59:16.320 CST [31876] FATAL:  terminating connection due to administrator command

2022-12-22 16:59:16.329 CST [31156] LOG:  background worker "kwr collector" (PID 31166) exited with exit code 1

2022-12-22 16:59:16.329 CST [31156] LOG:  background worker "logical replication launcher" (PID 31167) exited with exit code 1

2022-12-22 16:59:16.329 CST [31159] LOG:  shutting down

2022-12-22 16:59:16.534 CST [31156] LOG:  database system is shut down

如下图所示：

1）sys_ctl fast stop将中断事务的操作（回滚 rollback）。

2）终止客户端对数据库的连接访问。

3) shutdown数据库前没有触发checkpoint。

4、查看关库后检查点信息

[kingbase@node2 bin]$ ./sys_controldata  -D /data/kingbase/v8r6_054/data/

sys_control version number:            1201

Catalog version number:               202202151

Database system identifier:           7100823070678531859

Database cluster state:               shut down

sys_control last modified:             Thu 22 Dec 2022 05:32:03 PM CST

Latest checkpoint location:           0/242E5298

Latest checkpoint's REDO location:    0/242E5298

Latest checkpoint's REDO WAL file:    000000010000000000000024

Latest checkpoint's TimeLineID:       1

Latest checkpoint's PrevTimeLineID:   1

........

--如上所示，在执行sys_ctl fast stop关闭数据库后，检查点没有发生变化，在sys_log日志

中没有记录触发检查点的日志信息。

如下图所示：不指定任何参数的sys_ctl关闭数据库（等同于‘fast stop’）

三、总结

在集群执行sys_monitor.sh stop和sys_ctl fast stop关闭数据库服务时：

1）都会调用’fast shutdown’方式关闭数据库。

2）数据库终止并回滚正在执行的事务，断开客户端的连接。

3）sys_monitor.sh stop执行时，数据库shutdown之前将会执行checkpoint，而sys_ctl fast stop不会执行checkpoint。

KingbaseES V8R6 集群运维系列--sys_monitor.sh stop关闭集群分析的更多相关文章

KingbaseES V8R6集群管理运维案例之---repmgr standby switchover故障
案例说明: 在KingbaseES V8R6集群备库执行"repmgr standby switchover"时,切换失败,并且在执行过程中,伴随着"repmr stan ...
KingbaseES V8R6集群运维案例之---repmgr standby promote应用案例
案例说明: 在容灾环境中,跨区域部署的异地备节点不会自主提升为主节点,在主节点发生故障或者人为需要切换时需要手动执行切换操作.若主节点已经失效,希望将异地备机提升为主节点. $bin/repmgr s ...
KingbaseES V8R6集群维护案例之---停用集群node_export进程
案例说明: 在KingbaseES V8R6集群启动时,会启动node_exporter进程,此进程主要用于向kmonitor监控服务输出节点状态信息.在系统安全漏洞扫描中,提示出现以下安全漏洞: 对 ...
KingbaseES V8R6集群维护案例之--单实例数据迁移到集群案例
案例说明: 生产环境是单实例,测试环境是集群,现需要将生产环境的数据迁移到集群中运行,本文档详细介绍了从单实例环境恢复数据到集群环境的操作步骤,可以作为生产环境迁移数据的参考. 适用版本: Kingb ...
KingbaseES V8R6集群维护案例之--修改securecmdd工具服务端口
案例说明: 在一些生产环境,为了系统安全,不支持ssh互信,或限制root用户使用ssh登录,KingbaseES V8R6可以使用securecmdd工具支持主机之间的通讯.securecmdd工具 ...
KingbaseES V8R6集群维护之--修改数据库服务端口案例
案例说明: 对于KingbaseES数据库单实例环境,只需要修改kingbase.conf文件的'port'参数即可,但是对于KingbaseES V8R6集群中涉及到多个配置文件的修改,并且在应 ...
kingbaseES V8R6集群备份恢复案例之---备库作为repo主机执行物理备份
案例说明: 此案例是在KingbaseES V8R6集群环境下,当主库磁盘空间不足时,执行sys_rman备份,将集群的备库节点作为repo主机,执行备份,并将备份存储在备库的磁盘空间. 集群架构 ...
KingbaseES V8R6集群外部备份案例
案例说明: 本案例采用sys_backup.sh执行物理备份,备份使用如下逻辑架构:集群采用CentOS 7系统,repo采用kylin V10 Server. 一主一备+外部备份此场景为主备双机常 ...
SQL Server自动化运维系列——关于邮件通知那点事（.Net开发人员的福利）
需求描述在我们的生产环境中,大部分情况下需要有自己的运维体制,包括自己健康状态的检测等.如果发生异常,需要提前预警的,通知形式一般为发邮件告知. 邮件作为一种非常便利的预警实现方式,在及时性和易用性 ...
SQL Server自动化运维系列——监控跑批Job运行状态（Power Shell）
需求描述在我们的生产环境中,大部分情况下需要有自己的运维体制,包括自己健康状态的检测等.如果发生异常,需要提前预警的,通知形式一般为发邮件告知. 在上一篇文章中已经分析了SQL SERVER中关于邮 ...

随机推荐

HTML+CSS设计一个朴实无华的登录页
说明之前一直偏重于后端技术研究,最近设计网站感觉前端太菜,遂集中看了下CSS的内容.后续我会发表一些前端实战的一些例子,给自己记录的同时希望也能分享给大家. 实现效果主要知识点 DIV屏幕垂直居中 ...
ASP.NET Core MVC应用模型的构建[1]: 应用的蓝图
我个人觉得这是ASP.NET Core MVC框架体系最核心的部分.原因很简单,MVC框架建立在ASP.NET Core路由终结点上,它最终的目的就是将每个Action方法映射为一个或者多个路由终结点 ...
小程序threejs参考
之前做了一个小程序眼镜试戴的功能,涉及了人脸识别和3D模型渲染等.暂时记录一些参考的东西,有时间再整理. threejs官方文档(一定要看看) https://threejs.org/docs/ind ...
【Java复健指南10】OOP高级01-类变量、类方法和main
类变量什么是类变量类变量也叫静态变量/静态属性,是该类的所有对象共享的变量,任何一个该类的对象去访问它时,取到的都是相同的值,同样任何一个该类的对象去修改它时,修改的也是同一个变量. 如何定义类变 ...
jenkins 钉钉机器人插件
官方文档: https://jenkinsci.github.io/dingtalk-plugin/guide/getting-started.html#%E6%B3%A8%E6%84%8F 注意:系 ...
ThinkPHP6 事件的简单应用
一.序章 ThinkPHP6的手册中关于[事件]章节的介绍都是直接文字说明,给出创建的类文件,并没有一个好的示例来进行补充说明.对于刚接触[事件]的同学在阅读理解上增加了一点点困难,本文就在此结合示例 ...
[.Net]使用Soa库+Abp搭建微服务项目框架（三）：项目改造
上一章我们说道,如果要使各模块之间解耦,使得各自独立成服务,首先要解除各个模块之间的引用关系. 还记得上一章我们的小项目吗 ?们回到之前的代码上来,当前的项目架构如下图: 这次的任务是将它改造成 ...
Zabbix_get基础命令浅析
zabbix_get是Zabbix监控系统的一个命令行工具,可以用于从Zabbix服务器或代理获取数据.以下是zabbix_get的基本使用方法: 1.获取一个单独的键值对使用以下命令可以获取一个单 ...
Prometheus技术分享——如何监控宿主机和容器
这一期主要来跟大家聊一下,使用node_exporter工具来暴露主机和因公程序上的指标,利用prometheus来监控宿主机:以及通过通过Cadvisor监控docker容器. 一.部署node_e ...
IDE中使用Git提交代码报错:Push to origin/release-V2 was rejected
一.问题由来当前项目开发好之后,已经正常稳定运行一两个月,在使用过程中基本上没在出现什么BUG.因此公司在讨论准备开发二期项目,自己就在之前的基础之上,使用git创建了分支,一个分支release ...

KingbaseES V8R6 集群运维系列--sys_monitor.sh stop关闭集群分析

KingbaseES V8R6 集群运维系列--sys_monitor.sh stop关闭集群分析的更多相关文章

随机推荐

热门专题