KingbaseES V8R6 集群运维系列 -- trusted

案例说明：

在KingbaseES V8R3及V8R6早期的版本，对于读写分离的集群如果网关地址无法连通，将会导致整个集群关闭，数据库服务无法访问。在后期版本的改进中，降低了对网关的依赖性，当网关地址不通时，会影响集群的部分高可用功能比如failover切换，但集群可以正常对外提供数据库访问服务。如下图所示：

适用版本：

KingbaseES V8R6

集群网关配置：

[kingbase@node101 bin]$ cat ../etc/repmgr.conf |grep trust

trusted_servers='192.168.1.1'

running_under_failure_trusted_servers='on'

一、查看集群节点状态

[kingbase@node101 bin]$ ./repmgr cluster show

 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string

----+-------+---------+-----------+----------+----------+----------+----------+---------+---------------------------------------------------------------------------------------------------------------------------------------------------

 1  | node1 | standby |   running | node2    | default  | 100      | 4        | 0 bytes | host=192.168.1.102 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

 2  | node2 | primary | * running |          | default  | 100      | 4        |         | host=192.168.1.101 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

二、模拟网关故障

[kingbase@node101 ~]$ ping 192.168.1.1

PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.

From 192.168.1.101 icmp_seq=10 Destination Host Unreachable

From 192.168.1.101 icmp_seq=11 Destination Host Unreachable

From 192.168.1.101 icmp_seq=12 Destination Host Unreachable

.....

---如上所示，所有集群节点已经无法ping通网关地址。

三、查看网关失败后集群状态

1、集群节点状态

[kingbase@node101 bin]$ ./repmgr cluster show

 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string

----+-------+---------+-----------+----------+----------+----------+----------+---------+---------------------------------------------------------------------------------------------------------------------------------------------------

 1  | node1 | standby |   running | node2    | default  | 100      | 4        | 0 bytes | host=192.168.1.102 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

 2  | node2 | primary | * running |          | default  | 100      | 4        |         | host=192.168.1.101 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

2、数据库连接测试

[kingbase@node102 bin]$ ./ksql -U system test

ksql (V8.0)

Type "help" for help.

                                                       version

----------------------------------------------------------------------------------------------------------------------

 KingbaseES V008R006C007B0012 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit

(1 row)

---如上所示，网关无法连通后，集群节点状态及数据库服务仍都正常。

3、查看kbha.log日志

Tips：

KingbaseES V8R6集群通过kbha进程每过三秒执行一次网关连通性的测试。

[2023-04-10 15:57:30] [WARNING] ping host"192.168.1.1" failed

[2023-04-10 15:57:31] [NOTICE] PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.

--- 192.168.1.1 ping statistics ---

2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 999ms

pipe 2

[2023-04-10 15:57:31] [WARNING] ping host"192.168.1.1" failed

[2023-04-10 15:57:31] [DETAIL] average RTT value is not greater than zero

[2023-04-10 15:57:31] [DEBUG] ping process end early. usleep(994400)

----如上所示，kbha.log日志记录了网关地址连接失败的日志。

四、集群failover切换测试

1、关闭主库数据库服务

[kingbase@node101 bin]$ ./sys_ctl stop -D ../../data

2、查看备库hamgr.log日志

[2023-04-10 16:13:41] [DEBUG] monitoring node in degraded state for 640 seconds

[2023-04-10 16:13:43] [DEBUG] connecting to: "user=esrep connect_timeout=10 dbname=esrep host=192.168.1.101 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3 fallback_application_name=repmgr options=-csearch_path="

[2023-04-10 16:13:43] [DEBUG] monitoring node in degraded state for 642 seconds

[2023-04-10 16:13:45] [DEBUG] connecting to: "user=esrep connect_timeout=10 dbname=esrep host=192.168.1.101 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3 fallback_application_name=repmgr options=-csearch_path="

[2023-04-10 16:13:45] [DEBUG] monitoring node in degraded state for 644 seconds

[2023-04-10 16:13:47] [DEBUG] connecting to: "user=esrep connect_timeout=10 dbname=esrep host=192.168.1.101 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3 fallback_application_name=repmgr options=-csearch_path="

[2023-04-10 16:13:47] [DEBUG] monitoring node in degraded state for 646 seconds

[2023-04-10 16:13:49] [DEBUG] connecting to: "user=esrep connect_timeout=10 dbname=esrep host=192.168.1.101 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3 fallback_application_name=repmgr options=-csearch_path="

[2023-04-10 16:13:49] [DEBUG] monitoring node in degraded state for 648 seconds

---如以上所示，备库检测到主库连接失败，但是并没有触发主备切换。

3、查看集群节点状态

[kingbase@node102 bin]$ ./repmgr cluster show

 ID | Name  | Role    | Status        | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string

----+-------+---------+---------------+----------+----------+----------+----------+---------+---------------------------------------------------------------------------------------------------------------------------------------------------

 1  | node1 | standby |   running     | ? node2  | default  | 100      | 4        | ?       | host=192.168.1.102 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

 2  | node2 | primary | ? unreachable | ?        | default  | 100      |          |         | host=192.168.1.101 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

[WARNING] following issues were detected

  - unable to connect to node "node1" (ID: 1)'s upstream node "node2" (ID: 2)

  - unable to determine if node "node1" (ID: 1) is attached to its upstream node "node2" (ID: 2)

  - unable to connect to node "node2" (ID: 2)

  - node "node2" (ID: 2) is registered as an active primary but is unreachable

[HINT] execute with --verbose option to see connection error messages

如下图所示，primary处于不可连接状态，未产生failover切换：

五、总结

KingbaseES集群节点通过ping网关地址，测试集群节点之间的网络的互通，如果网关失败，会影响到集群的正常运行，可以在集群中配置多个网关，保证网关地址的高可用。

[kingbase@node101 bin]$ cat ../etc/repmgr.conf |grep trust

trusted_servers='192.168.1.1,192.168.1.254'

running_under_failure_trusted_servers='on'

KingbaseES V8R6 集群运维系列 -- trusted_server的更多相关文章

KingbaseES V8R6集群运维案例之---repmgr standby promote应用案例
案例说明: 在容灾环境中,跨区域部署的异地备节点不会自主提升为主节点,在主节点发生故障或者人为需要切换时需要手动执行切换操作.若主节点已经失效,希望将异地备机提升为主节点. $bin/repmgr s ...
KingbaseES V8R6集群管理运维案例之---repmgr standby switchover故障
案例说明: 在KingbaseES V8R6集群备库执行"repmgr standby switchover"时,切换失败,并且在执行过程中,伴随着"repmr stan ...
KingbaseES V8R6集群维护案例之---停用集群node_export进程
案例说明: 在KingbaseES V8R6集群启动时,会启动node_exporter进程,此进程主要用于向kmonitor监控服务输出节点状态信息.在系统安全漏洞扫描中,提示出现以下安全漏洞: 对 ...
kingbaseES V8R6集群备份恢复案例之---备库作为repo主机执行物理备份
案例说明: 此案例是在KingbaseES V8R6集群环境下,当主库磁盘空间不足时,执行sys_rman备份,将集群的备库节点作为repo主机,执行备份,并将备份存储在备库的磁盘空间. 集群架构 ...
KingbaseES V8R6集群维护之--修改数据库服务端口案例
案例说明: 对于KingbaseES数据库单实例环境,只需要修改kingbase.conf文件的'port'参数即可,但是对于KingbaseES V8R6集群中涉及到多个配置文件的修改,并且在应 ...
KingbaseES V8R6集群外部备份案例
案例说明: 本案例采用sys_backup.sh执行物理备份,备份使用如下逻辑架构:集群采用CentOS 7系统,repo采用kylin V10 Server. 一主一备+外部备份此场景为主备双机常 ...
KingbaseES V8R3集群运维案例之---主库系统down failover切换过程分析
案例说明: KingbaseES V8R3集群failover时两个cluster都会触发,但只有一个cluster会调用脚本去执行真正的切换流程,另一个有对应的打印,但不会调用脚本,只是走相关的 ...
KingbaseES V8R3集群运维案例之---kingbase_monitor.sh启动”two master“案例
案例说明: KingbaseES V8R3集群,执行kingbase_monitor.sh启动集群,出现"two master"节点的故障,启动集群失败:通过手工sys_ctl启动 ...
KingbaseES V8R3集群运维案例之---cluster.log ERROR: md5 authentication failed
案例说明: 在KingbaseES V8R3集群的cluster.log日志中,经常会出现"ERROR: md5 authentication failed:DETAIL: password ...
KingbaseES V8R6集群维护案例之---将securecmdd通讯改为ssh案例
案例说明: 在KingbaseES V8R6的后期版本中,为了解决有的主机之间不允许root用户ssh登录的问题,使用了securecmdd作为集群部署分发和通讯的服务,有生产环境通过漏洞扫描,在88 ...

随机推荐

.gitignore 无法工作
在开发一个新项目时,发现每次编译时都会产生一些 .obj 无用的文件,这些文件并不需要 push 到 github 上故使用 .gitignore 忽略这些文件首先,我们可以设置这些文件的输出目录 ...
硬件开发笔记（十六）：RK3568底板电路mipi摄像头接口原理图分析、mipi摄像头详解
前言本篇继续分析底板原理图mipi电路原理图.mipi摄像头输入硬件接口详解. RK3568芯片摄像头接口查看RK3568的芯片手册,摄像头接口并不支持直接sensor模拟信号输入,只 ...
麒麟系统开发笔记（十二）：在国产麒麟系统上编译GDAL库、搭建基础开发环境和基础Demo
前言麒麟系统上做全球北斗定位终端开发,北斗GPS发过来的是大地坐标,应用需要的是经纬度坐标,所以需要转换,可以使用公式转换,但是之前涉及到了山He智能一个项目使用WG. 大地坐标简介概述 ...
重点：递归函数，数学模块，随机模块---day14
1.递归函数自己调用自己的函数是递归函数递:去归:回一去一回叫作递归简单递归 def digui(n): print(n,'<==1==>') if n > 0: digu ...
【Azure 存储服务】Storage Account Blob 使用REST API如何获取磁盘大小(Content-Length), IOPS信息
问题描述 1)关于使用Rest API获取非托管磁盘信息比如获取磁盘大小 2)关于使用Rest API获取非托管磁盘信息比如iops 问题答案 #1:关于使用Rest API获取非托管磁盘信息比如获取 ...
Asp .Net Web Forms 系列：配置图片防盗链的几种方法
通过 URL Rewrite Module 组件 URL Rewrite Module 是一个用于在 ASP.NET Web Forms 或其他基于 IIS 的 Web 应用程序中重写 URL 的强大 ...
URLDNS链分析
一.概述 URLDNS 是ysoserial中利用链的一个名字,通常用于检测是否存在Java反序列化漏洞.该利用链具有如下特点: 不限制jdk版本,使用Java内置类,对第三方依赖没有要求目标无回显 ...
nowrap - table td 列宽度不被挤 - 大表格制作
nowrap - table td 列宽度不被挤 - 大表格制作表格前几列设置完宽度,会被右侧动态数据挤没有宽度,加上nowrap,就保证宽度了
Leetcode 1161 最大层内元素和
一.题目给你一个二叉树的根节点 root.设根节点位于二叉树的第1层,而根节点的子节点位于第2层,依此类推. 请返回层内元素之和最大的那几层(可能只有一层)的层号,并返回其中最小的那个. 示 ...
c语言中int和char之间的转换实例解析
壹: 经常用到c,积累一些小函数,免得下次还要重新写,极大的提升工作效率啊. 贰: 代码很简单,直接上源码: #include <stdio.h> typedef unsig ...

KingbaseES V8R6 集群运维系列 -- trusted_server

KingbaseES V8R6 集群运维系列 -- trusted_server的更多相关文章

随机推荐

热门专题