使用见证服务器:

见证服务器是一个正常的KingbaseES实例,不是流复制群集的一部分; 其目的是,如果发生故障转移情况,则提供证明它是主服务器本身不可用的证据,而不是例如在不同物理位置之间的网络分裂。见证服务器的典型用例是双节点流复制设置,其中主要和备用服务器位于不同的位置(数据中心)。通过在与主服务器相同的位置(数据中心)中创建见证服务器,如果主服务器变得不可用,则备用服务器可以决定是否可以在不“脑裂”情况的情况下提升为主:如果它无法看到见证人或主服务器,它可能存在网络级中断,它不应该提升为主。如果它可以看到见证人但不能看到主节点,这证明没有网络中断且主本身不可用,因此它可以提升自己为主。

对于更复杂的复制方案,例如使用多个数据中心,最好使用基于位置的故障转移,这可确保只有与主服务器位于同一位置的节点才能成为主节点。

要创建见证服务器,请在与群集的主服务器位于同一物理位置的服务器上设置普通的PostgreSQL实例。不应该在与主服务器同一个物理主机创建见证服务器,否则如果主服务器由于硬件问题失败,见证服务器会失效。

数据库版本:

test=# select version();
version
----------------------------------------------------------------------------------------------------------------------
KingbaseES V008R006C003B0010 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
(1 row)

repmgr cluster原架构:

[kingbase@node2 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | primary | * running | | default | 100 | 18 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3 | node243 | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
5 | node243B | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

一、创建witness服务器

=注意:witness服务器,应该是一个独立的主机节点不能和主库或备库在同一个主机上,并且witness和其他主机之间不构成流复制,所以witness是一个独立的primary实例,其数据库systemID,不应该和其他数据库一致,需要单独initdb一个实例,不能是通过clone或copy生成数据库。=

1)初始化实例(node2节点)

=将cluster其他节点的软件安装文件,拷贝到witness节点,然后重新初始一个实例=

 [kingbase@node2 bin]$ ./initdb -D /home/kingbase/cluster/R6HA/KHA/kingbase/data -E utf8 -U system -W
......

配置repmgr extension:

启动数据库服务:

[kingbase@node2 bin]$ ./sys_ctl -D /home/kingbase/cluster/R6HA/KHA/kingbase/data start

......

server started

2)创建repmgr元数据库和schema

[kingbase@node2 bin]$ ./ksql -U system test
ksql (V8.0)
Type "help" for help. # 创建esrep用户
test=# create user esrp with superuser;
CREATE ROLE
test=# alter user esrep with password 'Kingbaseha110';
ALTER ROLE #创建esrep数据库
test=# create database esrep owner esrep;
CREATE DATABASE
test=# \c esrep esrep
You are now connected to database "esrep" as user "esrep".
esrep=# \d
List of relations
Schema | Name | Type | Owner
--------+---------------------+------+--------
public | sys_stat_statements | view | system
(1 row) # 创建repmgr schema
esrep=# create schema repmgr;
CREATE SCHEMA

二、将witness加入repmgr cluster

1)配置repmgr.conf文件

[kingbase@node2 etc]$ cat repmgr.conf
on_bmj=off
node_id=2
node_name=node249
promote_command='/home/kingbase/cluster/R6HA/KHA/kingbase/bin/repmgr standby promote -f /home/kingbase/cluster/R6HA/KHA/kingbase/etc/repmgr.conf'
follow_command='/home/kingbase/cluster/R6HA/KHA/kingbase/bin/repmgr standby follow -f /home/kingbase/cluster/R6HA/KHA/kingbase/etc/repmgr.conf -W --upstream-node-id=%n'
conninfo='host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2' log_file='/home/kingbase/cluster/R6HA/KHA/kingbase/hamgr.log'
data_directory='/home/kingbase/cluster/R6HA/KHA/kingbase/data'
sys_bindir='/home/kingbase/cluster/R6HA/KHA/kingbase/bin'
ssh_options='-q -o ConnectTimeout=10 -o StrictHostKeyChecking=no -o ServerAliveInterval=2 -o ServerAliveCountMax=5 -p 22'
reconnect_attempts=2
reconnect_interval=3
failover='automatic'
recovery='automatic'
monitoring_history='no'
trusted_servers='192.168.7.1'
virtual_ip='192.168.7.240/24'
net_device='enp0s3'
ipaddr_path='/sbin'
arping_path='/sbin'
synchronous='quorum'
repmgrd_pid_file='/home/kingbase/cluster/R6HA/KHA/kingbase/hamgrd.pid'
ping_path='/usr/bin'
#priority=0

2)注册witness到repmgr cluster

[kingbase@node2 bin]$ ./repmgr witness register -h 192.168.7.248
# -h 指向主库节点ip
INFO: connecting to witness node "node249" (ID: 2)
INFO: connecting to primary node
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
INFO: witness registration complete
NOTICE: witness node "node249" (ID: 2) successfully registered [kingbase@node2 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | primary | * running | | default | 100 | 18 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
2 | node249 | witness | * running | node248 | default | 0 | 1 | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3 | node243 | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
5 | node243B | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

3)查看witness元数据库数据信息

=witness注册到repmgr cluster后,自动在esrep数据库的repmgr schema下创建repmgr元数据对象=

 [kingbase@node2 bin]$ ./ksql -U esrep esrep
ksql (V8.0)
Type "help" for help. esrep=# \d repmgr.*
Table "repmgr.events"
Column | Type | Collation | Nullable | Default
-----------------+--------------------------+-----------+----------+-------------------
node_id | integer | | not null |
event | text | | not null |
successful | boolean | | not null | true
event_timestamp | timestamp with time zone | | not null | CURRENT_TIMESTAMP
details | text | | | Index "repmgr.idx_monitoring_history_time"
Column | Type | Key? | Definition
-------------------+--------------------------+------+-------------------
last_monitor_time | timestamp with time zone | yes | last_monitor_time
standby_node_id | integer | yes | standby_node_id
btree, for table "repmgr.monitoring_history" Table "repmgr.monitoring_history"
Column | Type | Collation | Nullable | Default
---------------------------+--------------------------+-----------+----------+---------
primary_node_id | integer | | not null |
standby_node_id | integer | | not null |
last_monitor_time | timestamp with time zone | | not null |
last_apply_time | timestamp with time zone | | |
last_wal_primary_location | pg_lsn | | not null |
last_wal_standby_location | pg_lsn | | |
replication_lag | bigint | | not null |
apply_lag | bigint | | not null |
Indexes:
"idx_monitoring_history_time" btree (last_monitor_time, standby_node_id) Table "repmgr.nodes"
Column | Type | Collation | Nullable | Default
------------------+----------------------------+-----------+----------+-----------------
node_id | integer | | not null |
upstream_node_id | integer | | |
active | boolean | | not null | true
node_name | text | | not null |
type | text | | not null |
location | text | | not null | 'default'::text
priority | integer | | not null | 100
conninfo | text | | not null |
repluser | character varying(63 char) | | not null |
slot_name | text | | |
config_file | text | | not null |
Indexes:
"nodes_pkey" PRIMARY KEY, btree (node_id)
Check constraints:
"nodes_type_check" CHECK (type = ANY (ARRAY['primary'::text, 'standby'::text, 'witness'::text, 'bdr'::text]))
Foreign-key constraints:
"nodes_upstream_node_id_fkey" FOREIGN KEY (upstream_node_id) REFERENCES repmgr.nodes(node_id) DEFERRABLE
Referenced by:
TABLE "repmgr.nodes" CONSTRAINT "nodes_upstream_node_id_fkey" FOREIGN KEY (upstream_node_id) REFERENCES repmgr.nodes(node_id) DEFERRABLE Index "repmgr.nodes_pkey"
Column | Type | Key? | Definition
---------+---------+------+------------
node_id | integer | yes | node_id
primary key, btree, for table "repmgr.nodes" View "repmgr.replication_status"
Column | Type | Collation | Nullable | Default
---------------------------+--------------------------+-----------+----------+---------
primary_node_id | integer | | |
standby_node_id | integer | | |
standby_name | text | | |
node_type | text | | |
active | boolean | | |
last_monitor_time | timestamp with time zone | | |
last_wal_primary_location | pg_lsn | | |
last_wal_standby_location | pg_lsn | | |
replication_lag | text | | |
replication_time_lag | interval | | |
apply_lag | text | | |
communication_time_lag | interval | | | View "repmgr.show_nodes"
Column | Type | Collation | Nullable | Default
--------------------+---------+-----------+----------+---------
node_id | integer | | |
node_name | text | | |
active | boolean | | |
upstream_node_id | integer | | |
upstream_node_name | text | | |
type | text | | |
priority | integer | | |
conninfo | text | | | Table "repmgr.voting_term"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
term | integer | | not null |
Indexes:
"voting_term_restrict" UNIQUE, btree ((true))
Rules:
voting_term_delete AS
ON DELETE TO repmgr.voting_term DO INSTEAD NOTHING Index "repmgr.voting_term_restrict"
Column | Type | Key? | Definition
--------+---------+------+------------
bool | boolean | yes | (true)
unique, btree, for table "repmgr.voting_term"

三、witness节点注册故障分析

=如下所示,witness在其他节点的状态为“? unreachable ”。=

[kingbase@node1 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+---------------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | primary | * running | | default | 100 | 18 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
2 | node249 | witness | ? unreachable | node248 | default | 0 | ? | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3 | node243 | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
5 | node243B | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2 WARNING: following issues were detected
- unable to connect to node "node249" (ID: 2)

1)测试ksql到witness节点的连接(连接失败)

[kingbase@node1 bin]$ ./ksql -h 192.168.7.249 -U esrep esrep
ksql: error: could not connect to server: could not connect to server: No route to host
Is the server running on host "192.168.7.249" and accepting
TCP/IP connections on port 54321?
# 节点ping
[kingbase@node1 bin]$ ping 192.168.7.249
PING 192.168.7.249 (192.168.7.249) 56(84) bytes of data.
64 bytes from 192.168.7.249: icmp_seq=1 ttl=64 time=0.513 ms
64 bytes from 192.168.7.249: icmp_seq=2 ttl=64 time=0.390 ms
64 bytes from 192.168.7.249: icmp_seq=3 ttl=64 time=0.478 ms
^C
--- 192.168.7.249 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 0.390/0.460/0.513/0.054 ms

2)查看witness服务器防火墙配置

[root@node2 shell]# iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
ACCEPT udp -- anywhere anywhere udp dpt:domain
ACCEPT tcp -- anywhere anywhere tcp dpt:domain
ACCEPT udp -- anywhere anywhere udp dpt:bootps
ACCEPT tcp -- anywhere anywhere tcp dpt:bootps
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
ACCEPT all -- anywhere anywhere
INPUT_direct all -- anywhere anywhere
INPUT_ZONES_SOURCE all -- anywhere anywhere
INPUT_ZONES all -- anywhere anywhere
ACCEPT icmp -- anywhere anywhere
REJECT all -- anywhere anywhere reject-with icmp-host-prohibited Chain FORWARD (policy ACCEPT)
target prot opt source destination
ACCEPT all -- anywhere bogon/24 ctstate RELATED,ESTABLISHED
ACCEPT all -- 192.168.122.0/24 anywhere
ACCEPT all -- anywhere anywhere
REJECT all -- anywhere anywhere reject-with icmp-port-unreachable
REJECT all -- anywhere anywhere reject-with icmp-port-unreachable
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
ACCEPT all -- anywhere anywhere
FORWARD_direct all -- anywhere anywhere
FORWARD_IN_ZONES_SOURCE all -- anywhere anywhere
FORWARD_IN_ZONES all -- anywhere anywhere
FORWARD_OUT_ZONES_SOURCE all -- anywhere anywhere
FORWARD_OUT_ZONES all -- anywhere anywhere
ACCEPT icmp -- anywhere anywhere
REJECT all -- anywhere anywhere reject-with icmp-host-prohibited
......

=== 有以上可知,witness服务器节点防火墙被启动===

3)清理witness主机防火墙规则

[root@node2 shell]# iptables -F

4)测试witness主机数据库连接

[kingbase@node1 bin]$ ./ksql -h 192.168.7.249 -U system test
ksql (V8.0)
Type "help" for help.

5)查看集群节点状态

[kingbase@node1 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | primary | * running | | default | 100 | 18 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
2 | node249 | witness | * running | node248 | default | 0 | 1 | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3 | node243 | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
5 | node243B | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

四、集群failover 切换后

1)查看集群节点状态

[kingbase@node2 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | primary | * running | | default | 100 | 18 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
2 | node249 | witness | * running | node248 | default | 0 | 1 | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3 | node243 | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
5 | node243B | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

2)集群主备切换后,witness重新注册连接新的主库

[kingbase@node2 bin]$ ./repmgr witness register --force -h 192.168.7.243
INFO: connecting to witness node "node249" (ID: 2)
INFO: connecting to primary node
INFO: "repmgr" extension is already installed
INFO: witness registration complete
NOTICE: witness node "node249" (ID: 2) successfully registered [kingbase@node2 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | standby | running | node243 | default | 100 | 18 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
2 | node249 | witness | * running | node243 | default | 0 | 1 | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3 | node243 | primary | * running | | default | 100 | 19 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
5 | node243B | standby | running | node243 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2[kingbase@node2 bin]$ ./repmgr witness register --force -h 192.168.7.243
INFO: connecting to witness node "node249" (ID: 2)
INFO: connecting to primary node
INFO: "repmgr" extension is already installed
INFO: witness registration complete
NOTICE: witness node "node249" (ID: 2) successfully registered [kingbase@node2 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | standby | running | node243 | default | 100 | 18 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
2 | node249 | witness | * running | node243 | default | 0 | 1 | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3 | node243 | primary | * running | | default | 100 | 19 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
5 | node243B | standby | running | node243 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

新主库hamgr.log日志:

 [2021-03-01 12:49:05] [WARNING] unable to ping "host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2"
[2021-03-01 12:49:05] [DETAIL] PQping() returned "PQPING_REJECT"
[2021-03-01 12:49:05] [WARNING] unable to connect to upstream node "node248" (ID: 1)
[2021-03-01 12:49:05] [INFO] sleeping 3 seconds until next reconnection attempt
[2021-03-01 12:49:08] [INFO] checking state of node 1, 1 of 2 attempts
[2021-03-01 12:49:08] [WARNING] unable to ping "user=esrep connect_timeout=10 dbname=esrep host=192.168.7.248 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2 fallback_application_name=repmgr"
[2021-03-01 12:49:08] [DETAIL] PQping() returned "PQPING_REJECT"
[2021-03-01 12:49:08] [INFO] sleeping 3 seconds until next reconnection attempt
[2021-03-01 12:49:11] [INFO] checking state of node 1, 2 of 2 attempts
[2021-03-01 12:49:11] [WARNING] unable to ping "user=esrep connect_timeout=10 dbname=esrep host=192.168.7.248 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2 fallback_application_name=repmgr"
[2021-03-01 12:49:11] [DETAIL] PQping() returned "PQPING_NO_RESPONSE"
[2021-03-01 12:49:11] [WARNING] unable to reconnect to node 1 after 2 attempts
[2021-03-01 12:49:11] [NOTICE] setting "wal_retrieve_retry_interval" to 86405000 milliseconds
[2021-03-01 12:49:12] [WARNING] wal receiver not running
[2021-03-01 12:49:12] [NOTICE] WAL receiver disconnected on all sibling nodes
[2021-03-01 12:49:12] [INFO] WAL receiver disconnected on all 2 sibling nodes
[2021-03-01 12:49:12] [INFO] 2 active sibling nodes registered
[2021-03-01 12:49:12] [INFO] primary and this node have the same location ("default")
[2021-03-01 12:49:12] [INFO] local node's last receive lsn: 5/640000A0
[2021-03-01 12:49:12] [INFO] checking state of sibling node "node249" (ID: 2)
[2021-03-01 12:49:12] [INFO] node "node249" (ID: 2) reports its upstream is node 1, last seen 7 second(s) ago
[2021-03-01 12:49:12] [INFO] node 2 last saw primary node 7 second(s) ago
[2021-03-01 12:49:12] [INFO] checking state of sibling node "node243B" (ID: 5)
[2021-03-01 12:49:12] [WARNING] repmgrd not running on node "node243B" (ID: 5), skipping
[2021-03-01 12:49:12] [INFO] visible nodes: 3; total nodes: 3; no nodes have seen the primary within the last 4 seconds
[2021-03-01 12:49:12] [NOTICE] promotion candidate is "node243" (ID: 3)
[2021-03-01 12:49:12] [NOTICE] setting "wal_retrieve_retry_interval" to 5000 ms
[2021-03-01 12:49:12] [NOTICE] this node is the winner, will now promote itself and inform other nodes
[2021-03-01 12:49:12] [INFO] try to ping the trusted_servers "192.168.7.1" before execute promote_command
[2021-03-01 12:49:14] [NOTICE] PING 192.168.7.1 (192.168.7.1) 56(84) bytes of data. --- 192.168.7.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 2.450/2.460/2.471/0.050 ms
A
[2021-03-01 12:49:14] [NOTICE] successfully ping one or more of the trusted_servers "192.168.7.1"
[2021-03-01 12:49:14] [NOTICE] try to stop old primary db (host: "192.168.7.248")
ERROR: connection to database failed
DETAIL:
could not connect to server: Connection refused
Is the server running on host "192.168.7.248" and accepting
TCP/IP connections on port 54321? DETAIL: attempted to connect using:
user=esrep connect_timeout=10 dbname=esrep host=192.168.7.248 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2 fallback_application_name=repmgr
[2021-03-01 12:49:16] [NOTICE] PING 192.168.7.240 (192.168.7.240) 56(84) bytes of data. --- 192.168.7.240 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 0.357/0.365/0.374/0.020 ms [2021-03-01 12:49:16] [WARNING] the virtual ip is already on other host, try to release it on old primary node (host: "192.168.7.248")
[2021-03-01 12:49:16] [INFO] SSH connection to host "192.168.7.248" succeeded, ready to release vip on it
[2021-03-01 12:49:17] [NOTICE] old primary node (host: "192.168.7.248") release the virtual ip 192.168.7.240/24 success
[2021-03-01 12:49:17] [NOTICE] will acquire the virtual ip again
[2021-03-01 12:49:18] [NOTICE] PING 192.168.7.240 (192.168.7.240) 56(84) bytes of data. --- 192.168.7.240 ping statistics ---
2 packets transmitted, 0 received, +1 errors, 100% packet loss, time 999ms [2021-03-01 12:49:18] [WARNING] ping host"192.168.7.240" failed
[2021-03-01 12:49:18] [DETAIL] average RTT value is not greater than zero
[2021-03-01 12:49:19] [NOTICE] new primary node (ID: 3) acquire the virtual ip 192.168.7.240/24 success
[2021-03-01 12:49:19] [INFO] promote_command is:
"/home/kingbase/cluster/R6HA/KHA/kingbase/bin/repmgr standby promote -f /home/kingbase/cluster/R6HA/KHA/kingbase/etc/repmgr.conf"
WARNING: 2 sibling nodes found, but option "--siblings-follow" not specified
DETAIL: these nodes will remain attached to the current primary:
node249 (node ID: 2, witness server)
node243B (node ID: 5)
NOTICE: promoting standby to primary
DETAIL: promoting server "node243" (ID: 3) using sys_promote()
NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
INFO: SET synchronous TO "async" on primary host
NOTICE: STANDBY PROMOTE successful
DETAIL: server "node243" (ID: 3) was successfully promoted to primary

KingbaseES R6 集群repmgr witness 手工配置案例的更多相关文章

  1. KingbaseES R6 集群repmgr.conf参数'recovery'测试案例(一)

    KingbaseES R6集群repmgr.conf参数'recovery'测试案例(一) 案例说明: 在KingbaseES R6集群中,主库节点出现宕机(如重启或关机),会产生主备切换,但是当主库 ...

  2. KingbaseES R6 集群repmgr.conf参数'recovery'测试案例(三)

    案例三:测试'recovery = manual' 1.查看集群节点状态信息: [kingbase@node1 bin]$ ./repmgr cluster show ID | Name | Role ...

  3. KingbaseES R6 集群repmgr.conf参数'recovery'测试案例(二)

    案例二:测试'recovery = automatic' 1.查看集群节点状态信息: [kingbase@node1 bin]$ ./repmgr cluster show ID | Name | R ...

  4. KingbaseES R6集群归档备份故障分析解决案例

    案例说明: 在使用ps工具查看主库进程,发现主库'archiver'进程失败,检查sys_log日志可以发现归档失败的信息.通过sys_log日志提取归档语句手工执行归档操作,提示"当前数据 ...

  5. KingbaseES R6 集群主库网卡down测试案例

    数据库版本: test=# select version(); version ------------------------------------------------------------ ...

  6. KingbaseES R6 集群“双主”故障解决案例

    实际工作中,可能会碰到集群脑裂的情况,在脑裂时,会出现双 primary情况.这时,需要用户介入,人工判断哪个节点的数据最新,减少数据丢失. 一.测试环境信息 操作系统: [kingbase@node ...

  7. KingbaseES R6 集群 recovery 参数对切换的影响

    案例说明:在KingbaseES R6集群中,主库节点出现宕机(如重启或关机),会产生主备切换,但是当主库节点系统恢复正常后,如何对原主库节点进行处理,保证集群数据的一致性和安全,可以通过对repmg ...

  8. KingbaseES R6 集群修改data目录

    案例说明: 本案例是在部署完成KingbaseES R6集群后,由于业务的需求,集群需要修改data(数据存储)目录的测试.本案例分两种修改方式,第一种是离线修改data目录,即关闭整个集群后,修改数 ...

  9. KingbaseES R6 集群修改物理IP和VIP案例

    在用户的实际环境里,可能有时需要修改主机的IP,这就涉及到集群的配置修改.以下以例子的方式,介绍下KingbaseES R6集群如何修改IP. 一.案例测试环境 操作系统: [KINGBASE@nod ...

随机推荐

  1. FS2K人脸素描属性识别

    人脸素描属性识别 代码:https://github.com/linkcao/FS2K_extract 问题分析 需要根据FS2K数据集进行训练和测试,实现输入一张图片,输出该图片的属性特征信息,提取 ...

  2. IDEA插件配置之Eclipse Code Formatte

    1.下载 在idea中的Plugins中下载插件 Eclipse Code Formatte,下载过之后重启. 2.配置 将自己下载的xml文件加载进来即可. 这个xml文件可自行在网上找找,有需要的 ...

  3. Tapdata Cloud 版本上新!新增TiDB等数据源支持,连接和任务功能增强,体验更优

    Tapdata Cloud cloud.tapdata.net Tapdata Cloud 是国内首家异构数据库实时同步云平台,目前支持Oracle.MySQL.PG.SQL Server.Mongo ...

  4. JDBCToolsV2:利用ThreadLocal保证当前线程操作同一个数据库连接对象。

    JDBCToolsV2:     利用ThreadLocal保证当前线程操作同一个数据库连接对象. package com.dgd.test; import com.alibaba.druid.poo ...

  5. private关键字的作用及使用和this关键字的作用

    封装的操作--private关键字 private的含义 1. private是一个权限修饰符,代表最小权限. 2. 可以修饰成员变量和成员方法. 3. 被private修饰后的成员变量和成员方法,只 ...

  6. 【C++】学生管理系统

    [C++]学生管理系统 一道非常经典的C语言题目,用C++实现   题目如下: 输入功能:由键盘输入10个学生的学号.姓名.三科成绩,并计算出平均成绩和总成绩,然后将它存入文件stud.dat. 插入 ...

  7. java枚举和注解

    枚举 一.枚举(enumeration) 是一组常量的集合,可以理解为:枚举属于一种特殊的类,里面只包含一组有限的特定的对象,构造方法默认为private. 二.枚举的两种实现方式 1.自定义实现枚举 ...

  8. javaScript去重的11种方法

    前言 去重是开发和面试中经常遇到的问题,下面是总结的 11 种去重方法 方法 示例数组 var arr = [1, 2, 4, 5, 5, 2, 1, 1, 4, 6] set + 解构赋值 这种方法 ...

  9. 转换流的原理和OutputStreamWriter介绍&代码实现

    转换流的原理 OutputStreamWriter介绍&代码实现 package com.yang.Test.ReverseStream; import java.io.FileNotFoun ...

  10. python在执行命令时添加环境变量或指定执行路径

    cwd: 命令的执行路径,相当于os.chdir('/home')提前切换到对应路径 env: 环境变量,某些执行路径需要添加必须的环境变量,例如fastboot依赖与adb路径下的环境变量 impo ...