KingbaseES R6 集群repmgr witness 手工配置案例
使用见证服务器:
见证服务器是一个正常的KingbaseES实例,不是流复制群集的一部分; 其目的是,如果发生故障转移情况,则提供证明它是主服务器本身不可用的证据,而不是例如在不同物理位置之间的网络分裂。见证服务器的典型用例是双节点流复制设置,其中主要和备用服务器位于不同的位置(数据中心)。通过在与主服务器相同的位置(数据中心)中创建见证服务器,如果主服务器变得不可用,则备用服务器可以决定是否可以在不“脑裂”情况的情况下提升为主:如果它无法看到见证人或主服务器,它可能存在网络级中断,它不应该提升为主。如果它可以看到见证人但不能看到主节点,这证明没有网络中断且主本身不可用,因此它可以提升自己为主。
对于更复杂的复制方案,例如使用多个数据中心,最好使用基于位置的故障转移,这可确保只有与主服务器位于同一位置的节点才能成为主节点。
要创建见证服务器,请在与群集的主服务器位于同一物理位置的服务器上设置普通的PostgreSQL实例。不应该在与主服务器同一个物理主机创建见证服务器,否则如果主服务器由于硬件问题失败,见证服务器会失效。
数据库版本:
test=# select version();
                                                       version
----------------------------------------------------------------------------------------------------------------------
 KingbaseES V008R006C003B0010 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
(1 row)
repmgr cluster原架构:
[kingbase@node2 bin]$ ./repmgr cluster show
 ID | Name     | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
 1  | node248  | primary | * running |          | default  | 100      | 18       | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
 3  | node243  | standby |   running | node248  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
 5  | node243B | standby |   running | node248  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
一、创建witness服务器
=注意:witness服务器,应该是一个独立的主机节点不能和主库或备库在同一个主机上,并且witness和其他主机之间不构成流复制,所以witness是一个独立的primary实例,其数据库systemID,不应该和其他数据库一致,需要单独initdb一个实例,不能是通过clone或copy生成数据库。=
1)初始化实例(node2节点)
=将cluster其他节点的软件安装文件,拷贝到witness节点,然后重新初始一个实例=
 [kingbase@node2 bin]$ ./initdb -D /home/kingbase/cluster/R6HA/KHA/kingbase/data -E utf8 -U system -W
......
配置repmgr extension:

启动数据库服务:
[kingbase@node2 bin]$ ./sys_ctl -D /home/kingbase/cluster/R6HA/KHA/kingbase/data start
......
server started
2)创建repmgr元数据库和schema
[kingbase@node2 bin]$ ./ksql -U system test
ksql (V8.0)
Type "help" for help.
# 创建esrep用户
test=# create user esrp with superuser;
CREATE ROLE
test=# alter user esrep with password 'Kingbaseha110';
ALTER ROLE
#创建esrep数据库
test=# create database esrep owner esrep;
CREATE DATABASE
test=# \c esrep esrep
You are now connected to database "esrep" as user "esrep".
esrep=# \d
              List of relations
 Schema |        Name         | Type | Owner
--------+---------------------+------+--------
 public | sys_stat_statements | view | system
(1 row)
# 创建repmgr schema
esrep=# create schema repmgr;
CREATE SCHEMA
二、将witness加入repmgr cluster
1)配置repmgr.conf文件
[kingbase@node2 etc]$ cat repmgr.conf
on_bmj=off
node_id=2
node_name=node249
promote_command='/home/kingbase/cluster/R6HA/KHA/kingbase/bin/repmgr  standby promote -f /home/kingbase/cluster/R6HA/KHA/kingbase/etc/repmgr.conf'
follow_command='/home/kingbase/cluster/R6HA/KHA/kingbase/bin/repmgr  standby follow  -f /home/kingbase/cluster/R6HA/KHA/kingbase/etc/repmgr.conf -W --upstream-node-id=%n'
conninfo='host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2'
log_file='/home/kingbase/cluster/R6HA/KHA/kingbase/hamgr.log'
data_directory='/home/kingbase/cluster/R6HA/KHA/kingbase/data'
sys_bindir='/home/kingbase/cluster/R6HA/KHA/kingbase/bin'
ssh_options='-q -o ConnectTimeout=10 -o StrictHostKeyChecking=no -o ServerAliveInterval=2 -o ServerAliveCountMax=5 -p 22'
reconnect_attempts=2
reconnect_interval=3
failover='automatic'
recovery='automatic'
monitoring_history='no'
trusted_servers='192.168.7.1'
virtual_ip='192.168.7.240/24'
net_device='enp0s3'
ipaddr_path='/sbin'
arping_path='/sbin'
synchronous='quorum'
repmgrd_pid_file='/home/kingbase/cluster/R6HA/KHA/kingbase/hamgrd.pid'
ping_path='/usr/bin'
#priority=0

2)注册witness到repmgr cluster
[kingbase@node2 bin]$ ./repmgr witness register -h 192.168.7.248
# -h 指向主库节点ip
INFO: connecting to witness node "node249" (ID: 2)
INFO: connecting to primary node
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
INFO: witness registration complete
NOTICE: witness node "node249" (ID: 2) successfully registered
[kingbase@node2 bin]$ ./repmgr cluster show
 ID | Name     | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
 1  | node248  | primary | * running |          | default  | 100      | 18       | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
 2  | node249  | witness | * running | node248  | default  | 0        | 1        | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
 3  | node243  | standby |   running | node248  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
 5  | node243B | standby |   running | node248  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3)查看witness元数据库数据信息
=witness注册到repmgr cluster后,自动在esrep数据库的repmgr schema下创建repmgr元数据对象=
 [kingbase@node2 bin]$ ./ksql -U esrep esrep
ksql (V8.0)
Type "help" for help.
esrep=# \d repmgr.*
                                 Table "repmgr.events"
     Column      |           Type           | Collation | Nullable |      Default
-----------------+--------------------------+-----------+----------+-------------------
 node_id         | integer                  |           | not null |
 event           | text                     |           | not null |
 successful      | boolean                  |           | not null | true
 event_timestamp | timestamp with time zone |           | not null | CURRENT_TIMESTAMP
 details         | text                     |           |          | 
               Index "repmgr.idx_monitoring_history_time"
      Column       |           Type           | Key? |    Definition
-------------------+--------------------------+------+-------------------
 last_monitor_time | timestamp with time zone | yes  | last_monitor_time
 standby_node_id   | integer                  | yes  | standby_node_id
btree, for table "repmgr.monitoring_history"
                           Table "repmgr.monitoring_history"
          Column           |           Type           | Collation | Nullable | Default
---------------------------+--------------------------+-----------+----------+---------
 primary_node_id           | integer                  |           | not null |
 standby_node_id           | integer                  |           | not null |
 last_monitor_time         | timestamp with time zone |           | not null |
 last_apply_time           | timestamp with time zone |           |          |
 last_wal_primary_location | pg_lsn                   |           | not null |
 last_wal_standby_location | pg_lsn                   |           |          |
 replication_lag           | bigint                   |           | not null |
 apply_lag                 | bigint                   |           | not null |
Indexes:
    "idx_monitoring_history_time" btree (last_monitor_time, standby_node_id)
                                  Table "repmgr.nodes"
      Column      |            Type            | Collation | Nullable |     Default
------------------+----------------------------+-----------+----------+-----------------
 node_id          | integer                    |           | not null |
 upstream_node_id | integer                    |           |          |
 active           | boolean                    |           | not null | true
 node_name        | text                       |           | not null |
 type             | text                       |           | not null |
 location         | text                       |           | not null | 'default'::text
 priority         | integer                    |           | not null | 100
 conninfo         | text                       |           | not null |
 repluser         | character varying(63 char) |           | not null |
 slot_name        | text                       |           |          |
 config_file      | text                       |           | not null |
Indexes:
    "nodes_pkey" PRIMARY KEY, btree (node_id)
Check constraints:
    "nodes_type_check" CHECK (type = ANY (ARRAY['primary'::text, 'standby'::text, 'witness'::text, 'bdr'::text]))
Foreign-key constraints:
    "nodes_upstream_node_id_fkey" FOREIGN KEY (upstream_node_id) REFERENCES repmgr.nodes(node_id) DEFERRABLE
Referenced by:
    TABLE "repmgr.nodes" CONSTRAINT "nodes_upstream_node_id_fkey" FOREIGN KEY (upstream_node_id) REFERENCES repmgr.nodes(node_id) DEFERRABLE
       Index "repmgr.nodes_pkey"
 Column  |  Type   | Key? | Definition
---------+---------+------+------------
 node_id | integer | yes  | node_id
primary key, btree, for table "repmgr.nodes"
                           View "repmgr.replication_status"
          Column           |           Type           | Collation | Nullable | Default
---------------------------+--------------------------+-----------+----------+---------
 primary_node_id           | integer                  |           |          |
 standby_node_id           | integer                  |           |          |
 standby_name              | text                     |           |          |
 node_type                 | text                     |           |          |
 active                    | boolean                  |           |          |
 last_monitor_time         | timestamp with time zone |           |          |
 last_wal_primary_location | pg_lsn                   |           |          |
 last_wal_standby_location | pg_lsn                   |           |          |
 replication_lag           | text                     |           |          |
 replication_time_lag      | interval                 |           |          |
 apply_lag                 | text                     |           |          |
 communication_time_lag    | interval                 |           |          | 
                   View "repmgr.show_nodes"
       Column       |  Type   | Collation | Nullable | Default
--------------------+---------+-----------+----------+---------
 node_id            | integer |           |          |
 node_name          | text    |           |          |
 active             | boolean |           |          |
 upstream_node_id   | integer |           |          |
 upstream_node_name | text    |           |          |
 type               | text    |           |          |
 priority           | integer |           |          |
 conninfo           | text    |           |          | 
            Table "repmgr.voting_term"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 term   | integer |           | not null |
Indexes:
    "voting_term_restrict" UNIQUE, btree ((true))
Rules:
    voting_term_delete AS
    ON DELETE TO repmgr.voting_term DO INSTEAD NOTHING
 Index "repmgr.voting_term_restrict"
 Column |  Type   | Key? | Definition
--------+---------+------+------------
 bool   | boolean | yes  | (true)
unique, btree, for table "repmgr.voting_term"
三、witness节点注册故障分析
=如下所示,witness在其他节点的状态为“? unreachable ”。=
[kingbase@node1 bin]$ ./repmgr cluster show
 ID | Name     | Role    | Status        | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+---------------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
 1  | node248  | primary | * running     |          | default  | 100      | 18       | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
 2  | node249  | witness | ? unreachable | node248  | default  | 0        | ?        | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
 3  | node243  | standby |   running     | node248  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
 5  | node243B | standby |   running     | node248  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
WARNING: following issues were detected
  - unable to connect to node "node249" (ID: 2)

1)测试ksql到witness节点的连接(连接失败)
[kingbase@node1 bin]$ ./ksql -h 192.168.7.249 -U esrep esrep
ksql: error: could not connect to server: could not connect to server: No route to host
       Is the server running on host "192.168.7.249" and accepting
       TCP/IP connections on port 54321?
# 节点ping
[kingbase@node1 bin]$ ping 192.168.7.249
PING 192.168.7.249 (192.168.7.249) 56(84) bytes of data.
64 bytes from 192.168.7.249: icmp_seq=1 ttl=64 time=0.513 ms
64 bytes from 192.168.7.249: icmp_seq=2 ttl=64 time=0.390 ms
64 bytes from 192.168.7.249: icmp_seq=3 ttl=64 time=0.478 ms
^C
--- 192.168.7.249 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 0.390/0.460/0.513/0.054 ms
2)查看witness服务器防火墙配置
[root@node2 shell]# iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination
ACCEPT     udp  --  anywhere             anywhere             udp dpt:domain
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:domain
ACCEPT     udp  --  anywhere             anywhere             udp dpt:bootps
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:bootps
ACCEPT     all  --  anywhere             anywhere             ctstate RELATED,ESTABLISHED
ACCEPT     all  --  anywhere             anywhere
INPUT_direct  all  --  anywhere             anywhere
INPUT_ZONES_SOURCE  all  --  anywhere             anywhere
INPUT_ZONES  all  --  anywhere             anywhere
ACCEPT     icmp --  anywhere             anywhere
REJECT     all  --  anywhere             anywhere             reject-with icmp-host-prohibited
Chain FORWARD (policy ACCEPT)
target     prot opt source               destination
ACCEPT     all  --  anywhere             bogon/24             ctstate RELATED,ESTABLISHED
ACCEPT     all  --  192.168.122.0/24     anywhere
ACCEPT     all  --  anywhere             anywhere
REJECT     all  --  anywhere             anywhere             reject-with icmp-port-unreachable
REJECT     all  --  anywhere             anywhere             reject-with icmp-port-unreachable
ACCEPT     all  --  anywhere             anywhere             ctstate RELATED,ESTABLISHED
ACCEPT     all  --  anywhere             anywhere
FORWARD_direct  all  --  anywhere             anywhere
FORWARD_IN_ZONES_SOURCE  all  --  anywhere             anywhere
FORWARD_IN_ZONES  all  --  anywhere             anywhere
FORWARD_OUT_ZONES_SOURCE  all  --  anywhere             anywhere
FORWARD_OUT_ZONES  all  --  anywhere             anywhere
ACCEPT     icmp --  anywhere             anywhere
REJECT     all  --  anywhere             anywhere             reject-with icmp-host-prohibited
......
=== 有以上可知,witness服务器节点防火墙被启动===
3)清理witness主机防火墙规则
[root@node2 shell]# iptables -F
4)测试witness主机数据库连接
[kingbase@node1 bin]$ ./ksql -h 192.168.7.249 -U system test
ksql (V8.0)
Type "help" for help.
5)查看集群节点状态
[kingbase@node1 bin]$ ./repmgr cluster show
 ID | Name     | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
 1  | node248  | primary | * running |          | default  | 100      | 18       | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
 2  | node249  | witness | * running | node248  | default  | 0        | 1        | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
 3  | node243  | standby |   running | node248  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
 5  | node243B | standby |   running | node248  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

四、集群failover 切换后
1)查看集群节点状态
[kingbase@node2 bin]$ ./repmgr cluster show
 ID | Name     | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
 1  | node248  | primary | * running |          | default  | 100      | 18       | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
 2  | node249  | witness | * running | node248  | default  | 0        | 1        | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
 3  | node243  | standby |   running | node248  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
 5  | node243B | standby |   running | node248  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
2)集群主备切换后,witness重新注册连接新的主库
[kingbase@node2 bin]$ ./repmgr witness register --force -h 192.168.7.243
INFO: connecting to witness node "node249" (ID: 2)
INFO: connecting to primary node
INFO: "repmgr" extension is already installed
INFO: witness registration complete
NOTICE: witness node "node249" (ID: 2) successfully registered
[kingbase@node2 bin]$ ./repmgr cluster show
 ID | Name     | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
 1  | node248  | standby |   running | node243  | default  | 100      | 18       | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
 2  | node249  | witness | * running | node243  | default  | 0        | 1        | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
 3  | node243  | primary | * running |          | default  | 100      | 19       | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
 5  | node243B | standby |   running | node243  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2[kingbase@node2 bin]$ ./repmgr witness register --force -h 192.168.7.243
INFO: connecting to witness node "node249" (ID: 2)
INFO: connecting to primary node
INFO: "repmgr" extension is already installed
INFO: witness registration complete
NOTICE: witness node "node249" (ID: 2) successfully registered
[kingbase@node2 bin]$ ./repmgr cluster show
 ID | Name     | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
 1  | node248  | standby |   running | node243  | default  | 100      | 18       | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
 2  | node249  | witness | * running | node243  | default  | 0        | 1        | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
 3  | node243  | primary | * running |          | default  | 100      | 19       | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
 5  | node243B | standby |   running | node243  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
新主库hamgr.log日志:
 [2021-03-01 12:49:05] [WARNING] unable to ping "host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2"
[2021-03-01 12:49:05] [DETAIL] PQping() returned "PQPING_REJECT"
[2021-03-01 12:49:05] [WARNING] unable to connect to upstream node "node248" (ID: 1)
[2021-03-01 12:49:05] [INFO] sleeping 3 seconds until next reconnection attempt
[2021-03-01 12:49:08] [INFO] checking state of node 1, 1 of 2 attempts
[2021-03-01 12:49:08] [WARNING] unable to ping "user=esrep connect_timeout=10 dbname=esrep host=192.168.7.248 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2 fallback_application_name=repmgr"
[2021-03-01 12:49:08] [DETAIL] PQping() returned "PQPING_REJECT"
[2021-03-01 12:49:08] [INFO] sleeping 3 seconds until next reconnection attempt
[2021-03-01 12:49:11] [INFO] checking state of node 1, 2 of 2 attempts
[2021-03-01 12:49:11] [WARNING] unable to ping "user=esrep connect_timeout=10 dbname=esrep host=192.168.7.248 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2 fallback_application_name=repmgr"
[2021-03-01 12:49:11] [DETAIL] PQping() returned "PQPING_NO_RESPONSE"
[2021-03-01 12:49:11] [WARNING] unable to reconnect to node 1 after 2 attempts
[2021-03-01 12:49:11] [NOTICE] setting "wal_retrieve_retry_interval" to 86405000 milliseconds
[2021-03-01 12:49:12] [WARNING] wal receiver not running
[2021-03-01 12:49:12] [NOTICE] WAL receiver disconnected on all sibling nodes
[2021-03-01 12:49:12] [INFO] WAL receiver disconnected on all 2 sibling nodes
[2021-03-01 12:49:12] [INFO] 2 active sibling nodes registered
[2021-03-01 12:49:12] [INFO] primary and this node have the same location ("default")
[2021-03-01 12:49:12] [INFO] local node's last receive lsn: 5/640000A0
[2021-03-01 12:49:12] [INFO] checking state of sibling node "node249" (ID: 2)
[2021-03-01 12:49:12] [INFO] node "node249" (ID: 2) reports its upstream is node 1, last seen 7 second(s) ago
[2021-03-01 12:49:12] [INFO] node 2 last saw primary node 7 second(s) ago
[2021-03-01 12:49:12] [INFO] checking state of sibling node "node243B" (ID: 5)
[2021-03-01 12:49:12] [WARNING] repmgrd not running on node "node243B" (ID: 5), skipping
[2021-03-01 12:49:12] [INFO] visible nodes: 3; total nodes: 3; no nodes have seen the primary within the last 4 seconds
[2021-03-01 12:49:12] [NOTICE] promotion candidate is "node243" (ID: 3)
[2021-03-01 12:49:12] [NOTICE] setting "wal_retrieve_retry_interval" to 5000 ms
[2021-03-01 12:49:12] [NOTICE] this node is the winner, will now promote itself and inform other nodes
[2021-03-01 12:49:12] [INFO] try to ping the trusted_servers "192.168.7.1" before execute promote_command
[2021-03-01 12:49:14] [NOTICE] PING 192.168.7.1 (192.168.7.1) 56(84) bytes of data.
--- 192.168.7.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 2.450/2.460/2.471/0.050 ms
A
[2021-03-01 12:49:14] [NOTICE] successfully ping one or more of the trusted_servers "192.168.7.1"
[2021-03-01 12:49:14] [NOTICE] try to stop old primary db (host: "192.168.7.248")
ERROR: connection to database failed
DETAIL:
could not connect to server: Connection refused
        Is the server running on host "192.168.7.248" and accepting
        TCP/IP connections on port 54321?
DETAIL: attempted to connect using:
  user=esrep connect_timeout=10 dbname=esrep host=192.168.7.248 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2 fallback_application_name=repmgr
[2021-03-01 12:49:16] [NOTICE] PING 192.168.7.240 (192.168.7.240) 56(84) bytes of data.
--- 192.168.7.240 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 0.357/0.365/0.374/0.020 ms
[2021-03-01 12:49:16] [WARNING] the virtual ip is already on other host, try to release it on old primary node (host: "192.168.7.248")
[2021-03-01 12:49:16] [INFO] SSH connection to host "192.168.7.248" succeeded, ready to release vip on it
[2021-03-01 12:49:17] [NOTICE] old primary node (host: "192.168.7.248") release the virtual ip 192.168.7.240/24 success
[2021-03-01 12:49:17] [NOTICE] will acquire the virtual ip again
[2021-03-01 12:49:18] [NOTICE] PING 192.168.7.240 (192.168.7.240) 56(84) bytes of data.
--- 192.168.7.240 ping statistics ---
2 packets transmitted, 0 received, +1 errors, 100% packet loss, time 999ms
[2021-03-01 12:49:18] [WARNING] ping host"192.168.7.240" failed
[2021-03-01 12:49:18] [DETAIL] average RTT value is not greater than zero
[2021-03-01 12:49:19] [NOTICE] new primary node (ID: 3) acquire the virtual ip 192.168.7.240/24 success
[2021-03-01 12:49:19] [INFO] promote_command is:
  "/home/kingbase/cluster/R6HA/KHA/kingbase/bin/repmgr  standby promote -f /home/kingbase/cluster/R6HA/KHA/kingbase/etc/repmgr.conf"
WARNING: 2 sibling nodes found, but option "--siblings-follow" not specified
DETAIL: these nodes will remain attached to the current primary:
  node249 (node ID: 2, witness server)
  node243B (node ID: 5)
NOTICE: promoting standby to primary
DETAIL: promoting server "node243" (ID: 3) using sys_promote()
NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
INFO: SET synchronous TO "async" on primary host
NOTICE: STANDBY PROMOTE successful
DETAIL: server "node243" (ID: 3) was successfully promoted to primary
												
											KingbaseES R6 集群repmgr witness 手工配置案例的更多相关文章
- KingbaseES R6 集群repmgr.conf参数'recovery'测试案例(一)
		
KingbaseES R6集群repmgr.conf参数'recovery'测试案例(一) 案例说明: 在KingbaseES R6集群中,主库节点出现宕机(如重启或关机),会产生主备切换,但是当主库 ...
 - KingbaseES R6 集群repmgr.conf参数'recovery'测试案例(三)
		
案例三:测试'recovery = manual' 1.查看集群节点状态信息: [kingbase@node1 bin]$ ./repmgr cluster show ID | Name | Role ...
 - KingbaseES R6 集群repmgr.conf参数'recovery'测试案例(二)
		
案例二:测试'recovery = automatic' 1.查看集群节点状态信息: [kingbase@node1 bin]$ ./repmgr cluster show ID | Name | R ...
 - KingbaseES R6集群归档备份故障分析解决案例
		
案例说明: 在使用ps工具查看主库进程,发现主库'archiver'进程失败,检查sys_log日志可以发现归档失败的信息.通过sys_log日志提取归档语句手工执行归档操作,提示"当前数据 ...
 - KingbaseES R6 集群主库网卡down测试案例
		
数据库版本: test=# select version(); version ------------------------------------------------------------ ...
 - KingbaseES R6 集群“双主”故障解决案例
		
实际工作中,可能会碰到集群脑裂的情况,在脑裂时,会出现双 primary情况.这时,需要用户介入,人工判断哪个节点的数据最新,减少数据丢失. 一.测试环境信息 操作系统: [kingbase@node ...
 - KingbaseES R6 集群 recovery 参数对切换的影响
		
案例说明:在KingbaseES R6集群中,主库节点出现宕机(如重启或关机),会产生主备切换,但是当主库节点系统恢复正常后,如何对原主库节点进行处理,保证集群数据的一致性和安全,可以通过对repmg ...
 - KingbaseES R6 集群修改data目录
		
案例说明: 本案例是在部署完成KingbaseES R6集群后,由于业务的需求,集群需要修改data(数据存储)目录的测试.本案例分两种修改方式,第一种是离线修改data目录,即关闭整个集群后,修改数 ...
 - KingbaseES R6 集群修改物理IP和VIP案例
		
在用户的实际环境里,可能有时需要修改主机的IP,这就涉及到集群的配置修改.以下以例子的方式,介绍下KingbaseES R6集群如何修改IP. 一.案例测试环境 操作系统: [KINGBASE@nod ...
 
随机推荐
- ServletContext 对象
			
概念:代表整个Web应用 可以和程序的容器通信 (服务器) 获取 通过request对象获取 request.getServletContext(); 通过HTTPServlet获取 this.g ...
 - 开通博客-学习java之路
			
已被西南交通大学录取,毕设也已经进入末期.开始狂神说的Java学习之路,纪念一下!!!
 - 聊聊Netty那些事儿之从内核角度看IO模型
			
从今天开始我们来聊聊Netty的那些事儿,我们都知道Netty是一个高性能异步事件驱动的网络框架. 它的设计异常优雅简洁,扩展性高,稳定性强.拥有非常详细完整的用户文档. 同时内置了很多非常有用的模块 ...
 - 模拟HashMap冲突
			
最近看HashMap的源码,其中相同下标容易产生hash冲突,但是调试需要发生hash冲突,本文模拟hash冲突. hash冲突原理 HashMap冲突是key首先调用hash()方法: static ...
 - Tapdata 携手精诚瑞宝,共拓 Real Time DaaS 蓝海市场
			
2021年10月22日,深圳钛铂数据有限公司「Tapdata」 与精诚瑞宝计算机系统有限公司「精诚瑞宝」战略合作签约仪式在深圳举行,Tapdata 创始人唐建法先生与精诚瑞宝副总经理余灿雄先生签署 ...
 - Java面向对象(下)作业
			
首先我把题目先列到这里,可以仔细看一下题. (1)设计一个名为Geometric的几何图形的抽象类,该类包括: ①两个名为color.filled属性分别表示图形颜色和是否填充. ②一个无参的构造方法 ...
 - 从零开始制作【立体键盘】,画UI免写CSS,【盲打练习】的交互逻辑只用了10来行表达式!
			
手把手教你从空白页面开始通过拖拉拽可视化的方式制作[立体键盘]的静态页面,不用手写一行CSS代码,全程只用10来行表达式就完成了[盲打练习]的交互逻辑. 整个过程在众触应用平台进行,快速直观. 最终U ...
 - 全网最新的nacos 2.1.0集群多节点部署教程
			
原文链接:全网最新的nacos 2.1.0集群多节点部署教程-语雀 基本信息 进度整理中 版本 2.1.0 版本发布日期 2022-04-29 git revision number b5845313 ...
 - Windows 下如何调试 PowerShell
			
背景 最近在用 PowerShell 的时候,发现一些地方特别有意思.于是就萌生了看源代码的想法,单看肯定不过瘾,调试起来才有意思.于是就有了这个,记录下. 调试 PowerShell 主要分为两种方 ...
 - 洛谷 P2073 送花 treap 无指针
			
看了那么多题解都没做对,结果今早上按自己的思路和模板做了做,然后过了. 平衡树裸题 直接上代码: #include<bits/stdc++.h> #define rint register ...