案例说明:

KingbaseES R6集群启动时,出现“incorrect command permissions for the virtual ip”故障,本案例介绍了如何分析和解决此案例方法和步骤。

数据库版本:

test=# select version();
version
----------------------------------------------------------------------------------------------------------------------
KingbaseES V008R006C005B0023 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
(1 row)

集群架构:

一、集群启动失败

[kingbase@node3 bin]$ ./sys_monitor.sh start
2021-03-01 13:27:26 Ready to start all DB ...
2021-03-01 13:27:26 begin to start DB on "[192.168.7.243]".
incorrect command permissions for the virtual ip.
waiting for server to start..... done
server started
2021-03-01 13:27:30 execute to start DB on "[192.168.7.243]" success, connect to check it.
2021-03-01 13:27:31 DB on "[192.168.7.243]" start success.
2021-03-01 13:27:32 Try to ping trusted_servers on host 192.168.7.248 ...
2021-03-01 13:27:34 Try to ping trusted_servers on host 192.168.7.243 ...
2021-03-01 13:27:37 begin to start DB on "[192.168.7.248]".
incorrect command permissions for the virtual ip.
waiting for server to start..... done
server started
2021-03-01 13:27:40 execute to start DB on "[192.168.7.248]" success, connect to check it.
2021-03-01 13:27:41 DB on "[192.168.7.248]" start success.
ERROR: No execute permission for "/home/kingbase/cluster/R6C5/R6C5R//kingbase/bin/arping"
incorrect command permissions for the virtual ip.
2021-03-01 13:27:41 There is no primary DB running, will do nothing and exit.

=从以上错误信息可知,在加载vip时访问arping时,出现权限问题=

二、故障分析

1、查看repmgr配置信息

[kingbase@node3 bin]$ cat ../etc/repmgr.conf
on_bmj=off
node_id=1
node_name='node243'
promote_command='/home/kingbase/cluster/R6C5/R6C5R/kingbase/bin/repmgr standby promote -f /home/kingbase/cluster/R6C5/R6C5R/kingbase/etc/repmgr.conf'
follow_command='/home/kingbase/cluster/R6C5/R6C5R/kingbase/bin/repmgr standby follow -f /home/kingbase/cluster/R6C5/R6C5R/kingbase/etc/repmgr.conf -W --upstream-node-id=%n'
conninfo='host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3'
log_file='/home/kingbase/cluster/R6C5/R6C5R/kingbase/log/hamgr.log'
kbha_log_file='/home/kingbase/cluster/R6C5/R6C5R/kingbase/log/kbha.log'
data_directory='/home/kingbase/cluster/R6C5/R6C5R/kingbase/data'
sys_bindir='/home/kingbase/cluster/R6C5/R6C5R/kingbase/bin'
ssh_options='-q -o ConnectTimeout=10 -o StrictHostKeyChecking=no -o ServerAliveInterval=2 -o ServerAliveCountMax=5 -p 22'
reconnect_attempts=10
reconnect_interval=6
failover='automatic'
recovery='standby'
monitoring_history='no'
trusted_servers='192.168.7.1'
virtual_ip='192.168.7.241/24'
net_device='enp0s3'
net_device_ip='192.168.7.243'
ipaddr_path='/sbin'
arping_path='/home/kingbase/cluster/R6C5/R6C5R//kingbase/bin'
synchronous='sync'
repmgrd_pid_file='/home/kingbase/cluster/R6C5/R6C5R/kingbase/etc/hamgrd.pid'
kbha_pid_file='/home/kingbase/cluster/R6C5/R6C5R/kingbase/etc/kbha.pid'
ping_path='/usr/bin'
auto_cluster_recovery_level=1
use_check_disk=off

=此版本使用的arping是数据库软件包自带的工具=

2、查看arping版本

3、查看arping权限

[kingbase@node1 bin]$ ls -lh arping
-rwxr-xr-x 1 kingbase root 11K Nov 5 2021 arping

三、问题解决步骤

1、配置arping所有者为kingbase用户

1)配置权限

[kingbase@node1 bin]$ chown -R kingbase.kingbase arping
[kingbase@node1 bin]$ ls -lh arping
-rwxr-xr-x 1 kingbase kingbase 11K Nov 5 2021 arping

2)启动集群(故障依旧)

2、配置arping所有者为root并分配setuid权限

1)配置权限

[root@node3 ~]# cd /home/kingbase/cluster/R6C5/R6C5R//kingbase/bin
[root@node3 bin]# chown -R root.root arping
[root@node3 bin]# chmod u+s arping
[root@node3 bin]# ls -lh arping
-rwsr-xr-x 1 root root 11K Nov 5 2021 arping

2)启动集群

[kingbase@node3 bin]$ ./sys_monitor.sh start
2021-03-01 13:38:04 Ready to start all DB ...
2021-03-01 13:38:04 begin to start DB on "[192.168.7.243]".
2021-03-01 13:38:05 DB on "[192.168.7.243]" already started, connect to check it.
2021-03-01 13:38:06 DB on "[192.168.7.243]" start success.
2021-03-01 13:38:06 Try to ping trusted_servers on host 192.168.7.248 ...
2021-03-01 13:38:08 Try to ping trusted_servers on host 192.168.7.243 ...
2021-03-01 13:38:11 begin to start DB on "[192.168.7.248]".
2021-03-01 13:38:12 DB on "[192.168.7.248]" already started, connect to check it.
2021-03-01 13:38:13 DB on "[192.168.7.248]" start success.
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node243 | primary | * running | | default | 100 | 3 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node248 | standby | running | node243 | default | 100 | 3 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2021-03-01 13:38:13 The primary DB is started.
2021-03-01 13:38:13 check synchronous_standby_names ...
t
2021-03-01 13:38:24 Success to load virtual ip [192.168.7.241/24] on primary host [192.168.7.243].
2021-03-01 13:38:24 Try to ping vip on host 192.168.7.248 ...
2021-03-01 13:38:26 Try to ping vip on host 192.168.7.243 ...
2021-03-01 13:38:29 begin to start repmgrd on "[192.168.7.248]".
[2021-03-01 13:40:52] [NOTICE] using provided configuration file "/home/kingbase/cluster/R6C5/R6C5R/kingbase/bin/../etc/repmgr.conf"
[2021-03-01 13:40:52] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6C5/R6C5R/kingbase/log/hamgr.log" 2021-03-01 13:38:30 execute to start repmgrd on "[192.168.7.248]" failed.
2021-03-01 13:38:30 begin to start repmgrd on "[192.168.7.243]".
[2021-03-01 13:38:30] [NOTICE] using provided configuration file "/home/kingbase/cluster/R6C5/R6C5R/kingbase/bin/../etc/repmgr.conf"
[2021-03-01 13:38:30] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6C5/R6C5R/kingbase/log/hamgr.log" 2021-03-01 13:38:32 repmgrd on "[192.168.7.243]" start success.
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+---------+---------+-----------+----------+-------------+-------+---------+--------------------
1 | node243 | primary | * running | | running | 12552 | no | n/a
2 | node248 | standby | running | node243 | not running | n/a | n/a | n/a
[2021-03-01 13:40:56] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6C5/R6C5R/kingbase/log/kbha.log" [2021-03-01 13:38:37] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6C5/R6C5R/kingbase/log/kbha.log" 2021-03-01 13:38:39 Done. [kingbase@node3 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+-----------+----------+----------+----------+----------+--------------------------------------------------------------------------------------------------------------------
1 | node243 | primary | * running | | default | 100 | 3 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node248 | standby | running | node243 | default | 100 | 3 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

=== 由以上可知,集群启动成功。===

四、总结

对于kingbaseES R6集群使用数据库系统自带的arping软件包,一般不会出现版本不匹配的问题;对于arping工具的属主应该是root,不是kingbase用户,但为了kingbase用户也能执行arping,必须配置arping的setuid权限。

KingbaseES R6 集群启动‘incorrect command permissions for the virtual ip’故障案例的更多相关文章

  1. KingbaseES R6 集群手工配置VIP案例

    经常有用户问,V8R6集群搭建时没有配置VIP,搭建完成后,如何添加VIP?以下向大家介绍下手动添加VIP 的过程. 一.操作系统环境 操作系统(UOS): root@uos01:~# cat /et ...

  2. KingbaseES R6 集群创建流复制只读副本库案例

    一.环境概述 [kingbase@node2 bin]$ ./ksql -U system test ksql (V8.0) Type "help" for help. test= ...

  3. KingbaseES R6 集群修改物理IP和VIP案例

    在用户的实际环境里,可能有时需要修改主机的IP,这就涉及到集群的配置修改.以下以例子的方式,介绍下KingbaseES R6集群如何修改IP. 一.案例测试环境 操作系统: [KINGBASE@nod ...

  4. KingbaseES R6 集群 recovery 参数对切换的影响

    案例说明:在KingbaseES R6集群中,主库节点出现宕机(如重启或关机),会产生主备切换,但是当主库节点系统恢复正常后,如何对原主库节点进行处理,保证集群数据的一致性和安全,可以通过对repmg ...

  5. KingbaseES R6 集群修改data目录

    案例说明: 本案例是在部署完成KingbaseES R6集群后,由于业务的需求,集群需要修改data(数据存储)目录的测试.本案例分两种修改方式,第一种是离线修改data目录,即关闭整个集群后,修改数 ...

  6. KingbaseES R6 集群通过备库clone在线添加新节点

    案例说明: KingbaseES R6集群可以通过图形化方式在线添加新节点,但是在添加新节点clone环节时,是从主库copy数据到新的节点,这样在生产环境,如果数据量大,将会对主库的网络I/O造成压 ...

  7. KingbaseES R6 集群repmgr.conf参数'recovery'测试案例(一)

    KingbaseES R6集群repmgr.conf参数'recovery'测试案例(一) 案例说明: 在KingbaseES R6集群中,主库节点出现宕机(如重启或关机),会产生主备切换,但是当主库 ...

  8. KingbaseES R6 集群sys_monitor.sh change_password一键修改集群用户密码

    案例说明: kingbaseES R6集群用户密码修改,需要修改两处: 1)修改数据库用户密码(alter user): 2)修改.encpwd文件中用户密码: 可以通过sys_monitor.sh ...

  9. KingbaseES R6 集群禁用 root ssh 后需要修改集群为es_server 案例

    案例说明: 在生产环境下,由于安全需要,主机间不允许建立root用户的ssh信任连接,这样导致KingbaseES R6 repmgr集群,通过sys_monitor.sh脚本启动集群时,节点之间不能 ...

随机推荐

  1. 一图读懂k8s informer client-go

    概述 为什么要有k8s informer 我们都知道可以使用k8s的Clientset来获取所有的原生资源对象,那么怎么能持续的获取集群的所有资源对象,或监听集群的资源对象数据的变化呢?这里不需要轮询 ...

  2. SpringBoot之缓存

    一.准备工作 首先整合使用Spring整合MyBatis. 可参阅:SpringBoot整合MyBatis SpringBoot整合MyBatis完后后,我们需要在pom.xml中添加缓存相关的依赖. ...

  3. 不同network中的两个docker容器

    1. 创建docker网络 docker network create --subnet 172.18.0.1/16 test docker network ls 2. 创建两个容器指定docker ...

  4. umask默认权限及特殊权限

    1. linux系统中,创建一个新的文件或者目录的时候,新的文件或目录都会有默认的访问权限,umask命令与文件和目录的默认访问权限有关. 用户创建一个文件,文件的默认权限为 -rw-rw-rw-(6 ...

  5. Windows启动谷歌浏览器Chrome失败(应用程序无法启动,因为应用程序的并行配置不正确)解决方法

    目录 一.系统环境 二.问题描述 三.解决方法 一.系统环境 Windows版本 系统类型 浏览器Chrome版本 Windows 10 专业版 64 位操作系统, 基于 x64 的处理器 版本 10 ...

  6. 简单性能测试:springboot-2.x vs actix-web-4.x benchmark

    性能测试:springboot-2.x vs actix-web-4.x benchmark 转载请注明出处 https://www.cnblogs.com/funnyzpc/p/15956465.h ...

  7. 用python进行加密和解密——我看刑

    加密和解密 密码术意味着更改消息的文本,以便不知道你秘密的人永远不会理解你的消息. 下面就来创建一个GUI应用程序,使用Python进行加密和解密. 在这里,我们需要编写使用无限循环的代码,代码将不断 ...

  8. 【小程序自动化Minium】二、元素定位-Page接口中的 get_element() 与 get_elements()

    UI自动化中的重要工作就是元素定位了,高效精准的定位方法可以让工作事半功倍. 在过去的一段web自动化经历中,使用的selenium库支持了多种定位方法,我们可以利用这些定位方法来做进一步封装,写出符 ...

  9. 使用Kind快速构建k8s

    什么是 KindKind(Kubernetes in Docker) 是一个 Kubernetes 孵化项目,Kind 是一套开箱即用的 Kubernetes 环境搭建方案.顾名思义,就是将 Kube ...

  10. 跟HR在大群吵架是什么体验?

    原创不易,求分享.求一键三连 昨天跟HR负责人在公司大群吵了一架,先说结论:我输了... 事情原委是,老板在周一司庆上聊嗨了,说了一句:我觉得打卡没用,建议取消打卡. 下来后老板在公司论坛发了一个问题 ...