案例说明:

KingbaseES V8R6支持图形化方式在线扩容,但是在一些生产环境,在服务器不支持图形化界面的情况下 ,只能通过脚本命令行的方式执行集群的部署或在线扩容。

Tips:

KingbaseES V8R6C5默认情况下部署脚本(V8R6_cluster_install.sh)和配置文件(install.conf)不支持在线扩容,需要从KingbaseES V8R6C6版本下拷贝脚本和配置文件到 KingbaseES V8R6C5环境下使用。

适用版本:

KingbaseES V8R6

一、集群节点状态信息

1、主机节点信息

2、扩容前集群节点状态

[kingbase@node101 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------------------------------------------------------------------------------------------
1 | node101 | primary | * running | | default | 100 | 3 | host=192.168.1.101 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node102 | standby | running | node101 | default | 100 | 3 | host=192.168.1.102 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3 [kingbase@node101 bin]$ ./repmgr service status
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+---------+---------+-----------+----------+---------+------+---------+--------------------
1 | node101 | primary | * running | | running | 2997 | no | n/a
2 | node102 | standby | running | node101 | running | 2777 | no | 0 second(s) ago

二、新增节点系统配置

扩容说明参考官方文档:

https://help.kingbase.com.cn/stage-api/profile/document/kes/v8r6/html/highly/availability/cluster-use/cluster-use-6.html#sshd

1、配置新增节点系统环境

1)扩容节点需要和原集群节点具有相同的操作系统环境,并且版本一致;

2)配置新增节点系统内核参数、防火墙、selinux、进程资源限制等,具体配置可见《KingbaseES官方文档》。

3)需在新增节点建立数据库用户及root用户到集群其他节点的ssh互信。

2、在新增节点创建目录和相关文件

=以下文件,可以从主库的数据库软件安装目录下获取。=

Tips:

V8R6_cluster_install.sh和install.conf文件从KingbaseES V8R6C6版本下拷贝,KingbaseES V8R6C5版本目前不支持。

[kingbase@node103 r6_install]$ ls -lh
total 264M
-rw-rw-r-- 1 kingbase kingbase 261M Apr 7 10:33 db.zip
-rw-rw-r-- 1 kingbase kingbase 8.2K Apr 7 10:54 install.conf
-rwxr-x--- 1 kingbase kingbase 3.2K Apr 7 10:42 license.dat
-rw-rw-r-- 1 kingbase kingbase 2.1M Apr 7 10:33 securecmdd.zip
-rwxrwxr-x 1 kingbase kingbase 3.3K Apr 7 10:33 trust_cluster.sh
-rwxrwxr-x 1 kingbase kingbase 87K Apr 7 10:33 V8R6_cluster_install.sh

三、在线脚本扩容(新增节点执行)

1、编辑install.conf

[kingbase@node103 r6_install]$ cat install.conf |grep -v ^$|grep -v ^#

[install]
on_bmj=0
all_ip=(192.168.1.101 192.168.1.102)
witness_ip=""
production_ip=()
local_disaster_recovery_ip=()
remote_disaster_recovery_ip=()
install_dir="/home/kingbase/cluster/HA_R6/kha/kingbase"
zip_package="/home/kingbase/r6_install/db.zip"
license_file=(license.dat)
db_user="system" # the user name of database
db_port="54321" # the port of database, defaults is 54321
db_mode="oracle" # database mode: pg, oracle
db_auth="scram-sha-256" # database authority: scram-sha-256, md5, default is scram-sha-256
db_case_sensitive="yes" # database case sensitive settings: yes, no. default is yes - case sensitive; no - case insensitive (NOTE. cannot set to 'no' when db_mode="pg").
trusted_servers="192.168.1.1"
data_directory="/home/kingbase/cluster/HA_R6/kha/kingbase/data"
virtual_ip=""
net_device=(enp0s3 enp0s3)
net_device_ip=(192.168.1.101 192.168.1.102)
ipaddr_path="/usr/sbin"
arping_path="/opt/Kingbase/ES/V8R6_041/Server/bin/"
ping_path="/bin"
super_user="root"
execute_user="kingbase"
deploy_by_sshd=1 # choose whether to use sshd when deploy, 0 means not to use (deploy by sys_securecmdd), 1 means to use (deploy by sshd), default value is 1; when on_bmj=1, it will auto set to no(deploy_by_sshd=0)
use_scmd=1 # Is the cluster running on sys_securecmdd or sshd? 1 means yes (on sys_securecmdd), 0 means no (on sshd), default value is 1; when on_bmj=1, it will auto set to yes(use_scmd=1)
reconnect_attempts="10" # the number of retries in the event of an error
reconnect_interval="6" # retry interval
recovery="standby" # the way of cluster recovery: standby/automatic/manual
ssh_port="22" # the port of ssh, default is 22
scmd_port="8890" # the port of sys_securecmdd, default is 8890
auto_cluster_recovery_level='1'
use_check_disk='off'
synchronous='quorum' ###### 以下为expand配置信息#######
[expand]
expand_type="0" # The node type of standby/witness node, which would be add to cluster. 0:standby 1:witness
primary_ip="192.168.1101" # The ip addr of cluster primary node, which need to expand a standby/witness node.
expand_ip="192.168.1.103" # The ip addr of standby/witness node, which would be add to cluster.
node_id="3" # The node_id of standby/witness node, which would be add to cluster. It does not the same with any one in cluster node
# for example: node_id="3"
install_dir="/home/kingbase/cluster/HA_R6/kha"
zip_package="/home/kingbase/r6_install/db.zip"
net_device=(enp0s3) # if virtual_ip set,it must be set
net_device_ip=(192.168.1.103) # if virtual_ip set,it must be set
license_file=(license.dat)
deploy_by_sshd="1"
ssh_port="22"
scmd_port="8890"
############################################
[shrink]
shrink_type="" # The node type of standby/witness node, which would be delete from cluster. 0:standby 1:witness
primary_ip="" # The ip addr of cluster primary node, which need to shrink a standby/witness node.
shrink_ip="" # The ip addr of standby/witness node, which would be delete from cluster.
node_id="" # The node_id of standby/witness node, which would be delete from cluster. It does not the same with any one in cluster node
# for example: node_id="3"
install_dir=""
ssh_port="22" # the port of ssh, default is 22
scmd_port="8890" # the port of sys_securecmd, default is 8890

=以下为扩容(expand)节点配置信息,详细说明参考官方文档。=

2、执行扩容(expand)

[kingbase@node103 r6_install]$ sh V8R6_cluster_install.sh expand

[CONFIG_CHECK] will deploy the cluster of
[RUNNING] success connect to the target "192.168.1.103" ..... OK
.......
INFO: connecting to local node "node3" (ID: 3)
INFO: connecting to primary database
WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID 1)
INFO: standby registration complete
NOTICE: standby node "node3" (ID: 3) successfully registered
Usage: /home/kingbase/cluster/R6HA/kha//kingbase/bin/sys_monitor.sh {start|stop|restart|stoplocal|set [--restart]|change_password user password}
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+---------------+----------+----------+----------+----------+----------------------------------------------------------------------------------------------------------------------------------------------------
1 | node101 | primary | * running | | default | 100 | 3 | host=192.168.1.101 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node102 | standby | running | node101 | default | 100 | 3 | host=192.168.1.102 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
3 | node3 | standby | ? unreachable | node101 | default | 100 | ? | host=192.168.1.103 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3 WARNING: following issues were detected
- unable to connect to node "node3" (ID: 3)
- node "node3" (ID: 3) is registered as an active standby but is unreachable

=如上所示,扩容成功,但是新增节点node3的状态为“unreachable”。=

三、扩容故障问题解决

1、查看集群节点状态信息

=如下所示,在主库上查询,node3状态信息为“unreachable”。=

[kingbase@node101 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+---------------+----------+----------+----------+----------+----------------------------------------------------------------------------------------------------------------------------------------------------
1 | node101 | primary | * running | | default | 100 | 3 | host=192.168.1.101 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node102 | standby | running | node101 | default | 100 | 3 | host=192.168.1.102 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
3 | node3 | standby | ? unreachable | node101 | default | 100 | ? | host=192.168.1.103 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3 WARNING: following issues were detected
- unable to connect to node "node3" (ID: 3)

2、查看新增节点防火墙信息

=如下所示,新增节点防火墙未关闭,将防火墙关闭。=

[root@node103 ~]# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2022-07-20 15:09:55 CST; 47min ago
Main PID: 815 (firewalld)
CGroup: /system.slice/firewalld.service
└─815 /usr/bin/python -Es /usr/sbin/firewalld --nofork --nopid Jul 20 15:09:52 node103 systemd[1]: Starting firewalld - dynamic firewall daemon...
Jul 20 15:09:55 node103 systemd[1]: Started firewalld - dynamic firewall daemon. [root@node103 ~]# systemctl stop firewalld
[root@node103 ~]# systemctl disable firewalld Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
Removed symlink /etc/systemd/system/basic.target.wants/firewalld.service.
[root@node103 ~]# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
Active: inactive (dead) Jul 20 15:09:52 node103 systemd[1]: Starting firewalld - dynamic firewall daemon...
Jul 20 15:09:55 node103 systemd[1]: Started firewalld - dynamic firewall daemon.
Jul 20 15:57:05 node103 systemd[1]: Stopping firewalld - dynamic firewall daemon...
Jul 20 15:57:08 node103 systemd[1]: Stopped firewalld - dynamic firewall daemon. [root@node103 ~]# iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination Chain FORWARD (policy ACCEPT)
target prot opt source destination Chain OUTPUT (policy ACCEPT)
target prot opt source destination

3、重新注册新增节点

[kingbase@node103 bin]$ ./repmgr standby register --force
INFO: connecting to local node "node3" (ID: 3)
INFO: connecting to primary database
INFO: standby registration complete
NOTICE: standby node "node3" (ID: 3) successfully registered

4、查看集群节点状态信息

[kingbase@node101 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------------------------------------------------------------------------------------------
1 | node101 | primary | * running | | default | 100 | 3 | host=192.168.1.101 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node102 | standby | running | node101 | default | 100 | 3 | host=192.168.1.102 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
3 | node3 | standby | running | node101 | default | 100 | 3 | host=192.168.1.103 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

=如上所示,集群所有节点状态正常,集群扩容完成。=

四、总结

1、注意配置新增节点的系统环境,如防火墙、网络、内核参数、Selinux等。

2、需要建立新增节点数据库用户和root用户与集群原节点之间的ssh互信。

3、install.conf中【install】部分为原脚本部署集群时的配置,扩容需要配置【expand】部分。

Kingbase V8R6集群安装部署案例---脚本在线一键扩容的更多相关文章

  1. Kingbase V8R6集群安装部署案例---脚本在线一键缩容

    ​ 案例说明: KingbaseES V8R6支持图形化方式在线缩容,但是在一些生产环境,在服务器不支持图形化界面的情况下 ,只能通过脚本命令行的方式执行集群的部署或在线缩容. Tips: Kingb ...

  2. 第06讲:Flink 集群安装部署和 HA 配置

    Flink系列文章 第01讲:Flink 的应用场景和架构模型 第02讲:Flink 入门程序 WordCount 和 SQL 实现 第03讲:Flink 的编程模型与其他框架比较 第04讲:Flin ...

  3. flink部署操作-flink standalone集群安装部署

    flink集群安装部署 standalone集群模式 必须依赖 必须的软件 JAVA_HOME配置 flink安装 配置flink 启动flink 添加Jobmanager/taskmanager 实 ...

  4. HBase 1.2.6 完全分布式集群安装部署详细过程

    Apache HBase 是一个高可靠性.高性能.面向列.可伸缩的分布式存储系统,是NoSQL数据库,基于Google Bigtable思想的开源实现,可在廉价的PC Server上搭建大规模结构化存 ...

  5. HBase集群安装部署

    0x01 软件环境 OS: CentOS6.5 x64 java: jdk1.8.0_111 hadoop: hadoop-2.5.2 hbase: hbase-0.98.24 0x02 集群概况 I ...

  6. 1.Hadoop集群安装部署

    Hadoop集群安装部署 1.介绍 (1)架构模型 (2)使用工具 VMWARE cenos7 Xshell Xftp jdk-8u91-linux-x64.rpm hadoop-2.7.3.tar. ...

  7. 2 Hadoop集群安装部署准备

    2 Hadoop集群安装部署准备 集群安装前需要考虑的几点硬件选型--CPU.内存.磁盘.网卡等--什么配置?需要多少? 网络规划--1 GB? 10 GB?--网络拓扑? 操作系统选型及基础环境-- ...

  8. K8S集群安装部署

    K8S集群安装部署   参考地址:https://www.cnblogs.com/xkops/p/6169034.html 1. 确保系统已经安装epel-release源 # yum -y inst ...

  9. 【分布式】Zookeeper伪集群安装部署

    zookeeper:伪集群安装部署 只有一台linux主机,但却想要模拟搭建一套zookeeper集群的环境.可以使用伪集群模式来搭建.伪集群模式本质上就是在一个linux操作系统里面启动多个zook ...

随机推荐

  1. VisionPro · C# · 卸载相机

    在项目程序关闭前,需要将之前链接上的相机全部卸载,否则,关闭程序将出现弹窗报错. 解决报错,卸载相机代码如下: using System; using System.Windows.Forms; us ...

  2. Web思维导图实现的技术点分析(附完整源码)

    简介 思维导图是一种常见的表达发散性思维的有效工具,市面上有非常多的工具可以用来画思维导图,有免费的也有收费的,此外也有一些可以用来帮助快速实现的JavaScript类库,如:jsMind.KityM ...

  3. dubbox 入门demo

    1.Dubbox简介 Dubbox 是一个分布式服务架构,其前身是阿里巴巴开源项目 Dubbo,被国内电商及互联网项目使用,后期阿里巴巴停止了该项目的维护,当当网便在 Dubbo 基础上进行优化,并继 ...

  4. SpringCloud Gateway微服务网关实战与源码分析-上

    概述 定义 Spring Cloud Gateway 官网地址 https://spring.io/projects/spring-cloud-gateway/ 最新版本3.1.3 Spring Cl ...

  5. AI写代码! 神器copilot介绍+安装+使用

    !郑重提示!!!!!!!: 正在学编程.算法的同学请千万不要依赖此插件,否则你可能甚至无法手写出一个for循环 AI帮我写代码?我帮AI写代码?庄周梦蝶?蝶梦庄周?十分梦幻. copilot在VSco ...

  6. Phabricator Conduit API介绍

    在Phabricator页面,可以完成创建和编辑Project.Task等操作.但是如果想实现外部系统可以自主操作Phabricator,那么就需要调用Phabricator Conduit API, ...

  7. .net 温故知新:【6】Linq是什么

    1.什么是Linq 关于什么是Linq 我们先看看这段代码. List<int> list = new List<int> { 1, 1, 2, 2, 3, 3, 3, 5, ...

  8. Odoo14 TypeError: Cannot read property 'classList' of undefined

    Traceback: TypeError: Cannot read property 'classList' of undefined at Class.setLocalState (http://l ...

  9. 第三讲 Linux测试

    3.1 Linux操作系统定义 Ø我们为什么要学习这个linux系统呢? 那是因为我们很多的服务都放在这个linux系统,那为什么很多服务都要放到这个linux系统?这是因为linux系统好,它系统稳 ...

  10. Go 语言图片处理简明教程

    虽然 Go 语言主要用于 Web 后端以及各类中间件和基础设施开发,也难免遇到一些图像处理的需求.Go 语言提供的 image 标准库提供了基本的图片加载.裁剪.绘制等能力,可以帮助我们实现一些绘图需 ...