案例说明:

在同一主机环境,由于生产需要,需要部署两个集群;本案例详细描述了两个集群的部署过程。

注意:同一主机部署多个集群需要先部署securecmdd服务,节点之间通过securecmdd服务通讯。多个集群共享主机的securecmdd服务在节点之间通讯。

适用版本:

KingbaseES V8R6

一、检测主机节点securecmdd服务

1、查看securecmdd服务及端口

[kingbase@node101 bin]$ ps -ef |grep securecmd
root 15486 1 0 14:34 ? 00:00:00 sys_securecmdd: /home/kingbase/cluster/securecmdd/bin/sys_securecmdd -f /etc/.kes/securecmdd_config [listener] 0 of 128-256 startups [kingbase@node101 bin]$ netstat -antlp|grep 8890
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 0.0.0.0:8890 0.0.0.0:* LISTEN -
tcp6 0 0 :::8890 :::* LISTEN -

2、测试securecmd连接(所有节点)

# 本机
[kingbase@node101 bin]$ ./sys_securecmd kingbase@127.0.0.1 'whoami'
kingbase
[kingbase@node101 bin]$ ./sys_securecmd root@127.0.0.1 'whoami'
root # 远程
[kingbase@node101 bin]$ ./sys_securecmd kingbase@192.168.1.102 'whoami'
kingbase
[kingbase@node101 bin]$ ./sys_securecmd root@192.168.1.102 'whoami'
root

二、第一个集群部署

1、集群部署(参考官方文档,可以采用部署工具或脚本部署)

https://help.kingbase.com.cn/v8/install-updata/k-deploy/index.html

2、查看第一个集群节点状态

[kingbase@node101 bin]$ ./repmgr cluster show

 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+---------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node1 | primary | * running | | default | 100 | 6 | | host=192.168.1.101 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node2 | standby | running | node1 | default | 100 | 6 | 0 bytes | host=192.168.1.102 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

3、第一个集群配置

[kingbase@node101 bin]$ cat ../etc/repmgr.conf
use_scmd=on
ha_running_mode='DG'
node_id=1
node_name='node1'
conninfo='host=192.168.1.101 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3'
connection_check_type='mix' data_directory='/data/kingbase/hac7/data'
#data_directory='/home/kingbase/cluster/R6HA/ha7/kingbase/data'
log_file='/home/kingbase/cluster/R6HA/ha7/kingbase/kingbase/log/hamgr.log'
kbha_log_file='/home/kingbase/cluster/R6HA/ha7/kingbase/kingbase/log/kbha.log'
sys_bindir='/home/kingbase/cluster/R6HA/ha7/kingbase/kingbase/bin'
scmd_options='-q -o ConnectTimeout=10 -o StrictHostKeyChecking=no -o ServerAliveInterval=2 -o ServerAliveCountMax=5 -p 8890' trusted_servers='192.168.1.1'
running_under_failure_trusted_servers='on'
repmgrd_pid_file='/home/kingbase/cluster/R6HA/ha7/kingbase/kingbase/etc/hamgrd.pid'
kbha_pid_file='/home/kingbase/cluster/R6HA/ha7/kingbase/kingbase/etc/kbha.pid'
......

三、第二个集群部署(脚本部署)

1、脚本部署所需文件

[kingbase@node102 r6_install]$ ls -l
total 234236
-rw-rw-r-- 1 kingbase kingbase 237558136 Apr 7 14:07 db.zip
-rw-rw-r-- 1 kingbase kingbase 12397 Apr 19 16:09 install.conf
-rw-r--r-- 1 kingbase kingbase 3454 Apr 7 14:08 license.dat
-rw-rw-r-- 1 kingbase kingbase 2114981 Apr 19 15:43 securecmdd.zip
-rw-rw-r-- 1 kingbase kingbase 3902 Apr 7 14:07 trust_cluster.sh
-rw-rw-r-- 1 kingbase kingbase 152215 Apr 7 14:07 V8R6_cluster_install.sh

2、创建集群部署目录(两个集群分别在不同目录下)

[kingbase@node102 r6_install]$  mkdir -p /home/kingbase/cluster/R6HA/hac7/kingbase/

3、配置部署配置文件

[kingbase@node102 r6_install]$ cat install.conf|grep -v ^$|grep -v ^#
[install]
on_bmj=0
all_ip=(192.168.1.102 192.168.1.101)
witness_ip=""
production_ip=()
local_disaster_recovery_ip=()
remote_disaster_recovery_ip=()
install_dir="/home/kingbase/cluster/R6HA/hac7"
zip_package="/home/kingbase/r6_install/db.zip"
license_file=(license.dat)
db_user="system" # the user name of database
db_port="54325" # the port of database, defaults is 54321
db_mode="oracle" # database mode: pg, oracle
db_auth="scram-sha-256" # database authority: scram-sha-256, md5, default is scram-sha-256
db_case_sensitive="no" # database case sensitive settings: yes, no. default is yes - case sensitive; no - case insensitive (NOTE. cannot set to 'no' when db_mode="pg").
db_checksums="yes" # the checksum for data: yes, no. default is yes - a checksum is calculated for each data block to prevent corruption; no - nothing to do.
archive_mode="on" # enables archiving; off, on, or always
db_encoding="" # Cararcter set encoding to use in the new database.Specify a tring constant,or an integer encoding number, default value provided by locale command.
db_collate="" # Collation order(LC_COLLATE) to use in the new database,This affects the sort order applied to strings, default value provided by locale command.
db_ctype="" # Character classification(LC_CTYPE) to use int the new database. This affects the categorization of characters, default value provided by locale command.
other_db_init_options="" # addional initdb options,such as "--scenario-tuning"
trusted_servers="192.168.1.1"
running_under_failure_trusted_servers='on'
data_directory="/home/kingbase/cluster/R6HA/hac7/kingbase/data"
waldir=''
virtual_ip=""
net_device=(enp0s3 enp0s3)
net_device_ip=(192.168.1.102 192.168.1.101)
ipaddr_path="/sbin"
arping_path=""
ping_path="/bin"
super_user="root"
execute_user="kingbase"
deploy_by_sshd=0 # choose whether to use sshd when deploy, 0 means not to use (deploy by sys_securecmdd), 1 means to use (deploy by sshd), default value is 1; when on_bmj=1, it will auto set to no(deploy_by_sshd=0)
use_scmd=1 # Is the cluster running on sys_securecmdd or sshd? 1 means yes (on sys_securecmdd), 0 means no (on sshd), default value is 1; when on_bmj=1, it will auto set to yes(use_scmd=1)
reconnect_attempts="10" # the number of retries in the event of an error
reconnect_interval="6" # retry interval
recovery="standby" # the way of cluster recovery: standby/automatic/manual
ssh_port="22" # the port of ssh, default is 22
scmd_port="8890" # the port of sys_securecmdd, default is 8890
auto_cluster_recovery_level='1'
use_check_disk='off'
synchronous=''
sync_in_same_location=0
failover_need_server_alive='off'

4、拷贝db.zip文件到集群安装目录下并解压(所有节点)

1)拷贝db.zip到所有节点

[kingbase@node102 r6_install]$ cp db.zip /home/kingbase/cluster/R6HA/hac7/kingbase
[kingbase@node102 r6_install]$ scp db.zip node101:/home/kingbase/cluster/R6HA/hac7/kingbase
db.zip 100% 227MB 88.2MB/s 00:02

2)解压db.zip

[kingbase@node101 kingbase]$ unzip db.zip
[kingbase@node101 kingbase]$ ls -lh
total 227M
drwxr-xr-x 2 kingbase kingbase 4.0K Oct 29 14:57 bin
-rw-rw-r-- 1 kingbase kingbase 227M Apr 19 16:07 db.zip
drwxrwxr-x 5 kingbase kingbase 8.0K Oct 29 14:57 lib
drwxrwxr-x 7 kingbase kingbase 4.0K Oct 29 14:57 share

5、拷贝license.dat文件到集群bin目录下

[kingbase@node102 r6_install]$ cp license.dat /home/kingbase/cluster/R6HA/hac7/kingbase/bin/
[kingbase@node102 r6_install]$ scp license.dat node101:/home/kingbase/cluster/R6HA/hac7/kingbase/bin/
license.dat 100% 3454 3.3MB/s 00:00

6、执行脚本部署

[kingbase@node102 r6_install]$ sh V8R6_cluster_install.sh
........
2023-04-19 16:14:25 repmgrd on "[192.168.1.101]" start success.
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+-------+---------+-----------+----------+---------+-------+---------+--------------------
1 | node1 | primary | * running | | running | 32636 | no | n/a
2 | node2 | standby | running | node1 | running | 4895 | no | 1 second(s) ago
[2023-04-19 16:14:26] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/hac7/kingbase/log/kbha.log"
[2023-04-19 16:14:28] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/hac7/kingbase/log/kbha.log"
2023-04-19 16:14:28 Done.
[INSTALL] start up the whole cluster ... OK ---如上所示,第二个集群部署完成。

7、第二个集群配置

[kingbase@node102 bin]$ cat ../etc/repmgr.conf
use_scmd=on
ha_running_mode='DG'
node_id=1
node_name='node1'
conninfo='host=192.168.1.102 user=esrep dbname=esrep port=54325 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3'
connection_check_type='mix' data_directory='/home/kingbase/cluster/R6HA/hac7/kingbase/data'
log_file='/home/kingbase/cluster/R6HA/hac7/kingbase/log/hamgr.log'
kbha_log_file='/home/kingbase/cluster/R6HA/hac7/kingbase/log/kbha.log'
sys_bindir='/home/kingbase/cluster/R6HA/hac7/kingbase/bin'
scmd_options='-q -o ConnectTimeout=10 -o StrictHostKeyChecking=no -o ServerAliveInterval=2 -o ServerAliveCountMax=5 -p 8890' trusted_servers='192.168.1.1'
running_under_failure_trusted_servers='on'
repmgrd_pid_file='/home/kingbase/cluster/R6HA/hac7/kingbase/etc/hamgrd.pid'
kbha_pid_file='/home/kingbase/cluster/R6HA/hac7/kingbase/etc/kbha.pid'
......

四、集群验证

1、第一个集群节点状态

[kingbase@node102 bin]$ pwd
/home/kingbase/cluster/R6HA/hac7/kingbase/bin [kingbase@node102 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+---------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node1 | primary | * running | | default | 100 | 1 | | host=192.168.1.102 user=esrep dbname=esrep port=54325 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node2 | standby | running | node1 | default | 100 | 1 | 0 bytes | host=192.168.1.101 user=esrep dbname=esrep port=54325 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

2、第二个集群节点状态

[kingbase@node101 bin]$ pwd
/home/kingbase/cluster/R6HA/ha7/kingbase/kingbase/bin [kingbase@node101 bin]$ ./repmgr cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+---------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node1 | primary | * running | | default | 100 | 6 | | host=192.168.1.101 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node2 | standby | running | node1 | default | 100 | 6 | 0 bytes | host=192.168.1.102 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

五、总结

在通过脚本部署第二个集群时,需要先创建好集群安装目录,并将db.zip文件拷贝到集群安装目录下,解压;再将license.dat文件也要拷贝到集群安装目录下,然后执行脚本部署。对于需要vip的集群环境,不同的集群配置不同的vip地址。多个集群的数据库服务同时启动,需要配置不同的数据库服务端口。

KingbaseES V8R6集群运维案例之---同一主机节点部署多个集群的更多相关文章

  1. KingbaseES V8R6集群运维案例之---repmgr standby promote应用案例

    案例说明: 在容灾环境中,跨区域部署的异地备节点不会自主提升为主节点,在主节点发生故障或者人为需要切换时需要手动执行切换操作.若主节点已经失效,希望将异地备机提升为主节点. $bin/repmgr s ...

  2. KingbaseES V8R3集群运维案例之---主库系统down failover切换过程分析

    ​ 案例说明: KingbaseES V8R3集群failover时两个cluster都会触发,但只有一个cluster会调用脚本去执行真正的切换流程,另一个有对应的打印,但不会调用脚本,只是走相关的 ...

  3. KingbaseES V8R3集群运维案例之---用户自定义表空间管理

    ​案例说明: KingbaseES 数据库支持用户自定义表空间的创建,并建议表空间的文件存储路径配置到数据库的data目录之外.本案例复现了,当用户自定义表空间存储路径配置到data下时,出现的故障问 ...

  4. KingbaseES V8R3集群运维案例之---kingbase_monitor.sh启动”two master“案例

    案例说明: KingbaseES V8R3集群,执行kingbase_monitor.sh启动集群,出现"two master"节点的故障,启动集群失败:通过手工sys_ctl启动 ...

  5. KingbaseES V8R3集群运维案例之---cluster.log ERROR: md5 authentication failed

    案例说明: 在KingbaseES V8R3集群的cluster.log日志中,经常会出现"ERROR: md5 authentication failed:DETAIL: password ...

  6. PB级大规模Elasticsearch集群运维与调优实践

    导语 | 腾讯云Elasticsearch 被广泛应用于日志实时分析.结构化数据分析.全文检索等场景中,本文将以情景植入的方式,向大家介绍与腾讯云客户合作过程中遇到的各种典型问题,以及相应的解决思路与 ...

  7. PB级大规模Elasticsearch集群运维与调优实践【>>戳文章免费体验Elasticsearch服务30天】

    [活动]Elasticsearch Service免费体验馆>> Elasticsearch Service自建迁移特惠政策>>Elasticsearch Service新用户 ...

  8. PB 级大规模 Elasticsearch 集群运维与调优实践

    PB 级大规模 Elasticsearch 集群运维与调优实践 https://mp.weixin.qq.com/s/PDyHT9IuRij20JBgbPTjFA | 导语 腾讯云 Elasticse ...

  9. 集群运维ansible

    ssh免密登录 集群运维 生成秘钥,一路enter cd ~/.ssh/ ssh-keygen -t rsa 讲id_rsa.pub文件追加到授权的key文件中 cat ~/.ssh/id_rsa.p ...

  10. 阿里巴巴大规模神龙裸金属 Kubernetes 集群运维实践

    作者 | 姚捷(喽哥)阿里云容器平台集群管理高级技术专家 本文节选自<不一样的 双11 技术:阿里巴巴经济体云原生实践>一书,点击即可完成下载. 导读:值得阿里巴巴技术人骄傲的是 2019 ...

随机推荐

  1. 解决putty连接报 connection refused

    Ubuntu中换个速度快点的源后 执行 $sudo apt-get install openssh-server 安装ssh协议 执行ifconfig显示Ubuntu的ip地址 xp中用putty输入 ...

  2. centos7安装postgresql9.6

    1.安装yum源 yum install -y https://download.postgresql.org/pub/repos/yum/9.6/redhat/rhel-7-x86_64/pgdg- ...

  3. 启动MySQL5.7服务无法启动或Table 'mysql.plugin' doesn't exist

    首先说一下我这个是mysql5.7.16免安装版,不过这个问题对于5.7版本应该都适用. 问题重现: 安装过程也说一下吧: 1.将下载的压缩文件解压到指定目录,     我的是:E:\program\ ...

  4. 【Android 逆向】【攻防世界】easyjava

    1. apk 安装到手机,提示输入flag 2. jadx 打开apk看看 private static char a(String str, b bVar, a aVar) { return aVa ...

  5. 推导式,集合推导式,生成器表达式及生成器函数day13

    1.推导式 用一行循环判断遍历处一系列数据的方式 推导式在使用时,只能用for循环和判断,而且判断只能是单项判断 基本语法: lst = [i for i in range(1,51)] print( ...

  6. python中partial用法

    应用 典型的,函数在执行时,要带上所有必要的参数进行调用.然后,有时参数可以在函数被调用之前提前获知.这种情况下,一个函数有一个或多个参数预先就能用上,以便函数能用更少的参数进行调用. 示例pyqt5 ...

  7. Golang标准库之bytes介绍

    本次主要介绍golang中的标准库bytes,基本上参考了 字节 | bytes .Golang标准库--bytes 文章. bytes库主要包含 5 大部分,即: 常量 变量 函数 Buffer R ...

  8. 【应用服务 App Service】App Service For Linux 中如何挂载一个共享文件夹呢? Mount Azure Storage Account File Share

    问题描述 使用Linux作为服务器运行Web App时,如何将 Storage Account 作为本地共享装载到 App Service for  Linux / Container 中的应用呢? ...

  9. vscode+gitee+picgo实现稳定图床

    目录: 目录 目录: 1. 为什么使用vscode+gitee+picgo实现完美图床 2. 安装VSCode 2.1 安装VSCode软件及相关插件 3. 安装picgo 4. 准备Gitee图床 ...

  10. HttpRunner使用教程?

    什么是HttpRunner? 它是一种面向http协议的测试框架,它只需要去维护一份yaml/json文件就可以使用自动化测试,结合locus性能测试,线上性能监控,持续集成等多种需求 工作原理: 通 ...