案例说明:

生产环境是单实例,测试环境是集群,现需要将生产环境的数据迁移到集群中运行,本文档详细介绍了从单实例环境恢复数据到集群环境的操作步骤,可以作为生产环境迁移数据的参考。

适用版本:

KingbaseES V8R6

本案例数据库版本(单实例和集群使用相同的版本):

test=# select version();
version
----------------------------------------------------------------------------------------------------------------------
KingbaseES V008R006C005B0041 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
(1 row)

原集群节点信息:

[kingbase@node101 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------------------------------------------------------------------------------------------
1 | node101 | primary | * running | | default | 100 | 13 | host=192.168.1.101 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node102 | standby | running | node101 | default | 100 | 13 | host=192.168.1.102 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

一、单实例环境迁移前主备

Tips:

1)将单实例环境数据迁移到集群,需要停止单实例数据库服务,根据data目录数据的大小, 要估算停机窗口时间。

2)在关闭单实例数据库前,建议手工创建检查点,如果wal日志比较大,建议备份后,清理wal日志,只需要保留最近一天的日志到最近检查点后即可。

3)需要跨主机将单实例data目录拷贝到集群的主备库节点,需 根据网络带宽和节点数,估算整个拷贝时间。

1、查看数据信息

prod1=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+--------+----------+----------+-------------+-------------------
prod | system | UTF8 | ci_x_icu | zh_CN.UTF-8 |
prod1 | system | UTF8 | ci_x_icu | zh_CN.UTF-8 |
security | system | UTF8 | ci_x_icu | zh_CN.UTF-8 |
template0 | system | UTF8 | ci_x_icu | zh_CN.UTF-8 | =c/system +
| | | | | system=CTc/system
template1 | system | UTF8 | ci_x_icu | zh_CN.UTF-8 | =c/system +
| | | | | system=CTc/system
test | system | UTF8 | ci_x_icu | zh_CN.UTF-8 |
(6 rows) prod1=# select count(*) from t1;
count
--------
100000
(1 row)

2、关闭数据库服务

1)手工建立checkpoint
prod1=# select sys_switch_wal();
sys_switch_wal
----------------
0/3F74BD70
(1 row) prod1=# checkpoint;
CHECKPOINT

2)正常关闭数据库服务

[kingbase@node101 bin]$ ./sys_ctl stop -D /data/kingbase/v8r6_041/data
waiting for server to shut down.... done
server stopped

二、迁移数据到集群

1、停止集群

[kingbase@node101 bin]$ ./sys_monitor.sh stop
2022-06-20 10:33:40 Ready to stop all DB ...
.....
2022-06-20 10:33:55 begin to stop DB on "[192.168.1.102]".
2022-06-20 10:33:57 DB on "[192.168.1.101]" stop success.
2022-06-20 10:33:57 Done.

2、将集群的data目录备份

[kingbase@node101 bin]$ cd ../
[kingbase@node101 kingbase]$ mv data data.bk

3、将单实例的data目录拷贝到集群的主备节点

[kingbase@node101 v8r6_041]$ scp -r data node101:/home/kingbase/cluster/R6HA/kha/kingbase
[kingbase@node101 v8r6_041]$ scp -r data node102:/home/kingbase/cluster/R6HA/kha/kingbase

4、集群所有备库创建standby.signal文件

[kingbase@node102 kingbase]$ cd data
[kingbase@node102 data]$ touch standby.signal

5、创建kingbase.auto.conf、kingbase.conf、es_rep.conf文件:(所有节点)

[kingbase@node101 data]$ mv kingbase.auto.conf kingbase.auto.conf.bk

# 将原集群的文件复制
[kingbase@node101 data]$ cp ../data.bk/kingbase.auto.conf ./ [kingbase@node101 data]$ cat kingbase.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
enable_upper_colname = 'on'
wal_retrieve_retry_interval = '5000'
primary_conninfo = 'user=system connect_timeout=10 host=192.168.1.102 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3 application_name=node101'
recovery_target_timeline = 'latest'
primary_slot_name = 'repmgr_slot_1'
synchronous_standby_names = '1 (node102)' [kingbase@node101 data]$ cp ../data.bk/kingbase.conf ./
[kingbase@node101 data]$ cp ../data.bk/es_rep.conf ./

三、创建和配置流复制

Tips:

1)注意保证所有节点的数据库服务启动正常。

2)如果数据库服务启动失败,注意查看sys_log(尤其是备库)。

3)如果流复制失败,可以根据备库sys_log获取到备库无法流复制的原因(如本案例,是因为复制槽问题引起)。

1、sys_ctl启动主备库数据库服务

[kingbase@node101 bin]$ ./sys_ctl start -D ../data
waiting for server to start....2022-06-20 11:02:01.860 CST [30274] WARNING: enable_upper_colname can only be
.......
server started

2、查看流复制状态

test=# select * from sys_stat_replication;
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state |
sent_lsn | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state | reply_time
-----+----------+---------+------------------+-------------+-----------------+-------------+---------------+--------------+-------+-
---------+-----------+-----------+------------+-----------+-----------+------------+---------------+------------+------------
(0 rows) # 如以上所示流复制创建失败,查看备库sys_log,因为复制槽原因,导致备库无法创建流复制。

如下图所示:

3、查看复制槽信息

test=# select * from sys_replication_slots;
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confi
rmed_flush_lsn
--------------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+------
---------------
slot_node102 | | physical | | | f | f | | | | 0/370B4F68 |
slot_node101 | | physical | | | f | f | | | | |
(2 rows)

4、重新创建复制槽

test=# select sys_drop_replication_slot('slot_node101');
sys_drop_replication_slot
--------------------------- (1 row) test=# select sys_drop_replication_slot('slot_node102');
sys_drop_replication_slot
--------------------------- (1 row) test=# select * from sys_replication_slots;
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn
-----------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------
(0 rows) test=# select sys_create_physical_replication_slot('repmgr_slot_2');
sys_create_physical_replication_slot
--------------------------------------
(repmgr_slot_2,)
(1 row) test=# select sys_create_physical_replication_slot('repmgr_slot_1');
sys_create_physical_replication_slot
--------------------------------------
(repmgr_slot_1,)
(1 row) test=# select * from sys_replication_slots;
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn
---------------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------
repmgr_slot_2 | | physical | | | f | t | 31655 | 915 | | 0/420001E0 |
repmgr_slot_1 | | physical | | | f | f | | | | |
(2 rows) test=# select * from sys_stat_replication;
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent_lsn | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state | reply_time
-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+---- 31655 | 10 | system | node102 | 192.168.1.102 | | 36924 | 2022-06-20 11:07:38.664780+08 | | streaming | 0/420001E0 | 0/420001E0 | 0/420001E0 | 0/420001E0 | | | | 1 | sync | 2022-06-20 11:08:05.571558+08
(1 row) # 如上所示,重新创建正确的复制槽后,流复制恢复正常。

四、配置repmgr集群管理(主库完成)

1、创建esrep库和esrep用户:

# 根据原集群用户的密码,创建esrep用户
[kingbase@node102 data]$ cat ~/.encpwd
*:*:*:system:MTIzNDU2NzhhYg==
#*:*:*:system:MTIzNDU2Cg==
*:*:*:esrep:S2luZ2Jhc2VoYTExMA==
[kingbase@node102 data]$ echo 'S2luZ2Jhc2VoYTExMA=='|base64 -d
Kingbaseha110 test=# create database esrep;
CREATE DATABASE
test=# create user esrep with superuser password 'Kingbaseha110';
CREATE ROLE # 根据原集群的密码,修改system用户密码
[kingbase@node101 bin]$ echo 'MTIzNDU2NzhhYg=='|base64 -d
12345678ab[kingbase@node./ksql -U system test
ksql (V8.0)
Type "help" for help. test=# alter user system with password '12345678ab';
ALTER ROLE

2、创建repmgr extension

# 注意在kingbase.conf中支持repmgr extension

[kingbase@node101 bin]$ cat ../data/kingbase.conf |grep repmgr
........
shared_preload_libraries = 'repmgr,liboracle_parser, synonym, plsql, force_view, kdb_flashback,plugin_debugger, plsql_plugin_debugger, plsql_plprofiler, ora_commands,kdb_ora_expr, sepapower, dblink, sys_kwr, sys_ksh, sys_spacequota, sys_stat_statements, backtrace, kdb_utils_function, auto_bmr, sys_squeeze' #主库创建repmgr extension
test=# create extension repmgr;
CREATE EXTENSION

3、注册主备库到repmgr集群

1)注册primary节点

[kingbase@node101 bin]$ ./repmgr primary register --force
INFO: connecting to primary database...
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
NOTICE: primary node record (ID: 1) registered [kingbase@node101 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------------------------------------------------------------------------------------------
1 | node101 | primary | * running | | default | 100 | 1 | host=192.168.1.101 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

2)注册standby节点

[kingbase@node102 bin]$ ./repmgr standby register --force
INFO: connecting to local node "node102" (ID: 2)
INFO: connecting to primary database
WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID 1)
INFO: standby registration complete
NOTICE: standby node "node102" (ID: 2) successfully registered

3)查看集群节点状态

[kingbase@node102 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------------------------------------------------------------------------------------------
1 | node101 | primary | * running | | default | 100 | 1 | host=192.168.1.101 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node102 | standby | running | node101 | default | 100 | 1 | host=192.168.1.102 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

五、验证数据

1、重启集群

[kingbase@node101 bin]$ ./sys_monitor.sh restart
2022-06-20 11:26:21 Ready to stop all DB ...
......
2022-06-20 11:27:03 repmgrd on "[192.168.1.102]" start success.
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+---------+---------+-----------+----------+---------+-------+---------+--------------------
1 | node101 | primary | * running | | running | 5641 | no | n/a
2 | node102 | standby | running | node101 | running | 22473 | no | 0 second(s) ago
[2022-06-20 11:27:08] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/kha/kingbase/log/kbha.log" [2022-06-20 11:27:13] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/kha/kingbase/log/kbha.log" 2022-06-20 11:27:19 Done.

2、查看repmgrd进程状态(所有节点)

[kingbase@node101 bin]$ ps -ef |grep repmgr
kingbase 5641 1 0 11:26 ? 00:00:00 /home/kingbase/cluster/R6HA/kha/kingbase/bin/repmgrd -d -v -f /home/kingbase/cluster/R6HA/kha/kingbase/bin/../etc/repmgr.conf
kingbase 5879 1 0 11:27 ? 00:00:00 /home/kingbase/cluster/R6HA/kha/kingbase/bin/kbha -A daemon -f /home/kingbase/cluster/R6HA/kha/kingbase/bin/../etc/repmgr.conf

3、验证迁移后数据

[kingbase@node101 bin]$ ./ksql -U system test
ksql (V8.0)
Type "help" for help. test=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+--------+----------+----------+-------------+-------------------
esrep | system | UTF8 | ci_x_icu | zh_CN.UTF-8 |
prod | system | UTF8 | ci_x_icu | zh_CN.UTF-8 |
prod1 | system | UTF8 | ci_x_icu | zh_CN.UTF-8 |
security | system | UTF8 | ci_x_icu | zh_CN.UTF-8 |
template0 | system | UTF8 | ci_x_icu | zh_CN.UTF-8 | =c/system +
| | | | | system=CTc/system
template1 | system | UTF8 | ci_x_icu | zh_CN.UTF-8 | =c/system +
| | | | | system=CTc/system
test | system | UTF8 | ci_x_icu | zh_CN.UTF-8 |
(7 rows) test=# \c prod1
You are now connected to database "prod1" as user "system".
prod1=# \d
List of relations
Schema | Name | Type | Owner
--------+---------------------+-------+--------
public | sys_stat_statements | view | system
public | t1 | table | system
(2 rows) prod1=# select count(*) from t1;
count
--------
100000
(1 row) # 迁移后数据和迁移前数据一致,迁移成功。

六、总结

 1、从单实例环境迁移数据到集群,如果需要保证数据一致,必须要将单实例及集群停库,对于生产环境,要考虑停机窗口。
2、如果需要将原集群数据重新加载到新的集群,需要将原集群数据做逻辑备份,但是在导入时如果有重复数据需注意处理。(如将测试数据再导入到新的集群中,可能有许多数据会重复)。
3、申请停机窗口,要考虑原单实例数据量的大小、主机间的网络带宽、集群节点数、集群配置时间、集群启动故障的处理时间等。

KingbaseES V8R6集群维护案例之--单实例数据迁移到集群案例的更多相关文章

  1. Redis单实例数据迁移到集群

    环境说明 单机redis redis集群 192.168.41.101:7000 master 192.168.41.101:7001 master 192.168.41.102:7000 maste ...

  2. kingbaseES R3 集群备库转换为单实例库案例

    案例说明: 在生产环境需要将集群中架构转换为单实例环境,本案例以备库转换为单实例库为案例,介绍了两种方案,一种在数据库数据量小的环境下采用 sys_dumpall 导出导入方式建立单实例库:另外一种是 ...

  3. 项目案例之GitLab的数据迁移

    项目案例之GitLab的数据迁移 链接:https://pan.baidu.com/s/1CgaEv12cwfbs5RxcNpxdAg 提取码:fytm 复制这段内容后打开百度网盘手机App,操作更方 ...

  4. KingbaseES V8R6C6备份恢复案例之---单实例sys_baackup.sh备份

    案例说明: KingbaseES V8R6C6中sys_backup.sh在通用机单实例环境,默认需要通过securecmdd工具以及kingbase和root用户之间的ssh互信,执行备份初始化(i ...

  5. KingbaseES V8R3 备份恢复案例之--单实例环境sys_rman脚本备份案例

    案例说明: sys_rman是KingbaseES数据库的物理备份工具,支持数据库的全备和增量备份,由于sys_rman工具使用需要配置多个参数,对于一般用户使用不是很方便.为方便用户在Kingbas ...

  6. Elasticsearch:单节点数据迁移

    Elasticsearch数据迁移:windows单节点迁移到windows 将源数据中的ES安装目录下的data/nodes目录整体拷贝到目标ES的对应目录下 迁移前请备份:迁移后需要重启ES: E ...

  7. MongoDB复制集之将现有的单节点服务器转换为复制集

    服务器情况:   现有的单节点 Primary     192.168.126.9:27017   新增的节点    Secondry  192.168.126.8:27017    仲裁节点     ...

  8. Hbase 0.92.1集群数据迁移到新集群

    老集群 hbase(main):001:0> status 4 servers, 0 dead, 0.0000 average load hbase(main):002:0> list T ...

  9. KingbaseES V8R3集群管理维护案例之---集群迁移单实例架构

    案例说明: 在生产中,需要将KingbaseES V8R3集群转换为单实例架构,可以采用以下方式快速完成集群架构的迁移. 适用版本: KingbaseES V8R3 当前数据库版本: TEST=# s ...

随机推荐

  1. Vue搭建后台系统需要做的几点(持续更新中)

    前言 持续更新 一.UI框架 推荐 Elemnet ui 二.图表 vue-schart npm install vue-schart -S <template> <div id=& ...

  2. RPA应用场景-报税机器人

    场景概述 报税机器人 所涉系统名称 税务网站 人工操作(时间/次) 53分钟 所涉人工数量 60 操作频率 每月 场景流程 1.通过RPA自动将财税信息从对应系统中导出 2.RPA根据不同的税务报表规 ...

  3. cmd命令与bat编程

    命令解压缩文件 winrar 命令行解压文件 winrar x 要解压的文件 要解压到的路径   (保存压缩文件内的目录结果) 直接覆盖   -o+           覆盖已存在文件    在不提示 ...

  4. Python 中多线程共享全局变量的问题

    写在前面不得不看的一些P话: Python 中多个线程之间是可以共享全局变量的数据的. 但是,多线程共享全局变量是会出问题的. 假设两个线程 t1 和 t2 都要对全局变量g_num (默认是0)进行 ...

  5. 动态树 — Euler_Tour_Tree

    一般提到动态树,我们会不约而同的想到 LCT,这算是比较通用,实用,能力较为广泛的一种写法了.当然,掌握 LCT 就需要熟悉掌握 Splay 和各种操作和知识.ETT(中文常用称呼:欧拉游览树)是一种 ...

  6. windows版anaconda+CUDA9.0+cudnn7+pytorch+tensorflow安装

    1.Anaconda 首先下载Anaconda,它是一个开源的python发行版本,含有众多科学工具包,直接安装anaconda免除了许多包的手动安装,点击这里下载. 按照你的实际情况选择下载.下载完 ...

  7. Whats On Tap | Tapdata Cloud 如何助力大型家居连锁商城推进数字化经营?

    Tapdata Cloud 的操作有多便捷,上手试一下就能充分了解了.--Tapdata Cloud 用户 | 报表实施 @某大型家居服务平台 一边是监管政策趋严,推动房地产回归本源,存量竞争时代开启 ...

  8. 编程思想转换&体验Lambda的更优写法和Lambda标准格式

    编程思想转换做什么,而不是怎么做 我们真的希望创建一个匿名内部类对象吗?不,我们只是为了做这件事情而不得不创建一个对象. 我们真正希望做的事情是:将run方法体内的代码传递给Thread类知晓. 传递 ...

  9. C++类中的常成员和静态成员

    常变量.常对象.常引用.指向常对象或常变量的指针等在定义时都使用了const关键字,这是C++语言引入的一种数据保护机制,称为const数据保护机制.例如通过const关键字主动地将被调函数形参进行限 ...

  10. c++小游戏--五子棋

    大家好,我是芝麻狐! 这是我自制的小游戏,目前仅支持devc++. 如果你没有c++软件, 请打开网站GDB online Debugger | Compiler - Code, Compile, R ...