首先说一下这个3节点MongoDB集群各个维度的数据规模:

1、dataSize: 1.9T

2、storageSize: 600G

3、全量备份-加压缩开关:186G,耗时 8h

4、全量备份-不加压缩开关:1.8T,耗时 4h27m

具体导出的语法比较简单,此处不再赘述,本文重点描述导入的优化过程,最后给出导入的最佳实践。

■ 2023-09-13T20:00 第1次4并发导入测试

mongorestore --port=20000 -uadmin -p'passwd' --authenticationDatabase=admin --numInsertionWorkersPerCollection=4 --bypassDocumentValidation -d likingtest /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest >> 10.2.2.2.log 2>&1 &
tail -100f /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/10.2.2.2.log
以上导入:
2023-09-13T21:59:55.452+0800 The --db and --collection flags are deprecated for this use-case; please use --nsInclude instead, i.e. with --nsInclude=${DATABASE}.${COLLECTION}
2023-09-13T21:59:55.452+0800 building a list of collections to restore from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest dir
2023-09-13T21:59:55.466+0800 reading metadata for likingtest.oprceConfiguration from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/oprceConfiguration.metadata.json
2023-09-13T21:59:55.478+0800 reading metadata for likingtest.oprceDataObj from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/oprceDataObj.metadata.json
2023-09-13T21:59:55.491+0800 reading metadata for likingtest.oprcesDataObjInit from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/oprcesDataObjInit.metadata.json
2023-09-13T21:59:55.503+0800 reading metadata for likingtest.role from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/role.metadata.json
2023-09-13T21:59:55.508+0800 reading metadata for likingtest.activityConfiguration from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/activityConfiguration.metadata.json
2023-09-13T21:59:55.511+0800 reading metadata for likingtest.history_task from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/history_task.metadata.json
2023-09-13T21:59:55.512+0800 reading metadata for likingtest.resOutRelDataSnapshot from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/resOutRelDataSnapshot.metadata.json
2023-09-13T21:59:55.520+0800 reading metadata for likingtest.snapshotResource from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/snapshotResource.metadata.json
2023-09-13T21:59:55.524+0800 reading metadata for likingtest.oprceDataObjDraft from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/oprceDataObjDraft.metadata.json
2023-09-13T21:59:55.526+0800 reading metadata for likingtest.oprceDataObjInit from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/oprceDataObjInit.metadata.json
2023-09-13T21:59:55.761+0800 restoring likingtest.snapshotResource from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/snapshotResource.bson
...
2023-09-13T22:00:01.451+0800 [........................] likingtest.oprceDataObj 408MB/1205GB (0.0%)
...
2023-09-13T21:59:58.323+0800 finished restoring likingtest.oprceDataObjDraft (1559 documents, 0 failures)
2023-09-13T22:00:01.034+0800 finished restoring likingtest.resOutRelDataSnapshot (34426 documents, 0 failures)
2023-09-13T22:00:01.559+0800 finished restoring likingtest.history_task (3629 documents, 0 failures)
2023-09-13T22:00:02.086+0800 finished restoring likingtest.activityConfiguration (974 documents, 0 failures)
2023-09-13T22:00:02.293+0800 finished restoring likingtest.oprceConfiguration (162 documents, 0 failures)
2023-09-13T22:00:02.529+0800 finished restoring likingtest.oprcesDataObjInit (4 documents, 0 failures)
2023-09-13T22:00:02.857+0800 finished restoring likingtest.role (10 documents, 0 failures)
2023-09-13T22:00:29.153+0800 [########################] likingtest.snapshotResource 2.04GB/2.04GB (100.0%)
2023-09-13T22:00:29.155+0800 finished restoring likingtest.snapshotResource (50320 documents, 0 failures)
...
2023-09-14T00:18:58.451+0800 [############............] likingtest.oprceDataObj 651GB/1205GB (54.0%)
2023-09-14T00:18:59.857+0800 [########################] likingtest.oprceDataObjInit 635GB/635GB (100.0%)
2023-09-14T00:18:59.888+0800 finished restoring likingtest.oprceDataObjInit (43776648 documents, 0 failures)
...
2023-09-14T02:05:58.904+0800 [########################] likingtest.oprceDataObj 1205GB/1205GB (100.0%)
2023-09-14T02:05:58.937+0800 finished restoring likingtest.oprceDataObj (53311330 documents, 0 failures)
2023-09-14T02:05:58.945+0800 no indexes to restore for collection likingtest.activityConfiguration
2023-09-14T02:05:58.945+0800 no indexes to restore for collection likingtest.history_task
2023-09-14T02:05:58.945+0800 restoring indexes for collection likingtest.oprcesDataObjInit from metadata
2023-09-14T02:05:58.976+0800 index: &idx.IndexDocument{Options:primitive.M{"name":"flowId_1_activityConfiguration.activityNameEn_1", "ns":"likingtest.oprcesDataObjInit", "v":2}, Key:primitive.D{primitive.E{Key:"flowId", Value:1}, primitive.E{Key:"activityConfiguration.activityNameEn", Value:1}}, PartialFilterExpression:primitive.D(nil)}
2023-09-14T02:05:58.976+0800 index: &idx.IndexDocument{Options:primitive.M{"name":"oprceInfo.oprceInstID_1_activityInfo.activityInstID_1_workitemInfo.workItemID_1", "ns":"likingtest.oprcesDataObjInit", "v":2}, Key:primitive.D{primitive.E{Key:"oprceInfo.oprceInstID", Value:1}, primitive.E{Key:"activityInfo.activityInstID", Value:1}, primitive.E{Key:"workitemInfo.workItemID", Value:1}}, PartialFilterExpression:primitive.D(nil)}
2023-09-14T02:05:58.976+0800 no indexes to restore for collection likingtest.role
2023-09-14T02:05:58.976+0800 no indexes to restore for collection likingtest.snapshotResource
2023-09-14T02:05:58.976+0800 no indexes to restore for collection likingtest.oprceDataObjDraft
2023-09-14T02:05:58.976+0800 restoring indexes for collection likingtest.oprceDataObjInit from metadata
2023-09-14T02:05:58.976+0800 index: &idx.IndexDocument{Options:primitive.M{"name":"oprceInfo.oprceInstID_1_activityInfo.activityInstID_1_workitemInfo.workItemID_1", "ns":"likingtest.oprceDataObjInit", "v":2}, Key:primitive.D{primitive.E{Key:"oprceInfo.oprceInstID", Value:1}, primitive.E{Key:"activityInfo.activityInstID", Value:1}, primitive.E{Key:"workitemInfo.workItemID", Value:1}}, PartialFilterExpression:primitive.D(nil)}
2023-09-14T02:05:58.976+0800 index: &idx.IndexDocument{Options:primitive.M{"name":"flowNo_1", "ns":"likingtest.oprceDataObjInit", "v":2}, Key:primitive.D{primitive.E{Key:"flowNo", Value:1}}, PartialFilterExpression:primitive.D(nil)}
2023-09-14T02:05:58.976+0800 no indexes to restore for collection likingtest.oprceConfiguration
2023-09-14T02:05:58.976+0800 no indexes to restore for collection likingtest.resOutRelDataSnapshot
2023-09-14T02:05:58.976+0800 restoring indexes for collection likingtest.oprceDataObj from metadata
2023-09-14T02:05:58.976+0800 index: &idx.IndexDocument{Options:primitive.M{"name":"flowId_1_activityConfiguration.activityNameEn_1", "ns":"likingtest.oprceDataObj", "v":2}, Key:primitive.D{primitive.E{Key:"flowId", Value:1}, primitive.E{Key:"activityConfiguration.activityNameEn",Value:1}}, PartialFilterExpression:primitive.D(nil)}
2023-09-14T02:05:58.976+0800 index: &idx.IndexDocument{Options:primitive.M{"name":"flowNo_1", "ns":"likingtest.oprceDataObj", "v":2}, Key:primitive.D{primitive.E{Key:"flowNo", Value:1}}, PartialFilterExpression:primitive.D(nil)}
2023-09-14T02:05:58.976+0800 index: &idx.IndexDocument{Options:primitive.M{"name":"oprceInfo.oprceInstID_1_activityInfo.activityInstID_1_workitemInfo.workItemID_1", "ns":"likingtest.oprceDataObj", "v":2}, Key:primitive.D{primitive.E{Key:"oprceInfo.oprceInstID", Value:1}, primitive.E{Key:"activityInfo.activityInstID", Value:1}, primitive.E{Key:"workitemInfo.workItemID", Value:1}}, PartialFilterExpression:primitive.D(nil)}
2023-09-14T02:05:58.976+0800 index: &idx.IndexDocument{Options:primitive.M{"name":"flowId_1_activityConfiguration.activityNameEn_1", "ns":"likingtest.oprceDataObjInit", "v":2}, Key:primitive.D{primitive.E{Key:"flowId", Value:1}, primitive.E{Key:"activityConfiguration.activityNameEn", Value:1}}, PartialFilterExpression:primitive.D(nil)}
2023-09-14T03:45:47.152+0800 97179062 document(s) restored successfully. 0 document(s) failed to restore.

可见:

1、配置并发参数 --numInsertionWorkersPerCollection=4 和 检查参数 bypassDocumentValidation 后,restore速度大大提升,1.2T 的一个大集合 oprceDataObj,由原来默认restore方式约 12h,降为:4h

2、restore完所有数据以后,最后再restore索引,restore索引还是需要一定的时间,本次耗时:1h40m【注:实际没有成功,索引并未生效】

3、新版本的 -d -c 参数需统一修改为:--nsInclude --nsFrom= --nsTo=

■ 2023-09-14T10:40 第2次8并发导入测试

mongorestore --port=20000 -uadmin -p'passwd' --authenticationDatabase=admin --numInsertionWorkersPerCollection=8 --bypassDocumentValidation -d likingtest /u01/nfs/xxxxx_mongodb/10.1.1.1/20230914/likingtest >> 10.2.2.2.log 2>&1 &
tail -100f /u01/nfs/xxxxx_mongodb/10.1.1.1/20230914/10.2.2.2.log
---
2023-09-14T10:40:45.492+0800 The --db and --collection flags are deprecated for this use-case; please use --nsInclude instead, i.e. with --nsInclude=${DATABASE}.${COLLECTION}
...
2023-09-14T10:40:48.493+0800 [........................] likingtest.oprceDataObj 112MB/1208GB (0.0%)
...
2023-09-14T12:57:34.859+0800 [########################] likingtest.oprceDataObj 1208GB/1208GB (100.0%)
2023-09-14T12:57:34.867+0800 finished restoring likingtest.oprceDataObj (53413481 documents, 0 failures)

可见:

1、配置并发参数 --numInsertionWorkersPerCollection=8 和 检查参数 --bypassDocumentValidation 后,restore速度再次大大提升,1.2T的一个大集合 oprceDataObj,由原来默认restore方式约 12h,降为:2h17m

2、本次恢复采用nfs备份恢复,一台8C的虚机,8并发恢复时cpu占用约40%,网络接收速度300MB/s左右,本地磁盘写入速度在30-200MB/s左右,可见网络带段不是瓶颈。可以预见,如果采用更高的主机配置,尤其是IO更好的磁盘,resotore时间必将更少。

■ 2023-09-14T16:10 第3次12并发导入测试

【注意】由于新版本mongorestore摒弃了-d -c参数,虽然可用但使用不够灵活,因此需使用新参数--nsInclude,对于该参数的使用,摸索了多次才找到使用的限制条件,即 directory 必须为数据库备份的根目录/上一级目录,而不是 数据库目录!即类似 dumpdir/20230914,而不是 dumpdir/20230914/database!这是一个巨大的坑,切记!当然,这个目录下一定不能有其他不可识别的文件,否则也会报错。

mongorestore --port=20000 -uadmin -p'passwd' --authenticationDatabase=admin --numInsertionWorkersPerCollection=12 --bypassDocumentValidation --nsInclude="likingtest.*" /u01/nfs/xxxxx_mongodb/10.1.1.1/20230914 > 20230914.10.2.2.2-3.log 2>&1 &
tail -100f /u01/nfs/xxxxx_mongodb/10.1.1.1/20230914.10.2.2.2-3.log
---
2023-09-14T16:10:19.245+0800 preparing collections to restore from
...
2023-09-14T18:18:18.996+0800 [########################] likingtest.oprceDataObj 1208GB/1208GB (100.0%)
2023-09-14T18:18:19.014+0800 finished restoring likingtest.oprceDataObj (53413481 documents, 0 failures)

可见:

1、并发由 8 增至 12 并无效率提升,结论是 6-8 个并发就可以,这一点与oracle的并发导入设置为 6 基本是最佳实践类似。

2、本次恢复采用nfs备份恢复,一台8C的虚机,12并发恢复时cpu占用约60%,网络接收速度300MB/s左右,本地磁盘写入速度在30-500MB/s左右,可见网络带段不是瓶颈。可以预见,如果采用更高的主机配置,尤其是IO更好的磁盘,resotore时间必将更少。

3、关于索引的restore,restore时首先恢复数据,最后再创建索引,比较大的集合的索引创建还是需要较多的时间:

      currentOpTime: '2023-09-14T20:23:59.435+08:00',
...
command: {
createIndexes: 'oprceDataObj',
indexes: [
{
key: { flowId: 1, 'activityConfiguration.activityNameEn': 1 },
name: 'flowId_1_activityConfiguration.activityNameEn_1',
ns: 'likingtest.oprceDataObj'
},
{
key: { flowNo: 1 },
name: 'flowNo_1',
ns: 'likingtest.oprceDataObj'
},
{
key: {
'oprceInfo.oprceInstID': 1,
'activityInfo.activityInstID': 1,
'workitemInfo.workItemID': 1
},
name: 'oprceInfo.oprceInstID_1_activityInfo.activityInstID_1_workitemInfo.workItemID_1',
ns: 'likingtest.oprceDataObj'
}
],
.....
currentOpTime: '2023-09-14T20:23:59.489+08:00',
...
command: {
createIndexes: 'oprcesDataObjInit',
indexes: [
{
key: { flowId: 1, 'activityConfiguration.activityNameEn': 1 },
name: 'flowId_1_activityConfiguration.activityNameEn_1',
ns: 'likingtest.oprcesDataObjInit'
},
{
key: {
'oprceInfo.oprceInstID': 1,
'activityInfo.activityInstID': 1,
'workitemInfo.workItemID': 1
},
name: 'oprceInfo.oprceInstID_1_activityInfo.activityInstID_1_workitemInfo.workItemID_1',
ns: 'likingtest.oprcesDataObjInit'
}
],
......第二天再看,还没创建完索引:
currentOpTime: '2023-09-15T09:16:16.460+08:00',
effectiveUsers: [ { user: 'admin', db: 'admin' } ],
runBy: [ { user: '__system', db: 'local' } ],
threaded: true,
opid: 'shard1:11312917',
lsid: {
id: new UUID("e78379ff-9664-46b1-9e87-2bdd4abc5c5f"),
uid: Binary.createFromBase64("O0CMtIVItQN4IsEOsJdrPL8s7jv5xwh5a/A5Qfvs2A8=", 0)
},
secs_running: Long("53877"),
microsecs_running: Long("53877330742"),
op: 'command',
ns: 'likingtest.oprcesDataObjInit',
redacted: false,
command: {
createIndexes: 'oprcesDataObjInit',
......第二天满24h,还没创建完索引:
currentOpTime: '2023-09-15T18:55:16.877+08:00',
effectiveUsers: [ { user: 'admin', db: 'admin' } ],
runBy: [ { user: '__system', db: 'local' } ],
threaded: true,
opid: 'shard1:11312917',
lsid: {
id: new UUID("e78379ff-9664-46b1-9e87-2bdd4abc5c5f"),
uid: Binary.createFromBase64("O0CMtIVItQN4IsEOsJdrPL8s7jv5xwh5a/A5Qfvs2A8=", 0)
},
secs_running: Long("88617"),
microsecs_running: Long("88617747875"),
op: 'command',
ns: 'likingtest.oprcesDataObjInit',
redacted: false,
command: {
createIndexes: 'oprcesDataObjInit',
indexes: [
{
key: { flowId: 1, 'activityConfiguration.activityNameEn': 1 },
name: 'flowId_1_activityConfiguration.activityNameEn_1',
ns: 'likingtest.oprcesDataObjInit'
},

以上可见,mongorestore 导入数据库的数据效率目前是基本可控、可接受的,至少对于1.2T的大集合是可以接受的,但是最后的索引创建实在过于缓慢,且没有找到合适的解决办法:索引需多并发执行创建,且确保索引生效,本次索引创建最后并未生效

■ 2023-09-15T19:02 第4次10并发导入测试,不恢复索引

mongorestore --port=20000 -uadmin -p'passwd' --authenticationDatabase=admin --numInsertionWorkersPerCollection=10 --bypassDocumentValidation --nsInclude="likingtest.*" --nsFrom="likingtest.*" --nsTo="likingtest.*" --noIndexRestore /u01/nfs/xxxxx_mongodb/10.1.1.1/20230914 > 20230914.10.2.2.2-4.log 2>&1 &
tail -100f /u01/nfs/xxxxx_mongodb/10.1.1.1/20230914.10.2.2.2-4.log
2023-09-15T19:02:59.747+0800 preparing collections to restore from
...
2023-09-15T21:24:36.145+0800 [########################] likingtest.oprceDataObj 1208GB/1208GB (100.0%)
2023-09-15T21:24:36.161+0800 finished restoring likingtest.oprceDataObj (53413481 documents, 0 failures)
2023-09-15T21:24:36.165+0800 97367732 document(s) restored successfully. 0 document(s) failed to restore.

以上可见,耗时:2h22m

结论

1、restore 时需设置大数据量 collection 多并发导入:--numInsertionWorkersPerCollection=8

2、不恢复索引:--noIndexRestore

3、数据恢复后,后台创建索引:本站搜索"MongoDB 重建索引"

【最佳实践】MongoDB导出导入数据的更多相关文章

  1. mongoDB导出-导入数据

    --导出数据集 C:\MongoDB\db\bin>mongoexport -d ttx-xwms-test -c things -o d:\mongo_data\things.txt C:\M ...

  2. mongodb导出导入数据

    在使用mongodump导出单个表的时候,遇到了一个错误 # mongodump --host xxx --port 27017 --username 'admin' -p '123456' -d 数 ...

  3. MongoDB中导入数据命令的使用(mongoimport)

    MongoDB中导入数据命令的使用(mongoimport) 制作人:全心全意 语法: mongoimport <options> <file> 介绍: 该命令可以将CSV,T ...

  4. GreenPlum/postgres copy命令导出/导入数据

    一.COPY命令简单实用 1.copy在postgres与GreenPlum介绍 1.1 postgrespostgres的COPY命令可以快速的导出/导入数据到postgresql数据库中,支持常用 ...

  5. mongodb导出导入实例记录

    mongodb导出导入实例记录 平时很用mongodb,所以用到了,就需要去网上搜索方法,干脆将自己的实际经历记录下来,方便日后使用. # 大致需求 源库:db_name_mongo 源IP:192. ...

  6. mongodb使用自带命令工具导出导入数据

    记录 mongo 数据库用原生自带的命令工具使用 json 文件方式进行导入.导出的操作! 在一次数据更新中,同事把老数据进行了清空操作,但是新的逻辑数据由于某种原因(好像是她的电脑中病毒了),一直无 ...

  7. BCP 导出导入数据(SQL Server)

    BCP指令工具可通过安装SQL Server获得. 1. 根据现有的数据库生成表的format文件(导入导出数据的时候需要) bcp db_test.dbo.Table1 format nul -c ...

  8. 使用BCP导出导入数据

    bcp 实用工具可以在 Microsoft SQL Server 实例和用户指定格式的数据文件间大容量复制数据. 使用 bcp 实用工具可以将大量新行导入 SQL Server 表,或将表数据导出到数 ...

  9. oracle 导出导入数据

    在window的运行中输出cmd,然后执行下面的一行代码, imp blmp/blmp@orcl full=y file=D:\blmp.dmp OK,问题解决.如果报找不到该blmp.dmp文件,就 ...

  10. 【Teradata Utility】使用SQL Assistant导出导入数据

    1.导出 (1)选择菜单栏File,点击Export Results,输入导出数据的SQL: select * from etl_data.soure_table; (2)选择导出数据格式为txt或h ...

随机推荐

  1. 【论文阅读】Pyramid Scene Parsing Network

    解决的问题:(FCN) Mismatched Relationship: 匹配关系错误,如将在水中的船识别为车. Confusion Categories: 模糊的分类,如 hill 和 mounta ...

  2. 前端vue uni-app百度地图定位组件,显示地图定位,标记点,并显示详细地址

    快速实现前端百度地图定位组件,显示地图定位,标记点,并显示详细地址; 下载完整代码请访问uni-app插件市场地址:https://ext.dcloud.net.cn/plugin?id=12677 ...

  3. 3. @RequestMapping注解

    1. @RequestMapping 注解的功能 ‍ @RequestMapping 注解的作用就是将请求和处理请求的控制器方法关联起来,建立映射关系. ‍ SpringMVC 接收到指定的请求 , ...

  4. React框架学习基础篇-HelloReact-01

    一直想掌握一门前端技术,于是想跟着张天宇老师学习,便开始学习React,以此来记录一下我的学习之旅. 学习一门新的技术首先是去官网看看,React官网链接是[https://zh-hans.react ...

  5. 性能_3 jmeter连接数据库jdbc(sql server举例)

    一.下载第三方工具包驱动数据库 1. 因为JMeter本身没有提供链接数据库的功能,所以我们需要借助第三方的工具包来实现.  (有这个jar包之后,jmeter可以发起jdbc请求,没有这个jar包, ...

  6. 每日一题力扣 1262 https://leetcode.cn/problems/greatest-sum-divisible-by-three/

    . 题解 这道题目核心就算是要知道如果x%3=2的话,应该要去拿%3=1的数字,这样子才能满足%3=0 贪心 sum不够%3的时候,就减去余数为1的或者余数为2的 需要注意 两个余数为1会变成余数为2 ...

  7. 一张图告诉你如何提高 API 性能

    API 性能是指一个 API 在执行其功能时的效率和性能表现,通常用于衡量 API 的响应时间.吞吐量.可伸缩性和稳定性等方面的表现. API 性能的指标包括: 响应时间: API 的响应时间是指从发 ...

  8. Typecho handsome主题一言接口修改

    说明 handsome主题默认使用的是 https://v1.hitokoto.cn 一言接口 博主感觉不是太满意,于是想换成自己的"一言"服务接口 首先需要自己搭建一个" ...

  9. ListView选中获取数据并弹出菜单项

    前言 作为一名Android小白,我在编写过程中,使用ListView列表,想要使用他来完成长按弹出菜单选项,并且还要进行事件操作,经过百度编程的经历后,终于成功完成.在此附上这块比较完整的代码,理论 ...

  10. debezium之mysql配置

    实验环境 全部部署于本地虚拟机 1 mysql 参考 官方文档 和 根据官方示例镜像(debezium/example-mysql,mysql版本为8.0.32) 1.1 创建用户 官方镜像里一共有三 ...