转载一片mongodb 迁移pg 数据库的文章

The Guardian migrated their CMS's datastore in 2018 from a self-managed MongoDBcluster to PostgreSQL on Amazon RDS for a fully managed solution. The team did an API-based migration without any downtime.

Guardian’s in-house CMS - called Composer - which stores articles, blog content, photo galleries and video was originally built on top of MongoDB as a datastore. This was preceded by a vendor software backed by an Oracle database. This setup had downtimes whenever the schema had to be migrated. As an alternative, the team looked at various NoSQL dbs, and one of the key reasons for choosing MongoDB seems to have been flexibility. Originally hosted on their own datacenter, they moved their MongoDB to their AWS servers after an outage. The installation and management scripts had to be handwritten by Guardian’s team. They opted for a support contract and bought the OpsManager tool, which is a frontend application for managing MongoDB. However, the team did not go for MongoDB's Atlas offering, which is a "fully managed database", for reasons which are unclear. OpsManager does not manage deployments.

After moving to AWS, the team faced two MongoDB outages. Some of them the reasons were basic system administration issues, like not allowing NTP to access time servers to keep clocks in sync. Others pertained to the difficulty of managing OpsManager itself and obtaining timely support from the vendor, according to the article. The team felt that moving to a solution which had minimal database management would suit them best.

The team chose PostgreSQL due to its maturity and support for the jsonb data type, as a hosted database on Amazon’s RDS. The jsonb type allows for indexing of fields inside the JSON object. The migration plan was to write a new API over Postgres and use a proxy that would send traffic to both APIs to keep them in sync for new, incoming data. The existing data would be migrated using the APIs, and then the proxy would switch to the new API. Their previous migration from Oracle was also done using a similar approach. The migration script logs were pushed to Elasticsearch so that the migration could be tracked. In the process, they also improved their structured logging.

The proxy directed all traffic to the MongoDB API in real time, and asynchronously to the Postgres API . Any difference in the responses was logged and analyzed. To ensure that the new API and backend can hold up to production traffic, GoReplay processes were run to generate traffic. GoReplay can capture traffic and replay it against a different environment - in this case, the pre-production one. A complete migration was done on the pre-production environment. The final step in the production migration was to switch the DNS name from the proxy's endpoint (an Amazon ELB) to the Postgres API (another ELB). This allowed their clients to function without any change. Post-migration, their integration tests failed as they had not been migrated to the new API.

Other organizations have moved from MongoDB to PostgreSQL for a variety of reasons.

The Guardian’s Migration from MongoDB to PostgreSQL on Amazon RDS的更多相关文章

  1. MongoDB与PostgresQL无责任初步测试

    PostgresQL一秒能插入多少条记录,MongoDB呢?读取的情况又如何?我写了一些简单的程序,得出了一些简单的数据,贴在这里分享,继续往下阅读前请注意下本文标题中的“无责任”,这表示此测试结果不 ...

  2. data type Migration from MySQL to PostgreSQL

    MySQL          PostgreSQL tinyint           smallint smallint          smallint mediumint      integ ...

  3. python脚本 mongodb到postgresql

    安装 mongo模块 pip install pymongo 安装postgresql 驱动 pip install python-psycopg2  1 # -*- coding: utf-8 -* ...

  4. 快速掌握Flyway

    什么是Flyway? Flyway is an open-source database migration tool. It strongly favors simplicity and conve ...

  5. Tungsten Replicator学习总结

    之前基于Tungsten Replicator实现了内部使用的分布式数据库的数据迁移工具,此文为当时调研Tungsten Replicator时的学习心得,创建于2015.7.22. 1 概述 1.1 ...

  6. Flyway Overview and Installation

    https://flywaydb.org/documentation/ Flyway is an open-source database migration tool. It strongly fa ...

  7. 快速掌握和使用Flyway

    什么是Flyway? 转载:https://blog.waterstrong.me/flyway-in-practice/ Flyway is an open-source database migr ...

  8. Flyway学习笔记

    Flyway做为database migration开源工具,功能上像是git.svn这种代码版本控制.google搜索database migration,或者针对性更强些搜索database mi ...

  9. NoSQL数据库介绍(2)

    2 NoSQL潮流      在这一章中,将一起讨论NoSQL潮流的动机和主要驱动力.以及NoSQL主张的批评和反馈.本章将通过不同的尝试得出结论来分类和描写叙述NoSQL数据库.当中一个分类法将在随 ...

随机推荐

  1. jdk8--stream并行流

    stream的并行流要理解一个框架如下: 单线程,多线程和并行流对比 package com.atguigu.java8; import java.util.concurrent.ForkJoinPo ...

  2. GIL 相关 和进程池

    #GIL (global interpreter Lock) #全局解释器锁 :锁是为了避免资源竞争造成数据错乱 #当一个py启动后 会先执行主线程中的代码#在以上代码中有启动了子线程 子线程的任务还 ...

  3. day12作业答案

    2.1 # lst=['asdgg','as','drtysr'] # lst2=[i.upper() for i in lst if len(i) >3 ] # print(lst2) # 2 ...

  4. 地理空间数据云--TM影像下载

    实验要用到遥感影像,,TM,,之前是可以在美国USGS上下载的,但是要FQ了,有点麻烦,, 想到之前本科实在地理空间数据云平台下载的,就试了一下以前的账号,完美!,, TM数据很丰富,到2017年的都 ...

  5. shell 批量计算MD5值

    #!/bin/sh #需要计算MD5文件列表 # list=`ls` list="file list" for file in $list do file1=`` echo &qu ...

  6. Spring Boot 揭秘与实战(八) 发布与部署 - 远程调试

    文章目录 1. 依赖 2. 部署 3. 调试 4. 源代码 设置远程调试,可以在正式环境上随时跟踪与调试生产故障. 依赖 在 pom.xml 中增加远程调试依赖. <plugins> &l ...

  7. 第七十四课 图的遍历(BFS)

    广度优先相当于对顶点进行分层,层次遍历. 在Graph.h中添加BFS函数: #ifndef GRAPH_H #define GRAPH_H #include "Object.h" ...

  8. npm 包管理器的使用

    1. 权限问题 Warning "root" does not have permission to access the dev dir · Issue #454 · nodej ...

  9. PHP安全之webshell和后门检测(转)

    基于PHP的应用面临着各种各样的攻击: XSS:对PHP的Web应用而言,跨站脚本是一个易受攻击的点.攻击者可以利用它盗取用户信息.你可以配置Apache,或是写更安全的PHP代码(验证所有用户输入) ...

  10. /dev/i2c-*不见了

    /********************************************************************** * /dev/i2c-*不见了 * 说明: * 能在他的 ...