转载一片mongodb 迁移pg 数据库的文章

The Guardian migrated their CMS's datastore in 2018 from a self-managed MongoDBcluster to PostgreSQL on Amazon RDS for a fully managed solution. The team did an API-based migration without any downtime.

Guardian’s in-house CMS - called Composer - which stores articles, blog content, photo galleries and video was originally built on top of MongoDB as a datastore. This was preceded by a vendor software backed by an Oracle database. This setup had downtimes whenever the schema had to be migrated. As an alternative, the team looked at various NoSQL dbs, and one of the key reasons for choosing MongoDB seems to have been flexibility. Originally hosted on their own datacenter, they moved their MongoDB to their AWS servers after an outage. The installation and management scripts had to be handwritten by Guardian’s team. They opted for a support contract and bought the OpsManager tool, which is a frontend application for managing MongoDB. However, the team did not go for MongoDB's Atlas offering, which is a "fully managed database", for reasons which are unclear. OpsManager does not manage deployments.

After moving to AWS, the team faced two MongoDB outages. Some of them the reasons were basic system administration issues, like not allowing NTP to access time servers to keep clocks in sync. Others pertained to the difficulty of managing OpsManager itself and obtaining timely support from the vendor, according to the article. The team felt that moving to a solution which had minimal database management would suit them best.

The team chose PostgreSQL due to its maturity and support for the jsonb data type, as a hosted database on Amazon’s RDS. The jsonb type allows for indexing of fields inside the JSON object. The migration plan was to write a new API over Postgres and use a proxy that would send traffic to both APIs to keep them in sync for new, incoming data. The existing data would be migrated using the APIs, and then the proxy would switch to the new API. Their previous migration from Oracle was also done using a similar approach. The migration script logs were pushed to Elasticsearch so that the migration could be tracked. In the process, they also improved their structured logging.

The proxy directed all traffic to the MongoDB API in real time, and asynchronously to the Postgres API . Any difference in the responses was logged and analyzed. To ensure that the new API and backend can hold up to production traffic, GoReplay processes were run to generate traffic. GoReplay can capture traffic and replay it against a different environment - in this case, the pre-production one. A complete migration was done on the pre-production environment. The final step in the production migration was to switch the DNS name from the proxy's endpoint (an Amazon ELB) to the Postgres API (another ELB). This allowed their clients to function without any change. Post-migration, their integration tests failed as they had not been migrated to the new API.

Other organizations have moved from MongoDB to PostgreSQL for a variety of reasons.

The Guardian’s Migration from MongoDB to PostgreSQL on Amazon RDS的更多相关文章

  1. MongoDB与PostgresQL无责任初步测试

    PostgresQL一秒能插入多少条记录,MongoDB呢?读取的情况又如何?我写了一些简单的程序,得出了一些简单的数据,贴在这里分享,继续往下阅读前请注意下本文标题中的“无责任”,这表示此测试结果不 ...

  2. data type Migration from MySQL to PostgreSQL

    MySQL          PostgreSQL tinyint           smallint smallint          smallint mediumint      integ ...

  3. python脚本 mongodb到postgresql

    安装 mongo模块 pip install pymongo 安装postgresql 驱动 pip install python-psycopg2  1 # -*- coding: utf-8 -* ...

  4. 快速掌握Flyway

    什么是Flyway? Flyway is an open-source database migration tool. It strongly favors simplicity and conve ...

  5. Tungsten Replicator学习总结

    之前基于Tungsten Replicator实现了内部使用的分布式数据库的数据迁移工具,此文为当时调研Tungsten Replicator时的学习心得,创建于2015.7.22. 1 概述 1.1 ...

  6. Flyway Overview and Installation

    https://flywaydb.org/documentation/ Flyway is an open-source database migration tool. It strongly fa ...

  7. 快速掌握和使用Flyway

    什么是Flyway? 转载:https://blog.waterstrong.me/flyway-in-practice/ Flyway is an open-source database migr ...

  8. Flyway学习笔记

    Flyway做为database migration开源工具,功能上像是git.svn这种代码版本控制.google搜索database migration,或者针对性更强些搜索database mi ...

  9. NoSQL数据库介绍(2)

    2 NoSQL潮流      在这一章中,将一起讨论NoSQL潮流的动机和主要驱动力.以及NoSQL主张的批评和反馈.本章将通过不同的尝试得出结论来分类和描写叙述NoSQL数据库.当中一个分类法将在随 ...

随机推荐

  1. 深入理解java虚拟机---对象的创建过程(八)

    1.对象的创建过程 由于类的加载是一个很复杂的过程,所以这里暂时略过,后面会详细讲解,默认为是已加载过的类.着重强调对象的创建过程. 注意: 最后一步的init方法是代码块和构造方法. 以上是总图,下 ...

  2. delete和delete[] 区别

    // DeleteAndDelete[].cpp : 定义控制台应用程序的入口点. // #include "stdafx.h" #include <Windows.h> ...

  3. hdu3861 强连通分量缩点+二分图最最小路径覆盖

    The King’s Problem Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/32768 K (Java/Other ...

  4. 20165326 java第三周学习笔记

    纸质学习笔记 代码托管

  5. python 元组攻略

    1.元组中只包含一个元素时,需要在元素后面添加逗号来消除歧义 tup1=(50,) 2.元组中的元素值使不允许修改的,但可以对元组进行连接组合复制代码 1 tup1=(12,34.56)2 tup2= ...

  6. Microsoft Project 常用快捷键

    任务升级 : ALT  +  SHIFT + 向左键 任务降级: ALT  +  SHIFT + 向右键 滚动到表头(第一个任务):Ctrl + HOME 滚动到表尾(最后一个任务):Ctrl + E ...

  7. [BUG]数据库日期格式, 到页面是毫秒值

    springboot 配置文件

  8. JAVA数组与List相互转换

    1.数组转成List 数组转成List可以用方法 :Arrays.asList,一起来了解一下 System.out.println(Arrays.asList(new String[] { &quo ...

  9. C和C++内存模型

    以下内容,大部分整理自网络 C分为四个区:堆,栈,静态全局变量区,常量区 C++内存分为5个区域(堆栈全常代 ): 堆 heap : 由new分配的内存块,其释放编译器不去管,由我们程序自己控制(一个 ...

  10. ubuntu: firefox+flashplay

    更新两步: 1.安装firefox:rm-->下载-->mv-->ln http://www.cnblogs.com/yzsatcnblogs/p/4266985.html 2. f ...