The Guardian’s Migration from MongoDB to PostgreSQL on Amazon RDS
The Guardian migrated their CMS's datastore in 2018 from a self-managed MongoDBcluster to PostgreSQL on Amazon RDS for a fully managed solution. The team did an API-based migration without any downtime.
Guardian’s in-house CMS - called Composer - which stores articles, blog content, photo galleries and video was originally built on top of MongoDB as a datastore. This was preceded by a vendor software backed by an Oracle database. This setup had downtimes whenever the schema had to be migrated. As an alternative, the team looked at various NoSQL dbs, and one of the key reasons for choosing MongoDB seems to have been flexibility. Originally hosted on their own datacenter, they moved their MongoDB to their AWS servers after an outage. The installation and management scripts had to be handwritten by Guardian’s team. They opted for a support contract and bought the OpsManager tool, which is a frontend application for managing MongoDB. However, the team did not go for MongoDB's Atlas offering, which is a "fully managed database", for reasons which are unclear. OpsManager does not manage deployments.
After moving to AWS, the team faced two MongoDB outages. Some of them the reasons were basic system administration issues, like not allowing NTP to access time servers to keep clocks in sync. Others pertained to the difficulty of managing OpsManager itself and obtaining timely support from the vendor, according to the article. The team felt that moving to a solution which had minimal database management would suit them best.
The team chose PostgreSQL due to its maturity and support for the jsonb data type, as a hosted database on Amazon’s RDS. The jsonb type allows for indexing of fields inside the JSON object. The migration plan was to write a new API over Postgres and use a proxy that would send traffic to both APIs to keep them in sync for new, incoming data. The existing data would be migrated using the APIs, and then the proxy would switch to the new API. Their previous migration from Oracle was also done using a similar approach. The migration script logs were pushed to Elasticsearch so that the migration could be tracked. In the process, they also improved their structured logging.
The proxy directed all traffic to the MongoDB API in real time, and asynchronously to the Postgres API . Any difference in the responses was logged and analyzed. To ensure that the new API and backend can hold up to production traffic, GoReplay processes were run to generate traffic. GoReplay can capture traffic and replay it against a different environment - in this case, the pre-production one. A complete migration was done on the pre-production environment. The final step in the production migration was to switch the DNS name from the proxy's endpoint (an Amazon ELB) to the Postgres API (another ELB). This allowed their clients to function without any change. Post-migration, their integration tests failed as they had not been migrated to the new API.
Other organizations have moved from MongoDB to PostgreSQL for a variety of reasons.
The Guardian’s Migration from MongoDB to PostgreSQL on Amazon RDS的更多相关文章
- MongoDB与PostgresQL无责任初步测试
PostgresQL一秒能插入多少条记录,MongoDB呢?读取的情况又如何?我写了一些简单的程序,得出了一些简单的数据,贴在这里分享,继续往下阅读前请注意下本文标题中的“无责任”,这表示此测试结果不 ...
- data type Migration from MySQL to PostgreSQL
MySQL PostgreSQL tinyint smallint smallint smallint mediumint integ ...
- python脚本 mongodb到postgresql
安装 mongo模块 pip install pymongo 安装postgresql 驱动 pip install python-psycopg2 1 # -*- coding: utf-8 -* ...
- 快速掌握Flyway
什么是Flyway? Flyway is an open-source database migration tool. It strongly favors simplicity and conve ...
- Tungsten Replicator学习总结
之前基于Tungsten Replicator实现了内部使用的分布式数据库的数据迁移工具,此文为当时调研Tungsten Replicator时的学习心得,创建于2015.7.22. 1 概述 1.1 ...
- Flyway Overview and Installation
https://flywaydb.org/documentation/ Flyway is an open-source database migration tool. It strongly fa ...
- 快速掌握和使用Flyway
什么是Flyway? 转载:https://blog.waterstrong.me/flyway-in-practice/ Flyway is an open-source database migr ...
- Flyway学习笔记
Flyway做为database migration开源工具,功能上像是git.svn这种代码版本控制.google搜索database migration,或者针对性更强些搜索database mi ...
- NoSQL数据库介绍(2)
2 NoSQL潮流 在这一章中,将一起讨论NoSQL潮流的动机和主要驱动力.以及NoSQL主张的批评和反馈.本章将通过不同的尝试得出结论来分类和描写叙述NoSQL数据库.当中一个分类法将在随 ...
随机推荐
- Alpha 冲刺 (6/10
Alpha 冲刺 (6/10) 队名:第三视角 组长博客链接 本次作业链接 团队部分 团队燃尽图 工作情况汇报 张扬(组长) 过去两天完成了哪些任务: 文字/口头描述: 1.组织会议 2.帮助队员解决 ...
- 十五. Python基础(15)--内置函数-1
十五. Python基础(15)--内置函数-1 1 ● eval(), exec(), compile() 执行字符串数据类型的python代码 检测#import os 'import' in c ...
- Core Java 面经
1 面向对象的特征有哪些方面? (1)抽象,抽象就是忽略与当前目标无关的部分,抽象包含两个方面,一是过程抽象,一是数据 (2)继承,是Java中允许和鼓励类重用的思想的体现,, 它提供了一种方式,可 ...
- 201621123001 《Java程序设计》第6周学习总结
1. 本周学习总结 1.1 面向对象学习暂告一段落,请使用思维导图,以封装.继承.多态为核心概念画一张思维导图或相关笔记,对面向对象思想进行一个总结. 注1:关键词与内容不求多,但概念之间的联系要清晰 ...
- vue-cli快速构建vue项目模板
vue-cli 是vue.js的脚手架,用于自动生成vue.js模板工程的. 1.使用npm安装vue-cli 需要先装好vue 和 webpack(前提是已经安装了nodejs,否则连npm都用不了 ...
- 使用FileResult导出Excel数据文件
用的是Html拼接成Table表格的方式,返回 FileResult 输出一个二进制的文件. 第一种:使用FileContentResult // 通过使用文件内容,内容类型,文件名称创建一个File ...
- PHP中session_start 函数详解使用方法
一.官方 session_status() 返回值为: PHP_SESSION_DISABLED 会话是被禁用的. PHP_SESSION_NONE 会话是启用的,但不存在当前会话. PHP_SESS ...
- Java学习笔记33(IO:打印流,IO流工具类)
打印流: 有两个类:PrintStream PrintWriter类,两个类的方法一样,构造方法不一样 PrintStream构造方法:接收File类型,接收字符串文件名,接收字节输出流(Ou ...
- ss server端配置
关于ss server的配置,可以参考一个网址 关于服务器的购买可以上VIRMACH购买 和本地安装ss类似,首先安装ss,pip install shadowsocks 配置服务器参数,vim /e ...
- Nginx配置之location模块和proxy模块
1.location指令的用法介绍 Location主要用来匹配url,如:http://www.beyond.com/nice,在这里对于location来说www.beyond.com是域名,/n ...