From http://simongui.github.io/2016/12/02/improving-cache-consistency.html

A typically web application introduces an in-memory cache like memcache or redis to reduce load on the primary database for reads requesting hot data. The most primitive design looks something like Figure 1.

+--------------------------------+        +------------+        +----------------+

|            database            <--------+ web server +-------->     cache      |

| mssql, mysql, oracle, postgres |        +------------+        | memcache/redis |

+--------------------------------+                              +----------------+

Figure 1

Unfortunately this design is really common despite the many issues it introduces. I’ve seen some organizations with large scale applications still using this design and they maintain a bunch of hacks to overcome these issues which increases the systems operational complexity and sometimes surfaces as inconsistent data to end users.

Issue 1. Pool of connections to the cache services per web server instance

In a large application sometimes thousands of web server instances (especially in slower languages like Ruby) are hosting the web application. Each one has to maintain connections to the infrastructure the web application code communicates with directly. This can include primary databases like MSSQL, MySQL, Oracle, Postgres and cache services like Memcache or Redis. Each web server instance would for example have a pool of connections for each database or cache service instance it communicates with.

         --------------------------------------------------------------------------

         |                database (mssql, mysql, oracle, postgres)               |

         +----^--^-----------^--^-----------^--^-----------^--^-----------^--^----+

              |  |           |  |           |  |           |  |           |  |

N connections |  |           |  |           |  |           |  |           |  |

              |  |           |  |           |  |           |  |           |  |

         +------------+ +------------+ +------------+ +------------+ +------------+

         | web server | | web server | | web server | | web server | | web server |

         +------------+ +------------+ +------------+ +------------+ +------------+

              |  |           |  |           |  |           |  |           |  |

N connections |  |           |  |           |  |           |  |           |  |

              |  |           |  |           |  |           |  |           |  |

         -----v--v-----------v--v-----------v--v-----------v--v-----------v--v-----

         |                         cache (memcache, redis)                        |

         +------------------------------------------------------------------------+

Figure 2

This can be a strain on resources both on the web server but more importantly the database or cache service as shown in Figure 2. This is why I included a 16,384 connection benchmark in my benchmarks of Redis server libraries for Go to see how they scaled. It’s not uncommon to see 10,000or 20,000 connections to a Memcache or Redis server in a large system designed like this.

Issue 2. Many web app requests have to execute cache set operations

Similar to how a HTTP request may issue multiple SQL INSERT or UPDATE statements, multiple SET operations may be issued against the cache service. Even though these can be done asynchronously, they still consume resources on the web server and it would be great if the web servers only had to be concerned with updating the primary database.

Issue 3. No fault tolerance. Data loss if cache set operations fail

The typical sequence of operations of how Figure 2 in a web application would be designed would be as follows.

Update the primary database (MSSQL, MySQL, Oracle, Postgres, etc).
If the transaction fails return a HTTP error.
If the transaction succeeds send SET operations to the cache server(s) (memcache, redis, etc).

Any SET operation could fail even after retrying which puts the cache service(s) inconsistent with the primary database which could result in users seeing incorrect information. Even worse depending how the application is designed you could experience partial failures which results in users seeing partially correct and partially incorrect information after a change and a cache hit.

Some cache service protocols support sending multiple SET operations in one command but some do not. Not all web applications are smart enough to group SET operations that happen in different areas of the code into a single command either. If this is the case you could have partial failures where some of the SET operations succeeded and some failed.

Outside of retrying there’s not much the web application can do to eventually correct the missing cache SET operations. It has to retry and give up at some point. The cache will be serving cache hits that are inconsistent with the primary database until the cache key(s) invalidate via a TTL or some other process.

Messaging middleware

Sometimes this gets solved by messaging middleware like Kafka where the web applications push SET operations into Kafka and consumers pull changes from Kafka and execute the SET operations on the cache service(s). This greatly increases the cache consistency and allows the caches survive failures and catch up after short or long failures.

This introduces latency in the system. Changes may not be seen right away to users. Some web applications solve this by doing sticky sessions and caching in-memory in the web application to hide that data is inconsistent. Stale results are still possible if the web server fails and requests route to a different web server instance. This introduces complexity in the request routing tier of the system.

         +------------------------------------------------------------------------+

         |                database (mssql, mysql, oracle, postgres)               |

         +----^--^-----------^--^-----------^--^-----------^--^-----------^--^----+

              |  |           |  |           |  |           |  |           |  |

N connections |  |           |  |           |  |           |  |           |  |

              |  |           |  |           |  |           |  |           |  |

         +----+--+----+ +----+--+----+ +----+--+----+ +----+--+----+ +----+--+----+

         | web server | | web server | | web server | | web server | | web server |

         +----+--+----+ +----+--+----+ +----+--+----+ +----+--+----+ +----+--+----+

              |  |           |  |           |  |           |  |           |  |

N connections |  |           |  |           |  |           |  |           |  |

              |  |           |  |           |  |           |  |           |  |

         +----v--v-----------v--v-----------v--v-----------v--v-----------v--v----+

         |                    message queue (kafka, rabbitmq)                     |

         +----------------------------------^--^----------------------------------+

                                            |  |

                              N connections |  |

                                            |  |

                                     +------+--+------+

                                     | kafka consumer |

                                     +------+--+------+

                                            |  |

                              N connections |  |

                                            |  |

         +----------------------------------v--v----------------------------------+

         |                         cache (memcache, redis)                        |

         +------------------------------------------------------------------------+

Figure 3

As shown in Figure 3 this greatly reduces the connection load on the cache service but introduces a lot of operational complexity such as the following.

Deploy and operate a high throughput messaging system like Kafka with multiple brokers to survive broker failures.
Deploy and operate multiple consumer processes that consume messages in Kafka and execute SET operations to the cache service(s) to survive consumer failures.

Issue 4. No sequential consistency with the primary database

Leslie Lamport describes sequential consistency as follows.

The result of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program.

Figure 3 greatly improves fault tolerance and reduces the chances of losing an update however does not address order. Issue 3 describes the possibility of complete and partial failures and explains how a user could see partially up-to-date and partially stale results. Diving deeper operations could fail before following operations succeed. The order of visible changes could be out-of-order. Some applications may be more sensitive to this kind of inconsistency. Some applications may require strict partial order. Even if order isn’t critical, providing sequential consistency is a better experience for users and less confusing.

Solution: MySQL binlog replication

Figure 3 shows the benefits of a shared message queue however deploying one with fault tolerance is not trivial and operating one smoothly isn’t trivial either. If you use a database with replication there’s already a queue in your system and you may not need to deploy yet another queue and new piece of infrastructure like Kafka to solve some of these problems.

+----------+---+---+---+---+---+   binlog replication   +--------------------------+

| MySQL    | 1 | 2 | 3 | 4 | 5 <------------------------+ MySQL replication client |

+----------+---+---+---+---+---+                        +--------------------------+

              MySQL binlog

            binlog positions

Figure 4

MySQL has a binlog replication protocol which is used for primary/secondary replication. This is essentially a replicated queue that has all the transactions recorded in-order as shown in Figure 4.

This isn’t a popular solution but I say, why not? It works very well. You can write an application that can speak the MySQL binlog replication protocol that consumes the binlog entries and execute SET operations against the cache service(s). There are two ways you could consume the binlog data.

Interpret the raw SQL syntax and issue SET operations.
The web application embeds cache keys as a comment in the SQL.

Both of these options are good because you can even get the transaction scope of each transaction in the binlog statements if you need to and if the target system supports atomic multi-set operations. I prefer the 2nd option because it’s easier to parse and the application already has this information in most cases.

         +------------+ +------------+ +------------+ +------------+ +------------+

         | web server | | web server | | web server | | web server | | web server |

         +------------+ +------------+ +------------+ +------------+ +------------+

              |  |           |  |           |  |           |  |           |  |

N connections |  |           |  |           |  |           |  |           |  |

              |  |           |  |           |  |           |  |           |  |

         +----v--v-----------v--v-----------v--v-----------v--v-----------v--v----+

         |                 database (mssql, mysql,,oracle, postgres)              |

         +------------------------------------^-----------------------------------+

                                              |

                                 1 connection |

                                              |

                               +---------------------------+

                               | binlog replication client |

                               +---------------------------+

                                            |  |

                              N connections |  |

                                            |  |

         +----------------------------------v--v----------------------------------+

         |                         cache (memcache, redis)                        |

         +------------------------------------------------------------------------+

Figure 5

Figure 5 shows the overall architecture with the binlog replication in place.

Benefits

Drastically reduces connection load on the cache service(s). Web servers only connect to the database.
Sequential consistency because we are reading the databases commit log into the cache service(s).
Possible to connect to any MySQL replica in the replication chain since they are all sequentially consistent.

I love Kafka and have nothing against it, I use it myself. Reducing infrastructure simplified the architecture and reduced operational complexity. By replicating the MySQL commit log to the cache service(s) we have increased the consistency as well as gained strict partial order between the database and the cache service(s).

I’m currently working on a project in Go that provides this proposed functionality that I’ll announce at a later date. Contact me if you want to know more about it.

Improving cache consistency redis和db的一致性维护的更多相关文章

DB，Cache和Redis应用场景分析
最近做一产品,微博方面的.数据存储同时用到了DB(mysql),Cache(memcache),Redis.其实最开始架构设计的时候是准备用MongoDB的,由于学习成本太高,最终选择放弃了,采用了比 ...
Redis与DB的数据一致性解决方案（史上最全）
文章很长,而且持续更新,建议收藏起来,慢慢读! 高并发发烧友社群:疯狂创客圈(总入口) 奉上以下珍贵的学习资源: 疯狂创客圈经典图书 : 极致经典 + 社群大片好评 < Java 高并发三 ...
SmartSql = Dapper + MyBatis + Cache(Memory | Redis) + ZooKeeper + R/W Splitting + ......
SmartSql Why 拥抱跨平台 DotNet Core,是时候了. 高性能.高生产力,超轻量级的ORM.156kb (Dapper:168kb) So SmartSql TargetFrame ...
django自带cache结合redis创建永久缓存
0916自我总结 django自带cache结合redis创建永久缓存 1.redis库 1.安装redis与可视化操作工具 1.安装redis https://www.runoob.com/redi ...
Spring配置cache（concurrentHashMap，guava cache、redis实现）附源码
在应用程序中,数据一般是存在数据库中(磁盘介质),对于某些被频繁访问的数据,如果每次都访问数据库,不仅涉及到网络io,还受到数据库查询的影响:而目前通常会将频繁使用,并且不经常改变的数据放入缓存中,从 ...
springboot学习笔记-4 整合Druid数据源和使用@Cache简化redis配置
一.整合Druid数据源 Druid是一个关系型数据库连接池,是阿里巴巴的一个开源项目,Druid在监控,可扩展性,稳定性和性能方面具有比较明显的优势.通过Druid提供的监控功能,可以实时观察数据库 ...
【Spring】17、spring cache 与redis缓存整合
spring cache,基本能够满足一般应用对缓存的需求,但现实总是很复杂,当你的用户量上去或者性能跟不上,总需要进行扩展,这个时候你或许对其提供的内存缓存不满意了,因为其不支持高可用性,也不具备持 ...
springboot整合spring @Cache和Redis
转载请注明出处:https://www.cnblogs.com/wenjunwei/p/10779450.html spring基于注解的缓存对于缓存声明,spring的缓存提供了一组java注解: ...
Spring Boot（八）集成Spring Cache 和 Redis
在Spring Boot中添加spring-boot-starter-data-redis依赖: <dependency> <groupId>org.springframewo ...

随机推荐

Servlet用户登录功能实现
需求:完成用户登录页面校验第一步:创建一个用户登录的html页面 <!DOCTYPE html> <html> <head> <meta charset=& ...
【C语言】-返回指针的函数与指向函数的指针
本文目录前言一.返回指针的函数二.指向函数的指针说明:这个C语言专题,是学习iOS开发的前奏.也为了让有面向对象语言开发经验的程序员,能够快速上手C语言.如果你还没有编程经验,或者对C语言.i ...
word-wrap和word-break的区别吗？
word-wrap: css的 word-wrap 属性用来标明是否允许浏览器在单词内进行断句,这是为了防止当一个字符串太长而找不到它的自然断句点时产生溢出现象. word-break: css的 w ...
从外网GitHub clone开源项目的时候，.git文件过大，导致克隆慢
以clone impala为例,主要是加入-depth=1参数: git clone -b cdh4-2.0 --depth=1 https://github.com/cloudera/Impala. ...
MySql的视图
视图是从一个或多个表中导出的表.是一种虚拟存在的表.视图就像一个窗口,通过这个窗口可以看到系统专门提供的数据.这样,用户可以不用看到整个数据库表中数据,而只关心对自己有用的数据.视图可以使用户的操作更 ...
Hibernate各种查询操作（二）
一.QBC的查询方式使用QBC不在需要写hql语句,而是使用criteria对象的各种方法来实现. 1.查询所有 //使用QBC方式查询所有 @Test public void test11(){ ...
【httpwatch】httpwatch对测试的应用
HttpWatch是一款网页数据分析工具,是浏览器插件,集成在IE浏览器的工具栏中.主要可以用来帮忙我们查看及分析HTTP请求的:Cookie.请求参数.请求头信息.响应头信息.响应状态.响应正文等内 ...
[Hadoop大数据]--kafka入门
问题导读: 1.zookeeper在kafka的作用是什么? 2.kafka中几乎不允许对消息进行“随机读写”的原因是什么? 3.kafka集群consumer和producer状态信息是如何保存的? ...
.netCore2.0 WebApi 传递form表单
随着it的技术发展,目前越来越多的项目采用前后端分离的开发模式,通过webapi提供接口数据来进行交互最近项目用的是.netCore WebApi,在最近的项目使用中发现一些问题,进行记录.个人简介 ...
mac os下载安装jmeter
一.简介 jmeter是属于apache的一个开源产品,纯Java应用.最初用来进行功能测试,而后又扩展了更多的测试功能. 二.下载进入apache的jmeter下载页:http://jmeter. ...

Improving cache consistency redis和db的一致性维护