[rabbitmq-discuss] Exactly Once Delivery http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2010-August/008272.html

[rabbitmq-discuss] Exactly Once Delivery

John Apps johndapps at gmail.com
Thu Aug 5 14:00:11 BST 2010


Matthew,
an excellent response and thank you for it! Yes, difficult it is! It raises a somewhat philosophical discussion around where the onus is
placed in terms of guaranteeing such things as 'guaranteed once', i.e., on
the client side or on the server side? The JMS standard offers guaranteed
once, whereby the onus is on the server (JMS implementation) and not on the
client. What I am trying to say is that, in my opinion, client programs should be as
'simple' as possible with the servers doing all the hard work. This is what
the JMS standard forces on implementors and, perhaps to a lesser extent
today, do does AMQP. Note: the word 'server' is horribly overloaded these days. It is used here
to indicate the software with which clients, producers and consumers,
communicate. Oh well, off to librabbitMQ and some example programs written in COBOL... Cheers, John
On Thu, Aug 5, 2010 at 13:22, Matthew Sackman <matthew at rabbitmq.com> wrote: > Hi Mike,
>> On Tue, Aug 03, 2010 at 04:43:56AM -0400, Mike Petrusis wrote:
> > In reviewing the mailing list archives, I see various threads which state
> that ensuring "exactly once" delivery requires deduplication by the
> consumer. For example the following:
> >
> > "Exactly-once requires coordination between consumers, or idempotency,
> > even when there is just a single queue. The consumer, broker or network
> > may die during the transmission of the ack for a message, thus causing
> > retransmission of the message (which the consumer has already seen and
> > processed) at a later point."
> http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2009-July/004237.html
> >
> > In the case of competing consumers which pull messages from the same
> queue, this will require some sort of shared state between consumers to
> de-duplicate messages (assuming the consumers are not idempotent).
> >
> > Our application is using RabbitMQ to distribute tasks across multiple
> workers residing on different servers, this adds to the cost of sharing
> state between the workers.
> >
> > Another message in the email archive mentions that "You can guarantee
> exactly-once delivery if you use transactions, durable queues and exchanges,
> and persistent messages, but only as long as any failing node eventually
> recovers."
>> All the above is sort of wrong. You can never *guarantee* exactly once
> (there's always some argument about whether receiving message duplicates
> but relying on idempotency is achieving exactly once. I don't feel it
> does, and this should become clearer as to why further on...)
>> The problem is publishers. If the server on which RabbitMQ is running
> crashes, after commiting a transaction containing publishes, it's
> possible the commit-ok message may get lost. Thus the publishers still
> think they need to republish, so wait until the broker comes back up and
> then republishes. This can happen an infinite number of times: the
> publishers connect, start a transaction, publish messages, commit the
> transaction and then the commit-ok gets lost and so the publishers
> repeat the process.
>> As a result, on the clients, you need to detect duplicates. Now this is
> really a barrier to making all operations idempotent. The problem is
> that you never know how many copies of a message there will be. Thus you
> never know when it's safe to remove messages from your dedup cache. Now
> things like redis apparently have the means to delete entries after an
> amount of time, which would at least allow you to avoid the database
> eating up all the RAM in the universe, but there's still the possibility
> that after the entry's been deleted, another duplicate will come along
> which you now won't detect as a duplicate.
>> This isn't just a problem with RabbitMQ - in any messaging system, if
> any message can be lost, you can not achieve exactly once semantics. The
> best you can hope for is a probability of a large number of 9s that you
> will be able to detect all the duplicates. But that's the best you can
> achieve.
>> Scaling horizontally is thus more tricky because, as you say, you may
> now have multiple consumers which each receive one copy of a message.
> Thus the dedup database would have to be distributed. With high message
> rates, this might well become prohibitive because of the amount of
> network traffic due to transactions between the consumers.
>> > What's the recommended way to deal with the potential of duplicate
> messages?
>> Currently, there is no "recommended" way. If you have a single consumer,
> it's quite easy - something like tokyocabinet should be more than
> sufficiently performant. For multiple consumers, you're currently going
> to have to look at some sort of distributed database.
>> > Is this a rare enough edge case that most people just ignore it?
>> No idea. But one way of making your life easier is for the producer to
> send slightly different messages on every republish (they would still
> obviously need to have the same msg id). That way, if you detect a msg
> with "republish count" == 0, then you know it's the first copy, so you
> can insert async into your shared database and then act on the message.
> You only need to do a query on the database whenever you receive a msg
> with "republish count" > 0 - thus you can tune your database for
> inserts and hopefully save some work - the common case will then be the
> first case, and lookups will be exceedingly rare.
>> The question then is: if you've received a msg, republish count > 0 but
> there are no entries in the database, what do you do? It shouldn't have
> overtaken the first publish (though if consumers disconnected without
> acking, or requeued messages, it could have), but you need to cause some
> sort of synchronise operation between all the consumers to ensure none
> are in the process of adding to the database - it all gets a bit hairy
> at this point.
>> Thus if your message rate is low, you're much safer doing the insert and
> select on every message. If that's too expensive, you're going to have
> to think very hard indeed about how to avoid races between different
> consumers thinking they're both/all responsible for acting on the same
> message.
>> This stuff isn't easy.
>> Matthew
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
> --
---
John Apps
(49) 171 869 1813
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20100805/7eca06e8/attachment.htm>


More information about the rabbitmq-discuss mailing list

[rabbitmq-discuss] Exactly Once Delivery的更多相关文章

  1. RabbitMQ介绍5 - 集群

    RabbitMQ内建集群机制,利用Erlang提供的开放电信平台(OTP,Open telecom Platform)通信框架,使得集群很容易进行横向扩展,提高系统吞吐量.这里只讨论集群的概念.原理, ...

  2. RabbitMQ基础系列--客户端开发

    Ⅰ.高层接口 ConnectionFactory Connection Channel Consumor Ⅱ.操作流程及API [一]创建连接工厂ConnectionFactory Connectio ...

  3. rabbitmq 公平分发和消息接收确认(转载)

    原文地址:http://www.jianshu.com/p/f63820fe2638 当生产者投递消息到broker,rabbitmq把消息分发到消费者. 如果设置了autoAck=true 消费者会 ...

  4. 消息队列rabbitmq的五种工作模式(go语言版本)

    前言:如果你对rabbitmq基本概念都不懂,可以移步此篇博文查阅消息队列RabbitMQ 一.单发单收 二.工作队列Work Queue 三.发布/订阅 Publish/Subscribe 四.路由 ...

  5. RabbitMQ 信道(channel)挂掉,但连接仍然存在,同时出现错误:Received remote Channel.Close (406): PRECONDITION_FAILED - unknown delivery tag x 的问题

    该问题经过一番试验,发现是消费者(consumer)程序逻辑错误导致:在消息处理的回调函数中多次ack或nack. 开启Python日志,并在回调函数中两次ack得到如下信息: F:\software ...

  6. 消息队列——RabbitMQ学习笔记

    消息队列--RabbitMQ学习笔记 1. 写在前面 昨天简单学习了一个消息队列项目--RabbitMQ,今天趁热打铁,将学到的东西记录下来. 学习的资料主要是官网给出的6个基本的消息发送/接收模型, ...

  7. 消息队列性能对比——ActiveMQ、RabbitMQ与ZeroMQ(译文)

    Dissecting Message Queues 概述: 我花了一些时间解剖各种库执行分布式消息.在这个分析中,我看了几个不同的方面,包括API特性,易于部署和维护,以及性能质量..消息队列已经被分 ...

  8. RabbitMQ 高可用集群搭建及电商平台使用经验总结

    面向EDA(事件驱动架构)的方式来设计你的消息 AMQP routing key的设计 RabbitMQ cluster搭建 Mirror queue policy设置 两个不错的RabbitMQ p ...

  9. RabbitMQ总结概念

    AMQP:一个提供统一消息服务的应用层标准高级消息队列协议,是应用层协议的一个开放标准,为面向消息的中间件设计 http://www.diggerplus.org/archives/3110 AMQP ...

  10. 基于Netty与RabbitMQ的消息服务

    Netty作为一个高性能的异步网络开发框架,可以作为各种服务的开发框架. 前段时间的一个项目涉及到硬件设备实时数据的采集,采用Netty作为采集服务的实现框架,同时使用RabbitMQ作为采集服务和各 ...

随机推荐

  1. [UOJ#220][BZOJ4651][Noi2016]网格

    [UOJ#220][BZOJ4651][Noi2016]网格 试题描述 跳蚤国王和蛐蛐国王在玩一个游戏. 他们在一个 n 行 m 列的网格上排兵布阵.其中的 c 个格子中 (0≤c≤nm),每个格子有 ...

  2. UOJ 274 【清华集训2016】温暖会指引我们前行 ——Link-Cut Tree

    魔法森林高清重置, 只需要维护关于t的最大生成树,然后链上边权求和即可. 直接上LCT 调了将近2h 吃枣药丸 #include <cstdio> #include <cstring ...

  3. BZOJ 1933 [Shoi2007]Bookcase 书柜的尺寸 ——动态规划

    状态设计的方法很巧妙,六个值 h1,h2,h3,t1,t2,t3,我们发现t1,t2,t3可以通过前缀和优化掉一维. 然后考虑把h留下还是t留下,如果留下h显然t是会发生改变的,一个int存不下. 如 ...

  4. [BZOJ3611] [Heoi2014]大工程(DP + 虚树)

    传送门 $dp[i][0]$表示节点i到子树中的所有点的距离之和 $dp[i][1]$表示节点i到子树中最近距离的点的距离 $dp[i][2]$表示节点i到子树中最远距离的点的距离 建好虚树后dp即可 ...

  5. Ubuntu 常用命令和一些 tips

    001. ubuntu 解压.tar.xz文件到另一个文件夹:sudo tar -xvJf ***.tar.xz -C /usr/src sudo 超级用户tar [选项...][file]...-x ...

  6. 自定义header参数时的命名要求

    HTTP头是可以包含英文字母([A-Za-z]).数字([0-9]).连接号(-)hyphens, 也可义是下划线(_).在使用nginx的时候应该避免使用包含下划线的HTTP头.主要的原因有以下2点 ...

  7. POJ 2396 有源有汇有上下界可行流问题

    题意:给一个矩阵,给出每行每列之和,附加一些条件,如第i行第j列数必需大于(小于)多少. 思路题解:矩阵模型,模拟网络流,行.列标号为结点,构图,附加s,t,s连行标(容量上下限每行之和(必需以这个 ...

  8. .Net Core下使用RabbitMQ比较完备的两种方案(虽然代码有点惨淡,不过我会完善)

    一.前言     上篇说给大家来写C#和Java的方案,最近工作也比较忙,迟到了一些,我先给大家补上C#的方案,另外如果没看我上篇博客的人最好看一下,否则你可能看的云里雾里的,这里我就不进行具体的方案 ...

  9. spring data jpa使用原生sql查询

    spring data jpa使用原生sql查询 @Repository public interface AjDao extends JpaRepository<Aj,String> { ...

  10. Wormholes(spfa判负环)

      POJ - 3259—— Wormholes Time Limit: 2000MS   Memory Limit: 65536KB   64bit IO Format: %I64d & % ...