[rabbitmq-discuss] Exactly Once Delivery
[rabbitmq-discuss] Exactly Once Delivery http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2010-August/008272.html
[rabbitmq-discuss] Exactly Once Delivery
John Apps johndapps at gmail.com
Thu Aug 5 14:00:11 BST 2010
- Previous message: [rabbitmq-discuss] Exactly Once Delivery
- Next message: [rabbitmq-discuss] Exactly Once Delivery
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Matthew,
an excellent response and thank you for it! Yes, difficult it is! It raises a somewhat philosophical discussion around where the onus is
placed in terms of guaranteeing such things as 'guaranteed once', i.e., on
the client side or on the server side? The JMS standard offers guaranteed
once, whereby the onus is on the server (JMS implementation) and not on the
client. What I am trying to say is that, in my opinion, client programs should be as
'simple' as possible with the servers doing all the hard work. This is what
the JMS standard forces on implementors and, perhaps to a lesser extent
today, do does AMQP. Note: the word 'server' is horribly overloaded these days. It is used here
to indicate the software with which clients, producers and consumers,
communicate. Oh well, off to librabbitMQ and some example programs written in COBOL... Cheers, John
On Thu, Aug 5, 2010 at 13:22, Matthew Sackman <matthew at rabbitmq.com> wrote: > Hi Mike,
>> On Tue, Aug 03, 2010 at 04:43:56AM -0400, Mike Petrusis wrote:
> > In reviewing the mailing list archives, I see various threads which state
> that ensuring "exactly once" delivery requires deduplication by the
> consumer. For example the following:
> >
> > "Exactly-once requires coordination between consumers, or idempotency,
> > even when there is just a single queue. The consumer, broker or network
> > may die during the transmission of the ack for a message, thus causing
> > retransmission of the message (which the consumer has already seen and
> > processed) at a later point."
> http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2009-July/004237.html
> >
> > In the case of competing consumers which pull messages from the same
> queue, this will require some sort of shared state between consumers to
> de-duplicate messages (assuming the consumers are not idempotent).
> >
> > Our application is using RabbitMQ to distribute tasks across multiple
> workers residing on different servers, this adds to the cost of sharing
> state between the workers.
> >
> > Another message in the email archive mentions that "You can guarantee
> exactly-once delivery if you use transactions, durable queues and exchanges,
> and persistent messages, but only as long as any failing node eventually
> recovers."
>> All the above is sort of wrong. You can never *guarantee* exactly once
> (there's always some argument about whether receiving message duplicates
> but relying on idempotency is achieving exactly once. I don't feel it
> does, and this should become clearer as to why further on...)
>> The problem is publishers. If the server on which RabbitMQ is running
> crashes, after commiting a transaction containing publishes, it's
> possible the commit-ok message may get lost. Thus the publishers still
> think they need to republish, so wait until the broker comes back up and
> then republishes. This can happen an infinite number of times: the
> publishers connect, start a transaction, publish messages, commit the
> transaction and then the commit-ok gets lost and so the publishers
> repeat the process.
>> As a result, on the clients, you need to detect duplicates. Now this is
> really a barrier to making all operations idempotent. The problem is
> that you never know how many copies of a message there will be. Thus you
> never know when it's safe to remove messages from your dedup cache. Now
> things like redis apparently have the means to delete entries after an
> amount of time, which would at least allow you to avoid the database
> eating up all the RAM in the universe, but there's still the possibility
> that after the entry's been deleted, another duplicate will come along
> which you now won't detect as a duplicate.
>> This isn't just a problem with RabbitMQ - in any messaging system, if
> any message can be lost, you can not achieve exactly once semantics. The
> best you can hope for is a probability of a large number of 9s that you
> will be able to detect all the duplicates. But that's the best you can
> achieve.
>> Scaling horizontally is thus more tricky because, as you say, you may
> now have multiple consumers which each receive one copy of a message.
> Thus the dedup database would have to be distributed. With high message
> rates, this might well become prohibitive because of the amount of
> network traffic due to transactions between the consumers.
>> > What's the recommended way to deal with the potential of duplicate
> messages?
>> Currently, there is no "recommended" way. If you have a single consumer,
> it's quite easy - something like tokyocabinet should be more than
> sufficiently performant. For multiple consumers, you're currently going
> to have to look at some sort of distributed database.
>> > Is this a rare enough edge case that most people just ignore it?
>> No idea. But one way of making your life easier is for the producer to
> send slightly different messages on every republish (they would still
> obviously need to have the same msg id). That way, if you detect a msg
> with "republish count" == 0, then you know it's the first copy, so you
> can insert async into your shared database and then act on the message.
> You only need to do a query on the database whenever you receive a msg
> with "republish count" > 0 - thus you can tune your database for
> inserts and hopefully save some work - the common case will then be the
> first case, and lookups will be exceedingly rare.
>> The question then is: if you've received a msg, republish count > 0 but
> there are no entries in the database, what do you do? It shouldn't have
> overtaken the first publish (though if consumers disconnected without
> acking, or requeued messages, it could have), but you need to cause some
> sort of synchronise operation between all the consumers to ensure none
> are in the process of adding to the database - it all gets a bit hairy
> at this point.
>> Thus if your message rate is low, you're much safer doing the insert and
> select on every message. If that's too expensive, you're going to have
> to think very hard indeed about how to avoid races between different
> consumers thinking they're both/all responsible for acting on the same
> message.
>> This stuff isn't easy.
>> Matthew
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
> --
---
John Apps
(49) 171 869 1813
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20100805/7eca06e8/attachment.htm>
- Previous message: [rabbitmq-discuss] Exactly Once Delivery
- Next message: [rabbitmq-discuss] Exactly Once Delivery
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the rabbitmq-discuss mailing list
[rabbitmq-discuss] Exactly Once Delivery的更多相关文章
- RabbitMQ介绍5 - 集群
RabbitMQ内建集群机制,利用Erlang提供的开放电信平台(OTP,Open telecom Platform)通信框架,使得集群很容易进行横向扩展,提高系统吞吐量.这里只讨论集群的概念.原理, ...
- RabbitMQ基础系列--客户端开发
Ⅰ.高层接口 ConnectionFactory Connection Channel Consumor Ⅱ.操作流程及API [一]创建连接工厂ConnectionFactory Connectio ...
- rabbitmq 公平分发和消息接收确认(转载)
原文地址:http://www.jianshu.com/p/f63820fe2638 当生产者投递消息到broker,rabbitmq把消息分发到消费者. 如果设置了autoAck=true 消费者会 ...
- 消息队列rabbitmq的五种工作模式(go语言版本)
前言:如果你对rabbitmq基本概念都不懂,可以移步此篇博文查阅消息队列RabbitMQ 一.单发单收 二.工作队列Work Queue 三.发布/订阅 Publish/Subscribe 四.路由 ...
- RabbitMQ 信道(channel)挂掉,但连接仍然存在,同时出现错误:Received remote Channel.Close (406): PRECONDITION_FAILED - unknown delivery tag x 的问题
该问题经过一番试验,发现是消费者(consumer)程序逻辑错误导致:在消息处理的回调函数中多次ack或nack. 开启Python日志,并在回调函数中两次ack得到如下信息: F:\software ...
- 消息队列——RabbitMQ学习笔记
消息队列--RabbitMQ学习笔记 1. 写在前面 昨天简单学习了一个消息队列项目--RabbitMQ,今天趁热打铁,将学到的东西记录下来. 学习的资料主要是官网给出的6个基本的消息发送/接收模型, ...
- 消息队列性能对比——ActiveMQ、RabbitMQ与ZeroMQ(译文)
Dissecting Message Queues 概述: 我花了一些时间解剖各种库执行分布式消息.在这个分析中,我看了几个不同的方面,包括API特性,易于部署和维护,以及性能质量..消息队列已经被分 ...
- RabbitMQ 高可用集群搭建及电商平台使用经验总结
面向EDA(事件驱动架构)的方式来设计你的消息 AMQP routing key的设计 RabbitMQ cluster搭建 Mirror queue policy设置 两个不错的RabbitMQ p ...
- RabbitMQ总结概念
AMQP:一个提供统一消息服务的应用层标准高级消息队列协议,是应用层协议的一个开放标准,为面向消息的中间件设计 http://www.diggerplus.org/archives/3110 AMQP ...
- 基于Netty与RabbitMQ的消息服务
Netty作为一个高性能的异步网络开发框架,可以作为各种服务的开发框架. 前段时间的一个项目涉及到硬件设备实时数据的采集,采用Netty作为采集服务的实现框架,同时使用RabbitMQ作为采集服务和各 ...
随机推荐
- iOS第三方-百度地图地图SDK(一)
前言 最近项目忙完了准备把百度地图的方法都熟悉一遍,基于百度地图2.10.0,写demo的同时也写下博客来记录下 模拟器设置 我直接就复制我以前写过的一篇的图了,懒得截图... 获取百度地图KEY 让 ...
- POJ 2411 Mondriaan's Dream ——状压DP 插头DP
[题目分析] 用1*2的牌铺满n*m的格子. 刚开始用到动规想写一个n*m*2^m,写了半天才知道会有重复的情况. So Sad. 然后想到数据范围这么小,爆搜好了.于是把每一种状态对应的转移都搜了出 ...
- writeValueAsString封装成工具类
封装成工具类 <span style="font-family:Microsoft YaHei;">public static String toJsonByObjec ...
- uva 11995 判别数据类型
Problem I I Can Guess the Data Structure! There is a bag-like data structure, supporting two operati ...
- Iptables入门教程
转自:http://drops.wooyun.org/tips/1424 linux的包过滤功能,即linux防火墙,它由netfilter 和 iptables 两个组件组成. netfilter ...
- LINUX下面NetworkManager和network冲突的问题
https://blog.csdn.net/ID_EAGLE/article/details/74085409
- 深入爬虫书scrapy 之json内容没有写入文本
settings.py设置 ITEM_PIPELINES = { 'tets.pipelines.TetsPipeline': 300, } spider代码 xpath后缀添加.extract() ...
- java.nio.ByteBuffer 以及flip,clear及rewind区别
Buffer 类 定义了一个可以线性存放primitive type数据的容器接口.Buffer主要包含了与类型(byte, char…)无关的功能. 值得注意的是Buffer及其子类都不是线程安全的 ...
- java消息队列怎么用
消息队列的使用场景是怎样的? 经常听到诸如rebbitmq,activemq,请教一下各位前辈消息队列的使用场景是怎样的,什么时候会用到它 校验用户名等信息,如果没问题会在数据库中添加一个用户记录 ...
- Java8 ChronoUnits枚举
原文:http://www.yiibai.com/java8/java8_chronounits.html java.time.temporal.ChronoUnit 枚举在 Java8 中添加,以取 ...