https://kafka.apache.org/intro.html

Kafka as a Messaging System

How does Kafka's notion of streams compare to a traditional enterprise messaging system?

【队列有扩展性,不支持多订阅者 --- 发布者-订阅者 反之   queue publish-subscribe 】

【 scale processing  multi-subscriber】

【queues aren't multi-subscriber—once one process reads the data it's gone】

【.Publish-subscribe allows you broadcast data to multiple processes, but has no way of scaling processing since every message goes to every subscriber】

Messaging traditionally has two models: queuing and publish-subscribe. In a queue, a pool of consumers may read from a server and each record goes to one of them; in publish-subscribe the record is broadcast to all consumers. Each of these two models has a strength and a weakness. The strength of queuing is that it allows you to divide up the processing of data over multiple consumer instances, which lets you scale your processing. Unfortunately, queues aren't multi-subscriber—once one process reads the data it's gone. Publish-subscribe allows you broadcast data to multiple processes, but has no way of scaling processing since every message goes to every subscriber.

【 consumer group  解决了上述2个问题】

【群内扩展,群间广播】

【不同消费群是可以订阅同一个主题的】

The consumer group concept in Kafka generalizes these two concepts. As with a queue the consumer group allows you to divide up processing over a collection of processes (the members of the consumer group). As with publish-subscribe, Kafka allows you to broadcast messages to multiple consumer groups.

【Kafka's model is that every topic can scale processing and is also multi-subscriber】

The advantage of Kafka's model is that every topic has both these properties—it can scale processing and is also multi-subscriber—there is no need to choose one or the other.

Kafka has stronger ordering guarantees than a traditional messaging system, too.

A traditional queue retains records in-order on the server, and if multiple consumers consume from the queue then the server hands out records in the order they are stored. However, although the server hands out records in order, the records are delivered asynchronously to consumers, so they may arrive out of order on different consumers. This effectively means the ordering of the records is lost in the presence of parallel consumption. Messaging systems often work around this by having a notion of "exclusive consumer" that allows only one process to consume from a queue, but of course this means that there is no parallelism in processing.

【(t,p,o)   t:p>=o 】

【By having a notion of parallelism—the partition—within the topics, Kafka is able to provide both ordering guarantees and load balancing over a pool of consumer processes. 】

【partitions 】

【一话题,多分区:一个分区同时只被一个消费者消费,保证顺序;多分区,保证平行】

Kafka does it better. By having a notion of parallelism—the partition—within the topics, Kafka is able to provide both ordering guarantees and load balancing over a pool of consumer processes. This is achieved by assigning the partitions in the topic to the consumers in the consumer group so that each partition is consumed by exactly one consumer in the group. By doing this we ensure that the consumer is the only reader of that partition and consumes the data in order. Since there are many partitions this still balances the load over many consumer instances. Note however that there cannot be more consumer instances in a consumer group than partitions.

(t,p,o) t:p>=o there cannot be more consumer instances in a consumer group than partitions的更多相关文章

  1. In-Memory:内存数据库

    在逝去的2016后半年,由于项目需要支持数据的快速更新和多用户的高并发负载,我试水SQL Server 2016的In-Memory OLTP,创建内存数据库实现项目的负载需求,现在项目接近尾声,系统 ...

  2. Tomcat一个BUG造成CLOSE_WAIT

    之前应该提过,我们线上架构整体重新架设了,应用层面使用的是Spring Boot,前段日子因为一些第三方的原因,略有些匆忙的提前开始线上的内测了.然后运维发现了个问题,服务器的HTTPS端口有大量的C ...

  3. Oracle分析函数入门

    一.Oracle分析函数入门 分析函数是什么?分析函数是Oracle专门用于解决复杂报表统计需求的功能强大的函数,它可以在数据中进行分组然后计算基于组的某种统计值,并且每一组的每一行都可以返回一个统计 ...

  4. Hangfire项目实践分享

    Hangfire项目实践分享 目录 Hangfire项目实践分享 目录 什么是Hangfire Hangfire基础 基于队列的任务处理(Fire-and-forget jobs) 延迟任务执行(De ...

  5. Sql Server系列:分区表操作

    1. 分区表简介 分区表在逻辑上是一个表,而物理上是多个表.从用户角度来看,分区表和普通表是一样的.使用分区表的主要目的是为改善大型表以及具有多个访问模式的表的可伸缩性和可管理性. 分区表是把数据按设 ...

  6. SQL Server表分区

    什么是表分区 一般情况下,我们建立数据库表时,表数据都存放在一个文件里. 但是如果是分区表的话,表数据就会按照你指定的规则分放到不同的文件里,把一个大的数据文件拆分为多个小文件,还可以把这些小文件放在 ...

  7. 消息队列——RabbitMQ学习笔记

    消息队列--RabbitMQ学习笔记 1. 写在前面 昨天简单学习了一个消息队列项目--RabbitMQ,今天趁热打铁,将学到的东西记录下来. 学习的资料主要是官网给出的6个基本的消息发送/接收模型, ...

  8. 使用Python保存屏幕截图(不使用PIL)

    起因 在极客学院讲授<使用Python编写远程控制程序>的课程中,涉及到查看被控制电脑屏幕截图的功能. 如果使用PIL,这个需求只需要三行代码: from PIL import Image ...

  9. NodeJs之pm2

    pm2 pm2是一个进程管理工具,可以用它来管理你的node进程,并查看node进程的状态,当然也支持性能监控,进程守护,负载均衡等功能. 开发过程中建议时不时的参看官方详细命令行使用:命令行 pm2 ...

  10. iOS总结_UI层自我复习总结

    UI层复习笔记 在main文件中,UIApplicationMain函数一共做了三件事 根据第三个参数创建了一个应用程序对象 默认写nil,即创建的是UIApplication类型的对象,此对象看成是 ...

随机推荐

  1. Django时区配置:有次发现缓存的时间总是有问题,原来是时区配置需要改

    # LANGUAGE_CODE = 'en-us' # TIME_ZONE = 'UTC' LANGUAGE_CODE = 'zh-Hans' TIME_ZONE = 'Asia/Shanghai'

  2. unity3d自动寻路教程

    U3D的自动寻路插件是不少,但是其实U3D的PRO版本就提供了相当实用的自动寻路组件了,以下教程分别讲解自动寻路的路径选择优先,上楼梯跳下的条件判断等等实用方法,教程分三编,但这个教程没有讲到Navm ...

  3. 解决npm 的 shasum check failed for错误

    使用npm安装一些包失败,类似如下报错情况:   C:\Program Files\nodejs>npm update npm npm ERR! Windows_NT 10.0.14393 np ...

  4. AC日记——线段树练习5 codevs 4927

    4927 线段树练习5  时间限制: 1 s  空间限制: 128000 KB  题目等级 : 黄金 Gold 题解       题目描述 Description 有n个数和5种操作 add a b ...

  5. Unsafe in Java

    http://www.cnblogs.com/xrq730/p/4976007.html http://www.importnew.com/14511.html http://blog.csdn.ne ...

  6. 理解webpack中的devTool的配置项

    2.1. eval  eval 会将每一个module模块,执行eval,执行后不会生成sourcemap文件,仅仅是在每一个模块后,增加sourceURL来关联模块处理前后对应的关系.在webpac ...

  7. echarts 金字塔

    <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8&quo ...

  8. CentOS6.7安装部署LNMP(nginx1.8.0+php5.6.10+mysql5.6.12)

    IP-10.0.0.8 1.安装nginx mkdir -p /server/tools cd /server/tools yum install -y pcre pcre-devel openssl ...

  9. 利用例子来理解spring的面向切面编程

    最近学习了spring的面向切面编程,在网上看到猴子偷桃的例子,觉得这种方式学习比书本上讲解有趣多了,也便于理解.现在就来基于猴子偷桃写个基本的例子. maven工程:

  10. 定时任务crontab如何实现每秒执行?

    linux crontab 命令,最小的执行时间是一分钟.如需要在小于一分钟内重复执行,可以有两个方法实现. 方法一:crontab -l内容如下,则每10秒执行一次/home/fdipzone/ph ...