Sematext Monitoring  是最全面的Kafka监视解决方案之一,可捕获约200个Kafka指标,包括Kafka Broker,Producer和Consumer指标。尽管其中许多指标很有用,但每个人都有一个要监视的特定指标–消费者滞后。

什么是卡夫卡消费者滞后?

卡夫卡消费者滞后指标表明卡夫卡生产者和消费者之间存在多少滞后。人们谈论卡夫卡时,通常指的是卡夫卡经纪人。您可以将Kafka Broker视为Kafka服务器。代理实际上是存储和提供Kafka消息的对象。Kafka生产者是将消息写入Kafka(经纪人)的应用程序。Kafka使用者是从Kafka(Brokers)读取消息的应用程序。

内部经纪人数据存储在一个或多个主题中,每个主题由一个或多个分区组成。当写入数据时,代理实际上会将其写入特定的分区。在写入数据时,它会跟踪每个分区中的最后一个“写入位置”。这称为最新偏移,也称为对数结束偏移。每个分区都有自己独立的最新偏移量。

就像Broker跟踪每个分区中的写入位置一样,每个Consumer跟踪每个正在消耗其数据的分区中的“读取位置”。也就是说,它跟踪已读取的数据。这被称为消费者抵销。消费者偏移量会定期存在(到ZooKeeper或Kafka本身的特殊主题),因此它可以承受消费者崩溃或不正常关机的情况,并避免重复使用过多的旧数据。

卡夫卡消费者滞后率和读/写率

在上面的图表中,我们可以看到黄色的条形,代表着经纪人编写生产者创建的消息的速率。橙色条形表示消费者从经纪人那里消费消息的速率。费率看起来大致相等-必须保持一致,否则消费者将落后。但是,在写入消息和使用消息之间始终会有一些延迟。读取总是落后于写入,这就是我们所说的“消费者滞后”。消费者滞后时间只是最新偏移量和消费者偏移量之间的增量。

为什么消费者滞后很重要

如今,许多应用程序都是基于能够处理(接近)实时数据的。考虑一下性能监控系统(例如Sematext Monitoring)或日志管理服务(例如Sematext Logs)。他们连续不断地处理无限量的近实时数据。如果它们向您显示指标或日志的时间过长-如果“消费者滞后”过大-它们将几乎无用。消费者滞后告诉我们每个分区中每个消费者(组)落后多远。滞后时间越短,实时数据消耗就越大。

监视读写速率

卡夫卡消费者滞后和经纪人抵销变化

正如我们刚刚了解到的,“最新偏移量”与“消费者偏移量”之间的差异是导致我们“消费者滞后”的原因。在上面的Sematext图表中,您可能已经注意到其他一些指标:

  • 经纪人写率
  • 消耗率
  • 经纪人最早的抵销变动

速率指标是派生的指标。如果您查看Kafka的指标,您将找不到它们。在后台,开源Sematext代理收集了一些Kafka指标具有各种偏移量,可从这些偏移量计算这些费率。此外,它还绘制了经纪人最早的偏移量变化图,这是每个经纪人分区中已知的最早的偏移量。换句话说,此偏移量是分区中最旧消息的偏移量。尽管仅靠偏移量可能并不是超级有用,但当情况出现问题时,了解其变化情况可能会很方便。Kafka中的数据具有一定的TTL(生存时间),可以轻松清除旧数据。该清除操作由Kafka本身执行。每次清除都会使最旧数据的偏移量发生变化。Sematext的经纪人最早的抵销更改会浮出水面,以便您进行监控。该指标使您了解清除的频率以及每次运行时清除的消息数量。

Kafka监控工具

那里有几种Kafka监控工具,例如  LinkedIn的Burrow,其Sematext中使用了Kafka Offset监控和Consumer Lag监控方法。我们在Kafka开源监控工具中编写了各种开源监控工具。如果您需要一个好的Kafka监控解决方案,请尝试使用Sematext。将您的Kafka和其他日志发送到Sematext Logs中,您便拥有了一个DevOps解决方案,该解决方案使故障排除变得容易而不是麻烦。

is one of the most comprehensive Kafka monitoring solutions, capturing some 200 Kafka metrics, including Kafka Broker, Producer, and Consumer metrics. While lots of those metrics are useful, there is one particular metric everyone wants to monitor – Consumer Lag.

What is Kafka Consumer Lag?

Kafka Consumer Lag is the indicator of how much lag there is between Kafka producers and consumers. When people talk about Kafka they are typically referring to Kafka Brokers. You can think of a Kafka Broker as a Kafka server. A Broker is what actually stores and serves Kafka messages. Kafka Producers are applications that write messages into Kafka (Brokers). Kafka Consumers are applications that read messages from Kafka (Brokers).

Inside Brokers data is stored in one or more Topics, and each Topic consists of one or more Partitions. When writing data a Broker actually writes it into a specific Partition. As it writes data it keeps track of the last “write position” in each Partition. This is called Latest Offset also known as Log End Offset. Each Partition has its own independent Latest Offset.

Just like Brokers keep track of their write position in each Partition, each Consumer keeps track of “read position” in each Partition whose data it is consuming. That is, it keeps track of which data it has read. This is known as Consumer Offset. This Consumer Offset is periodically persisted (to ZooKeeper or a special Topic in Kafka itself) so it can survive Consumer crashes or unclean shutdowns and avoid re-consuming too much old data.

Kafka Consumer Lag and Read/Write Rates

In our diagram above we can see yellow bars, which represents the rate at which Brokers are writing messages created by Producers.  The orange bars represent the rate at which Consumers are consuming messages from Brokers. The rates look roughly equal – and they need to be, otherwise the Consumers will fall behind.  However, there is always going to be some delay between the moment a message is written and the moment it is consumed. Reads are always going to be lagging behind writes, and that is what we call Consumer Lag. The Consumer Lag is simply the delta between the Latest Offset and Consumer Offset.

Why is Consumer Lag Important

Many applications today are based on being able to process (near) real-time data. Think about performance monitoring system like Sematext Monitoring or log management service like Sematext Logs. They continuously process infinite streams of near real-time data. If they were to show you metrics or logs with too much delay – if the Consumer Lag were too big – they’d be nearly useless.  This Consumer Lag tells us how far behind each Consumer (Group) is in each Partition.  The smaller the lag the more real-time the data consumption.

Monitoring Read and Write Rates

Kafka Consumer Lag and Broker Offset Changes

As we just learned the delta between the Latest Offset and the Consumer Offset is what gives us the Consumer Lag.  In the above chart from Sematext you may have noticed a few other metrics:

  • Broker Write Rate
  • Consume Rate
  • Broker Earliest Offset Changes

The rate metrics are derived metrics.  If you look at Kafka’s metrics you won’t find them there.  Under the hood the open source Sematext agent collects a few Kafka metrics with various offsets from which these rates are computed.  In addition, it charts Broker Earliest Offset Changes, which is  the earliest known offset in each Broker’s Partition.  Put another way, this offset is the offset of the oldest message in a Partition.  While this offset alone may not be super useful, knowing how it’s changing could be handy when things go awry.  Data in Kafka has a certain TTL (Time To Live) to allow for easy purging of old data.  This purging is performed by Kafka itself.  Every time such purging kicks in the offset of the oldest data changes.  Sematext’s Broker Earliest Offset Change surfaces this information for your monitoring pleasure.  This metric gives you an idea how often purges are happening and how many messages they’ve removed each time they ran.

Kafka Monitoring Tools

There are several Kafka monitoring tools out there that, like LinkedIn’s Burrow, whose Kafka Offset monitoring and Consumer Lag monitoring approach is used in Sematext.  We’ve written various open source monitoring tools in Kafka Open Source Monitoring Tools. If you need a good Kafka monitoring solution, give Sematext a go.  Ship your Kafka and other logs into Sematext Logs and you’ve got yourself a DevOps solution that will make troubleshooting easy instead of dreadful.

Kafka Consumer Lag Monitoring的更多相关文章

  1. Understanding Kafka Consumer Groups and Consumer Lag

    In this post, we will dive into the consumer side of this application ecosystem, which means looking ...

  2. 【原创】Kafka Consumer多线程实例续篇

    在上一篇<Kafka Consumer多线程实例>中我们讨论了KafkaConsumer多线程的两种写法:多KafkaConsumer多线程以及单KafkaConsumer多线程.在第二种 ...

  3. Kafka consumer group位移0ffset重设

    本文阐述如何使用Kafka自带的kafka-consumer-groups.sh脚本随意设置消费者组(consumer group)的位移.需要特别强调的是, 这是0.11.0.0版本提供的新功能且只 ...

  4. Kafka设计解析(十九)Kafka consumer group位移重设

    转载自 huxihx,原文链接 Kafka consumer group位移重设 本文阐述如何使用Kafka自带的kafka-consumer-groups.sh脚本随意设置消费者组(consumer ...

  5. Kafka Consumer接口

    对于kafka的consumer接口,提供两种版本,   high-level 一种high-level版本,比较简单不用关心offset, 会自动的读zookeeper中该Consumer grou ...

  6. 【原创】美团二面:聊聊你对 Kafka Consumer 的架构设计

    在上一篇中我们详细聊了关于 Kafka Producer 内部的底层原理设计思想和细节, 本篇我们主要来聊聊 Kafka Consumer 即消费者的内部底层原理设计思想. 1.Consumer之总体 ...

  7. 【原创】Kafka Consumer多线程实例

    Kafka 0.9版本开始推出了Java版本的consumer,优化了coordinator的设计以及摆脱了对zookeeper的依赖.社区最近也在探讨正式用这套consumer API替换Scala ...

  8. Kafka设计解析(四)- Kafka Consumer设计解析

    本文转发自Jason’s Blog,原文链接 http://www.jasongj.com/2015/08/09/KafkaColumn4 摘要 本文主要介绍了Kafka High Level Con ...

  9. 【原创】kafka consumer源代码分析

    顾名思义,就是kafka的consumer api包. 一.ConsumerConfig.scala Kafka consumer的配置类,除了一些默认值常量及验证参数的方法之外,就是consumer ...

随机推荐

  1. 英语CollaCoriiAsini阿胶CollaCoriiAsini单词

    阿胶(colla Corii Asini)始载于<神农本草经>,是马科动物驴的皮去毛后熬制而成的胶块,其性味甘.平,具有滋阴润肺,补血.止血等功效.主要治疗血虚萎黄,眩晕心悸,肌痿无力,心 ...

  2. Kubernetes概念之RC

    感觉自己浪费了一年的时间,种一棵树最好的时间是十年前,还有就是现在,虽然这颗树种了又种,种了又种,这次真的要种了......   本文通过<Kubernetes权威指南>的概念部分学习总结 ...

  3. 01GitLab的使用——创建项目并上传到GitLab

    借鉴:https://jingyan.baidu.com/article/9c69d48fe68cce13c9024e9c.html 登录GitLab网站,创建一个项目上传地址:https://blo ...

  4. 设置 Jupyter notebook 工作空间 / 默认路径

    常用的启动 Jupyter notebook 的两种方式是:命令行窗口启动和开始菜单启动.设置 Jupyter notebook 的默认路径也有两种常用方式: 修改配置文件 设置快捷方式. 1 通过修 ...

  5. webuploader大文件分片,多线程总结

    项目的新需求是用webuploader来做一个多文件,多线程,并且可以进行分块上传的要求,这些在前面的一篇文章当中足够使用了,但是现在又来一个新的需求,要求上传失败的文件进行重新的上传……心里默默说句 ...

  6. LCD编程_显示文字

    在上篇博客中,实现了画点操作,然后在画点的基础上实现了画线.画圆的操作.实际上显示文字也是在画点的基础上实现的. 文字是由点组成的,那么这些点阵是在哪里获得的呢? 随便打开一个内核文件,搜索font, ...

  7. 201871010104-陈园园 《面向对象程序设计(java)》第四周学习总结

    201871010104-陈园园 <面向对象程序设计(java)>第四周学习总结 项目 内容 这个作业属于哪个课程 https://www.cnblogs.com/nwnu-daizh/ ...

  8. JMeter基础【第三篇】JMeter5.1元件作用域及执行顺序

    执行顺序,大家可以实践验证,加深印象. 最后,给大家说一个万能且保险的方法:放到对应的取样器下面即可.

  9. wordpress列表页如果文章没有缩略图就显示默认图片

    有时我们在设计wordpress模板时需要考虑是否有特色图,在分类页上如果一些文章有缩略图一些没有那就有点参差不齐不美观,有没办法设置如果没有文章缩略图则自动显示默认图呢?可以的,随ytkah一起来看 ...

  10. 09-C#笔记-循环

    1. while 同 C++ 2. for 同 C++ 3. foreach,注意数组的定义 int[] fibarray = new int[] { 0, 1, 1, 2, 3, 5, 8, 13 ...