Configuring High Availability and Consistency for Apache Kafka
To achieve high availability and consistency targets, adjust the following parameters to meet your requirements:
Replication Factor
The default replication factor for new topics is one. For high availability production systems, Cloudera recommends setting the replication factor to at least three. This requires at least three Kafka brokers.
To change the replication factor, navigate to Kafka Service > Configuration > Service-Wide. Set Replication factor to 3, click Save Changes, and restart the Kafka service.
Preferred Leader Election
Kafka is designed with failure in mind. At some point in time, web communications or storage resources fail. When a broker goes offline, one of the replicas becomes the new leader for the partition. When the broker comes back online, it has no leader partitions. Kafka keeps track of which machine is configured to be the leader. Once the original broker is back up and in a good state, Kafka restores the information it missed in the interim and makes it the partition leader once more.
Preferred Leader Election is enabled by default, and should occur automatically unless you actively disable the feature. Typically, the leader is restored within five minutes of coming back online. If the preferred leader is offline for a very long time, though, it might need additional time to restore its required information from the replica.
There is a small possibility that some messages might be lost when switching back to the preferred leader. You can minimize the chance of lost data by setting the acks property on the Producer to all. See Acknowledgements.
Unclean Leader Election
Enable unclean leader election to allow an out-of-sync replica to become the leader and preserve the availability of the partition. With unclean leader election, messages that were not synced to the new leader are lost. This provides balance between consistency (guaranteed message delivery) and availability. With unclean leader election disabled, if a broker containing the leader replica for a partition becomes unavailable, and no in-sync replica exists to replace it, the partition becomes unavailable until the leader replica or another in-sync replica is back online.
To enable unclean leader election, navigate to Kafka Service > Configuration > Service-Wide. Check the box labeled Enable unclean leader election, click Save Changes, and restart the Kafka service.
Acknowledgements
When writing or configuring a Kafka producer, you can choose how many replicas commit a new message before the message is acknowledged using the acks property.
Set acks to 0 (immediately acknowledge the message without waiting for any brokers to commit), 1(acknowledge after the leader commits the message), or all (acknowledge after all in-sync replicas are committed) according to your requirements. Setting acks to all provides the highest consistency guarantee at the expense of slower writes to the cluster.
Minimum In-sync Replicas
You can set the minimum number of in-sync replicas (ISRs) that must be available for the producer to successfully send messages to a partition using the min.insync.replicas setting. If min.insync.replicas is set to 2 and acks is set to all, each message must be written successfully to at least two replicas. This guarantees that the message is not lost unless both hosts crash.
It also means that if one of the hosts crashes, the partition is no longer available for writes. Similar to the unclean leader election configuration, setting min.insync.replicas is a balance between higher consistency (requiring writes to more than one broker) and higher availability (allowing writes when fewer brokers are available).
The leader is considered one of the in-sync replicas. It is included in the count of total min.insync.replicas. However, leaders are special, in that producers and consumers can only interact with leaders in a Kafka cluster.
To configure min.insync.replicas at the cluster level, navigate to Kafka Service > Configuration > Service-Wide. Set Minimum number of replicas in ISR to the desired value, click Save Changes, and restart the Kafka service.
min.insync.replicas.per.topic=topic_name_1:value,topic_name_2:value
Replace topic_name_n with the topic names, and replace value with the desired minimum number of in-sync replicas.
/usr/bin/kafka-topics --alter --zookeeper zk01.example.com:2181 --topic topicname \
--config min.insync.replicas=2
Kafka MirrorMaker
Kafka mirroring enables maintaining a replica of an existing Kafka cluster. You can configure MirrorMaker directly in Cloudera Manager 5.4 and higher.
The most important configuration setting is Destination broker list. This is a list of brokers on the destination cluster. You should list more than one, to support high availability, but you do not need to list all brokers.
You can create a Topic whitelist that represents the exclusive set of topics to replicate. The Topic blacklist setting has been removed in CDK 2.0 and higher Powered By Apache Kafka.
Note: The Avoid Data Loss option from earlier releases has been removed in favor of automatically setting the following properties. Also note that MirrorMaker starts correctly if you enter the numeric values in the configuration snippet (rather than using "max integer" for retries and "max long" for max.block.ms).
- Producer settings
- acks=all
- retries=2147483647
- max.block.ms=9223372036854775807
- Consumer setting
- auto.commit.enable=false
- MirrorMaker setting
- abort.on.send.failure=true
Configuring High Availability and Consistency for Apache Kafka的更多相关文章
- Configuring Apache Kafka for Performance and Resource Management
Apache Kafka is optimized for small messages. According to benchmarks, the best performance occurs w ...
- Configuring Apache Kafka Security
This topic describes additional steps you can take to ensure the safety and integrity of your data s ...
- Understanding When to use RabbitMQ or Apache Kafka
https://content.pivotal.io/rabbitmq/understanding-when-to-use-rabbitmq-or-apache-kafka How do humans ...
- Apache Kafka - How to Load Test with JMeter
In this article, we are going to look at how to load test Apache Kafka, a distributed streaming plat ...
- Apache Kafka: Next Generation Distributed Messaging System---reference
Introduction Apache Kafka is a distributed publish-subscribe messaging system. It was originally dev ...
- Spring for Apache Kafka
官方文档详见:http://docs.spring.io/spring-kafka/docs/1.0.2.RELEASE/reference/htmlsingle/ Authors Gary Russ ...
- How Cigna Tuned Its Spark Streaming App for Real-time Processing with Apache Kafka
Explore the configuration changes that Cigna’s Big Data Analytics team has made to optimize the perf ...
- How-to: Do Real-Time Log Analytics with Apache Kafka, Cloudera Search, and Hue
Cloudera recently announced formal support for Apache Kafka. This simple use case illustrates how to ...
- Flafka: Apache Flume Meets Apache Kafka for Event Processing
The new integration between Flume and Kafka offers sub-second-latency event processing without the n ...
随机推荐
- ueditor上传图片尺寸过大导致显示难看的解决办法
昨天遇到这个问题,我也是折腾成了狗, 到处查,最后收集到三个办法,记录一下. 代码贴这里,方便复制 img { max-width: 100%; /*图片自适应宽度*/ } body { overfl ...
- 用户及用户组管理(week1_day4)--技术流ken
本节内容 useradd userdel usermod groupadd groupdel 用户管理 为什么需要有用户? 1. linux是一个多用户系统 2. 权限管理(权限最小化) 用户:存在的 ...
- 解决MySQL报错The server time zone value 'Öйú±ê׼ʱ¼ä' is unrecognized or represents .....
1.前言 今天在用SpringBoot2.0+MyBatis+MySQL搭建项目开发环境的时候启动项目发现报了一个很奇怪的错,报错内容如下: java.sql.SQLException: The se ...
- Java Pom.xml 详解
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/20 ...
- 【默认加入持久化机制,防止消息丢失,v0.0.3】对RabbitMQ.Client进行一下小小的包装,绝对实用方便
RabbitMQ是一个老牌的非微软的消息队列组件,一般来说应该能满足中小型公司对消息队列生产的需求,平时我们在.NET开发环境下运用它是可能会需要RabbitMQ.Client的SDK库,此库是官网提 ...
- MongoDB初了解——用户权限
本文所述MongoDB版本为4.0.5,笔者对MongoDB刚接触,对各个版本的MongoDB不甚了解,本文不对该版本的MongoDB做特性介绍,所涉及命令也许对其余版本不适用. 因为目前有一个试验性 ...
- nginx系列4:日志管理
日志切割 如果使用默认日志配置,经过一段时间运行后,access.log和error.log文件会变得非常大,使维护和排查问题变得不便,所以非常有必要做日志切割. 通常的思路是:使用nginx的-s ...
- 关于我的博客(About My Blogs)
本文算是第一篇文章,记录工作/学习/生活中的觉得值得分享的东西 When,Where,Who,What,How,Why,尽可能每篇博文当作一个事件来描述. 也就是触发想写这篇文章的事件. 要坚持下来写 ...
- xxxx-xx-xx的时间的加减
<!DOCTYPE html> <html lang="zh-cn"> <head> <meta charset="UTF-8& ...
- Vue.js如何在一个页面调用另一个同级页面的方法
使用场景: 页面分为header.home.footer三部分,需要在home中调用header中的方法,这两个没有相互引入 官方给出方法: 需要在展示页里调用顶部导航栏页里的方法,两者之间没有引用关 ...