Apache Kafka(四)- 使用 Java 访问 Kafka
1. Produer
1.1. 基本 Producer
首先使用 maven 构建相关依赖,这里我们服务器kafka 版本为 2.12-2.3.0,pom.xml 文件为:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion> <groupId>com.github.tang</groupId>
<artifactId>kafka-beginner</artifactId>
<version>1.0</version> <dependencies>
<!-- https://mvnrepository.com/artifact/org.apache.kafka/kafka-clients -->
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>2.3.0</version>
</dependency> <!-- https://mvnrepository.com/artifact/org.slf4j/slf4j-simple -->
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-simple</artifactId>
<version>1.7.26</version>
</dependency> </dependencies> </project>
然后创建一个 Producer:
package com.github.tang.kafka.tutorial1; import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.common.serialization.StringSerializer; import java.util.Properties; public class ProducerDemo { private static String bootstrapServers = "server_xxx:9092"; public static void main(String[] args) { /**
* create Producer properties
*
* Properties are available in official document:
* https://kafka.apache.org/documentation/#producerconfigs
*
*/
Properties properties = new Properties();
properties.setProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
properties.setProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
properties.setProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,StringSerializer.class.getName()); // create the producer
KafkaProducer<String, String> produer = new KafkaProducer<String, String>(properties); // create a producer record
ProducerRecord<String, String> record =
new ProducerRecord<String, String>("first_topic", "message from java"); // send data - asynchronous
/**
* asynchronous means the data would not send immediately
* however, the program would terminate immediately after run the send() method
* hence the data would not send to kafka topic
* and the consumer would not receive the data
*
* so we need flush()
*/
produer.send(record); /**
* use flush() to wait sending complete
*/
produer.flush();
produer.close(); }
}
运行此程序可以在consumer-console-cli 下看到发送的消息。
1.2. 带Callback() 的Producer
Callback() 函数会在每次发送record 后执行,例如:
首先实例化一个 logger() 对象:
// create a logger
final Logger logger = LoggerFactory.getLogger(ProducerDemoCallback.class);
使用 Callback():
/**
* send data with Callback()
*/
for(int i = 0; i < 10; i++) {
// create a producer record
ProducerRecord<String, String> record =
new ProducerRecord<String, String>("first_topic", "message from java" + Integer.toString(i)); produer.send(record, new Callback() {
public void onCompletion(RecordMetadata recordMetadata, Exception e) {
// execute every time a record is successfully sent or an exception is thrown
if (e == null) {
// the record is sent successfully
logger.info("Received new metadata. \n" +
"Topic: " + recordMetadata.topic() + "\n" +
"Partition: " + recordMetadata.partition() + "\n" +
"Offset: " + recordMetadata.offset() + "\n" +
"Timestamp: " + recordMetadata.timestamp());
} else {
logger.error("Error while producing", e);
}
}
});
}
部分输出结果如下:
[kafka-producer-network-thread | producer-1] INFO com.github.tang.kafka.tutorial1.ProducerDemoCallback - Received new metadata.
Topic: first_topic
Partition: 2
Offset: 21
Timestamp: 1565501879059
[kafka-producer-network-thread | producer-1] INFO com.github.tang.kafka.tutorial1.ProducerDemoCallback - Received new metadata.
Topic: first_topic
Partition: 2
Offset: 22
Timestamp: 1565501879075
1.3. 发送带key的records
上面的例子均是未带key,所以消息是按轮询的方式发送到partition。下面是带key的producer例子,重载send() 方法即可:
String key = "id_" + Integer.toString(i); ProducerRecord<String, String> record =
new ProducerRecord<String, String>(topic, key,"message from java" + Integer.toString(i));
2. Consumer
2.1. 基本Consumer
下面是一个基本的consumer 例子:
package com.github.tang.kafka.tutorial1; import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.common.serialization.StringDeserializer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.time.Duration;
import java.util.Arrays;
import java.util.Properties; public class ConsumerDemo {
private static String bootstrapServers = "server:9092";
private static String groupId = "my-forth-app";
private static String topic = "first_topic"; public static void main(String[] args) {
Logger logger = LoggerFactory.getLogger(ConsumerDemo.class); /**
* create Consumer properties
*
* Properties are available in official document:
* https://kafka.apache.org/documentation/#consumerconfigs
*
*/
Properties properties = new Properties();
properties.setProperty(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
properties.setProperty(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
properties.setProperty(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
properties.setProperty(ConsumerConfig.GROUP_ID_CONFIG, groupId);
properties.setProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest"); // create consumer
KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(properties); // subscribe consumer to our topic(s)
consumer.subscribe(Arrays.asList(topic)); // poll for new data
while(true){
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMinutes(100)); for(ConsumerRecord record : records){
logger.info("Key: " + record.key() + "\t" + "Value: " + record.value() +
"Topic: " + record.partition() + "\t" + "Partition: " + record.partition()
); }
} }
}
部分输出结果如下:

从输出结果可以看到,consumer 在读取时,(在指定offset为earliest的情况下)是先读完一个partition后,再读下一个partition。
2.2. Consumer balancing
之前提到过,在一个consumer group中的consumers可以自动做负载均衡。下面我们启动一个consumer后,再启动一个consumer。
下面是第一个consumer的日志:

在第二个consumer加入后,第一个consumer 重新分配 partition,从之前负责三个partition(0,1,2)到现在负责一个partition(2)。
对于第二个consumer的日志:

可以看到第二个consumer在加入后,开始负责2个partition(0与1)的读
2.3 Consumer 多线程方式:
package com.github.tang.kafka.tutorial1; import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.common.errors.WakeupException;
import org.apache.kafka.common.serialization.StringDeserializer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory; import java.time.Duration;
import java.util.Arrays;
import java.util.Properties;
import java.util.concurrent.CountDownLatch; public class ConsumerDemoWithThreads { private static Logger logger = LoggerFactory.getLogger(ConsumerDemoWithThreads.class); public static void main(String[] args) {
String bootstrapServers = "server:9092";
String groupId = "my-fifth-app";
String topic = "first_topic"; // latch for dealing with multiple threads
CountDownLatch latch = new CountDownLatch(1); ConsumerRunnable consumerRunnable = new ConsumerRunnable(latch,
bootstrapServers,
groupId,
topic); Thread myConsumerThread = new Thread(consumerRunnable);
myConsumerThread.start(); // add a shutdown hook
Runtime.getRuntime().addShutdownHook(new Thread(() ->{
logger.info("Caught shutdown hook");
consumerRunnable.shutdown(); try{
latch.await();
} catch (InterruptedException e){
e.printStackTrace();
}
logger.info("Application has exited"); })); try{
latch.await();
} catch (InterruptedException e){
logger.error("Application got interrupted", e);
} finally {
logger.info("Application is closing");
} } private static class ConsumerRunnable implements Runnable{ private CountDownLatch latch;
KafkaConsumer<String, String> consumer;
private String bootstrapServers;
private String topic;
private String groupId; public ConsumerRunnable(CountDownLatch latch,
String bootstrapServers,
String groupId,
String topic){
this.latch = latch;
this.bootstrapServers = bootstrapServers;
this.topic = topic;
this.groupId = groupId;
} @Override
public void run() { Properties properties = new Properties();
properties.setProperty(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
properties.setProperty(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
properties.setProperty(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
properties.setProperty(ConsumerConfig.GROUP_ID_CONFIG, groupId);
properties.setProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest"); consumer = new KafkaConsumer<String, String>(properties);
consumer.subscribe(Arrays.asList(topic)); // poll for new data
try {
while (true) {
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMinutes(100)); for (ConsumerRecord record : records) {
logger.info("Key: " + record.key() + "\t" + "Value: " + record.value());
logger.info("Partition: " + record.partition() + "\t" + "Offset: " + record.offset()
); }
}
} catch (WakeupException e){
logger.info("Received shutdown signal!");
} finally {
consumer.close(); // tell our main code we're done with the consumer
latch.countDown();
}
} public void shutdown(){
// the wakeup() method is a special method to interrupt consumer.poll()
// it will throw the exceptioin WakeUpException
consumer.wakeup();
}
}
}
2.4. Consumer使用 Assign and Seek
Consumer 中可以使用Assign 分配一个topic的partition,然后用seek方法从给定offset读取records。一般此方式用于replay数据或是获取一条特定的record。
在实现时,基于上一个例子,修改run()方法部分代码如下:
// assign and seek are most used to replay data or fetch a specific message // assign
TopicPartition partitionToReadFrom = new TopicPartition(topic, 0);
long offsetToReadFrom = 15L;
consumer.assign(Arrays.asList(partitionToReadFrom)); // seek
consumer.seek(partitionToReadFrom, offsetToReadFrom); int numberOfMessagesToRead = 5;
boolean keepOnReading = true;
int numberOfMessagesReadSoFar = 0; // poll for new data
try {
while (keepOnReading) {
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMinutes(100)); for (ConsumerRecord record : records) {
numberOfMessagesReadSoFar += 1; logger.info("Key: " + record.key() + "\t" + "Value: " + record.value());
logger.info("Partition: " + record.partition() + "\t" + "Offset: " + record.offset()
); if (numberOfMessagesReadSoFar >= numberOfMessagesToRead){
keepOnReading = false;
break;
}
}
}
} catch (WakeupException e){
logger.info("Received shutdown signal!");
} finally {
consumer.close(); // tell our main code we're done with the consumer
latch.countDown();
}
需要注意的是,使用此方法时,不需要指定consumer group。
3. 客户端双向兼容
在Kafka 0.10.2 版本之后,Kafka 客户端与Kafka brokers可以实现双向兼容(通过将API版本化实现,也就是说:不同的版本客户端发送的API版本不一样,且服务端可以处理不同版本API的请求)。
也就是说:
- 一个老版本的客户端(1.1之前版本)可以与更新版本的broker(2.0版本)正常交互
- 一个新版本的客户端(2.0之前版本)可以与一个老版本的broker(1.1版本)正常交互
对此的建议是:在任何时候都是用最新的客户端lib版本。
Apache Kafka(四)- 使用 Java 访问 Kafka的更多相关文章
- Java访问kafka的时候java.nio.channels.ClosedChannelException解决办法
import java.util.Properties; import kafka.javaapi.producer.Producer; import kafka.producer.KeyedMess ...
- 《Apache kafka实战》读书笔记-kafka集群监控工具
<Apache kafka实战>读书笔记-kafka集群监控工具 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 如官网所述,Kafka使用基于yammer metric ...
- kafka集群搭建和使用Java写kafka生产者消费者
1 kafka集群搭建 1.zookeeper集群 搭建在110, 111,112 2.kafka使用3个节点110, 111,112 修改配置文件config/server.properties ...
- JAVA版Kafka代码及配置解释
伟大的程序员版权所有,转载请注明:http://www.lenggirl.com/bigdata/java-kafka.html.html 一.JAVA代码 kafka是吞吐量巨大的一个消息系统,它是 ...
- 4 kafka集群部署及kafka生产者java客户端编程 + kafka消费者java客户端编程
本博文的主要内容有 kafka的单机模式部署 kafka的分布式模式部署 生产者java客户端编程 消费者java客户端编程 运行kafka ,需要依赖 zookeeper,你可以使用已有的 zo ...
- _00017 Kafka的体系结构介绍以及Kafka入门案例(0基础案例+Java API的使用)
博文作者:妳那伊抹微笑 itdog8 地址链接 : http://www.itdog8.com(个人链接) 博客地址:http://blog.csdn.net/u012185296 博文标题:_000 ...
- Java版Kafka使用及配置解释
Java版Kafka使用及配置解释 一.Java示例 kafka是吞吐量巨大的一个消息系统,它是用scala写的,和普通的消息的生产消费还有所不同,写了个demo程序供大家参考.kafka的安装请参考 ...
- K8S环境快速部署Kafka(K8S外部可访问)
欢迎访问我的GitHub https://github.com/zq2599/blog_demos 内容:所有原创文章分类汇总及配套源码,涉及Java.Docker.Kubernetes.DevOPS ...
- ActiveMQ、RabbitMQ、RocketMQ、Kafka四种消息中间件分析介绍
ActiveMQ.RabbitMQ.RocketMQ.Kafka四种消息中间件分析介绍 我们从四种消息中间件的介绍到基本使用,以及高可用,消息重复性,消息丢失,消息顺序性能方面进行分析介绍! 一.消息 ...
随机推荐
- 剑指offer-面试题13-机器人的运动范围-递归法
/* 题目: 地上有一个m行n列的方格.一个机器人从坐标(0,0)的格子开始运动, 每次可向上.下.左.右移动一格,但不能进入行坐标和列坐标之和大于k的格子. 如,当k=18时,机器人能进入(35,3 ...
- source、sh、./执行脚本对变量的影响
shell脚本中的变量: local一般用于局部变量声明,多在在函数内部使用. shell脚本中定义的变量是global的,其作用域从被定义的地方开始,到shell结束或被显示删除的地方为止. she ...
- C语言 switch
C语言 switch 功能:获取到值对应成立不同表达式. 优点:switch 语句执行效率比if语句要快,switch是通过开关选择的方式执行,而if语句是从开头判断到结尾. 缺点:不能判断多个区间. ...
- Costco这样的超级零售商,能不能干掉电商?
名创优品创始人叶国富曾说过,Costco只是没有来到中国(大陆),如果它来了,中国现在的零售业全部都会"死光".叶国富的话,似乎一语成箴. 随着Costco正式入华,其正在彻底搅动 ...
- Codeforces 577A - Multiplication Table
Let's consider a table consisting of n rows and n columns. The cell located at the intersection of i ...
- Verilog-格雷码加法器
1.概述 格雷码执行加1操作最多只会变1位,可用在多位地址指针中消除毛刺. 2.verilog代码 `timescale 1ns / 1ps module gray_adder #() ( input ...
- Proxy SwitchyOmega 使用黑名单和白名单
“黑名单”会告诉代理工具,黑名单(国外)里面的网站要使用代理:“白名单”会告诉代理工具,白名单(大陆网站)里面的网站直接连接,其余使用代理. 黑名单PAC 黑名单PAC两条(任选其一):https:/ ...
- js封装删除数组指定的某个元素的方法
首先可以给JS的数组对象定义一个函数,用于查找指定的元素在数组中的位置,即索引,代码为: Array.prototype.indexOf = function(val) { for (var i = ...
- 解决git速度太慢的问题,亲测有效
在家用的是电信网,每次git大型项目总是失败,甚是苦恼,解决了好几次都失败了,终忍受不了,下定决心干掉它. git clone特别慢是因为github.global.ssl.fastly.net域名被 ...
- leetcode top-100-liked-questions刷题总结
一.起因 宅在家中,不知该做点什么.没有很好的想法,自己一直想提升技能,语言基础自不必言,数据结构还算熟悉,算法能力一般.于是乎,就去刷一通题. 刷题平台有很多,我选择了在leetcode进行刷题.回 ...