Kafka之 API实战

一、环境准备

  1)启动zk和kafka集群,在kafka集群中打开一个消费者

[hadoop1 kafka]$ bin/kafka-console-consumer.sh \
--zookeeper hadoop1:2181 --topic first

  2)导入pom依赖

<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>2.0.0</version>
</dependency>

二、Kafka生产者Java API

2.1 创建生产者(过时的API)

package com.libt.kafka;
import java.util.Properties;
import kafka.javaapi.producer.Producer;
import kafka.producer.KeyedMessage;
import kafka.producer.ProducerConfig; public class OldProducer { @SuppressWarnings("deprecation")
public static void main(String[] args) { Properties properties = new Properties();
properties.put("metadata.broker.list", "hadoop1:9092");
properties.put("request.required.acks", "1");
properties.put("serializer.class", "kafka.serializer.StringEncoder"); Producer<Integer, String> producer = new Producer<Integer,String>(new ProducerConfig(properties)); KeyedMessage<Integer, String> message = new KeyedMessage<Integer, String>("first", "hello world");
producer.send(message );
}
}

2.2 创建生产者(新API)

package com.libt.kafka;
import java.util.Properties;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.Producer;
import org.apache.kafka.clients.producer.ProducerRecord; public class NewProducer {
public static void main(String[] args) { //kafka所需要的配置信息
Properties props = new Properties();
// Kafka集群 服务端的主机名和端口号
props.put("bootstrap.servers", "hadoop2:9092");
// 等待所有副本节点的应答
props.put("acks", "all");
// 消息发送最大尝试次数
props.put("retries", 0);
// 一批消息处理大小
props.put("batch.size", 16384);
// 请求延时
props.put("linger.ms", 1);
// 发送缓存区内存大小
props.put("buffer.memory", 33554432);
// key序列化
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
// value序列化
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer"); Producer<String, String> producer = new KafkaProducer<>(props);
for (int i = 0; i < 50; i++) {
producer.send(new ProducerRecord<String, String>("first", Integer.toString(i), "hello world-" + i));
} producer.close();
}
}

2.2 创建生产者带回调函数(API)

package com.libt.kafka;
import java.util.Properties;
import org.apache.kafka.clients.producer.Callback;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.clients.producer.RecordMetadata; public class CallBackProducer { public static void main(String[] args) { Properties props = new Properties();
// Kafka服务端的主机名和端口号
props.put("bootstrap.servers", "hadoop2:9092");
// 等待所有副本节点的应答
props.put("acks", "all");
// 消息发送最大尝试次数
props.put("retries", 0);
// 一批消息处理大小
props.put("batch.size", 16384);
// 增加服务端请求延时
props.put("linger.ms", 1);
// 发送缓存区内存大小
props.put("buffer.memory", 33554432);
// key序列化
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
// value序列化
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer"); //创建生产者
KafkaProducer<String, String> kafkaProducer = new KafkaProducer<>(props); for (int i = 0; i < 50; i++) { kafkaProducer.send(new ProducerRecord<String, String>("first", "hello" + i), new Callback() { @Override
public void onCompletion(RecordMetadata metadata, Exception exception) { if (metadata != null) { System.err.println(metadata.partition() + "---" + metadata.offset());
}
}
});
} kafkaProducer.close();
}
}

2.3 创建生产者带回调函数(新API)

package com.libt.kafka;
import java.util.Properties;
import org.apache.kafka.clients.producer.Callback;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.clients.producer.RecordMetadata; public class CallBackProducer { public static void main(String[] args) { Properties props = new Properties();
// Kafka服务端的主机名和端口号
props.put("bootstrap.servers", "hadoop2:9092");
// 等待所有副本节点的应答
props.put("acks", "all");
// 消息发送最大尝试次数
props.put("retries", 0);
// 一批消息处理大小
props.put("batch.size", 16384);
// 增加服务端请求延时
props.put("linger.ms", 1);
// 发送缓存区内存大小
props.put("buffer.memory", 33554432);
// key序列化
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
// value序列化
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer"); //创建生产者
KafkaProducer<String, String> kafkaProducer = new KafkaProducer<>(props); for (int i = 0; i < 50; i++) { kafkaProducer.send(new ProducerRecord<String, String>("first", "hello" + i), new Callback() { @Override
public void onCompletion(RecordMetadata metadata, Exception exception) { if (metadata != null) { System.err.println(metadata.partition() + "---" + metadata.offset());
}
}
});
} kafkaProducer.close();
}
}

2.4 自定义分区生产者

0)需求:将所有数据存储到topic的第0号分区上
1)定义一个类实现Partitioner接口,重写里面的方法(过时API)

package com.libt.kafka;
import java.util.Map;
import kafka.producer.Partitioner; public class CustomPartitioner implements Partitioner { public CustomPartitioner() {
super();
} @Override
public int partition(Object key, int numPartitions) {
// 控制分区
return 0;
}
}

2)自定义分区(新API)

package com.libt.kafka;
import java.util.Map;
import org.apache.kafka.clients.producer.Partitioner;
import org.apache.kafka.common.Cluster; public class CustomPartitioner implements Partitioner { @Override
public void configure(Map<String, ?> configs) { } @Override
public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {
// 控制分区
return 0;
} @Override
public void close() { }
}

3)在代码中调用

package com.libt.kafka;
import java.util.Properties;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.Producer;
import org.apache.kafka.clients.producer.ProducerRecord; public class PartitionerProducer { public static void main(String[] args) { Properties props = new Properties();
// Kafka服务端的主机名和端口号
props.put("bootstrap.servers", "hadoop2:9092");
// 等待所有副本节点的应答
props.put("acks", "all");
// 消息发送最大尝试次数
props.put("retries", 0);
// 一批消息处理大小
props.put("batch.size", 16384);
// 增加服务端请求延时
props.put("linger.ms", 1);
// 发送缓存区内存大小
props.put("buffer.memory", 33554432);
// key序列化
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
// value序列化
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
// 自定义分区
props.put("partitioner.class", "com.atguigu.kafka.CustomPartitioner"); Producer<String, String> producer = new KafkaProducer<>(props);
producer.send(new ProducerRecord<String, String>("first", "1", "atguigu")); producer.close();
}
}

4)测试

(1)在hadoop1上监控/home/bigdata/kafka/logs/目录下first主题3个分区的log日志动态变化情况

[hadoop1 first-0]$ tail -f 00000000000000000000.log
[hadoop1 first-1]$ tail -f 00000000000000000000.log
[hadoop1 first-2]$ tail -f 00000000000000000000.log

(2)发现数据都存储到指定的分区了。

三、Kafka消费者Java API

3.1 高级API

0)在控制台创建发送者

[hadoop1 kafka]$ bin/kafka-console-producer.sh \
--broker-list hadoop1:9092 --topic first
>hello world

1)创建消费者(过时API)

package com.libt.kafka.consume;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Properties;
import kafka.consumer.Consumer;
import kafka.consumer.ConsumerConfig;
import kafka.consumer.ConsumerIterator;
import kafka.consumer.KafkaStream;
import kafka.javaapi.consumer.ConsumerConnector; public class CustomConsumer { @SuppressWarnings("deprecation")
public static void main(String[] args) {
Properties properties = new Properties(); properties.put("zookeeper.connect", "hadoop1:2181");
properties.put("group.id", "g1");
properties.put("zookeeper.session.timeout.ms", "500");
properties.put("zookeeper.sync.time.ms", "250");
properties.put("auto.commit.interval.ms", "1000"); // 创建消费者连接器
ConsumerConnector consumer = Consumer.createJavaConsumerConnector(new ConsumerConfig(properties)); HashMap<String, Integer> topicCount = new HashMap<>();
topicCount.put("first", 1); Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumer.createMessageStreams(topicCount); KafkaStream<byte[], byte[]> stream = consumerMap.get("first").get(0); ConsumerIterator<byte[], byte[]> it = stream.iterator(); while (it.hasNext()) {
System.out.println(new String(it.next().message()));
}
}
}

2)官方提供案例(自动维护消费情况)(新API)

package com.libt.kafka.consume;
import java.util.Arrays;
import java.util.Properties;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer; public class CustomNewConsumer { public static void main(String[] args) { Properties props = new Properties();
// 定义kakfa 服务的地址,不需要将所有broker指定上
props.put("bootstrap.servers", "hadoop1:9092");
// 制定consumer group
props.put("group.id", "test");
// 是否自动确认offset
props.put("enable.auto.commit", "true");
// 自动确认offset的时间间隔
props.put("auto.commit.interval.ms", "1000");
// key的序列化类
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
// value的序列化类
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
// 定义consumer
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props); // 消费者订阅的topic, 可同时订阅多个
consumer.subscribe(Arrays.asList("first", "second","third")); while (true) {
// 读取数据,读取超时时间为100ms
ConsumerRecords<String, String> records = consumer.poll(100); for (ConsumerRecord<String, String> record : records)
System.out.printf("offset = %d, key = %s, value = %s%n", record.offset(), record.key(), record.value());
}
}
}

4.2 低级API

实现使用低级API读取指定topic,指定partition,指定offset的数据。
1)消费者使用低级API 的主要步骤:

步骤

主要工作

1

根据指定的分区从主题元数据中找到主副本

2

获取分区最新的消费进度

3

从主副本拉取分区的消息

4

识别主副本的变化,重试

2)方法描述:

findLeader()

客户端向种子节点发送主题元数据,将副本集加入备用节点

getLastOffset()

消费者客户端发送偏移量请求,获取分区最近的偏移量

run()

消费者低级AP I拉取消息的主要方法

findNewLeader()

当分区的主副本节点发生故障,客户将要找出新的主副本

3)代码:

package com.libt;
import java.nio.ByteBuffer;
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map; import kafka.api.FetchRequest;
import kafka.api.FetchRequestBuilder;
import kafka.api.PartitionOffsetRequestInfo;
import kafka.cluster.BrokerEndPoint;
import kafka.common.ErrorMapping;
import kafka.common.TopicAndPartition;
import kafka.javaapi.FetchResponse;
import kafka.javaapi.OffsetResponse;
import kafka.javaapi.PartitionMetadata;
import kafka.javaapi.TopicMetadata;
import kafka.javaapi.TopicMetadataRequest;
import kafka.javaapi.consumer.SimpleConsumer;
import kafka.message.MessageAndOffset; public class SimpleExample {
private List<String> m_replicaBrokers = new ArrayList<>(); public SimpleExample() {
m_replicaBrokers = new ArrayList<>();
} public static void main(String args[]) {
SimpleExample example = new SimpleExample();
// 最大读取消息数量
long maxReads = Long.parseLong("3");
// 要订阅的topic
String topic = "test1";
// 要查找的分区
int partition = Integer.parseInt("0");
// broker节点的ip
List<String> seeds = new ArrayList<>();
seeds.add("192.168.9.102");
seeds.add("192.168.9.103");
seeds.add("192.168.9.104");
// 端口
int port = Integer.parseInt("9092");
try {
example.run(maxReads, topic, partition, seeds, port);
} catch (Exception e) {
System.out.println("Oops:" + e);
e.printStackTrace();
}
} public void run(long a_maxReads, String a_topic, int a_partition, List<String> a_seedBrokers, int a_port) throws Exception {
// 获取指定Topic partition的元数据
PartitionMetadata metadata = findLeader(a_seedBrokers, a_port, a_topic, a_partition);
if (metadata == null) {
System.out.println("Can't find metadata for Topic and Partition. Exiting");
return;
}
if (metadata.leader() == null) {
System.out.println("Can't find Leader for Topic and Partition. Exiting");
return;
}
String leadBroker = metadata.leader().host();
String clientName = "Client_" + a_topic + "_" + a_partition; SimpleConsumer consumer = new SimpleConsumer(leadBroker, a_port, 100000, 64 * 1024, clientName);
long readOffset = getLastOffset(consumer, a_topic, a_partition, kafka.api.OffsetRequest.EarliestTime(), clientName);
int numErrors = 0;
while (a_maxReads > 0) {
if (consumer == null) {
consumer = new SimpleConsumer(leadBroker, a_port, 100000, 64 * 1024, clientName);
}
FetchRequest req = new FetchRequestBuilder().clientId(clientName).addFetch(a_topic, a_partition, readOffset, 100000).build();
FetchResponse fetchResponse = consumer.fetch(req); if (fetchResponse.hasError()) {
numErrors++;
// Something went wrong!
short code = fetchResponse.errorCode(a_topic, a_partition);
System.out.println("Error fetching data from the Broker:" + leadBroker + " Reason: " + code);
if (numErrors > 5)
break;
if (code == ErrorMapping.OffsetOutOfRangeCode()) {
// We asked for an invalid offset. For simple case ask for
// the last element to reset
readOffset = getLastOffset(consumer, a_topic, a_partition, kafka.api.OffsetRequest.LatestTime(), clientName);
continue;
}
consumer.close();
consumer = null;
leadBroker = findNewLeader(leadBroker, a_topic, a_partition, a_port);
continue;
}
numErrors = 0; long numRead = 0;
for (MessageAndOffset messageAndOffset : fetchResponse.messageSet(a_topic, a_partition)) {
long currentOffset = messageAndOffset.offset();
if (currentOffset < readOffset) {
System.out.println("Found an old offset: " + currentOffset + " Expecting: " + readOffset);
continue;
}
readOffset = messageAndOffset.nextOffset();
ByteBuffer payload = messageAndOffset.message().payload(); byte[] bytes = new byte[payload.limit()];
payload.get(bytes);
System.out.println(String.valueOf(messageAndOffset.offset()) + ": " + new String(bytes, "UTF-8"));
numRead++;
a_maxReads--;
} if (numRead == 0) {
try {
Thread.sleep(1000);
} catch (InterruptedException ie) {
}
}
}
if (consumer != null)
consumer.close();
} public static long getLastOffset(SimpleConsumer consumer, String topic, int partition, long whichTime, String clientName) {
TopicAndPartition topicAndPartition = new TopicAndPartition(topic, partition);
Map<TopicAndPartition, PartitionOffsetRequestInfo> requestInfo = new HashMap<TopicAndPartition, PartitionOffsetRequestInfo>();
requestInfo.put(topicAndPartition, new PartitionOffsetRequestInfo(whichTime, 1));
kafka.javaapi.OffsetRequest request = new kafka.javaapi.OffsetRequest(requestInfo, kafka.api.OffsetRequest.CurrentVersion(), clientName);
OffsetResponse response = consumer.getOffsetsBefore(request); if (response.hasError()) {
System.out.println("Error fetching data Offset Data the Broker. Reason: " + response.errorCode(topic, partition));
return 0;
}
long[] offsets = response.offsets(topic, partition);
return offsets[0];
} private String findNewLeader(String a_oldLeader, String a_topic, int a_partition, int a_port) throws Exception {
for (int i = 0; i < 3; i++) {
boolean goToSleep = false;
PartitionMetadata metadata = findLeader(m_replicaBrokers, a_port, a_topic, a_partition);
if (metadata == null) {
goToSleep = true;
} else if (metadata.leader() == null) {
goToSleep = true;
} else if (a_oldLeader.equalsIgnoreCase(metadata.leader().host()) && i == 0) {
// first time through if the leader hasn't changed give
// ZooKeeper a second to recover
// second time, assume the broker did recover before failover,
// or it was a non-Broker issue
//
goToSleep = true;
} else {
return metadata.leader().host();
}
if (goToSleep) {
Thread.sleep(1000);
}
}
System.out.println("Unable to find new leader after Broker failure. Exiting");
throw new Exception("Unable to find new leader after Broker failure. Exiting");
} private PartitionMetadata findLeader(List<String> a_seedBrokers, int a_port, String a_topic, int a_partition) {
PartitionMetadata returnMetaData = null;
loop:
for (String seed : a_seedBrokers) {
SimpleConsumer consumer = null;
try {
consumer = new SimpleConsumer(seed, a_port, 100000, 64 * 1024, "leaderLookup");
List<String> topics = Collections.singletonList(a_topic);
TopicMetadataRequest req = new TopicMetadataRequest(topics);
kafka.javaapi.TopicMetadataResponse resp = consumer.send(req); List<TopicMetadata> metaData = resp.topicsMetadata();
for (TopicMetadata item : metaData) {
for (PartitionMetadata part : item.partitionsMetadata()) {
if (part.partitionId() == a_partition) {
returnMetaData = part;
break loop;
}
}
}
} catch (Exception e) {
System.out.println("Error communicating with Broker [" + seed + "] to find Leader for [" + a_topic + ", " + a_partition + "] Reason: " + e);
} finally {
if (consumer != null)
consumer.close();
}
}
if (returnMetaData != null) {
m_replicaBrokers.clear();
for (BrokerEndPoint replica : returnMetaData.replicas()) {
m_replicaBrokers.add(replica.host());
}
}
return returnMetaData;
}
}

Kafka之 API实战的更多相关文章

  1. [转帖]Kafka 原理和实战

    Kafka 原理和实战 https://segmentfault.com/a/1190000020120043 两个小时读完... 实在是看不完... 1.2k 次阅读  ·  读完需要 101 分钟 ...

  2. Kafka 分区备份实战

    1.概述 在 Kafka 集群中,我们可以对每个 Topic 进行一个或是多个分区,并为该 Topic 指定备份数.这部分元数据信息都是存放在 Zookeeper 上,我们可以使用 zkCli 客户端 ...

  3. 百度AI开放平台- API实战调用

    百度AI开放平台- API实战调用 一.      前言 首先说一下项目需求. 两个用户,分别上传了两段不同的文字,要计算两段文字相似度有多少,匹配数据库中的符合条件的数据,初步估计列出来会有60-1 ...

  4. Zookeeper+Kafka完全分布式实战部署

    Zookeeper+Kafka完全分布式实战部署 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 其实我之前部署过kafak和zookeeper的完全分布式,集群是可以正常使用没错, ...

  5. Kafka Consumer API样例

    Kafka Consumer API样例 1. 自动确认Offset 说明参照:http://blog.csdn.net/xianzhen376/article/details/51167333 Pr ...

  6. Kafka消费者APi

    Kafka客户端从集群中消费消息,并透明地处理kafka集群中出现故障服务器,透明地调节适应集群中变化的数据分区.也和服务器交互,平衡均衡消费者. public class KafkaConsumer ...

  7. Kafka(五)Kafka的API操作和拦截器

    一 kafka的API操作 1.1 环境准备 1)在eclipse中创建一个java工程 2)在工程的根目录创建一个lib文件夹 3)解压kafka安装包,将安装包libs目录下的jar包拷贝到工程的 ...

  8. Kafka笔记整理(二):Kafka Java API使用

    下面的测试代码使用的都是下面的topic: $ kafka-topics.sh --describe hadoop --zookeeper uplooking01:,uplooking02:,uplo ...

  9. kafka java API的使用

    Kafka包含四种核心的API: 1.Producer API支持应用将数据流发送到Kafka集群的主题 2.Consumer API支持应用从Kafka集群的主题中读取数据流 3.Streams A ...

随机推荐

  1. 使用JDK的同步容器时,应该避免那些坑?

    摘要:在使用JDK中的同步容器时,应该尽量避免哪些坑 本文分享自华为云社区<[高并发]亿级流量高并发秒杀系统商品"超卖"了,只因使用的JDK同步容器中存在这两个巨大的坑!!( ...

  2. React报错之map() is not a function

    正文从这开始~ 总览 当我们对一个不是数组的值调用map()方法时,就会产生"TypeError: map is not a function"错误.为了解决该错误,请将你调用ma ...

  3. GTID主从和lamp架构运行原理

    目录 GTID主从 GTID概念介绍 GTID工作原理 GTID主从配置 lamp lamp简介 web服务器工作流程 cgi与fastcgi http协议 是什么? lamp架构运行的原理 Apac ...

  4. typora的第一天

    一级标题 二级标题 三级标题 ..... 表格 java spring mybatis 代码 java代码 public void Hello(){ } 字体 hello word! hello wo ...

  5. 使用Linux、Nginx和Github Actions托管部署ASP.NET Core 6.0应用

    使用Linux.Nginx和Github Actions托管部署ASP.NET Core 6.0应用 前言 本文主要参考微软这篇文档而来 Host ASP.NET Core on Linux with ...

  6. DevOps落地实践点滴和踩坑记录-(2) -聊聊平台建设

    很久没有写文章记录了,上一篇文章像流水账一样,把所见所闻一个个记录下来.这次专门聊聊DevOps平台的建设吧,有些新的体会和思考,希望给正在做这个事情的同学们一些启发吧. DevOps落地实践点滴和踩 ...

  7. PerfView专题 (第六篇):如何洞察 C# 中 GC 的变化

    一:背景 在洞察 GC 方面,我觉得市面上没有任何一款工具可以和 PerfView 相提并论,这也是为什么我会在 WinDbg 之外还要学习这么一款工具的原因,这篇我们先简单聊聊 PerfView 到 ...

  8. ABC206 F - Interval Game 2 (区间DP,博弈论,SG函数)

    题面 题意很简单 A l i c e \tt Alice Alice 和 B o b \tt Bob Bob 在博弈.摆在他们面前有 N \rm N N 个区间 [ l i , r i ) \rm[l ...

  9. 获取Windows正在运行的窗口进程

    主要是获取Alt+Tab中展示的窗口 原理主要是获取窗口的样式来判断是否会在Alt+Tab中显示 具体代码如下 /// <summary> /// Alt+Tab 应用 /// </ ...

  10. Linux网桥配置(用于大数据虚拟化)

    理解 VMware里面有三个虚拟机,分别为RHEL8,RHEL7,Windows的虚拟机,只有一个物理网卡连接物理网络,现在三台虚拟机都需要直连到物理网络,此时无法访问物理网络,只能给一个虚拟机访问物 ...