kafka_2.9.2-0.8.1.1分布式集群搭建代码开发实例
准备3台虚拟机, 系统是RHEL64服务版. 1) 每台机器配置如下:
$ cat /etc/hosts
# zookeeper hostnames: 192.168.8.182 zk1 192.168.8.183 zk2 192.168.8.184 zk3
2) 每台机器上安装jdk, zookeeper, kafka, 配置如下:
$ vi /etc/profile # jdk, zookeeper, kafka export KAFKA_HOME=/usr/local/lib/kafka/kafka_2.9.2-0.8.11 export ZK_HOME=/usr/local/lib/zookeeper/zookeeper-3.4.6 export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$KAFKA_HOME/bin:$ZK_HOME/bin:$PATH
3) 每台机器上运行:
$ source /etc/profile
$ mkdir -p /var/lib/zookeeper
$ cd $ZK_HOME/conf
$ cp zoo_sample.cfg zoo.cfg
$ vi zoo.cfg dataDir=/var/lib/zookeeper # the port at which the clients will connect clientPort=2181 # zookeeper cluster server.1=zk1:2888:3888 server.2=zk2:2888:3888 server.3=zk3:2888:3888
4) 每台机器上生成myid:
zk1:
$ echo "1" > /var/lib/zookeeper/myid
zk2:
$ echo "2" > /var/lib/zookeeper/myid
zk3:
$ echo "3" > /var/lib/zookeeper/myid 5) 每台机器上运行setup关闭防火墙
Firewall:
[ ] enabled 6) 每台机器上启动zookeeper:
$ zkServer.sh start
查看状态:
$ zkServer.sh status
1)下载KAFKA
$ wget http://apache.fayea.com/apache-mirror/kafka/0.8.1.1/kafka_2.9.2-0.8.1.1.tgz
安装和配置参考上一篇文章:
http://blog.csdn.net/ubuntu64fan/article/details/26678877
2)配置$KAFKA_HOME/config/server.properties
我们安装3个broker,分别在3个vm上:zk1,zk2,zk3:
zk1:
$ vi /etc/sysconfig/network
NETWORKING=yes HOSTNAME=zk1
$ vi $KAFKA_HOME/config/server.properties
broker.id=0 port=9092 host.name=zk1 advertised.host.name=zk1 ... num.partitions=2 ... zookeeper.contact=zk1:2181,zk2:2181,zk3:2181
zk2:
$ vi /etc/sysconfig/network
NETWORKING=yes HOSTNAME=zk2
$ vi $KAFKA_HOME/config/server.properties
broker.id=1 port=9092 host.name=zk2 advertised.host.name=zk2 ... num.partitions=2 ... zookeeper.contact=zk1:2181,zk2:2181,zk3:2181
zk3:
$ vi /etc/sysconfig/network
NETWORKING=yes HOSTNAME=zk3
$ vi $KAFKA_HOME/config/server.properties
broker.id=2 port=9092 host.name=zk3 advertised.host.name=zk3 ... num.partitions=2 ... zookeeper.contact=zk1:2181,zk2:2181,zk3:2181
3)启动zookeeper服务, 在zk1,zk2,zk3上分别运行:
$ zkServer.sh start 4)启动kafka服务, 在zk1,zk2,zk3上分别运行:
$ kafka-server-start.sh $KAFKA_HOME/config/server.properties 5) 新建一个TOPIC(replication-factor=num of brokers)
$ kafka-topics.sh --create --topic test --replication-factor 3 --partitions 2 --zookeeper zk1:2181 6)假设我们在zk2上,开一个终端,发送消息至kafka(zk2模拟producer)
$ kafka-console-producer.sh --broker-list zk1:9092 --sync --topic test
在发送消息的终端输入:Hello Kafka
7)假设我们在zk3上,开一个终端,显示消息的消费(zk3模拟consumer)
$ kafka-console-consumer.sh --zookeeper zk1:2181 --topic test --from-beginning 在消费消息的终端显示:Hello Kafka
项目准备开发
项目基于maven构建,不得不说kafka java客户端实在是太糟糕了;构建环境会遇到很多麻烦。建议参考如下pom.xml;其中各个依赖包必须版本协调一致。如果kafka client的版
本和kafka server的版本不一致,将会有很多异常,比如"broker id not exists"等;因为kafka从0.7升级到0.8之后(正名为2.8.0),client与server通讯的protocol已经改变.
- <dependencies>
- <dependency>
- <groupId>log4j</groupId>
- <artifactId>log4j</artifactId>
- <version>1.2.14</version>
- </dependency>
- <dependency>
- <groupId>org.apache.kafka</groupId>
- <artifactId>kafka_2.8.2</artifactId>
- <version>0.8.0</version>
- <exclusions>
- <exclusion>
- <groupId>log4j</groupId>
- <artifactId>log4j</artifactId>
- </exclusion>
- </exclusions>
- </dependency>
- <dependency>
- <groupId>org.scala-lang</groupId>
- <artifactId>scala-library</artifactId>
- <version>2.8.2</version>
- </dependency>
- <dependency>
- <groupId>com.yammer.metrics</groupId>
- <artifactId>metrics-core</artifactId>
- <version>2.2.0</version>
- </dependency>
- <dependency>
- <groupId>com.101tec</groupId>
- <artifactId>zkclient</artifactId>
- <version>0.3</version>
- </dependency>
- </dependencies>
Producer端代码
1) producer.properties文件:此文件放在/resources目录下
- #partitioner.class=
- ##broker列表可以为kafka server的子集,因为producer需要从broker中获取metadata
- ##尽管每个broker都可以提供metadata,此处还是建议,将所有broker都列举出来
- ##此值,我们可以在spring中注入过来
- ##metadata.broker.list=127.0.0.1:9092,127.0.0.1:9093
- ##,127.0.0.1:9093
- ##同步,建议为async
- producer.type=sync
- compression.codec=0
- serializer.class=kafka.serializer.StringEncoder
- ##在producer.type=async时有效
- #batch.num.messages=100
2) KafkaProducerClient.java代码样例
- import java.util.ArrayList;
- import java.util.Collection;
- import java.util.List;
- import java.util.Properties;
- import kafka.javaapi.producer.Producer;
- import kafka.producer.KeyedMessage;
- import kafka.producer.ProducerConfig;
- public class KafkaProducerClient {
- private Producer<String, String> inner;
- private String brokerList;//for metadata discovery,spring setter
- private String location = "kafka-producer.properties";//spring setter
- private String defaultTopic;//spring setter
- public void setBrokerList(String brokerList) {
- this.brokerList = brokerList;
- }
- public void setLocation(String location) {
- this.location = location;
- }
- public void setDefaultTopic(String defaultTopic) {
- this.defaultTopic = defaultTopic;
- }
- public KafkaProducerClient(){}
- public void init() throws Exception {
- Properties properties = new Properties();
- properties.load(Thread.currentThread().getContextClassLoader().getResourceAsStream(location));
- if(brokerList != null) {
- properties.put("metadata.broker.list", brokerList);
- }
- ProducerConfig config = new ProducerConfig(properties);
- inner = new Producer<String, String>(config);
- }
- public void send(String message){
- send(defaultTopic,message);
- }
- public void send(Collection<String> messages){
- send(defaultTopic,messages);
- }
- public void send(String topicName, String message) {
- if (topicName == null || message == null) {
- return;
- }
- KeyedMessage<String, String> km = new KeyedMessage<String, String>(topicName,message);
- inner.send(km);
- }
- public void send(String topicName, Collection<String> messages) {
- if (topicName == null || messages == null) {
- return;
- }
- if (messages.isEmpty()) {
- return;
- }
- List<KeyedMessage<String, String>> kms = new ArrayList<KeyedMessage<String, String>>();
- int i= 0;
- for (String entry : messages) {
- KeyedMessage<String, String> km = new KeyedMessage<String, String>(topicName,entry);
- kms.add(km);
- i++;
- if(i % 20 == 0){
- inner.send(kms);
- kms.clear();
- }
- }
- if(!kms.isEmpty()){
- inner.send(kms);
- }
- }
- public void close() {
- inner.close();
- }
- /**
- * @param args
- */
- public static void main(String[] args) {
- KafkaProducerClient producer = null;
- try {
- producer = new KafkaProducerClient();
- //producer.setBrokerList("");
- int i = 0;
- while (true) {
- producer.send("test-topic", "this is a sample" + i);
- i++;
- Thread.sleep(2000);
- }
- } catch (Exception e) {
- e.printStackTrace();
- } finally {
- if (producer != null) {
- producer.close();
- }
- }
- }
- }
Consumer端
1) consumer.properties:文件位于/resources目录下
- ## 此值可以配置,也可以通过spring注入
- ##zookeeper.connect=127.0.0.1:2181,127.0.0.1:2182,127.0.0.1:2183
- ##,127.0.0.1:2182,127.0.0.1:2183
- # timeout in ms for connecting to zookeeper
- zookeeper.connectiontimeout.ms=1000000
- #consumer group id
- group.id=test-group
- #consumer timeout
- #consumer.timeout.ms=5000
- auto.commit.enable=true
- auto.commit.interval.ms=60000
2) KafkaConsumerClient.java代码样例
- package com.test.kafka;
- import java.nio.ByteBuffer;
- import java.nio.CharBuffer;
- import java.nio.charset.Charset;
- import java.util.HashMap;
- import java.util.List;
- import java.util.Map;
- import java.util.Properties;
- import java.util.concurrent.ExecutorService;
- import java.util.concurrent.Executors;
- import kafka.consumer.Consumer;
- import kafka.consumer.ConsumerConfig;
- import kafka.consumer.ConsumerIterator;
- import kafka.consumer.KafkaStream;
- import kafka.javaapi.consumer.ConsumerConnector;
- import kafka.message.Message;
- import kafka.message.MessageAndMetadata;
- public class KafkaConsumerClient {
- private String groupid; //can be setting by spring
- private String zkConnect;//can be setting by spring
- private String location = "kafka-consumer.properties";//配置文件位置
- private String topic;
- private int partitionsNum = 1;
- private MessageExecutor executor; //message listener
- private ExecutorService threadPool;
- private ConsumerConnector connector;
- private Charset charset = Charset.forName("utf8");
- public void setGroupid(String groupid) {
- this.groupid = groupid;
- }
- public void setZkConnect(String zkConnect) {
- this.zkConnect = zkConnect;
- }
- public void setLocation(String location) {
- this.location = location;
- }
- public void setTopic(String topic) {
- this.topic = topic;
- }
- public void setPartitionsNum(int partitionsNum) {
- this.partitionsNum = partitionsNum;
- }
- public void setExecutor(MessageExecutor executor) {
- this.executor = executor;
- }
- public KafkaConsumerClient() {}
- //init consumer,and start connection and listener
- public void init() throws Exception {
- if(executor == null){
- throw new RuntimeException("KafkaConsumer,exectuor cant be null!");
- }
- Properties properties = new Properties();
- properties.load(Thread.currentThread().getContextClassLoader().getResourceAsStream(location));
- if(groupid != null){
- properties.put("groupid", groupid);
- }
- if(zkConnect != null){
- properties.put("zookeeper.connect", zkConnect);
- }
- ConsumerConfig config = new ConsumerConfig(properties);
- connector = Consumer.createJavaConsumerConnector(config);
- Map<String, Integer> topics = new HashMap<String, Integer>();
- topics.put(topic, partitionsNum);
- Map<String, List<KafkaStream<byte[], byte[]>>> streams = connector.createMessageStreams(topics);
- List<KafkaStream<byte[], byte[]>> partitions = streams.get(topic);
- threadPool = Executors.newFixedThreadPool(partitionsNum * 2);
- //start
- for (KafkaStream<byte[], byte[]> partition : partitions) {
- threadPool.execute(new MessageRunner(partition));
- }
- }
- public void close() {
- try {
- threadPool.shutdownNow();
- } catch (Exception e) {
- //
- } finally {
- connector.shutdown();
- }
- }
- class MessageRunner implements Runnable {
- private KafkaStream<byte[], byte[]> partition;
- MessageRunner(KafkaStream<byte[], byte[]> partition) {
- this.partition = partition;
- }
- public void run() {
- ConsumerIterator<byte[], byte[]> it = partition.iterator();
- while (it.hasNext()) {
- // connector.commitOffsets();手动提交offset,当autocommit.enable=false时使用
- MessageAndMetadata<byte[], byte[]> item = it.next();
- try{
- executor.execute(new String(item.message(),charset));// UTF-8,注意异常
- }catch(Exception e){
- //
- }
- }
- }
- public String getContent(Message message){
- ByteBuffer buffer = message.payload();
- if (buffer.remaining() == 0) {
- return null;
- }
- CharBuffer charBuffer = charset.decode(buffer);
- return charBuffer.toString();
- }
- }
- public static interface MessageExecutor {
- public void execute(String message);
- }
- /**
- * @param args
- */
- public static void main(String[] args) {
- KafkaConsumerClient consumer = null;
- try {
- MessageExecutor executor = new MessageExecutor() {
- public void execute(String message) {
- System.out.println(message);
- }
- };
- consumer = new KafkaConsumerClient();
- consumer.setTopic("test-topic");
- consumer.setPartitionsNum(2);
- consumer.setExecutor(executor);
- consumer.init();
- } catch (Exception e) {
- e.printStackTrace();
- } finally {
- if(consumer != null){
- consumer.close();
- }
- }
- }
- }
需要提醒的是,上述LogConsumer类中,没有太多的关注异常情况,必须在MessageExecutor.execute()方法中抛出异常时的情况.
在测试时,建议优先启动consumer,然后再启动producer,这样可以实时的观测到最新的消息。
kafka_2.9.2-0.8.1.1分布式集群搭建代码开发实例的更多相关文章
- Hadoop上路-01_Hadoop2.3.0的分布式集群搭建
一.配置虚拟机软件 下载地址:https://www.virtualbox.org/wiki/downloads 1.虚拟机软件设定 1)进入全集设定 2)常规设定 2.Linux安装配置 1)名称类 ...
- 分布式实时日志系统(四) 环境搭建之centos 6.4下hbase 1.0.1 分布式集群搭建
一.hbase简介 HBase是一个开源的非关系型分布式数据库(NoSQL),它参考了谷歌的BigTable建模,实现的编程语言为 Java.它是Apache软件基金会的Hadoop项目的一部分,运行 ...
- kafka系列二:多节点分布式集群搭建
上一篇分享了单节点伪分布式集群搭建方法,本篇来分享一下多节点分布式集群搭建方法.多节点分布式集群结构如下图所示: 为了方便查阅,本篇将和上一篇一样从零开始一步一步进行集群搭建. 一.安装Jdk 具体安 ...
- hbase分布式集群搭建
hbase和hadoop一样也分为单机版.伪分布式版和完全分布式集群版本,这篇文件介绍如何搭建完全分布式集群环境搭建. hbase依赖于hadoop环境,搭建habase之前首先需要搭建好hadoop ...
- Hadoop完全分布式集群搭建
Hadoop的运行模式 Hadoop一般有三种运行模式,分别是: 单机模式(Standalone Mode),默认情况下,Hadoop即处于该模式,使用本地文件系统,而不是分布式文件系统.,用于开发和 ...
- hbase完整分布式集群搭建
简介: hadoop的单机,伪分布式,分布式安装 hadoop2.8 集群 1 (伪分布式搭建 hadoop2.8 ha 集群搭建 hbase完整分布式集群搭建 hadoop完整集群遇到问题汇总 Hb ...
- 大数据之Hadoop完全分布式集群搭建
1.准备阶段 1.1.新建三台虚拟机 Hadoop完全分市式集群是典型的主从架构(master-slave),一般需要使用多台服务器来组建.我们准备3台服务器(关闭防火墙.静态IP.主机名称).如果没 ...
- HBase HA分布式集群搭建
HBase HA分布式集群搭建部署———集群架构 搭建之前建议先学习好HBase基本构架原理:https://www.cnblogs.com/lyywj170403/p/9203012.html 集群 ...
- ubuntu18.04.2 hadoop3.1.2+zookeeper3.5.5高可用完全分布式集群搭建
ubuntu18.04.2 hadoop3.1.2+zookeeper3.5.5高可用完全分布式集群搭建 集群规划: hostname NameNode DataNode JournalNode Re ...
随机推荐
- HDOJ 1428 漫步校园
漫步校园 Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 65536/32768 K (Java/Others)Total Submis ...
- 定位position详解:relative与absolute
定位标签:position 包含属性:relative(相对) absolute(绝对) 1.position:relative; 如果对一个元素进行相对定位,首先它将出现在它所在的位置上.然后通过设 ...
- 异步等待的 Python 协程
现在 Python 已经支持用协程进行异步处理.但最近有建议称添加协程以全面完善 Python 的语言结构,而不是像现在这样把他们作为生成器的一个类型.此外,两个新的关键字---异步(async)和等 ...
- HDU 4572 Bottles Arrangement(找规律,仔细读题)
题目 //找规律,123321123321123321…发现这样排列恰好可以错开 // 其中注意题中数据范围: M是行,N是列,3 <= N < 2×M //则猜测:m,m,m-1,m-1 ...
- Activity学习(二)——生命周期
一.什么是Activity? 简单的说:Activity就是布满整个窗口或者悬浮于其他窗口上的交互界面.在一个应用程序中通常由多个Activity构成,都会在Manifest.xml中指定一个主的Ac ...
- 强大的grep命令
1.作用 Linux系统中grep命令是一种强大的文本搜索工具,它能使用正则表达式搜索文本,并把匹 配的行打印出来.grep全称是Global Regular Expression Print,表示全 ...
- .bat后台运行
可以编辑一个vbs脚本,在其中以隐藏窗口运行批处理程序. ------------------------------------------------------------ Set ws = C ...
- IDA 与VC 加载符号表
将Windbg路径下的symsrv.yes 拷贝到ida 的安装目录,重新分析ntoskrnl.exe, 加载本地的符号表 添加环境变量 变量名:_NT_SYMBOL_PATH变量值:SRV*{$P ...
- Jmeter 快速入门教程(三-2) -- 设置集结点
集合点:简单来理解一下,虽然我们的“性能测试”理解为“多用户并发测试”,但真正的并发是不存在的,为了更真实的实现并发这感念,我们可以在需要压力的地方设置集合点, 还拿那个用户和密码的地方,每到输入用户 ...
- [转载]U3d常规性能优化技巧
以下技巧并不是必须的,但是对于想要提升游戏性能的人来说应该还是很不错的. 优化的常规技巧 n 剖析你的游戏. 不要花费时间来优化那些晦涩的代码或者缩减图形文件的大小,除非这是你游戏的瓶颈.第一次剖析你 ...