Kakfa
Kakfa分布式集群搭建
本位以最新版本kafka_2.11-0.10.1.0版本讲述分布式kafka集群环境的搭建过程。服务器列表:
1
2
3
|
172.31.10.1 172.31.10.2 172.31.10.3 |
1.下载kafka安装包
登录kafka官网http://kafka.apache.org/,
- 单击左侧“Download”按钮
- 选择对应的版本,版本2.11代表scala版本(kafka是由scala编写的),0.10.1.0代表kafka的版本
- 在弹出的窗口中选择下载链接即可
2.下载zookeeper安装包
kafka整体架构如下:
而kafka集群通常会依赖zookeeper的命名服务,单机版的可以直接用kafka安装包的zookeeper,而通常生产环境为保证命名服务的可用性,一般会单独搭建zookeeper集群。服务器不足可以直接和kafka broker共用服务器,zookeeper命名服务队资源要求不高。
登录zookeeper官网http://www.apache.org/dyn/closer.cgi/zookeeper/,一路选择download下载即可,本文选择稳定版zookeeper-3.4.8
3.安装zookeeper集群
将安装包zookeeper-3.4.8.tar上传至服务器172.31.10.1,
- 解压,目录/opt/zookeeper/zookeeper-3.4.8
1
tar -zxvf zookeeper-3.4.8.tar
- 配置,切换到conf目录,并更改dataDir和server.x
12
cd /opt/zookeeper/zookeeper-3.4.8/conf
mv zoo_sample.cfg zoo.cfg
更改后的zoo.cfg配置如下:
12345678910111213141516171819202122232425262728293031# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/
var
/logs/data/zookeeper
# the port at which the clients will connect
clientPort=2181
server.1=172.31.10.1:2888:3888
server.2=172.31.10.2:2888:3888
server.3=172.31.10.3:2888:3888
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
其中dataDir为zookeeper目录,server.x为zookeeper服务器列表的地址和通信端口
- 远程复制到其他两台服务器,并在dataDir目录下创建myid文件,内容为server.x中的数字。本文设置如下:
1
2
3
4
5
6
7
8
9
10
11
|
#172.31.10.1执行 cd / var /logs/data/zookeeper echo "1" > / var /logs/data/zookeeper/myid #172.31.10.2执行 cd / var /logs/data/zookeeper echo "2" > / var /logs/data/zookeeper/myid #172.31.10.3执行 cd / var /logs/data/zookeeper echo "3" > / var /logs/data/zookeeper/myid |
- 启动zookeeper集群和验证
1
2
3
4
5
6
7
|
#在每台服务器上启动zookeeper cd /opt/zookeeper/zookeeper-3.4.8/bin /opt/zookeeper/zookeeper-3.4.8/bin/zkServer.sh start #查看服务器上zookeeper节点角色 cd /opt/zookeeper/zookeeper-3.4.8/bin /opt/zookeeper/zookeeper-3.4.8/bin/zkServer.sh status |
4.安装kafka集群
- 解压,到/opt/kafka/kafka_2.11-0.10.1.0
1
2
|
tar -zxvf kafka_2.11-0.10.1.0.tgz cd /opt/kafka/kafka_2.11-0.10.1.0 |
- 更改conf/server.properties配置,主要是更改如下几项:
- 1234
broker.id=1
host.name=172.31.10.1
log.dirs=/
var
/logs/data/kafka
zookeeper.connect=172.31.10.1:2181,172.31.10.2:2181,172.31.10.2:2181/kafka
注意每台服务器上的broker.id均不同,需要保证整个集群中唯一性
更改后的server.properties如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
|
############################# Server Basics ############################# # The id of the broker. This must be set to a unique integer for each broker. broker.id=1 # The port the socket server listens on port=9092 # Hostname the broker will bind to. If not set, the server will bind to all interfaces host.name=172.31.10.1 # Switch to enable topic deletion or not, default value is false #delete.topic.enable=true ############################# Socket Server Settings ############################# # The address the socket server listens on. It will get the value returned from # java.net.InetAddress.getCanonicalHostName() if not configured. # FORMAT: # listeners = security_protocol://host_name:port # EXAMPLE: # listeners = PLAINTEXT://your.host.name:9092 #listeners=PLAINTEXT://:9092 # Hostname and port the broker will advertise to producers and consumers. If not set, # it uses the value for "listeners" if configured. Otherwise, it will use the value # returned from java.net.InetAddress.getCanonicalHostName(). #advertised.listeners=PLAINTEXT://your.host.name:9092 # The number of threads handling network requests num.network.threads=3 # The number of threads doing disk I/O num.io.threads=8 # The send buffer (SO_SNDBUF) used by the socket server socket.send.buffer.bytes=102400 # The receive buffer (SO_RCVBUF) used by the socket server socket.receive.buffer.bytes=102400 # The maximum size of a request that the socket server will accept (protection against OOM) socket.request.max.bytes=104857600 ############################# Log Basics ############################# # A comma seperated list of directories under which to store log files log.dirs=/ var /logs/data/kafka # The default number of log partitions per topic. More partitions allow greater # parallelism for consumption, but this will also result in more files across # the brokers. num.partitions=1 # The number of threads per data directory to be used for log recovery at startup and flushing at shutdown. # This value is recommended to be increased for installations with data dirs located in RAID array. num.recovery.threads.per.data.dir=1 ############################# Log Flush Policy ############################# # Messages are immediately written to the filesystem but by default we only fsync() to sync # the OS cache lazily. The following configurations control the flush of data to disk. # There are a few important trade-offs here: # 1. Durability: Unflushed data may be lost if you are not using replication. # 2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush. # 3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to exceessive seeks. # The settings below allow one to configure the flush policy to flush data after a period of time or # every N messages (or both). This can be done globally and overridden on a per-topic basis. # The number of messages to accept before forcing a flush of data to disk #log.flush.interval.messages=10000 # The maximum amount of time a message can sit in a log before we force a flush #log.flush.interval.ms=1000 ############################# Log Retention Policy ############################# # The following configurations control the disposal of log segments. The policy can # be set to delete segments after a period of time, or after a given size has accumulated. # A segment will be deleted whenever *either* of these criteria are met. Deletion always happens # from the end of the log. # The minimum age of a log file to be eligible for deletion log.retention.hours=168 # A size-based retention policy for logs. Segments are pruned from the log as long as the remaining # segments don't drop below log.retention.bytes. #log.retention.bytes=1073741824 # The maximum size of a log segment file. When this size is reached a new log segment will be created. log.segment.bytes=1073741824 # The interval at which log segments are checked to see if they can be deleted according # to the retention policies log.retention.check.interval.ms=300000 ############################# Zookeeper ############################# # Zookeeper connection string (see zookeeper docs for details). # This is a comma separated host:port pairs, each corresponding to a zk # server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002". # You can also append an optional chroot string to the urls to specify the # root directory for all kafka znodes. zookeeper.connect=172.31.10.1:2181,172.31.10.2:2181,172.31.10.2:2181/kafka # Timeout in ms for connecting to zookeeper zookeeper.connection.timeout.ms=6000 |
- 同步到其他服务器,更改broker.id
- kafka启动和验证
12
cd /opt/kafka/kafka_2.11-0.10.1.0/bin
nohup /opt/kafka/kafka_2.11-0.10.1.0/bin/kafka-server-start.sh config/server.properties &
创建topic,如能成功创建topic则表示集群安装完成,也可以用jps命令查看kafka进程是否存在。
1/opt/kafka/kafka_2.11-0.10.1.0/bin/kafka-topics.sh --create --zookeeper 172.31.10.1:2181,172.31.10.2:2181,172.31.10.2:2181/kafka --replication-factor 3 --partitions 1 --topic test
至此,kafka分布式集群安装完成,后续将深入讲解kafka其他内容。
Kakfa的更多相关文章
- 【原创】Kakfa cluster包源代码分析
kafka.cluster包定义了Kafka的基本逻辑概念:broker.cluster.partition和replica——这些是最基本的概念.只有弄懂了这些概念,你才真正地使用kakfa来帮助完 ...
- kakfa源码编译打包
kakfa项目编译: cd /home/zhaofuxin/workspace/kafka-0.8.2.1-src ./gradlew releaseTarGz 会出现如下异常: zhaofuxin@ ...
- Kakfa揭秘 Day9 KafkaReceiver源码解析
Kakfa揭秘 Day9 KafkaReceiver源码解析 上一节课中,谈了Direct的方式来访问kafka的Broker,今天主要来谈一下,另一种方式,也就是KafkaReceiver. 初始化 ...
- Kakfa揭秘 Day8 DirectKafkaStream代码解析
Kakfa揭秘 Day8 DirectKafkaStream代码解析 今天让我们进入SparkStreaming,看一下其中重要的Kafka模块DirectStream的具体实现. 构造Stream ...
- Kakfa揭秘 Day7 Producer源码解密
Kakfa揭秘 Day7 Producer源码解密 今天我们来研究下Producer.Producer的主要作用就是向Kafka的brokers发送数据.从思考角度,为了简化思考过程,可以简化为一个单 ...
- Kakfa揭秘 Day6 Consumer源码解密
Kakfa揭秘 Day6 Consumer源码解密 今天主要分析下Consumer是怎么来工作的,今天主要是例子出发,对整个过程进行刨析. 简单例子 Example中Consumer.java是一个简 ...
- Kakfa揭秘 Day5 SocketServer下的NIO
Kakfa揭秘 Day5 SocketServer下的NIO 整个Kafka底层都是基于NIO来进行开发的,这种消息机制可以达到弱耦合的效果,同时在磁盘有很多数据时,会非常的高效,在gc方面有非常大的 ...
- Kakfa揭秘 Day4 Kafka中分区深度解析
Kakfa揭秘 Day4 Kafka中分区深度解析 今天主要谈Kafka中的分区数和consumer中的并行度.从使用Kafka的角度说,这些都是至关重要的. 分区原则 Partition代表一个to ...
- Kakfa揭秘 Day3 Kafka源码概述
Kakfa揭秘 Day3 Kafka源码概述 今天开始进入Kafka的源码,本次学习基于最新的0.10.0版本进行.由于之前在学习Spark过程中积累了很多的经验和思想,这些在kafka上是通用的. ...
随机推荐
- 【C++面试】常考题复习:排序算法
// Sort.cpp : 定义控制台应用程序的入口点. // #include "stdafx.h" #include <stdlib.h> /*********** ...
- linux tcp状态学习
参考: http://huoding.com/2013/12/31/316 http://www.cnblogs.com/sunxucool/p/3449068.html http://maoyida ...
- Use BEC to do mobile phone forensics
Belkasoft Evidence Center makes me very impressed that it supports lots of evidence type. I have to ...
- jquery 入门之-------jquery 简介
什么是jquery? Jquery是一个Javascript库,通过封装原生的Javascript函数得到一套定义好的方法.它可以用个少的代码完成更多更强大更复杂的功能,从而得到开发者的青睐. So! ...
- 【转】详解JavaScript中的this
ref:http://blog.jobbole.com/39305/ 来源:foocoder 详解JavaScript中的this JavaScript中的this总是让人迷惑,应该是js众所周知的坑 ...
- 5.21_启程日本二面_1 vs 1
昨天上午刚群面完,晚上7点左右就接到了电话.面试官就两位菇凉,看来她们也是很辛苦.今天下午3点 1 vs 1,在一家咖啡店里,主要是询问下去日本的意愿是否足够强烈.太老实,这里实话实说,也没有表现出非 ...
- Mongodb添加地理位置索引
1.同步站点信息到mongo中(支持mysql.sqlserver数据同步) 2.在Collections文件夹下所在文档右键,在菜单中选择Add Index, 3.然后进行数据查询{ "m ...
- iostat命令简单说说
tps: 每秒钟发送到的I/O请求数. Blk_read /s: 每秒读取的block数 Blk_wrtn/s: 每秒写入的block数 Blk_read: 读入的block总数 Blk_wrtn: ...
- ViewPage显示Fragment集合实现左右滑动并且出现tab栏--第三方开源--SlidingTabLayout和SlidingTabStrip实现
注意:有关Fragment的方法和ViewPager的全部是android.support.v4包的,否则会报很多的错误 MainActivity: package com.zzw.fragmentt ...
- HTML5开发规范
1.总体规范——采用html5的结构标签进行页面布局,注意结构的语义化,并注意页面大纲的层级结构.使用css3.0进行样式的设计. a.网页大纲查询网址http://gsnedders.html5.o ...