Kakfa
Kakfa分布式集群搭建
本位以最新版本kafka_2.11-0.10.1.0版本讲述分布式kafka集群环境的搭建过程。服务器列表:
|
1
2
3
|
172.31.10.1172.31.10.2172.31.10.3 |
1.下载kafka安装包
登录kafka官网http://kafka.apache.org/,
- 单击左侧“Download”按钮
- 选择对应的版本,版本2.11代表scala版本(kafka是由scala编写的),0.10.1.0代表kafka的版本
- 在弹出的窗口中选择下载链接即可
2.下载zookeeper安装包
kafka整体架构如下:
而kafka集群通常会依赖zookeeper的命名服务,单机版的可以直接用kafka安装包的zookeeper,而通常生产环境为保证命名服务的可用性,一般会单独搭建zookeeper集群。服务器不足可以直接和kafka broker共用服务器,zookeeper命名服务队资源要求不高。
登录zookeeper官网http://www.apache.org/dyn/closer.cgi/zookeeper/,一路选择download下载即可,本文选择稳定版zookeeper-3.4.8
3.安装zookeeper集群
将安装包zookeeper-3.4.8.tar上传至服务器172.31.10.1,
- 解压,目录/opt/zookeeper/zookeeper-3.4.8
1
tar -zxvf zookeeper-3.4.8.tar
- 配置,切换到conf目录,并更改dataDir和server.x
12
cd /opt/zookeeper/zookeeper-3.4.8/confmv zoo_sample.cfg zoo.cfg更改后的zoo.cfg配置如下:
12345678910111213141516171819202122232425262728293031# The number of milliseconds of each ticktickTime=2000# The number of ticks that the initial# synchronization phase can takeinitLimit=10# The number of ticks that can pass between# sending a request and getting an acknowledgementsyncLimit=5# the directory where the snapshot is stored.# do not use /tmp for storage, /tmp here is just# example sakes.dataDir=/var/logs/data/zookeeper# the port at which the clients will connectclientPort=2181server.1=172.31.10.1:2888:3888server.2=172.31.10.2:2888:3888server.3=172.31.10.3:2888:3888# the maximum number of client connections.# increase this if you need to handle more clients#maxClientCnxns=60## Be sure to read the maintenance section of the# administrator guide before turning on autopurge.## http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance## The number of snapshots to retain in dataDir#autopurge.snapRetainCount=3# Purge task interval in hours# Set to "0" to disable auto purge feature#autopurge.purgeInterval=1其中dataDir为zookeeper目录,server.x为zookeeper服务器列表的地址和通信端口
- 远程复制到其他两台服务器,并在dataDir目录下创建myid文件,内容为server.x中的数字。本文设置如下:
|
1
2
3
4
5
6
7
8
9
10
11
|
#172.31.10.1执行cd /var/logs/data/zookeeperecho "1" > /var/logs/data/zookeeper/myid#172.31.10.2执行cd /var/logs/data/zookeeperecho "2" > /var/logs/data/zookeeper/myid#172.31.10.3执行cd /var/logs/data/zookeeperecho "3" > /var/logs/data/zookeeper/myid |
- 启动zookeeper集群和验证
|
1
2
3
4
5
6
7
|
#在每台服务器上启动zookeepercd /opt/zookeeper/zookeeper-3.4.8/bin/opt/zookeeper/zookeeper-3.4.8/bin/zkServer.sh start#查看服务器上zookeeper节点角色cd /opt/zookeeper/zookeeper-3.4.8/bin/opt/zookeeper/zookeeper-3.4.8/bin/zkServer.sh status |
4.安装kafka集群
- 解压,到/opt/kafka/kafka_2.11-0.10.1.0
|
1
2
|
tar -zxvf kafka_2.11-0.10.1.0.tgzcd /opt/kafka/kafka_2.11-0.10.1.0 |
- 更改conf/server.properties配置,主要是更改如下几项:
- 1234
broker.id=1host.name=172.31.10.1log.dirs=/var/logs/data/kafkazookeeper.connect=172.31.10.1:2181,172.31.10.2:2181,172.31.10.2:2181/kafka
注意每台服务器上的broker.id均不同,需要保证整个集群中唯一性
更改后的server.properties如下:
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
|
############################# Server Basics ############################## The id of the broker. This must be set to a unique integer for each broker.broker.id=1# The port the socket server listens onport=9092# Hostname the broker will bind to. If not set, the server will bind to all interfaceshost.name=172.31.10.1# Switch to enable topic deletion or not, default value is false#delete.topic.enable=true############################# Socket Server Settings ############################## The address the socket server listens on. It will get the value returned from# java.net.InetAddress.getCanonicalHostName() if not configured.# FORMAT:# listeners = security_protocol://host_name:port# EXAMPLE:# listeners = PLAINTEXT://your.host.name:9092#listeners=PLAINTEXT://:9092# Hostname and port the broker will advertise to producers and consumers. If not set,# it uses the value for "listeners" if configured. Otherwise, it will use the value# returned from java.net.InetAddress.getCanonicalHostName().#advertised.listeners=PLAINTEXT://your.host.name:9092# The number of threads handling network requestsnum.network.threads=3# The number of threads doing disk I/Onum.io.threads=8# The send buffer (SO_SNDBUF) used by the socket serversocket.send.buffer.bytes=102400# The receive buffer (SO_RCVBUF) used by the socket serversocket.receive.buffer.bytes=102400# The maximum size of a request that the socket server will accept (protection against OOM)socket.request.max.bytes=104857600############################# Log Basics ############################## A comma seperated list of directories under which to store log fileslog.dirs=/var/logs/data/kafka# The default number of log partitions per topic. More partitions allow greater# parallelism for consumption, but this will also result in more files across# the brokers.num.partitions=1# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.# This value is recommended to be increased for installations with data dirs located in RAID array.num.recovery.threads.per.data.dir=1############################# Log Flush Policy ############################## Messages are immediately written to the filesystem but by default we only fsync() to sync# the OS cache lazily. The following configurations control the flush of data to disk.# There are a few important trade-offs here:# 1. Durability: Unflushed data may be lost if you are not using replication.# 2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.# 3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to exceessive seeks.# The settings below allow one to configure the flush policy to flush data after a period of time or# every N messages (or both). This can be done globally and overridden on a per-topic basis.# The number of messages to accept before forcing a flush of data to disk#log.flush.interval.messages=10000# The maximum amount of time a message can sit in a log before we force a flush#log.flush.interval.ms=1000############################# Log Retention Policy ############################## The following configurations control the disposal of log segments. The policy can# be set to delete segments after a period of time, or after a given size has accumulated.# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens# from the end of the log.# The minimum age of a log file to be eligible for deletionlog.retention.hours=168# A size-based retention policy for logs. Segments are pruned from the log as long as the remaining# segments don't drop below log.retention.bytes.#log.retention.bytes=1073741824# The maximum size of a log segment file. When this size is reached a new log segment will be created.log.segment.bytes=1073741824# The interval at which log segments are checked to see if they can be deleted according# to the retention policieslog.retention.check.interval.ms=300000############################# Zookeeper ############################## Zookeeper connection string (see zookeeper docs for details).# This is a comma separated host:port pairs, each corresponding to a zk# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".# You can also append an optional chroot string to the urls to specify the# root directory for all kafka znodes.zookeeper.connect=172.31.10.1:2181,172.31.10.2:2181,172.31.10.2:2181/kafka# Timeout in ms for connecting to zookeeperzookeeper.connection.timeout.ms=6000 |
- 同步到其他服务器,更改broker.id
- kafka启动和验证
12
cd /opt/kafka/kafka_2.11-0.10.1.0/binnohup /opt/kafka/kafka_2.11-0.10.1.0/bin/kafka-server-start.sh config/server.properties &创建topic,如能成功创建topic则表示集群安装完成,也可以用jps命令查看kafka进程是否存在。
1/opt/kafka/kafka_2.11-0.10.1.0/bin/kafka-topics.sh --create --zookeeper 172.31.10.1:2181,172.31.10.2:2181,172.31.10.2:2181/kafka --replication-factor 3 --partitions 1 --topic test至此,kafka分布式集群安装完成,后续将深入讲解kafka其他内容。
Kakfa的更多相关文章
- 【原创】Kakfa cluster包源代码分析
kafka.cluster包定义了Kafka的基本逻辑概念:broker.cluster.partition和replica——这些是最基本的概念.只有弄懂了这些概念,你才真正地使用kakfa来帮助完 ...
- kakfa源码编译打包
kakfa项目编译: cd /home/zhaofuxin/workspace/kafka-0.8.2.1-src ./gradlew releaseTarGz 会出现如下异常: zhaofuxin@ ...
- Kakfa揭秘 Day9 KafkaReceiver源码解析
Kakfa揭秘 Day9 KafkaReceiver源码解析 上一节课中,谈了Direct的方式来访问kafka的Broker,今天主要来谈一下,另一种方式,也就是KafkaReceiver. 初始化 ...
- Kakfa揭秘 Day8 DirectKafkaStream代码解析
Kakfa揭秘 Day8 DirectKafkaStream代码解析 今天让我们进入SparkStreaming,看一下其中重要的Kafka模块DirectStream的具体实现. 构造Stream ...
- Kakfa揭秘 Day7 Producer源码解密
Kakfa揭秘 Day7 Producer源码解密 今天我们来研究下Producer.Producer的主要作用就是向Kafka的brokers发送数据.从思考角度,为了简化思考过程,可以简化为一个单 ...
- Kakfa揭秘 Day6 Consumer源码解密
Kakfa揭秘 Day6 Consumer源码解密 今天主要分析下Consumer是怎么来工作的,今天主要是例子出发,对整个过程进行刨析. 简单例子 Example中Consumer.java是一个简 ...
- Kakfa揭秘 Day5 SocketServer下的NIO
Kakfa揭秘 Day5 SocketServer下的NIO 整个Kafka底层都是基于NIO来进行开发的,这种消息机制可以达到弱耦合的效果,同时在磁盘有很多数据时,会非常的高效,在gc方面有非常大的 ...
- Kakfa揭秘 Day4 Kafka中分区深度解析
Kakfa揭秘 Day4 Kafka中分区深度解析 今天主要谈Kafka中的分区数和consumer中的并行度.从使用Kafka的角度说,这些都是至关重要的. 分区原则 Partition代表一个to ...
- Kakfa揭秘 Day3 Kafka源码概述
Kakfa揭秘 Day3 Kafka源码概述 今天开始进入Kafka的源码,本次学习基于最新的0.10.0版本进行.由于之前在学习Spark过程中积累了很多的经验和思想,这些在kafka上是通用的. ...
随机推荐
- jQuery Ajax 全解析(转)
<!-- .ajax div{ border: solid 1px red; } --> // <![CDATA[ $(function(){ $("#btnajax&qu ...
- swift皮筋弹动发射飞机
今天在那个ios教程网上看到了一个不错的ios游戏源码,这是一个款采用swift实现的皮筋弹动发射飞机游戏源码,游戏源码比较详细,大家可以研究学习一下吧. <ignore_js_op> & ...
- ElasticSearch-PHP的API使用(二)
1:索引内的一个文档的创建(相当表记录的添加) 比如:要添加一条记录 INSERT INTO blog(title,content,add_time) VALUES('ElasticSearch-PH ...
- 如何循序渐进地学习Javascript
javascript入门太容易了,导致几乎人人随便看看就能上手,零基础的人学个三五天都能对外宣称自己掌握了js.可是真正掌握js是一件很难的事情.如果在初学一门语言的时候第一想到的是问别人,是很难取得 ...
- javascript 数组对象与嵌套循环写法
'use strict' var info=[{"name":"最近想跳河","interst":["历史"," ...
- 《第一行代码--Android》阅读笔记之Activity
1.BaseActivity里面可以干什么 定义一个Context定义一个TAG 记录当前的Activity名字getClass().getSimpleName(); 2.Activity里面的几个重 ...
- Java垃圾回收基础
- 在PyQt中直接使用ui文件并加载qrc资源文件
1. 用Qt设计师创建一个包含qrc资源文件的ui文件 2.打开cmd使用以下命令把qrc资源文件转换成十六进制的py文件 pyrcc4 -o C:\res.py C:\res.qrc pyrcc4 ...
- mysql 根据生日计算年龄,并查询在18-25之间的语句
select id, DATE_FORMAT(birthday,"%Y-%m-%d") birthday, CURDATE() , (year(now())-year(birthd ...
- CKeditor的简单使用
由于项目中要使用ckeditor 做个推荐功能,由于值设定到文本内容,就选择最基本的使用. 使用的版本为当前最新版本4.4.7,你需要下载两部分,一个是前台使用,一个是后台使用, 你可以到我的网盘中下 ...