In this blog post I will show you kafka integration with ganglia, this is very interesting & important topic for those who want to do bench-marking, measure performance by monitoring specific Kafka metrics via ganglia.

Before going ahead let me briefly explain about what is Kafka and Ganglia.

Kafka – Kafka is open source distributed message broker project developed by Apache, Kafka provides a unified, high-throughput, low-latency platform for handling real-time data feeds.

Ganglia – Ganglia is distributed system for monitoring high performance computing systems such as grids, clusters etc.

Now lets get started, In this example we have a Hadoop cluster with 3 Kafka brokers, First we will see how to install and configure ganglia on these machines.

Step 1: Setup and Configure Ganglia gmetad and gmond

First thing is you need to install EPEL repo on all the nodes

yum install epel-release

On master node (ganglia-server) download below packages

yum install rrdtool ganglia ganglia-gmetad ganglia-gmond ganglia-web httpdphpaprapr-util

On slave nodes (ganglia-client) download below packages

yum install ganglia-gmond

On master node do the following

chown apache:apache -R /var/www/html/ganglia

Edit below config file and allow ganglia webpage from any IP

vi /etc/httpd/conf.d/ganglia.conf

It should look like below:

#
# Ganglia monitoring system php web frontend
#
Alias /ganglia /usr/share/ganglia
<Location /ganglia>
Order deny,allow
Allow from all                    #this is very important or else you won’t be able to see ganglia web UI
Allow from 127.0.0.1
Allow from ::1
# Allow from .example.com
</Location>

On master node edit gmetadconfig file and it should look like below (Please change highlighted IP address to your ganglia-server private IP address)

#cat /etc/ganglia/gmetad.conf |grep -v ^#
data_source "hadoopkafka" 172.30.0.81:8649
gridname "Hadoop-Kafka"
setuid_username ganglia
case_sensitive_hostnames 0

On master node edit gmond.conf, keep other parameters to default except below ones

Copy gmond.conf to all other nodes in the cluster

cluster {
name = "hadoopkafka"
owner = "unspecified"
latlong = "unspecified"
url = "unspecified"
}
/* The host section describes attributes of the host, like the location */
host {
location = "unspecified"
}
/* Feel free to specify as many udp_send_channels as you like. Gmond
used to only support having a single channel */
udp_send_channel {
#bind_hostname = yes # Highly recommended, soon to be default.
                       # This option tells gmond to use a source address
                       # that resolves to the machine's hostname. Without
                       # this, the metrics may appear to come from any
                       # interface and the DNS names associated with
                       # those IPs will be used to create the RRDs.
#mcast_join = 239.2.11.71
host = 172.30.0.81
port = 8649
#ttl = 1
}
/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
#mcast_join = 239.2.11.71
port = 8649
#bind = 239.2.11.71
#retry_bind = true
# Size of the UDP buffer. If you are handling lots of metrics you really
# should bump it up to e.g. 10MB or even higher.
# buffer = 10485760
}

Start apache service on master node

service httpd start

Start gmetad service on master node

service gmetad start

Start gmond service on every node in the server

service gmond start

This is it!  Now you can see basic ganglia metrics by visiting web UI at http://IP-address-of-ganglia-server/ganglia

Step 2: Ganglia Integration with Kafka

Enable JMX Monitoring for Kafka Brokers

In order to get custom Kafka metrics we need to enable JMX monitoring for Kafka Broker Daemon.

To enable JMX Monitoring for Kafka broker, please follow below instructions:

Edit kafka-run-class.sh and modify KAFKA_JMX_OPTS variable like below (please replace red text with your Kafka Broker hostname)

KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=kafka.broker.hostname -Djava.net.preferIPv4Stack=true"

Add below line in kafka-server-start.sh (in case of Hortonworks hadoop, path is /usr/hdp/current/kafka-broker/bin/kafka-server-start.sh)

export JMX_PORT=${JMX_PORT:-9999}

That’s it! Please do the above steps on all Kafka brokers and restart the kafka brokers ( manually or via management UI whatever applicable)

Verify that JMX port has been enabled!

 You can use jconsole to do so.

Download, install and configure jmxtrans

Download jmxtrans rpm from below link and install it using rpm command

http://code.google.com/p/jmxtrans/downloads/detail?name=jmxtrans-250-0.noarch.rpm&can=2&q=

Once you have installed jmxtrans, please make sure that java &jps configured in $PATH variable

Write a JSON for fetching MBeans on each Kafka Broker.

I have written JSON for monitoring custom Kafka metrics, please download it from here.

Please note that, you need to replace “IP_address_of_kafka_broker” with your kafka broker’s IP address in downloaded JSON, same is the case for ganglia server’s IP address.

Once you are done with writing JSON, please verify the syntax using any online JSON validator( http://jsonlint.com/ ).

Start the jmxtrans using below command

cd /usr/share/jmxtrans/
sh jmxtrans.sh start $name-of-the-json-file

Verify that jmxtrans has started successfully using simple “ps” command

Repeat above procedure on all Kafka brokers

 

Verify custom metrics

Login to ganglia server and go to rrd directory ( by default it is /var/lib/ganglia/rrds/ ) and check if there are new rrd files for kafka metrics.

You should see output like below (output is truncated)

Go to ganglia web UI –>  select hadoopkafka from below highlighted dropdown

Select “custom.metrics” from below highlighted dropdown

That’s all! 

Kafka integration with Ganglia的更多相关文章

  1. Structured Streaming + Kafka Integration Guide 结构化流+Kafka集成指南 (Kafka broker version 0.10.0 or higher)

    用于Kafka 0.10的结构化流集成从Kafka读取数据并将数据写入到Kafka. 1. Linking 对于使用SBT/Maven项目定义的Scala/Java应用程序,用以下工件artifact ...

  2. Spark Streaming + Kafka Integration Guide原文翻译及解析

    前面写了关于kafka和spark streaming的结合使用(https://www.cnblogs.com/qfxydtk/p/11662591.html),其具体使用用法其实来自于原文:htt ...

  3. Spark踩坑记——Spark Streaming+Kafka

    [TOC] 前言 在WeTest舆情项目中,需要对每天千万级的游戏评论信息进行词频统计,在生产者一端,我们将数据按照每天的拉取时间存入了Kafka当中,而在消费者一端,我们利用了spark strea ...

  4. Spark Streaming+Kafka

    Spark Streaming+Kafka 前言 在WeTest舆情项目中,需要对每天千万级的游戏评论信息进行词频统计,在生产者一端,我们将数据按照每天的拉取时间存入了Kafka当中,而在消费者一端, ...

  5. Spark集群 + Akka + Kafka + Scala 开发(4) : 开发一个Kafka + Spark的应用

    前言 在Spark集群 + Akka + Kafka + Scala 开发(1) : 配置开发环境中,我们已经部署好了一个Spark的开发环境. 在Spark集群 + Akka + Kafka + S ...

  6. 5分钟spark streaming实践之 与kafka联姻

    你:kafka是什么? 我:嗯,这个嘛..看官网. Apache Kafka® is a distributed streaming platform Kafka is generally used ...

  7. Offset Management For Apache Kafka With Apache Spark Streaming

    An ingest pattern that we commonly see being adopted at Cloudera customers is Apache Spark Streaming ...

  8. Spark streaming消费Kafka的正确姿势

    前言 在游戏项目中,需要对每天千万级的游戏评论信息进行词频统计,在生产者一端,我们将数据按照每天的拉取时间存入了Kafka当中,而在消费者一端,我们利用了spark streaming从kafka中不 ...

  9. 【Spark】SparkStreaming-输出到Kafka

    SparkStreaming-输出到Kafka sparkstreaming output kafka_百度搜索 SparkStreaming采用直连方式(Direct Approach)获取Kafk ...

随机推荐

  1. Postgresql数据库部署之:Postgresql本机启动和Postgresql注册成windows 服务

    1.初始化并创建数据库(一次即可)  initdb \data --locale=chs -U postgres -W  You can now start the database server u ...

  2. String字符串创建与存储机制

    Java内存可以粗略的区分为堆内存(Heap)和栈内存(Stack),堆中存放的是对象实例,而栈中存放的则是方法调用过程中的局部变量或引用等. 在Java语言中,字符串的生命与初始化有如下两种方式: ...

  3. .Net MVC+NPOI实现下载自定义的Word文档

    我们浏览很多网站时都会看到下载文件的功能(图片.word文档等),好巧不巧的是贫道近日也遇到了这个问题,于是写一篇博客记录一下. 技术点:MVC.NPOI.Form表单. 具体如何实现,待贫道喝一口水 ...

  4. ansible copy 模块的使用

    copy copy 模块是将 ansible 管理主机上的文件拷贝上远程主机中,与 fetch 相反,如果目标路径不存在,则自动创建,如果 src 的目录带“/” 则复制该目录下的所有东西,如果 sr ...

  5. Redhat 平台下 LVM 管理说明

    Redhat 平台下  LVM 管理说明 LVM 是 Logical Volume Manager(逻辑卷管理器)的简写,它为主机提供了更高层次的磁盘存储管理能力.LVM 可以帮助系统管理员为应用与用 ...

  6. C#多线程之旅~上车吧?

    前言:前几天,写了一篇关于多线程使用的文章,[线程使用]用法得到不少博友的好评,博主这几天加班写文章,把剩下的高级使用给写完,期望可以得到博友的追赞吧,那么废话不多说,开始我们的C#高级用法之旅!! ...

  7. 《HelloGitHub》第 27 期

    公告 网站新增了简单的搜索功能,可以通过项目名称或地址搜索.查看项目.欢迎star和推荐项目,我们一只在路上,希望志同道合者加入进来. 现招募专栏负责人: C# Java <HelloGitHu ...

  8. Ubuntu:命令行下浏览网页

    前述 兴起,试一下不用图形化界面浏览 安装w3m 直接进入root账号 apt-get install w3m 检验是否成功 w3m www.baidu.com 就这样成功的进入baidu了,纯文本模 ...

  9. 『练手』004 Laura.SqlForever如何扩展 导航栏 工具栏 右键菜单 插件

    004 Laura.SqlForever如何扩展 导航栏 工具栏 右键菜单 插件 导航栏 插件扩展 比如下图的    窗口 > 关闭所有文档    这个导航栏: 在 任何程序集,任何命名空间,任 ...

  10. JSP 内置对象(上)

    JSP 内置对象是 Web 容器创建的一组对象,不使用 new 关键字就可以直接使用的对象.如上一章中使用脚本实现打印九九乘法表中的out对象 <%-- 脚本:out对象是JSPWriter类的 ...