Kafka integration with Ganglia
In this blog post I will show you kafka integration with ganglia, this is very interesting & important topic for those who want to do bench-marking, measure performance by monitoring specific Kafka metrics via ganglia.
Before going ahead let me briefly explain about what is Kafka and Ganglia.
Kafka – Kafka is open source distributed message broker project developed by Apache, Kafka provides a unified, high-throughput, low-latency platform for handling real-time data feeds.
Ganglia – Ganglia is distributed system for monitoring high performance computing systems such as grids, clusters etc.
Now lets get started, In this example we have a Hadoop cluster with 3 Kafka brokers, First we will see how to install and configure ganglia on these machines.
Step 1: Setup and Configure Ganglia gmetad and gmond
First thing is you need to install EPEL repo on all the nodes
yum install epel-release
On master node (ganglia-server) download below packages
yum install rrdtool ganglia ganglia-gmetad ganglia-gmond ganglia-web httpdphpaprapr-util
On slave nodes (ganglia-client) download below packages
yum install ganglia-gmond
On master node do the following
chown apache:apache -R /var/www/html/ganglia
Edit below config file and allow ganglia webpage from any IP
vi /etc/httpd/conf.d/ganglia.conf
It should look like below:
#
# Ganglia monitoring system php web frontend
#
Alias /ganglia /usr/share/ganglia
<Location /ganglia>
Order deny,allow
Allow from all #this is very important or else you won’t be able to see ganglia web UI
Allow from 127.0.0.1
Allow from ::1
# Allow from .example.com
</Location>
On master node edit gmetadconfig file and it should look like below (Please change highlighted IP address to your ganglia-server private IP address)
#cat /etc/ganglia/gmetad.conf |grep -v ^#
data_source "hadoopkafka" 172.30.0.81:8649
gridname "Hadoop-Kafka"
setuid_username ganglia
case_sensitive_hostnames 0
On master node edit gmond.conf, keep other parameters to default except below ones
Copy gmond.conf to all other nodes in the cluster
cluster {
name = "hadoopkafka"
owner = "unspecified"
latlong = "unspecified"
url = "unspecified"
}
/* The host section describes attributes of the host, like the location */
host {
location = "unspecified"
}
/* Feel free to specify as many udp_send_channels as you like. Gmond
used to only support having a single channel */
udp_send_channel {
#bind_hostname = yes # Highly recommended, soon to be default.
# This option tells gmond to use a source address
# that resolves to the machine's hostname. Without
# this, the metrics may appear to come from any
# interface and the DNS names associated with
# those IPs will be used to create the RRDs.
#mcast_join = 239.2.11.71
host = 172.30.0.81
port = 8649
#ttl = 1
}
/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
#mcast_join = 239.2.11.71
port = 8649
#bind = 239.2.11.71
#retry_bind = true
# Size of the UDP buffer. If you are handling lots of metrics you really
# should bump it up to e.g. 10MB or even higher.
# buffer = 10485760
}
Start apache service on master node
service httpd start
Start gmetad service on master node
service gmetad start
Start gmond service on every node in the server
service gmond start
This is it! Now you can see basic ganglia metrics by visiting web UI at http://IP-address-of-ganglia-server/ganglia
Step 2: Ganglia Integration with Kafka
Enable JMX Monitoring for Kafka Brokers
In order to get custom Kafka metrics we need to enable JMX monitoring for Kafka Broker Daemon.
To enable JMX Monitoring for Kafka broker, please follow below instructions:
Edit kafka-run-class.sh and modify KAFKA_JMX_OPTS variable like below (please replace red text with your Kafka Broker hostname)
KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=kafka.broker.hostname -Djava.net.preferIPv4Stack=true"
Add below line in kafka-server-start.sh (in case of Hortonworks hadoop, path is /usr/hdp/current/kafka-broker/bin/kafka-server-start.sh)
export JMX_PORT=${JMX_PORT:-9999}
That’s it! Please do the above steps on all Kafka brokers and restart the kafka brokers ( manually or via management UI whatever applicable)
Verify that JMX port has been enabled!
You can use jconsole to do so.
Download, install and configure jmxtrans
Download jmxtrans rpm from below link and install it using rpm command
http://code.google.com/p/jmxtrans/downloads/detail?name=jmxtrans-250-0.noarch.rpm&can=2&q=
Once you have installed jmxtrans, please make sure that java &jps configured in $PATH variable
Write a JSON for fetching MBeans on each Kafka Broker.
I have written JSON for monitoring custom Kafka metrics, please download it from here.
Please note that, you need to replace “IP_address_of_kafka_broker” with your kafka broker’s IP address in downloaded JSON, same is the case for ganglia server’s IP address.
Once you are done with writing JSON, please verify the syntax using any online JSON validator( http://jsonlint.com/ ).
Start the jmxtrans using below command
cd /usr/share/jmxtrans/
sh jmxtrans.sh start $name-of-the-json-file
Verify that jmxtrans has started successfully using simple “ps” command
Repeat above procedure on all Kafka brokers
Verify custom metrics
Login to ganglia server and go to rrd directory ( by default it is /var/lib/ganglia/rrds/ ) and check if there are new rrd files for kafka metrics.
You should see output like below (output is truncated)
Go to ganglia web UI –> select hadoopkafka from below highlighted dropdown
Select “custom.metrics” from below highlighted dropdown
That’s all!
Kafka integration with Ganglia的更多相关文章
- Structured Streaming + Kafka Integration Guide 结构化流+Kafka集成指南 (Kafka broker version 0.10.0 or higher)
用于Kafka 0.10的结构化流集成从Kafka读取数据并将数据写入到Kafka. 1. Linking 对于使用SBT/Maven项目定义的Scala/Java应用程序,用以下工件artifact ...
- Spark Streaming + Kafka Integration Guide原文翻译及解析
前面写了关于kafka和spark streaming的结合使用(https://www.cnblogs.com/qfxydtk/p/11662591.html),其具体使用用法其实来自于原文:htt ...
- Spark踩坑记——Spark Streaming+Kafka
[TOC] 前言 在WeTest舆情项目中,需要对每天千万级的游戏评论信息进行词频统计,在生产者一端,我们将数据按照每天的拉取时间存入了Kafka当中,而在消费者一端,我们利用了spark strea ...
- Spark Streaming+Kafka
Spark Streaming+Kafka 前言 在WeTest舆情项目中,需要对每天千万级的游戏评论信息进行词频统计,在生产者一端,我们将数据按照每天的拉取时间存入了Kafka当中,而在消费者一端, ...
- Spark集群 + Akka + Kafka + Scala 开发(4) : 开发一个Kafka + Spark的应用
前言 在Spark集群 + Akka + Kafka + Scala 开发(1) : 配置开发环境中,我们已经部署好了一个Spark的开发环境. 在Spark集群 + Akka + Kafka + S ...
- 5分钟spark streaming实践之 与kafka联姻
你:kafka是什么? 我:嗯,这个嘛..看官网. Apache Kafka® is a distributed streaming platform Kafka is generally used ...
- Offset Management For Apache Kafka With Apache Spark Streaming
An ingest pattern that we commonly see being adopted at Cloudera customers is Apache Spark Streaming ...
- Spark streaming消费Kafka的正确姿势
前言 在游戏项目中,需要对每天千万级的游戏评论信息进行词频统计,在生产者一端,我们将数据按照每天的拉取时间存入了Kafka当中,而在消费者一端,我们利用了spark streaming从kafka中不 ...
- 【Spark】SparkStreaming-输出到Kafka
SparkStreaming-输出到Kafka sparkstreaming output kafka_百度搜索 SparkStreaming采用直连方式(Direct Approach)获取Kafk ...
随机推荐
- Sql Server 本地(客户端)连接服务器端操作
网有很多相关内容,我在此做记录和总结 1.主要是sql server 配置管理工具的配置 在此参考 https://www.cnblogs.com/yougmi/p/4616273.html(再次感谢 ...
- CDN工作机制和负载均衡
定义: CDN 即内容分布网络,(Content Delivery Netwrok) ,是构筑在现有Internet上的一种先进的流量分配网络,其目的是通过在现有的Internet中增加一层新的网络 ...
- Windows上搭建远程访问服务
Windows上搭建远程访问服务 转自:https://blog.51cto.com/13871378/2153308?source=dra 概述:允许客户机通过拨号连接或虚拟专用网连接到公司局域网, ...
- supervisord 备注
最近项目中使用了supervisord,简单做下备注. supervisord是linux下基于python开发的一个服务管理工具,类似之前node环境下的forever,用该方法启动进程后,supe ...
- 将HTML字符转换为DOM节点并动态添加到文档中
将HTML字符转换为DOM节点并动态添加到文档中 将字符串动态转换为DOM节点,在开发中经常遇到,尤其在模板引擎中更是不可或缺的技术. 字符串转换为DOM节点本身并不难,本篇文章主要涉及两个主题: 1 ...
- Quartz+ssm注解方式的最最最最简单使用
Maven配置 <!-- quartz监控 --> <dependency> <groupId>org.quartz-scheduler</groupId&g ...
- Vue(day6)
一.webpack中常用的文件loader & 插件 由于版本存在变动,以下安装和配置都有可能发生变化,请以官方文档为准. 1.html-webpack-plugin插件 html-webpa ...
- 使用dom4j 解析xml文件
//使用dom4j 解析xml文件,升级版,dom4j是对dom的封装 //重点 package com.offcn.utils; import java.io.File; import java.i ...
- spring的核心组件及作用(一)
Spring的核心组件有: Context Core Bean. 如果要在这三个核心组件上挑出一个最核心的组件,那就是Bean组件了. Spring的特性功能有:WEB ORM AOP ...
- Linux知识要点大全(第四章)
第四章 文件管理 *主要内容 文件和目录的操作: ①创建 ②删除 ③拷贝 ④重命名(剪切) ⑤查看 一:目录的操作 回顾与目录相关的命令 ls 查看目录中的内容 .pwd 打印当前目录 .cd ...