Kafka integration with Ganglia
In this blog post I will show you kafka integration with ganglia, this is very interesting & important topic for those who want to do bench-marking, measure performance by monitoring specific Kafka metrics via ganglia.
Before going ahead let me briefly explain about what is Kafka and Ganglia.
Kafka – Kafka is open source distributed message broker project developed by Apache, Kafka provides a unified, high-throughput, low-latency platform for handling real-time data feeds.
Ganglia – Ganglia is distributed system for monitoring high performance computing systems such as grids, clusters etc.
Now lets get started, In this example we have a Hadoop cluster with 3 Kafka brokers, First we will see how to install and configure ganglia on these machines.
Step 1: Setup and Configure Ganglia gmetad and gmond
First thing is you need to install EPEL repo on all the nodes
- yum install epel-release
On master node (ganglia-server) download below packages
- yum install rrdtool ganglia ganglia-gmetad ganglia-gmond ganglia-web httpdphpaprapr-util
On slave nodes (ganglia-client) download below packages
- yum install ganglia-gmond
On master node do the following
- chown apache:apache -R /var/www/html/ganglia
Edit below config file and allow ganglia webpage from any IP
- vi /etc/httpd/conf.d/ganglia.conf
It should look like below:
- #
- # Ganglia monitoring system php web frontend
- #
- Alias /ganglia /usr/share/ganglia
- <Location /ganglia>
- Order deny,allow
- Allow from all #this is very important or else you won’t be able to see ganglia web UI
- Allow from 127.0.0.1
- Allow from ::1
- # Allow from .example.com
- </Location>
On master node edit gmetadconfig file and it should look like below (Please change highlighted IP address to your ganglia-server private IP address)
- #cat /etc/ganglia/gmetad.conf |grep -v ^#
- data_source "hadoopkafka" 172.30.0.81:8649
- gridname "Hadoop-Kafka"
- setuid_username ganglia
- case_sensitive_hostnames 0
On master node edit gmond.conf, keep other parameters to default except below ones
Copy gmond.conf to all other nodes in the cluster
- cluster {
- name = "hadoopkafka"
- owner = "unspecified"
- latlong = "unspecified"
- url = "unspecified"
- }
- /* The host section describes attributes of the host, like the location */
- host {
- location = "unspecified"
- }
- /* Feel free to specify as many udp_send_channels as you like. Gmond
- used to only support having a single channel */
- udp_send_channel {
- #bind_hostname = yes # Highly recommended, soon to be default.
- # This option tells gmond to use a source address
- # that resolves to the machine's hostname. Without
- # this, the metrics may appear to come from any
- # interface and the DNS names associated with
- # those IPs will be used to create the RRDs.
- #mcast_join = 239.2.11.71
- host = 172.30.0.81
- port = 8649
- #ttl = 1
- }
- /* You can specify as many udp_recv_channels as you like as well. */
- udp_recv_channel {
- #mcast_join = 239.2.11.71
- port = 8649
- #bind = 239.2.11.71
- #retry_bind = true
- # Size of the UDP buffer. If you are handling lots of metrics you really
- # should bump it up to e.g. 10MB or even higher.
- # buffer = 10485760
- }
Start apache service on master node
- service httpd start
Start gmetad service on master node
- service gmetad start
Start gmond service on every node in the server
- service gmond start
This is it! Now you can see basic ganglia metrics by visiting web UI at http://IP-address-of-ganglia-server/ganglia
Step 2: Ganglia Integration with Kafka
Enable JMX Monitoring for Kafka Brokers
In order to get custom Kafka metrics we need to enable JMX monitoring for Kafka Broker Daemon.
To enable JMX Monitoring for Kafka broker, please follow below instructions:
Edit kafka-run-class.sh and modify KAFKA_JMX_OPTS variable like below (please replace red text with your Kafka Broker hostname)
- KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=kafka.broker.hostname -Djava.net.preferIPv4Stack=true"
Add below line in kafka-server-start.sh (in case of Hortonworks hadoop, path is /usr/hdp/current/kafka-broker/bin/kafka-server-start.sh)
- export JMX_PORT=${JMX_PORT:-9999}
That’s it! Please do the above steps on all Kafka brokers and restart the kafka brokers ( manually or via management UI whatever applicable)
Verify that JMX port has been enabled!
You can use jconsole to do so.
Download, install and configure jmxtrans
Download jmxtrans rpm from below link and install it using rpm command
http://code.google.com/p/jmxtrans/downloads/detail?name=jmxtrans-250-0.noarch.rpm&can=2&q=
Once you have installed jmxtrans, please make sure that java &jps configured in $PATH variable
Write a JSON for fetching MBeans on each Kafka Broker.
I have written JSON for monitoring custom Kafka metrics, please download it from here.
Please note that, you need to replace “IP_address_of_kafka_broker” with your kafka broker’s IP address in downloaded JSON, same is the case for ganglia server’s IP address.
Once you are done with writing JSON, please verify the syntax using any online JSON validator( http://jsonlint.com/ ).
Start the jmxtrans using below command
- cd /usr/share/jmxtrans/
- sh jmxtrans.sh start $name-of-the-json-file
Verify that jmxtrans has started successfully using simple “ps” command
Repeat above procedure on all Kafka brokers
Verify custom metrics
Login to ganglia server and go to rrd directory ( by default it is /var/lib/ganglia/rrds/ ) and check if there are new rrd files for kafka metrics.
You should see output like below (output is truncated)
Go to ganglia web UI –> select hadoopkafka from below highlighted dropdown
Select “custom.metrics” from below highlighted dropdown
That’s all!
Kafka integration with Ganglia的更多相关文章
- Structured Streaming + Kafka Integration Guide 结构化流+Kafka集成指南 (Kafka broker version 0.10.0 or higher)
用于Kafka 0.10的结构化流集成从Kafka读取数据并将数据写入到Kafka. 1. Linking 对于使用SBT/Maven项目定义的Scala/Java应用程序,用以下工件artifact ...
- Spark Streaming + Kafka Integration Guide原文翻译及解析
前面写了关于kafka和spark streaming的结合使用(https://www.cnblogs.com/qfxydtk/p/11662591.html),其具体使用用法其实来自于原文:htt ...
- Spark踩坑记——Spark Streaming+Kafka
[TOC] 前言 在WeTest舆情项目中,需要对每天千万级的游戏评论信息进行词频统计,在生产者一端,我们将数据按照每天的拉取时间存入了Kafka当中,而在消费者一端,我们利用了spark strea ...
- Spark Streaming+Kafka
Spark Streaming+Kafka 前言 在WeTest舆情项目中,需要对每天千万级的游戏评论信息进行词频统计,在生产者一端,我们将数据按照每天的拉取时间存入了Kafka当中,而在消费者一端, ...
- Spark集群 + Akka + Kafka + Scala 开发(4) : 开发一个Kafka + Spark的应用
前言 在Spark集群 + Akka + Kafka + Scala 开发(1) : 配置开发环境中,我们已经部署好了一个Spark的开发环境. 在Spark集群 + Akka + Kafka + S ...
- 5分钟spark streaming实践之 与kafka联姻
你:kafka是什么? 我:嗯,这个嘛..看官网. Apache Kafka® is a distributed streaming platform Kafka is generally used ...
- Offset Management For Apache Kafka With Apache Spark Streaming
An ingest pattern that we commonly see being adopted at Cloudera customers is Apache Spark Streaming ...
- Spark streaming消费Kafka的正确姿势
前言 在游戏项目中,需要对每天千万级的游戏评论信息进行词频统计,在生产者一端,我们将数据按照每天的拉取时间存入了Kafka当中,而在消费者一端,我们利用了spark streaming从kafka中不 ...
- 【Spark】SparkStreaming-输出到Kafka
SparkStreaming-输出到Kafka sparkstreaming output kafka_百度搜索 SparkStreaming采用直连方式(Direct Approach)获取Kafk ...
随机推荐
- cmd中输入net start mysql 提示:服务名无效或者MySQL正在启动 MySQL无法启动
在DOS窗口.gitbush以及一些可以使用的命令行工具的界面上,输入:net stop mysql.net start mysql时,总是提示:服务名无效. 出现提示如下: 原因是:因为net st ...
- nginx部署静态网站
实验环境 服务器:centos7.5 1核1G Nginx版本:nginx-1.14.2 主题 部署静态文件 根据不同url请求路径,定向到不同的系统文件夹 部署静态文件 假设nginx安装在“/us ...
- [Nuget]Nuget命令行工具安装
下载 地址:https://www.nuget.org/downloads 直接下最新推荐版本(recommended latest)就好了. 是个单一的nuget.exe文件. 安装配置 想要在wi ...
- springboot集合jpa使用
现目前java中用较多的数据库操作框架主要有:ibatis,mybatis,hibernate:今天分享的是jpa框架,在springboot框架中能够很快并方便的使用它,就我个人而言觉得如果是做业务 ...
- 【Caffe篇】--Caffe solver层从初始到应用
一.前述 solve主要是定义求解过程,超参数的 二.具体 #往往loss function是非凸的,没有解析解,我们需要通过优化方法来求解. #caffe提供了六种优化算法来求解最优参数,在solv ...
- ToastMiui【仿MIUI的带有动画的Toast】
版权声明:本文为HaiyuKing原创文章,转载请注明出处! 前言 仿MIUI的带有动画的Toast 效果图 代码分析 ToastMiui类基于WindowManager 为了和Toast用法保持一致 ...
- stylus 详解与引入
Stylus介绍及特点Stylus 是一个基于Node.js的CSS的预处理框架,诞生于2010年,比较年轻,可以说是一种新型语言,其本质上做的事情与 Sass/LESS 等类似, 可以以近似脚本的方 ...
- Python编程从入门到实践笔记——异常和存储数据
Python编程从入门到实践笔记——异常和存储数据 #coding=gbk #Python编程从入门到实践笔记——异常和存储数据 #10.3异常 #Python使用被称为异常的特殊对象来管理程序执行期 ...
- 从PRISM开始学WPF(三)Prism-Region-更新至Prism7.1
[7.1update]在开始前,我们先看下版本7.1中在本实例中的改动. 首先,项目文件中没有了Bootstrapper.cs,在上一篇的末尾,我们说过了,在7.1中,不见推荐使用Bootstrapp ...
- SLAM+语音机器人DIY系列:(八)高阶拓展——1.miiboo机器人安卓手机APP开发
android要与ROS通讯,一种是基于rosbridge,另一种是基于rosjava库. 相关参考例子工程 rosbridge例子: https://github.com/hibernate2011 ...