hadoop 集群部署ganglia 监控服务与nagios 报警服务

1. 部署ganglia 服务

ganglia 涉及到的组件:

数据监测节点（gmond）：这个部件装在需要监测的节点上，用于收集本节点的运行情况，并将这些统计信息传送到gmetad，Ubuntu系统中的ganglia-monitor包可以安装；
数据收集节点（gmetad、gweb）：这个部件用于收集gmond发送的数据，并通过web部件将其显示处理，可以通过ganglia-webfrontend包完成安装；
web界面：这个就是用于将gmetad整理生成的xml数据以网页形式显示出来的部件，已经包含在了ganglia-webfrontend包.

gmetad负责收集各个节点的监测数据，在集群中只需要选择一台节点进行安装即可，然后可以通过这台节点获取监测结果；gmond负责监测各个节点，所以在集群中的每个节点上都需要安装gmond。

1.1 监控主节点(信息收集与展示节点)部署 192.168.2.237

sudo apt-get install ganglia-monitor

sudo apt-get install ganglia-webfrontend

本节点作为主节点，用于收集并展示其他节点的信息。

复制 Ganglia webfrontend Apache 配置

sudo cp /etc/ganglia-webfrontend/apache.conf /etc/apache2/sites-enabled/ganglia.conf

在进行Ganglia集群配置之前，首先要搞清楚单播和组播。

单播：可以跨网段传播，只将信息发送给指定的机器。要配置成为单播你应该指定一个（或者多个）接受的主机。
组播：在机器所处的网段中发送广播，发送给位于同一网段的所有机器。如果你正在使用组播传输，那么你没必要改变任何东西，因为这是Ganglia 包安装默认的。唯一要做的就是把gmetad指向一个或几个运行着gmo nd的主机。没有必要列出每一个单个主机，因为gmo nd被设置为接受模式时会包含所有主机的列表以及整个集群的统计信息。

本集群使用了单播模式，在每个机器上都要配置/etc/ganglia/gmond.conf

globals {

daemonize = yes

setuid = yes

user = root /*运行Ganglia的用户*/

debug_level = 0

max_udp_msg_len = 1472

mute = no

deaf = no

host_dmax = 120 /*secs */

cleanup_threshold = 300 /*secs */

gexec = no

send_metadata_interval = 10/*发送数据的时间间隔*/

}

cluster {

name = "hadoop" /*集群名称*/

owner = "root" /*运行Ganglia的用户*/

latlong = "unspecified"

url = "unspecified"

}

udp_send_channel {

# mcast_join = 192.168.52.105 /*注释掉组播*/

host = 192.168.2.237 /*发送给安装gmetad的机器*/

port = 8649

ttl = 1

}

udp_recv_channel { #mcast_join = 239.2.11.71 port = 8649 #bind = 239.2.11.71 }

配置 /etc/ganglia/gmetad.conf,该配置在仅在主节点配置。

data_source "hadoop" 192.168.2.237:8649,192.168.2.201:8649,192.168.2.202:8649,192.168.2.203:8649,192.168.2.204:8649,192.168.2.205:8649,192.168.2.206:8649,192.168.2.207:8649,192.168.2.208:8649,192.168.2.209:8649,192.168.2.210:8649,192.16

8.2.211:8649,192.168.2.212:8649,192.168.2.213:8649,192.168.2.214:8649,192.168.2.215:8649,192.168.2.216:8649

注意，此处的hadoop 一定要对应 /etc/ganglia/gmond.conf 里的cluster 的name="hadoop"，并且不同的cluster 一定要对应不同的端口

重启服务

sudo /etc/init.d/ganglia-monitor restart

sudo /etc/init.d/gmetad restart

sudo /etc/init.d/apache2 restart

1.2 从节点部署

从节点只安装gmond即可,配置同上面 /etc/ganglia/gmond.conf

sudo apt-get install ganglia-monitor

重启服务

sudo /etc/init.d/ganglia-monitor restart

2 监控Hadoop集群

编辑文件 hadoop-2.5.2/etc/hadoop/hadoop-metrics2.properties

#

# Licensed to the Apache Software Foundation (ASF) under one or more

# contributor license agreements. See the NOTICE file distributed with

# this work for additional information regarding copyright ownership.

# The ASF licenses this file to You under the Apache License, Version 2.0

# (the "License"); you may not use this file except in compliance with

# the License. You may obtain a copy of the License at

#

# http://www.apache.org/licenses/LICENSE-2.0

#

# Unless required by applicable law or agreed to in writing, software

# distributed under the License is distributed on an "AS IS" BASIS,

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

# See the License for the specific language governing permissions and

# limitations under the License.

#

# syntax: [prefix].[source|sink].[instance].[options]

# See javadoc of package-info.java for org.apache.hadoop.metrics2 for details

#*.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink

# default sampling period, in seconds

#*.period=10

# The namenode-metrics.out will contain metrics from all context

#namenode.sink.file.filename=namenode-metrics.out

# Specifying a special sampling period for namenode:

#namenode.sink.*.period=8

#datanode.sink.file.filename=datanode-metrics.out

# the following example split metrics of different

# context to different sinks (in this case files)

#jobtracker.sink.file_jvm.context=jvm

#jobtracker.sink.file_jvm.filename=jobtracker-jvm-metrics.out

#jobtracker.sink.file_mapred.context=mapred

#jobtracker.sink.file_mapred.filename=jobtracker-mapred-metrics.out

#tasktracker.sink.file.filename=tasktracker-metrics.out

#maptask.sink.file.filename=maptask-metrics.out

#reducetask.sink.file.filename=reducetask-metrics.out

*.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31

*.sink.ganglia.period=10

# default for supportsparse is false

*.sink.ganglia.supportsparse=true

*.sink.ganglia.slope=jvm.metrics.gcCount=zero,jvm.metrics.memHeapUsedM=both

*.sink.ganglia.dmax=jvm.metrics.threadsBlocked=70,jvm.metrics.memHeapUsedM=40

namenode.sink.ganglia.servers=192.168.2.237:8649

datanode.sink.ganglia.servers=192.168.2.237:8649

jobtracker.sink.ganglia.servers=192.168.2.237:8649

tasktracker.sink.ganglia.servers=192.168.2.237:8649

maptask.sink.ganglia.servers=192.168.2.237:8649

reducetask.sink.ganglia.servers=192.168.2.237:8649

重启各项服务。

hadoop 集群部署ganglia 监控服务与nagios 报警服务的更多相关文章

Hadoop系列之（二）：Hadoop集群部署
1. Hadoop集群介绍 Hadoop集群部署,就是以Cluster mode方式进行部署. Hadoop的节点构成如下: HDFS daemon: NameNode, SecondaryName ...
基于k8s集群部署prometheus监控ingress nginx
目录基于k8s集群部署prometheus监控ingress nginx 1.背景和环境概述 2.修改prometheus配置 3.检查是否生效 4.配置grafana图形基于k8s集群部署pro ...
基于k8s集群部署prometheus监控etcd
目录基于k8s集群部署prometheus监控etcd 1.背景和环境概述 2.修改prometheus配置 3.检查是否生效 4.配置grafana图形基于k8s集群部署prometheus监控 ...
hadoop集群部署
1) 安装jdk 下载jdk-6u21-linux-i586.bin 然后修改/etc/profile: export JAVA_HOME=/usr/local/jdk export CLASSPAT ...
hadoop集群部署后，遇到的问题记录
1. 部署完,启动集群后,mapred-site.xml文件中配置没有生效 <property> <name>mapred.job.tracker</name> ...
王雅超的学习笔记-大数据hadoop集群部署（十）
Spark集群安装部署
SPARK安装二：HADOOP集群部署
一.hadoop下载使用2.7.6版本,因为公司生产环境是这个版本 cd /opt wget http://mirrors.hust.edu.cn/apache/hadoop/common/hado ...
hadoop集群部署配置补充
/etc/hosts192.168.153.147 Hadoop-host192.168.153.146 Hadoopnode1 192.168.153.145 Hadoopnode2::1 loca ...
Hadoop集群部署-Hadoop 运行集群后Live Nodes显示0
可以尝试以下步骤解决: 1 ,分别删除:主节点从节点的 /usr/local/hadoop-2.6.2/etc/tmp 下得所有文件; 2: 编辑cd usr/local/hadoop-2.6. ...

随机推荐

ubuntu 12.04 设置代理
一. Ubuntu 12.04 apt-get 代理设置由于公司通过代理上网,firefox的代理设置很容易就搞定了,但是通过apt-get安装软件还是不行,于是,查阅了很多资料,最多的方法就是网上 ...
树莓派make 360wifi2报错
输入make命令后报错 make: *** /lib/modules/3.10.25+/build: No such file or directory. Stop. 系统缺少编译模块所需要的内核头文 ...
(Forward) Music Player: From UI Proposal to Code
Some developers have difficult to code when the UI proposal is a bit “sophisticated” or “complex”. M ...
jsp配置项目时出错Deployment failure on Tomcat 6.x. Could not copy all resources to
转自:http://www.2cto.com/kf/201201/116853.html 今天在网上部署项目的时候出现在了问题 tomcat一直部署不上网上查了一下原因记下来供大家查看 Deplo ...
java8中hashMap
摘自:http://www.importnew.com/20386.html 简介 Java为数据结构中的映射定义了一个接口java.util.Map,此接口主要有四个常用的实现类,分别是HashMa ...
Linux相关文章
1.linux 中特殊符号用法详解 2.linux之vim命令 3.linux各文件夹的作用 4.修改linux文件权限命令:chmod 5.CentOS 6.6下安装配置Tomcat环境 6.lin ...
WinExec
WinAPI: WinExec - 运行外部程序 //声明 WinExec( lpCmdLine: LPCSTR; {文件名和参数; 如没指定路径会按以下顺序查找: 程序目录/当前目录/Syste ...
scala 学习： case class
case class: 1.定义为case class 的类在实例化时,可以不使用new 关键字. case class People(name:String, age:Int) val zhangs ...
Swift3.0基础语法学习<三>
枚举和结构体: // // ViewController3.swift // SwiftBasicDemo // // Created by 思彭 on 16/11/16. // Copyright ...
使用Aspose插件将程序中的表格，导出生成excel表格
http://www.cnblogs.com/lanyue52011/p/3372452.html这个是原文地址 /// <summary> /// 点击按钮,将内存表导出excel表格! ...

hadoop 集群部署ganglia 监控服务与nagios 报警服务

hadoop 集群部署ganglia 监控服务与nagios 报警服务的更多相关文章

随机推荐

热门专题