来源:Hadoop Metrics2

  Metrics are collections of information about Hadoop daemons, events and measurements; for example, data nodes collect metrics such as the number of blocks replicated, number of read requests from clients, and so on. For that reason, metrics are an invaluable resource for monitoring Apache Hadoop services and an indispensable tool for debugging system problems.

This blog post focuses on the features and use of the Metrics2 system for Hadoop, which allows multiple metrics output plugins to be used in parallel, supports dynamic reconfiguration of metrics plugins, provides metrics filtering, and allows all metrics to be exported via JMX.

Metrics vs. MapReduce Counters

When speaking about metrics, a question about their relationship to MapReduce counters usually arises. This differences can be described in two ways: First, Hadoop daemons and services are generally the scope for metrics, whereas MapReduce applications are the scope for MapReduce counters (which are collected for MapReduce tasks and aggregated for the whole job). Second, whereas Hadoop administrators are the main audience for metrics, MapReduce users are the audience for MapReduce counters.

Contexts and Prefixes

For organizational purposes metrics are grouped into named contexts – e.g., jvm for java virtual machine metrics or dfs for the distributed file system metric. There are different sets of contexts supported by Hadoop-1 and Hadoop-2; the table below highlights the ones supported for each of them.

Branch-1

Branch-2

– jvm
– rpc
– rpcdetailed
– metricssystem
– mapred
– dfs
– ugi
– yarn
– jvm
– rpc
– rpcdetailed
– metricssystem
– mapred
– dfs
– ugi

A Hadoop daemon collects metrics in several contexts. For example, data nodes collect metrics for the “dfs”, “rpc” and “jvm” contexts. The daemons that collect different metrics in Hadoop (for Hadoop-1 and Hadoop-2) are listed below:

Branch-1 Daemons/Prefixes Branch-2 Daemons/Prefixes

– namenode
– datanode
– jobtracker
– tasktracker
– maptask
– reducetask

– namenode
– secondarynamenode
– datanode
– resourcemanager
– nodemanager
– mrappmaster
– maptask
– reducetask

System Design

The Metrics2 framework is designed to collect and dispatch per-process metrics to monitor the overall status of the Hadoop system. Producers register the metrics sources with the metrics system, while consumers register the sinks. The framework marshals metrics from sources to sinks based on (per source/sink) configuration options. This design is depicted below.

Here is an example class implementing the MetricsSource:

class MyComponentSource implements MetricsSource {
@Override
public void getMetrics(MetricsCollector collector, boolean all) {
collector.addRecord("MyComponentSource")
.setContext("MyContext")
.addGauge(info("MyMetric", "My metric description"), 42);
}
}

The “MyMetric” in the listing above could be, for example, the number of open connections for a specific server.

Here is an example class implementing the MetricsSink:

public class MyComponentSink implements MetricsSink {
public void putMetrics(MetricsRecord record) {
System.out.print(record);
}
public void init(SubsetConfiguration conf) {}
public void flush() {}
}

To use the Metric2s framework, the system needs to be initialized and sources and sinks registered. Here is an example initialization:

DefaultMetricsSystem.initialize(”datanode");
MetricsSystem.register(source1, “source1 description”, new MyComponentSource());
MetricsSystem.register(sink2, “sink2 description”, new MyComponentSink())

Configuration and Filtering

The Metrics2 framework uses the PropertiesConfiguration from the apache commons configuration library.

Sinks are specified in a configuration file (e.g., “hadoop-metrics2-test.properties”), as:

test.sink.mysink0.class=com.example.hadoop.metrics.MySink
 The configuration syntax is:
[prefix].[source|sink|jmx|].[instance].[option]

In the previous example, test is the prefix and mysink0 is an instance name. DefaultMetricsSystem would try to load hadoop-metrics2-[prefix].properties first, and if not found, try the default hadoop-metrics2.properties in the class path. Note, the [instance] is an arbitrary name to uniquely identify a particular sink instance. The asterisk (*) can be used to specify default options.

Here is an example with inline comments to identify the different configuration sections:

 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# syntax: [prefix].[source|sink].[instance].[options]
# Here we define a file sink with the instance name “foo”
*.sink.foo.class=org.apache.hadoop.metrics2.sink.FileSink
# Now we specify the filename for every prefix/daemon that is used for
# dumping metrics to this file. Notice each of the following lines is
# associated with one of those prefixes.
namenode.sink.foo.filename=/tmp/namenode-metrics.out
secondarynamenode.sink.foo.filename=/tmp/secondarynamenode-metrics.out
datanode.sink.foo.filename=/tmp/datanode-metrics.out
resourcemanager.sink.foo.filename=/tmp/resourcemanager-metrics.out
nodemanager.sink.foo.filename=/tmp/nodemanager-metrics.out
maptask.sink.foo.filename=/tmp/maptask-metrics.out
reducetask.sink.foo.filename=/tmp/reducetask-metrics.out
mrappmaster.sink.foo.filename=/tmp/mrappmaster-metrics.out
# We here define another file sink with a different instance name “bar”
*.sink.bar.class=org.apache.hadoop.metrics2.sink.FileSink
# The following line specifies the filename for the nodemanager daemon
# associated with this instance. Note that the nodemanager metrics are
# dumped into two different files. Typically you’ll use a different sink type
# (e.g. ganglia), but here having two file sinks for the same daemon can be
# only useful when different filtering strategies are applied to each.
nodemanager.sink.bar.filename=/tmp/nodemanager-metrics-bar.out

Here is an example set of NodeManager metrics that are dumped into the NodeManager sink file:

1
2
3
4
5
6
7
1349542623843 jvm.JvmMetrics: Context=jvm, ProcessName=NodeManager, SessionId=null, Hostname=ubuntu, MemNonHeapUsedM=11.877365, MemNonHeapCommittedM=18.25, MemHeapUsedM=2.9463196, MemHeapCommittedM=30.5, GcCountCopy=5, GcTimeMillisCopy=28, GcCountMarkSweepCompact=0, GcTimeMillisMarkSweepCompact=0, GcCount=5, GcTimeMillis=28, ThreadsNew=0, ThreadsRunnable=6, ThreadsBlocked=0, ThreadsWaiting=23, ThreadsTimedWaiting=2, ThreadsTerminated=0, LogFatal=0, LogError=0, LogWarn=0, LogInfo=0
1349542623843 yarn.NodeManagerMetrics: Context=yarn, Hostname=ubuntu, AvailableGB=8
1349542623843 ugi.UgiMetrics: Context=ugi, Hostname=ubuntu
1349542623843 mapred.ShuffleMetrics: Context=mapred, Hostname=ubuntu
1349542623844 rpc.rpc: port=42440, Context=rpc, Hostname=ubuntu, NumOpenConnections=0, CallQueueLength=0
1349542623844 rpcdetailed.rpcdetailed: port=42440, Context=rpcdetailed, Hostname=ubuntu
1349542623844 metricssystem.MetricsSystem: Context=metricssystem, Hostname=ubuntu, NumActiveSources=6, NumAllSources=6, NumActiveSinks=1, NumAllSinks=0, SnapshotNumOps=6, SnapshotAvgTime=0.16666666666666669

Each line starts with a time followed by the context and metrics name and the corresponding value for each metric.

Filtering

By default, filtering can be done by source, context, record and metrics. More discussion of different filtering strategies can be found in the Javadoc and wiki.

Example:

1
2
3
4
5
6
7
8
9
10
11
mrappmaster.sink.foo.context=jvm
# Define the classname used for filtering
*.source.filter.class=org.apache.hadoop.metrics2.filter.GlobFilter
*.record.filter.class=${*.source.filter.class}
*.metric.filter.class=${*.source.filter.class}
# Filter in any sources with names start with Jvm
nodemanager.*.source.filter.include=Jvm*
# Filter out records with names that matches foo* in the source named "rpc"
nodemanager.source.rpc.record.filter.exclude=foo*
# Filter out metrics with names that matches foo* for sink instance "file" only
nodemanager.sink.foo.metric.filter.exclude=MemHeapUsedM

Conclusion

The Metrics2 system for Hadoop provides a gold mine of real-time and historical data that help monitor and debug problems associated with the Hadoop services and jobs.

Ahmed Radwan is a software engineer at Cloudera, where he contributes to various platform tools and open-source projects.

Hadoop Metrics2的更多相关文章

  1. log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.o

    上面的报错是在本地java调试(windows) hadoop集群 出现的 解决方案: 在resources文件夹下面创建一个文件log4j.properties(这个其实hadoop安装目录下的 e ...

  2. hadoop项目开发运行报错(log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).)

    使用hadoop+myeclipse开发项目是测试运行报错: log4j:WARN No appenders could be found for logger (org.apache.hadoop. ...

  3. 关于log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).的问题

    解决办法(非长久之计,折中) 将该方法插入到main函数中,可以自行打印日志信息了 BasicConfigurator.configure(); //自动快速地使用缺省Log4j环境.原文链接:htt ...

  4. 使用ganglia监控hadoop及hbase集群

    一.Ganglia简介 Ganglia 是 UC Berkeley 发起的一个开源监视项目,设计用于测量数以千计的节点.每台计算机都运行一个收集和发送度量数据(如处理器速度.内存使用量等)的名为 gm ...

  5. hadoop安装及配置入门篇

    声明: author: 龚细军 时间: -- 类型: 笔记 转载时请注明出处及相应链接. 链接地址: http://www.cnblogs.com/gongxijun/p/5726024.html 本 ...

  6. hadoop安装遇到的各种异常及解决办法

    hadoop安装遇到的各种异常及解决办法 异常一: 2014-03-13 11:10:23,665 INFO org.apache.hadoop.ipc.Client: Retrying connec ...

  7. Windows下Eclipse连接hadoop

    2015-3-27 参考: http://www.cnblogs.com/baixl/p/4154429.html http://blog.csdn.net/u010911997/article/de ...

  8. Hadoop 2.2.0学习笔记20131210

    伪分布式单节点安装执行pi失败: [root@server- ~]# ./bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples ...

  9. Hadoop 2.2.0学习笔记20131209

    1.下载java 7并安装 [root@server- ~]# rpm -ivh jdk-7u40-linux-x64.rpm Preparing... ####################### ...

随机推荐

  1. django:multivaluedictkeyerror错误

    查了一下,是因为获取前台数据时,用了request.POST[],改用request.POST.get()之后没有这个报错了 细节: request.POST是用来接受从前端表单中传过来的数据,比如用 ...

  2. Layui:踩坑之我见

    layui.form.on("XXX",function(){});  此方法会有事件冒泡的现象产生,解决方法是return false  或者使用 layui.stope(),但 ...

  3. win10下安装配置iis,发布iis

    老有朋友不会配置iis跟发布iis,今天整理一下,欢迎参考借鉴 打开控制面板 找到 程序 点击程序  找到启用或关闭windows功能 在windows服务中找到 Internet Informati ...

  4. 仿微信聊天面板制作 javascript

    先上图吧 , 点击头像更换说话对象,简单说下实现原理,html中创建一个ul用于存放所有说话的内容,对话内容是有javascript 动态生成, 主要难点:先布局好css,当时奥巴马发送时候,让这个l ...

  5. java.lang.NoSuchMethodError: javax.servlet.ServletContext.getContextPath()Ljava/lang/String;

    问题描述:在eclipse3.7中启动tomcat6时一直出现这个错误, java.lang.NoSuchMethodError: javax.servlet.ServletContext.getCo ...

  6. NOIP2013PUZZLE

    #include<cstdio> #include<cstring> #define MIN(A,B) (A)<(B)?(A):(B) using namespace s ...

  7. collections模块—— Counter

    ounter目的是用来跟踪值出现的次数.它是一个无序的容器类型,以字典的键值对形式存储,其中元素作为key,其计数作为value.计数值可以是任意的Interger(包括0和负数).Counter类和 ...

  8. 深入了解java虚拟机(JVM) 第十一章 类的加载

    一.类加载机制概述 虚拟机把描述类的数据从class文件加载到内存并对数据进行效验,解析和初始化,最终形成可以被虚拟机直接使用的java类型,这就是虚拟机的类加载机制. 二.类加载的机制 类加载的过程 ...

  9. LOJ#6038. 「雅礼集训 2017 Day5」远行(LCT)

    题面 传送门 题解 要不是因为数组版的\(LCT\)跑得实在太慢我至于去学指针版的么--而且指针版的完全看不懂啊-- 首先有两个结论 1.与一个点距离最大的点为任意一条直径的两个端点之一 2.两棵树之 ...

  10. bash脚本编程学习笔记(一)

    bash脚本语言,不同于C/C++是一种解释性语言.即在执行前不需要事先转变为可执行的二进制代码,而是每次执行时经解释器解释后执行.bash脚本语言是命令的堆砌,即按照实际需要,结合命令流程机制实现的 ...