从Metrics的使用说起

Flink的Metrics种类有四种CountersGaugesHistograms和Meters.

如何使用Metrics呢? 以Counter为例,

 public class MyMapper extends RichMapFunction<String, String> {
private transient Counter counter; @Override
public void open(Configuration config) {
this.counter = getRuntimeContext()
.getMetricGroup()
.counter("myCounter");
} @Override
public String map(String value) throws Exception {
this.counter.inc();
return value;
}
}

行7 getMetricGroup()获取MetricGroup

行8 从MetricGroup中获取Metric实例

那么,我们来探访一下MetricGroup

Metric容器--MetricGroup

MetricGroup是Metric对象和metric subgroups的容器.

调用以下4个方法可以获得Metric对象并调用addMetric()注册这个Metric.

(AbstractMetricGroup.java)

 public <C extends Counter> C counter(String name, C counter)
{
addMetric(name, counter);
return counter;
} public <T, G extends Gauge<T>> G gauge(String name, G gauge) {
addMetric(name, gauge);
return gauge;
} public <H extends Histogram> H histogram(String name, H histogram) {
addMetric(name, histogram);
return histogram;
} public <M extends Meter> M meter(String name, M meter) {
addMetric(name, meter);
return meter;
}

注,MetricGroup接口的另一个实现UnregisteredMetricsGroup仅仅返回Metric实例而不对Metric进行注册

注2,MetricGroup接口的第三个实现ProxyMetricGroup有一个parent MetricGroup,ProxyMetricGroup所有的调用都转发到parentMetricGroup上

(AbstractMetricGroup.java)重要的域

    /** The registry that this metrics group belongs to. */
protected final MetricRegistry registry; /** All metrics that are directly contained in this group. */
private final Map<String, Metric> metrics = new HashMap<>(); /** All metric subgroups of this group. */
private final Map<String, AbstractMetricGroup> groups = new HashMap<>();/** Flag indicating whether this group has been closed. */
private volatile boolean closed;

(AbstractMetricGroup.java)

     protected void addMetric(String name, Metric metric) {
if (metric == null) {
LOG.warn("Ignoring attempted registration of a metric due to being null for name {}.", name);
return;
}
// add the metric only if the group is still open
synchronized (this) {
if (!closed) {
// immediately put without a 'contains' check to optimize the common case (no collision)
// collisions are resolved later
Metric prior = metrics.put(name, metric); // check for collisions with other metric names
if (prior == null) {
// no other metric with this name yet if (groups.containsKey(name)) {
// we warn here, rather than failing, because metrics are tools that should not fail the
// program when used incorrectly
LOG.warn("Name collision: Adding a metric with the same name as a metric subgroup: '" +
name + "'. Metric might not get properly reported. " + Arrays.toString(scopeComponents));
} registry.register(metric, name, this);
}
else {
// we had a collision. put back the original value
metrics.put(name, prior); // we warn here, rather than failing, because metrics are tools that should not fail the
// program when used incorrectly
LOG.warn("Name collision: Group already contains a Metric with the name '" +
name + "'. Metric will not be reported." + Arrays.toString(scopeComponents));
}
}
}
}

具体再来看一下addMetric()的代码

行7 获得互斥锁

行8 检测当前group是否close

行11~34 把要注册的Metric对象添加到metrics map中

这里一个小trick是,默认没有key的冲突,直接把这个metric对象添加到map中.再回头检测是否有值被替换出来.这样的做法可以优化性能(若没有key冲突,减少了一次map寻址)

行24 在MetricRegister中注册Metric,这个在下一节详谈

调用addGroup()可以添加subgroup

(AbstractMetricGroup.java)

     private AbstractMetricGroup<?> addGroup(String name, ChildType childType) {
synchronized (this) {
if (!closed) {
// adding a group with the same name as a metric creates problems in many reporters/dashboards
// we warn here, rather than failing, because metrics are tools that should not fail the
// program when used incorrectly
if (metrics.containsKey(name)) {
LOG.warn("Name collision: Adding a metric subgroup with the same name as an existing metric: '" +
name + "'. Metric might not get properly reported. " + Arrays.toString(scopeComponents));
} AbstractMetricGroup newGroup = createChildGroup(name, childType);
AbstractMetricGroup prior = groups.put(name, newGroup);
if (prior == null) {
// no prior group with that name
return newGroup;
} else {
// had a prior group with that name, add the prior group back
groups.put(name, prior);
return prior;
}
}
else {
// return a non-registered group that is immediately closed already
GenericMetricGroup closedGroup = new GenericMetricGroup(registry, this, name);
closedGroup.close();
return closedGroup;
}
}
} protected GenericMetricGroup createChildGroup(String name, ChildType childType) {
switch (childType) {
case KEY:
return new GenericKeyMetricGroup(registry, this, name);
default:
return new GenericMetricGroup(registry, this, name);
}
} /**
* Enum for indicating which child group should be created.
* `KEY` is used to create {@link GenericKeyMetricGroup}.
* `VALUE` is used to create {@link GenericValueMetricGroup}.
* `GENERIC` is used to create {@link GenericMetricGroup}.
*/
protected enum ChildType {
KEY,
VALUE,
GENERIC
}

行2 获取互斥锁

行12~21 新建MetricGroup对象

注意,添加的subgroup的name与Metric对象的name相同会造成问题.

行25,35,37 同一个tree里的MetricGroup对象使用同一个MetricRegister

行26 close MetricGroup

     public void close() {
synchronized (this) {
if (!closed) {
closed = true; // close all subgroups
for (AbstractMetricGroup group : groups.values()) {
group.close();
}
groups.clear(); // un-register all directly contained metrics
for (Map.Entry<String, Metric> metric : metrics.entrySet()) {
registry.unregister(metric.getValue(), metric.getKey(), this);
}
metrics.clear();
}
}
}

行2 获取互斥锁

递归地close所有subgroups, 注销所有metrics

MetricGroup中的addMetric(),addGroup(),close()以及上面未提到的getAllVariables()方法需要获取互斥锁

原因: 防止关闭group的同时添加metrics和subgroups造成的资源泄露.

MetricGroup另一个很重要的方法是public String getMetricIdentifier(String metricName, CharacterFilter filter, int reporterIndex).

作用是获取某个Metric的唯一名作为标志(identifier).

identifier分为3部分:System scope, User scope, Metric name

A.B.C   其中,A为System scope,B为User scope,C为Metric name, '.'是分隔符

System Scope可在conf/flink-conf.yaml中定义.

User Scope就是groups tree, 可调用addGroup(String)来定义 (可定义多层group)

MetricGroup与MetricReporter之间的桥梁 -- MetricRegister

MetricRegistry追踪所有已注册的Metric.它作为MetricGroup和MetricReporter之间的桥梁.

在MetricGroup的addMetric()方法中调用了MetricRegister的register()方法:

registry.register(metric, name, this);

在MetricGroup的close()方法中调用了MetricRegister的unregister()方法:

registry.unregister(metric.getValue(), metric.getKey(), this);
     // ------------------------------------------------------------------------
// Metrics (de)registration
// ------------------------------------------------------------------------ @Override
public void register(Metric metric, String metricName, AbstractMetricGroup group) {
synchronized (lock) {
if (isShutdown()) {
LOG.warn("Cannot register metric, because the MetricRegistry has already been shut down.");
} else {
if (reporters != null) {
for (int i = 0; i < reporters.size(); i++) {
MetricReporter reporter = reporters.get(i);
try {
if (reporter != null) {
FrontMetricGroup front = new FrontMetricGroup<AbstractMetricGroup<?>>(i, group);
reporter.notifyOfAddedMetric(metric, metricName, front);
}
} catch (Exception e) {
LOG.warn("Error while registering metric.", e);
}
}
}
try {
if (queryService != null) {
MetricQueryService.notifyOfAddedMetric(queryService, metric, metricName, group);
}
} catch (Exception e) {
LOG.warn("Error while registering metric.", e);
}
try {
if (metric instanceof View) {
if (viewUpdater == null) {
viewUpdater = new ViewUpdater(executor);
}
viewUpdater.notifyOfAddedView((View) metric);
}
} catch (Exception e) {
LOG.warn("Error while registering metric.", e);
}
}
}
} @Override
public void unregister(Metric metric, String metricName, AbstractMetricGroup group) {
synchronized (lock) {
if (isShutdown()) {
LOG.warn("Cannot unregister metric, because the MetricRegistry has already been shut down.");
} else {
if (reporters != null) {
for (int i = 0; i < reporters.size(); i++) {
try {
MetricReporter reporter = reporters.get(i);
if (reporter != null) {
FrontMetricGroup front = new FrontMetricGroup<AbstractMetricGroup<?>>(i, group);
reporter.notifyOfRemovedMetric(metric, metricName, front);
}
} catch (Exception e) {
LOG.warn("Error while registering metric.", e);
}
}
}
try {
if (queryService != null) {
MetricQueryService.notifyOfRemovedMetric(queryService, metric);
}
} catch (Exception e) {
LOG.warn("Error while registering metric.", e);
}
try {
if (metric instanceof View) {
if (viewUpdater != null) {
viewUpdater.notifyOfRemovedView((View) metric);
}
}
} catch (Exception e) {
LOG.warn("Error while registering metric.", e);
}
}
}
}

register()方法和unregister()方法基本相似

行7 获取同步锁. 锁对象不再是this,而是new Object().这样做,方便拓展第二个锁.

行11~23 向所有下属的MetricReporter添加该Metric

行24~30 向MetricQueryService添加该Metric

MetricQueryService是个actor,它会将Metric序列化,然后写入到output stream

行31~40 如果Metric实现了View接口,那么在viewUpdater中注册这个Metric

Metric类实现View接口后,可以按设定时间间隔来更新这个Metric(由viewUpdater来执行update)

MetricReporter

MetricReporter用于把Metric导出到外部backend.

外部backend的参数可在conf/flink-conf.yaml中设定.

可同时设定多个外部backend.

MetricReporter接口

 public interface MetricReporter {

     // ------------------------------------------------------------------------
// life cycle
// ------------------------------------------------------------------------ void open(MetricConfig config); // void close(); // ------------------------------------------------------------------------
// adding / removing metrics
// ------------------------------------------------------------------------ void notifyOfAddedMetric(Metric metric, String metricName, MetricGroup group); void notifyOfRemovedMetric(Metric metric, String metricName, MetricGroup group);
}

行8 配置这个Reporter.

由于reporter的构造器是无参的,这个方法用于初始化reporter的域.

这个方法总是在对象构造后调用

行10 关闭这个Reporter.

应该在这个方法中关闭 channels,streams以及释放资源.

行16,19 增删metrics

常规的reporter类还需要实现Scheduled接口用于报告当前的measurements

 public interface Scheduled {

     void report();
}

行3 由metric registry定期地调用report()方法,来报告当前的measurements

深入理解Flink ---- Metrics的内部结构的更多相关文章

  1. Flink - metrics

      Metrics是以MetricsGroup来组织的 MetricGroup MetricGroup 这就是个metric容器,里面可以放subGroup,或者各种metric 所以主要的接口就是注 ...

  2. Flink Metrics 源码解析

    Flink Metrics 有如下模块: Flink Metrics 源码解析 -- Flink-metrics-core Flink Metrics 源码解析 -- Flink-metrics-da ...

  3. 深入理解Flink核心技术及原理

    前言 Apache Flink(下简称Flink)项目是大数据处理领域最近冉冉升起的一颗新星,其不同于其他大数据项目的诸多特性吸引了越来越多人的关注.本文将深入分析Flink的一些关键技术与特性,希望 ...

  4. 理解Storm Metrics

    在hadoop中,存在对应的counter计数器用于记录hadoop map/reduce job任务执行过程中自定义的一些计数器,其中hadoop任务中已经内置了一些计数器,例如CPU时间,GC时间 ...

  5. Flink – metrics V1.2

    WebRuntimeMonitor   .GET("/jobs/:jobid/vertices/:vertexid/metrics", handler(new JobVertexM ...

  6. 深入理解Flink核心技术(转载)

    作者:李呈祥 Flink项目是大数据处理领域最近冉冉升起的一颗新星,其不同于其他大数据项目的诸多特性吸引了越来越多的人关注Flink项目.本文将深入分析Flink一些关键的技术与特性,希望能够帮助读者 ...

  7. 深入理解Flink ---- 系统内部消息传递的exactly once语义

    At Most once,At Least once和Exactly once 在分布式系统中,组成系统的各个计算机是独立的.这些计算机有可能fail. 一个sender发送一条message到rec ...

  8. 深入理解Flink ---- End-to-End Exactly-Once语义

    上一篇文章所述的Exactly-Once语义是针对Flink系统内部而言的. 那么Flink和外部系统(如Kafka)之间的消息传递如何做到exactly once呢? 问题所在: 如上图,当sink ...

  9. 理解Flink中的Task和SUBTASK

    1.概念 Task(任务):Task是一个阶段多个功能相同的subTask 的集合,类似于Spark中的TaskSet. subTask(子任务):subTask是Flink中任务最小执行单元,是一个 ...

随机推荐

  1. 单元测试框架之unittest(六)

    一.摘要 本片博文将介绍unittest框架的一些轻便有效的特性,在我们的测试中经常可以用到 如果有一些测试方法不想执行,如果有些测试方法在某些条件下不执行 该当如何? 如果有些方法未在unittes ...

  2. Selenium(七)多窗口切换、等待方法、alert对话框处理

    一.多窗口切换 1.打开百度首页 2.在百度中搜索博客园 3.从搜索结果中跳转到博客园 4.博客园首页和百度搜索页面切换 handle:句柄 二.等待方法 time.sleep(5) 先导入方法 参数 ...

  3. sqlserver2014安装Windows版教程

    下载好安装包,直接运行 根据自己的情况选择,我是首次安装,选择第一项即可. 之后一路下一步,然后等待安装. 安装完成

  4. Start Failed, Internal error: recovering IDE to the working state after the critical startup error

    Start Failed, Internal error: recovering IDE to the working state after the critical startup error F ...

  5. Shell 05 Sed

    一.基本用方法 1.sed文本处理工具的用法 用法1:前置命令 | sed  [选项]  '条件指令' 用法2:sed  [选项]  '条件指令'  文件.. .. 注意:没有条件时候,默认所有条件, ...

  6. BigDecimal 3个toString()方法区别

    BigDecimal 的toEngineeringString.toPlainString和toString方法的区别: toEngineeringString:有必要时使用工程计数法.工程记数法是一 ...

  7. 线程的分离状态 detached joinable

    转自  http://blog.chinaunix.net/uid-26983585-id-3315953.html 其实在写上一篇日志的时候,由于我把创建线程的返回值的判断条件写错了,程序每次运行的 ...

  8. 【新词发现】基于SNS的文本数据挖掘、短语挖掘

    互联网时代的社会语言学:基于SNS的文本数据挖掘 python实现 https://github.com/jtyoui/Jtyoui/tree/master/jtyoui/word  这是一个无监督训 ...

  9. 面向对象(OOP)笔记

    1.本质:以类的方式组织代码,以对象的方式组织(封装)数据 2.对象:是具体的事物 3.类:是对对象的抽象(抽象 抽出象的部分) 先有具体的对象,然后抽象各个对象之间象的部分,归纳出类 通过类再认识其 ...

  10. windows游戏编程X86 32位保护模式下的内存管理概述(二)

    本系列文章由jadeshu编写,转载请注明出处.http://blog.csdn.net/jadeshu/article/details/22448323 作者:jadeshu   邮箱: jades ...