Metrics是以MetricsGroup来组织的

MetricGroup

MetricGroup

这就是个metric容器,里面可以放subGroup,或者各种metric

所以主要的接口就是注册,

/**
* A MetricGroup is a named container for {@link Metric Metrics} and further metric subgroups.
*
* <p>Instances of this class can be used to register new metrics with Flink and to create a nested
* hierarchy based on the group names.
*
* <p>A MetricGroup is uniquely identified by it's place in the hierarchy and name.
*/
public interface MetricGroup {
<C extends Counter> C counter(int name, C counter);
<T, G extends Gauge<T>> G gauge(int name, G gauge);
<H extends Histogram> H histogram(String name, H histogram);
MetricGroup addGroup(String name);
}

 

AbstractMetricGroup

关键是实现MetricGroup,逻辑很简单,在注册或close的时候都需要加锁互斥

/**
* Abstract {@link MetricGroup} that contains key functionality for adding metrics and groups.
*
*/ public abstract class AbstractMetricGroup implements MetricGroup { /** The registry that this metrics group belongs to */
protected final MetricRegistry registry; /** All metrics that are directly contained in this group */
private final Map<String, Metric> metrics = new HashMap<>(); /** All metric subgroups of this group */
private final Map<String, AbstractMetricGroup> groups = new HashMap<>(); /** The metrics scope represented by this group.
* For example ["host-7", "taskmanager-2", "window_word_count", "my-mapper" ]. */
private final String[] scopeComponents; //命名空间 /** The metrics scope represented by this group, as a concatenated string, lazily computed.
* For example: "host-7.taskmanager-2.window_word_count.my-mapper" */
private String scopeString; @Override
public <C extends Counter> C counter(String name, C counter) {
addMetric(name, counter);
return counter;
} /**
* Adds the given metric to the group and registers it at the registry, if the group
* is not yet closed, and if no metric with the same name has been registered before.
*
* @param name the name to register the metric under
* @param metric the metric to register
*/
protected void addMetric(String name, Metric metric) {
// add the metric only if the group is still open
synchronized (this) { //加锁
if (!closed) {
// immediately put without a 'contains' check to optimize the common case (no collition)
// collisions are resolved later
Metric prior = metrics.put(name, metric); // check for collisions with other metric names
if (prior == null) {
// no other metric with this name yet registry.register(metric, name, this);
}
else {
// we had a collision. put back the original value
metrics.put(name, prior); }
}
}
}
}

 

MetricReporter

采集好的Metrics需要用reporter才能发送出去,

/**
* Reporters are used to export {@link Metric Metrics} to an external backend.
*
* <p>Reporters are instantiated via reflection and must be public, non-abstract, and have a
* public no-argument constructor.
*/
public interface MetricReporter { // ------------------------------------------------------------------------
// life cycle
// ------------------------------------------------------------------------ /**
* Configures this reporter. Since reporters are instantiated generically and hence parameter-less,
* this method is the place where the reporters set their basic fields based on configuration values.
*
* <p>This method is always called first on a newly instantiated reporter.
*
* @param config The configuration with all parameters.
*/
void open(MetricConfig config); /**
* Closes this reporter. Should be used to close channels, streams and release resources.
*/
void close(); void notifyOfAddedMetric(Metric metric, String metricName, MetricGroup group);
void notifyOfRemovedMetric(Metric metric, String metricName, MetricGroup group);
}

 

AbstractReporter实现MetricReport接口,

/**
* Base interface for custom metric reporters.
*/
public abstract class AbstractReporter implements MetricReporter, CharacterFilter {
protected final Logger log = LoggerFactory.getLogger(getClass()); protected final Map<Gauge<?>, String> gauges = new HashMap<>();
protected final Map<Counter, String> counters = new HashMap<>();
protected final Map<Histogram, String> histograms = new HashMap<>(); @Override
public void notifyOfAddedMetric(Metric metric, String metricName, MetricGroup group) {
final String name = group.getMetricIdentifier(metricName, this); //group只是用来获取metrics完整的name synchronized (this) {
if (metric instanceof Counter) {
counters.put((Counter) metric, name);
} else if (metric instanceof Gauge) {
gauges.put((Gauge<?>) metric, name);
} else if (metric instanceof Histogram) {
histograms.put((Histogram) metric, name);
} else {
log.warn("Cannot add unknown metric type {}. This indicates that the reporter " +
"does not support this metric type.", metric.getClass().getName());
}
}
} @Override
public void notifyOfRemovedMetric(Metric metric, String metricName, MetricGroup group) {
synchronized (this) {
if (metric instanceof Counter) {
counters.remove(metric);
} else if (metric instanceof Gauge) {
gauges.remove(metric);
} else if (metric instanceof Histogram) {
histograms.remove(metric);
} else {
log.warn("Cannot remove unknown metric type {}. This indicates that the reporter " +
"does not support this metric type.", metric.getClass().getName());
}
}
}
}

 

MetricRegistry

MetricRegistry用于连接MetricGroups和MetricReporters,

会把需要report的metric加到MetricReporters,并启动定时的report线程

/**
* A MetricRegistry keeps track of all registered {@link Metric Metrics}. It serves as the
* connection between {@link MetricGroup MetricGroups} and {@link MetricReporter MetricReporters}.
*/
public class MetricRegistry { private List<MetricReporter> reporters;
private ScheduledExecutorService executor; private final ScopeFormats scopeFormats; private final char delimiter; /**
* Creates a new MetricRegistry and starts the configured reporter.
*/
public MetricRegistry(Configuration config) {
// first parse the scope formats, these are needed for all reporters
ScopeFormats scopeFormats;
try {
scopeFormats = createScopeConfig(config); //从配置中读到scope的格式,即监控数据的namespace的格式是什么
}
catch (Exception e) {
LOG.warn("Failed to parse scope format, using default scope formats", e);
scopeFormats = new ScopeFormats();
}
this.scopeFormats = scopeFormats; char delim;
try {
delim = config.getString(ConfigConstants.METRICS_SCOPE_DELIMITER, ".").charAt(0); //从配置里面读出分隔符
} catch (Exception e) {
LOG.warn("Failed to parse delimiter, using default delimiter.", e);
delim = '.';
}
this.delimiter = delim; // second, instantiate any custom configured reporters
this.reporters = new ArrayList<>(); final String definedReporters = config.getString(ConfigConstants.METRICS_REPORTERS_LIST, null); //读出配置的Reporters if (definedReporters == null) {
// no reporters defined
// by default, don't report anything
LOG.info("No metrics reporter configured, no metrics will be exposed/reported.");
this.executor = null;
} else {
// we have some reporters so
String[] namedReporters = definedReporters.split("\\s*,\\s*");
for (String namedReporter : namedReporters) { //对于配置的每个reporter DelegatingConfiguration reporterConfig = new DelegatingConfiguration(config, ConfigConstants.METRICS_REPORTER_PREFIX + namedReporter + ".");
final String className = reporterConfig.getString(ConfigConstants.METRICS_REPORTER_CLASS_SUFFIX, null); //reporter class名配置 try {
String configuredPeriod = reporterConfig.getString(ConfigConstants.METRICS_REPORTER_INTERVAL_SUFFIX, null); //report interval配置
TimeUnit timeunit = TimeUnit.SECONDS;
long period = 10; if (configuredPeriod != null) {
try {
String[] interval = configuredPeriod.split(" ");
period = Long.parseLong(interval[0]);
timeunit = TimeUnit.valueOf(interval[1]);
}
catch (Exception e) {
LOG.error("Cannot parse report interval from config: " + configuredPeriod +
" - please use values like '10 SECONDS' or '500 MILLISECONDS'. " +
"Using default reporting interval.");
}
} Class<?> reporterClass = Class.forName(className);
MetricReporter reporterInstance = (MetricReporter) reporterClass.newInstance(); //实例化reporter MetricConfig metricConfig = new MetricConfig();
reporterConfig.addAllToProperties(metricConfig);
reporterInstance.open(metricConfig); //open reporter if (reporterInstance instanceof Scheduled) {
if (this.executor == null) {
executor = Executors.newSingleThreadScheduledExecutor(); //创建Executor
}
LOG.info("Periodically reporting metrics in intervals of {} {} for reporter {} of type {}.", period, timeunit.name(), namedReporter, className); executor.scheduleWithFixedDelay(
new ReporterTask((Scheduled) reporterInstance), period, period, timeunit); //Scheduled report
}
reporters.add(reporterInstance); //加入reporters列表
}
catch (Throwable t) {
shutdownExecutor();
LOG.error("Could not instantiate metrics reporter" + namedReporter + ". Metrics might not be exposed/reported.", t);
}
}
}
} // ------------------------------------------------------------------------
// Metrics (de)registration
// ------------------------------------------------------------------------ /**
* Registers a new {@link Metric} with this registry.
*
* @param metric the metric that was added
* @param metricName the name of the metric
* @param group the group that contains the metric
*/
public void register(Metric metric, String metricName, MetricGroup group) { //在AbstractMetricGroup.addMetric中被调用,metric被加到group的同时也会加到reporter中
        try {
if (reporters != null) {
for (MetricReporter reporter : reporters) {
if (reporter != null) {
reporter.notifyOfAddedMetric(metric, metricName, group); //把metric加到每个reporters上面
}
}
}
} catch (Exception e) {
LOG.error("Error while registering metric.", e);
}
} /**
* Un-registers the given {@link org.apache.flink.metrics.Metric} with this registry.
*
* @param metric the metric that should be removed
* @param metricName the name of the metric
* @param group the group that contains the metric
*/
public void unregister(Metric metric, String metricName, MetricGroup group) {
try {
if (reporters != null) {
for (MetricReporter reporter : reporters) {
if (reporter != null) {
reporter.notifyOfRemovedMetric(metric, metricName, group);
}
}
}
} catch (Exception e) {
LOG.error("Error while registering metric.", e);
}
} // ------------------------------------------------------------------------ /**
* This task is explicitly a static class, so that it does not hold any references to the enclosing
* MetricsRegistry instance.
*
* This is a subtle difference, but very important: With this static class, the enclosing class instance
* may become garbage-collectible, whereas with an anonymous inner class, the timer thread
* (which is a GC root) will hold a reference via the timer task and its enclosing instance pointer.
* Making the MetricsRegistry garbage collectible makes the java.util.Timer garbage collectible,
* which acts as a fail-safe to stop the timer thread and prevents resource leaks.
*/
private static final class ReporterTask extends TimerTask { private final Scheduled reporter; private ReporterTask(Scheduled reporter) {
this.reporter = reporter;
} @Override
public void run() {
try {
reporter.report(); //Task的核心就是调用reporter.report
} catch (Throwable t) {
LOG.warn("Error while reporting metrics", t);
}
}
}
}

 

TaskManager

在TaskManager中,

associateWithJobManager
metricsRegistry = new FlinkMetricRegistry(config.configuration)

taskManagerMetricGroup =
new TaskManagerMetricGroup(metricsRegistry, this.runtimeInfo.getHostname, id.toString) TaskManager.instantiateStatusMetrics(taskManagerMetricGroup)

创建metricsRegistry 和TaskManagerMetricGroup

可以看到instantiateStatusMetrics,只是注册各种taskManager的status metrics,

private def instantiateStatusMetrics(taskManagerMetricGroup: MetricGroup) : Unit = {
val jvm = taskManagerMetricGroup
.addGroup("Status")
.addGroup("JVM") instantiateClassLoaderMetrics(jvm.addGroup("ClassLoader"))
instantiateGarbageCollectorMetrics(jvm.addGroup("GarbageCollector"))
instantiateMemoryMetrics(jvm.addGroup("Memory"))
instantiateThreadMetrics(jvm.addGroup("Threads"))
instantiateCPUMetrics(jvm.addGroup("CPU"))
} private def instantiateClassLoaderMetrics(metrics: MetricGroup) {
val mxBean = ManagementFactory.getClassLoadingMXBean //从ManagementFactory可以取出表示JVM指标的MXBean metrics.gauge[Long, FlinkGauge[Long]]("ClassesLoaded", new FlinkGauge[Long] {
override def getValue: Long = mxBean.getTotalLoadedClassCount
})
metrics.gauge[Long, FlinkGauge[Long]]("ClassesUnloaded", new FlinkGauge[Long] {
override def getValue: Long = mxBean.getUnloadedClassCount
})
}

 

在submitTask的时候,

submitTask
  val taskMetricGroup = taskManagerMetricGroup.addTaskForJob(tdd)

  val task = new Task(
tdd,
memoryManager,
ioManager,
network,
bcVarManager,
selfGateway,
jobManagerGateway,
config.timeout,
libCache,
fileCache,
runtimeInfo,
taskMetricGroup)

看到会为每个task,创建taskMetricGroup

并在创建Task对象的时候传入该对象,

Environment env = new RuntimeEnvironment(jobId, vertexId, executionId,
executionConfig, taskInfo, jobConfiguration, taskConfiguration,
userCodeClassLoader, memoryManager, ioManager,
broadcastVariableManager, accumulatorRegistry,
splitProvider, distributedCacheEntries,
writers, inputGates, jobManager, taskManagerConfig, metrics, this); // let the task code create its readers and writers
invokable.setEnvironment(env);

在Task中, 关键的就是把这个taskMetricGroup,加入RuntimeEnvironment,这样在实际逻辑中,就可以通过RuntimeEnvironment获取到metrics

而StreamTask就是一种Invokable,接口定义如下

public abstract class AbstractInvokable {

    /** The environment assigned to this invokable. */
private Environment environment; /**
* Starts the execution.
*
* <p>Must be overwritten by the concrete task implementation. This method
* is called by the task manager when the actual execution of the task
* starts.
*
* <p>All resources should be cleaned up when the method returns. Make sure
* to guard the code with <code>try-finally</code> blocks where necessary.
*
* @throws Exception
* Tasks may forward their exceptions for the TaskManager to handle through failure/recovery.
*/
public abstract void invoke() throws Exception; /**
* Sets the environment of this task.
*
* @param environment
* the environment of this task
*/
public final void setEnvironment(Environment environment) {
this.environment = environment;
} /**
* Returns the environment of this task.
*
* @return The environment of this task.
*/
public Environment getEnvironment() {
return this.environment;
}
}

 

所以在StreamTask里面可以这样使用metrics,

getEnvironment().getMetricGroup().gauge("lastCheckpointSize", new Gauge<Long>() {
@Override
public Long getValue() {
return StreamTask.this.lastCheckpointSize;
}
});

Flink - metrics的更多相关文章

  1. Flink Metrics 源码解析

    Flink Metrics 有如下模块: Flink Metrics 源码解析 -- Flink-metrics-core Flink Metrics 源码解析 -- Flink-metrics-da ...

  2. 深入理解Flink ---- Metrics的内部结构

    从Metrics的使用说起 Flink的Metrics种类有四种Counters, Gauges, Histograms和Meters. 如何使用Metrics呢? 以Counter为例, publi ...

  3. Flink – metrics V1.2

    WebRuntimeMonitor   .GET("/jobs/:jobid/vertices/:vertexid/metrics", handler(new JobVertexM ...

  4. Apache Flink 进阶(八):详解 Metrics 原理与实战

    本文由 Apache Flink Contributor 刘彪分享,本文对两大问题进行了详细的介绍,即什么是 Metrics.如何使用 Metrics,并对 Metrics 监控实战进行解释说明. 什 ...

  5. Flink写入kafka时,只写入kafka的部分Partitioner,无法写所有的Partitioner问题

    1. 写在前面 在利用flink实时计算的时候,往往会从kafka读取数据写入数据到kafka,但会发现当kafka多个Partitioner时,特别在P量级数据为了kafka的性能kafka的节点有 ...

  6. flink metric库的使用和自定义metric-reporter

    简单介绍 flink内部实现了一套metric数据收集库. 同时flink自身系统有一些固定的metric数据, 包括系统的一些指标,CPU,内存, IO 或者各个task运行的一些指标.具体包含那些 ...

  7. Flink知识点

    1. Flink.Storm.Sparkstreaming对比 Storm只支持流处理任务,数据是一条一条的源源不断地处理,而MapReduce.spark只支持批处理任务,spark-streami ...

  8. Flink 灵魂两百问,这谁顶得住?

    Flink 学习 https://github.com/zhisheng17/flink-learning 麻烦路过的各位亲给这个项目点个 star,太不易了,写了这么多,算是对我坚持下来的一种鼓励吧 ...

  9. Flink 从0到1学习 —— Flink 中如何管理配置?

    前言 如果你了解 Apache Flink 的话,那么你应该熟悉该如何像 Flink 发送数据或者如何从 Flink 获取数据.但是在某些情况下,我们需要将配置数据发送到 Flink 集群并从中接收一 ...

随机推荐

  1. Android获取APK包名的几种方法

    Android获取APK包名的几种方法:1.adb shell pm list package -f | findstr 关键字 #只能获取到包名,主Activity名无法获取到 2.使用aapt-- ...

  2. 获取iframe的元素并进行操作

    获取iframe中的document元素有一下集中方法: 1.getElementById()方法和contentWindow属性: window.onload=function(){ /*必须等待页 ...

  3. HTML无序列表和有序列表

    html无序列表<ul><li></li></ul>   ul属性设定:<ul type="square"> 常用属性值 ...

  4. HDU-2159FATE(二维完全背包)

    FATE Problem Description 最 近xhd正在玩一款叫做FATE的游戏,为了得到极品装备,xhd在不停的杀怪做任务.久而久之xhd开始对杀怪产生的厌恶感,但又不得不通过杀怪来升完 ...

  5. [转]Mac 科研常用软件

    转自:http://bbs.feng.com/read-htm-tid-7698336.html 我的 Mac 是 2012 年的 Pro Retina,现在主要用的是 Mac 系统,Windows ...

  6. Js:DOM对象操作常用的方法和属性

  7. Install the Maven in your computer

    While, this blog will talk about installing the Maven in your computer. There are three steps as fol ...

  8. 最详细eclipse汉化插件安装教程

    最详细eclipse汉化插件安装教程(转) 转自:http://blog.csdn.net/dai_zhenliang/article/details/8588576#t4 教程作者:戴振良 本文与& ...

  9. [LintCode] House Robber 打家劫舍

    You are a professional robber planning to rob houses along a street. Each house has a certain amount ...

  10. GitLab安装手记

    阿里云1G内存20G硬盘 1.首先下载GitLab Deb包(官网附有apt-get安装方式,但国内环境貌似不成功): https://about.gitlab.com/downloads/ 2. d ...