Metrics是以MetricsGroup来组织的

MetricGroup

MetricGroup

这就是个metric容器,里面可以放subGroup,或者各种metric

所以主要的接口就是注册,

/**
* A MetricGroup is a named container for {@link Metric Metrics} and further metric subgroups.
*
* <p>Instances of this class can be used to register new metrics with Flink and to create a nested
* hierarchy based on the group names.
*
* <p>A MetricGroup is uniquely identified by it's place in the hierarchy and name.
*/
public interface MetricGroup {
<C extends Counter> C counter(int name, C counter);
<T, G extends Gauge<T>> G gauge(int name, G gauge);
<H extends Histogram> H histogram(String name, H histogram);
MetricGroup addGroup(String name);
}

 

AbstractMetricGroup

关键是实现MetricGroup,逻辑很简单,在注册或close的时候都需要加锁互斥

/**
* Abstract {@link MetricGroup} that contains key functionality for adding metrics and groups.
*
*/ public abstract class AbstractMetricGroup implements MetricGroup { /** The registry that this metrics group belongs to */
protected final MetricRegistry registry; /** All metrics that are directly contained in this group */
private final Map<String, Metric> metrics = new HashMap<>(); /** All metric subgroups of this group */
private final Map<String, AbstractMetricGroup> groups = new HashMap<>(); /** The metrics scope represented by this group.
* For example ["host-7", "taskmanager-2", "window_word_count", "my-mapper" ]. */
private final String[] scopeComponents; //命名空间 /** The metrics scope represented by this group, as a concatenated string, lazily computed.
* For example: "host-7.taskmanager-2.window_word_count.my-mapper" */
private String scopeString; @Override
public <C extends Counter> C counter(String name, C counter) {
addMetric(name, counter);
return counter;
} /**
* Adds the given metric to the group and registers it at the registry, if the group
* is not yet closed, and if no metric with the same name has been registered before.
*
* @param name the name to register the metric under
* @param metric the metric to register
*/
protected void addMetric(String name, Metric metric) {
// add the metric only if the group is still open
synchronized (this) { //加锁
if (!closed) {
// immediately put without a 'contains' check to optimize the common case (no collition)
// collisions are resolved later
Metric prior = metrics.put(name, metric); // check for collisions with other metric names
if (prior == null) {
// no other metric with this name yet registry.register(metric, name, this);
}
else {
// we had a collision. put back the original value
metrics.put(name, prior); }
}
}
}
}

 

MetricReporter

采集好的Metrics需要用reporter才能发送出去,

/**
* Reporters are used to export {@link Metric Metrics} to an external backend.
*
* <p>Reporters are instantiated via reflection and must be public, non-abstract, and have a
* public no-argument constructor.
*/
public interface MetricReporter { // ------------------------------------------------------------------------
// life cycle
// ------------------------------------------------------------------------ /**
* Configures this reporter. Since reporters are instantiated generically and hence parameter-less,
* this method is the place where the reporters set their basic fields based on configuration values.
*
* <p>This method is always called first on a newly instantiated reporter.
*
* @param config The configuration with all parameters.
*/
void open(MetricConfig config); /**
* Closes this reporter. Should be used to close channels, streams and release resources.
*/
void close(); void notifyOfAddedMetric(Metric metric, String metricName, MetricGroup group);
void notifyOfRemovedMetric(Metric metric, String metricName, MetricGroup group);
}

 

AbstractReporter实现MetricReport接口,

/**
* Base interface for custom metric reporters.
*/
public abstract class AbstractReporter implements MetricReporter, CharacterFilter {
protected final Logger log = LoggerFactory.getLogger(getClass()); protected final Map<Gauge<?>, String> gauges = new HashMap<>();
protected final Map<Counter, String> counters = new HashMap<>();
protected final Map<Histogram, String> histograms = new HashMap<>(); @Override
public void notifyOfAddedMetric(Metric metric, String metricName, MetricGroup group) {
final String name = group.getMetricIdentifier(metricName, this); //group只是用来获取metrics完整的name synchronized (this) {
if (metric instanceof Counter) {
counters.put((Counter) metric, name);
} else if (metric instanceof Gauge) {
gauges.put((Gauge<?>) metric, name);
} else if (metric instanceof Histogram) {
histograms.put((Histogram) metric, name);
} else {
log.warn("Cannot add unknown metric type {}. This indicates that the reporter " +
"does not support this metric type.", metric.getClass().getName());
}
}
} @Override
public void notifyOfRemovedMetric(Metric metric, String metricName, MetricGroup group) {
synchronized (this) {
if (metric instanceof Counter) {
counters.remove(metric);
} else if (metric instanceof Gauge) {
gauges.remove(metric);
} else if (metric instanceof Histogram) {
histograms.remove(metric);
} else {
log.warn("Cannot remove unknown metric type {}. This indicates that the reporter " +
"does not support this metric type.", metric.getClass().getName());
}
}
}
}

 

MetricRegistry

MetricRegistry用于连接MetricGroups和MetricReporters,

会把需要report的metric加到MetricReporters,并启动定时的report线程

/**
* A MetricRegistry keeps track of all registered {@link Metric Metrics}. It serves as the
* connection between {@link MetricGroup MetricGroups} and {@link MetricReporter MetricReporters}.
*/
public class MetricRegistry { private List<MetricReporter> reporters;
private ScheduledExecutorService executor; private final ScopeFormats scopeFormats; private final char delimiter; /**
* Creates a new MetricRegistry and starts the configured reporter.
*/
public MetricRegistry(Configuration config) {
// first parse the scope formats, these are needed for all reporters
ScopeFormats scopeFormats;
try {
scopeFormats = createScopeConfig(config); //从配置中读到scope的格式,即监控数据的namespace的格式是什么
}
catch (Exception e) {
LOG.warn("Failed to parse scope format, using default scope formats", e);
scopeFormats = new ScopeFormats();
}
this.scopeFormats = scopeFormats; char delim;
try {
delim = config.getString(ConfigConstants.METRICS_SCOPE_DELIMITER, ".").charAt(0); //从配置里面读出分隔符
} catch (Exception e) {
LOG.warn("Failed to parse delimiter, using default delimiter.", e);
delim = '.';
}
this.delimiter = delim; // second, instantiate any custom configured reporters
this.reporters = new ArrayList<>(); final String definedReporters = config.getString(ConfigConstants.METRICS_REPORTERS_LIST, null); //读出配置的Reporters if (definedReporters == null) {
// no reporters defined
// by default, don't report anything
LOG.info("No metrics reporter configured, no metrics will be exposed/reported.");
this.executor = null;
} else {
// we have some reporters so
String[] namedReporters = definedReporters.split("\\s*,\\s*");
for (String namedReporter : namedReporters) { //对于配置的每个reporter DelegatingConfiguration reporterConfig = new DelegatingConfiguration(config, ConfigConstants.METRICS_REPORTER_PREFIX + namedReporter + ".");
final String className = reporterConfig.getString(ConfigConstants.METRICS_REPORTER_CLASS_SUFFIX, null); //reporter class名配置 try {
String configuredPeriod = reporterConfig.getString(ConfigConstants.METRICS_REPORTER_INTERVAL_SUFFIX, null); //report interval配置
TimeUnit timeunit = TimeUnit.SECONDS;
long period = 10; if (configuredPeriod != null) {
try {
String[] interval = configuredPeriod.split(" ");
period = Long.parseLong(interval[0]);
timeunit = TimeUnit.valueOf(interval[1]);
}
catch (Exception e) {
LOG.error("Cannot parse report interval from config: " + configuredPeriod +
" - please use values like '10 SECONDS' or '500 MILLISECONDS'. " +
"Using default reporting interval.");
}
} Class<?> reporterClass = Class.forName(className);
MetricReporter reporterInstance = (MetricReporter) reporterClass.newInstance(); //实例化reporter MetricConfig metricConfig = new MetricConfig();
reporterConfig.addAllToProperties(metricConfig);
reporterInstance.open(metricConfig); //open reporter if (reporterInstance instanceof Scheduled) {
if (this.executor == null) {
executor = Executors.newSingleThreadScheduledExecutor(); //创建Executor
}
LOG.info("Periodically reporting metrics in intervals of {} {} for reporter {} of type {}.", period, timeunit.name(), namedReporter, className); executor.scheduleWithFixedDelay(
new ReporterTask((Scheduled) reporterInstance), period, period, timeunit); //Scheduled report
}
reporters.add(reporterInstance); //加入reporters列表
}
catch (Throwable t) {
shutdownExecutor();
LOG.error("Could not instantiate metrics reporter" + namedReporter + ". Metrics might not be exposed/reported.", t);
}
}
}
} // ------------------------------------------------------------------------
// Metrics (de)registration
// ------------------------------------------------------------------------ /**
* Registers a new {@link Metric} with this registry.
*
* @param metric the metric that was added
* @param metricName the name of the metric
* @param group the group that contains the metric
*/
public void register(Metric metric, String metricName, MetricGroup group) { //在AbstractMetricGroup.addMetric中被调用,metric被加到group的同时也会加到reporter中
        try {
if (reporters != null) {
for (MetricReporter reporter : reporters) {
if (reporter != null) {
reporter.notifyOfAddedMetric(metric, metricName, group); //把metric加到每个reporters上面
}
}
}
} catch (Exception e) {
LOG.error("Error while registering metric.", e);
}
} /**
* Un-registers the given {@link org.apache.flink.metrics.Metric} with this registry.
*
* @param metric the metric that should be removed
* @param metricName the name of the metric
* @param group the group that contains the metric
*/
public void unregister(Metric metric, String metricName, MetricGroup group) {
try {
if (reporters != null) {
for (MetricReporter reporter : reporters) {
if (reporter != null) {
reporter.notifyOfRemovedMetric(metric, metricName, group);
}
}
}
} catch (Exception e) {
LOG.error("Error while registering metric.", e);
}
} // ------------------------------------------------------------------------ /**
* This task is explicitly a static class, so that it does not hold any references to the enclosing
* MetricsRegistry instance.
*
* This is a subtle difference, but very important: With this static class, the enclosing class instance
* may become garbage-collectible, whereas with an anonymous inner class, the timer thread
* (which is a GC root) will hold a reference via the timer task and its enclosing instance pointer.
* Making the MetricsRegistry garbage collectible makes the java.util.Timer garbage collectible,
* which acts as a fail-safe to stop the timer thread and prevents resource leaks.
*/
private static final class ReporterTask extends TimerTask { private final Scheduled reporter; private ReporterTask(Scheduled reporter) {
this.reporter = reporter;
} @Override
public void run() {
try {
reporter.report(); //Task的核心就是调用reporter.report
} catch (Throwable t) {
LOG.warn("Error while reporting metrics", t);
}
}
}
}

 

TaskManager

在TaskManager中,

associateWithJobManager
metricsRegistry = new FlinkMetricRegistry(config.configuration)

taskManagerMetricGroup =
new TaskManagerMetricGroup(metricsRegistry, this.runtimeInfo.getHostname, id.toString) TaskManager.instantiateStatusMetrics(taskManagerMetricGroup)

创建metricsRegistry 和TaskManagerMetricGroup

可以看到instantiateStatusMetrics,只是注册各种taskManager的status metrics,

private def instantiateStatusMetrics(taskManagerMetricGroup: MetricGroup) : Unit = {
val jvm = taskManagerMetricGroup
.addGroup("Status")
.addGroup("JVM") instantiateClassLoaderMetrics(jvm.addGroup("ClassLoader"))
instantiateGarbageCollectorMetrics(jvm.addGroup("GarbageCollector"))
instantiateMemoryMetrics(jvm.addGroup("Memory"))
instantiateThreadMetrics(jvm.addGroup("Threads"))
instantiateCPUMetrics(jvm.addGroup("CPU"))
} private def instantiateClassLoaderMetrics(metrics: MetricGroup) {
val mxBean = ManagementFactory.getClassLoadingMXBean //从ManagementFactory可以取出表示JVM指标的MXBean metrics.gauge[Long, FlinkGauge[Long]]("ClassesLoaded", new FlinkGauge[Long] {
override def getValue: Long = mxBean.getTotalLoadedClassCount
})
metrics.gauge[Long, FlinkGauge[Long]]("ClassesUnloaded", new FlinkGauge[Long] {
override def getValue: Long = mxBean.getUnloadedClassCount
})
}

 

在submitTask的时候,

submitTask
  val taskMetricGroup = taskManagerMetricGroup.addTaskForJob(tdd)

  val task = new Task(
tdd,
memoryManager,
ioManager,
network,
bcVarManager,
selfGateway,
jobManagerGateway,
config.timeout,
libCache,
fileCache,
runtimeInfo,
taskMetricGroup)

看到会为每个task,创建taskMetricGroup

并在创建Task对象的时候传入该对象,

Environment env = new RuntimeEnvironment(jobId, vertexId, executionId,
executionConfig, taskInfo, jobConfiguration, taskConfiguration,
userCodeClassLoader, memoryManager, ioManager,
broadcastVariableManager, accumulatorRegistry,
splitProvider, distributedCacheEntries,
writers, inputGates, jobManager, taskManagerConfig, metrics, this); // let the task code create its readers and writers
invokable.setEnvironment(env);

在Task中, 关键的就是把这个taskMetricGroup,加入RuntimeEnvironment,这样在实际逻辑中,就可以通过RuntimeEnvironment获取到metrics

而StreamTask就是一种Invokable,接口定义如下

public abstract class AbstractInvokable {

    /** The environment assigned to this invokable. */
private Environment environment; /**
* Starts the execution.
*
* <p>Must be overwritten by the concrete task implementation. This method
* is called by the task manager when the actual execution of the task
* starts.
*
* <p>All resources should be cleaned up when the method returns. Make sure
* to guard the code with <code>try-finally</code> blocks where necessary.
*
* @throws Exception
* Tasks may forward their exceptions for the TaskManager to handle through failure/recovery.
*/
public abstract void invoke() throws Exception; /**
* Sets the environment of this task.
*
* @param environment
* the environment of this task
*/
public final void setEnvironment(Environment environment) {
this.environment = environment;
} /**
* Returns the environment of this task.
*
* @return The environment of this task.
*/
public Environment getEnvironment() {
return this.environment;
}
}

 

所以在StreamTask里面可以这样使用metrics,

getEnvironment().getMetricGroup().gauge("lastCheckpointSize", new Gauge<Long>() {
@Override
public Long getValue() {
return StreamTask.this.lastCheckpointSize;
}
});

Flink - metrics的更多相关文章

  1. Flink Metrics 源码解析

    Flink Metrics 有如下模块: Flink Metrics 源码解析 -- Flink-metrics-core Flink Metrics 源码解析 -- Flink-metrics-da ...

  2. 深入理解Flink ---- Metrics的内部结构

    从Metrics的使用说起 Flink的Metrics种类有四种Counters, Gauges, Histograms和Meters. 如何使用Metrics呢? 以Counter为例, publi ...

  3. Flink – metrics V1.2

    WebRuntimeMonitor   .GET("/jobs/:jobid/vertices/:vertexid/metrics", handler(new JobVertexM ...

  4. Apache Flink 进阶(八):详解 Metrics 原理与实战

    本文由 Apache Flink Contributor 刘彪分享,本文对两大问题进行了详细的介绍,即什么是 Metrics.如何使用 Metrics,并对 Metrics 监控实战进行解释说明. 什 ...

  5. Flink写入kafka时,只写入kafka的部分Partitioner,无法写所有的Partitioner问题

    1. 写在前面 在利用flink实时计算的时候,往往会从kafka读取数据写入数据到kafka,但会发现当kafka多个Partitioner时,特别在P量级数据为了kafka的性能kafka的节点有 ...

  6. flink metric库的使用和自定义metric-reporter

    简单介绍 flink内部实现了一套metric数据收集库. 同时flink自身系统有一些固定的metric数据, 包括系统的一些指标,CPU,内存, IO 或者各个task运行的一些指标.具体包含那些 ...

  7. Flink知识点

    1. Flink.Storm.Sparkstreaming对比 Storm只支持流处理任务,数据是一条一条的源源不断地处理,而MapReduce.spark只支持批处理任务,spark-streami ...

  8. Flink 灵魂两百问,这谁顶得住?

    Flink 学习 https://github.com/zhisheng17/flink-learning 麻烦路过的各位亲给这个项目点个 star,太不易了,写了这么多,算是对我坚持下来的一种鼓励吧 ...

  9. Flink 从0到1学习 —— Flink 中如何管理配置?

    前言 如果你了解 Apache Flink 的话,那么你应该熟悉该如何像 Flink 发送数据或者如何从 Flink 获取数据.但是在某些情况下,我们需要将配置数据发送到 Flink 集群并从中接收一 ...

随机推荐

  1. Middleware In ASP.NET Core

    中间件简介 ASP.NET Core 由很多中间件构成,实现了一个HTTP请求管道(pipeline). Request的Response的管道可以看成一个Push Stack 和 Pop Stack ...

  2. three.js 显示一个绿色的正方体

    第一个框架的效果是显示一个绿色的正方体 <!DOCTYPE html> <html> <head> <title></title> < ...

  3. 【Oracle】使用Oracle的v$sql视图查看近段时间执行的SQL语句

    SELECT * FROM v$sql ORDER BY first_load_time DESC;

  4. 51Nod 1021 石子合并 Label:Water DP

    N堆石子摆成一条线.现要将石子有次序地合并成一堆.规定每次只能选相邻的2堆石子合并成新的一堆,并将新的一堆石子数记为该次合并的代价.计算将N堆石子合并成一堆的最小代价.   例如: 1 2 3 4,有 ...

  5. 20161004 NOIP 模拟赛 T1 解题报告

    第1题  小麦亩产一千八 [问题描述] “有了金坷垃,肥料一袋能顶两袋撒,小麦亩产一千八,吸收两米下的氮磷钾……”,话说HYSBZ(Hengyang School for Boys & Zy) ...

  6. url 转码

    //URL解码 //-(NSString *)URLDecodedString:(NSString *)str //{ // NSString *decodedString=(__bridge_tra ...

  7. Flex与.net进行URL参数传递编码处理

    在JS中用到的三种编码方式escape 对应于Flex中是一样的,并且支持相互的解码 var a:String = "超越梦想#"; trace(escape(a)); //%u8 ...

  8. php 实现推技术comet(转)

    实现实时通信一般有两种方式:socket或comet.socket是比较好的解决方案,问题在于不是所有的浏览器都兼容,服务器端实现起来也稍微有点麻烦.相比之下,comet(基于HTTP长连接的&quo ...

  9. ASP.NET生成静态方法

    CMS系统如果新闻多了,全部生成静态的话.不现实,而且占用空间比较大.那么只生成网站首页是必须的了,下面列出JCMS首页生成静态的方法.换一种思路其实更简单. 当点击生成首页静态的时候.去获取动态首页 ...

  10. [LintCode] Left Pad 左填充

    You know what, left pad is javascript package and referenced by React: Github link One day his autho ...