读accumlator

JobManager

在job finish的时候会汇总accumulator的值,

newJobStatus match {
case JobStatus.FINISHED =>
try {
val accumulatorResults = executionGraph.getAccumulatorsSerialized()
val result = new SerializedJobExecutionResult(
jobID,
jobInfo.duration,
accumulatorResults) jobInfo.client ! decorateMessage(JobResultSuccess(result))
}

 

在client请求accumulation时,

public Map<String, Object> getAccumulators(JobID jobID, ClassLoader loader) throws Exception {
ActorGateway jobManagerGateway = getJobManagerGateway(); Future<Object> response;
try {
response = jobManagerGateway.ask(new RequestAccumulatorResults(jobID), timeout);
} catch (Exception e) {
throw new Exception("Failed to query the job manager gateway for accumulators.", e);
}

 

消息传到job manager

case message: AccumulatorMessage => handleAccumulatorMessage(message)
private def handleAccumulatorMessage(message: AccumulatorMessage): Unit = {
message match {
case RequestAccumulatorResults(jobID) =>
try {
currentJobs.get(jobID) match {
case Some((graph, jobInfo)) =>
val accumulatorValues = graph.getAccumulatorsSerialized()
sender() ! decorateMessage(AccumulatorResultsFound(jobID, accumulatorValues))
case None =>
archive.forward(message)
}
}

 

ExecuteGraph

获取accumulator的值

/**
* Gets a serialized accumulator map.
* @return The accumulator map with serialized accumulator values.
* @throws IOException
*/
public Map<String, SerializedValue<Object>> getAccumulatorsSerialized() throws IOException { Map<String, Accumulator<?, ?>> accumulatorMap = aggregateUserAccumulators(); Map<String, SerializedValue<Object>> result = new HashMap<String, SerializedValue<Object>>();
for (Map.Entry<String, Accumulator<?, ?>> entry : accumulatorMap.entrySet()) {
result.put(entry.getKey(), new SerializedValue<Object>(entry.getValue().getLocalValue()));
} return result;
}

 

execution的accumulator聚合,

/**
* Merges all accumulator results from the tasks previously executed in the Executions.
* @return The accumulator map
*/
public Map<String, Accumulator<?,?>> aggregateUserAccumulators() { Map<String, Accumulator<?, ?>> userAccumulators = new HashMap<String, Accumulator<?, ?>>(); for (ExecutionVertex vertex : getAllExecutionVertices()) {
Map<String, Accumulator<?, ?>> next = vertex.getCurrentExecutionAttempt().getUserAccumulators();
if (next != null) {
AccumulatorHelper.mergeInto(userAccumulators, next);
}
} return userAccumulators;
}

具体merge的逻辑,

public static void mergeInto(Map<String, Accumulator<?, ?>> target, Map<String, Accumulator<?, ?>> toMerge) {
for (Map.Entry<String, Accumulator<?, ?>> otherEntry : toMerge.entrySet()) {
Accumulator<?, ?> ownAccumulator = target.get(otherEntry.getKey());
if (ownAccumulator == null) {
// Create initial counter (copy!)
target.put(otherEntry.getKey(), otherEntry.getValue().clone());
}
else {
// Both should have the same type
AccumulatorHelper.compareAccumulatorTypes(otherEntry.getKey(),
ownAccumulator.getClass(), otherEntry.getValue().getClass());
// Merge target counter with other counter
mergeSingle(ownAccumulator, otherEntry.getValue());
}
}
}

 

更新accumulator

JobManager

收到task发来的heartbeat,其中附带accumulators

case Heartbeat(instanceID, metricsReport, accumulators) =>
updateAccumulators(accumulators)

根据jobid,更新到ExecutionGraph

private def updateAccumulators(accumulators : Seq[AccumulatorSnapshot]) = {
accumulators foreach {
case accumulatorEvent =>
currentJobs.get(accumulatorEvent.getJobID) match {
case Some((jobGraph, jobInfo)) =>
future {
jobGraph.updateAccumulators(accumulatorEvent)
}(context.dispatcher)
case None =>
// ignore accumulator values for old job
}
}
}

根据ExecutionAttemptID, 更新Execution中

/**
* Updates the accumulators during the runtime of a job. Final accumulator results are transferred
* through the UpdateTaskExecutionState message.
* @param accumulatorSnapshot The serialized flink and user-defined accumulators
*/
public void updateAccumulators(AccumulatorSnapshot accumulatorSnapshot) {
Map<AccumulatorRegistry.Metric, Accumulator<?, ?>> flinkAccumulators;
Map<String, Accumulator<?, ?>> userAccumulators;
try {
flinkAccumulators = accumulatorSnapshot.deserializeFlinkAccumulators();
userAccumulators = accumulatorSnapshot.deserializeUserAccumulators(userClassLoader); ExecutionAttemptID execID = accumulatorSnapshot.getExecutionAttemptID();
Execution execution = currentExecutions.get(execID);
if (execution != null) {
execution.setAccumulators(flinkAccumulators, userAccumulators);
}
}
}

对于execution,只要状态不是结束,就直接更新

/**
* Update accumulators (discarded when the Execution has already been terminated).
* @param flinkAccumulators the flink internal accumulators
* @param userAccumulators the user accumulators
*/
public void setAccumulators(Map<AccumulatorRegistry.Metric, Accumulator<?, ?>> flinkAccumulators,
Map<String, Accumulator<?, ?>> userAccumulators) {
synchronized (accumulatorLock) {
if (!state.isTerminal()) {
this.flinkAccumulators = flinkAccumulators;
this.userAccumulators = userAccumulators;
}
}
}

 

再看TaskManager如何更新accumulator,并发送heartbeat,

 /**
* Sends a heartbeat message to the JobManager (if connected) with the current
* metrics report.
*/
protected def sendHeartbeatToJobManager(): Unit = {
try {
val metricsReport: Array[Byte] = metricRegistryMapper.writeValueAsBytes(metricRegistry) val accumulatorEvents =
scala.collection.mutable.Buffer[AccumulatorSnapshot]() runningTasks foreach {
case (execID, task) =>
val registry = task.getAccumulatorRegistry
val accumulators = registry.getSnapshot
accumulatorEvents.append(accumulators)
} currentJobManager foreach {
jm => jm ! decorateMessage(Heartbeat(instanceID, metricsReport, accumulatorEvents))
}
}
}

可以看到会把每个running task的accumulators放到accumulatorEvents,然后通过Heartbeat消息发出

 

而task的accumlators是通过,task.getAccumulatorRegistry.getSnapshot得到

看看
AccumulatorRegistry
/**
* Main accumulator registry which encapsulates internal and user-defined accumulators.
*/
public class AccumulatorRegistry { protected static final Logger LOG = LoggerFactory.getLogger(AccumulatorRegistry.class); protected final JobID jobID; //accumulators所属的Job
protected final ExecutionAttemptID taskID; //taskID /* Flink's internal Accumulator values stored for the executing task. */
private final Map<Metric, Accumulator<?, ?>> flinkAccumulators = //内部的Accumulators
new HashMap<Metric, Accumulator<?, ?>>(); /* User-defined Accumulator values stored for the executing task. */
private final Map<String, Accumulator<?, ?>> userAccumulators = new HashMap<>(); //用户定义的Accumulators /* The reporter reference that is handed to the reporting tasks. */
private final ReadWriteReporter reporter; /**
* Creates a snapshot of this accumulator registry.
* @return a serialized accumulator map
*/
public AccumulatorSnapshot getSnapshot() {
try {
return new AccumulatorSnapshot(jobID, taskID, flinkAccumulators, userAccumulators);
} catch (IOException e) {
LOG.warn("Failed to serialize accumulators for task.", e);
return null;
}
}
}

snapshot的逻辑也很简单,

public AccumulatorSnapshot(JobID jobID, ExecutionAttemptID executionAttemptID,
Map<AccumulatorRegistry.Metric, Accumulator<?, ?>> flinkAccumulators,
Map<String, Accumulator<?, ?>> userAccumulators) throws IOException {
this.jobID = jobID;
this.executionAttemptID = executionAttemptID;
this.flinkAccumulators = new SerializedValue<Map<AccumulatorRegistry.Metric, Accumulator<?, ?>>>(flinkAccumulators);
this.userAccumulators = new SerializedValue<Map<String, Accumulator<?, ?>>>(userAccumulators);
}

 

最后,我们如何将统计数据累加到Accumulator上的?

直接看看Flink内部的Accumulator是如何更新的,都是通过这个reporter来更新的

/**
* Accumulator based reporter for keeping track of internal metrics (e.g. bytes and records in/out)
*/
private static class ReadWriteReporter implements Reporter { private LongCounter numRecordsIn = new LongCounter();
private LongCounter numRecordsOut = new LongCounter();
private LongCounter numBytesIn = new LongCounter();
private LongCounter numBytesOut = new LongCounter(); private ReadWriteReporter(Map<Metric, Accumulator<?,?>> accumulatorMap) {
accumulatorMap.put(Metric.NUM_RECORDS_IN, numRecordsIn);
accumulatorMap.put(Metric.NUM_RECORDS_OUT, numRecordsOut);
accumulatorMap.put(Metric.NUM_BYTES_IN, numBytesIn);
accumulatorMap.put(Metric.NUM_BYTES_OUT, numBytesOut);
} @Override
public void reportNumRecordsIn(long value) {
numRecordsIn.add(value);
} @Override
public void reportNumRecordsOut(long value) {
numRecordsOut.add(value);
} @Override
public void reportNumBytesIn(long value) {
numBytesIn.add(value);
} @Override
public void reportNumBytesOut(long value) {
numBytesOut.add(value);
}
}

 

何处调用到这个report的接口,

对于in, 在反序列化到record的时候会统计Bytesin和Recordsin

AdaptiveSpanningRecordDeserializer
public DeserializationResult getNextRecord(T target) throws IOException {
// check if we can get a full length;
if (nonSpanningRemaining >= 4) {
int len = this.nonSpanningWrapper.readInt(); if (reporter != null) {
reporter.reportNumBytesIn(len);
} if (len <= nonSpanningRemaining - 4) {
// we can get a full record from here
target.read(this.nonSpanningWrapper); if (reporter != null) {
reporter.reportNumRecordsIn(1);
}

 

所以对于out,反之则序列化的时候写入

SpanningRecordSerializer
@Override
public SerializationResult addRecord(T record) throws IOException {
int len = this.serializationBuffer.length();
this.lengthBuffer.putInt(0, len); if (reporter != null) {
reporter.reportNumBytesOut(len);
reporter.reportNumRecordsOut(1);
}

 

使用accumulator时,需要首先extends RichFunction by callinggetRuntimeContext().addAccumulator

flink - accumulator的更多相关文章

  1. Flink DataSet API Programming Guide

     https://ci.apache.org/projects/flink/flink-docs-release-0.10/apis/programming_guide.html   Example ...

  2. Flink Program Guide (1) -- 基本API概念(Basic API Concepts -- For Java)

    false false false false EN-US ZH-CN X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-n ...

  3. Apache Flink Quickstart

    Apache Flink 是新一代的基于 Kappa 架构的流处理框架,近期底层部署结构基于 FLIP-6 做了大规模的调整,我们来看一下在新的版本(1.6-SNAPSHOT)下怎样从源码快速编译执行 ...

  4. Flink学习(三)状态机制于容错机制,State与CheckPoint

    摘自Apache官网 一.State的基本概念 什么叫State?搜了一把叫做状态机制.可以用作以下用途.为了保证 at least once, exactly once,Flink引入了State和 ...

  5. Flink – WindowedStream

    在WindowedStream上可以执行,如reduce,aggregate,min,max等操作 关键是要理解windowOperator对KVState的运用,因为window是用它来存储wind ...

  6. Flink 中的kafka何时commit?

    https://ci.apache.org/projects/flink/flink-docs-release-1.6/internals/stream_checkpointing.html @Ove ...

  7. Flink 部署文档

    Flink 部署文档 1 先决条件 2 下载 Flink 二进制文件 3 配置 Flink 3.1 flink-conf.yaml 3.2 slaves 4 将配置好的 Flink 分发到其他节点 5 ...

  8. 聊聊flink的Async I/O

    // This example implements the asynchronous request and callback with Futures that have the // inter ...

  9. Flink学习笔记:Flink API 通用基本概念

    本文为<Flink大数据项目实战>学习笔记,想通过视频系统学习Flink这个最火爆的大数据计算框架的同学,推荐学习课程: Flink大数据项目实战:http://t.cn/EJtKhaz ...

随机推荐

  1. 【GruntMate】一个让你更方便使用Grunt的工具

    GruntMate是什么? 一个基于Grunt的项目管理可视化工具(还不知道Grunt是什么?可以谷歌一下就知道了!) GruntMate有哪些功能? 方便的管理基于Grunt的项目 方便统一管理Gr ...

  2. java环境变量配置(转)

    java环境变量配置 windows xp下配置JDK环境变量: 1.安装JDK,安装过程中可以自定义安装目录等信息,例如我们选择安装目录为D:\java\jdk1.5.0_08: 2.安装完成后,右 ...

  3. Servlet应用的运行流程

    其中,红色部分为我们开发人员要做的,其他部分是框架做的. 学习就要搞懂整个运行的流程!否则,不利于个人技术的积累!

  4. Android之SurfaceView学习(一)

    对应的中文翻译SurfaceView是视图(View)的继承类,这个视图里内嵌了一个专门用于绘制的Surface.你可以控制这个Surface的格式和尺寸.Surfaceview控制这个Surface ...

  5. java.lang.NoClassDefFoundError: org/apache/avro/ipc/Responder

    文章发自:http://www.cnblogs.com/hark0623/p/4170174.html  转发请注明     java.lang.NoClassDefFoundError: org/a ...

  6. WireShark数据包分析数据封装

    WireShark数据包分析数据封装 数据封装(Data Encapsulation)是指将协议数据单元(PDU)封装在一组协议头和尾中的过程.在OSI七层参考模型中,每层主要负责与其它机器上的对等层 ...

  7. sprint1的个人总结及《构建之法》8、9、10章读后感

    对sprint1的总结: 我们这次的sprint1做的挺差的,大家原来说好的分工都没有完成,也许是大家这段时间的大作业花了更多的时间,所以对这次团队任务的进度是拖慢了很多,但是团队已经认清了现阶段的问 ...

  8. 2015CCPC小记

    距离第一届CCPC也正好一星期了,突然想到还没写总结,现在补上.做为刚度过大一的我,能有机会去参加国赛是很激动的.周五下午出发,坐了13个小时的火车抵达南阳,南阳不算大城市,有点落后,但是这里的人很热 ...

  9. BZOJ 2282 & 树的直径

    SDOI2011的Dayx第2题 题意: 在树中找到一条权值和不超过S的链(为什么是链呢,因为题目中提到“使得路径的两端都是城市”,如果不是链那不就不止两端了吗——怎么这么机智的感觉...),使得不在 ...

  10. POJ 1177 Picture(求周长并)

    题目链接 看的HH的题解..周长有两部分组成,横着和竖着的,横着通过,sum[1] - last来计算,竖着的通过标记,记录有多少段. #include <cstdio> #include ...