看看Flink cep如何将pattern转换为NFA?

当来了一条event,如果在NFA中执行的?

前面的链路,CEP –> PatternStream –> select –> CEPOperatorUtils.createPatternStream

1. 产生NFACompiler.compileFactory,完成pattern到state的转换

final NFACompiler.NFAFactory<T> nfaFactory = NFACompiler.compileFactory(pattern, inputSerializer, false);
            final NFAFactoryCompiler<T> nfaFactoryCompiler = new NFAFactoryCompiler<>(pattern);
nfaFactoryCompiler.compileFactory();
return new NFAFactoryImpl<>(inputTypeSerializer, nfaFactoryCompiler.getWindowTime(), nfaFactoryCompiler.getStates(), timeoutHandling);
调用,nfaFactoryCompiler.compileFactory
        void compileFactory() {
// we're traversing the pattern from the end to the beginning --> the first state is the final state
State<T> sinkState = createEndingState();
// add all the normal states
sinkState = createMiddleStates(sinkState);
// add the beginning state
createStartState(sinkState);
}
可以看到做的工作,主要是生成state,即把pattern转换为NFA中的state和stateTransition
因为加pattern的是不断往后加,通过private final Pattern<T, ? extends T> previous来指向前面的pattern,所以在遍历pattern的时候只能回溯
先创建最后的final state
        private State<T> createEndingState() {
State<T> endState = createState(ENDING_STATE_NAME, State.StateType.Final);
windowTime = currentPattern.getWindowTime() != null ? currentPattern.getWindowTime().toMilliseconds() : 0L;
return endState;
}
很简单,就单纯的创建state
        private State<T> createState(String name, State.StateType stateType) {
String stateName = getUniqueInternalStateName(name);
usedNames.add(stateName);
State<T> state = new State<>(stateName, stateType);
states.add(state);
return state;
}
 
继续加middle的state,
        private State<T> createMiddleStates(final State<T> sinkState) {
State<T> lastSink = sinkState; //记录上一个state
while (currentPattern.getPrevious() != null) { checkPatternNameUniqueness(currentPattern.getName());
lastSink = convertPattern(lastSink); //convert pattern到state // we traverse the pattern graph backwards
followingPattern = currentPattern;
currentPattern = currentPattern.getPrevious(); //往前回溯 final Time currentWindowTime = currentPattern.getWindowTime();
if (currentWindowTime != null && currentWindowTime.toMilliseconds() < windowTime) {
// the window time is the global minimum of all window times of each state
windowTime = currentWindowTime.toMilliseconds();
}
}
return lastSink;
}
调用convertPattern,
        private State<T> convertPattern(final State<T> sinkState) {
final State<T> lastSink; lastSink = createSingletonState(sinkState); //只看singleton state
addStopStates(lastSink); return lastSink;
}
 
createSingletonState
        private State<T> createSingletonState(final State<T> sinkState, final IterativeCondition<T> ignoreCondition, final boolean isOptional) {
final IterativeCondition<T> currentCondition = (IterativeCondition<T>) currentPattern.getCondition(); //从pattern里面取出condition
final IterativeCondition<T> trueFunction = BooleanConditions.trueFunction(); final State<T> singletonState = createState(currentPattern.getName(), State.StateType.Normal); //对currentPattern创建singletonState
// if event is accepted then all notPatterns previous to the optional states are no longer valid
singletonState.addTake(sink, currentCondition); //设置take StateTransition if (isOptional) {
// if no element accepted the previous nots are still valid.
singletonState.addProceed(sinkState, trueFunction); //如果有Optional,设置Proceed StateTransition
} return singletonState;
}
addTake
addStateTransition
    public void addStateTransition(
final StateTransitionAction action,
final State<T> targetState,
final IterativeCondition<T> condition) {
stateTransitions.add(new StateTransition<T>(this, action, targetState, condition));
}
 
createStartState
        private State<T> createStartState(State<T> sinkState) {
checkPatternNameUniqueness(currentPattern.getName());
final State<T> beginningState = convertPattern(sinkState);
beginningState.makeStart();
return beginningState;
}
 
 
2. 当event coming,如何处理?
AbstractKeyedCEPPatternOperator.processElement
            NFA<IN> nfa = getNFA();
processEvent(nfa, element.getValue(), getProcessingTimeService().getCurrentProcessingTime());
updateNFA(nfa);
 
如果statebackend里面有就取出来,否则nfaFactory.createNFA
    private NFA<IN> getNFA() throws IOException {
NFA<IN> nfa = nfaOperatorState.value();
return nfa != null ? nfa : nfaFactory.createNFA();
}
 
createNFA
    NFA<T> result =  new NFA<>(inputTypeSerializer.duplicate(), windowTime, timeoutHandling);
result.addStates(states);
 
addState
    public void addStates(final Collection<State<T>> newStates) {
for (State<T> state: newStates) {
addState(state);
}
} public void addState(final State<T> state) {
states.add(state); if (state.isStart()) {
computationStates.add(ComputationState.createStartState(this, state));
}
}
把states加入到NFA,
start state会加入computationStates,因为pattern的识别总是从start开始
 
KeyedCEPPatternOperator – > processEvent
    protected void processEvent(NFA<IN> nfa, IN event, long timestamp) {
Tuple2<Collection<Map<String, List<IN>>>, Collection<Tuple2<Map<String, List<IN>>, Long>>> patterns =
nfa.process(event, timestamp); emitMatchedSequences(patterns.f0, timestamp);
}
 
NFA –> process
    public Tuple2<Collection<Map<String, List<T>>>, Collection<Tuple2<Map<String, List<T>>, Long>>> process(final T event, final long timestamp) {
final int numberComputationStates = computationStates.size();
final Collection<Map<String, List<T>>> result = new ArrayList<>();
final Collection<Tuple2<Map<String, List<T>>, Long>> timeoutResult = new ArrayList<>(); // iterate over all current computations
for (int i = 0; i < numberComputationStates; i++) { //遍历所有的当前state
ComputationState<T> computationState = computationStates.poll(); //poll一个state final Collection<ComputationState<T>> newComputationStates; newComputationStates = computeNextStates(computationState, event, timestamp); //通过NFA计算下一批的state //delay adding new computation states in case a stop state is reached and we discard the path.
final Collection<ComputationState<T>> statesToRetain = new ArrayList<>(); //newComputationStates中有可能是stop state,所以不一定会放到statesToRetain
//if stop state reached in this path
boolean shouldDiscardPath = false;
for (final ComputationState<T> newComputationState: newComputationStates) {
if (newComputationState.isFinalState()) { //如果是final state,说明完成匹配
// we've reached a final state and can thus retrieve the matching event sequence
Map<String, List<T>> matchedPattern = extractCurrentMatches(newComputationState);
result.add(matchedPattern); // remove found patterns because they are no longer needed
eventSharedBuffer.release(
newComputationState.getPreviousState().getName(),
newComputationState.getEvent(),
newComputationState.getTimestamp(),
computationState.getCounter());
} else if (newComputationState.isStopState()) { //如果是stop state,那么删除该path
//reached stop state. release entry for the stop state
shouldDiscardPath = true;
eventSharedBuffer.release(
newComputationState.getPreviousState().getName(),
newComputationState.getEvent(),
newComputationState.getTimestamp(),
computationState.getCounter());
} else { //中间状态,放入statesToRetain
// add new computation state; it will be processed once the next event arrives
statesToRetain.add(newComputationState);
}
} if (shouldDiscardPath) { //释放discardPath
// a stop state was reached in this branch. release branch which results in removing previous event from
// the buffer
for (final ComputationState<T> state : statesToRetain) {
eventSharedBuffer.release(
state.getPreviousState().getName(),
state.getEvent(),
state.getTimestamp(),
state.getCounter());
}
} else { //将中间state加入computationStates
computationStates.addAll(statesToRetain);
} } // prune shared buffer based on window length
if (windowTime > 0L) { //prune超时过期的pattern
long pruningTimestamp = timestamp - windowTime; if (pruningTimestamp < timestamp) {
// the check is to guard against underflows // remove all elements which are expired
// with respect to the window length
eventSharedBuffer.prune(pruningTimestamp);
}
} return Tuple2.of(result, timeoutResult);
}
 
computeNextStates
 
    private Collection<ComputationState<T>> computeNextStates(
final ComputationState<T> computationState,
final T event,
final long timestamp) { final OutgoingEdges<T> outgoingEdges = createDecisionGraph(computationState, event); //找出state的所有出边 final List<StateTransition<T>> edges = outgoingEdges.getEdges(); final List<ComputationState<T>> resultingComputationStates = new ArrayList<>();
for (StateTransition<T> edge : edges) {
switch (edge.getAction()) {
case IGNORE: {
if (!computationState.isStartState()) {
final DeweyNumber version;
if (isEquivalentState(edge.getTargetState(), computationState.getState())) {
//Stay in the same state (it can be either looping one or singleton)
final int toIncrease = calculateIncreasingSelfState(
outgoingEdges.getTotalIgnoreBranches(),
outgoingEdges.getTotalTakeBranches());
version = computationState.getVersion().increase(toIncrease);
} else {
//IGNORE after PROCEED
version = computationState.getVersion()
.increase(totalTakeToSkip + ignoreBranchesToVisit)
.addStage();
ignoreBranchesToVisit--;
} addComputationState( //对于ignore state,本身不用take,把target state加到computation state中
resultingComputationStates,
edge.getTargetState(),
computationState.getPreviousState(),
computationState.getEvent(),
computationState.getCounter(),
computationState.getTimestamp(),
version,
computationState.getStartTimestamp()
);
}
}
break;
case TAKE:
final State<T> nextState = edge.getTargetState();
final State<T> currentState = edge.getSourceState();
final State<T> previousState = computationState.getPreviousState(); final T previousEvent = computationState.getEvent(); final int counter;
final long startTimestamp;
//对于take,需要把当前state记录到path里面,即放到eventSharedBuffer
if (computationState.isStartState()) {
startTimestamp = timestamp;
counter = eventSharedBuffer.put(
currentState.getName(),
event,
timestamp,
currentVersion);
} else {
startTimestamp = computationState.getStartTimestamp();
counter = eventSharedBuffer.put(
currentState.getName(),
event,
timestamp,
previousState.getName(),
previousEvent,
computationState.getTimestamp(),
computationState.getCounter(),
currentVersion);
} addComputationState(
resultingComputationStates,
nextState,
currentState,
event,
counter,
timestamp,
nextVersion,
startTimestamp); //check if newly created state is optional (have a PROCEED path to Final state)
final State<T> finalState = findFinalStateAfterProceed(nextState, event, computationState);
if (finalState != null) {
addComputationState(
resultingComputationStates,
finalState,
currentState,
event,
counter,
timestamp,
nextVersion,
startTimestamp);
}
break;
}
} return resultingComputationStates;
}
 
private OutgoingEdges<T> createDecisionGraph(ComputationState<T> computationState, T event) {
final OutgoingEdges<T> outgoingEdges = new OutgoingEdges<>(computationState.getState()); final Stack<State<T>> states = new Stack<>();
states.push(computationState.getState()); //First create all outgoing edges, so to be able to reason about the Dewey version
while (!states.isEmpty()) {
State<T> currentState = states.pop();
Collection<StateTransition<T>> stateTransitions = currentState.getStateTransitions(); //取出state所有的stateTransitions // check all state transitions for each state
for (StateTransition<T> stateTransition : stateTransitions) {
try {
if (checkFilterCondition(computationState, stateTransition.getCondition(), event)) {
// filter condition is true
switch (stateTransition.getAction()) {
case PROCEED: //如果是proceed,直接跳到下个state
// simply advance the computation state, but apply the current event to it
// PROCEED is equivalent to an epsilon transition
states.push(stateTransition.getTargetState());
break;
case IGNORE:
case TAKE: //default,把stateTransition加入边
outgoingEdges.add(stateTransition);
break;
}
}
} catch (Exception e) {
throw new RuntimeException("Failure happened in filter function.", e);
}
}
}
return outgoingEdges;
}
 
 
 
 
 
 
 
 

Flink – CEP NFA的更多相关文章

  1. Apache Flink CEP 实战

    本文根据Apache Flink 实战&进阶篇系列直播课程整理而成,由哈啰出行大数据实时平台资深开发刘博分享.通过一些简单的实际例子,从概念原理,到如何使用,再到功能的扩展,希望能够给打算使用 ...

  2. Flink cep的初步使用

    一.CEP是什么 在应用系统中,总会发生这样或那样的事件,有些事件是用户触发的,有些事件是系统触发的,有些可能是第三方触发的,但它们都可以被看做系统中可观察的状态改变,例如用户登陆应用失败.用户下了一 ...

  3. Flink/CEP/规则引擎/风控

    基于 Apache Flink 和规则引擎的实时风控解决方案 ​ 对一个互联网产品来说,典型的风控场景包括:注册风控.登陆风控.交易风控.活动风控等,而风控的最佳效果是防患于未然,所以事前事中和事后三 ...

  4. 大数据计算引擎之Flink Flink CEP复杂事件编程

    原文地址: 大数据计算引擎之Flink Flink CEP复杂事件编程 复杂事件编程(CEP)是一种基于流处理的技术,将系统数据看作不同类型的事件,通过分析事件之间的关系,建立不同的时事件系序列库,并 ...

  5. FlinkCEP - Complex event processing for Flink

    https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/libs/cep.html 首先目的是匹配pattern sequenc ...

  6. 8.Flink实时项目之CEP计算访客跳出

    1.访客跳出明细介绍 首先要识别哪些是跳出行为,要把这些跳出的访客最后一个访问的页面识别出来.那么就要抓住几个特征: 该页面是用户近期访问的第一个页面,这个可以通过该页面是否有上一个页面(last_p ...

  7. 流计算技术实战 - CEP

    CEP,Complex event processing Wiki定义 "Complex event processing, or CEP, is event processing that ...

  8. 如何利用Flink实现超大规模用户行为分析

    如何利用Flink实现超大规模用户行为分析   各位晚上好,首先感谢大家参与我的这次主题分享,同时也感谢 InfoQ AI 前线组织这次瀚思科技主题月! 瀚思科技成立于 2014 年,按行业划分我们是 ...

  9. Flink 灵魂两百问,这谁顶得住?

    Flink 学习 https://github.com/zhisheng17/flink-learning 麻烦路过的各位亲给这个项目点个 star,太不易了,写了这么多,算是对我坚持下来的一种鼓励吧 ...

随机推荐

  1. 【Python】python3.6中实现同一行动态输出

    print同时需要设置flush=True来动态显示 print("\r[{0}] {1} <- {2}".format(username,time,jdTimeFomat) ...

  2. c++对c的加强

    1.register关键字的加强 register修饰符暗示编译程序相应的变量将被频繁地使用,如果可能的话,应将其保存在CPU的寄存器中,以加快其存储速度,这只是一种请求,编译器可以拒绝这种申请. ( ...

  3. 5. BERT算法原理解析

    1. 语言模型 2. Attention Is All You Need(Transformer)算法原理解析 3. ELMo算法原理解析 4. OpenAI GPT算法原理解析 5. BERT算法原 ...

  4. iOS系统及客户端软件测试的基础介绍

    iOS系统及客户端软件测试的基础介绍 iOS现在的最新版本iOS5是10月12号推出,当前版本是4.3.5 先是硬件部分,采用iOS系统的是iPad,iPhone,iTouch这三种设备,其中iPho ...

  5. 【转】WPF自定义控件与样式(8)-ComboBox与自定义多选控件MultComboBox

    一.前言 申明:WPF自定义控件与样式是一个系列文章,前后是有些关联的,但大多是按照由简到繁的顺序逐步发布的等. 本文主要内容: 下拉选择控件ComboBox的自定义样式及扩展: 自定义多选控件Mul ...

  6. MySQL数据库远程访问权限如何打开(两种方法)

    在我们使用mysql数据库时,有时我们的程序与数据库不在同一机器上,这时我们需要远程访问数据库.缺省状态下,mysql的用户没有远程访问的权限. 下面介绍两种方法,解决这一问题. 1.改表法 可能是你 ...

  7. http 请求参数之Query String Parameters、Form Data、Request Payload

    Query String Parameters 当发起一次GET请求时,参数会以url string的形式进行传递.即?后的字符串则为其请求参数,并以&作为分隔符. 如下http请求报文头: ...

  8. Ubuntu:双(多)网卡绑定(bonding)配置

    step 0:安装网卡绑定的功能 apt-get install    ifenslave step 1:加载内核模块:编辑 /etc/modules,添加: bonding step 2:编辑网卡配 ...

  9. QT Graphics-View 3D编程例子- 3D Model Viewer

    学习在Graphics-View框架中使用opengl进行3D编程,在网上找了一个不错的例子“3D Model Viewer”,很值得学习. 可以在http://www.oyonale.com/acc ...

  10. [Understanding] Compressive Sensing and Deep Model

    低维模型与深度模型的殊途同归 有助理解核心,陌生概念需要加强理解. 对于做机器学习,和做图像视觉的研究者来说,过去的十年是非常激动人心的十年.以我个人来讲,非常有幸接触了两件事情: 第一件是压缩感知( ...