看看Flink cep如何将pattern转换为NFA?

当来了一条event,如果在NFA中执行的?

前面的链路,CEP –> PatternStream –> select –> CEPOperatorUtils.createPatternStream

1. 产生NFACompiler.compileFactory,完成pattern到state的转换

final NFACompiler.NFAFactory<T> nfaFactory = NFACompiler.compileFactory(pattern, inputSerializer, false);
            final NFAFactoryCompiler<T> nfaFactoryCompiler = new NFAFactoryCompiler<>(pattern);
nfaFactoryCompiler.compileFactory();
return new NFAFactoryImpl<>(inputTypeSerializer, nfaFactoryCompiler.getWindowTime(), nfaFactoryCompiler.getStates(), timeoutHandling);
调用,nfaFactoryCompiler.compileFactory
        void compileFactory() {
// we're traversing the pattern from the end to the beginning --> the first state is the final state
State<T> sinkState = createEndingState();
// add all the normal states
sinkState = createMiddleStates(sinkState);
// add the beginning state
createStartState(sinkState);
}
可以看到做的工作,主要是生成state,即把pattern转换为NFA中的state和stateTransition
因为加pattern的是不断往后加,通过private final Pattern<T, ? extends T> previous来指向前面的pattern,所以在遍历pattern的时候只能回溯
先创建最后的final state
        private State<T> createEndingState() {
State<T> endState = createState(ENDING_STATE_NAME, State.StateType.Final);
windowTime = currentPattern.getWindowTime() != null ? currentPattern.getWindowTime().toMilliseconds() : 0L;
return endState;
}
很简单,就单纯的创建state
        private State<T> createState(String name, State.StateType stateType) {
String stateName = getUniqueInternalStateName(name);
usedNames.add(stateName);
State<T> state = new State<>(stateName, stateType);
states.add(state);
return state;
}
 
继续加middle的state,
        private State<T> createMiddleStates(final State<T> sinkState) {
State<T> lastSink = sinkState; //记录上一个state
while (currentPattern.getPrevious() != null) { checkPatternNameUniqueness(currentPattern.getName());
lastSink = convertPattern(lastSink); //convert pattern到state // we traverse the pattern graph backwards
followingPattern = currentPattern;
currentPattern = currentPattern.getPrevious(); //往前回溯 final Time currentWindowTime = currentPattern.getWindowTime();
if (currentWindowTime != null && currentWindowTime.toMilliseconds() < windowTime) {
// the window time is the global minimum of all window times of each state
windowTime = currentWindowTime.toMilliseconds();
}
}
return lastSink;
}
调用convertPattern,
        private State<T> convertPattern(final State<T> sinkState) {
final State<T> lastSink; lastSink = createSingletonState(sinkState); //只看singleton state
addStopStates(lastSink); return lastSink;
}
 
createSingletonState
        private State<T> createSingletonState(final State<T> sinkState, final IterativeCondition<T> ignoreCondition, final boolean isOptional) {
final IterativeCondition<T> currentCondition = (IterativeCondition<T>) currentPattern.getCondition(); //从pattern里面取出condition
final IterativeCondition<T> trueFunction = BooleanConditions.trueFunction(); final State<T> singletonState = createState(currentPattern.getName(), State.StateType.Normal); //对currentPattern创建singletonState
// if event is accepted then all notPatterns previous to the optional states are no longer valid
singletonState.addTake(sink, currentCondition); //设置take StateTransition if (isOptional) {
// if no element accepted the previous nots are still valid.
singletonState.addProceed(sinkState, trueFunction); //如果有Optional,设置Proceed StateTransition
} return singletonState;
}
addTake
addStateTransition
    public void addStateTransition(
final StateTransitionAction action,
final State<T> targetState,
final IterativeCondition<T> condition) {
stateTransitions.add(new StateTransition<T>(this, action, targetState, condition));
}
 
createStartState
        private State<T> createStartState(State<T> sinkState) {
checkPatternNameUniqueness(currentPattern.getName());
final State<T> beginningState = convertPattern(sinkState);
beginningState.makeStart();
return beginningState;
}
 
 
2. 当event coming,如何处理?
AbstractKeyedCEPPatternOperator.processElement
            NFA<IN> nfa = getNFA();
processEvent(nfa, element.getValue(), getProcessingTimeService().getCurrentProcessingTime());
updateNFA(nfa);
 
如果statebackend里面有就取出来,否则nfaFactory.createNFA
    private NFA<IN> getNFA() throws IOException {
NFA<IN> nfa = nfaOperatorState.value();
return nfa != null ? nfa : nfaFactory.createNFA();
}
 
createNFA
    NFA<T> result =  new NFA<>(inputTypeSerializer.duplicate(), windowTime, timeoutHandling);
result.addStates(states);
 
addState
    public void addStates(final Collection<State<T>> newStates) {
for (State<T> state: newStates) {
addState(state);
}
} public void addState(final State<T> state) {
states.add(state); if (state.isStart()) {
computationStates.add(ComputationState.createStartState(this, state));
}
}
把states加入到NFA,
start state会加入computationStates,因为pattern的识别总是从start开始
 
KeyedCEPPatternOperator – > processEvent
    protected void processEvent(NFA<IN> nfa, IN event, long timestamp) {
Tuple2<Collection<Map<String, List<IN>>>, Collection<Tuple2<Map<String, List<IN>>, Long>>> patterns =
nfa.process(event, timestamp); emitMatchedSequences(patterns.f0, timestamp);
}
 
NFA –> process
    public Tuple2<Collection<Map<String, List<T>>>, Collection<Tuple2<Map<String, List<T>>, Long>>> process(final T event, final long timestamp) {
final int numberComputationStates = computationStates.size();
final Collection<Map<String, List<T>>> result = new ArrayList<>();
final Collection<Tuple2<Map<String, List<T>>, Long>> timeoutResult = new ArrayList<>(); // iterate over all current computations
for (int i = 0; i < numberComputationStates; i++) { //遍历所有的当前state
ComputationState<T> computationState = computationStates.poll(); //poll一个state final Collection<ComputationState<T>> newComputationStates; newComputationStates = computeNextStates(computationState, event, timestamp); //通过NFA计算下一批的state //delay adding new computation states in case a stop state is reached and we discard the path.
final Collection<ComputationState<T>> statesToRetain = new ArrayList<>(); //newComputationStates中有可能是stop state,所以不一定会放到statesToRetain
//if stop state reached in this path
boolean shouldDiscardPath = false;
for (final ComputationState<T> newComputationState: newComputationStates) {
if (newComputationState.isFinalState()) { //如果是final state,说明完成匹配
// we've reached a final state and can thus retrieve the matching event sequence
Map<String, List<T>> matchedPattern = extractCurrentMatches(newComputationState);
result.add(matchedPattern); // remove found patterns because they are no longer needed
eventSharedBuffer.release(
newComputationState.getPreviousState().getName(),
newComputationState.getEvent(),
newComputationState.getTimestamp(),
computationState.getCounter());
} else if (newComputationState.isStopState()) { //如果是stop state,那么删除该path
//reached stop state. release entry for the stop state
shouldDiscardPath = true;
eventSharedBuffer.release(
newComputationState.getPreviousState().getName(),
newComputationState.getEvent(),
newComputationState.getTimestamp(),
computationState.getCounter());
} else { //中间状态,放入statesToRetain
// add new computation state; it will be processed once the next event arrives
statesToRetain.add(newComputationState);
}
} if (shouldDiscardPath) { //释放discardPath
// a stop state was reached in this branch. release branch which results in removing previous event from
// the buffer
for (final ComputationState<T> state : statesToRetain) {
eventSharedBuffer.release(
state.getPreviousState().getName(),
state.getEvent(),
state.getTimestamp(),
state.getCounter());
}
} else { //将中间state加入computationStates
computationStates.addAll(statesToRetain);
} } // prune shared buffer based on window length
if (windowTime > 0L) { //prune超时过期的pattern
long pruningTimestamp = timestamp - windowTime; if (pruningTimestamp < timestamp) {
// the check is to guard against underflows // remove all elements which are expired
// with respect to the window length
eventSharedBuffer.prune(pruningTimestamp);
}
} return Tuple2.of(result, timeoutResult);
}
 
computeNextStates
 
    private Collection<ComputationState<T>> computeNextStates(
final ComputationState<T> computationState,
final T event,
final long timestamp) { final OutgoingEdges<T> outgoingEdges = createDecisionGraph(computationState, event); //找出state的所有出边 final List<StateTransition<T>> edges = outgoingEdges.getEdges(); final List<ComputationState<T>> resultingComputationStates = new ArrayList<>();
for (StateTransition<T> edge : edges) {
switch (edge.getAction()) {
case IGNORE: {
if (!computationState.isStartState()) {
final DeweyNumber version;
if (isEquivalentState(edge.getTargetState(), computationState.getState())) {
//Stay in the same state (it can be either looping one or singleton)
final int toIncrease = calculateIncreasingSelfState(
outgoingEdges.getTotalIgnoreBranches(),
outgoingEdges.getTotalTakeBranches());
version = computationState.getVersion().increase(toIncrease);
} else {
//IGNORE after PROCEED
version = computationState.getVersion()
.increase(totalTakeToSkip + ignoreBranchesToVisit)
.addStage();
ignoreBranchesToVisit--;
} addComputationState( //对于ignore state,本身不用take,把target state加到computation state中
resultingComputationStates,
edge.getTargetState(),
computationState.getPreviousState(),
computationState.getEvent(),
computationState.getCounter(),
computationState.getTimestamp(),
version,
computationState.getStartTimestamp()
);
}
}
break;
case TAKE:
final State<T> nextState = edge.getTargetState();
final State<T> currentState = edge.getSourceState();
final State<T> previousState = computationState.getPreviousState(); final T previousEvent = computationState.getEvent(); final int counter;
final long startTimestamp;
//对于take,需要把当前state记录到path里面,即放到eventSharedBuffer
if (computationState.isStartState()) {
startTimestamp = timestamp;
counter = eventSharedBuffer.put(
currentState.getName(),
event,
timestamp,
currentVersion);
} else {
startTimestamp = computationState.getStartTimestamp();
counter = eventSharedBuffer.put(
currentState.getName(),
event,
timestamp,
previousState.getName(),
previousEvent,
computationState.getTimestamp(),
computationState.getCounter(),
currentVersion);
} addComputationState(
resultingComputationStates,
nextState,
currentState,
event,
counter,
timestamp,
nextVersion,
startTimestamp); //check if newly created state is optional (have a PROCEED path to Final state)
final State<T> finalState = findFinalStateAfterProceed(nextState, event, computationState);
if (finalState != null) {
addComputationState(
resultingComputationStates,
finalState,
currentState,
event,
counter,
timestamp,
nextVersion,
startTimestamp);
}
break;
}
} return resultingComputationStates;
}
 
private OutgoingEdges<T> createDecisionGraph(ComputationState<T> computationState, T event) {
final OutgoingEdges<T> outgoingEdges = new OutgoingEdges<>(computationState.getState()); final Stack<State<T>> states = new Stack<>();
states.push(computationState.getState()); //First create all outgoing edges, so to be able to reason about the Dewey version
while (!states.isEmpty()) {
State<T> currentState = states.pop();
Collection<StateTransition<T>> stateTransitions = currentState.getStateTransitions(); //取出state所有的stateTransitions // check all state transitions for each state
for (StateTransition<T> stateTransition : stateTransitions) {
try {
if (checkFilterCondition(computationState, stateTransition.getCondition(), event)) {
// filter condition is true
switch (stateTransition.getAction()) {
case PROCEED: //如果是proceed,直接跳到下个state
// simply advance the computation state, but apply the current event to it
// PROCEED is equivalent to an epsilon transition
states.push(stateTransition.getTargetState());
break;
case IGNORE:
case TAKE: //default,把stateTransition加入边
outgoingEdges.add(stateTransition);
break;
}
}
} catch (Exception e) {
throw new RuntimeException("Failure happened in filter function.", e);
}
}
}
return outgoingEdges;
}
 
 
 
 
 
 
 
 

Flink – CEP NFA的更多相关文章

  1. Apache Flink CEP 实战

    本文根据Apache Flink 实战&进阶篇系列直播课程整理而成,由哈啰出行大数据实时平台资深开发刘博分享.通过一些简单的实际例子,从概念原理,到如何使用,再到功能的扩展,希望能够给打算使用 ...

  2. Flink cep的初步使用

    一.CEP是什么 在应用系统中,总会发生这样或那样的事件,有些事件是用户触发的,有些事件是系统触发的,有些可能是第三方触发的,但它们都可以被看做系统中可观察的状态改变,例如用户登陆应用失败.用户下了一 ...

  3. Flink/CEP/规则引擎/风控

    基于 Apache Flink 和规则引擎的实时风控解决方案 ​ 对一个互联网产品来说,典型的风控场景包括:注册风控.登陆风控.交易风控.活动风控等,而风控的最佳效果是防患于未然,所以事前事中和事后三 ...

  4. 大数据计算引擎之Flink Flink CEP复杂事件编程

    原文地址: 大数据计算引擎之Flink Flink CEP复杂事件编程 复杂事件编程(CEP)是一种基于流处理的技术,将系统数据看作不同类型的事件,通过分析事件之间的关系,建立不同的时事件系序列库,并 ...

  5. FlinkCEP - Complex event processing for Flink

    https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/libs/cep.html 首先目的是匹配pattern sequenc ...

  6. 8.Flink实时项目之CEP计算访客跳出

    1.访客跳出明细介绍 首先要识别哪些是跳出行为,要把这些跳出的访客最后一个访问的页面识别出来.那么就要抓住几个特征: 该页面是用户近期访问的第一个页面,这个可以通过该页面是否有上一个页面(last_p ...

  7. 流计算技术实战 - CEP

    CEP,Complex event processing Wiki定义 "Complex event processing, or CEP, is event processing that ...

  8. 如何利用Flink实现超大规模用户行为分析

    如何利用Flink实现超大规模用户行为分析   各位晚上好,首先感谢大家参与我的这次主题分享,同时也感谢 InfoQ AI 前线组织这次瀚思科技主题月! 瀚思科技成立于 2014 年,按行业划分我们是 ...

  9. Flink 灵魂两百问,这谁顶得住?

    Flink 学习 https://github.com/zhisheng17/flink-learning 麻烦路过的各位亲给这个项目点个 star,太不易了,写了这么多,算是对我坚持下来的一种鼓励吧 ...

随机推荐

  1. 关于java线程、进程的一些问题

    1.多核硬件上,java中同一个进程的多个线程可以运行在不同的CPU上么? 应该是可以的,在eclipse上面跑一个模拟程序,一个死循环的线程可以占用系统(4核,Win7)%的CPU,4个这样的线程刚 ...

  2. sql in not in 案例用 exists not exists 代替

    from AppStoke B WHERE B.Opencode=A.Code) in用extist代替 select distinct * from Stoke where Code not in ...

  3. 用Python来玩微信跳一跳

    微信2017年12月28日发布了新版本,在小程序里面有一个跳一跳小游戏,试着点一点玩了下.第二天刚好在一篇技术公众号中,看到有大神用Python代码计算出按压时间,向手机发送android adb命令 ...

  4. 【Jetty】Jetty 的工作原理以及与 Tomcat 的比较

    Jetty 应该是目前最活跃也是很有前景的一个 Servlet 引擎.本文将介绍 Jetty 基本架构与基本的工作原理:您将了解到 Jetty 的基本体系结构:Jetty 的启动过程:Jetty 如何 ...

  5. 机器学习&深度学习基础(目录)

    从业这么久了,做了很多项目,一直对机器学习的基础课程鄙视已久,现在回头看来,系统的基础知识整理对我现在思路的整理很有利,写完这个基础篇,开始把AI+cv的也总结完,然后把这么多年做的项目再写好总结. ...

  6. Spring Security 认证流程

    请求之间共享SecurityContext原因:

  7. centos7安装elasticsearch-head

    elasticsearch-head安装前准备 1.操作系统64位CentOS Linux release 7.2.1511 (Core)2.git是必需的elasticsearch-head是一款开 ...

  8. Python 函数(可变参数)

    在python函数中,可以定义可变参数,顾名思义,可变参数就是,传入的参数是可变的例如,给定一组数字a,b,c...  请计算a2 + b2 + c2 + …… 要定义出这个函数,我们必须确定输入的参 ...

  9. 使用 wondershaper 在 Linux 中限制网络带宽使用

    wondershaper 实际上是一个 shell 脚本,它使用 tc 来定义流量调整命令,使用 QoS 来处理特定的网络接口.外发流量通过放在不同优先级的队列中,达到限制传出流量速率的目的:而传入流 ...

  10. ASP.NET MVC 4 - 上传图片到数据库

    这里演示如何在MVC WEB应用程序如何上传图片到数据库以及如何在WEB页面上显示图片.数据库表对应整个Model类,不单图片数据一个字段,我们从数据表的定义开始: CREATE TABLE [dbo ...