Flink – CEP NFA

看看Flink cep如何将pattern转换为NFA？

当来了一条event，如果在NFA中执行的？

前面的链路，CEP –> PatternStream –> select –> CEPOperatorUtils.createPatternStream

1. 产生NFACompiler.compileFactory，完成pattern到state的转换

final NFACompiler.NFAFactory<T> nfaFactory = NFACompiler.compileFactory(pattern, inputSerializer, false);

            final NFAFactoryCompiler<T> nfaFactoryCompiler = new NFAFactoryCompiler<>(pattern);

            nfaFactoryCompiler.compileFactory();

            return new NFAFactoryImpl<>(inputTypeSerializer, nfaFactoryCompiler.getWindowTime(), nfaFactoryCompiler.getStates(), timeoutHandling);

调用，nfaFactoryCompiler.compileFactory

        void compileFactory() {

            // we're traversing the pattern from the end to the beginning --> the first state is the final state

            State<T> sinkState = createEndingState();

            // add all the normal states

            sinkState = createMiddleStates(sinkState);

            // add the beginning state

            createStartState(sinkState);

        }

可以看到做的工作，主要是生成state，即把pattern转换为NFA中的state和stateTransition

因为加pattern的是不断往后加，通过private final Pattern<T, ? extends T> previous来指向前面的pattern，所以在遍历pattern的时候只能回溯

先创建最后的final state

        private State<T> createEndingState() {

            State<T> endState = createState(ENDING_STATE_NAME, State.StateType.Final);

            windowTime = currentPattern.getWindowTime() != null ? currentPattern.getWindowTime().toMilliseconds() : 0L;

            return endState;

        }

很简单，就单纯的创建state

        private State<T> createState(String name, State.StateType stateType) {

            String stateName = getUniqueInternalStateName(name);

            usedNames.add(stateName);

            State<T> state = new State<>(stateName, stateType);

            states.add(state);

            return state;

        }

继续加middle的state，

        private State<T> createMiddleStates(final State<T> sinkState) {

            State<T> lastSink = sinkState; //记录上一个state

            while (currentPattern.getPrevious() != null) {

                checkPatternNameUniqueness(currentPattern.getName());

                lastSink = convertPattern(lastSink); //convert pattern到state

                // we traverse the pattern graph backwards

                followingPattern = currentPattern;

                currentPattern = currentPattern.getPrevious(); //往前回溯

                final Time currentWindowTime = currentPattern.getWindowTime();

                if (currentWindowTime != null && currentWindowTime.toMilliseconds() < windowTime) {

                    // the window time is the global minimum of all window times of each state

                    windowTime = currentWindowTime.toMilliseconds();

                }

            }

            return lastSink;

        }

调用convertPattern，

        private State<T> convertPattern(final State<T> sinkState) {

            final State<T> lastSink;

            lastSink = createSingletonState(sinkState); //只看singleton state

            addStopStates(lastSink);

            return lastSink;

        }

createSingletonState

        private State<T> createSingletonState(final State<T> sinkState, final IterativeCondition<T> ignoreCondition, final boolean isOptional) {

            final IterativeCondition<T> currentCondition = (IterativeCondition<T>) currentPattern.getCondition(); //从pattern里面取出condition

            final IterativeCondition<T> trueFunction = BooleanConditions.trueFunction();

            final State<T> singletonState = createState(currentPattern.getName(), State.StateType.Normal); //对currentPattern创建singletonState

            // if event is accepted then all notPatterns previous to the optional states are no longer valid

            singletonState.addTake(sink, currentCondition); //设置take StateTransition

            if (isOptional) {

                // if no element accepted the previous nots are still valid.

                singletonState.addProceed(sinkState, trueFunction); //如果有Optional，设置Proceed StateTransition

            }

            return singletonState;

        }

addTake

addStateTransition

    public void addStateTransition(

            final StateTransitionAction action,

            final State<T> targetState,

            final IterativeCondition<T> condition) {

        stateTransitions.add(new StateTransition<T>(this, action, targetState, condition));

    }

createStartState

        private State<T> createStartState(State<T> sinkState) {

            checkPatternNameUniqueness(currentPattern.getName());

            final State<T> beginningState = convertPattern(sinkState);

            beginningState.makeStart();

            return beginningState;

        }

2. 当event coming，如何处理？

AbstractKeyedCEPPatternOperator.processElement

            NFA<IN> nfa = getNFA();

            processEvent(nfa, element.getValue(), getProcessingTimeService().getCurrentProcessingTime());

            updateNFA(nfa);

如果statebackend里面有就取出来，否则nfaFactory.createNFA

    private NFA<IN> getNFA() throws IOException {

        NFA<IN> nfa = nfaOperatorState.value();

        return nfa != null ? nfa : nfaFactory.createNFA();

    }

createNFA

    NFA<T> result =  new NFA<>(inputTypeSerializer.duplicate(), windowTime, timeoutHandling);

    result.addStates(states);

addState

    public void addStates(final Collection<State<T>> newStates) {

        for (State<T> state: newStates) {

            addState(state);

        }

    }

    public void addState(final State<T> state) {

        states.add(state);

        if (state.isStart()) {

            computationStates.add(ComputationState.createStartState(this, state));

        }

    }

把states加入到NFA，

start state会加入computationStates，因为pattern的识别总是从start开始

KeyedCEPPatternOperator – > processEvent

    protected void processEvent(NFA<IN> nfa, IN event, long timestamp) {

        Tuple2<Collection<Map<String, List<IN>>>, Collection<Tuple2<Map<String, List<IN>>, Long>>> patterns =

            nfa.process(event, timestamp);

        emitMatchedSequences(patterns.f0, timestamp);

    }

NFA –> process

    public Tuple2<Collection<Map<String, List<T>>>, Collection<Tuple2<Map<String, List<T>>, Long>>> process(final T event, final long timestamp) {

        final int numberComputationStates = computationStates.size();

        final Collection<Map<String, List<T>>> result = new ArrayList<>();

        final Collection<Tuple2<Map<String, List<T>>, Long>> timeoutResult = new ArrayList<>();

        // iterate over all current computations

        for (int i = 0; i < numberComputationStates; i++) { //遍历所有的当前state

            ComputationState<T> computationState = computationStates.poll(); //poll一个state

            final Collection<ComputationState<T>> newComputationStates;

            newComputationStates = computeNextStates(computationState, event, timestamp); //通过NFA计算下一批的state

            //delay adding new computation states in case a stop state is reached and we discard the path.

            final Collection<ComputationState<T>> statesToRetain = new ArrayList<>(); //newComputationStates中有可能是stop state，所以不一定会放到statesToRetain

            //if stop state reached in this path

            boolean shouldDiscardPath = false;

            for (final ComputationState<T> newComputationState: newComputationStates) {

                if (newComputationState.isFinalState()) { //如果是final state，说明完成匹配

                    // we've reached a final state and can thus retrieve the matching event sequence

                    Map<String, List<T>> matchedPattern = extractCurrentMatches(newComputationState);

                    result.add(matchedPattern);

                    // remove found patterns because they are no longer needed

                    eventSharedBuffer.release(

                            newComputationState.getPreviousState().getName(),

                            newComputationState.getEvent(),

                            newComputationState.getTimestamp(),

                            computationState.getCounter());

                } else if (newComputationState.isStopState()) { //如果是stop state，那么删除该path

                    //reached stop state. release entry for the stop state

                    shouldDiscardPath = true;

                    eventSharedBuffer.release(

                        newComputationState.getPreviousState().getName(),

                        newComputationState.getEvent(),

                        newComputationState.getTimestamp(),

                        computationState.getCounter());

                } else { //中间状态，放入statesToRetain

                    // add new computation state; it will be processed once the next event arrives

                    statesToRetain.add(newComputationState);

                }

            }

            if (shouldDiscardPath) { //释放discardPath

                // a stop state was reached in this branch. release branch which results in removing previous event from

                // the buffer

                for (final ComputationState<T> state : statesToRetain) {

                    eventSharedBuffer.release(

                        state.getPreviousState().getName(),

                        state.getEvent(),

                        state.getTimestamp(),

                        state.getCounter());

                }

            } else { //将中间state加入computationStates

                computationStates.addAll(statesToRetain);

            }

        }

        // prune shared buffer based on window length

        if (windowTime > 0L) { //prune超时过期的pattern

            long pruningTimestamp = timestamp - windowTime;

            if (pruningTimestamp < timestamp) {

                // the check is to guard against underflows

                // remove all elements which are expired

                // with respect to the window length

                eventSharedBuffer.prune(pruningTimestamp);

            }

        }

        return Tuple2.of(result, timeoutResult);

    }

computeNextStates

    private Collection<ComputationState<T>> computeNextStates(

            final ComputationState<T> computationState,

            final T event,

            final long timestamp) {

        final OutgoingEdges<T> outgoingEdges = createDecisionGraph(computationState, event); //找出state的所有出边

         final List<StateTransition<T>> edges = outgoingEdges.getEdges();

        final List<ComputationState<T>> resultingComputationStates = new ArrayList<>();

        for (StateTransition<T> edge : edges) {

            switch (edge.getAction()) {

                case IGNORE: {

                    if (!computationState.isStartState()) {

                        final DeweyNumber version;

                        if (isEquivalentState(edge.getTargetState(), computationState.getState())) {

                            //Stay in the same state (it can be either looping one or singleton)

                            final int toIncrease = calculateIncreasingSelfState(

                                outgoingEdges.getTotalIgnoreBranches(),

                                outgoingEdges.getTotalTakeBranches());

                            version = computationState.getVersion().increase(toIncrease);

                        } else {

                            //IGNORE after PROCEED

                            version = computationState.getVersion()

                                .increase(totalTakeToSkip + ignoreBranchesToVisit)

                                .addStage();

                            ignoreBranchesToVisit--;

                        }

                        addComputationState( //对于ignore state，本身不用take，把target state加到computation state中

                                resultingComputationStates,

                                edge.getTargetState(),

                                computationState.getPreviousState(),

                                computationState.getEvent(),

                                computationState.getCounter(),

                                computationState.getTimestamp(),

                                version,

                                computationState.getStartTimestamp()

                        );

                    }

                }

                break;

                case TAKE:

                    final State<T> nextState = edge.getTargetState();

                    final State<T> currentState = edge.getSourceState();

                    final State<T> previousState = computationState.getPreviousState();

                    final T previousEvent = computationState.getEvent();

                    final int counter;

                    final long startTimestamp;

                    //对于take，需要把当前state记录到path里面，即放到eventSharedBuffer

                    if (computationState.isStartState()) {

                        startTimestamp = timestamp;

                        counter = eventSharedBuffer.put(

                            currentState.getName(),

                            event,

                            timestamp,

                            currentVersion);

                    } else {

                        startTimestamp = computationState.getStartTimestamp();

                        counter = eventSharedBuffer.put(

                            currentState.getName(),

                            event,

                            timestamp,

                            previousState.getName(),

                            previousEvent,

                            computationState.getTimestamp(),

                            computationState.getCounter(),

                            currentVersion);

                    }

                    addComputationState(

                            resultingComputationStates,

                            nextState,

                            currentState,

                            event,

                            counter,

                            timestamp,

                            nextVersion,

                            startTimestamp);

                    //check if newly created state is optional (have a PROCEED path to Final state)

                    final State<T> finalState = findFinalStateAfterProceed(nextState, event, computationState);

                    if (finalState != null) {

                        addComputationState(

                                resultingComputationStates,

                                finalState,

                                currentState,

                                event,

                                counter,

                                timestamp,

                                nextVersion,

                                startTimestamp);

                    }

                    break;

            }

        }

        return resultingComputationStates;

    }

private OutgoingEdges<T> createDecisionGraph(ComputationState<T> computationState, T event) {

        final OutgoingEdges<T> outgoingEdges = new OutgoingEdges<>(computationState.getState());

        final Stack<State<T>> states = new Stack<>();

        states.push(computationState.getState());

        //First create all outgoing edges, so to be able to reason about the Dewey version

        while (!states.isEmpty()) {

            State<T> currentState = states.pop();

            Collection<StateTransition<T>> stateTransitions = currentState.getStateTransitions(); //取出state所有的stateTransitions

            // check all state transitions for each state

            for (StateTransition<T> stateTransition : stateTransitions) {

                try {

                    if (checkFilterCondition(computationState, stateTransition.getCondition(), event)) {

                        // filter condition is true

                        switch (stateTransition.getAction()) {

                            case PROCEED:  //如果是proceed，直接跳到下个state

                                // simply advance the computation state, but apply the current event to it

                                // PROCEED is equivalent to an epsilon transition

                                states.push(stateTransition.getTargetState());

                                break;

                            case IGNORE:

                            case TAKE: //default，把stateTransition加入边

                                outgoingEdges.add(stateTransition);

                                break;

                        }

                    }

                } catch (Exception e) {

                    throw new RuntimeException("Failure happened in filter function.", e);

                }

            }

        }

        return outgoingEdges;

    }

Flink – CEP NFA的更多相关文章

Apache Flink CEP 实战
本文根据Apache Flink 实战&进阶篇系列直播课程整理而成,由哈啰出行大数据实时平台资深开发刘博分享.通过一些简单的实际例子,从概念原理,到如何使用,再到功能的扩展,希望能够给打算使用 ...
Flink cep的初步使用
一.CEP是什么在应用系统中,总会发生这样或那样的事件,有些事件是用户触发的,有些事件是系统触发的,有些可能是第三方触发的,但它们都可以被看做系统中可观察的状态改变,例如用户登陆应用失败.用户下了一 ...
Flink/CEP/规则引擎/风控
基于 Apache Flink 和规则引擎的实时风控解决方案对一个互联网产品来说,典型的风控场景包括:注册风控.登陆风控.交易风控.活动风控等,而风控的最佳效果是防患于未然,所以事前事中和事后三 ...
大数据计算引擎之Flink Flink CEP复杂事件编程
原文地址: 大数据计算引擎之Flink Flink CEP复杂事件编程复杂事件编程(CEP)是一种基于流处理的技术,将系统数据看作不同类型的事件,通过分析事件之间的关系,建立不同的时事件系序列库,并 ...
FlinkCEP - Complex event processing for Flink
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/libs/cep.html 首先目的是匹配pattern sequenc ...
8.Flink实时项目之CEP计算访客跳出
1.访客跳出明细介绍首先要识别哪些是跳出行为,要把这些跳出的访客最后一个访问的页面识别出来.那么就要抓住几个特征: 该页面是用户近期访问的第一个页面,这个可以通过该页面是否有上一个页面(last_p ...
流计算技术实战 - CEP
CEP,Complex event processing Wiki定义 "Complex event processing, or CEP, is event processing that ...
如何利用Flink实现超大规模用户行为分析
如何利用Flink实现超大规模用户行为分析各位晚上好,首先感谢大家参与我的这次主题分享,同时也感谢 InfoQ AI 前线组织这次瀚思科技主题月! 瀚思科技成立于 2014 年,按行业划分我们是 ...
Flink 灵魂两百问，这谁顶得住？
Flink 学习 https://github.com/zhisheng17/flink-learning 麻烦路过的各位亲给这个项目点个 star,太不易了,写了这么多,算是对我坚持下来的一种鼓励吧 ...

随机推荐

关于java线程、进程的一些问题
1.多核硬件上,java中同一个进程的多个线程可以运行在不同的CPU上么? 应该是可以的,在eclipse上面跑一个模拟程序,一个死循环的线程可以占用系统(4核,Win7)%的CPU,4个这样的线程刚 ...
sql in not in 案例用 exists not exists 代替
from AppStoke B WHERE B.Opencode=A.Code) in用extist代替 select distinct * from Stoke where Code not in ...
用Python来玩微信跳一跳
微信2017年12月28日发布了新版本,在小程序里面有一个跳一跳小游戏,试着点一点玩了下.第二天刚好在一篇技术公众号中,看到有大神用Python代码计算出按压时间,向手机发送android adb命令 ...
【Jetty】Jetty 的工作原理以及与 Tomcat 的比较
Jetty 应该是目前最活跃也是很有前景的一个 Servlet 引擎.本文将介绍 Jetty 基本架构与基本的工作原理:您将了解到 Jetty 的基本体系结构:Jetty 的启动过程:Jetty 如何 ...
机器学习&深度学习基础（目录）
从业这么久了,做了很多项目,一直对机器学习的基础课程鄙视已久,现在回头看来,系统的基础知识整理对我现在思路的整理很有利,写完这个基础篇,开始把AI+cv的也总结完,然后把这么多年做的项目再写好总结. ...
Spring Security 认证流程
请求之间共享SecurityContext原因:
centos7安装elasticsearch-head
elasticsearch-head安装前准备 1.操作系统64位CentOS Linux release 7.2.1511 (Core)2.git是必需的elasticsearch-head是一款开 ...
Python 函数（可变参数）
在python函数中,可以定义可变参数,顾名思义,可变参数就是,传入的参数是可变的例如,给定一组数字a,b,c... 请计算a2 + b2 + c2 + …… 要定义出这个函数,我们必须确定输入的参 ...
使用 wondershaper 在 Linux 中限制网络带宽使用
wondershaper 实际上是一个 shell 脚本,它使用 tc 来定义流量调整命令,使用 QoS 来处理特定的网络接口.外发流量通过放在不同优先级的队列中,达到限制传出流量速率的目的:而传入流 ...
ASP.NET MVC 4 - 上传图片到数据库
这里演示如何在MVC WEB应用程序如何上传图片到数据库以及如何在WEB页面上显示图片.数据库表对应整个Model类,不单图片数据一个字段,我们从数据表的定义开始: CREATE TABLE [dbo ...

Flink – CEP NFA

Flink – CEP NFA的更多相关文章

随机推荐

热门专题