Understanding Complex Event Processing (CEP)/ Streaming SQL Operators with WSO2 CEP (Siddhi)
转自:https://iwringer.wordpress.com/2013/08/07/understanding-complex-event-processing-cep-operators-with-wso2-cep-siddhi/
CEP model has many sensors. A sensor can be a real sensor (e.g. temperature sensor), some agent, or a system that support instrumentation. The sensor sends events to CEP and each event has several name-value properties.

We call events coming from the same sensor as a “stream” and give it a name. When an interesting event occurs, the sensor sends that event to the stream.
To use a stream, you need to first define them.
define stream PizzaOrders (id string, price float, ts long, custid string)
CEP listens to one or more streams, and we can write queries telling the CEP to look for certain conditions. For writing queries, you can use following constructs.
- Filters
- Windows
- Joins
- Patterns and Sequences
- Event tables
- Partitions
Let us see what we can do with each construct.
Filter
The filter checks a condition about the property in an event. It can be a =, >, < etc., and you can create complex queries by combing multiple conditions via and, or, not etc.
Following query detect pizza orders that are small and placed too far from the store.
select from PizzaOrders[price 1km]
insert into NBNOrders id, price, distance
Windows
An event stream can have an infinite number of events. Windows are a way to select a subset of events for further processing. You can select events in many ways: events came in a time period, last N events etc.
An output from a window is set of events. You can use it for further processing (e.g. joining event streams) or calculate aggregate function like sum and average.
We can either get output to be triggered when all events are collected or whenever a new event is added. We call the first type batch windows and second sliding windows.
For example, a window can collect all pizza orders placed in the last hour and emit the average value of the order once every hour.
from PizzaOrders#window.time( 1h ) into HourlyOrderStats avg(price) as avgPrice
Joins
Join operator join two event streams. The idea is to match event coming from two streams and create a new event stream.
For example, you can use join operator to join PizzaDelivery stream and PizzaOrder stream and calculate the time took to deliver each order.
from PizzaOrder#window.time(1h) as o join PizzaDelivery as d
on o.id == d.id
insert into DeliveryTime o.id as id, d.ts-0.ts as ts
At least one side of the join must have a window. For example, in above example, we can have a one hour window for PizzaOrder (because delivery always happens after the order) where join will store the events coming in PizzaOrder for one hour and match them against delivery events. If you have two windows, the join will store events at each stream and match them against events coming to the other stream.
Patterns and Sequences
Patterns and sequences let us match conditions that happen over time.
For example, we can use patterns to identify returning customers using the following query. Here -> denotes followed by relationship.
from every a1 = PizzaOder
-> a2 = PizzaOder[custid=a1.custid]
insert into ReturningCustomers
a1.custid as custid a2.ts as ts
Patterns match even when there are other events in between two matching conditions. Sequences are similar but provided event sequence must exactly match the events that happened. For example, following is the same query implemented using sequences. Note here the second line is to ignore any not matching events.
from every a1 = PizzaOder,
PizzaOder[custid!=a1.custid]*,
a2 = PizzaOder[custid=a1.custid]
insert into ReturningCustomers
a1.custid as custid a2.ts as ts
Here instead of -> relationship we use a regular expression like notation to define a sequence of conditions.
Partitions (available in upcoming 3.0 release)
Siddhi evaluates a query matching all the events in event streams used by that query. Partitions let us partition events into several groups based on some condition before evaluating queries.
For example, let say we need to find the time spent until pizza left the shop and until it is delivered. We can first partition pizza orders by orderID and then evaluate the query. It simplifies the query by a great extent.
define partition oderParition by PizzaOder.id, PizzaDone.oid, PizzaDelivered.oid
select from PizzaOder as o ->PizzaDone as p -> PizzaDelivered as d
insert into OrderTimes (p.ts-o.ts) as time2Preprae, (d.ts-p.ts) as time2Delivery
partition by oderParition
We do this for several reasons.
- Evaluating events separately within several partitions might be faster than matching them all together. In the latter case, we match events only within the partition.
- Sometimes it makes queries easier to design. For example, in the above query, partitioning let us write a query without worrying about other orders that are overlapped with the same order.
- Partitions let CEP runtime to distribute evaluation to multiple machines, and this can helps when scaling queries.
define table LatePizzaOrdersTable (ordered string, ts long, price float);
Then you can add events to it, delete events from it, and join those events in the table against incoming events.
For example, let’s say we need to store all late deliveries and if late delivery happened to the same customer twice we want to give them free pizza.
from LatePizzaDeliveries insert into LatePizzaOrdersTable;
Then we can join events from event table with incoming events as follows.
from LatePizzaDeliveries as l join LatePizzaOrdersTable as t
on l.custid=t.custid AND l.ts!=t.ts
insert into FreePizzaOrders
You can also do the same using an event stream. However, event tables can be written to the disk and very useful for the long-running use cases. For example, if we do the above using an event stream stored values will be lost when we restart the server. However, values in event tables will be preserved in a disk.
Update 2017 September: You can try out above queries with WSO2 Stream Processor,which is freely available under Apache Licence 2.
Update 2018 January: You can find a detailed discussion about operators from Stream Processing 101: From SQL to Streaming SQL in 10 Minutes
Understanding Complex Event Processing (CEP)/ Streaming SQL Operators with WSO2 CEP (Siddhi)的更多相关文章
- How to scale Complex Event Processing (CEP)/ Streaming SQL Systems?
转自:https://iwringer.wordpress.com/2012/05/18/how-to-scale-complex-event-processing-cep-systems/ What ...
- An Overview of Complex Event Processing
An Overview of Complex Event Processing 复杂事件处理技术概览(一) 翻译前言:我在理解复杂事件处理(CEP)方面一直有这样的困惑--为什么这种计算模式是有效的, ...
- FlinkCEP - Complex event processing for Flink
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/libs/cep.html 首先目的是匹配pattern sequenc ...
- Stream Processing 101: From SQL to Streaming SQL in 10 Minutes
转自:https://wso2.com/library/articles/2018/02/stream-processing-101-from-sql-to-streaming-sql-in-ten- ...
- An Overview of Complex Event Processing2
An Overview of Complex Event Processing 翻译前言:感觉作者有点夸夸其谈兼絮絮叨叨,但文章还是很有用的.原文<An Overview of Complex ...
- Introducing KSQL: Streaming SQL for Apache Kafka
Update: KSQL is now available as a component of the Confluent Platform. I’m really excited to announ ...
- Flafka: Apache Flume Meets Apache Kafka for Event Processing
The new integration between Flume and Kafka offers sub-second-latency event processing without the n ...
- KSQL: Streaming SQL for Apache Kafka
Few weeks back, while I was enjoying my holidays in the south of Italy, I started receiving notifica ...
- OpenGL的GLUT事件处理(Event Processing)窗口管理(Window Management)函数[转]
GLUT事件处理(Event Processing)窗口管理(Window Management)函数 void glutMainLoop(void) 让glut程序进入事件循环.在一个glut程序中 ...
随机推荐
- [Leetcode 62]机器人走路Unique Path 动态规划
[题目] A robot is located at the top-left corner of a m x n grid (marked 'Start' in the diagram below) ...
- ubuntu gnome桌面隐藏顶栏
注意:ubuntu 14.04.5默认的为unity桌面,有多点触发,没有自带Tweak Tool工具.需安装gnome 桌面,可参见我的另一随笔. 环境: ubuntu 14.04.5 gnome ...
- window.location.replace()与window.location.href()区别
有3个页面 a,b,c 如果当前页面是c页面,并且c页面是这样跳转过来的:a->b->c 1:b->c 是通过window.location.replace("..xx/c ...
- 第三节 java 数组(循环遍历、获取数组的最值(最大值和最小值)、选择排序、冒泡排序、练习控制台输出大写的A)
获取数组的最值(最大值和最小值) 思路: 1.获取最值需要进行比较,每一次比较都会有一个较大的值,因为该 值不确定,需要一个变量进行临储. 2.让数组中的每一个元素都和这个变量中的值进行比较,如果大于 ...
- c++的读入txt文件(转)
因为学姐的项目需要,要用到excel的读入读出,百度过后发现txt的读入读出比较简单,于是,我采用了先把excel转成txt,然后再读入. 方法是csdn上的天使的原地址: https://blo ...
- MyEclipse使用教程:在Web项目中使用Web片段
MyEclipse 在线订购年终抄底促销!火爆开抢>> MyEclipse最新版下载 本教程向用户展示了使用关联的Web项目创建Web片段项目的机制.用户还可以获得要检查的示例项目.在本教 ...
- 代码改变世界 | 如何封装一个简单的 Koa
下面给大家带来:封装一个简单的 Koa Koa 是基于 Node.js 平台的下一代 web 开发框架 Koa 是一个新的 web 框架,可以快速而愉快地编写服务端应用程序,本文将跟大家一起学习:封装 ...
- 20165214 2017-2018-2 《Java程序设计》课程总结
20165214 2017-2018-2 <Java程序设计>课程总结 每周任务链接 预备作业1:我期望的师生关系 预备作业2:C语言基础调查和java学习展望 预备作业3:Linux安装 ...
- synchronized(九)
在Java中是有常量池缓存的功能的,就是说如果我先声明了一个String str1 = “a”; 再声明一个一样的字符串的时候,取值是从原地址去取的,也就是说是同一个对象.这也就导致了在锁字符串对象的 ...
- Hibernate项目的创建
首先,我们需要知道,Hibernate是一个不需要建立在容器的基础上的一个框架,所以在java项目中同样可以运行. 为了证明,我们在这次程序的创建中,用JavaProject文件来实现Hibernat ...