Use Elasticksearch to solve TOP N issue
The raw data is like
timestamp, router, interface, src_ip, dst_ip, protocol, pkts
10000000, 1.1.1.1 1 2.2.2.2 1.3.3.3 tcp 100
10000000, 1.1.1.2 2 2.2.8.2 2.3.3.3 tcp 200
10000001, 8.1.1.1 1 2.2.2.8 3.3.3.3 udp 500
10000001, 2.1.1.1 1 2.2.8.2 4.3.3.8 udp 800
I put these data into elastic search. Now I want to solve the problem:
What is the top 3 combination of src_ip and dst_ip which send most pkts.
Translate this requirements to SQL it would be:
SELECT src_ip, dst_ip, sum(pkts) as total FROM raw_data GROUP BY src_ip, dst_ip ORDER BY total DESC LIMIT 3
Before Every thing, below are the datas, I try to do group and aggregate
{"TAG":10001,"SRC_MAC":"52:54:00:14:05:4a","DST_MAC":"52:54:00:2c:e4:7c","VLAN":1342,"COS":84,"IN_IFACE":2,"OUT_IFACE":2,"SRC_IP":"42.120.85.133","DST_IP":"42.120.83.164","SRC_MASK":24,"DST_MASK":24,"SRC_PORT":13628,"DST_PORT":13783,"PROTOCOL":"ggp","PACKETS":11,"BYTES":4330,"time":1452643200}
{"TAG":10002,"SRC_MAC":"52:54:00:5d:b5:10","DST_MAC":"52:54:00:18:d8:de","VLAN":543,"COS":66,"IN_IFACE":2,"OUT_IFACE":2,"SRC_IP":"42.120.83.123","DST_IP":"42.120.86.184","SRC_MASK":24,"DST_MASK":24,"SRC_PORT":14731,"DST_PORT":14856,"PROTOCOL":"ip","PACKETS":6,"BYTES":3958,"time":1452643200}
{"TAG":10002,"SRC_MAC":"52:54:00:50:14:e1","DST_MAC":"52:54:00:3f:50:38","VLAN":250,"COS":77,"IN_IFACE":2,"OUT_IFACE":3,"SRC_IP":"42.120.83.165","DST_IP":"42.120.86.172","SRC_MASK":24,"DST_MASK":24,"SRC_PORT":11778,"DST_PORT":14673,"PROTOCOL":"isis","PACKETS":2,"BYTES":3803,"time":1452643200}
{"TAG":10001,"SRC_MAC":"52:54:00:17:5e:e3","DST_MAC":"52:54:00:75:da:af","VLAN":2647,"COS":2,"IN_IFACE":2,"OUT_IFACE":1,"SRC_IP":"42.120.86.253","DST_IP":"42.120.83.58","SRC_MASK":24,"DST_MASK":24,"SRC_PORT":16767,"DST_PORT":16418,"PROTOCOL":"ipv6-route","PACKETS":10,"BYTES":2852,"time":1452643200}
{"TAG":10002,"SRC_MAC":"52:54:00:4a:e6:49","DST_MAC":"52:54:00:37:f2:78","VLAN":1005,"COS":88,"IN_IFACE":3,"OUT_IFACE":2,"SRC_IP":"42.120.81.90","DST_IP":"42.120.81.248","SRC_MASK":24,"DST_MASK":24,"SRC_PORT":11573,"DST_PORT":16757,"PROTOCOL":"encap","PACKETS":7,"BYTES":1745,"time":1452643200}
{"TAG":10001,"SRC_MAC":"52:54:00:52:1e:29","DST_MAC":"52:54:00:6a:2b:0e","VLAN":1816,"COS":26,"IN_IFACE":2,"OUT_IFACE":3,"SRC_IP":"42.120.82.91","DST_IP":"42.120.85.121","SRC_MASK":24,"DST_MASK":24,"SRC_PORT":15961,"DST_PORT":15761,"PROTOCOL":"gre","PACKETS":16,"BYTES":2753,"time":1452643200}
{"TAG":10000,"SRC_MAC":"52:54:00:3d:16:89","DST_MAC":"52:54:00:33:b8:54","VLAN":3393,"COS":64,"IN_IFACE":4,"OUT_IFACE":2,"SRC_IP":"42.120.86.27","DST_IP":"42.120.85.184","SRC_MASK":24,"DST_MASK":24,"SRC_PORT":18677,"DST_PORT":17202,"PROTOCOL":"eigrp","PACKETS":6,"BYTES":3474,"time":1452643200}
{"TAG":10000,"SRC_MAC":"52:54:00:01:bb:4c","DST_MAC":"52:54:00:21:91:c0","VLAN":3803,"COS":23,"IN_IFACE":1,"OUT_IFACE":2,"SRC_IP":"42.120.85.186","DST_IP":"42.120.82.206","SRC_MASK":24,"DST_MASK":24,"SRC_PORT":15093,"DST_PORT":18784,"PROTOCOL":"ospf","PACKETS":20,"BYTES":3171,"time":1452643200}
The mapping
curl -XGET localhost:9200/sflow/_mapping | json_reformat
{
"sflow": {
"mappings": {
"9k": {
"properties": {
"BYTES": {
"type": "long"
},
"COS": {
"type": "long"
},
"DST_IP": {
"type": "string"
},
"DST_MAC": {
"type": "string"
},
"DST_MASK": {
"type": "long"
},
"DST_PORT": {
"type": "long"
},
"IN_IFACE": {
"type": "long"
},
"OUT_IFACE": {
"type": "long"
},
"PACKETS": {
"type": "long"
},
"PROTOCOL": {
"type": "string"
},
"SRC_IP": {
"type": "string"
},
"SRC_MAC": {
"type": "string"
},
"SRC_MASK": {
"type": "long"
},
"SRC_PORT": {
"type": "long"
},
"TAG": {
"type": "long"
},
"VLAN": {
"type": "long"
},
"time": {
"type": "long"
}
}
}
}
}
}
The agg query
curl -XPOST 'localhost:9200/sflow/_search?pretty' -d '
{
"query": { "match_all": {} },
"aggs":
{
"by_SRC_IP":
{
"terms": {"script": "[doc.SRC_IP.value, doc.DST_IP.value].join('-')","size": 3, "order": {"sum_bits": "desc"}},
"aggs": { "sum_bits": { "sum": {"field": "BYTES"} } }
}
}
}'
The output
{
"error" : {
"root_cause" : [ {
"type" : "groovy_script_compilation_exception",
"reason" : "failed to compile groovy script"
} ],
"type" : "search_phase_execution_exception",
"reason" : "all shards failed",
"phase" : "query",
"grouped" : true,
"failed_shards" : [ {
"shard" : 0,
"index" : "sflow",
"node" : "N0xpH1hbQG-V9DU-ANfVgw",
"reason" : {
"type" : "script_exception",
"reason" : "Failed to compile inline script [[doc.SRC_IP.value, doc.DST_IP.value].join(-)] using lang [groovy]",
"caused_by" : {
"type" : "groovy_script_compilation_exception",
"reason" : "failed to compile groovy script",
"caused_by" : {
"type" : "multiple_compilation_errors_exception",
"reason" : "startup failed:\nc6d3ea304aad32f8cc4f0efa6b5540cf0ad6be1a: 1: unexpected token: ) @ line 1, column 44.\n alue, doc.DST_IP.value].join(-)\n ^\n\n1 error\n"
}
}
}
} ]
},
"status" : 500
}
Use Elasticksearch to solve TOP N issue的更多相关文章
- Solve Hibernate Lazy-Init issue with hibernate.enable_lazy_load_no_trans
I have been suffering from infamous hibernate exception org.hibernate.LazyInitializationException: c ...
- Oracle Cannot Update TOP N Issue, 请专家解答
大家好 上周写了匿名方法一文,很多读者,很高兴,相信我们已经从大伙的回复中,对.NET又有了更深刻的认识. 好,现在说主题,各类数据库都有相应更新本表top n的方案.现在我一一举例 首先看表结构如下 ...
- 一个DIV三列布局100%高度自适应的好例子(国外)
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W ...
- Replacing JNI Crashes by Exceptions on Android
http://blog.httrack.com/blog/2013/08/23/catching-posix-signals-on-android/ To Report Or Not To Repor ...
- 转:Monoids and Finger Trees
转自:http://apfelmus.nfshost.com/articles/monoid-fingertree.html This post grew out of the big monoid ...
- 小强的HTML5移动开发之路(5)——制作一个漂亮的视频播放器
来自:http://blog.csdn.net/dawanganban/article/details/17679069 在前面几篇文章中介绍了HTML5的特点和需要掌握的基础知识,下面我们开始真正的 ...
- HDU 5988.Coding Contest 最小费用最大流
Coding Contest Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 65536/65536 K (Java/Others)To ...
- Coding Contest(费用流变形题,double)
Coding Contest http://acm.hdu.edu.cn/showproblem.php?pid=5988 Time Limit: 2000/1000 MS (Java/Others) ...
- Ethereum White Paper
https://github.com/ethereum/wiki/wiki/White-Paper White Paper EditNew Page James Ray edited this pag ...
随机推荐
- 微信小程序组件解读和分析:十四、slider滑动选择器
slider滑动选择器组件说明: 滑动选择器. slider滑动选择器示例代码运行效果如下: 下面是WXML代码: [XML] 纯文本查看 复制代码 ? 01 02 03 04 05 06 07 08 ...
- qt5.5版本的creator构建套件自动检测为警告
原创,转载请注明http://www.cnblogs.com/dachen408/p/7226188.html 原因,安装qt在E盘,winsdksetup也在E盘 的原因,卸载winsdksetup ...
- ASP.NET自学之路(转载)
第一步 掌握一门NET面向对象语言,C#或VB.NET 我强烈反对在没系统学过一门面向对象(OO)语言的前提下去学ASP.NET. ASP.NET是一个全面向对象的技术,不懂OO,那绝对学不下去! 第 ...
- 216种Web安全颜色
216种Web安全颜色 全部 JavaScript HTML5 jQuery CSS EXT Ajax Web综合 界面设计 DWR 锁定老帖子 主题:216种Web安全颜色 精华帖 (0) :: ...
- Java中的JVM的内存结构
Java的虚拟机自身结构图: JVM内存结构主要包括两个子系统和两个组件.两个子系统分别是Classloader子系统和Executionengine(执行引擎)子系统:两个组件分别是Runtimed ...
- [GXOI/GZOI2019]与或和(单调栈)
想了想决定把这几题也随便水个解题报告... bzoj luogu 思路: 首先肯定得拆成二进制30位啊 此后每一位的就是个01矩阵 Q1就是全是1的矩阵个数 Q2就是总矩阵个数减去全是0的矩阵个数 ...
- [Python3网络爬虫开发实战] 2.4-会话和Cookies
在浏览网站的过程中,我们经常会遇到需要登录的情况,有些页面只有登录之后才可以访问,而且登录之后可以连续访问很多次网站,但是有时候过一段时间就需要重新登录.还有一些网站,在打开浏览器时就自动登录了,而且 ...
- DNS服务器原理简述、搭建主/从DNS服务器并实现智能解析
1. TLD:Top Level Domain 顶级域名 组织域:.com, .net, .org, .gov, .edu, .mil 国家域:.iq, .tw, .hk, .jp, .cn, ... ...
- 等待某(N)个线程执行完再执行某个线程的几种方法(Thread.join(),CountDownLatch,CyclicBarrier,Semaphore)
1.main线程中先调用threadA.join() ,再调用threadB.join()实现A->B->main线程的执行顺序 调用threadA.join()时,main线程会挂起,等 ...
- Win2008 Server搭建FTP服务器
首先创建一个专门的FTP用户,当然也可以不创建. 用系统自带的超管用户. 设置用户名和密码.用户下次登陆必须修改密码记得去掉勾选. 在角色里面的WEB服务器找到添加角色服务.我之前有安装IIS. 没有 ...