search(12)- elastic4s-聚合=桶+度量
这篇我们介绍一下ES的聚合功能(aggregation)。聚合是把索引数据可视化处理成可读有用数据的主要工具。聚合由bucket桶和metrics度量两部分组成。
所谓bucket就是SQL的GROUPBY,如下:
GET /cartxns/_search
{
"size" : ,
"aggs": {
"color": {
"terms": {"field": "color.keyword"}
}
}
} ... "aggregations" : {
"color" : {
"doc_count_error_upper_bound" : ,
"sum_other_doc_count" : ,
"buckets" : [
{
"key" : "red",
"doc_count" :
},
{
"key" : "blue",
"doc_count" :
},
{
"key" : "green",
"doc_count" :
}
]
}
}
上面这个例子中是以color.keyword为bucket的。elastic4是如下表现的:
val aggTerms = search("cartxns").aggregations(
termsAgg("colors","color.keyword").includeExactValues("red","green")
).sourceInclude("color","make").size()
println(aggTerms.show)
val termsResult = client.execute(aggTerms).await
termsResult.result.hits.hits.foreach(m => println(m.sourceAsMap))
termsResult.result.aggregations.terms("colors").buckets.foreach(b => println(s"${b.key},${b.docCount}"))
输出为:
POST:/cartxns/_search?
StringEntity({"size":,"_source":{"includes":["color","make"]},"aggs":{"colors":{"terms":{"field":"color.keyword","include":["red","green"]}}}},Some(application/json))
Map(color -> red, make -> honda)
Map(color -> red, make -> honda)
Map(color -> green, make -> ford)
red,
green,
下面的avg_price是个简单的度量:
POST /cartxns/_search
{
"aggs":{
"colors":{
"terms":{"field":"color.keyword"},
"aggs":{
"avg_price":{
"avg":{"field":"price"}
}
}
}
}
} ... "aggregations" : {
"colors" : {
"doc_count_error_upper_bound" : ,
"sum_other_doc_count" : ,
"buckets" : [
{
"key" : "red",
"doc_count" : ,
"avg_price" : {
"value" : 32500.0
}
},
{
"key" : "blue",
"doc_count" : ,
"avg_price" : {
"value" : 20000.0
}
},
{
"key" : "green",
"doc_count" : ,
"avg_price" : {
"value" : 21000.0
}
}
]
}
}
terms定义bucket。在terms下加上aggs-avg表示符合某个backet条件文件的平均定价avg_price。elastic4是如下表达的:
val aggTermsAvg = search("cartxns").aggregations(
termsAgg("colors","color.keyword").subAggregations(
avgAgg("avg_price","price")
)
).sourceInclude("color","make").size()
println(aggTermsAvg.show)
val avgResult = client.execute(aggTermsAvg).await
avgResult.result.hits.hits.foreach(m => println(m.sourceAsMap))
avgResult.result.aggregations.terms("colors").buckets
.foreach(b => println(s"${b.key},${b.docCount},${b.avg("avg_price").value}"))
...
POST:/cartxns/_search?
StringEntity({"size":,"_source":{"includes":["color","make"]},"aggs":{"colors":{"terms":{"field":"color.keyword"},"aggs":{"avg_price":{"avg":{"field":"price"}}}}}},Some(application/json))
Map(color -> red, make -> honda)
Map(color -> red, make -> honda)
Map(color -> green, make -> ford)
red,,32500.0
blue,,20000.0
green,,21000.0
然后,我们可以在bucket里再增加bucket,如下:
POST /cartxns/_search
{
"aggs":{
"colors":{
"terms":{"field":"color.keyword"},
"aggs":{
"avg_price":{"avg":{"field":"price"}},
"makes":{"terms":{"field":"make.keyword"}}
}
}
}
} ... "aggregations" : {
"colors" : {
"doc_count_error_upper_bound" : ,
"sum_other_doc_count" : ,
"buckets" : [
{
"key" : "red",
"doc_count" : ,
"makes" : {
"doc_count_error_upper_bound" : ,
"sum_other_doc_count" : ,
"buckets" : [
{
"key" : "honda",
"doc_count" :
},
{
"key" : "bmw",
"doc_count" :
}
]
},
"avg_price" : {
"value" : 32500.0
}
},
{
"key" : "blue",
"doc_count" : ,
"makes" : {
"doc_count_error_upper_bound" : ,
"sum_other_doc_count" : ,
"buckets" : [
{
"key" : "ford",
"doc_count" :
},
{
"key" : "toyota",
"doc_count" :
}
]
},
"avg_price" : {
"value" : 20000.0
}
},
{
"key" : "green",
"doc_count" : ,
"makes" : {
"doc_count_error_upper_bound" : ,
"sum_other_doc_count" : ,
"buckets" : [
{
"key" : "ford",
"doc_count" :
},
{
"key" : "toyota",
"doc_count" :
}
]
},
"avg_price" : {
"value" : 21000.0
}
}
]
}
}
elastic4示范:
val aggTAvgT = search("cartxns").aggregations(
termsAgg("colors","color.keyword").subAggregations(
avgAgg("avg_price","price"),
termsAgg("makes","make.keyword")
)
).size()
println(aggTAvgT.show)
val avgTTResult = client.execute(aggTAvgT).await
avgTTResult.result.hits.hits.foreach(m => println(m.sourceAsMap))
avgTTResult.result.aggregations.terms("colors").buckets
.foreach { cb =>
println(s"${cb.key},${cb.docCount},${cb.avg("avg_price").value}")
cb.terms("makes").buckets.foreach(mb => println(s"${mb.key},${mb.docCount}"))
}
...
POST:/cartxns/_search?
StringEntity({"size":,"aggs":{"colors":{"terms":{"field":"color.keyword"},"aggs":{"avg_price":{"avg":{"field":"price"}},"makes":{"terms":{"field":"make.keyword"}}}}}},Some(application/json))
Map(price -> , color -> red, make -> honda, sold -> --)
Map(price -> , color -> red, make -> honda, sold -> --)
Map(price -> , color -> green, make -> ford, sold -> --)
red,,32500.0
honda,
bmw,
blue,,20000.0
ford,
toyota,
green,,21000.0
ford,
toyota,
最后,我们再在最内层的bucket增加min,max两个metrics:
POST /cartxns/_search
{
"size":,
"aggs":{
"colors":{
"terms":{"field":"color.keyword"},
"aggs":{
"avg_price":{"avg":{"field":"price"}},
"makes":{"terms":{"field":"make.keyword"},
"aggs":{
"max_price":{"max":{"field":"price"}},
"min_price":{"min":{"field":"price"}}
}
}
}
}
}
} ... "aggregations" : {
"colors" : {
"doc_count_error_upper_bound" : ,
"sum_other_doc_count" : ,
"buckets" : [
{
"key" : "red",
"doc_count" : ,
"makes" : {
"doc_count_error_upper_bound" : ,
"sum_other_doc_count" : ,
"buckets" : [
{
"key" : "honda",
"doc_count" : ,
"max_price" : {
"value" : 20000.0
},
"min_price" : {
"value" : 10000.0
}
},
{
"key" : "bmw",
"doc_count" : ,
"max_price" : {
"value" : 80000.0
},
"min_price" : {
"value" : 80000.0
}
}
]
},
"avg_price" : {
"value" : 32500.0
}
},
{
"key" : "blue",
"doc_count" : ,
"makes" : {
"doc_count_error_upper_bound" : ,
"sum_other_doc_count" : ,
"buckets" : [
{
"key" : "ford",
"doc_count" : ,
"max_price" : {
"value" : 25000.0
},
"min_price" : {
"value" : 25000.0
}
},
{
"key" : "toyota",
"doc_count" : ,
"max_price" : {
"value" : 15000.0
},
"min_price" : {
"value" : 15000.0
}
}
]
},
"avg_price" : {
"value" : 20000.0
}
},
{
"key" : "green",
"doc_count" : ,
"makes" : {
"doc_count_error_upper_bound" : ,
"sum_other_doc_count" : ,
"buckets" : [
{
"key" : "ford",
"doc_count" : ,
"max_price" : {
"value" : 30000.0
},
"min_price" : {
"value" : 30000.0
}
},
{
"key" : "toyota",
"doc_count" : ,
"max_price" : {
"value" : 12000.0
},
"min_price" : {
"value" : 12000.0
}
}
]
},
"avg_price" : {
"value" : 21000.0
}
}
]
}
}
elastic4示范:
val aggTAvgTMM = search("cartxns").aggregations(
termsAgg("colors","color.keyword").subAggregations(
avgAgg("avg_price","price"),
termsAgg("makes","make.keyword").subAggregations(
maxAgg("max_price","price"),
minAgg("min_price","price")
)
)
).size()
println(aggTAvgTMM.show)
val avgTTMMResult = client.execute(aggTAvgTMM).await
avgTTMMResult.result.hits.hits.foreach(m => println(m.sourceAsMap))
avgTTMMResult.result.aggregations.terms("colors").buckets
.foreach { cb =>
println(s"${cb.key},${cb.docCount},${cb.avg("avg_price").value}")
cb.terms("makes").buckets.foreach { mb =>
println(s"${mb.key},${mb.docCount},${mb.avg("min_price").value},${mb.avg("max_price").value}")
}
}
...
POST:/cartxns/_search?
StringEntity({"size":,"aggs":{"colors":{"terms":{"field":"color.keyword"},"aggs":{"avg_price":{"avg":{"field":"price"}},"makes":{"terms":{"field":"make.keyword"},"aggs":{"max_price":{"max":{"field":"price"}},"min_price":{"min":{"field":"price"}}}}}}}},Some(application/json))
Map(price -> , color -> red, make -> honda, sold -> --)
Map(price -> , color -> red, make -> honda, sold -> --)
Map(price -> , color -> green, make -> ford, sold -> --)
red,,32500.0
honda,,10000.0,20000.0
bmw,,80000.0,80000.0
blue,,20000.0
ford,,25000.0,25000.0
toyota,,15000.0,15000.0
green,,21000.0
ford,,30000.0,30000.0
toyota,,12000.0,12000.0
search(12)- elastic4s-聚合=桶+度量的更多相关文章
- elasticsearch聚合--桶(Buckets)和指标(Metrics)的概念
写在前面的话:读书破万卷,编码如有神--------------------------------------------------------------------主要内容包括: 聚合的两个核 ...
- 第六章:Django 综合篇 - 12:聚合内容 RSS/Atom
Django提供了一个高层次的聚合内容框架,让我们创建RSS/Atom变得简单,你需要做的只是编写一个简单的Python类. 一.范例 要创建一个feed,只需要编写一个Feed类,然后设置一条指向F ...
- 010-elasticsearch5.4.3【四】-聚合操作【一】-度量聚合【metrics】-min、max、sum、avg、count
一.概述 度量类型聚合主要针对的number类型的数据,需要ES做比较多的计算工作 参考向导:地址 import org.elasticsearch.search.aggregations.Aggre ...
- Elastic Stack 笔记(七)Elasticsearch5.6 聚合分析
博客地址:http://www.moonxy.com 一.前言 Elasticsearch 是一个分布式的全文搜索引擎,索引和搜索是 Elasticsarch 的基本功能.同时,Elasticsear ...
- 翻译 | Placing Search in Context The Concept Revisited
翻译 | Placing Search in Context The Concept Revisited 原文 摘要 [1] Keyword-based search engines are in w ...
- Hive 文件格式 & Hive操作(外部表、内部表、区、桶、视图、索引、join用法、内置操作符与函数、复合类型、用户自定义函数UDF、查询优化和权限控制)
本博文的主要内容如下: Hive文件存储格式 Hive 操作之表操作:创建外.内部表 Hive操作之表操作:表查询 Hive操作之表操作:数据加载 Hive操作之表操作:插入单表.插入多表 Hive语 ...
- 031 Spring Data Elasticsearch学习笔记---重点掌握第5节高级查询和第6节聚合部分
Elasticsearch提供的Java客户端有一些不太方便的地方: 很多地方需要拼接Json字符串,在java中拼接字符串有多恐怖你应该懂的 需要自己把对象序列化为json存储 查询到结果也需要自己 ...
- ElasticSearch 2 (37) - 信息聚合系列之内存与延时
ElasticSearch 2 (37) - 信息聚合系列之内存与延时 摘要 控制内存使用与延时 版本 elasticsearch版本: elasticsearch-2.x 内容 Fielddata ...
- ElasticSearch 聚合函数
一.简单聚合 桶 :简单来说就是满足特定条件的文档的集合. 指标:大多数 指标 是简单的数学运算(例如最小值.平均值.最大值,还有汇总),这些是通过文档的值来计算. 桶能让我们划分文档到有意义的集合, ...
随机推荐
- O - Employment Planning HDU - 1158
题目大意: 第一行一个n,表示共n个月份,然后第二行分别表示一个工人的聘请工资,月薪水,解雇工资.第三行是n个月每个月需要的工人的最少数目.然后求最少花费 题解: dp[i][j] 表示第i个月聘请j ...
- 并发工具——CyclicBarrier
本博客系列是学习并发编程过程中的记录总结.由于文章比较多,写的时间也比较散,所以我整理了个目录贴(传送门),方便查阅. 并发编程系列博客传送门 CyclicBarrier简介 CyclicBarrie ...
- Ubuntu上mysql, 通过python连接报错Can't connect to MySQL server on xxx (10061)
通过sqlyog连接ubuntu上的mysql报错 试了试python直接连接也报同样的错 那应该就是ubuntu上mysql服务自己的问题了 查看mysql 版本 mysql -V root@clo ...
- 云开发静态网站托管现已支持 Angular 应用
云开发静态托管是云开发提供的静态网站托管的能力,静态资源(HTML.CSS.JavaScript.字体等)的分发由腾讯云对象存储 COS 和拥有多个边缘网点的腾讯云 CDN 提供支持. 在云开发静态托 ...
- pytorch seq2seq模型训练测试
num_sequence.py """ 数字序列化方法 """ class NumSequence: """ ...
- Java IO 流--FileUtils 工具类封装
IO流的操作写多了,会发现都已一样的套路,为了使用方便我们可以模拟commosIo 封装一下自己的FileUtils 工具类: 1.封装文件拷贝: 文件拷贝需要输入输出流对接,通过输入流读取数据,然后 ...
- JZ2440 linux-3.4.2内核启动报错:Verifying Checksum ... Bad Data CRC
使用的uboot版本是1.1.6,是打过u-boot-1.1.6_jz2440.patch的: kernel使用的版本是3.4.2, 也是打过linux-3.4.2_camera_jz2440.pat ...
- Windows API Index
https://docs.microsoft.com/en-us/windows/desktop/apiindex/windows-api-list
- 对 spring 中默认的 DataSource 创建进行覆盖
配置如下 /** * Primary:标识为主配置,将默认的配置覆盖掉 * ConfigurationProperties:设置配置来源 * * @return DataSource */ @Prim ...
- 计算5的n次幂html代码
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8&quo ...