Elasticsearch 查询语言(Query DSL)认识(一)

一、基本认识

查询子句的行为取决于

  • query context
  • filter context

也就是执行的是查询(query)还是过滤(filter)

  • query context 描述的是:被搜索的文档和查询子句的匹配程度

  • filter context 描述的是: 被搜索的文档和查询子句是否匹配

一个是匹配程度问题,一个是是否匹配的问题

二、实例

  1. 导入数据 bank account data download
  2. 将数据导入到elasticsearch
curl -XPOST 'localhost:9200/bank/account/_bulk?pretty' --data-binary "@accounts.json"
curl 'localhost:9200/_cat/indices?v'

这里有两个地方需要注意,1.host要改成符合自己的。2.早期版本中下载的数据可以能是'accounts.json?raw=true'

大概如下 curl -XPOST 'wbelk:9200/bank/account/_bulk?pretty' --data-binary "@accounts.json?raw=true"

  1. 参数认识

为了便捷操作,可以安装一个kiabna sense

$./bin/kibana plugin --install elastic/sense

$./bin/kibana
sudo -i service restart kibana(或者用这个启动kibana)

match_all 搜索,直接返回所有文档

GET /bank/_search
{
"query": {
"match_all": {}
}
}

返回大致如下:

{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1000,
"max_score": 1,
"hits": [
{
"_index": "bank",
"_type": "account",
"_id": "25",
"_score": 1,
"_source": {
"account_number": 25,
"balance": 40540,
"firstname": "Virginia",
"lastname": "Ayala",
"age": 39,
"gender": "F",
"address": "171 Putnam Avenue",
"employer": "Filodyne",
"email": "virginiaayala@filodyne.com",
"city": "Nicholson",
"state": "PA"
}
},

参数大致解释:

  • took: 执行搜索耗时,毫秒为单位,也就是本文我1ms
  • time_out: 搜索是否超时
  • _shards: 多少分片被搜索,成功多少,失败多少
  • hits: 搜索结果展示
  • hits.total: 匹配条件的文档总数
  • hits.hits: 返回结果展示,默认返回十个
  • hits.max_score:最大匹配得分
  • hits._score: 返回文档的匹配得分(得分越高,匹配程度越高,越靠前)
  • _index _type _id 作为剥层定位到特定的文档
  • _source 文档源
  1. 查询语言之 执行查询
  • 只显示account_number 和 balance
POST /bank/_search
{
"query": { "match_all": {} },
"_source": ["account_number", "balance"]
}
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1000,
"max_score": 1,
"hits": [
{
"_index": "bank",
"_type": "account",
"_id": "25",
"_score": 1,
"_source": {
"account_number": 25,
"balance": 40540
}
},
{
"_index": "bank",
"_type": "account",
"_id": "44",
"_score": 1,
"_source": {
"account_number": 44,
"balance": 34487
}
},
{
"_index": "bank",
"_type": "account",
"_id": "99",
"_score": 1,
"_source": {
"account_number": 99,
"balance": 47159
}
},
  • 返回accountu_number 为20的document
POST /bank/_search
{
"query": { "match": { "account_number": 20 } }
}
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 5.6587105,
"hits": [
{
"_index": "bank",
"_type": "account",
"_id": "20",
"_score": 5.6587105,
"_source": {
"account_number": 20,
"balance": 16418,
"firstname": "Elinor",
"lastname": "Ratliff",
"age": 36,
"gender": "M",
"address": "282 Kings Place",
"employer": "Scentric",
"email": "elinorratliff@scentric.com",
"city": "Ribera",
"state": "WA"
}
}
]
}
}
  • 返回地址中包含(term)mill的所有账户
POST /bank/_search
{
"query": { "match": { "address": "mill" } }
}
  • 返回地址中包含term 'mill'或者 'lane'的所有账户
POST /bank/_search
{
"query": { "match": { "address": "mill lane" } }
}
  • 匹配phrase 'mill lane'
POST /bank/_search
{
"query": { "match_phrase": { "address": "mill lane" } }
}
  • 返回address包含'mill'和'lane'的所有账户 (AND)
POST /bank/_search
{
"query": {
"bool": {
"must": [
{ "match": { "address": "mill" } },
{ "match": { "address": "lane" } }
]
}
}
}
  • 返回address包含'mill'或'lane'的所有账户 (OR)
POST /bank/_search
{
"query": {
"bool": {
"should": [
{ "match": { "address": "mill" } },
{ "match": { "address": "lane" } }
]
}
}
}
  • 返回address既不包含'mill'也不包含'lane'的所有账户 (NO)
POST /bank/_search
{
"query": {
"bool": {
"must_not": [
{ "match": { "address": "mill" } },
{ "match": { "address": "lane" } }
]
}
}
}
  • 返回age为40,并且state不是ID的所有账户 (组合)
POST /bank/_search
{
"query": {
"bool": {
"must": [
{ "match": { "age": "40" } }
],
"must_not": [
{ "match": { "state": "ID" } }
]
}
}
}
  1. 查询语言之 执行过滤

过滤不会进行相关度得分的计算

  • 在所有账户中寻找balance 在29900到30000之间(闭区间)的所有账户

    (先查询到所有的账户,然后进行过滤)
POST /bank/_search
{
"query": {
"filtered": {
"query": { "match_all": {} },
"filter": {
"range": {
"balance": {
"gte": 29900,
"lte": 30000
}
}
}
}
}
}
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 5,
"max_score": 1,
"hits": [
{
"_index": "bank",
"_type": "account",
"_id": "243",
"_score": 1,
"_source": {
"account_number": 243,
"balance": 29902,
"firstname": "Evangelina",
"lastname": "Perez",
"age": 20,
"gender": "M",
"address": "787 Joval Court",
"employer": "Keengen",
"email": "evangelinaperez@keengen.com",
"city": "Mulberry",
"state": "SD"
}
},
{
"_index": "bank",
"_type": "account",
"_id": "781",
"_score": 1,
"_source": {
"account_number": 781,
"balance": 29961,
"firstname": "Sanford",
"lastname": "Mullen",
"age": 26,
"gender": "F",
"address": "879 Dover Street",
"employer": "Zanity",
"email": "sanfordmullen@zanity.com",
"city": "Martinez",
"state": "TX"
}
},
...

根据返回结果我们可以看到filter得到的_score为1.不存在程度上的问题。是0和1的问题

三、query和filter效率

一般认为filter的速度快于query的速度

  • filter不会计算相关度得分,效率高
  • filter的结果可以缓存到内存中,方便再用

以bank account 数据为例,认识elasticsearch query 和 filter的更多相关文章

  1. ElasticSearch - query vs filter

    query vs filter 来自stackoverflow Stackoverflow - queries-vs-filters Question 题主希望知道Query和Filter的区别 An ...

  2. elasticsearch query 和 filter 的区别

    Query查询器 与 Filter 过滤器 尽管我们之前已经涉及了查询DSL,然而实际上存在两种DSL:查询DSL(query DSL)和过滤DSL(filter DSL).过滤器(filter)通常 ...

  3. Elasticsearch query和filter的区别

    1.关于Query context和filter context 查询语句的表现行为取决于使用了查询上下文方式还是过滤上下文方式. Query context:查询上下文,回答了“文档是如何被查询语句 ...

  4. 数据从文件导入Elasticsearch

    1.资源准备 1.数据文件:accounts.json 2.索引名称:bank 3.数据类型:account 4.批量操作API:bulk 2.导入数据 curl -XPOST 'localhost: ...

  5. [Codeforces Round #186 (Div. 2)] A. Ilya and Bank Account

    A. Ilya and Bank Account time limit per test 2 seconds memory limit per test 256 megabytes input sta ...

  6. php curl模拟post请求提交数据样例总结

    在php中要模拟post请求数据提交我们会使用到curl函数,以下我来给大家举几个curl模拟post请求提交数据样例有须要的朋友可參考參考.注意:curl函数在php中默认是不被支持的,假设须要使用 ...

  7. How To Change the Supplier Bank Account Masking in UI (Doc ID 877074.1)

      Give Feedback...           How To Change the Supplier Bank Account Masking in UI (Doc ID 877074.1) ...

  8. Pandas之:Pandas高级教程以铁达尼号真实数据为例

    Pandas之:Pandas高级教程以铁达尼号真实数据为例 目录 简介 读写文件 DF的选择 选择列数据 选择行数据 同时选择行和列 使用plots作图 使用现有的列创建新的列 进行统计 DF重组 简 ...

  9. Query DSL for elasticsearch Query

    Query DSL Query DSL (资料来自: http://www.elasticsearch.cn/guide/reference/query-dsl/) http://elasticsea ...

随机推荐

  1. 谈谈DOMContentLoaded:Javascript中的domReady引入机制

    一.扯淡部分 回想当年,在摆脱写页面时js全靠从各种DEMO中copy出来然后东拼西凑的幽暗岁月之后,毅然决然地打算放弃这种处处“拿来主义”的不正之风,然后开启通往高大上的“前端攻城狮”的飞升之旅.想 ...

  2. Entity Framework 6 Recipes 2nd Edition 译 -> 目录 -持续更新

    因为看了<Entity Framework 6 Recipes 2nd Edition>这本书前面8章的翻译,感谢china_fucan. 从第九章开始,我是边看边译的,没有通读,加之英语 ...

  3. jquery.Callbacks的实现

    前言 本人是一个热爱前端的菜鸟,一直喜欢学习js原生,对于jq这种js库,比较喜欢理解他的实现,虽然自己能力有限,水平很低,但是勉勉强强也算是能够懂一点吧,对于jq源码解读系列,博客园里有很多,推荐大 ...

  4. CRL快速开发框架系列教程十三(嵌套查询)

    本系列目录 CRL快速开发框架系列教程一(Code First数据表不需再关心) CRL快速开发框架系列教程二(基于Lambda表达式查询) CRL快速开发框架系列教程三(更新数据) CRL快速开发框 ...

  5. scp报错 -bash: scp: command not found

    环境:RHEL6.5 使用scp命令报错: [root@oradb23 media]# scp /etc/hosts oradb24:/etc/ -bash: scp: command not fou ...

  6. 使用nginx反向代理,一个80端口下,配置多个微信项目

    我们要接入微信公众号平台开发,需要填写服务器配置,然后依据接口文档才能实现业务逻辑.但是微信公众号接口只支持80接口(80端口).我们因业务需求需要在一个公众号域名下面,发布两个需要微信授权的项目,怎 ...

  7. Pramp mock interview (4th practice): Matrix Spiral Print

    March 16, 2016 Problem statement:Given a 2D array (matrix) named M, print all items of M in a spiral ...

  8. ReSharper详解Index0

    JetBrains ReSharper可以帮助Visual Studio用户编写出更好的代码.支持对C#,VB.NET,XAML,JavaScript,TypeScript,JSON,XML,HTML ...

  9. Xamarin技术文档------VS多平台开发

    此技术业余时间研究,仅供大家学习参考,不涉及深入研究,有一定开发基础的人员,应该都能较快上手. 一.简介 Xamarin始创于2011年,旨在使移动开发变得难以置信地迅捷和简单.Xamarin的产品简 ...

  10. 避免调试代码导致IE出错

    记录一下 if(!window.console){ var names = ["log", "debug", "info", "w ...