Elasticsearch必知必会的干货知识一：ES索引文档的CRUD

若在传统DBMS 关系型数据库中查询海量数据，特别是模糊查询，一般我们都是使用like %查询的值%，但这样会导致无法应用索引，从而形成全表扫描效率低下，即使是在有索引的字段精确值查找，面对海量数据，效率也是相对较低的，所以目前一般的互联网公司或大型公司，若要查询海量数据，最好的办法就是使用搜索引擎，目前比较主流的搜索引擎框架就是：Elasticsearch，故今天我这里总结了Elasticsearch必知必会的干货知识一：ES索引文档的CRUD，后面陆续还会有其它干货知识分享，敬请期待。

ES索引文档的CRUD（6.X与7.X有区别，6.X中支持一个index创建多个type，而7.X中及以上只支持1个固定的type，即：_doc，API用法上也稍有不同）：

Create创建索引文档【POST index/type/id可选，如果index、type、id已存在则重建索引文档（先删除后创建索引文档，与Put index/type/id 原理相同），如果在指定id情况下需要限制自动更新，则可以使用：index/type/id?op_type=create 或 index/type/id/_create，指明操作类型为创建，这样当存在的记录的情况下会报错】

POST demo_users/_doc 或 demo_users/_doc/2vJKsm8BriJODA6s9GbQ/_create

Request Body:
```
{

"userId":1,

"username":"张三",

"role":"administrator",

"enabled":true,

"createdDate":"2020-01-01T12:00:00"

}
```
Response Body:
```
{

"_index": "demo_users",

"_type": "_doc",

"_id": "2vJKsm8BriJODA6s9GbQ",

"_version": 1,

"result": "created",

"_shards": {

"total": 2,

"successful": 1,

"failed": 0

},

"_seq_no": 0,

"_primary_term": 1

}
```

Get获取索引文档【Get index/type/id】

Get demo_users/_doc/123

Response Body:

{

"_index": "demo_users",

"_type": "_doc",

"_id": "123",

"_version": 1,

"found": true,

"_source": {

"userId": 1,

"username": "张三",

"role": "administrator",

"enabled": true,

"createdDate": "2020-01-01T12:00:00"

}

}

Index Put重建索引文档【PUT index/type/id 或 index/type/id?op_type=index，id必传，如果id不存在文档则创建文档，否则先删除原有id文档后再重新创建文档，version加1】

Put/POST demo_users/_doc/123 或 demo_users/_doc/123?op_type=index

Request Body:

{

"userId":1,

"username":"张三",

"role":"administrator",

"enabled":true,

"createdDate":"2020-01-01T12:00:00",

"remark":"仅演示"

}

Response Body:

{

"_index": "demo_users",

"_type": "_doc",

"_id": "123",

"_version": 4,

"result": "updated",

"_shards": {

"total": 2,

"successful": 2,

"failed": 0

},

"_seq_no": 10,

"_primary_term": 1

}

Update更新索引文档【POST index/type/id/_update 请求体必需是{"doc":{具体的文档JSON}},如果指定的键字段已存在则更新，如果指定的键字段不存在则附加新的键值对，支持多层级嵌套，多次请求，如果有字段值有更新则version加1，否则提示更新0条】

POST demo_users/_doc/123/_update

Request Body:

{

  "doc": {

    "userId": 1,

    "username": "张三",

    "role": "administrator",

    "enabled": true,

    "createdDate": "2020-01-01T12:00:00",

    "remark": "仅演示POST更新5",

    "updatedDate": "2020-01-17T15:30:00"

  }

}

Response Body:

{

"_index": "demo_users",

"_type": "_doc",

"_id": "123",

"_version": 26,

"result": "updated",

"_shards": {

"total": 2,

"successful": 2,

"failed": 0

},

"_seq_no": 35,

"_primary_term": 1

}

Delete删除索引文档【DELETE index/type/id】

DELETE demo_users/_doc/123

Response Body:

{

"_index": "demo_users",

"_type": "_doc",

"_id": "123",

"_version": 2,

"result": "deleted",

"_shards": {

"total": 2,

"successful": 2,

"failed": 0

},

"_seq_no": 39,

"_primary_term": 1

}

Bulk批量操作文档【POST _bulk 或 index/_bulk 或 index/type/_bulk 一次请求支持进行多个索引、多个type的多种不同的CRUD操作，如果操作中有某个出现错误不会影响其它操作；】

POST _bulk

Request Body:(注意最后还得多一个换行，因为ES是根据换行符来识别多条命令的，如果缺少最后一条换行则会报错，注意请求体非标准的JSON，每行才是一个JSON，整体顶多可看成是\n区分的JSON对象数组)

{ "index" : { "_index" : "demo_users_test", "_type" : "_doc", "_id" : "1" } }

{ "bulk_field1" : "测试创建index" }

{ "delete" : { "_index" : "demo_users", "_type" : "_doc", "_id" : "123" } }

{ "create" : { "_index" : "demo_users", "_type" : "_doc", "_id" : "2" } }

{ "bulk_field2" : "测试创建index2" }

{ "update" : { "_index" : "demo_users_test","_type" : "_doc","_id" : "1" } }

{ "doc": {"bulk_field1" : "测试创建index1","bulk_field2" : "测试创建index2"} }

Response Body:

{

    "took": 162,

    "errors": true,

    "items": [

        {

            "index": {

                "_index": "demo_users_test",

                "_type": "_doc",

                "_id": "1",

                "_version": 8,

                "result": "updated",

                "_shards": {

                    "total": 2,

                    "successful": 2,

                    "failed": 0

                },

                "_seq_no": 7,

                "_primary_term": 1,

                "status": 200

            }

        },

        {

            "delete": {

                "_index": "demo_users",

                "_type": "_doc",

                "_id": "123",

                "_version": 2,

                "result": "not_found",

                "_shards": {

                    "total": 2,

                    "successful": 2,

                    "failed": 0

                },

                "_seq_no": 44,

                "_primary_term": 1,

                "status": 404

            }

        },

        {

            "create": {

                "_index": "demo_users",

                "_type": "_doc",

                "_id": "2",

                "status": 409,

                "error": {

                    "type": "version_conflict_engine_exception",

                    "reason": "[_doc][2]: version conflict, document already exists (current version [1])",

                    "index_uuid": "u7WE286CQnGqhHeuwW7oyw",

                    "shard": "2",

                    "index": "demo_users"

                }

            }

        },

        {

            "update": {

                "_index": "demo_users_test",

                "_type": "_doc",

                "_id": "1",

                "_version": 9,

                "result": "updated",

                "_shards": {

                    "total": 2,

                    "successful": 2,

                    "failed": 0

                },

                "_seq_no": 8,

                "_primary_term": 1,

                "status": 200

            }

        }

    ]

}

mGet【POST _mget 或 index/_mget 或 index/type/_mget ，如果指定了index或type，则请求报文中则无需再指明index或type，可以通过_source指明要查询的include以及要排除exclude的字段】

POST _mget

Request Body:

{

  "docs": [

    {

      "_index": "demo_users",

      "_type": "_doc",

      "_id": "12345"

    },

    {

      "_index": "demo_users",

      "_type": "_doc",

      "_id": "1234567",

      "_source": [

        "userId",

        "username",

        "role"

      ]

    },

    {

      "_index": "demo_users",

      "_type": "_doc",

      "_id": "1234",

      "_source": {

        "include": [

          "userId",

          "username"

        ],

        "exclude": [

          "role"

        ]

      }

    }

  ]

}

Response Body:

{

    "docs":[

        {

            "_index":"demo_users",

            "_type":"_doc",

            "_id":"12345",

            "_version":1,

            "found":true,

            "_source":{

                "userId":1,

                "username":"张三",

                "role":"administrator",

                "enabled":true,

                "createdDate":"2020-01-01T12:00:00"

            }

        },

        {

            "_index":"demo_users",

            "_type":"_doc",

            "_id":"1234567",

            "_version":7,

            "found":true,

            "_source":{

                "role":"administrator",

                "userId":1,

                "username":"张三"

            }

        },

        {

            "_index":"demo_users",

            "_type":"_doc",

            "_id":"1234",

            "_version":1,

            "found":true,

            "_source":{

                "userId":1,

                "username":"张三"

            }

        }

    ]

}

POST demo_users/_doc/_mget

Request Body:

{

  "ids": [

    "1234",

    "12345",

    "123457"

  ]

}

Response Body:

{

    "docs":[

        {

            "_index":"demo_users",

            "_type":"_doc",

            "_id":"1234",

            "_version":1,

            "found":true,

            "_source":{

                "userId":1,

                "username":"张三",

                "role":"administrator",

                "enabled":true,

                "createdDate":"2020-01-01T12:00:00",

                "remark":"仅演示"

            }

        },

        {

            "_index":"demo_users",

            "_type":"_doc",

            "_id":"12345",

            "_version":1,

            "found":true,

            "_source":{

                "userId":1,

                "username":"张三",

                "role":"administrator",

                "enabled":true,

                "createdDate":"2020-01-01T12:00:00"

            }

        },

        {

            "_index":"demo_users",

            "_type":"_doc",

            "_id":"123457",

            "found":false

        }

    ]

}

_update_by_query根据查询条件更新匹配到的索引文档的指定字段【POST index/_update_by_query 请求体写查询条件以及更新的字段，更新字段这里采用了painless脚本进行灵活更新】

POST demo_users/_update_by_query

Request Body:（意思是查询role=administrator【可能大家看到keyword，这是因为role字段为text类型，无法直接匹配，需要借助于子字段role.keyword，如果有不理解后面会有简要说明】，更新role为poweruser、remark为remark+采用_update_by_query更新）

{

    "script":{ "source":"ctx._source.role=params.role;ctx._source.remark=ctx._source.remark+params.remark",

        "lang":"painless",

        "params":{

            "role":"poweruser",

            "remark":"采用_update_by_query更新"

        }

    },

    "query":{

        "term":{

            "role.keyword":"administrator"

        }

    }

}

painless写法请具体参考：painless语法教程

Response Body:

{

"took": 114,

"timed_out": false,

"total": 6,

"updated": 6,

"deleted": 0,

"batches": 1,

"version_conflicts": 0,

"noops": 0,

"retries": {

"bulk": 0,

"search": 0

},

"throttled_millis": 0,

"requests_per_second": -1,

"throttled_until_millis": 0,

"failures": [ ]

}

_delete_by_query根据查询条件删除匹配到的索引文档【 POST index/_delete_by_query 请求体写查询匹配条件】

POST demo_users/_delete_by_query

Request Body:（意思是查询enabled=false）

{

  "query": {

    "match": {

      "enabled": false

    }

  }

}

Response Body:

   {

           "took":29,

           "timed_out":false,

           "total":3,

           "deleted":3,

           "batches":1,

           "version_conflicts":0,

           "noops":0,

           "retries":{

               "bulk":0,

               "search":0

           },

           "throttled_millis":0,

           "requests_per_second":-1,

           "throttled_until_millis":0,

           "failures":[

           ]

      }

search查询

URL GET查询（GET index/_search?q=query_string语法，注意中文内容默认分词器是一个汉字拆分成一个term）



A.Term Query:【即分词片段(词条)查询，注意这里讲的包含是指与分词片段匹配】

GET /demo_users/_search?q=role:poweruser //指定字段查询,即：字段包含查询的值

GET /demo_users/_search?q=poweruser //泛查询(没有指定查询的字段)，即查询文档中所有字段包含poweruser的值，只要有一个字段符合，那么该文档将会被返回

B.Phrase Query【即分组查询】

操作符有：AND / OR  / NOT 或者表示为： && / || / !

+表示must -表示must_not 例如：field:(+a -b)意为field中必需包含a但不能包含b

GET /demo_users/_search?q=remark:(POST test)

GET /demo_users/_search?q=remark:(POST OR test)

GET /demo_users/_search?q=remark:"POST test"

//分组查询，即：查询remark中包含POST 或 test的文档记录

GET /demo_users/_search?q=remark:(test AND POST) //remark同时包含test与POST

GET /demo_users/_search?q=remark:(test NOT POST) //remark包含test但不包含POST

C.范围查询

区间表示：[]闭区间，{}开区间

如：year:[2019 TO 2020] 或 {2019 TO 2020} 或 {2019 TO 2020] 或 [* TO 2020]

算数符号

year:>2019 或 (>2012 && <=2020) 或 (+>=2012 +<=2020)

GET /demo_users/_search?q=userId:>123 //查询userId字段大于123的文档记录

D.通配符查询

?表示匹配任意1个字符，*表示匹配0或多个字符 例如：role:power* , role:use?

GET /demo_users/_search?q=role:power* //查询role字段前面是power，后面可以是0或多个其它任意字符。

可使用正则表达式，如：username:张三\d+

可使用近似查询偏移量（slop）提高查询匹配结果【使用~N，N表示偏移量】

GET /demo_users/_search?q=remark:tett~1 //查询remark中包含test的文档，但实际写成了tett，故使用~1偏移近似查询，可以获得test的查询结果

GET /demo_users/_search?q=remark:"i like shenzhen"~2 //查询i like shenzhen但实际remark字段中值为：i like hubei and shenzhen，比查询值多了 hubei and，这里使用~2指定可偏移相隔2个term（这里即两个单词），最终也是可以查询出结果

DSL POST查询（POST index/_search）

POST demo_users/_search

Request Body:

{

    "query":{

        "bool":{

            "must":[

                {

                    "term":{

                        "enabled":"true"  #查询enabled=true

                    }

                },

                {

                    "term":{

                        "role.keyword":"poweruser" #且role=poweruser

                    }

                },

                {

                    "query_string":{

                        "default_field":"username.keyword",

                        "query":"张三" #且 username 包含张三

                    }

                }

            ],

            "must_not":[

            ],

            "should":[

            ]

        }

    },

    "from":0,

    "size":1000,

    "sort":[

        {

            "createdDate":"desc"  #根据createdDate倒序

        }

    ],

    "_source":{ #指明返回的字段，includes需返回字段，excludes不需要返回字段

        "includes":[

            "role",

            "username",

            "userId",

            "remark"

        ],

        "excludes":[

        ]

    }

}

具体用法可参见：

【Elasticsearch】query_string的各种用法

Elasticsearch中 match、match_phrase、query_string和term的区别

Elasticsearch Query DSL 整理总结

[布尔查询Bool Query]

最后附上ES官方的API操作链接指引：

Indices APIs：负责索引Index的创建（create）、删除（delete）、获取（get）、索引存在（exist）等操作。

Document APIs：负责索引文档的创建（index）、删除（delete）、获取（get）等操作。

Search APIs：负责索引文档的search（查询），Document APIS根据doc_id进行查询，Search APIs]根据条件查询。

Aggregations：负责针对索引的文档各维度的聚合（Aggregation）。

cat APIs：负责查询索引相关的各类信息查询。

Cluster APIs：负责集群相关的各类信息查询。

Elasticsearch必知必会的干货知识一：ES索引文档的CRUD的更多相关文章

Elasticsearch必知必会的干货知识二：ES索引操作技巧
该系列上一篇文章<Elasticsearch必知必会的干货知识一:ES索引文档的CRUD> 讲了如何进行index的增删改查,本篇则侧重讲解说明如何对index进行创建.更改.迁移.查询配 ...
python网络爬虫，知识储备，简单爬虫的必知必会，【核心】
知识储备,简单爬虫的必知必会,[核心] 一.实验说明 1. 环境登录无需密码自动登录,系统用户名shiyanlou 2. 环境介绍本实验环境采用带桌面的Ubuntu Linux环境,实验中会用到桌 ...
脑残式网络编程入门(三)：HTTP协议必知必会的一些知识
本文原作者:“竹千代”,原文由“玉刚说”写作平台提供写作赞助,原文版权归“玉刚说”微信公众号所有,即时通讯网收录时有改动. 1.前言无论是即时通讯应用还是传统的信息系统,Http协议都是我们最常打交 ...
《SQL必知必会》学习笔记(一)
这两天看了<SQL必知必会>第四版这本书,并照着书上做了不少实验,也对以前的概念有得新的认识,也发现以前自己有得地方理解错了.我采用的数据库是SQL Server2012.数据库中有一张比 ...
2015 前端[JS]工程师必知必会
2015 前端[JS]工程师必知必会本文摘自:http://zhuanlan.zhihu.com/FrontendMagazine/20002850 ,因为好东东西暂时没看懂,所以暂时保留下来,供以 ...
[ 学习路线 ] 2015 前端(JS)工程师必知必会 (2)
http://segmentfault.com/a/1190000002678515?utm_source=Weibo&utm_medium=shareLink&utm_campaig ...
mysql必知必会系列(一)
mysql必知必会系列是本人在读<mysql必知必会>中的笔记,方便自己以后查看. MySQL. Oracle以及Microsoft SQL Server等数据库是基于客户机-服务器的数据 ...
crypto必知必会
crypto必知必会最近参加了个ctf比赛,在i春秋,南邮方面刷了一些crypto密码学题目,从中也增长了不少知识,在此关于常见的密码学知识做个小总结! Base编码 Base编码中用的比较多的是b ...
Android程序员必知必会的网络通信传输层协议——UDP和TCP
1.点评互联网发展至今已经高度发达,而对于互联网应用(尤其即时通讯技术这一块)的开发者来说,网络编程是基础中的基础,只有更好地理解相关基础知识,对于应用层的开发才能做到游刃有余. 对于Android ...

随机推荐

19_07_07校内训练[xor]
题意长度为n的数组,上面有k个位置是1,现在有l种长度的连续全1串,要求用最少的次数将这个数组异或成全0的数组.n<=1E5,k<=10,l<=100. 思考先将数组进行异或的差 ...
git hub安装
windows下GitHub的安装.配置以及项目的上传过程详细介绍阅读目录概要操作必备 GitHub的安装 Git的初始配置本地Git与远程GitHub连接的建立将本地项目上传到GitHub ...
CSDN
链接:https://blog.csdn.net/shaoyedeboke
OpenCV3入门（二）Mat操作
1.Mat结构 1.1.Mat数据 Mat本质上是由两个数据部分组成的类: 矩阵头:包含信息有矩阵的大小,用于存储的方法,矩阵存储的地址等数据矩阵指针:指向包含了像素值的矩阵. 矩阵头部的大小是恒定 ...
计算机网络基础：TCP和UDP
UDP(用户数据报协议) 应用场景:一个数据包就能完成数据通信:不需要建立会话和流量控制:多播.广播:是一种不可靠传输.(例如QQ聊天,屏幕广播) UDP协议特点: UDP是无连接的,即发送数据之前不 ...
Spring学习的一点感想
最近在学习Java体系的一些框架,先把SSM先学一遍吧,不得不说经典的Java体系带给我的冲击还是比较大的,这里不记录框架的一些实现细节,那些都记录在笔记里面了,这里记录学习 Spring 体系的一些 ...
SpringBoot之切面AOP
SpringBoot提供了强大AOP支持,我们前面讲解过AOP面向切面,所以这里具体AOP原理就补具体介绍: AOP切面主要是切方法,我们一般搞一些日志分析和事务操作,要用到切面,类似拦截器: @As ...
关于java String类的getBytes(String charsetName)和String(byte[] bytes, String charsetName)
public byte[] getBytes(Charset charset) Encodes this String into a sequence of bytes using the given ...
Go语言实现：【剑指offer】数组中出现次数超过一半的数字
该题目来源于牛客网<剑指offer>专题. 数组中有一个数字出现的次数超过数组长度的一半,请找出这个数字.例如输入一个长度为9的数组{1,2,3,2,2,2,5,4,2}.由于数字2在数组 ...
Go语言实现：【剑指offer】数组中的逆序对
该题目来源于牛客网<剑指offer>专题. 在数组中的两个数字,如果前面一个数字大于后面的数字,则这两个数字组成一个逆序对.输入一个数组,求出这个数组中的逆序对的总数P.并将P对10000 ...

Elasticsearch必知必会的干货知识一：ES索引文档的CRUD

最后附上ES官方的API操作链接指引：

Elasticsearch必知必会的干货知识一：ES索引文档的CRUD的更多相关文章

随机推荐

热门专题