上篇继续,本篇主要研究如何查询

一、sql方式查询

习惯于数据库开发的同学,自然最喜欢这种方式。为了方便讲解,先写一段代码,生成一堆记录

package com.cnblogs.yjmyzz;

import java.io.IOException;
import java.net.URI;
import java.net.URISyntaxException;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse; public class Test { public static void main(String[] args) throws IOException, URISyntaxException, InterruptedException {
HttpClient httpClient = HttpClient.newBuilder().build();
for (int i = 1000000; i < 2000000; i++) {
HttpRequest httpRequest = HttpRequest.newBuilder()
.header("Content-Type", "application/json")
.version(HttpClient.Version.HTTP_1_1)
.uri(new URI("http://localhost:9200/cnblogs/_doc/" + i))
.POST(HttpRequest.BodyPublishers.ofString("{\n" +
" \"blog_id\":" + i + ",\n" +
" \"blog_title\":\"java并发编程(" + i + ")\",\n" +
" \"blog_content\":\"java并发编程学习笔记" + i + "-by 菩提树下的杨过\",\n" +
" \"blog_category\":\"java\"\n" +
"}")).build();
HttpResponse<String> response = httpClient.send(httpRequest, HttpResponse.BodyHandlers.ofString());
System.out.println(response.toString() + "\t" + i);
}
}
}

这里没借助任何第3方类库,仅用jdk 11自带的HttpClient向ES添加100w条记录,插入后数据大致长这样

如果想用sql取前10条,可以这样:

POST http://localhost:9200/_sql?format=txt

{
"query": "SELECT * FROM cnblogs where blog_category='java' and blog_id between 1000000 and 1005000 order by blog_id desc limit 10"
}

只要象查mysql一样,写sql就行了,非常方便。执行效果:

另外,es还提供了一个SQL的CLI,命令终端输入 ./elasticsearch-sql-cli 即可

更多SQL搜索的细节,可参考 https://www.elastic.co/guide/en/elasticsearch/reference/current/xpack-sql.html

二、URI简单搜索

2.1 根据内部_id精确搜索

GET http://localhost:9200/cnblogs/_doc/1001818

如果存在_id=1001818的数据,将返回

{
"_index": "cnblogs",
"_type": "_doc",
"_id": "1001818",
"_version": 1,
"_seq_no": 954,
"_primary_term": 1,
"found": true,
"_source": {
"blog_id": 1001818,
"blog_title": "java并发编程(1001818)",
"blog_content": "java并发编程学习笔记1001818-by 菩提树下的杨过",
"blog_category": "java"
}
}

如果数据不存在,将返回404的http状态码。

tips: 如果不希望返回_xxx这一堆元数据,可以URI后面加上/_source,即:http://localhost:9200/cnblogs/_doc/1001818/_source,将返回

{
"blog_id": 1001818,
"blog_title": "java并发编程(1001818)",
"blog_content": "java并发编程学习笔记1001818-by 菩提树下的杨过",
"blog_category": "java"
}

另外有些大文本的字段,每次返回也比较消耗性能,如果只需要返回指定字段,可以这么做:

http://localhost:9200/cnblogs/_doc/1001818/_source/?_source=blog_id,blog_title

将只返回blog_id,blog_title这2列

 

2.2 利用_search?q搜索

GET http://localhost:9200/cnblogs/_search?q=blog_id:1001818

这表示搜索blog_id为1001818的记录

更多搜索细节,可参考https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html

三、DSL搜索

_search也支持POST复杂方式搜索,称为Query DSL,比如:取出第5条数据

POST http://localhost:9200/cnblogs/_search

{
"size": 5,
"from": 0
}

这跟mysql中的limit x,y 分页是类似效果,但是要注意的事,这种分页方式遇到偏移量大时,性能极低下,ES7.x默认会判断,如果超过10000,就直接返回错误了

比如:

{
"size": 5,
"from": 10000
}

会返回:

{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "Result window is too large, from + size must be less than or equal to: [10000] but was [10005]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting."
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "cnblogs",
"node": "TZ_qYEMOSZ63E1HMl4lFfA",
"reason": {
"type": "illegal_argument_exception",
"reason": "Result window is too large, from + size must be less than or equal to: [10000] but was [10005]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting."
}
}
],
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Result window is too large, from + size must be less than or equal to: [10000] but was [10005]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Result window is too large, from + size must be less than or equal to: [10000] but was [10005]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting."
}
}
},
"status": 400
}

利用DSL可以构造很复杂的查询,

比如:

POST http://localhost:9200/cnblogs/_search

{
"query": {
"bool": {
"must": [
{
"range": {
"blog_id": {
"gte": 1001818,
"lte": 1001830
}
}
},
{
"match": {
"blog_category": "java"
}
}
]
}
},
"size": 10,
"from": 0
}

翻译成sql的话,等价于 blog_id between 1001818 and 10001830 and blog_category='java' limit 0,10

DSL不建议死记,可以通过Elasticsearch Tools以可视化方式生成

另外还可以通过highlight来让匹配的结果,相应的关键字高亮显示

{
"query": {
"bool": {
"must": [
{
"match": {
"blog_title": "并发 ES"
}
}
]
}
},
"highlight": {
"fields": {
"blog_title": {}
}
},
"size": "1",
"from": 0
}

返回结果:

{
"took": 63,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 10000,
"relation": "gte"
},
"max_score": 9.87141,
"hits": [
{
"_index": "cnblogs",
"_type": "_doc",
"_id": "1",
"_score": 9.87141,
"_source": {
"blog_id": 10000001,
"blog_title": "ES 7.8速成笔记(新标题)",
"blog_content": "这是一篇关于ES的测试内容by 菩提树下的杨过",
"blog_category": "ES"
},
"highlight": {
"blog_title": [
"<em>ES</em> 7.8速成笔记(新标题)"
]
}
}
]
}
}

多出的highlight中,匹配成功的关键字,会有em标识。

指定排序(sort)

{
"query": {
"bool": {
"must": [
{
"match": {
"blog_title": "并发 ES"
}
}
]
}
},
"highlight": {
"fields": {
"blog_title": {}
}
},
"sort": [
{
"blog_id": {
"order": "desc"
}
}
],
"size": "1",
"from": 0
}

注意sort部分,默认为asc升序。

聚合(group by)

{
"aggs": {
"all_interests": {
"terms": {
"field": "blog_category"
}
}
},
"size": 0,
"from": 0
}

上述查询,类似sql中的 select count(0) from cnblogs group by blog_category 返回结果如下:

{
"took": 1783,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 10000,
"relation": "gte"
},
"max_score": null,
"hits": []
},
"aggregations": {
"all_interests": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "java",
"doc_count": 514666
},
{
"key": "ES",
"doc_count": 1
},
{
"key": "sql",
"doc_count": 1
}
]
}
}
}

更多Query DSL细节,可参考文档https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html

四、使用Client SDK查询

ES提供了2种客户端:elasticsearch-rest-client、elasticsearch-rest-high-level-client

4.1 elasticsearch-rest-client

pom依赖:

        <dependency>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>2.8.6</version>
</dependency> <dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-client</artifactId>
<version>7.8.0</version>
</dependency>

示例代码:

package com.cnblogs.yjmyzz;

import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import org.apache.http.HttpHost;
import org.apache.http.util.EntityUtils;
import org.elasticsearch.client.*; import java.io.IOException;
import java.util.HashMap;
import java.util.Map; public class EsClientTest { private static Gson gson = new GsonBuilder()
.setPrettyPrinting()
.setDateFormat("yyyy-MM-dd HH:mm:ss.SSS")
.create(); public static void main(String[] args) throws IOException {
RestClientBuilder builder = RestClient.builder(new HttpHost("127.0.0.1", 9200, "http"));
builder.setFailureListener(new RestClient.FailureListener() {
@Override
public void onFailure(Node node) {
System.out.println("fail:" + node);
return;
}
}); RestClient client = builder.build();
//简单的get查询示例
Request request = new Request("GET", "/cnblogs/_doc/1001818/_source/?_source=blog_id,blog_title");
request.addParameter("pretty", "true");
Response response = client.performRequest(request);
System.out.println(response.getRequestLine());
System.out.println(response.getStatusLine());
System.out.println(EntityUtils.toString(response.getEntity())); System.out.println("----------------"); //post查询示例
request = new Request("POST", "/cnblogs/_search/?_source=blog_id,blog_title");
request.addParameter("pretty", "true");
Map<String, Integer> map = new HashMap<>();
map.put("size", 2);
map.put("from", 0);
request.setJsonEntity(gson.toJson(map));
response = client.performRequest(request);
System.out.println(response.getRequestLine());
System.out.println(response.getStatusLine());
System.out.println(EntityUtils.toString(response.getEntity()));
}
}

  

4.2 elasticsearch-rest-high-level-client

pom依赖:

        <dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>7.8.0</version>
</dependency>

示例代码:

package com.cnblogs.yjmyzz;

import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import org.apache.http.HttpHost;
import org.elasticsearch.action.get.GetRequest;
import org.elasticsearch.action.get.GetResponse;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.*;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.builder.SearchSourceBuilder; import java.io.IOException; public class EsClientHighLevelTest { public static void main(String[] args) throws IOException {
RestClientBuilder builder = RestClient.builder(new HttpHost("127.0.0.1", 9200, "http"));
builder.setFailureListener(new RestClient.FailureListener() {
@Override
public void onFailure(Node node) {
System.out.println("fail:" + node);
return;
}
}); RestHighLevelClient client = new RestHighLevelClient(builder);
//简单的get查询示例
GetRequest request = new GetRequest("cnblogs", "1001818");
GetResponse response = client.get(request, RequestOptions.DEFAULT);
System.out.println(response.getSourceAsString()); //search示例
SearchRequest searchRequest = new SearchRequest("cnblogs");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(QueryBuilders.matchQuery("blog_title", "并发 笔记"));
sourceBuilder.from(0);
sourceBuilder.size(5);
searchRequest.source(sourceBuilder); SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
for (SearchHit hit : searchResponse.getHits()) {
System.out.println(hit.getSourceAsString());
} client.close();
}
}

ES 7.8 速成笔记(中)的更多相关文章

  1. pyqt样式表语法笔记(中)--原创

    pyqt样式表语法笔记(中) pyqt QSS python 样式表 一.弹窗 在日常的各种桌面软件的使用中,我们都会碰到弹窗.例如注册,登录的时候,会有相应的信息弹窗,这里就以信息收集弹窗为例进行弹 ...

  2. pyqt样式表语法笔记(中)

    pyqt样式表语法笔记(中) pyqt QSS python 样式表 一.弹窗 在日常的各种桌面软件的使用中,我们都会碰到弹窗.例如注册,登录的时候,会有相应的信息弹窗,这里就以信息收集弹窗为例进行弹 ...

  3. 在为知笔记中使用Markdown和思维导图

    为知笔记Wiz是一款很好的网摘和笔记工具,作为为知的忠实用户,我在为知收费后第一时间就购买了两年的授权,毕竟这么多年积累的资料都在为知上,我也习惯了使用Wiz来做些工作相关的笔记.为知笔记自带Mark ...

  4. 菜鸟教程之学习Shell script笔记(中)

    菜鸟教程Shell script学习笔记(中) 以下内容是学习菜鸟教程之shell教程,所整理的笔记 菜鸟教程之shell教程:http://www.runoob.com/linux/linux-sh ...

  5. OpenGL ES SL 3.0规范中以前的attribute改成了in varying改成了out

           OpenGL ES和OpenGL的图标 关于“OpenGL ES SL 3.0规范中以前的attribute改成了in varying改成了out”这个问题,做一阐述: 1.关键字的小修 ...

  6. Zookeeper学习笔记(中)

    Zookeeper学习笔记(中) Zookeeper的基本原理和基本实现 深入了解ZK的基本原理 ZK的一致性: ZAB 协议: Zookeeper 原子消息广播协议 ZK通过选举保证 leader ...

  7. 黑马程序员----java基础笔记中(毕向东)

    <p>------<a href="http://www.itheima.com" target="blank">Java培训.Andr ...

  8. 在为知笔记中使用JQuery

    为知笔记很好用,深得我心.原来还有一点想法,创建一些自己的模板,用的更加深入一些.后来发现,必要性不大,笔记自带的功能足够满足大多数的需求,如果画蛇添足,反而不利于跨电脑,跨平台使用. 不过近期又有一 ...

  9. 【ES】elasticsearch学习笔记

    ES学习 1 优势 1.1 简单 1.1.1 相比Solor配置部署等非常简单 1.2 高效 1.2.1 ES使用Netty作为内部RPC框架,Solor使用Jetty 1.3 插件化 1.3.1 E ...

  10. 【C++ OpenGL ES 2.0编程笔记】8: 使用VBO和IBO绘制立方体 【转】

    http://blog.csdn.net/kesalin/article/details/8351935 前言 本文介绍了OpenGL ES 2.0 中的顶点缓冲对象(VBO: Vertex Buff ...

随机推荐

  1. linux系统权限管理

    一.认识linux系统的文件权限 首先随便在一个目录下使用ls -l(可简写为ll)指令,就会把该目录下所有的文件和目录的权限显示出来,例如,在根目录下使用ls -l: (深蓝字:目录,白字:文件,浅 ...

  2. 凯撒密码--java实现

    关于凯撒密码的介绍我就不多说了,感兴趣的可以看什么是凯撒密码?,我主要说的是java如何实现. 我发现网上有写java加密解密的,写的时候发现只需要一个转换函数就可以了,可以作为加密用,也可以用作解密 ...

  3. 《原型设计工具深度解析:Axure到墨刀的实战指南》

    原型设计工具深度解析:从Axure到墨刀的实战应用 项目背景 "Shou学"作为信息学院本科必修课指南平台,需通过高保真原型实现课程导航.知识点拆解.习题模拟等核心功能.本文结合& ...

  4. Axure RP医疗在线挂号问诊原型图医院APP原形模板

    Axure RP医疗在线挂号问诊原型图医院APP原形模板 医疗在线挂号问诊Axure RP原型图医院APP原形模板,是一款原创的医疗类APP,设计尺寸采用iPhone13(375*812px),原型图 ...

  5. 浅析鸿蒙(ark runtime)执行动态代码

    @charset "UTF-8"; .markdown-body { line-height: 1.75; font-weight: 400; font-size: 15px; o ...

  6. 数据库事务隔离与Alembic数据恢复的实战艺术

    title: 数据库事务隔离与Alembic数据恢复的实战艺术 date: 2025/05/15 00:05:13 updated: 2025/05/15 00:05:13 author: cmdra ...

  7. FMEA方法,排除架构可用性隐患的利器

    极客时间:<从 0 开始学架构>:FMEA方法,排除架构可用性隐患的利器 FMEA 方法,就是保证我们做到全面分析的一个非常简单但是非常有效的方法. 1.FMEA 介绍 FMEA(Fail ...

  8. Ubuntu 通过 docker 启动 mysql

    1.首先拉取MySQL的镜像 docker pull mysql 2.运行mysql容器 docker run --name mysql -p 3306:3306 -e MYSQL_ROOT_PASS ...

  9. 初识protobuf

    protobuf的优点 性能方面 序列化后,数据大小可缩小3倍 序列化速度快 传输速度快 使用方面 使用简单:proto编译器自动进行序列化和反序列化 维护成本低:多平台只需要维护一套对象协议文件,即 ...

  10. 1 分钟生成架构图?程序员 AI 绘图保姆级教程

    大家好,我是鱼皮.作为一名程序员,画图可以说是工作中的家常便饭了.无论是给领导汇报时画架构图.还是写文档时画流程图.或者头脑风暴时画思维导图,画图能力直接体现出我们的专业水平. 以前画图需要自己费时费 ...