Fortunately, Elasticsearch provides a very comprehensive and powerful REST API that you can use to interact with your cluster. Among the few things that can be done with the API are as follows:

  • Check your cluster, node, and index health, status, and statistics
  • Administer your cluster, node, and index data and metadata
  • Perform CRUD (Create, Read, Update, and Delete) and search operations against your indexes
  • Execute advanced search operations such as paging, sorting, filtering, scripting, aggregations, and many others

es提供了一套容易理解并且强大的rest api接口,通过该接口你可以和集群进行交互,完成各种操作:检查集群状态、管理集群、对索引做CRUD操作、查询索引;

Rest API Pattern:

<REST Verb> /<Index>/<Type>/<ID>[?pretty|v]

ps:Type相当于Category或者Partition的概念;Type未来会被废弃掉;

The pretty parameter, again, just tells Elasticsearch to return pretty-printed JSON results.

所有返回json接口都可以增加pretty参数,这样返回的json是格式化的;

Each of the commands accepts a query string parameter v to turn on verbose output.

v参数意味着详细输出;

以下通过CURL请求,关于CURL详见:https://www.cnblogs.com/barneywill/p/10279555.html

一  集群相关

1 查看健康情况

# curl http://$es_server:9200/_cat/health?v
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1547990539 21:22:19 elasticsearch green 3 3 10 5 0 0 0 0 - 100.0%

2 查看节点

# curl http://$es_server:9200/_cat/nodes?v
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
server1 29 74 1 0.07 0.10 0.13 mdi * 3iLMMxu
server2 45 74 1 0.11 0.11 0.13 mdi - vz1k1MB
server3 47 75 1 0.08 0.07 0.08 mdi - vGUu-b6

3 查看master

# curl 'http://$es_server:9200/_cat/master?v'
id host ip node
3iLMMxuCTISHPJaVo6I4SA server1 server1 3iLMMxu

4 查看所有索引

# curl http://$es_server:9200/_cat/indices?v
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open testdoc GFZhtn6GSMy2pPPj8UK70Q 5 1 1 0 8.9kb 4.4kb

5 查看节点状态

# curl -XGET 'http://localhost:9200/_nodes/stats?pretty'

二 索引相关

1 建立新索引

# curl -XPUT 'http://$es_server:9200/testdoc/'
{"acknowledged":true,"shards_acknowledged":true,"index":"testdoc"}

2 删除索引

# curl -XDELETE 'http://$es_server:9200/testdoc/'

3 查看shards

# curl http://localhost:9200/_cat/shards

三 文档相关

1 插入单个文档

# curl -XPUT 'http://localhost:9200/testdoc/testtype/1' -d '{"name":"test"}'
{"_index":"testdoc","_type":"testtype","_id":"1","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":0,"_primary_term":1}

如果报错:

{"error":"Incorrect HTTP method for uri [/testdoc/testtype] and method [PUT], allowed: [POST]","status":405}

添加header

-H 'Content-Type: application/json'

2 查询单个文档

# curl -XGET 'http://$es_server:9200/testdoc/testtype/1'
{"_index":"testdoc","_type":"testtype","_id":"1","_version":1,"found":true,"_source":{"name":"test"}}

3 修改单个文档

1)使用相同的id和不同的数据再调用一次

# curl -XPUT 'http://$es_server:9200/testdoc/testtype/1' -d '{"name":"test hello"}'
{"_index":"testdoc","_type":"testtype","_id":"1","_version":2,"result":"updated","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":1,"_primary_term":2}

2)通过update

# curl -XPOST 'http://$es_server:9200/testdoc/testtype/1/_update' -d '{"doc":{"name":"test hello again"}}'
{"_index":"testdoc","_type":"testtype","_id":"1","_version":3,"result":"updated","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":2,"_primary_term":2}

4 删除单个文档

# curl -XDELETE 'http://$es_server:9200/testdoc/testtype/1'

5 批量文档操作接口

同时进行两个插入一个修改一个删除

# curl -XPOST 'http://$es_server:9200/testdoc/testtype/1/_bulk' -d '
{"index":{"_id":"3"}}
{"name": "John Doe" }
{"index":{"_id":"4"}}
{"name": "Jane Doe" }
{"update":{"_id":"1"}}
{"doc": { "name": "John Doe becomes Jane Doe" } }
{"delete":{"_id":"2"}}'

6 查询所有文档

以下两种请求等价

# curl -XGET 'http://$es_server:9200/testdoc/_search?q=*'
{"took":2,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"testdoc","_type":"testtype","_id":"1","_score":1.0,"_source":{"name":"test hello again"}}]}}

# curl -XPOST 'http://$es_server:9200/testdoc/_search' -d '{"query":{"match_all":{}}}'

7 查询count总数

# curl http://localhost:9200/testdoc/_count

8 通过条件查询count

# curl http://localhost:9200/testdoc/_count?q=name:hello

# curl http://localhost:9200/testdoc/_count?q=name:hello%20AND%20age:10

注意url传递query时如果有多个field,需要使用AND或OR连接,同时空格替换为编码%20

9 sql查询

# curl -XPOST -H 'Content-Type: application/json' 'http://$es_server:9200/_xpack/sql?format=txt' -d '{"query":"select * from testdoc"}'
name
----------------
test hello again

四 Setting相关

1 查看一个索引的setting

# curl -XGET 'http://localhost:9200/testdoc/_settings'

2 查看所有setting

# curl -XGET 'http://localhost:9200/_all/_settings'

五 Mapping相关

Mapping(索引结构定义)类似于表结构定义,定义所有的字段、数据类型、是否存储、是否索引、analyzer等;

Mapping is the process of defining how a document, and the fields it contains, are stored and indexed. For instance, use mappings to define:

  • which string fields should be treated as full text fields.
  • which fields contain numbers, dates, or geolocations.
  • whether the values of all fields in the document should be indexed into the catch-all _all field.
  • the format of date values.
  • custom rules to control the mapping for dynamically added fields.

1 查看单个索引的mapping

# curl http://localhost:9200/testdoc/_mapping/testtype
{"testdoc":{"mappings":{"testtype":{"properties":{"name":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}}}}}}

2 查看所有的索引的mapping

# curl http://localhost:9200/_mapping
# curl http://localhost:9200/_all/_mapping

3 在已有mapping上添加字段

# curl -XPOST -H 'Content-Type: application/json' http://localhost:9200/testdoc/_mapping/testtype -d '
{
"properties": {
"email": {
"type": "keyword"
}
}
}'

4 设置mapping

# curl -XPOST -H 'Content-Type: application/json' http://localhost:9200/testdoc -d '
{
"mappings": {
"testtype": {
"properties": {
"title": { "type": "text", "analyzer": "standard"},
"name": { "type": "text" },
"age": { "type": "integer" },
"created": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
}
}
}
}
}'

5 更新mapping

mapping无法更新,只能使用新的mapping创建新的索引,然后重建索引来间接实现mapping更新;

Other than where documented, existing field mappings cannot be updated. Changing the mapping would mean invalidating already indexed documents. Instead, you should create a new index with the correct mappings and reindex your data into that index. If you only wish to rename a field and not change its mappings, it may make sense to introduce an alias field.

参考:https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html

六 Analyzer相关

Analysis is the process of converting text, like the body of any email, into tokens or terms which are added to the inverted index for searching. Analysis is performed by an analyzer which can be either a built-in analyzer or a custom analyzer defined per index.

analyzer在mapping中配置,比如

"title": { "type": "text", "analyzer": "standard"}, \

测试analyzer

# curl -XPOST -H 'Content-Type: application/json' http://localhost:9200/_analyze?pretty -d '{"tokenizer":"standard","filter":  [ "lowercase", "asciifolding" ],"text":      "Is this chandler?"}'
{
"tokens" : [
{
"token" : "is",
"start_offset" : 0,
"end_offset" : 2,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "this",
"start_offset" : 3,
"end_offset" : 7,
"type" : "<ALPHANUM>",
"position" : 1
},
{
"token" : "chandler",
"start_offset" : 8,
"end_offset" : 16,
"type" : "<ALPHANUM>",
"position" : 2
}
]
}
# curl -XPOST -H 'Content-Type: application/json' http://localhost:9200/_analyze?pretty -d '{"tokenizer":"standard","text":"联想是全球最大的笔记本厂商"}'
{
"tokens" : [
{
"token" : "联",
"start_offset" : 0,
"end_offset" : 1,
"type" : "<IDEOGRAPHIC>",
"position" : 0
},
{
"token" : "想",
"start_offset" : 1,
"end_offset" : 2,
"type" : "<IDEOGRAPHIC>",
"position" : 1
},
{
"token" : "是",
"start_offset" : 2,
"end_offset" : 3,
"type" : "<IDEOGRAPHIC>",
"position" : 2
},
{
"token" : "全",
"start_offset" : 3,
"end_offset" : 4,
"type" : "<IDEOGRAPHIC>",
"position" : 3
},
{
"token" : "球",
"start_offset" : 4,
"end_offset" : 5,
"type" : "<IDEOGRAPHIC>",
"position" : 4
},
{
"token" : "最",
"start_offset" : 5,
"end_offset" : 6,
"type" : "<IDEOGRAPHIC>",
"position" : 5
},
{
"token" : "大",
"start_offset" : 6,
"end_offset" : 7,
"type" : "<IDEOGRAPHIC>",
"position" : 6
},
{
"token" : "的",
"start_offset" : 7,
"end_offset" : 8,
"type" : "<IDEOGRAPHIC>",
"position" : 7
},
{
"token" : "笔",
"start_offset" : 8,
"end_offset" : 9,
"type" : "<IDEOGRAPHIC>",
"position" : 8
},
{
"token" : "记",
"start_offset" : 9,
"end_offset" : 10,
"type" : "<IDEOGRAPHIC>",
"position" : 9
},
{
"token" : "本",
"start_offset" : 10,
"end_offset" : 11,
"type" : "<IDEOGRAPHIC>",
"position" : 10
},
{
"token" : "厂",
"start_offset" : 11,
"end_offset" : 12,
"type" : "<IDEOGRAPHIC>",
"position" : 11
},
{
"token" : "商",
"start_offset" : 12,
"end_offset" : 13,
"type" : "<IDEOGRAPHIC>",
"position" : 12
}
]
}

中文分词smarkcn

$ bin/elasticsearch-plugin install analysis-smartcn

This plugin can be downloaded for offline install from https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-smartcn/analysis-smartcn-6.6.2.zip.

The plugin provides the smartcn analyzer and smartcn_tokenizer tokenizer, which are not configurable.

分词效果:

# curl -XPOST -H 'Content-Type: application/json' http://localhost:9200/_analyze?pretty -d '{"tokenizer":"smartcn_tokenizer","text":"联想是全球最大的笔记本厂商"}'
{
"tokens" : [
{
"token" : "联想",
"start_offset" : 0,
"end_offset" : 2,
"type" : "word",
"position" : 0
},
{
"token" : "是",
"start_offset" : 2,
"end_offset" : 3,
"type" : "word",
"position" : 1
},
{
"token" : "全球",
"start_offset" : 3,
"end_offset" : 5,
"type" : "word",
"position" : 2
},
{
"token" : "最",
"start_offset" : 5,
"end_offset" : 6,
"type" : "word",
"position" : 3
},
{
"token" : "大",
"start_offset" : 6,
"end_offset" : 7,
"type" : "word",
"position" : 4
},
{
"token" : "的",
"start_offset" : 7,
"end_offset" : 8,
"type" : "word",
"position" : 5
},
{
"token" : "笔记本",
"start_offset" : 8,
"end_offset" : 11,
"type" : "word",
"position" : 6
},
{
"token" : "厂商",
"start_offset" : 11,
"end_offset" : 13,
"type" : "word",
"position" : 7
}
]
}

参考:https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-smartcn.html

中文分词ik

$ bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.6.2/elasticsearch-analysis-ik-6.6.2.zip

The IK Analysis plugin integrates Lucene IK analyzer (http://code.google.com/p/ik-analyzer/) into elasticsearch, support customized dictionary.
Analyzer: ik_smart , ik_max_word , Tokenizer: ik_smart , ik_max_word

分词效果:

ik_smark

# curl -XPOST -H 'Content-Type: application/json' http://localhost:9200/_analyze?pretty -d '{"tokenizer":"ik_smart","text":"联想是全球最大的笔记本厂商"}'
{
"tokens" : [
{
"token" : "联想",
"start_offset" : 0,
"end_offset" : 2,
"type" : "CN_WORD",
"position" : 0
},
{
"token" : "是",
"start_offset" : 2,
"end_offset" : 3,
"type" : "CN_CHAR",
"position" : 1
},
{
"token" : "全球",
"start_offset" : 3,
"end_offset" : 5,
"type" : "CN_WORD",
"position" : 2
},
{
"token" : "最大",
"start_offset" : 5,
"end_offset" : 7,
"type" : "CN_WORD",
"position" : 3
},
{
"token" : "的",
"start_offset" : 7,
"end_offset" : 8,
"type" : "CN_CHAR",
"position" : 4
},
{
"token" : "笔记本",
"start_offset" : 8,
"end_offset" : 11,
"type" : "CN_WORD",
"position" : 5
},
{
"token" : "厂商",
"start_offset" : 11,
"end_offset" : 13,
"type" : "CN_WORD",
"position" : 6
}
]
}

ik_max_word

# curl -XPOST -H 'Content-Type: application/json' http://localhost:9200/_analyze?pretty -d '{"tokenizer":"ik_max_word","text":"联想是全球最大的笔记本厂商"}'
{
"tokens" : [
{
"token" : "联想",
"start_offset" : 0,
"end_offset" : 2,
"type" : "CN_WORD",
"position" : 0
},
{
"token" : "是",
"start_offset" : 2,
"end_offset" : 3,
"type" : "CN_CHAR",
"position" : 1
},
{
"token" : "全球",
"start_offset" : 3,
"end_offset" : 5,
"type" : "CN_WORD",
"position" : 2
},
{
"token" : "最大",
"start_offset" : 5,
"end_offset" : 7,
"type" : "CN_WORD",
"position" : 3
},
{
"token" : "的",
"start_offset" : 7,
"end_offset" : 8,
"type" : "CN_CHAR",
"position" : 4
},
{
"token" : "笔记本",
"start_offset" : 8,
"end_offset" : 11,
"type" : "CN_WORD",
"position" : 5
},
{
"token" : "笔记",
"start_offset" : 8,
"end_offset" : 10,
"type" : "CN_WORD",
"position" : 6
},
{
"token" : "本厂",
"start_offset" : 10,
"end_offset" : 12,
"type" : "CN_WORD",
"position" : 7
},
{
"token" : "厂商",
"start_offset" : 11,
"end_offset" : 13,
"type" : "CN_WORD",
"position" : 8
}
]
}

参考:https://github.com/medcl/elasticsearch-analysis-ik

七 复杂查询

1 查询接口主要参数

q
The query string.

stored_fields
The selective stored fields of the document to return for each hit, comma delimited. Not specifying any value will cause no fields to return.

sort
Sorting to perform. Can either be in the form of fieldName, or fieldName:asc/fieldName:desc. The fieldName can either be an actual field within the document, or the special _score name to indicate sorting based on scores. There can be several sort parameters (order is important).

from
The starting from index of the hits to return. Defaults to 0.

size
The number of hits to return. Defaults to 10.

timeout
A search timeout, bounding the search request to be executed within the specified time value and bail with the hits accumulated up to that point when expired. Defaults to no timeout.

default_operator
The default operator to be used, can be AND or OR. Defaults to OR.

其中sort有很多种实现,比如 _geo_distance 可以用来实现地理位置远近排序,另外还可以通过filter来实现地理位置圈定,详见:https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-geo-distance-query.html

【原创】大数据基础之ElasticSearch(2)常用API整理的更多相关文章

  1. 大数据(5) - HDFS中的常用API操作

    一.安装java 二.IntelliJ IDEA(2018)安装和破解与初期配置 参考链接 1.进入官网下载IntelliJ IDEA https://www.jetbrains.com/idea/d ...

  2. 【原创】大数据基础之ElasticSearch(4)es数据导入过程

    1 准备analyzer 内置analyzer 参考:https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis- ...

  3. 【原创】大数据基础之ElasticSearch(1)简介、安装、使用

    ElasticSearch 6.6.0 官方:https://www.elastic.co/ 一 简介 ElasticSearch简单来说是对lucene的分布式封装,增加了shard(每个shard ...

  4. 【原创】大数据基础之ElasticSearch(3)升级

    elasticsearch版本升级方案 常用的滚动升级过程(Rolling Upgrade)如下: $ curl -XPUT '$es_server:9200/_cluster/settings?pr ...

  5. 【原创】大数据基础之ElasticSearch(5)重要配置及调优

    Index Settings 重要索引配置 Index level settings can be set per-index. Settings may be: 1 static 静态索引配置 Th ...

  6. 【原创】大数据基础之Zookeeper(2)源代码解析

    核心枚举 public enum ServerState { LOOKING, FOLLOWING, LEADING, OBSERVING; } zookeeper服务器状态:刚启动LOOKING,f ...

  7. 大数据基础知识问答----spark篇,大数据生态圈

    Spark相关知识点 1.Spark基础知识 1.Spark是什么? UCBerkeley AMPlab所开源的类HadoopMapReduce的通用的并行计算框架 dfsSpark基于mapredu ...

  8. 大数据除了Hadoop还有哪些常用的工具?

    大数据除了Hadoop还有哪些常用的工具? 1.Hadoop大数据生态平台Hadoop 是一个能够对大量数据进行分布式处理的软件框架.但是 Hadoop 是以一种可靠.高效.可伸缩的方式进行处理的.H ...

  9. 大数据篇:ElasticSearch

    ElasticSearch ElasticSearch是什么 ElasticSearch是一个基于Lucene的搜索服务器.它提供了一个分布式多用户能力的全文搜索引擎,基于RESTful web接口. ...

随机推荐

  1. BZOJ3328 PYXFIB 单位根反演

    题意:求 \[ \sum_{i=0}^n[k|i]\binom{n}{i}Fib(i) \] 斐波那契数列有简单的矩阵上的通项公式\(Fib(n)=A^n_{1,1}\).代入得 \[ =\sum_{ ...

  2. Epemme

    Goss wa lap tirre kamme da, Waess u'malarre zuzze nasa. Mat abbe price junirre nay, Ywe zay prolodde ...

  3. mysql int(19) float(7,2) decimal(7,2)对比

    nt(19):指定数字的显示宽度为19,与实际存储数值的范围无关 float(7,2):  7是显示宽度指示器,指定显示的浮点数为7位数字(与float实际存储值的范围无关),2代表小数点后只有两位小 ...

  4. 异常SRVE0199E

    后台生成导出exe表格,在tomcat自己环境下完全没问题到websphere环境下保SRVE0199E产生这个问题是因为response.OutputStream已经打开再次打开就报这个异常,前台如 ...

  5. [BZOJ 4819] [SDOI 2017] 新生舞会

    Description 学校组织了一次新生舞会,Cathy作为经验丰富的老学姐,负责为同学们安排舞伴. 有 \(n\) 个男生和 \(n\) 个女生参加舞会买一个男生和一个女生一起跳舞,互为舞伴. C ...

  6. BZOJ4241历史研究——回滚莫队

    题目描述 IOI国历史研究的第一人——JOI教授,最近获得了一份被认为是古代IOI国的住民写下的日记.JOI教授为了通过这份日记来研究古代IOI国的生活,开始着手调查日记中记载的事件. 日记中记录了连 ...

  7. Mybatis Generator的model生成中文注释,支持oracle和mysql(通过实现CommentGenerator接口的方法来实现)

    自己手动实现的前提,对maven项目有基本的了解,在本地成功搭建了maven环境,可以参考我之前的文章:maven环境搭建 项目里新建表时model,mapper以及mapper.xml基本都是用My ...

  8. 仙人掌&圆方树学习笔记

    仙人掌&圆方树学习笔记 1.仙人掌 圆方树用来干啥? --处理仙人掌的问题. 仙人掌是啥? (图片来自于\(BZOJ1023\)) --也就是任意一条边只会出现在一个环里面. 当然,如果你的图 ...

  9. Product(欧拉函数)

    原题地址 先吐槽一波:凉心出题人又卡时间又卡空间 先来化简一波柿子 \[\prod_{i=1}^{n}\prod_{j=1}^{n}\frac{lcm(i,j)}{gcd(i,j)}\] \[=\pr ...

  10. python xpath学习

    一.选取节点: 二.谓词: 注意:在scrapy中用xpath进行搜索时,如果使用相对路径,要加上.,如,不然搜索的是整个文档.