Elasticsearch【基础入门】

一.操作index
二.操作index
三. Scala API插入数据到ES
四.JavaAPI 条件查询ES 解析

一.操作index

1.查看index

GET /_cat/indices?v

2.增加index

PUT /index名

3.删除index

DELETE /index名

二.操作index

1.新增document

--PUT /index名/type名/id

PUT /movie_index/movie/1

{ "id":1,

  "name":"operation red sea",

  "doubanScore":8.5,

  "actorList":[

    {"id":1,"name":"zhang yi"},

    {"id":2,"name":"hai qing"},

    {"id":3,"name":"zhang han yu"}

  ]

}

PUT /movie_index/movie/2

{

  "id":2,

  "name":"operation meigong river",

  "doubanScore":8.0,

  "actorList":[

    {"id":3,"name":"zhang han yu"}

  ]

}

注意: 如果之前没建过 index 或者 type，es 会自动创建。

2.查询type 全部数据

GET /index名/type名/_search

3.查找指定 id 的 document 数据

GET /index名/type名/id值

4.修改 document

修改分两种: 整体替换和只修改某个字段

整体修改:和新增document差不多

PUT /movie_index/movie/3

{

  "id":"3",

  "name":"incident red sea",

  "doubanScore":"8.0",

  "actorList":[

    {"id":"1","name":"zhang chen"}

  ]

}

只修改某个字段 ：使用post方法

POST /movie_index/movie/3/_update

{

  "doc": {

    --字段值 : 更新后的值

    "doubanScore":"8.1"

  }

}

5.删除一个 document

DELETE /movie_index/movie/3

6.条件查询

原始数据格式

{ "id":1,

  "name":"operation red sea",

  "doubanScore":8.5,

  "actorList":[

    {"id":1,"name":"zhang yi"},

    {"id":2,"name":"hai qing"},

    {"id":3,"name":"zhang han yu"}

  ]

}

查询全部

GET /movie_index/movie/_search

{

  "query": {

    "match_all": {}

  }

}

------等价于

GET /movie_index/movie/_search

按照指定字段值查询

GET /movie_index/movie/_search

{

  "query": {

    "match": {

      "name": "sea" --字段值

    }

  }

}

按照字段子属性查询

GET /movie_index/movie/_search

{

  "query": {

    "match": {

      "actorList.name": "zhang" --json数组取子元素字段

    }

  }

}

7.按照短语整体查询

按照短语查询的意思是指, 匹配某个 field 的整个内容, 不再利用分词技术

GET /movie_index/movie/_search

{

  "query": {

    "match_phrase": {

      "name": "operation red"

    }

  }

}

说明: 把operation red作为一个整体来看待

对比一下

--包含 operation 或者 red 的都会被查找出来

GET /movie_index/movie/_search

{

  "query": {

    "match": {

      "name": "operation red"

    }

  }

}

8.模糊查询

校正匹配分词，当一个单词都无法准确匹配，es 通过一种算法对非常接近的单词也给与一定的评分，能够查询出来，但是消耗更多的性能。

GET /movie_index/movie/_search

{

  "query": {

    "fuzzy": {

      "name": "red"

    }

  }

}

9.查询后过滤

GET /movie_index/movie/_search

{

  "query": {

    "match": {

      "name": "red"

    }

  },

  "post_filter": {

    "term": {

      "actorList.id": "3"

    }

  }

}

10.查询前过滤（推荐）

--条件：actorList.id=3 or actorList.id= 1 && name contains "zhang"

GET movie_index/movie/_search

{

  "query": {

    "bool": {

      "filter": [

        {"term":

          {"actorList.id": 3}

        },

        {

          "term":

            {"actorList.id": 1}

        }

      ],

      "must":

        {"match": {

          "name": "zhang"

        }}

    }

  }

}

must、should、must_not区别

must可以理解为 &，should理解为 |, must_not理解为！【与、或、非的关系】

must+should的使用可以理解为：先按照must过滤，过滤出来的结果的score分数增加。should只是辅助作用

must

年龄39 且性别'女'

GET /bank/_search

{

  "query": {

    "bool": {

      "must": [

        {"match": {

          "age": "39"

        }},

        {"match": {

          "gender": "F"

        }}

      ]

    }

  }

}

should

年龄39 或性别'女'

GET /bank/_search

{

  "query": {

    "bool": {

      "should": [

        {"match": {

          "age": "39"

        }},

        {"match": {

          "gender": "F"

        }}

      ]

    }

  }

}

must_not

年龄不是39 且性别不为 '女'

GET /bank/_search

{

  "query": {

    "bool": {

      "must_not": [

        {"match": {

          "age": "39"

        }},

        {"match": {

          "age": "40"

        }}

      ]

    }

  }

}

must+should

结果和must结果一样，不同就是“must+should”的结果的score增加。

GET /bank/_search

{

  "query": {

    "bool": {

      "must": [

        {"match": {

          "age": "39"

        }}

      ],

      "should": [

        {"match": {

          "gender": "F"

        }}

      ]

    }

  }

}

11.按照范围过滤

lt:小于，lte：小于等于，gt：大于，gte：大于等于

GET movie_index/movie/_search

{

  "query": {

    "bool": {

      "filter": {

        "range": {

          "doubanScore": {

            "gt": 5,

            "lt": 9

          }

        }

      }

    }

  }

}

12.排序

GET movie_index/movie/_search

{

  "query":{

    "match": {"name":"red operation"}

  }

  , "sort": [

    {

      "doubanScore": {  --指定排序字段

        "order": "desc" --指定排序规则

      }

    }

  ]

}

13.分页查询

GET movie_index/movie/_search

{

  "query": { "match_all": {} },

  "from": 10, --从第几条开始查询

  "size": 10  --展示几条

}

14.聚合

select count(*) from group by gender

GET /bank/_search

{

  "aggs": {

    "groupby_gender": {

      "terms": {

        "field": "gender.keyword",

        "size": 1

      }

    }

  }

}

多组聚合

相对于Sql中的两个group by语句的查询结果

selec sum(balance), max(balance) from .. group by gender

selec sum(balance) from .. group by age

GET /bank/_search

{

  "query": {

    "match": {

      "address": "Terrace"

    }

  }, 

  "aggs": {

    --按照gender聚合

    "groupby_gender": {

      "terms": {

        "field": "gender.keyword",

        "size": 2

      },

      "aggs": {

        "b_sum": {

          "sum": {

            "field": "balance"

          }

        },

        "b_max":{

          "max": {

            "field": "balance"

          }

        }

      }

    },

    ----按照age聚合

    "groupby_age": {

      "terms": {

        "field": "age",

        "size": 100

      },

      "aggs": {

        "b_sum": {

          "sum": {

            "field": "balance"

          }

        }

      }

    }

  },

  "sort": [

    {

      "age": {

        "order": "desc"

      }

    }

  ],

  "_source": ["balance", "age"]

}

三. Scala API插入数据到ES

使用java API同样可以实现

1.ES中新建测试Index

GET /user/_search --查询

--向Index插入一条数据，同时创建Index

PUT /user/_doc/1

{

  "name":"zhangsan",

  "age":10

}

2.User样例类

case class User(name:String,age:Int)

3.ES工具类

import io.searchbox.client.{JestClient, JestClientFactory}

import io.searchbox.client.config.HttpClientConfig

import io.searchbox.core.{Bulk, Index}

/**

 * @description: ES工具类

 * @author: HaoWu

 * @create: 2020年09月09日

 */

object ESUtil {

  // 构建JestClientFactory

  //ES服务器地址     注意：可以设置1个也可以设置1个Collection，要转为java的集合

  import scala.collection.JavaConverters._

  val esServerUrl = Set("http://hadoop102:9200", "http://hadoop103:9200", "http://hadoop104:9200").asJava

  //构建客户端工厂

  private val factory = new JestClientFactory

  var conf: HttpClientConfig = new HttpClientConfig.Builder(esServerUrl)

    .multiThreaded(true)

    .maxTotalConnection(100)

    .connTimeout(10 * 1000)

    .readTimeout(10 * 1000)

    .build()

  factory.setHttpClientConfig(conf)

  /**

   * 获取ES客户端

   */

  def getESClient(): JestClient = {

    factory.getObject

  }

  /**

   * 插入单条数据

   *

   * @param index  :插入的Index

   * @param source :满足两种类型参数：1.source   2.(id,source) ,其中source可以是样例类对象 或 json对象字符串

   */

  def insertSingle(index: String, source: Any): Unit = {

    val client: JestClient = getESClient()

    val action =

      source match {

        case (id, s: Any) => {

          new Index.Builder(s)

            .index(index)

            .`type`("_doc")

            .id(id.toString) //ES中的id为String类型，当入参的id为int类型可能插入错误。

            .build()

        }

        case (_) => {

          new Index.Builder(source)

            .index(index)

            .`type`("_doc")

            .build()

        }

      }

    client.execute(action)

    client.close()

  }

  /**

   * 批量插入数据

   *

   * @param index   :插入的Index

   * @param sources :满足两种类型参数：1.source   2.(id,source) ,其中source可以是样例类对象 或 Json对象字符串

   *                说明：将来数据使用mapPartition批量写入，所以参数封装为Iterator类型

   */

  def insertBluk(index: String, sources: Iterator[Object]) {

    // 1.获取ES客户端

    val client: JestClient = getESClient()

    // 2.构建Builder

    val builder: Bulk.Builder = new Bulk.Builder()

      .defaultIndex(index)

      .defaultType("_doc")

    // 3.为Builder添加action

    //================== 方式一 ========================================

    /*    sources.foreach(

          source => {

            val action =

              source match {

                case (id: String, s) => { //入参是一个元祖(id， source)

                  new Index.Builder(s)

                    .id(id)

                    .build()

                }

                case (_) => { //入参是source,样例类，或者 json对象字符串

                  new Index.Builder(source)

                    .build()

                }

              }

            //添加action

            builder.addAction(action)

          }

        )*/

    //================== 方式二 ========================================

    sources.map { //转换为action

      case (id: String, s) => {

        new Index.Builder(s)

          .id(id)

          .build()

      }

      case (source) => {

        new Index.Builder(source)

          .build()

      }

    } //往builder添加action

      .foreach(builder.addAction)

    // 4.执行插入

    client.execute(builder.build())

    // 5.关闭客户端

    client.close()

  }

4.测试

插入单条数据

    val source1 = User("lisi", 20) //id随机生成

    val source2 = ("11",User("lisi", 20)) //id为10

    insertSingle("user", source1)

    insertSingle("user", source2)

查询结果

      {

        "_index": "user",

        "_type": "_doc",

        "_id": "pwvTcXQBrKDUC6YPHEQZ",

        "_score": 1,

        "_source": {

          "name": "lisi",

          "age": 20

        }

      },

	   {

        "_index": "user",

        "_type": "_doc",

        "_id": "11",

        "_score": 1,

        "_source": {

          "name": "lisi",

          "age": 20

        }

      }

批量插入数据

//不指定id 和 指定id

val sources = Iterator(User("wangwu", 21), (99,User("zhaoliu", 30)))

insertBluk("user",sources)