ElasticSearch的基本认识和基本操作

1.1. ElasticSearch（简称ES）

ES即为了解决原生Lucene使用的不足，优化Lucene的调用方式，并实现了高可用的分布式集群的搜索方案，其第一个版本于2010年2月出现在GitHub上并迅速成为最受欢迎的项目之一。

ES的核心不在于Lucene，其特点更多的体现为：

分布式的实时文件存储，每个字段都被索引并可被搜索

分布式的实时分析搜索引擎

可以扩展到上百台服务器，处理PB级结构化或非结构化数据

高度集成化的服务，你的应用可以通过简单的 RESTful API、各种语言的客户端甚至命令行与之

交互。

上手Elasticsearch非常容易。它提供了许多合理的缺省值，并对初学者隐藏了复杂的搜索引擎理论。它拥有开瓶即饮的效果（安装即可使用），只需很少的学习既可在生产环境中使用。

和ES类似的框架

Solr

Solr和ES比较：

Solr 利用 Zookeeper 进行分布式管理，支持更多格式的数据（HTML/PDF/CSV），官方提供的功能更多在传统的搜索应用中表现好于 ES，但实时搜索效率低。

ES自身带有分布式协调管理功能，但仅支持json文件格式，本身更注重于核心功能，高级功能多有第三方插件提供，在处理实时搜索应用时效率明显高于 Solr。

Katta

基于 Lucene 的，支持分布式，可扩展，具有容错功能，准实时的搜索方案。

优点：开箱即用，可以与 Hadoop 配合实现分布式。具备扩展和容错机制。

缺点：只是搜索方案，建索引部分还是需要自己实现。在搜索功能上，只实现了最基本的需求。成功案例较少，项目的成熟度稍微差一些。

HadoopContrib

Map/Reduce 模式的，分布式建索引方案，可以跟 Katta 配合使用。

优点：分布式建索引，具备可扩展性。

缺点：只是建索引方案，不包括搜索实现。工作在批处理模式，对实时搜索的支持不佳。

Normal
0

7.8 磅
0
2

false
false
false

EN-US
ZH-CN
X-NONE

Normal
0

7.8 磅
0
2

false
false
false

EN-US
ZH-CN
X-NONE

/* Style Definitions */
table.MsoNormalTable
{mso-style-name:普通表格;
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
mso-para-margin:0cm;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:10.0pt;
font-family:"Times New Roman",serif;}

ES数据管理

Normal
0

7.8 磅
0
2

false
false
false

EN-US
ZH-CN
X-NONE

/* Style Definitions */
table.MsoNormalTable
{mso-style-name:普通表格;
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
mso-para-margin:0cm;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:10.0pt;
font-family:"Times New Roman",serif;}

创建索引文档

①使用自己的ID创建：

PUT {index}/{type}/{id}

{

"field": "value",

...

}

Normal
0

7.8 磅
0
2

false
false
false

EN-US
ZH-CN
X-NONE

局部更新文档

接受一个局部文档参数 doc，它会合并到现有文档中，对象合并在一起，存在的标量字段被覆盖，新字段被添加。

POST itsource/employee/123/_update

{

“doc”:{

"email" : "nixianhua@itsource.cn",

"salary": 1000

}

}

Normal
0

7.8 磅
0
2

false
false
false

EN-US
ZH-CN
X-NONE

删除文档

DELETE {index}/{type}/{id}

Normal
0

7.8 磅
0
2

false
false
false

EN-US
ZH-CN
X-NONE

批量操作bulk API

使用单一请求来实现多个文档的create、index、update 或 delete。

Bulk请求体格式：

{ action: { metadata }}\n

{ request body }\n

{ action: { metadata }}\n

{ request body }\n

POST _bulk

{ "delete": { "_index": "itsource", "_type": "employee", "_id": "123" }}

{ "create": { "_index": "itsource", "_type": "blog", "_id": "123" }}

{ "title": "我发布的博客" }

{ "index": { "_index": "itsource", "_type": "blog" }}

{ "title": "我的第二博客" }

批量获取

#批量获取方式一

GET _mget

{

  "docs":[{

      "_index":"itsource",

      "_type":"blog",

      "_id":"123"

    },{

       "_index":"itsource",

      "_type":"blog",

      "_id":"AWpXiEfhCq6ubXlpA9Ia",

      "_source":"title"

    }]

}

#批量获取方式二

GET itsource/blog/_mget

{

  "ids":["123","AWpXiEfhCq6ubXlpA9Ia"]

}

分页查询

#分页查询

GET _search?size=3&from=2;

//查询条件位欸age=18的

GET crm/employees/_search?q=age:18

//查询10>age<30

GET crm/employees/_search?q=age[10 TO 30]

Normal
0

7.8 磅
0
2

false
false
false

EN-US
ZH-CN
X-NONE

DSL查询

#DSL的查询方式

GET crm/employees/_search

{

"query" : {

    "match" : {

          "name" : "大哥"

   }

  }

}

案例：类似京东网站查询关键字为iphone，国家为us的，价格范围6000到8000 价格降序，并# 且取前面2条:

GET shop/goods/_search

{

  "query":{

    "bool": {

      "must": [

        {"match": {

          "name": "iphone"

        }}

      ],

      "filter": [{

        "term":{

          "local":"us"

        }

      },{

        "range":{

          "price":{

            "gte":"5000",

            "lte":"7000"

          }

        }

      }]

    }

  },

  "from": 1,

  "size": 5,

  "_source": ["id", "name", "type","price"],

  "sort": [{"price": "desc"}]

在java中的操作为

 public class elasticTest {

     //取得clean对象

     public TransportClient getClient() throws UnknownHostException {

         TransportClient client = new PreBuiltTransportClient(Settings.EMPTY)

                 .addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("127.0.0.1"), 9300));

         return client;

     }

     //添加数据

     @Test

     public void getCreated() throws Exception {

         TransportClient client = getClient();

         //创建一个库

         IndexRequestBuilder indexRequestBuilder = client.prepareIndex("crm", "user", "1");

         //准备数据

         Map<String, Object> mp = new HashMap();

         mp.put("id", 2);

         mp.put("name", "kg");

         mp.put("age", 18);

         //将数据放入到库中，并且将数据读取出来

         IndexResponse indexResponse = indexRequestBuilder.setSource(mp).get();

         System.out.println(indexResponse);

     }

     //修改数据

     @Test

         public void update() throws Exception{

         TransportClient client = getClient();

         Map mp=new HashMap();

         mp.put("id", 2);

         mp.put("name", "黄巢");

         mp.put("age", 35);

         UpdateResponse response = client.prepareUpdate("crm", "user", "1").setDoc(mp).get();

         GetResponse fields = client.prepareGet("crm", "user", "1").get();

         System.out.println(fields.getSource());

     }

     //进行删除

     @Test

         public void testdelete() throws Exception{

         TransportClient client = getClient();

         DeleteResponse response = client.prepareDelete("crm", "user", "1").get();

         System.out.println(response);

     }

         //批量添加

     @Test

         public void BUlk() throws Exception{

         TransportClient client = getClient();

         BulkRequestBuilder bulk = client.prepareBulk();

         for (int i=0;i<10;i++){

             Map map=new HashMap();

             map.put("id", i);

             map.put("age", 6+i);

             map.put("name", "zhansan"+i);

             bulk.add(client.prepareIndex("crm","suer",i+"").setSource(map));

         }

         BulkResponse response = bulk.get();

         SearchRequestBuilder search = client.prepareSearch("crm", "suer");

         System.out.println(search);

         if (response.hasFailures()){

             System.out.println("err");

         }

     client.close();

     }

     @Test

         public void testQuery() throws Exception{

         TransportClient client = getClient();

         BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();

         //匹配值

         List<QueryBuilder> must = boolQuery.must();

         must.add(QueryBuilders.termQuery("name", "zhansan1"));

         //过滤

         List<QueryBuilder> filter = boolQuery.filter();

         filter.add(QueryBuilders.rangeQuery("age").gte("6").lte(10));

         //设置分页

         SearchResponse response = client.prepareSearch("crm")

                 .setFrom(0).setSize(3)

                 .setQuery(boolQuery)

                 .addSort("id", SortOrder.DESC).get();

         System.out.println("总条数"+response.getHits().getTotalHits());

         //第一次gethits表示获取到命中条数，第二次表示获取得到命中条数的数组

         SearchHit[] hits = response.getHits().getHits();

         //遍历数组得到具体的值

         for (SearchHit hit : hits) {

             System.out.println(hit.getSource());

         }

     }

 }