elasticsearch _source字段的一些说明
_source
field
The _source
field contains the original JSON document body that was passed at index time. The_source
field itself is not indexed (and thus is not searchable), but it is stored so that it can be returned when executing fetch requests, like get or search.
Disabling the _source
field
Though very handy to have around, the source field does incur storage overhead within the index. For this reason, it can be disabled as follows:
PUT tweets
{
"mappings": {
"tweet": {
"_source": {
"enabled": false
}
}
}
}

Think before disabling the _source
field
Users often disable the _source
field without thinking about the consequences, and then live to regret it. If the _source
field isn’t available then a number of features are not supported:
- The
update
,update_by_query
, andreindex
APIs. - On the fly highlighting.
- The ability to reindex from one Elasticsearch index to another, either to change mappings or analysis, or to upgrade an index to a new major version.
- The ability to debug queries or aggregations by viewing the original document used at index time.
- Potentially in the future, the ability to repair index corruption automatically.

If disk space is a concern, rather increase the compression level instead of disabling the _source
.
The metrics use case
The metrics use case is distinct from other time-based or logging use cases in that there are many small documents which consist only of numbers, dates, or keywords. There are no updates, no highlighting requests, and the data ages quickly so there is no need to reindex. Search requests typically use simple queries to filter the dataset by date or tags, and the results are returned as aggregations.
In this case, disabling the _source
field will save space and reduce I/O. It is also advisable to disable the _all
field in the metrics case.
Including / Excluding fields from _source
An expert-only feature is the ability to prune the contents of the _source
field after the document has been indexed, but before the _source
field is stored.

Removing fields from the _source
has similar downsides to disabling _source
, especially the fact that you cannot reindex documents from one Elasticsearch index to another. Consider using source filtering instead.
The includes
/excludes
parameters (which also accept wildcards) can be used as follows:
PUT logs
{
"mappings": {
"event": {
"_source": {
"includes": [
"*.count",
"meta.*"
],
"excludes": [
"meta.description",
"meta.other.*"
]
}
}
}
} PUT logs/event/1
{
"requests": {
"count": 10,
"foo": "bar"
},
"meta": {
"name": "Some metric",
"description": "Some metric description",
"other": {
"foo": "one",
"baz": "two"
}
}
} GET logs/event/_search
{
"query": {
"match": {
"meta.other.foo": "one"
}
}
}
|
These fields will be removed from the stored |
|
We can still search on this field, even though it is not in the stored |
elasticsearch _source字段的一些说明的更多相关文章
- elasticsearch的store属性 vs _source字段
众所周知_source字段存储的是索引的原始内容,那store属性的设置是为何呢?es为什么要把store的默认取值设置为no?设置为yes是否是重复的存储呢? 我们将一个field的值写入es中,要 ...
- elasticsearch的store属性跟_source字段——如果你的文档长度很长,存储了_source,从_source中获取field的代价很大,你可以显式的将某些field的store属性设置为yes,否则设置为no
转自:http://kangrui.iteye.com/blog/2262860 众所周知_source字段存储的是索引的原始内容,那store属性的设置是为何呢?es为什么要把store的默认取值设 ...
- ElasticStack系列之八 & _source 字段
有很多人会有这样的一个疑问: _source字段存储的是索引的原始内容,那 store 属性的设置是为何呢?elasticsearch 为什么要把 store 的默认取值设置为 no?设置为 yes ...
- ES _source字段介绍——json文档,去掉的话无法更新部分文档,最重要的是无法reindex
摘自:https://es.xiaoleilu.com/070_Index_Mgmt/31_Metadata_source.html The _source field stores the JSON ...
- elasticsearch _source
默认地,Elasticsearch 在 _source 字段存储代表文档体的JSON字符串.和所有被存储的字段一样, _source 字段在被写入磁盘之前先会被压缩.这个字段的存储几乎总是我们想要的, ...
- [Elasticsearch] 多字段搜索 (三) - multi_match查询和多数字段 <译>
multi_match查询 multi_match查询提供了一个简便的方法用来对多个字段执行相同的查询. NOTE 存在几种类型的multi_match查询,其中的3种正好和在“了解你的数据”一节中提 ...
- Elasticsearch - 理解字段分析过程(_analyze与_explain)
我们经常会遇到问题.为什么指定的文档没有被搜索到.许多情况下, 这都归因于映射的定义和分析例程配置存在问题. 针对分析过程的调试,ElasticSearch提供了专用的REST API. _analy ...
- Elasticsearch 多字段搜索
查询很少是对一个字段做 match 查询,通常都是一个 query 查询多个字段,比如一个 doc 有 title.content.pagetag 等文本字段,要在这些字段查询含多个 term 的 q ...
- [Elasticsearch] 多字段搜索 (三) - multi_match查询和多数字段
multi_match查询 multi_match查询提供了一个简便的方法用来对多个字段执行相同的查询. NOTE 存在几种类型的multi_match查询,其中的3种正好和在"了解你的数据 ...
随机推荐
- 检验 java 基础数据类型参数传递方式
测试证明,java基础数据类型参数传递值虽是引用传递但是值不会改变.对象是引用传递,值会改变. 为什么?找到一段话来解释这个问题. "对于字符串对象来说,虽然在参数传递的时候也是引用传递,但 ...
- SVN客户端忽略无关文件
修改前请先备份文件 ~/.subversion/config. 1,打开Terminal,输入命令: $ open ~/.subversion/config 2,在打开的文件中寻找:`global ...
- vim 模式查找
1. / 正向查找, ?反向查找 2. \v 激活very magic搜索模式,撰写正则表达式更接近于perl的正则表达式,大多数字符不需要进行转义 3. \V 激活noVeryMagic模式,按字符 ...
- Struts2学习五----------指定多个配置文件
© 版权声明:本文为博主原创文章,转载请注明出处 指定多个配置文件 - 在Struts2配置文件中使用include可指定多个配置文件 实例 1.项目结构 2.pom.xml <project ...
- spring-web中的WebDataBinder理解
Spring可以自动封装Bean,也就是说前台通过SpringMVC传递过来的属性值会自动对应到对象中的属性并封装成javaBean,但是只能是基本数据类型(int,String等).如果传递过来的是 ...
- deeplearning.net 0.1 document - Multilayer Perceptron
Multilayer Perceptron 以下我们使用Theano来介绍一下单隐藏层的多层感知机(MLP).MLP能够看成一个logistic回归分类器,它使用一个已经学习的非线性转换器处理输入.这 ...
- Django之站内搜索-Solr,Haystack
java -version 不多说 solr 是java 开发的 java version "1.7.0_79" Java(TM) SE Runtime Environment ( ...
- EntityFramework 6.0 修改一个已经存在的对象
public void UpdateObj(someobject obj) { db.Entry(obj).State = EntityState.Modified; db.SaveChanges() ...
- mysql中把空值放在最后,有值的数据放在前面
order by column is null,column; 如果:order by column,则column中空值的数据放在最前面,有数据的放在后面
- Struts2实现input数据回显
/** 修改页面 */ public String editUI() { //准备回显得数据 Role role = roleService.getById(id); ...