elasticsearch _source字段的一些说明
_source field
The _source field contains the original JSON document body that was passed at index time. The_source field itself is not indexed (and thus is not searchable), but it is stored so that it can be returned when executing fetch requests, like get or search.
Disabling the _source field
Though very handy to have around, the source field does incur storage overhead within the index. For this reason, it can be disabled as follows:
PUT tweets
{
"mappings": {
"tweet": {
"_source": {
"enabled": false
}
}
}
}
Think before disabling the _source field
Users often disable the _source field without thinking about the consequences, and then live to regret it. If the _source field isn’t available then a number of features are not supported:
- The
update,update_by_query, andreindexAPIs. - On the fly highlighting.
- The ability to reindex from one Elasticsearch index to another, either to change mappings or analysis, or to upgrade an index to a new major version.
- The ability to debug queries or aggregations by viewing the original document used at index time.
- Potentially in the future, the ability to repair index corruption automatically.
If disk space is a concern, rather increase the compression level instead of disabling the _source.
The metrics use case
The metrics use case is distinct from other time-based or logging use cases in that there are many small documents which consist only of numbers, dates, or keywords. There are no updates, no highlighting requests, and the data ages quickly so there is no need to reindex. Search requests typically use simple queries to filter the dataset by date or tags, and the results are returned as aggregations.
In this case, disabling the _source field will save space and reduce I/O. It is also advisable to disable the _all field in the metrics case.
Including / Excluding fields from _source
An expert-only feature is the ability to prune the contents of the _source field after the document has been indexed, but before the _source field is stored.
Removing fields from the _source has similar downsides to disabling _source, especially the fact that you cannot reindex documents from one Elasticsearch index to another. Consider using source filtering instead.
The includes/excludes parameters (which also accept wildcards) can be used as follows:
PUT logs
{
"mappings": {
"event": {
"_source": {
"includes": [
"*.count",
"meta.*"
],
"excludes": [
"meta.description",
"meta.other.*"
]
}
}
}
} PUT logs/event/1
{
"requests": {
"count": 10,
"foo": "bar"
},
"meta": {
"name": "Some metric",
"description": "Some metric description",
"other": {
"foo": "one",
"baz": "two"
}
}
} GET logs/event/_search
{
"query": {
"match": {
"meta.other.foo": "one"
}
}
}
|
|
These fields will be removed from the stored |
|
|
We can still search on this field, even though it is not in the stored |
elasticsearch _source字段的一些说明的更多相关文章
- elasticsearch的store属性 vs _source字段
众所周知_source字段存储的是索引的原始内容,那store属性的设置是为何呢?es为什么要把store的默认取值设置为no?设置为yes是否是重复的存储呢? 我们将一个field的值写入es中,要 ...
- elasticsearch的store属性跟_source字段——如果你的文档长度很长,存储了_source,从_source中获取field的代价很大,你可以显式的将某些field的store属性设置为yes,否则设置为no
转自:http://kangrui.iteye.com/blog/2262860 众所周知_source字段存储的是索引的原始内容,那store属性的设置是为何呢?es为什么要把store的默认取值设 ...
- ElasticStack系列之八 & _source 字段
有很多人会有这样的一个疑问: _source字段存储的是索引的原始内容,那 store 属性的设置是为何呢?elasticsearch 为什么要把 store 的默认取值设置为 no?设置为 yes ...
- ES _source字段介绍——json文档,去掉的话无法更新部分文档,最重要的是无法reindex
摘自:https://es.xiaoleilu.com/070_Index_Mgmt/31_Metadata_source.html The _source field stores the JSON ...
- elasticsearch _source
默认地,Elasticsearch 在 _source 字段存储代表文档体的JSON字符串.和所有被存储的字段一样, _source 字段在被写入磁盘之前先会被压缩.这个字段的存储几乎总是我们想要的, ...
- [Elasticsearch] 多字段搜索 (三) - multi_match查询和多数字段 <译>
multi_match查询 multi_match查询提供了一个简便的方法用来对多个字段执行相同的查询. NOTE 存在几种类型的multi_match查询,其中的3种正好和在“了解你的数据”一节中提 ...
- Elasticsearch - 理解字段分析过程(_analyze与_explain)
我们经常会遇到问题.为什么指定的文档没有被搜索到.许多情况下, 这都归因于映射的定义和分析例程配置存在问题. 针对分析过程的调试,ElasticSearch提供了专用的REST API. _analy ...
- Elasticsearch 多字段搜索
查询很少是对一个字段做 match 查询,通常都是一个 query 查询多个字段,比如一个 doc 有 title.content.pagetag 等文本字段,要在这些字段查询含多个 term 的 q ...
- [Elasticsearch] 多字段搜索 (三) - multi_match查询和多数字段
multi_match查询 multi_match查询提供了一个简便的方法用来对多个字段执行相同的查询. NOTE 存在几种类型的multi_match查询,其中的3种正好和在"了解你的数据 ...
随机推荐
- 有问必答项目 -数据库设计文档(ask-utf-8)
有问必答项目 -数据库设计文档(ask-utf-8) 表前缀的使用 早期租用公共的服务器 一个数据库,保存多个项目(问答.电子商务.医院),为了区分这些项目,使用前缀分割 ask_ ec_ hospi ...
- jq:正则表达式
var checkNum = /^[A-Za-z0-9]+$/; 注意没有引号 checkNum.test(这里添加待匹配的字符串); var textNull=/^\s*$/; if(textN ...
- 高仿阴阳师官网轮播图效果的jQuery插件
代码地址如下:http://www.demodashi.com/demo/12302.html 插件介绍 这是一个根据阴阳师官网的轮播效果所扒下来的轮播插件,主要应用于定制个性化场景,目前源码完全公开 ...
- JOB Hunting 总结-----2013-11-5
从9月份开始的找工作大战,告一段落:其实早在10月中旬就已搞定,现在回想起这过去的几个月,很充实,很疲惫,很挫败又很有成就感! 开始找工作,对未来有过很多憧憬,也很迷茫,不知道自己的未来会在 ...
- 还需要学习的十二种CSS选择器
在前面的文章中,我们在介绍了<五种你必须彻底了解的CSS选择器>,现在向大家介绍,还需要学习的另外十二种CSS选择器.如果你还没有用过,就好好学习一下,如果你已经熟知了就当是温习. 一.X ...
- 怎么样自己动手写OS
虽然我现在并不是从事内核方向,却本着探索计算机本质的想法学习的内核,自从写完这个内核以后真的发现对很多东西的理解都更深一层,所以专研内核,对我现在的工作是很有帮助的.我个人强烈建议师弟师妹们尽早地啃一 ...
- 对Oracle的rownum生成时机的理解
在Oracle中,rownum和rowid是平时经常用到的.比如rownum经常用于分页查询,rowid用于排重或者快速定位到记录. 对rownum跟order by配合下的生成时机一直没有具体研究过 ...
- 华为p20:拍美景,听讲解,旅行更智能
华为P20轰轰烈烈地上市了,本来对手机并不感冒的我,看到身边的好友换了P20,不禁感慨:这个月的活又要白干了,全部都要上交给华为,因为这款手机完全戳中了旅游爱好者的痛点. 痛点一:丢弃笨重的单反,手机 ...
- Mac 常用属性
如果需要让隐藏的文件可见. 具体做法就是打开一个Terminal终端窗口,输入以下命令: 对于OS X Mavericks 10.9: defaults write com.apple.finder ...
- iOS 应用发布
本文转载至 http://blog.csdn.net/ysy441088327/article/details/7833579 苹果为广大的开发者提供了一个很好的应用生态环境 参考资料: 1:如何向 ...