_field_stats 实现的功能:https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-field-stats.html

获取索引下字段的统计信息,如下表,同时还可以针对这些统计值进行过滤:

Field statistics

The field stats api is supported on string based, number based and date based fields and can return the following statistics per field:

max_doc

The total number of documents.

doc_count

The number of documents that have at least one term for this field, or -1 if this measurement isn’t available on one or more shards.

density

The percentage of documents that have at least one value for this field. This is a derived statistic and is based on the max_doc and doc_count.

sum_doc_freq

The sum of each term’s document frequency in this field, or -1 if this measurement isn’t available on one or more shards. Document frequency is the number of documents containing a particular term.

sum_total_term_freq

The sum of the term frequencies of all terms in this field across all documents, or -1 if this measurement isn’t available on one or more shards. Term frequency is the total number of occurrences of a term in a particular document and field.

Field stats index constraints ——kibana里按照时间范围进行绘图就是用到这个。

Field stats index constraints allows to omit all field stats for indices that don’t match with the constraint. An index constraint can exclude indices' field stats based on the min_value and max_value statistic. This option is only useful if the level option is set to indices. Fields that are not indexed (not searchable) are always omitted when an index constraint is defined.

For example index constraints can be useful to find out the min and max value of a particular property of your data in a time based scenario. The following request only returns field stats for the answer_count property for indices holding questions created in the year 2014:

POST _field_stats?level=indices
{
"fields" : ["answer_count"],

   "index_constraints" : { 

      "creation_date" : { 

         "max_value" : { 

            "gte" : "2014-01-01T00:00:00.000Z"
},
"min_value" : {

            "lt" : "2015-01-01T00:00:00.000Z"
}
}
}
}

对应ES5.5的源码部分:elasticsearch/search/lookup/IndexField.java

import org.apache.lucene.search.CollectionStatistics;
import org.elasticsearch.common.util.MinimalMap; import java.io.IOException;
import java.util.HashMap;
import java.util.Map; /**
* Script interface to all information regarding a field.
* */
public class IndexField extends MinimalMap<String, IndexFieldTerm> { /*
* TermsInfo Objects that represent the Terms are stored in this map when
* requested. Information such as frequency, doc frequency and positions
* information can be retrieved from the TermInfo objects in this map.
*/
private final Map<String, IndexFieldTerm> terms = new HashMap<>(); // the name of this field
private final String fieldName; /*
* The holds the current reader. We need it to populate the field
* statistics. We just delegate all requests there
*/
private final LeafIndexLookup indexLookup; /*
* General field statistics such as number of documents containing the
* field.
*/
private final CollectionStatistics fieldStats;
public IndexField(String fieldName, LeafIndexLookup indexLookup) throws IOException { assert fieldName != null;
this.fieldName = fieldName; assert indexLookup != null;
this.indexLookup = indexLookup; fieldStats = this.indexLookup.getIndexSearcher().collectionStatistics(fieldName);
} /* get number of documents containing the field */
public long docCount() throws IOException {
return fieldStats.docCount();
} /* get sum of the number of words over all documents that were indexed */
public long sumttf() throws IOException {
return fieldStats.sumTotalTermFreq();
} /*
* get the sum of doc frequencies over all words that appear in any document
* that has the field.
*/
public long sumdf() throws IOException {
return fieldStats.sumDocFreq();
}
// 。。。。。。。
}

elasticsearch _field_stats 源码分析的更多相关文章

  1. Elasticsearch之源码分析(shard分片规则)

    前期博客是 Elasticsearch之源码编译 (1)elasticsearch在建立索引时,根据id或(id,类型)进行hash,得到hash值之后再与该索引的分片数量取模,取模的值即为存入的分片 ...

  2. ElasticSearch Index操作源码分析

    ElasticSearch Index操作源码分析 本文记录ElasticSearch创建索引执行源码流程.从执行流程角度看一下创建索引会涉及到哪些服务(比如AllocationService.Mas ...

  3. Elasticsearch源码分析 - 源码构建

    原文地址:https://mp.weixin.qq.com/s?__biz=MzU2Njg5Nzk0NQ==&mid=2247483694&idx=1&sn=bd03afe5a ...

  4. ElasticSearch 启动时加载 Analyzer 源码分析

    ElasticSearch 启动时加载 Analyzer 源码分析 本文介绍 ElasticSearch启动时如何创建.加载Analyzer,主要的参考资料是Lucene中关于Analyzer官方文档 ...

  5. Elasticsearch源码分析—线程池(十一) ——就是从队列里处理请求

    Elasticsearch源码分析—线程池(十一) 转自:https://www.felayman.com/articles/2017/11/10/1510291570687.html 线程池 每个节 ...

  6. elasticsearch源码分析之search模块(server端)

    elasticsearch源码分析之search模块(server端) 继续接着上一篇的来说啊,当client端将search的请求发送到某一个node之后,剩下的事情就是server端来处理了,具体 ...

  7. elasticsearch源码分析之search模块(client端)

    elasticsearch源码分析之search模块(client端) 注意,我这里所说的都是通过rest api来做的搜索,所以对于接收到请求的节点,我姑且将之称之为client端,其主要的功能我们 ...

  8. Solr4.8.0源码分析(13)之LuceneCore的索引修复

    Solr4.8.0源码分析(13)之LuceneCore的索引修复 题记:今天在公司研究elasticsearch,突然看到一篇博客说elasticsearch具有索引修复功能,顿感好奇,于是点进去看 ...

  9. 转-filebeat 源码分析

    背景 在基于elk的日志系统中,filebeat几乎是其中必不可少的一个组件,例外是使用性能较差的logstash file input插件或自己造个功能类似的轮子:). 在使用和了解filebeat ...

随机推荐

  1. android学习之路资料集合

    版权声明:本文为 stormzhang 原创文章,可以随意转载,但必须在明确位置注明出处!!! 这篇博客背后的故事 一路走来很不容易,刚好知乎上被人邀请回答如何自学android编程, 就借这个机会在 ...

  2. C#——简单工厂

    简单工厂的方法实现过程核心就是之前介绍的接口应用.所以直接上代码: public interface IPerson { void Say(); } public class Student : IP ...

  3. 【Centos7】Tomcat安装及一个服务器配置多个Tomcat

    完成解压 参考 http://www.cnblogs.com/h--d/p/5074800.html https://www.cnblogs.com/tudou-22/p/9330875.html 步 ...

  4. vsftpd:500OOPS:vsftpd:refusingtorunwithwritablerootinsidechroot()错误的解决方法

    当我们限定了用户不能跳出其主目录之后,使用该用户登录FTP时往往会遇到这个错误: 500 OOPS: vsftpd: refusing to run with writable root inside ...

  5. windows 小知识---windows下生成公钥和私钥

    首先Windows操作系统需要安装git. 安装完成后,再到任意的文件夹内,点击右键.选择git bash here 打开之后,输入ssh-keygen,一路按enter键. 全部结束后,再到C:\U ...

  6. includes() 方法用来判断一个数组是否包含一个指定的值,根据情况,如果包含则返回 true,否则返回false。

    注意:对象数组不能使用includes方法来检测. JavaScript Demo: Array.includes() var array1 = [1, 2, 3]; console.log(arra ...

  7. Oracle ASM注意事项

    ASM是负载均衡的存储策略,加新磁盘会将其它盘数据平均迁移到新磁盘,删除磁盘会将删除磁盘数据平均写回其它磁盘 1.同一磁盘组如果是在raid上,划分的磁盘越少越好,磁盘组分布在不同raid上性能好: ...

  8. The C++ Programming Language - Bjarne Stroustrup

    Preface Part 1: Introduction 1.1 The Structure of This Book 1.1.1 Introduction 1.1.2 Basic Facilitie ...

  9. windows下python-nmap运行过程中出现的问题及解决办法

    python-nmap 运行时出现了一下错误 D:\python\untitled5\Scripts\python.exe D:/python/untitled5/test.py Traceback ...

  10. python爬虫17 | 听说你又被封 ip 了,你要学会伪装好自己,这次说说伪装你的头部

    这两天 有小伙伴问小帅b 为什么我爬取 xx 网站的时候 不返回给我数据 而且还甩一句话给我 “系统检测到您频繁访问,请稍后再来” 小帅b看了一下他的代码 ): requests.get(url) 瞬 ...