Histogram
folly/Histogram.h
Classes
Histogram
Histogram.h defines a simple histogram class, templated on the type of data you want to store. This class is useful for tracking a large stream of data points, where you want to remember the overall distribution of the data, but do not need to remember each data point individually.
Each histogram bucket stores the number of data points that fell in the bucket, as well as the overall sum of the data points in the bucket. Note that no overflow checking is performed, so if you have a bucket with a large number of very large values, it may overflow and cause inaccurate data for this bucket. As such, the histogram class is not well suited to storing data points with very large values. However, it works very well for smaller data points such as request latencies, request or response sizes, etc.
In addition to providing access to the raw bucket data, the Histogram class also provides methods for estimating percentile values. This allows you to estimate the median value (the 50th percentile) and other values such as the 95th or 99th percentiles.
All of the buckets have the same width. The number of buckets and bucket width is fixed for the lifetime of the histogram. As such, you do need to know your expected data range ahead of time in order to have accurate statistics. The histogram does keep one bucket to store all data points that fall below the histogram minimum, and one bucket for the data points above the maximum. However, because these buckets don't have a good lower/upper bound, percentile estimates in these buckets may be inaccurate.
HistogramBuckets
The Histogram class is built on top of HistogramBuckets. HistogramBuckets provides an API very similar to Histogram, but allows a user-defined bucket class. This allows users to implement more complex histogram types that store more than just the count and sum in each bucket.
When computing percentile estimates HistogramBuckets allows user-defined functions for computing the average value and data count in each bucket. This allows you to define more complex buckets which may have multiple different ways of computing the average value and the count.
For example, one use case could be tracking timeseries data in each bucket. Each set of timeseries data can have independent data in the bucket, which can show how the data distribution is changing over time.
Example Usage
Say we have code that sends many requests to remote services, and want to generate a histogram showing how long the requests take. The following code will initialize histogram with 50 buckets, tracking values between 0 and 5000. (There are 50 buckets since the bucket width is specified as 100. If the bucket width is not an even multiple of the histogram range, the last bucket will simply be shorter than the others.)
folly::Histogram<int64_t> latencies(, , );
The addValue() method is used to add values to the histogram. Each time a request finishes we can add its latency to the histogram:
latencies.addValue(now - startTime);
You can access each of the histogram buckets to display the overall distribution. Note that bucket 0 tracks all data points that were below the specified histogram minimum, and the last bucket tracks the data points that were above the maximum.
unsigned int numBuckets = latencies.getNumBuckets();
cout << "Below min: " << latencies.getBucketByIndex().count << "\n";
for (unsigned int n = ; n < numBuckets - ; ++n) {
cout << latencies.getBucketMin(n) << "-" << latencies.getBucketMax(n)
<< ": " << latencies.getBucketByIndex(n).count << "\n";
}
cout << "Above max: "
<< latencies.getBucketByIndex(numBuckets - ).count << "\n";
You can also use the getPercentileEstimate() method to estimate the value at the Nth percentile in the distribution. For example, to estimate the median, as well as the 95th and 99th percentile values:
int64_t median = latencies.getPercentileEstimate(0.5);
int64_t p95 = latencies.getPercentileEstimate(0.95);
int64_t p99 = latencies.getPercentileEstimate(0.99);
Thread Safety
Note that Histogram and HistogramBuckets objects are not thread-safe. If you wish to access a single Histogram from multiple threads, you must perform your own locking to ensure that multiple threads do not access it at the same time.
Histogram的更多相关文章
- [LeetCode] Largest Rectangle in Histogram 直方图中最大的矩形
Given n non-negative integers representing the histogram's bar height where the width of each bar is ...
- poj 2559 Largest Rectangle in a Histogram - 单调栈
Largest Rectangle in a Histogram Time Limit: 1000MS Memory Limit: 65536K Total Submissions: 19782 ...
- LeetCode 笔记系列 17 Largest Rectangle in Histogram
题目: Largest Rectangle in Histogram Given n non-negative integers representing the histogram's bar he ...
- LeetCode: Largest Rectangle in Histogram(直方图最大面积)
http://blog.csdn.net/abcbc/article/details/8943485 具体的题目描述为: Given n non-negative integers represent ...
- DP专题训练之HDU 1506 Largest Rectangle in a Histogram
Description A histogram is a polygon composed of a sequence of rectangles aligned at a common base l ...
- Largest Rectangle in Histogram
Given n non-negative integers representing the histogram's bar height where the width of each bar is ...
- 数据结构与算法(1)支线任务3——Largest Rectangle in Histogram
题目如下:(https://leetcode.com/problems/largest-rectangle-in-histogram/) Given n non-negative integers r ...
- LeetCode之Largest Rectangle in Histogram浅析
首先上题目 Given n non-negative integers representing the histogram's bar height where the width of each ...
- Largest Rectangle in a Histogram(DP)
Largest Rectangle in a Histogram Time Limit : 2000/1000ms (Java/Other) Memory Limit : 65536/32768K ...
- Elasticsearch聚合 之 Histogram 直方图聚合
Elasticsearch支持最直方图聚合,它在数字字段自动创建桶,并会扫描全部文档,把文档放入相应的桶中.这个数字字段既可以是文档中的某个字段,也可以通过脚本创建得出的. 桶的筛选规则 举个例子,有 ...
随机推荐
- 利用xcopy在复制文件或文件夹的时候保留其权限
当用 Windows Explorer 复制或移动文件和文件夹时,文件或文件夹上设置的权限可能会发生改变.例如,当在一个 NTFS文件系统卷内或在两个 NTFS 卷之间复制一个文件时,Windows将 ...
- JSP和JS的区别
从本科毕业设计开始就一直困扰我,jsp和js这两者的区别,一直处于迷糊状态,也没有搞清楚.今天就简单的介绍下两者的区别. 1.JSP全称是java server page JS全称是javaSc ...
- android代码常识
查看当前android代码版本号:build/core/version_defaults.mk---->查找platform_version android源码在线阅读网址 http://and ...
- boost split字符串
boost split string , which is very convenience #include <string> #include <iostream> #in ...
- 【剑指offer】栈的压入弹出序列,C++实现(举例)
原创文章,转载请注明出处! 本题牛客网地址 博客文章索引地址 博客文章中代码的github地址 1.题目 输入两个整数序列,第一个序列表示栈的压入顺序,请判断第二个序列是否为第一个序列的出栈序列.注意 ...
- docker下的Jenkins安装和体验【转】
原文地址:http://blog.csdn.net/boling_cavalry/article/details/78942408 作为一款优秀的持续集成工具,jenkins在日常的项目中经常会用到, ...
- HDU1800 hash+去前导0
注意一:卡map的时间,但是好好写+运气还是可以卡过,哇...求人品爆发 注意二:去前导0,毕竟‘0’也有ASCII码 #include<cstdio> #include<cstdl ...
- linux自学(四)之开始centos学习,网络配置
上一篇:linux自学(三)之开启虚拟机 安装好镜像之后,重启之后需要登录,我这里直接是root账号直接登录的,注意:输入密码的时候不显示. 之后输入ifconfig最常用的命令来查看网卡信息,出现c ...
- IntelliJ-IDEA中mybatis三剑客
一.mybatis-generator的使用 作用:根据数据库自动生成pojo.dao和xml文件. 1.引入mybatis-generator pom.xml中引入配置:
- CF1117D Magic Gems
CF1117D Magic Gems 考虑 \(dp\) , \(f[i]\) 表示用 \(i\) 个单位空间的方案数,答案即为 \(f[n]\). 对于一个位置,我们可以放 \(Magic\) 的, ...