FBVector
folly/FBVector.h
Simply replacing std::vector with folly::fbvector (after having included the folly/FBVector.h header file) will improve the performance of your C++ code using vectors with common coding patterns. The improvements are always non-negative, almost always measurable, frequently significant, sometimes dramatic, and occasionally spectacular.
Sample
folly::fbvector<int> numbers({, , , });
numbers.reserve();
for (int i = ; i < ; i++) {
numbers.push_back(i * );
}
assert(numbers[] == );
Motivation
std::vector is the stalwart abstraction many use for dynamically-allocated arrays in C++. It is also the best known and most used of all containers. It may therefore seem a surprise that std::vector leaves important - and sometimes vital - efficiency opportunities on the table. This document explains how our own drop-in abstraction fbvector improves key performance aspects of std::vector. Refer to folly/test/FBVectorTest.cpp for a few benchmarks.
Memory Handling
It is well known that std::vector grows exponentially (at a constant factor) in order to avoid quadratic growth performance. The trick is choosing a good factor. Any factor greater than 1 ensures O(1) amortized append complexity towards infinity. But a factor that's too small (say, 1.1) causes frequent vector reallocation, and one that's too large (say, 3 or 4) forces the vector to consume much more memory than needed.
The initial HP implementation by Stepanov used a growth factor of 2; i.e., whenever you'd push_back into a vector without there being room, it would double the current capacity. This was not a good choice: it can be mathematically proven that a growth factor of 2 is rigorously the worst possible because it never allows the vector to reuse any of its previously-allocated memory. Despite other compilers reducing the growth factor to 1.5, gcc has staunchly maintained its factor of 2. This makes std::vector cache- unfriendly and memory manager unfriendly.
To see why that's the case, consider a large vector of capacity C. When there's a request to grow the vector, the vector (assuming no in-place resizing, see the appropriate section in this document) will allocate a chunk of memory next to its current chunk, copy its existing data to the new chunk, and then deallocate the old chunk. So now we have a chunk of size C followed by a chunk of size k * C. Continuing this process we'll then have a chunk of size k * k * C to the right and so on. That leads to a series of the form (using ^^ for power):
C, C*k, C*k^^, C*k^^, ...
If we choose k = 2 we know that every element in the series will be strictly larger than the sum of all previous ones because of the remarkable equality:
+ ^^ + ^^ + ^^... + ^^n = ^^(n+) -
This means that any new chunk requested will be larger than all previously used chunks combined, so the vector must crawl forward in memory; it can't move back to its deallocated chunks. But any number smaller than 2 guarantees that you'll at some point be able to reuse the previous chunks. For instance, choosing 1.5 as the factor allows memory reuse after 4 reallocations; 1.45 allows memory reuse after 3 reallocations; and 1.3 allows reuse after only 2 reallocations.
Of course, the above makes a number of simplifying assumptions about how the memory allocator works, but definitely you don't want to choose the theoretically absolute worst growth factor. fbvector uses a growth factor of 1.5. That does not impede good performance at small sizes because of the way fbvector cooperates with jemalloc (below).
The jemalloc Connection
Virtually all modern allocators allocate memory in fixed-size quanta that are chosen to minimize management overhead while at the same time offering good coverage at low slack. For example, an allocator may choose blocks of doubling size (32, 64, 128, <t_co>, ...) up to 4096, and then blocks of size multiples of a page up until 1MB, and then 512KB increments and so on.
As discussed above, std::vector also needs to (re)allocate in quanta. The next quantum is usually defined in terms of the current size times the infamous growth constant. Because of this setup, std::vector has some slack memory at the end much like an allocated block has some slack memory at the end.
It doesn't take a rocket surgeon to figure out that an allocator- aware std::vector would be a marriage made in heaven: the vector could directly request blocks of "perfect" size from the allocator so there would be virtually no slack in the allocator. Also, the entire growth strategy could be adjusted to work perfectly with allocator's own block growth strategy. That's exactly what fbvector does - it automatically detects the use of jemalloc and adjusts its reallocation strategy accordingly.
But wait, there's more. Many memory allocators do not support in- place reallocation, although most of them could. This comes from the now notorious design of realloc() to opaquely perform either in-place reallocation or an allocate-memcpy-deallocate cycle. Such lack of control subsequently forced all clib-based allocator designs to avoid in-place reallocation, and that includes C++'s new and std::allocator. This is a major loss of efficiency because an in-place reallocation, being very cheap, may mean a much less aggressive growth strategy. In turn that means less slack memory and faster reallocations.
Object Relocation
One particularly sensitive topic about handling C++ values is that they are all conservatively considered non- relocatable. In contrast, a relocatable value would preserve its invariant even if its bits were moved arbitrarily in memory. For example, an int32 is relocatable because moving its 4 bytes would preserve its actual value, so the address of that value does not "matter" to its integrity.
C++'s assumption of non-relocatable values hurts everybody for the benefit of a few questionable designs. The issue is that moving a C++ object "by the book" entails (a) creating a new copy from the existing value; (b) destroying the old value. This is quite vexing and violates common sense; consider this hypothetical conversation between Captain Picard and an incredulous alien:
Incredulous Alien: "So, this teleporter, how does it work?"
Picard: "It beams people and arbitrary matter from one place to another."
Incredulous Alien: "Hmmm... is it safe?"
Picard: "Yes, but earlier models were a hassle. They'd clone the person to another location. Then the teleporting chief would have to shoot the original. Ask O'Brien, he was an intern during those times. A bloody mess, that's what it was."
Only a tiny minority of objects are genuinely non-relocatable:
Objects that use internal pointers, e.g.:
class Ew { char buffer[1024]; char * pointerInsideBuffer; public: Ew() : pointerInsideBuffer(buffer) {} ... }
Objects that need to update "observers" that store pointers to them.
The first class of designs can always be redone at small or no cost in efficiency. The second class of objects should not be values in the first place - they should be allocated with new and manipulated using (smart) pointers. It is highly unusual for a value to have observers that alias pointers to it.
Relocatable objects are of high interest to std::vector because such knowledge makes insertion into the vector and vector reallocation considerably faster: instead of going to Picard's copy-destroy cycle, relocatable objects can be moved around simply by using memcpy or memmove. This optimization can yield arbitrarily high wins in efficiency; for example, it transforms vector< vector<double> > or vector< hash_map<int, string> > from risky liabilities into highly workable compositions.
In order to allow fast relocation without risk, fbvector uses a trait folly::IsRelocatable defined in "folly/Traits.h". By default, folly::IsRelocatable::value conservatively yields false. If you know that your type Widget is in fact relocatable, go right after Widget's definition and write this:
// at global namespace level
namespace folly {
struct IsRelocatable<Widget> : boost::true_type {};
}
If you don't do this, fbvector<Widget> will fail to compile with a static_assert.
Miscellaneous
fbvector uses a careful implementation all around to make sure it doesn't lose efficiency through the cracks. Some future directions may be in improving raw memory copying (memcpy is not an intrinsic in gcc and does not work terribly well for large chunks) and in furthering the collaboration with jemalloc. Have fun!
FBVector的更多相关文章
- 用#define来实现多份近似代码 - map,set中的应用
在stl中map,set内部都是使用相同的红黑树实现,map对应模板参数key_type,mapped_type,而set对应模板参数没有mapped_type 两者都支持insert操作 pair& ...
- 转: 在创业公司使用C++
from: http://oicwx.com/detail/827436 在创业公司使用C++ 2016-01-04开发资讯 James Perry和朋友创办了一家公司,主要是做基于云的OLAP多维数 ...
- [原创]CentOS6.4编译安装Facebook的folly库(gcc4.8.1boost1.5.3)
Folly: Facebook Open-souce LibrarY,Facebook开源的一个基础组件库,据说在大规模的场景中性能较高.目前因为自己负责的系统有几个地方性能较差,因此特意找来看看 ...
- Traits
'folly/Traits.h' Implements traits complementary to those provided in <type_traits> Implements ...
- small_vector
folly/small_vector.h folly::small_vector<T,Int=1,...> is a sequence container that implements ...
- DynamicConverter
folly/DynamicConverter.h When dynamic objects contain data of a known type, it is sometimes useful t ...
- folly学习心得(转)
原文地址: https://www.cnblogs.com/Leo_wl/archive/2012/06/27/2566346.html 阅读目录 学习代码库的一般步骤 folly库的学习心得 ...
- Folly: Facebook Open-source Library Readme.md 和 Overview.md(感觉包含的东西并不多,还是Boost更有用)
folly/ For a high level overview see the README Components Below is a list of (some) Folly component ...
- C++ folly库解读(二) small_vector —— 小数据集下的std::vector替代方案
介绍 使用场景 为什么不是std::array 其他用法 其他类似库 Benchmark 代码关注点 主要类 small_vector small_vector_base 数据结构 InlineSto ...
随机推荐
- Getsystime()与Getlocaltime()函数 相差8个小时
转自 http://xujinzeng.blog.163.com/blog/static/260083420086114747452/ 今天看一个有关时间的例程,发现Getsystime()与Getl ...
- 从JDK源码角度看Object
Java的Object是所有其他类的父类,从继承的层次来看它就是最顶层根,所以它也是唯一一个没有父类的类.它包含了对象常用的一些方法,比如getClass.hashCode.equals.clone. ...
- Android使用HTTP协议访问网络——HttpClient
套路篇 1.HttpClient是一个接口,因此无法创建它的实例,通常情况下都会创建一个DefaultHttpClient的实例 HttpClient httpClient=new DefaultHt ...
- 【商业源码】生日大放送-Newlife商业源码分享 -转
http://www.cnblogs.com/asxinyu/p/3225179.html 今天是农历六月二十三,是@大石头的生日,记得每年生日都会有很劲爆的重量级源码送出,今天Newlife群和 ...
- java中守护线程的一些概念和用法
网上的资料中,守护线程的功能一般都是“只要当前JVM实例中尚存任何一个非守护线程没有结束,守护线程就全部工作:只有当最后一个非守护线程结束是,守护线程随着JVM一同结束工作,Daemon作用是为其他线 ...
- JSON和JSONP简单总结
jsonp和json的区别,原理,在jquery中的使用 http://www.cnblogs.com/dowinning/archive/2012/04/19/json-jsonp-jquery.h ...
- BootStrap FileInput 插件实现多文件上传前端功能
<!DOCTYPE html> <html> <head> <title>文件上传</title> <meta charset=&qu ...
- BZOJ4561 JLoi2016 圆的异或并 【扫描线】【set】*
BZOJ4561 JLoi2016 圆的异或并 Description 在平面直角坐标系中给定N个圆.已知这些圆两两没有交点,即两圆的关系只存在相离和包含.求这些圆的异或面积并.异或面积并为:当一片区 ...
- 将美化进行到底,把 PowerShell 做成 oh-my-zsh 的样子
不知你有没有看过 Linux 上 oh-my-zsh 的样子?看过之后你一定会惊叹,原来命令行还能这么玩!然而 Windows 下能这么玩吗?答案是可行的,接下来就来看看怎么玩. Windows 下我 ...
- Quartz 2D编程指南(2) - 图形上下文
一个Graphics Context表示一个绘制目标.它包含绘制系统用于完成绘制指令的绘制参数和设备相关信息.Graphics Context定义了基本的绘制属性,如颜色.裁减区域.线条宽度和样式信息 ...