替换libc中的malloc free

不同平台替换方式不同。基于unix的系统上的glibc，使用了weak alias的方式替换。具体来说是因为这些入口函数都被定义成了weak symbols，再加上gcc支持 alias attribute，所以替换就变成了这种通用形式：

void* malloc(size_t size) __THROW __attribute__ ((alias (tc_malloc)))

因此所有malloc的调用都跳转到了tc_malloc的实现。

小块内存分配 do_malloc_small

小于等于kMaxSize（256K)的内存被划定为小块内存了，由函数do_malloc_small处理，定义如下：

 1 inline void * do_malloc_small( ThreadCache* heap , size_t size) {

 2

 3 ASSERT( Static::IsInited ());

 4

 5 ASSERT( heap != NULL );

 6

 7 size_t cl = Static ::sizemap()-> SizeClass(size );

 8

 9 size = Static::sizemap ()->class_to_size( cl);

10

11 if (( FLAGS_tcmalloc_sample_parameter > 0) && heap ->SampleAllocation( size)) {

12

13 return DoSampledAllocation (size);

14

15   } else {

16

17 // The common case, and also the simplest.  This just pops the

18

19 // size-appropriate freelist, after replenishing it if it's empty.

20

21 return CheckedMallocResult (heap-> Allocate(size , cl));

22

23   }

24

25 }

26

请求的size会被sizemap对齐成某一个相近的尺寸。sizemap管理着这些映射关系，从源size到目标size的映射主要是通过三个map实现的：

ClassIndex映射
映射方式在代码里面有比较详细的注释：

 1   // Sizes <= 1024 have an alignment >= 8.  So for such sizes we have an

 2   // array indexed by ceil(size/8).  Sizes > 1024 have an alignment >= 128.

 3   // So for these larger sizes we have an array indexed by ceil(size/128).

 4   //

 5   // We flatten both logical arrays into one physical array and use

 6   // arithmetic to compute an appropriate index.  The constants used by

 7   // ClassIndex() were selected to make the flattening work.

 8   //

 9   // Examples:

10   //   Size       Expression                      Index

11   //   -------------------------------------------------------

12   //   0          (0 + 7) / 8                     0

13   //   1          (1 + 7) / 8                     1

14   //   ...

15   //   1024       (1024 + 7) / 8                  128

16   //   1025       (1025 + 127 + (120<<7)) / 128   129

17   //   ...

18   //   32768      (32768 + 127 + (120<<7)) / 128  376

19

简而言之就是：<= 1024字节按照8字节向上取整对齐，>1024按照128字节对齐

class_array_和class_to_size_

class_array_和class_to_size_是简单的数组，在模块加载的时候在SizeMap::Init中初始化：

 1  // Compute the size classes we want to use

 2   int sc = 1;   // Next size class to assign

 3   int alignment = kAlignment;

 4   CHECK_CONDITION(kAlignment <= kMinAlign);

 5   for (size_t size = kAlignment; size <= kMaxSize; size += alignment) {

 6     alignment = AlignmentForSize(size);

 7     CHECK_CONDITION((size % alignment) == 0);

 8

 9     int blocks_to_move = NumMoveSize(size) / 4;

10     size_t psize = 0;

11     do {

12       psize += kPageSize;

13       // Allocate enough pages so leftover is less than 1/8 of total.

14       // This bounds wasted space to at most 12.5%.

15       while ((psize % size) > (psize >> 3)) {

16         psize += kPageSize;

17       }

18       // Continue to add pages until there are at least as many objects in

19       // the span as are needed when moving objects from the central

20       // freelists and spans to the thread caches.

21     } while ((psize / size) < (blocks_to_move));

22     const size_t my_pages = psize >> kPageShift;

23

24     if (sc > 1 && my_pages == class_to_pages_[sc-1]) {

25       // See if we can merge this into the previous class without

26       // increasing the fragmentation of the previous class.

27       const size_t my_objects = (my_pages << kPageShift) / size;

28       const size_t prev_objects = (class_to_pages_[sc-1] << kPageShift)

29                                   / class_to_size_[sc-1];

30       if (my_objects == prev_objects) {

31         // Adjust last class to include this size

32         class_to_size_[sc-1] = size;

33         continue;

34       }

35     }

36

37     // Add new class

38     class_to_pages_[sc] = my_pages;

39     class_to_size_[sc] = size;

40     sc++;

41   }

42

.csharpcode, .csharpcode pre { font-size: small; color: rgba(0, 0, 0, 1); font-family: consolas, "Courier New", courier, monospace; background-color: rgba(255, 255, 255, 1) }
.csharpcode pre { margin: 0 }
.csharpcode .rem { color: rgba(0, 128, 0, 1) }
.csharpcode .kwrd { color: rgba(0, 0, 255, 1) }
.csharpcode .str { color: rgba(0, 96, 128, 1) }
.csharpcode .op { color: rgba(0, 0, 192, 1) }
.csharpcode .preproc { color: rgba(204, 102, 51, 1) }
.csharpcode .asp { background-color: rgba(255, 255, 0, 1) }
.csharpcode .html { color: rgba(128, 0, 0, 1) }
.csharpcode .attr { color: rgba(255, 0, 0, 1) }
.csharpcode .alt { background-color: rgba(244, 244, 244, 1); width: 100%; margin: 0 }
.csharpcode .lnum { color: rgba(96, 96, 96, 1) }

class_to_size_的映射关系是按照不同size的对齐大小累加而成的，而对齐大小由 alignment = AlignmentForSize(size); 计算出，代码如下：

 1 int AlignmentForSize (size_t size) {

 2

 3 int alignment = kAlignment ;

 4

 5 if ( size > kMaxSize ) {

 6

 7 // Cap alignment at kPageSize for large sizes.

 8

 9 alignment = kPageSize ;

10

11   } else if (size >= 128) {

12

13 // Space wasted due to alignment is at most 1/8, i.e., 12.5%.

14

15 alignment = (1 << LgFloor (size)) / 8;

16

17   } else if (size >= kMinAlign) {

18

19 // We need an alignment of at least 16 bytes to satisfy

20

21 // requirements for some SSE types.

22

23 alignment = kMinAlign ;

24

25   }

26

27 // Maximum alignment allowed is page size alignment.

28

29 if ( alignment > kPageSize ) {

30

31 alignment = kPageSize ;

32

33   }

34

35 CHECK_CONDITION( size < kMinAlign || alignment >= kMinAlign);

36

37 CHECK_CONDITION(( alignment & (alignment - 1)) == 0);

38

39 return alignment;

40

41 }

42

LgFloor是个二分法求数值二进制最高位是哪一位的函数。对齐方式可以简化成如下的公式：

按照这样的公式 class_to_size_[1] = 8, class_to_size_[2] = 16, class_to_size_[3] = 32 ...

class_array_的初始化在class_to_size_之后：

 1 // Initialize the mapping arrays

 2

 3 int next_size = 0;

 4

 5 for ( int c = 1; c < kNumClasses; c ++) {

 6

 7 const int max_size_in_class = class_to_size_[c ];

 8

 9 for (int s = next_size; s <= max_size_in_class; s += kAlignment ) {

10

11 class_array_[ClassIndex (s)] = c;

12

13     }

14

15 next_size = max_size_in_class + kAlignment;

16

17   }

18

总的来说就是 ClassIndex一般按照8字节对齐，结果class_to_size_一般按照16字节对齐，class_array_就是去让他们建立对应关系。

以一个具体例子来说明这个映射关系，比如应用程序申请malloc(25)字节时，tcmalloc实际会给分配多少内存：

ClassIndex class_array_ class_to_size_

25 ----------------> (25+7)/8=4 -------------------> 3 -------------------> 32

结果是32字节的内存。

class_to_pages_ 和num_objects_to_move_

SizeMap中还有两个map：class_to_pages_ ， num_objects_to_move_ 。

class_to_pages_用在central free list中，表示该size class每一次从 page heap中分配的内存页数，初始化也在SzieMap::Init中：

 1 do {

 2

 3       psize += kPageSize;

 4

 5 // Allocate enough pages so leftover is less than 1/8 of total.

 6

 7 // This bounds wasted space to at most 12.5%.

 8

 9 while ((psize % size) > (psize >> 3)) {

10

11         psize += kPageSize;

12

13       }

14

15 // Continue to add pages until there are at least as many objects in

16

17 // the span as are needed when moving objects from the central

18

19 // freelists and spans to the thread caches.

20

21     } while ((psize / size) < (blocks_to_move));

22

该初始化大小受两个条件决定：

1）必须小于blocks_to_move（既num_objects_to_move_，表示每次分配内存分配多少个object）;

2) 使得分配出页内存若被划分出一个个object内存，剩余的内存空间不超过该size的1/8的约束，也就是浪费的空间要小于 size/8;

总结

SizeMap把tcmalloc所有和内存size有关的map收集封装统一管理，可以通过调整SizeMap来微调分配行为。问题是为什么把要申请的size先按照8字节对齐映射，然后又按照16字节对齐映射，最后再映射两个表？我的一开始想法是把src size直接按照16字节映射，即：

src size index dst size

0 0 0

1 1 16

2 1 16

n (n+15)/16 (n+15)/16 *16

这样实现起来更简单直观，也是可以达到目的。可能tcmalloc有更深层的原因我没发现。

TCMalloc源码学习（二）的更多相关文章

Dubbo源码学习(二)
@Adaptive注解在上一篇ExtensionLoader的博客中记录了,有两种扩展点,一种是普通的扩展实现,另一种就是自适应的扩展点,即@Adaptive注解的实现类. @Documented ...
python 协程库gevent学习--gevent源码学习(二)
在进行gevent源码学习一分析之后,我还对两个比较核心的问题抱有疑问: 1. gevent.Greenlet.join()以及他的list版本joinall()的原理和使用. 2. 关于在使用mon ...
Vue源码学习二 ———— Vue原型对象包装
Vue原型对象的包装在Vue官网直接通过 script 标签导入的 Vue包是 umd模块的形式.在使用前都通过 new Vue({}).记录一下 Vue构造函数的包装. 在 src/core/in ...
TCMalloc源码学习（四）（小内存块释放）
pagemap_和pagemap_cache_ PageHeap有两个map,pagemap_记录某一内存页对应哪一个span,显然可能多页对应一个span,pagemap_cache_记录某一内存页 ...
以太坊 layer2: optimism 源码学习(二) 提现原理
作者:林冠宏 / 指尖下的幽灵.转载者,请: 务必标明出处. 掘金:https://juejin.im/user/1785262612681997 博客:http://www.cnblogs.com/ ...
[spring源码学习]二、IOC源码——配置文件读取
一.环境准备对于学习源码来讲,拿到一大堆的代码,脑袋里肯定是嗡嗡的,所以从代码实例进行跟踪调试未尝不是一种好的办法,此处,我们准备了一个小例子: package com.zjl; public cl ...
TCMalloc源码学习（一）
打算一边学习tcmalloc的源码一边写总结文章.先从转述TCMalloc的一篇官方文档开始(TCMalloc : Thread-Caching Malloc). 为什么用TCMalloc TCMal ...
SocketServer源码学习(二)
SocketServer 中非常重要的两个基类就是:BaseServer 和 BaseRequestHandler在SocketServer 中也提供了对TCP以及UDP的高级封装,这次我们主要通过分 ...
Thrift源码学习二——Server层
Thrift 提供了如图五种模式:TSimpleServer.TNonblockingServer.THsHaServer.TThreadPoolServer.TThreadSelectorServe ...
mybatis源码学习(二)--mybatis+spring源码学习
这篇笔记主要来就,mybatis是如何利用spring的扩展点来实现和spring的整合 1.mybatis和spring整合之后,我们就不需要使用sqlSession.selectOne()这种方式 ...

随机推荐

CentOS7下常用安装服务软件源码编译安装方式的介绍
简介:介绍源码编译安装软件包的管理源码安装优点:编译安装过程,可以设定参数,指定安装目录,按照需求进行安装,指定安装的版本,灵活性比较大. 源码安装的缺点:需要对依赖包一个一个的进行安装,不敢随便升 ...
IntelliJ IDEA无法新建类解决办法
IntelliJ IDEA无法新建类解决办法灿夏 2018-07-14 08:50:05 4891 收藏 1 展开原文地址 IntelliJ IDEA使用教程 (总目录篇) [原文地址](ht ...
96. Unique Binary Search Trees1和2
/* 这道题的关键是:动态表尽量的选取,知道二叉搜索树中左子树的点都比根节点小,右子树的点都比根节点大所以当i为根节点,左子树有i-1个点,右子树有n-i个点,左右子树就可以开始递归构建,过程和一开 ...
CI持续集成理论知识
(1)什么是CI What is CI? CI就是持续集成,持续集成是一种软件开发实践,即团队开发成员经常集成他们的工作,通常每个成员每天至少集成一次,也就意味着每天可能会发生多次集成.每次集成都通过 ...
json 与 ajax
json类似与js中的对象,但是json中不能有方法,json相当于python中的字典,但是json中的键值如果是字符串的话,需要加上双引号:ajax是一个前后台配合的技术,它可以让js发送http ...
Turtlebot3入门教程-系统-SBC软件设置（ubuntu20.04）
本文针对如何在树莓派3上安装ubuntu20.04系统和软件进行讲解树莓派3接上显示屏和鼠标后,开机后继续安装软件包详细步骤如下: (1)系统安装 (2)ROS安装 (3)TurboBot3依赖的 ...
Java GC --- Java堆内存
Java堆是被所有线程共享的一块内存区域,所有对象实例和数组都在堆上进行内存分配.为了进行高效的垃圾回收,虚拟机把堆内存划分成: 1. 新生代(Young Generation): 由 Eden 与 ...
Vs2017编译器提示：不能将“const char *”类型的值分配到“char *”类型的实体
在项目属性中将语言符合模式改成否即可
网易163 docker镜像
$ sudo echo "DOCKER_OPTS=\"--registry-mirror=http://hub-mirror.c.163.com\"" > ...
Sqoop（四）增量导入、全量导入、减量导入
增量导入一.说明当在生产环境中,我们可能会定期从与业务相关的关系型数据库向Hadoop导入数据,导入数仓后进行后续离线分析.这种情况下我们不可能将所有数据重新再导入一遍,所以此时需要数据增量导入. ...

TCMalloc源码学习（二）