[转] 分代垃圾回收的 新旧代引用问题(原标题:Back To Basics: Generational Garbage Collection)
原文链接: https://blogs.msdn.microsoft.com/abhinaba/2009/03/02/back-to-basics-generational-garbage-collection/
This post is Part 8 in the series of posts on Garbage Collection (GC). Please see the index here.
One of the primary disadvantage discussed in the post on mark-sweep garbage collection is that it introduces very large system pauses when the entire heap is marked and swept. One of the primary optimization employed to solve this issue is employing generational garbage collection. This optimization is based on the following observations
- Most objects die young
- Over 90% garbage collected in a GC is newly created post the previous GC cycle
- If an object survives a GC cycle the chances of it becoming garbage in the short term is low and hence the GC wastes time marking it again and again in each cycle
The optimization based on the above observations is to segregate objects by age into multiple generations and collect each with different frequencies.
This scheme has proven to work rather well and is widely used in many modern systems (including .NET).
Detailed algorithm
The objects can be segregated into age based generations in different ways, e.g. by time of creation. However one common way is to consider a newly created object to be in Generation 0 (Gen0) and then if it is not collected by a cycle of garbage collection then it is promoted to the next higher generation, Gen1. Similarly if an object in Gen1 survives a GC then that gets promoted to Gen2.
Lower generations are collected more often. This ensures lower system pauses. The higher generation collection is triggered fewer times.
How many generations are employed, varies from system to system. In .NET 3 generations are used. Here for simplicity we will consider a 2 generation system but the concepts are easily extended to more than 2.
Let us consider that the memory is divided into two contiguous blocks, one for Gen1 and the other for Gen0. At start memory is allocated only from Gen0 area as follows
![]()
So we have 4 objects in Gen0. Now one of the references is released
![]()
Now if GC is fired it will use mark and sweep on Gen0 objects and cleanup the two objects that are not reachable. So the final state after cleaning up is
![]()
The two surviving objects are then promoted to Gen1. Promotion includes copying the two objects to Gen1 area and then updating the references to them
![]()
Now assume a whole bunch of allocation/de-allocation has happened. Since new allocations are in Gen0 the memory layout looks like
![]()
The whole purpose of segregating into generations is to reduce the number of objects to inspect for marking. So the first root is used for marking as it points to a Gen0 object. While using the second root the moment the marker sees that the reference is into a Gen1 object it does not follow the reference, speeding up marking process.
Now if we only consider the Gen0 objects for marking then we only mark the objects indicated by ✓. The marking algorithm will fail to locate the Gen1 to Gen0 references (shown in red) and some object marking will be left out leading to dangling pointers.
One of the way to handle this is to somehow record all references from Gen1 to Gen0 (way to do that is in the next section) and then use these objects as new roots for the marking phase. If we use this method then we get a new set of marked objects as follows
![]()
This now gives the full set of marked objects. Post another GC and promotion of surviving objects to higher generation we get
![]()
At this point the next cycle as above resumes…
Tracking higher to lower generation references
In general applications there are very few (some studies show < 1% of all references) of these type of references. However, they all need to be recorded. There are two general approached of doing this
Write barrier + card-table
First a table called a card table is created. This is essentially an array of bits. Each bit indicates if a given range of memory is dirty (contains a write to a lower generation object). E.g. we can use a single bit to mark a 4KB block.
![]()
Whenever an reference assignment is made in user code, instead of directly doing the assignment it is redirected to a small thunk (incase .NET the JITter does this). The thunk compares the assignees address to that of the Gen1 memory range. If the range falls within, then the thunk updates the corresponding bit in the card table to indicate that the range which the bit covers is now dirty (shown as red).
First marking uses only Gen0 objects. Once this is over it inspects the card table to locate dirty blocks. Then it considers every object in that dirty block to be new roots and marks objects using it.
As you can see that the 4KB block is just an optimization to reduce the size of the card table. If we increase the granularity to be per object then we can save marking time by having to consider only one object (in contrast to all in 4KB range) but our card table size will also significantly increase.
One of the flip sides is that the thunk makes reference assignment slower.
HW support
Hardware support also uses card table but instead of using thunk it simply uses special features exposed by the HW+OS for notification of dirty writes. E.g. it can use the Win32 api GetWriteWatch to get the list of pages where write happened and use that information to get the card table entries.
However, these kind of support is not available on all platforms (or older version of platforms) and hence is less utilized.
[转] 分代垃圾回收的 新旧代引用问题(原标题:Back To Basics: Generational Garbage Collection)的更多相关文章
- Java分代垃圾回收机制:年轻代/年老代/持久代(转)
虚拟机中的共划分为三个代:年轻代(Young Generation).年老点(Old Generation)和持久代(Permanent Generation).其中持久代主要存放的是Java类的类信 ...
- Java中的分代垃圾回收策略
一.分代GC的理论基础 分代的垃圾回收策略,是基于这样一个事实:不同的对象的生命周期是不一样的.因此,不同生命周期的对象可以采取不同的收集方式,以便提高回收效率. 在Java程序运行的过程中,会产生大 ...
- JVM分代垃圾回收策略的基础概念
由于不同对象的生命周期不一样,因此在JVM的垃圾回收策略中有分代这一策略.本文介绍了分代策略的目标,如何分代,以及垃圾回收的触发因素. 文章总结了JVM垃圾回收策略为什么要分代,如何分代,以及垃圾回收 ...
- JVM调优总结(五)-分代垃圾回收详述1
为什么要分代 分代的垃圾回收策略,是基于这样一个事实:不同的对象的生命周期是不一样的.因此,不同生命周期的对象可以采取不同的收集方式,以便提高回收效率. 在Java程序运行的过程中,会产生大量的对象, ...
- JVM调优总结:分代垃圾回收详述
为什么要分代 分代的垃圾回收策略,是基于这样一个事实:不同的对象的生命周期是不一样的.因此,不同生命周期的对象可以采取不同的收集方式,以便提高回收效率. 在Java程序运行的过程中,会产生大量的对象, ...
- java虚拟机学习-JVM调优总结-分代垃圾回收详述(9)
为什么要分代 分代的垃圾回收策略,是基于这样一个事实:不同的对象的生命周期是不一样的.因此,不同生命周期的对象可以采取不同的收集方式,以便提高回收效率. 在Java程序运行的过程中,会产生大量的对象, ...
- JVM堆内存控制/分代垃圾回收
JVM的堆的内存, 是通过下面面两个参数控制的 -Xms 最小堆的大小, 也就是当你的虚拟机启动后, 就会分配这么大的堆内存给你 -Xmx 是最大堆的大小 当最小堆占满后,会尝试进行GC,如果GC之后 ...
- JVM调优总结(4):分代垃圾回收
为什么要分代 分代的垃圾回收策略,是基于这样一个事实:不同的对象的生命周期是不一样的.因此,不同生命周期的对象可以采取不同的收集方式,以便提高回收效率. 在Java程序运行的过程中,会产生大量的对象, ...
- Java 垃圾回收机制 (分代垃圾回收ZGC)
什么是自动垃圾回收? 自动垃圾回收是一种在堆内存中找出哪些对象在被使用,还有哪些对象没被使用,并且将后者删掉的机制.所谓使用中的对象(已引用对象),指的是程序中有指针指向的对象:而未使用中的对象(未引 ...
随机推荐
- Vue数据交互
注意:本地json文件必须放在static目录下面,读取或交互数据前,请先安装vue-resource.点击前往 -->(vue-resource中文文档) 一.Vue读取本地JSON数据 c ...
- 编写函数,接受一个string,返回一个bool值,指出string是否有5个或者更多字符,使用此函数打印出长度大于等于5的元素
#include <algorithm> using namespace std; bool isFive(const string& s1) { return s1.size() ...
- python3 报错
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: ...
- 通信导论-IP数据网络基础(4)
IP地址的编址方法--IP地址+掩码地址=网络地址 分类的IP地址 每一类地址都由两个固定长度的字段组成,其中一个字段是网络号 net-id,标志主机或路由器所连接到的网络,另一个字段则是主机号 ho ...
- Retrofit 2.0 上传文件
1.用MultipartBody.Part的方式上传文件(单文件上传)(表单方式) @Multipart @POST("xxx/xxx") Call<ResponseBody ...
- node.js 从入门到。。。
本人安装环境为 mac ,所以只记录了 mac 下的操作步骤 1.安装 node node的国内下载地址:http://nodejs.cn/download/ 安装之后,在终端输入指令 node -v ...
- Python之路(第三十四篇) 网络编程:验证客户端合法性
一.验证客户端合法性 如果你想在分布式系统中实现一个简单的客户端链接认证功能,又不像SSL那么复杂,那么利用hmac+加盐的方式来实现. 客户端验证的总的思路是将服务端随机产生的指定位数的字节发送到客 ...
- postgresql 日期生成流水号
--表结构 DROP TABLE if exists public.sys_tabid; CREATE TABLE public.sys_tabid ( id serial NOT NULL , ty ...
- Java多线程编程核心技术(一)
先提一下进程,可以理解为操作系统管理的基本单元. 而线程呢,在进程中独立运行的子任务.举个栗子:QQ.exe运行时有很多子任务在同时运行,比如好友视频线程.下载视频线程.传输数据线程等等. 多线程的优 ...
- python 基础———— 字符串常用的调用 (图2)
1. replace 2. join 3.split 4 rsplit 5. strip : 去除字符串左右两边特定(指定)的字符 7. rstrip : 去除右边特定(指定)的字符 8. l ...