[转] 分代垃圾回收的 新旧代引用问题(原标题:Back To Basics: Generational Garbage Collection)
原文链接: https://blogs.msdn.microsoft.com/abhinaba/2009/03/02/back-to-basics-generational-garbage-collection/
This post is Part 8 in the series of posts on Garbage Collection (GC). Please see the index here.
One of the primary disadvantage discussed in the post on mark-sweep garbage collection is that it introduces very large system pauses when the entire heap is marked and swept. One of the primary optimization employed to solve this issue is employing generational garbage collection. This optimization is based on the following observations
- Most objects die young
- Over 90% garbage collected in a GC is newly created post the previous GC cycle
- If an object survives a GC cycle the chances of it becoming garbage in the short term is low and hence the GC wastes time marking it again and again in each cycle
The optimization based on the above observations is to segregate objects by age into multiple generations and collect each with different frequencies.
This scheme has proven to work rather well and is widely used in many modern systems (including .NET).
Detailed algorithm
The objects can be segregated into age based generations in different ways, e.g. by time of creation. However one common way is to consider a newly created object to be in Generation 0 (Gen0) and then if it is not collected by a cycle of garbage collection then it is promoted to the next higher generation, Gen1. Similarly if an object in Gen1 survives a GC then that gets promoted to Gen2.
Lower generations are collected more often. This ensures lower system pauses. The higher generation collection is triggered fewer times.
How many generations are employed, varies from system to system. In .NET 3 generations are used. Here for simplicity we will consider a 2 generation system but the concepts are easily extended to more than 2.
Let us consider that the memory is divided into two contiguous blocks, one for Gen1 and the other for Gen0. At start memory is allocated only from Gen0 area as follows
![]()
So we have 4 objects in Gen0. Now one of the references is released
![]()
Now if GC is fired it will use mark and sweep on Gen0 objects and cleanup the two objects that are not reachable. So the final state after cleaning up is
![]()
The two surviving objects are then promoted to Gen1. Promotion includes copying the two objects to Gen1 area and then updating the references to them
![]()
Now assume a whole bunch of allocation/de-allocation has happened. Since new allocations are in Gen0 the memory layout looks like
![]()
The whole purpose of segregating into generations is to reduce the number of objects to inspect for marking. So the first root is used for marking as it points to a Gen0 object. While using the second root the moment the marker sees that the reference is into a Gen1 object it does not follow the reference, speeding up marking process.
Now if we only consider the Gen0 objects for marking then we only mark the objects indicated by ✓. The marking algorithm will fail to locate the Gen1 to Gen0 references (shown in red) and some object marking will be left out leading to dangling pointers.
One of the way to handle this is to somehow record all references from Gen1 to Gen0 (way to do that is in the next section) and then use these objects as new roots for the marking phase. If we use this method then we get a new set of marked objects as follows
![]()
This now gives the full set of marked objects. Post another GC and promotion of surviving objects to higher generation we get
![]()
At this point the next cycle as above resumes…
Tracking higher to lower generation references
In general applications there are very few (some studies show < 1% of all references) of these type of references. However, they all need to be recorded. There are two general approached of doing this
Write barrier + card-table
First a table called a card table is created. This is essentially an array of bits. Each bit indicates if a given range of memory is dirty (contains a write to a lower generation object). E.g. we can use a single bit to mark a 4KB block.
![]()
Whenever an reference assignment is made in user code, instead of directly doing the assignment it is redirected to a small thunk (incase .NET the JITter does this). The thunk compares the assignees address to that of the Gen1 memory range. If the range falls within, then the thunk updates the corresponding bit in the card table to indicate that the range which the bit covers is now dirty (shown as red).
First marking uses only Gen0 objects. Once this is over it inspects the card table to locate dirty blocks. Then it considers every object in that dirty block to be new roots and marks objects using it.
As you can see that the 4KB block is just an optimization to reduce the size of the card table. If we increase the granularity to be per object then we can save marking time by having to consider only one object (in contrast to all in 4KB range) but our card table size will also significantly increase.
One of the flip sides is that the thunk makes reference assignment slower.
HW support
Hardware support also uses card table but instead of using thunk it simply uses special features exposed by the HW+OS for notification of dirty writes. E.g. it can use the Win32 api GetWriteWatch to get the list of pages where write happened and use that information to get the card table entries.
However, these kind of support is not available on all platforms (or older version of platforms) and hence is less utilized.
[转] 分代垃圾回收的 新旧代引用问题(原标题:Back To Basics: Generational Garbage Collection)的更多相关文章
- Java分代垃圾回收机制:年轻代/年老代/持久代(转)
虚拟机中的共划分为三个代:年轻代(Young Generation).年老点(Old Generation)和持久代(Permanent Generation).其中持久代主要存放的是Java类的类信 ...
- Java中的分代垃圾回收策略
一.分代GC的理论基础 分代的垃圾回收策略,是基于这样一个事实:不同的对象的生命周期是不一样的.因此,不同生命周期的对象可以采取不同的收集方式,以便提高回收效率. 在Java程序运行的过程中,会产生大 ...
- JVM分代垃圾回收策略的基础概念
由于不同对象的生命周期不一样,因此在JVM的垃圾回收策略中有分代这一策略.本文介绍了分代策略的目标,如何分代,以及垃圾回收的触发因素. 文章总结了JVM垃圾回收策略为什么要分代,如何分代,以及垃圾回收 ...
- JVM调优总结(五)-分代垃圾回收详述1
为什么要分代 分代的垃圾回收策略,是基于这样一个事实:不同的对象的生命周期是不一样的.因此,不同生命周期的对象可以采取不同的收集方式,以便提高回收效率. 在Java程序运行的过程中,会产生大量的对象, ...
- JVM调优总结:分代垃圾回收详述
为什么要分代 分代的垃圾回收策略,是基于这样一个事实:不同的对象的生命周期是不一样的.因此,不同生命周期的对象可以采取不同的收集方式,以便提高回收效率. 在Java程序运行的过程中,会产生大量的对象, ...
- java虚拟机学习-JVM调优总结-分代垃圾回收详述(9)
为什么要分代 分代的垃圾回收策略,是基于这样一个事实:不同的对象的生命周期是不一样的.因此,不同生命周期的对象可以采取不同的收集方式,以便提高回收效率. 在Java程序运行的过程中,会产生大量的对象, ...
- JVM堆内存控制/分代垃圾回收
JVM的堆的内存, 是通过下面面两个参数控制的 -Xms 最小堆的大小, 也就是当你的虚拟机启动后, 就会分配这么大的堆内存给你 -Xmx 是最大堆的大小 当最小堆占满后,会尝试进行GC,如果GC之后 ...
- JVM调优总结(4):分代垃圾回收
为什么要分代 分代的垃圾回收策略,是基于这样一个事实:不同的对象的生命周期是不一样的.因此,不同生命周期的对象可以采取不同的收集方式,以便提高回收效率. 在Java程序运行的过程中,会产生大量的对象, ...
- Java 垃圾回收机制 (分代垃圾回收ZGC)
什么是自动垃圾回收? 自动垃圾回收是一种在堆内存中找出哪些对象在被使用,还有哪些对象没被使用,并且将后者删掉的机制.所谓使用中的对象(已引用对象),指的是程序中有指针指向的对象:而未使用中的对象(未引 ...
随机推荐
- Android Toast 工具类
android 中常用系统吐司工具类 package cn.yhq.utils; import android.content.Context; import android.widget.Toas ...
- 【LeetCode刷题系列 - 002题】Add Two Numbers
题目: You are given two non-empty linked lists representing two non-negative integers. The digits are ...
- 理解javascript中的立即执行函数(function(){})()
之前看了好多代码,都有用到这种函数的写法,但是都没认真的去想为什么会这样写,今天开始想学习下jquery的源码,发现jquery也是使用这种方式,用(function(window, undefine ...
- Memcached和Memcache安装(64位win7)[z]
http://www.cnblogs.com/lucky-man/p/6126667.html 一.Memcached和Memcache的区别: 网上关于Memcached和Memcache的区别的理 ...
- exl表格找两个字符间的数据
例子找的是]XXX,中间的内容 =MID(B2,FIND("]",B2)+1,FIND(",",B2)-FIND("]",B2)-1) ...
- js实现动态加载脚本的方法实例汇总
本文实例讲述了js实现动态加载脚本的方法.分享给大家供大家参考,具体如下: 最近公司的前端地图产品需要做一下模块划分,希望用户用到哪一块的功能再加载哪一块的模块,这样可以提高用户体验. 所以到处查 ...
- dubbo服务达成jar包
<build> <finalName>dubbo-provider</finalName> <!-- jar包名,一般设置为提供者服务名 --> < ...
- windbg排查大内存
现在都是用windbg preview,安装比较麻烦了,还要配置环境变量, 并且每次分析前要先执行 !analyze - v !eeheap -gc !DumpHeap -min 500 000002 ...
- BSOJ3760||洛谷P1453 城市环路 题解
城市环路 Description 一座城市,往往会被人们划分为几个区域,例如住宅区.商业区.工业区等等.B市就被分为了以下的两个区域——城市中心和城市郊区.在着这两个区域的中间是一条围绕B市的环路,环 ...
- Mac更改PHP默认目录
在Mac上搭建了PHP服务器以后,默认的路径为/Library/WebServer/Documents下面,但这让人很不爽,我想修改到自己定义的路径下.经过好一番折腾,终于成功了. PHPEclips ...