[转] 分代垃圾回收的 新旧代引用问题(原标题:Back To Basics: Generational Garbage Collection)
原文链接: https://blogs.msdn.microsoft.com/abhinaba/2009/03/02/back-to-basics-generational-garbage-collection/
This post is Part 8 in the series of posts on Garbage Collection (GC). Please see the index here.
One of the primary disadvantage discussed in the post on mark-sweep garbage collection is that it introduces very large system pauses when the entire heap is marked and swept. One of the primary optimization employed to solve this issue is employing generational garbage collection. This optimization is based on the following observations
- Most objects die young
- Over 90% garbage collected in a GC is newly created post the previous GC cycle
- If an object survives a GC cycle the chances of it becoming garbage in the short term is low and hence the GC wastes time marking it again and again in each cycle
The optimization based on the above observations is to segregate objects by age into multiple generations and collect each with different frequencies.
This scheme has proven to work rather well and is widely used in many modern systems (including .NET).
Detailed algorithm
The objects can be segregated into age based generations in different ways, e.g. by time of creation. However one common way is to consider a newly created object to be in Generation 0 (Gen0) and then if it is not collected by a cycle of garbage collection then it is promoted to the next higher generation, Gen1. Similarly if an object in Gen1 survives a GC then that gets promoted to Gen2.
Lower generations are collected more often. This ensures lower system pauses. The higher generation collection is triggered fewer times.
How many generations are employed, varies from system to system. In .NET 3 generations are used. Here for simplicity we will consider a 2 generation system but the concepts are easily extended to more than 2.
Let us consider that the memory is divided into two contiguous blocks, one for Gen1 and the other for Gen0. At start memory is allocated only from Gen0 area as follows
![]()
So we have 4 objects in Gen0. Now one of the references is released
![]()
Now if GC is fired it will use mark and sweep on Gen0 objects and cleanup the two objects that are not reachable. So the final state after cleaning up is
![]()
The two surviving objects are then promoted to Gen1. Promotion includes copying the two objects to Gen1 area and then updating the references to them
![]()
Now assume a whole bunch of allocation/de-allocation has happened. Since new allocations are in Gen0 the memory layout looks like
![]()
The whole purpose of segregating into generations is to reduce the number of objects to inspect for marking. So the first root is used for marking as it points to a Gen0 object. While using the second root the moment the marker sees that the reference is into a Gen1 object it does not follow the reference, speeding up marking process.
Now if we only consider the Gen0 objects for marking then we only mark the objects indicated by ✓. The marking algorithm will fail to locate the Gen1 to Gen0 references (shown in red) and some object marking will be left out leading to dangling pointers.
One of the way to handle this is to somehow record all references from Gen1 to Gen0 (way to do that is in the next section) and then use these objects as new roots for the marking phase. If we use this method then we get a new set of marked objects as follows
![]()
This now gives the full set of marked objects. Post another GC and promotion of surviving objects to higher generation we get
![]()
At this point the next cycle as above resumes…
Tracking higher to lower generation references
In general applications there are very few (some studies show < 1% of all references) of these type of references. However, they all need to be recorded. There are two general approached of doing this
Write barrier + card-table
First a table called a card table is created. This is essentially an array of bits. Each bit indicates if a given range of memory is dirty (contains a write to a lower generation object). E.g. we can use a single bit to mark a 4KB block.
![]()
Whenever an reference assignment is made in user code, instead of directly doing the assignment it is redirected to a small thunk (incase .NET the JITter does this). The thunk compares the assignees address to that of the Gen1 memory range. If the range falls within, then the thunk updates the corresponding bit in the card table to indicate that the range which the bit covers is now dirty (shown as red).
First marking uses only Gen0 objects. Once this is over it inspects the card table to locate dirty blocks. Then it considers every object in that dirty block to be new roots and marks objects using it.
As you can see that the 4KB block is just an optimization to reduce the size of the card table. If we increase the granularity to be per object then we can save marking time by having to consider only one object (in contrast to all in 4KB range) but our card table size will also significantly increase.
One of the flip sides is that the thunk makes reference assignment slower.
HW support
Hardware support also uses card table but instead of using thunk it simply uses special features exposed by the HW+OS for notification of dirty writes. E.g. it can use the Win32 api GetWriteWatch to get the list of pages where write happened and use that information to get the card table entries.
However, these kind of support is not available on all platforms (or older version of platforms) and hence is less utilized.
[转] 分代垃圾回收的 新旧代引用问题(原标题:Back To Basics: Generational Garbage Collection)的更多相关文章
- Java分代垃圾回收机制:年轻代/年老代/持久代(转)
虚拟机中的共划分为三个代:年轻代(Young Generation).年老点(Old Generation)和持久代(Permanent Generation).其中持久代主要存放的是Java类的类信 ...
- Java中的分代垃圾回收策略
一.分代GC的理论基础 分代的垃圾回收策略,是基于这样一个事实:不同的对象的生命周期是不一样的.因此,不同生命周期的对象可以采取不同的收集方式,以便提高回收效率. 在Java程序运行的过程中,会产生大 ...
- JVM分代垃圾回收策略的基础概念
由于不同对象的生命周期不一样,因此在JVM的垃圾回收策略中有分代这一策略.本文介绍了分代策略的目标,如何分代,以及垃圾回收的触发因素. 文章总结了JVM垃圾回收策略为什么要分代,如何分代,以及垃圾回收 ...
- JVM调优总结(五)-分代垃圾回收详述1
为什么要分代 分代的垃圾回收策略,是基于这样一个事实:不同的对象的生命周期是不一样的.因此,不同生命周期的对象可以采取不同的收集方式,以便提高回收效率. 在Java程序运行的过程中,会产生大量的对象, ...
- JVM调优总结:分代垃圾回收详述
为什么要分代 分代的垃圾回收策略,是基于这样一个事实:不同的对象的生命周期是不一样的.因此,不同生命周期的对象可以采取不同的收集方式,以便提高回收效率. 在Java程序运行的过程中,会产生大量的对象, ...
- java虚拟机学习-JVM调优总结-分代垃圾回收详述(9)
为什么要分代 分代的垃圾回收策略,是基于这样一个事实:不同的对象的生命周期是不一样的.因此,不同生命周期的对象可以采取不同的收集方式,以便提高回收效率. 在Java程序运行的过程中,会产生大量的对象, ...
- JVM堆内存控制/分代垃圾回收
JVM的堆的内存, 是通过下面面两个参数控制的 -Xms 最小堆的大小, 也就是当你的虚拟机启动后, 就会分配这么大的堆内存给你 -Xmx 是最大堆的大小 当最小堆占满后,会尝试进行GC,如果GC之后 ...
- JVM调优总结(4):分代垃圾回收
为什么要分代 分代的垃圾回收策略,是基于这样一个事实:不同的对象的生命周期是不一样的.因此,不同生命周期的对象可以采取不同的收集方式,以便提高回收效率. 在Java程序运行的过程中,会产生大量的对象, ...
- Java 垃圾回收机制 (分代垃圾回收ZGC)
什么是自动垃圾回收? 自动垃圾回收是一种在堆内存中找出哪些对象在被使用,还有哪些对象没被使用,并且将后者删掉的机制.所谓使用中的对象(已引用对象),指的是程序中有指针指向的对象:而未使用中的对象(未引 ...
随机推荐
- 100-days: twenty-six
Title: The Guardian(英国卫报) view on the Notre Dame fire: we share France's terrible loss Notre Dame 巴黎 ...
- 100-days: twenty-four
Title: No word of a lie: scientists rate(评出) the world's biggest peddlers of bull(说瞎话的人,扯淡的人) bullsh ...
- AUDIOqueue 为什么会播放一段时间就听不到声音
转自简书:非常有用 AudioQueue缓冲区为空时,那么AudioQueueOutputCallback回调不会再调用 这个其实很好理解,AudioQueue的回调本事就是数据播完了才回调的 Aud ...
- c#: TextBox添加水印效果(PlaceHolderText)
基于他人代码修改,不闪,以做备忘. 与SendMessage EM_SETCUEBANNER消息相比,它能改变字体绘制颜色,EM_SETCUEBANNER只限定了DimGray颜色,太深 //与Sen ...
- 浙江省赛之Singing Everywhere
题目:http://acm.zju.edu.cn/onlinejudge/showContestProblem.do?problemId=5996 方法: 在大佬的指导下完成. 寻找峰值,找到一共k个 ...
- oracle 锁表sql 解锁
1.select * from v$locked_object; 查看具体的 : select session_id , oracle_username, process from v$loc ...
- C#多线程--信号量(Semaphore)[z]
百度百科:Semaphore,是负责协调各个线程, 以保证它们能够正确.合理的使用公共资源.也是操作系统中用于控制进程同步互斥的量. Semaphore常用的方法有两个WaitOne()和Releas ...
- JAVA多线程之线程间的通信方式
(转发) 收藏 记 周日,北京的天阳光明媚,9月,北京的秋格外肃穆透彻,望望窗外的湛蓝的天,心似透过栏杆,沐浴在这透亮清澈的蓝天里,那朵朵白云如同一朵棉絮,心意畅想....思绪外扬, 鱼和熊掌不可兼得 ...
- angular使用Md5加密
一.现象 用户登录时需要记住密码的功能,在前端需要对密码进行加密处理,增加安全性 二解决 1.利用npm(如果没有,先自行安装npm)安装ts-md5 npm install ts-md5 --sav ...
- MySQL导入数据报 Got a packet bigger than‘max_allowed_packet’bytes 错误的解决方法
MySQL根据配置文件会限制Server接受的数据包大小.有时候大的插入和更新会受 max_allowed_packet 参数限制,导致大数据写入或者更新失败. 通过终端进入mysql控制台,输入如下 ...