Performance

Now that we have a basic model for how things are working, let's consider some things that could go wrong that would make it slow. That will give us a good idea what sorts of things we should try to avoid to get the best performance out of the collector.

Too Many Allocations

This is really the most basic thing that can go wrong. Allocating new memory with the garbage collector is really quite fast. As you can see in Figure 2 above is all that needs to happen typically is for the allocation pointer to get moved to create space for your new object on the "allocated" side—it doesn't get much faster than that. However, sooner or later a garbage collect has to happen and, all things being equal, it's better for that to happen later than sooner. So you want to make sure when you're creating new objects that it's really necessary and appropriate to do so, even though creating just one is fast.

This may sound like obvious advice, but actually it's remarkably easy to forget that one little line of code you write could trigger a lot of allocations. For example, suppose you're writing a comparison function of some kind, and suppose that your objects have a keywords field and that you want your comparison to be case insensitive on the keywords in the order given. Now in this case you can't just compare the entire keywords string, because the first keyword might be very short. It would be tempting to use String.Split to break the keyword string into pieces and then compare each piece in order using the normal case-insensitive compare. Sounds great right?

Well, as it turns out doing it like that isn't such a good idea. You see, String.Split is going to create an array of strings, which means one new string object for every keyword originally in your keywords string plus one more object for the array. Yikes! If we're doing this in the context of a sort, that's a lot of comparisons and your two-line comparison function is now creating a very large number of temporary objects. Suddenly the garbage collector is going to be working very hard on your behalf, and even with the cleverest collection scheme there is just a lot of trash to clean up. Better to write a comparison function that doesn't require the allocations at all.

Too-Large Allocations

When working with a traditional allocator, such as malloc(), programmers often write code that makes as few calls to malloc() as possible because they know the cost of allocation is comparatively high. This translates into the practice of allocating in chunks, often speculatively allocating objects we might need, so that we can do fewer total allocations. The pre-allocated objects are then manually managed from some kind of pool, effectively creating a sort of high-speed custom allocator.

In the managed world this practice is much less compelling for several reasons:

First, the cost of doing an allocation is extremely low—there's no searching for free blocks as with traditional allocators; all that needs to happen is the boundary between the free and allocated areas needs to move. The low cost of allocation means that the most compelling reason to pool simply isn't present.

Second, if you do choose to pre-allocate you will of course be making more allocations than are required for your immediate needs, which could in turn force additional garbage collections that might otherwise have been unnecessary.

Finally, the garbage collector will be unable to reclaim space for objects that you are manually recycling, because from the global perspective all of those objects, including the ones that are not currently in use, are still live. You might find that a great deal of memory is wasted keeping ready-to-use but not in-use objects on hand.

This isn't to say that pre-allocating is always a bad idea. You might wish to do it to force certain objects to be initially allocated together, for instance, but you will likely find it is less compelling as a general strategy than it would be in unmanaged code.

Too Many Pointers

If you create a data structure that is a large mesh of pointers you'll have two problems. First, there will be a lot of object writes (see Figure 3 below) and, secondly, when it comes time to collect that data structure, you will make the garbage collector follow all those pointers and if necessary change them all as things move around. If your data structure is long-lived and won't change much, then the collector will only need to visit all those pointers when full collections happen (at the gen2 level). But if you create such a structure on a transitory basis, say as part of processing transactions, then you will pay the cost much more often.

Figure 3. Data structure heavy in pointers

Data structures that are heavy in pointers can have other problems as well, not related to garbage collection time. Again, as we discussed earlier, when objects are created they are allocated contiguously in the order of allocation. This is great if you are creating a large, possibly complex, data structure by, for instance, restoring information from a file. Even though you have disparate data types, all your objects will be close together in memory, which in turn will help the processor to have fast access to those objects. However, as time passes and your data structure is modified, new objects will likely need to be attached to the old objects. Those new objects will have been created much later and so will not be near the original objects in memory. Even when the garbage collector does compact your memory your objects will not be shuffled around in memory, they merely "slide" together to remove the wasted space. The resulting disorder might get so bad over time that you may be inclined to make a fresh copy of your whole data structure, all nicely packed, and let the old disorderly one be condemned by the collector in due course.

Too Many Roots

The garbage collector must of course give roots special treatment at collection time—they always have to be enumerated and duly considered in turn. The gencollection can be fast only to the extent that you don't give it a flood of roots to consider. If you were to create a deeply recursive function that has many object pointers among its local variables, the result can actually be quite costly. This cost is incurred not only in having to consider all those roots, but also in the extra-large number of gen0 objects that those roots might be keeping alive for not very long (discussed below).

Too Many Object Writes

Once again referring to our earlier discussion, remember that every time a managed program modified an object pointer the write barrier code is also triggered. This can be bad for two reasons:

First, the cost of the write barrier might be comparable to the cost of what you were trying to do in the first place. If you are, for instance, doing simple operations in some kind of enumerator class, you might find that you need to move some of your key pointers from the main collection into the enumerator at every step. This is actually something you might want to avoid, because you effectively double the cost of copying those pointers around due to the write barrier and you might have to do it one or more times per loop on the enumerator.

Second, triggering write barriers is doubly bad if you are in fact writing on older objects. As you modify your older objects you effectively create additional roots to check (discussed above) when the next garbage collection happens. If you modified enough of your old objects you would effectively negate the usual speed improvements associated with collecting only the youngest generation.

These two reasons are of course complemented by the usual reasons for not doing too many writes in any kind of program. All things being equal, it's better to touch less of your memory (read or write, in fact) so as to make more economical use of the processor's cache.

Too Many Almost-Long-Life Objects

Finally, perhaps the biggest pitfall of the generational garbage collector is the creation of many objects, which are neither exactly temporary nor are they exactly long-lived. These objects can cause a lot of trouble, because they will not be cleaned up by a gencollection (the cheapest), as they will still be necessary, and they might even survive a gencollection because they are still in use, but they soon die after that.

The trouble is, once an object has arrived at the genlevel, only a full collection will get rid of it, and full collections are sufficiently costly that the garbage collector delays them as long as is reasonably possible. So the result of having many "almost-long-lived" objects is that your genwill tend to grow, potentially at an alarming rate; it might not get cleaned up nearly as fast as you would like, and when it does get cleaned up it will certainly be a lot more costly to do so than you might have wished.

To avoid these kinds of objects, your best lines of defense go like this:

  1. Allocate as few objects as possible, with due attention to the amount of temporary space you are using.
  2. Keep the longer-lived object sizes to a minimum.
  3. Keep as few object pointers on your stack as possible (those are roots).

If you do these things, your gencollections are more likely to be highly effective, and genwill not grow very fast. As a result, gen1 collections can be done less frequently and, when it becomes prudent to do a gencollection, your medium lifetime objects will already be dead and can be recovered, cheaply, at that time.

If things are going great then during steady-state operations your gensize will not be increasing at all!

原文:http://msdn.microsoft.com/en-us/library/ms973837.aspx

关于GC和LOH的精彩推荐文章

GC、LOH和Performance相关的更多相关文章

  1. 现代JVM内存管理方法的发展历程,GC的实现及相关设计概述(转)

    JVM区域总体分两类,heap区和非heap区.heap区又分:Eden Space(伊甸园).Survivor Space(幸存者区).Tenured Gen(老年代-养老区). 非heap区又分: ...

  2. Java Hotspot G1 GC的一些关键技术

    G1 GC,全称Garbage-First Garbage Collector,通过-XX:+UseG1GC参数来启用,作为体验版随着JDK 6u14版本面世,在JDK 7u4版本发行时被正式推出,相 ...

  3. CoreCLR源码探索(四) GC内存收集器的内部实现 分析篇

    在这篇中我将讲述GC Collector内部的实现, 这是CoreCLR中除了JIT以外最复杂部分,下面一些概念目前尚未有公开的文档和书籍讲到. 为了分析这部分我花了一个多月的时间,期间也多次向Cor ...

  4. JVM:从实际案例聊聊Java应用的GC优化

    原文转载自美团从实际案例聊聊Java应用的GC优化,感谢原作者的贡献 当Java程序性能达不到既定目标,且其他优化手段都已经穷尽时,通常需要调整垃圾回收器来进一步提高性能,称为GC优化.但GC算法复杂 ...

  5. 从实际案例聊聊Java应用的GC优化

    转自美团点评技术博客:https://tech.meituan.com/jvm_optimize.html 当Java程序性能达不到既定目标,且其他优化手段都已经穷尽时,通常需要调整垃圾回收器来进一步 ...

  6. 从实际案例聊聊Java应用的GC优化--转

    https://tech.meituan.com/jvm_optimize.html 当Java程序性能达不到既定目标,且其他优化手段都已经穷尽时,通常需要调整垃圾回收器来进一步提高性能,称为GC优化 ...

  7. 使用Javascript监控前端相关数据

    项目开发完成外发后,没有一个监控系统,我们很难了解到发布出去的代码在用户机器上执行是否正确,所以需要建立前端代码性能相关的监控系统. 所以我们需要做以下的一些模块: 一.收集脚本执行错误 functi ...

  8. JAVA gc垃圾回收机制

    一.GC概要   JVM堆相关知识     为什么先说JVM堆?     JVM的堆是Java对象的活动空间,程序中的类的对象从中分配空间,其存储着正在运行着的应用程序用到的所有对象.这些对象的建立方 ...

  9. JVM GC杂谈之理论入门

    GC杂谈之理论入门 JVM堆布局介绍 ​ JVM堆被划分成两个不同的区域:新生代 ( Young ).老年代 ( Old ).新生代 ( Young ) 又被划分为三个区域:Eden.From Sur ...

随机推荐

  1. Python多进程(1)——subprocess与Popen()

    Python多进程方面涉及的模块主要包括: subprocess:可以在当前程序中执行其他程序或命令: mmap:提供一种基于内存的进程间通信机制: multiprocessing:提供支持多处理器技 ...

  2. 【JAVA、C++】LeetCode 009 Palindrome Number

    Determine whether an integer is a palindrome. Do this without extra space. 解题思路一: 双指针法,逐位判断 Java代码如下 ...

  3. apple配置WIFI热点

    打开AirPort打开设置偏好-共享,找到WIFI相关

  4. OpenGL在 win8 64bits系统下的配置

    1 program files(x86)与program files 在64位系统下,为了更好的兼容32位程序,在安装一些32位程序(注意某些程序他就是32位的),会默认扔到program files ...

  5. LinuxC语言读取文件,分割字符串,存入链表,放入另一个文件

    //file_op.c #include <string.h> #include <stdio.h> #include <stdlib.h> struct info ...

  6. 安装win7或win8系统时UEFI和Legacy模式的设置

    很多新型号的笔记本或台式机主板都开始支持UEFI模式,比起原来的Legacy启动减少了BIOS自检,加快平台启动,如下图所示Legacy,UEFI启动过程: 安装系统,建议选择Legacy模式,在UE ...

  7. MVC学习笔记---MVC导出excel(数据量大,非常耗时的,异步导出)

    要在ASP.NET MVC站点上做excel导出功能,但是要导出的excel文件比较大,有几十M,所以导出比较费时,为了不影响对界面的其它操作,我就采用异步的方式,后台开辟一个线程将excel导出到指 ...

  8. hdu 1465:不容易系列之一(递推入门题)

    不容易系列之一 Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/32768 K (Java/Others)Total Sub ...

  9. wp8 json2csharp

    string jsonData = "{\"result\":\"600\",\"data\":{\"items\&qu ...

  10. 【codevs】1082 线段树练习 3 <区间修改+区间和>

    题目连接   http://codevs.cn/problem/1082/ Description 给你N个数,有两种操作: 1:给区间[a,b]的所有数增加X 2:询问区间[a,b]的数的和. In ...