I have an odd situation I am trying to figure out.
The Genesis:
I am running my program on a physical machine with 16 cores and 128GB of RAM. I am trying to determine why it is not using all available cores, typically it uses 20-25% CPU on average (so 4-5 cores of the 16). When I look at performance counters they show on the order of 60-70% Time in Garbage Collection.
For reference, I am using .NET Framework 4 and the TPL (Parallel.ForEach) to thread the performance-intensive portion of my program. I am limiting the number of threads to the number of cores.
The Problem:
I was creating a large number of objects, far too many for the garbage collector to handle efficiently and thus it spent a large amount of time in the garbage collector.
The Simple Solution thus far:
I am introducing object pooling to reduce the pressure on the garbage collector. I will continue pooling objects to improve performance, already pooling some objects reduced garbage collection from 60-70% of time to 45% of time and my program ran 40% faster.
The Nagging Question (the one I hope you will answer for me):
My program when running uses at most 14GB of the available RAM, compared to 128GB of RAM this is quite small. Nothing else is running on this machine (it is purely a testbed for me) and there is plenty of RAM available.
- If there is plenty of RAM available, why are any gen2 (or full) collections occurring at all? A fairly large number of these gen2 collections (in the thousands) are occurring. i.e. how is it deter开发者_如何学JAVAmining the threshold to start a gen2 collection?
- Why doesn't the garbage collector simply delay any full collections until pressure on physical RAM reaches a higher threshold?
- Is there any way I can configure the garbage collector to wait for a higher threshold? (i.e. not bother collecting at all if no necessary)
EDIT:
I am already using the option to use the server garbage collector ... what I need to know is what is triggering a gen2 collection, not that the server garbage collector is better (I already know that).
As I recall, the Client GC is the default. My experience with it is that it doesn't let the heap get very large before collecting. For my heavy duty processing applications, I use the "server" GC.
You enable the server GC in your application configuration file:
<?xml version ="1.0"?>
<configuration>
<runtime>
<gcServer enabled="true"/>
</runtime>
</configuration>
That makes a huge difference in performance for me. For example, one of my programs was spending upwards of 80% of its time in garbage collection. Enabling the server GC dropped that to just a little over 10%. Memory usage went up because the GC let it go, but that's fine for most of my applications.
Another thing that will cause a Gen 2 collection is the Large Object Heap. See CLR Inside Out: Large Object Heap Uncovered. In a nutshell, if you exceed the LOH threshold, it will trigger a Gen 2 collection. If you're allocating a lot of short-lived large objects (about 85 kilobytes), this will be a problem.
From vague memory and having a read through: http://msdn.microsoft.com/en-us/library/ee787088.aspx, I think one trigger of a Gen 2 GC can be a Gen 2 segment filling up. The article states that Server GC uses larger segments, so as already noted, this is probably important for your performance.
Having the machine wait until it has virtually no memory free will mean you get one hell of a GC at some stage. This is probably not ideal. If your time in GC is so high, it's a sign you're allocating too many objects that are surviving long enough to get past gen 0 & 1, and doing it in a repetitive way. If the memory usage of your application is not rising over time, this indicates that these objects are actually short lived, but live long enough to survive a 0 and a 1 collection. This is a bad situation - you're allocating a short lived object but paying a full Gen 2 collection cost to clean it up.
If that's the case, you have a few different directions to take:
- Try to make the short lived objects collectable sooner (so they don't make it to gen 2 and hence the GC cost is lower)
- Try to allocate fewer short-lived objects (so GCs happen less frequently and you have more time to finish using your short-lived objects before the allocations force a GC and the objects are moved to older generations)
- Use stack allocated value types instead of reference types for the short lived objects (if it suits your purpose)
- If you know you need a large chunk of these objects, pool them upfront. It sounds like you're doing this, but there must be still a lot of allocation going on to keep the GC at 45%. If your pool isn't big enough, allocate more upfront - as you say, you have plenty of spare memory.
It's likely that a combination of all of these would be a good solution. You need to have a good understanding of what objects you're allocating, how long they're living, and how long they actually need to live to fulfill your purpose.
The GC is happy with temporary objects that have short lifetimes (as in are collectable by the GC quickly), or long term/permanent objects that have long lifetimes. Allocating lots of objects in the middle of those two categories is where you get the pain. So allocate less of them or change their lifetimes to match their usage scenario.
精彩评论