I have 开发者_运维问答a solution consisting of a number of C# projects. It was written in C# to get it operational quickly. Garbage collections are starting to become an issue—we are seeing some 100 ms delays that we'd like to avoid.
One thought was to re-write it in C++, project by project. But if you combine C# with unmanaged C++, will the threads in the C++ projects also be frozen by garbage collections?
UPDATE
Thanks for your replies. This is, in fact, an app where 100 ms might be significant. It was probably a poor decision to build it in C#, but it was essential that it be up and running quickly at the time.
Right now, we're using Windows' Multimedia Timers to fire an event every 5 ms. We do see some 100+ ms gaps, and we've confirmed by checking the GC counters that these always occur during a collection. Optimization is on; built in Release mode.
I work as a .NET developer at a trading firm where, like you, we care about 100 ms delays. Garbage collection can indeed become a significant issue when dependable minimal latency is required.
That said, I don't think migrating to C++ is going to be a smart move, mainly due to how time consuming it would be. Garbage collection occurs after a certain amount of memory has been allocated on the heap over time. You can substantially mitigate this issue by minimizing the amount of heap allocation your code creates.
I'd recommend trying to spot methods in your application that are responsible for significant amounts of allocation. Anywhere objects are constructed is going to be a candidate for modification. A classic approach to fighting garbage collection is utilizing resource pools: instead of creating a new object every time a method is called, maintain a pool of already-constructed objects, borrowing from the pool on every method call and returning the object to the pool once the method has completed.
Another no-brainer involves hunting down any ArrayList
, HashTable
, or similar non-generic collections in your code that box/unbox value types, leading to totally unnecessary heap allocation. Replace these with List<T>
, Dictionary<TKey, TValue>
, and so on wherever possible (here I am specifically referring to collections of value types such as int
, double
, long
, etc.). Likewise, look out for any methods you may be calling which box value type arguments (or return boxed value types).
These are just a couple of relatively small steps you can take to reducing your garbage collection count, but they can make a big difference. With enough effort it can even be possible to completely (or at least nearly) eliminate all generation 2 garbage collections during the continuous operations phase (everything except for startup and shutdown) of your application. And I think you'll find that generation 2 collections are the real heavy-hitters.
Here's a paper outlining one company's efforts to minimize latency in a .NET application through resource pooling, in addition to a couple of other methods, with great success:
Rapid Addition leverages Microsoft .NET 3.5 Framework to build ultra-low latency FIX and FAST processing
So to reiterate: I would strongly recommend investigating ways to modify your code so as to cut down on garbage collection over converting to an entirely different language.
First, have you tried profiling things to see if you could optimize your memory usage? A good place to start is with the CLR profiler (works with all CLRs up to 3.5).
Rewriting everything in C++ is an incredibly drastic change just for the sake of a small performance hit -- this is like fixing a paper cut by amputating your hand.
Are you certain that those 100ms delays are due to the GC? I would make VERY sure that the GC really is your problem before you spend a lot of time, effort, and money rewriting the thing in C++. Combining managed code with unmanaged code also presents its own problems, as you have to deal with marshalling between those two contexts. That will add its own performance drain, and your net gain could quite likely end up being zero in the end.
I would profile your C# application and narrow down exactly where your 100ms delays are coming from. This tool might be helpful:
How To: Use CLR Profiler
A word on the GC
Another word about the .NET GC (or really any GC, for that matter.) This one is not nearly said often enough, but it is a critical factor in successfully writing code with a GC:
Having a Garbage Collector does not mean you don't have to think about memory management!
Writing optimal code that plays nicely with the GC requires less effort and hassle than writing C++ code that plays nicely with an unmanaged heap...but you still have to understand the GC and write code that plays nicely with the it. You can't completely ignore all memory management related things. You have to worry about it less, but you still have to think about it. Writing code that plays nicely with the GC is a critically important factor in achieving performant code that does not CREATE memory management problems.
The following article should also be helpful, as it outlines the fundamental behaivor of the .NET GC (valid through .NET 3.5...its quite likely that this article is no longer completely valid for .NET 4.0 as there have been some critical changes to its GC...for one, it no longer has to block .NET threads while collection occurs):
Garbage Collection: Automatic Memory Management in the Microsoft .NET Framework
The CLR GC does not suspend threads running unmanaged code during a collection. If the native code calls into managed code, or returns to managed code then it may be affected by a collection (like any other managed code).
If 100 ms is an issue, I asusme your code is mission critical. Mixing managed and unmanaged code will have interop overhead of calling between managed appdomain and unmanaged space.
GC is very well optimized, so before doing that try to profile your code and refactor it. If you are concerned about GC, try playing with setting the thread priority and minimize object creation and cache the data whenever possible. In your project property turns on Optimize code setting too.
One thought was to re-write it in C++, project by project. But if you combine C# with unmanaged C++, will the threads in the C++ projects also be frozen by garbage collections?
Not if the C++ code is running on different threads. the C++ heap and the managed heap are different things.
On the other hand, if your C++ code is doing a lot of new/delete, you will still begin to see allocation stalls in the C++ code as the heap gets to be fragmented. And these stalls are likely to be much worse than what you see in C# code because there is no GC. When the heap needs to be cleaned up, it just happens inside the call to new or delete.
If you really have a tight performance requirement, then you need to plan on not doing any memory allocation from the general heap inside your time critical code. In practice that means this will be more like C code than C++ code, or using special memory pools and placement new.
.NET 4.0 has what's called Background Garbage Collection, which is different than Concurrent Garbage Collection, which may be what is causing your issue. Jason Olson talks about it with Carl Franklin and Richard Campbell on .NET Rocks Episode #517. You can view the transcript here. It's on page 5.
I'm not completely sure if just upgrading to the 4.0 Framework will solve your problem, but I imagine it would be well worth your time looking into it before rewriting everything in C++.
精彩评论