As part of a memory analysis, we've found the following:
percent live alloc'ed stack class rank self accum bytes objs bytes objs trace name 3 3.98% 19.85% 24259392 808 3849949016 1129587 359697 byte[] 4 3.98% 23.83% 24259392 808 3849949016 1129587 359698 byte[]
You'll notice that many objects are allocated, but few remain live. This is for a simple reason - the two byte arrays are allocated for each instance of a "client" that is generated. Clients are not reusable - each one can only handle one request开发者_高级运维 and is then thrown away. The byte arrays always have the same size (30000).
We're considering moving to a pool (apache's GenericObjectPool) of byte arrays, as normally there are a known number of active clients at any given moment (so the pool size shouldn't fluctuate much). This way, we can save on memory allocation and garbage collection. The question is, would the pool cause a severe CPU hit? Is this idea a good idea at all?
Thanks for your help!
I think there are good gc related reasons to avoid this sort of allocation behaviour. Depending on the size of the heap & the free space in eden at the time of allocation, simply allocating a 30000 element byte[] could be a serious performance hit given that it could easily be bigger than the TLAB (hence allocation is not a bump the pointer event) & there may even not be enough space in eden available hence allocation directly into tenured which in itself likely to cause another hit down the line due to increased full gc activity (particularly if using cms due to fragmentation).
Having said that, the comments from fdreger are completely valid too. A multithreaded object pool is a bit of a grim thing that is likely to cause headaches. You mention they handle a single request only, if this request is serviced by a single thread only then a ThreadLocal byte[] that is wiped at the end of the request could be a good option. If the request is short lived relatively to your typical young gc period then the young->old reference issue may not be a big problem (as the probability of any given request being handled during a gc is small even if you're guaranteed to get this periodically).
Probably pooling will not help you much if at all - possibly it will make things worse, although it depends on a number of factors (what GC are you using, how long the objects live, how much memory is available, etc.):
The time of GC depends mostly on the number of live objects. Collector (I assume you run a vanilla Java JRE) does not visit dead objects and does not deallocate them one by one. It frees whole areas of memory after copying the live objects away (this keeps memory neat and compacted). 100 dead objects can collect as fast as 100000. On the other hand, all the live objects must be copied - so if you, say, have a pool of 100 objects and only 50 are used at a given time, keeping the unused object is going to cost you.
If your arrays currently tend to live shorter than the time needed to get tenured (copied to the old generation space), there is another problem: your pooled arrays will certainly live long enough. This will produce a situation where there is a lot of references from old generation to young - and GCs are optimized with a reverse situation in mind.
Actually it is quite possible that pooling arrays will make your GC SLOWER than creating new ones; this is usually the case with cheap objects.
Another cost of pooling comes from synchronizing objects across threads and cleaning them up after use. Both are trickier than they sound.
Summing up, unless you are well aware of the internals of your GC and understand how it works under the hood, AND have a results from a profiler that show that managing all the arrays is a bottleneck - DO NOT POOL. In most cases it is a bad idea.
If garbage collection in your case is really a performance hit (often cleaning up the eden space does not take much time if not many objects survive), and it is easy to plug in the object pool, try it, and measure it.
This certainly depends on your application's need.
The pool would work out much better as long as you always have a reference to it, this way the garbage collector simply ignores the pool and will only be declared once (you could always declare it static to be on the safe side). Although it would be persistent memory but I doubt that will be a problem for your application.
精彩评论