I'm writing some pretty CPU-intensive, concurrent numerical code that will process large amounts of data stored in Java arrays (e.g. lots of double[100000]s). Some of the algorithms might run millions of times over several days so getting maximum steady-state performance is a high priority.
In essence, each algorithm is a Java object that has an method API something like:
public double[] runMyAlgorithm(double[] inputData);
or alternatively a reference could be passed to the array to store the output data:
public runMyAlgorithm(double[] inputData, double[] outputData);
Given this requirement, I'm trying to determine the optimal strategy for allocating / managing array space. Frequently the algorithms will need large amounts of temporary storage space. They will also take large arrays as input and create large arrays as output.
Among the options I am considering are:
- Always allocate new arrays as local variables whenever they are needed (e.g. new double[100000]). Probably the simplest approach, but will produce a lot of garbage.
- Pre-allocate temporary arrays and store them as final fields in the algorithm object - big downside would be that this would mean that only one thread could run the algorithm at any one time.
- Keep pre-allocated temporary arrays in ThreadLocal storage, so that a thread can use a fixed amount of temporary array space whenever it needs it. ThreadLocal would be required since multiple thre开发者_运维知识库ads will be running the same algorithm simultaneously.
- Pass around lots of arrays as parameters (including the temporary arrays for the algorithm to use). Not good since it will make the algorithm API extremely ugly if the caller has to be responsible for providing temporary array space....
- Allocate extremely large arrays (e.g. double[10000000]) but also provide the algorithm with offsets into the array so that different threads will use a different area of the array independently. Will obviously require some code to manage the offsets and allocation of the array ranges.
Any thoughts on which approach would be best (and why)?
What I have noticed when working with memory in Java is the following. If your memory needs patterns are simple (mostly 2-3 types of memory allocations) you can usually be better than the default allocator. You can either preallocate a pool of buffers at the application startup and use them as needed or go to the other route (allocate an huge array at the beginning and provide pieces of that when needed). In effect you are writing your own memory allocator. But chances are you will do a worse job than the default allocator of Java.
I would probably try to do the following: standardize the buffer sizes and allocate normally. That way after a while the only memory allocation/deallocation will be in fixed sizes which will greatly help the garbage collector to run fast. Another thing I would do is to make sure at the algorithm design time that the total memory needed at any one point will not exceed something like 80-85% of the memory of the machine in order to not trigger a full collection inadvertently.
Apart from those heuristics I would probably test the hell of any solution I would pick and see how it works in practice.
Allocating big arrays is relatively cheap for the GC. You tend to use you your Eden space quickly, but the cost is largely per object. I suggest you write the code in the simplest manner possible and optimise it later after profiling the application. a double[100000] is less than a MB and you can over a thousand in a GB.
Memory is a lot cheaper than it used to be. An 8 GB server costs about £850. A 24 GB server costs about £1,800. (a 24 GB machine could allow you 24K x double[100000]) You may find using a large heap size or even a large Eden size gives you the efficiency you want.
精彩评论