When I was using C++ in college, I was told to use multidimensional arrays (hereby MDA) whenever possible, since it exhibits better memory locality since it's allocated in one big chunk. Array of arrays (AoA), on the other hand, are allocated in multiple smaller chunks, possibly scattered all over the place in the physical memory wherever vacancies are found.
So I guess the first question is: is this a myth, or is this an advice worth following?
Assuming that it's the latter, then the next question would be what to do in a language like Java that doesn't have true MDA. It's not that hard to emulate MDA with a 1DA, of course. Essentially, what is syntactic sugar for languages with MDA can be implemented as library support for languages without MDA.
Is this worth the开发者_开发问答 effort? Is this too low level of an optimization issue for a language like Java? Should we just abandon arrays and use List
s even for primitives?
Another question: in Java, does allocating AoA at once (new int[M][N]
) possibly yield a different memory allocation than doing it hierarchically (new int[M][]; for (... new int[N]
)?
Java and C# allocate memory in much different fashion that C++ does. In fact, in .NET for sure all the arrays of AoA will be close together if they are allocated one after another because memory there is just one continuous chunk without any fragmentation whatsoever.
But it is still true for C++ and still makes sense if you want maximum speed. Although you shouldn't follow that advise every time you want multidimensional array, you should write maintainable code first and then profile it if it is slow, premature optimization is root for all evil in this world.
Is this worth the effort? Is this too low level of an optimization issue for a language like Java?
Generally speaking, it is not worth the effort. The best strategy to to forget about this issue in the first version of your application, and implement in a straight-forward (i.e. easy to maintain) way. If the first version runs too slowly for your requirements, use a profiling tool to find the application's bottlenecks. If the profiling suggests that arrays of arrays is likely to be the problem, do some experiments to change your data structures to simulated multi-dimensional arrays and profile see if it makes a significant difference. [I suspect that it won't make much difference. But the most important things is to not waste your time optimizing something unnecessarily.]
Should we just abandon arrays and use Lists even for primitives?
I wouldn't go that far. Assuming that you are dealing with arrays of a predetermined size:
- arrays of objects will be a bit faster than equivalent lists of objects, and
- arrays of primitives will be considerably faster and take considerably less space than equivalent lists of primitive wrapper.
On the other hand, if your application needs to "grow" the arrays, using a List will simplify your code.
I would not waste the effort to use a 1D array as a multidim array in Java because there is no syntax to help. Of course one could define functions (methods) to hide the work for you, but you just end up with a function call instead of following a pointer when using an array of arrays. Even if the compiler/interpreter speeds this up for you, I still don't think it is worth the effort. In addition you can run into complications when trying to use code that expects 2D (or N-Dim) arrays that expect as arrays of arrays. I'm sure most general code out there will be written for arrays like this in Java. Also one can cheaply reassign rows (or columns if you decide to think like that).
If you know that this multidim array is a bottleneck, you may disregard what I said and see if manually optimizing helps.
From personal experience in Java, multidimensional arrays are far far slower than one dimensional arrays if one is loading a large amount of data, or accessing elements in the data which are at different positions. I wrote a program that took a screen shot image in BMP format, and then searched the screenshot for a smaller image. Loading the screenshot image (approx. 3 mb) into a multidimensional array (three dimensional, [xPos][yPos][color] (with color=0 being red value, and suchforth)) took 14 seconds. To load it into a single dimensional array took 1 second. The gain for finding the smaller image in the larger image was similar. It took around 28 seconds to find the smaller image in the larger image when both images were stored as multi-dimensional arrays. It took around a second to find the smaller image in the larger image when both images were stored as one dimensional arrays. That said, I first wrote my program using a dimensional arrays for the sake of readability.
精彩评论