We have a big class with 68 int, 22 double members, and there are also 4 members as class. e.g
Class A{
public int i1
public int i2
public int i3
....
public Order order1
public Order order2
...
public double..
}
1: Is the memory of i1,i2,i3 is continually physically?
2: For class A, does it store the pointer to order1 & order 2, or it stores the content of order 1 & order 2?
There is another class B which has a member as an array of A, there are 365 A. So the memory for B could be very large. My concern is if the siz开发者_C百科e of B is too huge, we can get lots of cache level 2 missing and degrade the performance. We mainly will sum the value of i1, and sum the value of i2, and sum the value of i3 etc. e.g if sum i1 for all 365 A, then the i1 for all these 365A will not sit continually in the memory. So we could hit some cache missing and get not good performance.
I am thinking of using class B but remove the class A, and move all the elements inside A to B, so we can get
Class B {
public array_of_i1
public array_of_i2
..
}
In this way, when I calculate the sum of i1 or i2, then all the i1 or i2 are sitting together, so maybe we could get performance improvement?
As the class is huge, I'd like to look for your opinions before the change.
It's generally consecutive but it depends on which JVM you are using.
One complication is that runtime in memory structure of Java objects is not enforced by the virtual machine specification, which means that virtual machine providers can implement them as they please. The consequence is that you can write a class, and instances of that class in one VM can occupy a different amount of memory than instances of that same class when run in another VM.
As for the specific layout,
In order to save some memory, the Sun VM doesn't lay out object's attributes in the same order they are declared. Instead, the attributes are organized in memory in the following order:
- doubles and longs
- ints and floats
- shorts and chars
- booleans and bytes
- references
(from http://www.codeinstructions.com/2008/12/java-objects-memory-structure.html)
He also includes how inherited classes are handled.
The JLS doesn't strongly specify the exact sizes of objects, so this can vary between JVM implementations (though you can infer some lower bounds, i.e. an integer must be at least 32 bits).
In Sun's JVM however, integers take 32 bits, doubles take 64 bits and object references take 32 bits (unless you're running on a 64-bit JVM and pointer compression is disabled). Then the object itself has a 2 word header, and the overall memory size is aligned to a multiple of 8 bytes.
So overall this object should take 8 * ceil((8 + 68 * 4 + 22 * 8 + 4 * 4) / 8)
= 10448 bytes, if I haven't forgotten to account for something (which is entirely possible), and if you're running on a 32-bit machine.
But - as stated above, you shouldn't really rely too strongly on this as it's not specified anywhere, and will vary between implementations and on different platforms. As always with performance-related metrics, the key is to write clean code, measure the impact (in this case use a profiler to look at memory usage, and execution time) and then optimise as required.
Performance only really matters from the macro perspective; worrying about L2 cache misses when designing your object model is really the wrong way round to do it.
(And a class with 94 fields is almost certainly not a clean design, so you're right to consider refactoring it...)
Firstly, before you embark on any work, have you profiled your application? Are cache misses causing a bottleneck?
What are your performance requirements? (Note: 'As fast as possible' isnt a requirement*)
- That would be implementation dependent.
- Yes, it stores pointers. The objects will reside elsewhere.
- In general, yes. But I don't think you necessarily want to depend on it. Wrong language for that low-level type stuff.
- Pointers, but I'm not sure why that matters.
- Profile before making significant changes for performance reasons. I think the second is cleaner though. Wouldn't you rather do a simple array loop for your summing?
Or you could change the structure to use a smaller class, keeping the stuff that runs in a tight loop together will tend to improve cache hits (iff that is your performance bottleneck).
精彩评论