I'm trying to facilitate automatic vectorization by the compiler in the blitz++ array library. For this reason, I'd like to present a view of the array data that is in chunks of fixed-length vectors, which are already vectorized well. However, I can't figure out what the type aliasing rules imply in conjunction with dynamically allocated arrays.
Here's the idea. An array currently consists of
T_numtype* restrict data_;
Operations are done by looping over these data. What I would like to do is present an alternative view of this array as an array of TinyVector<T_numtype, N>
, which is a fixed-length vector whose operations are totally vectorized using the expression template machinery. The idea would be that a L-length array should be either T_numtype[L]
or TinyVector<T_numt开发者_StackOverflow中文版ype, N>[L/N]
. Is there a way to accomplish this without running afoul of the type alasing rules?
For a statically allocated array, one would do
union {
T_numtype data_[L];
TinyVector<T_numtype, N>[L/N];
};
The closest I could think of is to define
typedef union {
T_numtype data_[N];
TinyVector<T_numtype, N>;
} u;
u* data_;
and then allocate it with
data_ = new u[L/N];
But it seems that now I have given up my right to address the entire array as a flat array of T_numtype, so to access a particular element I would need to do data_[i/N].data_[i%N]
, which is a lot more complicated.
So, is there a way to legally create a union of T_numtype data_[L]
and TinyVector<T_numtype, N>[L/N]
where L is a dynamically determined size?
(I'm aware that there are additional alignment concerns, i.e. N must be a value that is the same as the alignment of the TinyVector member, otherwise there will be holes in the array.)
Aliasing is hard to make legal. However, if some "operations are done by looping over these data.", do those operations require that these data are exactly an array of T_numtype?
It may be better to wrap the data in a class with one data member of type TinyVector<T_numtype, N>[L/N]
or even std::vector<TinyVector<T_numtype, N> >
since that L is apparently determined at runtime, and expose a pair of iterators for those operations that want to loop over the entire data as a single sequence.
精彩评论