I want to vectorize by hand some C code, in order to it speedup. For that purpose (SPE on the Cell processor or CBE) I want to use SIMD math. The code originally uses some physical vector calculations (speed, acceleration, etc), so in some parts of the code there is a lot of operations like;
ax=a*vx+b*rx;
ay=a*vy+b*ry;
az=d*vz+b*rz;
so at this point I thought about c开发者_开发百科onverting v's and r's to vectors (on the SPE, one vector can contain 4 single float values), so in pseudocode it should be something like
vector V,R,A;
V.x=vx;
R.x=r.x; (and same for the others "y,z")
A=spu_sum(spu_prod(a,V),spu_prod(b,R));
ax=A.x; (and same for the others "y,z")
so do you think this approach worths or can you think about a better one?
Thanks
If you have to pack and unpack the components at every SIMD calculation, you're unlikely to get much, if any, speedup at all.
You really need to see if you can make deeper changes, so that the components are normally kept in vector form and passed around as vectors as much as possible.
精彩评论