I have this SphereInFrustrum function here:
0.49% int FrustumG::sphereInFrustum(Vec3 &p, float radius) {
int result = INSIDE;
float distance;
2.40% for(int i=0; i < 6; i++) {
7.94% distance = pl[i].distance(p);
12.21% if (distance < -radius)
0.67% return OUTSIDE;
3.67% else if (distance < radius)
result = INTERSECT;
}
return(result);
}
The numbers are from my code profiler. The issue is that, this check is taking longer t开发者_JAVA百科han actually rendering. The whole point of implementing geometry culling was so that I could have really big levels. I really just need a very quick and dirty way to see if an AABB is in or out. Right now I provide it with the radius of the cube and the center. Given that my boxes are AABB, is there a faster way to do this? I favor speed over accuracy.
Thanks
If I provided the cube's min and max would that make it faster? I'm sure there must be a way to do this without the distance formula with an expensif square root;
float Plane::distance(Vec3 &p) {
return (d + normal.innerProduct(p));
}
float Vec3::innerProduct(Vec3 &v) {
return (x * v.x + y * v.y + z * v.z);
}
I wanted to just leave a comment to ask a question but i only seem to be able to leave an answer, this is just an observation:
Does this code really do what you want to do?
int result = INSIDE;
float distance;
2.40% for(int i=0; i < 6; i++) {
7.94% distance = pl[i].distance(p);
12.21% if (distance < -radius)
0.67% return OUTSIDE;
3.67% else if (distance < radius)
result = INTERSECT;
how this function reads to me is, assume the sphere is inside, for each point on your frustrum i, take the distance between i and the center of sphere p, if this distance is less than negative radius... and here my paradigm is destroyed.
So this returns early if you have a negative distance that is less than your negative radius? Is that really what you want right there?
Are you really executing this code for each sphere ? If you do, no wonder it's slower.
You should use a hierarchical approach, which can cull entire parts of the scenes in one call. For instance, you can use a quadtree of spheres.
A few ideas (in order of increasing complexity)
Elimintate branches - it may be faster (depending on platform and compiler) to compute all 6 distances and return based on the minimum of them. E.g. on PowerPC,
minDist = (d < minDist) ? d : minDist
can be computed with anfsel
instruction and avoid the branches.Unroll (part I) - Unrolling the loop for all six planes might give the compiler better chances to optimize the code (by hiding instruction latency).
Unroll (part II) - Can you process multiple spheres at once? Again, you might be able to hide some of the latency.
SIMD - If you don't mind getting your hands dirty with SIMD, you can store the "transpose" of 3 planes together in 4 quads. This lets you compute the dot products more easily, without having to do "horizontal" SIMD operations. Symbolically, this would look like
vec4 sphereX = sphere.splat(0);
vec4 sphereY = sphere.splat(1);
vec4 sphereZ = sphere.splat(2);
vec4 dot = sphereX * planesX; // planesX is the x components of 3 planes
dot += sphereY * planesY;
dot += sphereZ * planesZ;
dot += planesDistance;
// dot now contains the results of 3 distances
// now do the same for the other 3 planes
The actual SIMD code will depend on the platform and compiler.
There are also some methods for doing frustum queries on kd-trees (i.e. MLRTA) - I've never tried them, but it should cut down dramatically on the number of spheres you need to query.
Hope that helps, let me know if anything is unclear.
精彩评论