Optimizing SphereInFrustrum check_问答_开发者_运维开发者技术经验分享

I have this SphereInFrustrum function here:

0.49%  int FrustumG::sphereInFrustum(Vec3 &p, float radius) {

        int result = INSIDE;
        float distance;

2.40%   for(int i=0; i < 6; i++) {
7.94%       distance = pl[i].distance(p);
12.21%      if (distance < -radius)
0.67%           return OUTSIDE;
3.67%       else if (distance < radius)
                result =  INTERSECT;
        }
        return(result);

       }

The numbers are from my code profiler. The issue is that, this check is taking longer t开发者_JAVA百科han actually rendering. The whole point of implementing geometry culling was so that I could have really big levels. I really just need a very quick and dirty way to see if an AABB is in or out. Right now I provide it with the radius of the cube and the center. Given that my boxes are AABB, is there a faster way to do this? I favor speed over accuracy.

Thanks

If I provided the cube's min and max would that make it faster? I'm sure there must be a way to do this without the distance formula with an expensif square root;

float Plane::distance(Vec3 &p) {

    return (d + normal.innerProduct(p));
}


float Vec3::innerProduct(Vec3 &v) {

    return (x * v.x + y * v.y + z * v.z);
}

I wanted to just leave a comment to ask a question but i only seem to be able to leave an answer, this is just an observation:

Does this code really do what you want to do?

        int result = INSIDE;
        float distance;

2.40%   for(int i=0; i < 6; i++) {
7.94%       distance = pl[i].distance(p);
12.21%      if (distance < -radius)
0.67%           return OUTSIDE;
3.67%       else if (distance < radius)
                result =  INTERSECT;

how this function reads to me is, assume the sphere is inside, for each point on your frustrum i, take the distance between i and the center of sphere p, if this distance is less than negative radius... and here my paradigm is destroyed.

So this returns early if you have a negative distance that is less than your negative radius? Is that really what you want right there?

Are you really executing this code for each sphere ? If you do, no wonder it's slower.

You should use a hierarchical approach, which can cull entire parts of the scenes in one call. For instance, you can use a quadtree of spheres.

A few ideas (in order of increasing complexity)

Elimintate branches - it may be faster (depending on platform and compiler) to compute all 6 distances and return based on the minimum of them. E.g. on PowerPC, minDist = (d < minDist) ? d : minDist can be computed with an fsel instruction and avoid the branches.
Unroll (part I) - Unrolling the loop for all six planes might give the compiler better chances to optimize the code (by hiding instruction latency).
Unroll (part II) - Can you process multiple spheres at once? Again, you might be able to hide some of the latency.
SIMD - If you don't mind getting your hands dirty with SIMD, you can store the "transpose" of 3 planes together in 4 quads. This lets you compute the dot products more easily, without having to do "horizontal" SIMD operations. Symbolically, this would look like

vec4 sphereX = sphere.splat(0);
vec4 sphereY = sphere.splat(1);
vec4 sphereZ = sphere.splat(2);

vec4 dot = sphereX * planesX; // planesX is the x components of 3 planes
dot += sphereY * planesY; 
dot += sphereZ * planesZ;
dot += planesDistance;
// dot now contains the results of 3 distances
// now do the same for the other 3 planes

The actual SIMD code will depend on the platform and compiler.

There are also some methods for doing frustum queries on kd-trees (i.e. MLRTA) - I've never tried them, but it should cut down dramatically on the number of spheres you need to query.

Hope that helps, let me know if anything is unclear.