How to do z-ordering in software correctly_问答_开发者

I am rendering 3D objects on a 2D canvas by doing all necessary calculations in software. I am not using graphics acceleration.

Initially all the objects were cubes of same size, so I could sort them based on their distance in Z from camera and th开发者_高级运维at would order them correctly. But now I am trying to draw cubes of varying dimensions. That makes my simple z-ordering algorithm fail in perspective projection.

I looked into computer graphics books and found the techniques used, they eventually recommend pixel based comparision of two polygons to determine which one is ahead of other. Probably that's what they do in graphics card. But doing so in software seems overly difficult and I guess it will be slow for practical use even if I can do it.

Is there a simple trick to do this in software? Any examples from early days of 3D graphics, when graphics cards were not available?

Although this is generic 3D graphics question, if it helps, I am doing this on top of HTML5 Canvas 2D API.

As @ybungalobill has already mentioned, z-buffer is the easiest algorithm to implement. When you are filling the triangles/polygons that make up your cubes, interpolate the Z coordinate between them and store it per-pixel. If you later fill another polygon that renders on the same X, Y coordinate, check if its Z is less than the Z already stored in the buffer. Do not forget to clear the Z buffer to infinity before repainting. Pseudocode:

foreach (scanline in polygon)  {
  int length = scanline.right.x - scanline.left.x;
  foreach (pixel in scanline)  {
    float t = (float)pixel.x / length;
    float z = (1 - t) * scanline.left.z + t * scanline.right.z;  // Interpolate the Z coordinate
    if (z < zbuffer[scanline.y][pixel.x])
      drawPixel(pixel.x, scanline.y, polygon.color);  // The pixel is closer, paint it
  }
}

A revised approach of Z buffer that performs better on CPU by not drawing pixels that would be overwritten is called segment buffer: http://www.gamedev.net/reference/articles/article668.asp

Another approach is the Warnock's algorithm. It uses recursion which makes it hard to use on GPUs but CPU should do fine if you use your own stack to avoid stack overflow. The idea lies in dividing the scene into 4 parts and checking if there's only one polygon covering the whole part. If not split it again until the condition is met (it will be met at the pixel level in worst case). Pseudocode:

void warnock(Rectangle rect)
{
  float minZ = infinity;
  foreach (polygon in polygons)  {
    if (rect is inside polygon)  {
      float z = interpolateZ(polygon, rect.x + rect.width / 2, rect.y + rect.height / 2);  // Get Z coordinate at the centre of the rectangle
      if (z < minZ)  {  // If there are more polygons in this rectangle, make sure the topmost one gets drawn last
        fillRect(polygon.color);
        minZ = z;
      }
    } else {
      // Divide to 4 subrectangles
      warnock(Rectangle(rect.x, rect.y, rect.width / 2, rect.height / 2));  // Top left
      warnock(Rectangle(rect.x, rect.y + rect.height / 2, rect.width / 2, rect.height / 2));  // Bottom left
      warnock(Rectangle(rect.x + rect.width / 2, rect.y, rect.width / 2, rect.height / 2));  // Bottom right
      warnock(Rectangle(rect.x + rect.width / 2, rect.y + rect.height / 2, rect.width / 2, rect.height / 2));  // Top right
    }
  }
}

The painter's algorithm is what you have done with your cubes, the only difference is that it sorts the polygons instead of whole objects. Even then it is difficult to solve various polygon intersections and I personally wouldn't use it for non-trivial scenes.

Another algorithm you might use is the backface culling algorithm. This works only for convex objects that do not overlap though. The algorithm calculates the normal of each polygon and if it points in the direction from the camera, it is removed.

Raycasting is another way to determine the visibility per-pixel. It is, however, quite CPU intensive. The basic idea is to check for each pixel of the screen, what polygon intersects it (what polygon gets hit by the ray casted from the current pixel). The origin of the rays is eye position. Pseudocode:

foreach (pixel in screen)  {
  float minZ = infinity;  // Can be zfar from the perspective projection
  Color pixelColor = backgroundColor;
  foreach (polygon in projectedPolygons)  {
    if (polygon contains Point(pixel.x, pixel.y))  {
      float z = interpolateZ(polygon, pixel.x, pixel.y);  // Get the current Z for (x, y) and this polygon using bilinear interpolation
      if (z < minZ)  {
        minZ = z;
        pixelColor = polygon.color;
      }
    }
  }
}

Right, today you use Z-buffering on the GPU to do a per-pixel depth comparison. You can do this in software too.

The sorting technique will not work in general, see wikipedia. You can improve it though by breaking your cubes into individual faces and sort them instead of cubes.

A more general technique that was used in many early games (e.g. Doom) is BSP trees. They won't work with dynamic scenes since it's expensive to create them. However they solve the ordering problem in general.

What I've found suits me is a fixed grid combined with Warnock. Partition the area of the screen encompassing the model(s) in the frustum into cells:

How to do z-ordering in software correctly

For this you can just use the bounding rectangle of the primitives you insert. This structure can be pretty rapid to update since all you have to do is manipulate integers to move things from one cell to another. To avoid constantly allocating and reallocating data, use a free list approach:

How to do z-ordering in software correctly

Now render each grid cell if it's "simple enough" (Warnock criteria). If not, apply Warnock.

A grid cell is "simple enough" if the rectangle for the cell is contained entirely within the triangles you are rendering for that cell, e.g. and all 4 intersection points for the rectangle within a given triangle are in front of all others (have the minimum depth value)... or if the cell is empty or just has one primitive.

That said, I don't really do this for real-time display purposes. It might be quite difficult to do this efficiently enough on complex meshes in realtime.

I mainly do it to do things like marquee and lasso select vertices/edges/polygons in 3D software on very dense meshes where we don't want to miss unoccluded primitives by approximating with a fixed pixel resolution. In that case the user could zoom out far away from a mesh and we don't want our lasso and marquee selections to miss a whole bunch of sub-pixel primitives, so the appeal of using resolution-independent Warnock here is that you can recursively apply the algorithm as deep as you need until you get that "simple enough" result, which could be a rectangle much smaller than a pixel. It might also be useful for antialiasing with reasonably efficient sub-sampling (since it won't sub-sample if a pixel has full coverage, e.g.). I never actually used this for rasterization contexts.

Raytracing is also fun because of all the options it opens up as far as doing indirect lighting, caustics, DOF, etc, though it's very computationally expensive as pointed out by Karel. That said, I have found with a good BVH that I can do real-time raytracing these days at pretty high res if I'm just doing mostly direct lighting.

Here's a little example I dug up of raytracing a million triangle mesh in real-time on the CPU that I whipped up from some years back. This was on my i3 and at 1600x1200 pixels. Only took a day to code it. The GIF really downgraded the quality and frame rate (was originally over ~120 FPS) but hopefully you get the idea:

How to do z-ordering in software correctly

The main downside to me with realtime raytracing on CPU (as well as GPU) is actually not the rasterization part. While I could render basic materials and lighting in realtime quite easily with an i3 (and this was not even optimized code, just some basic SIMD and parallel loops in C), it would be much more difficult if that million triangle mesh was being deformed every frame. Then I would have to be able to update the BVH storing over a million triangles at over 100 FPS which I have no idea how to do fast enough.

That said, there is one software out there which is actually rasterizing millions of polygons worth of deforming data in realtime. It's called ZBrush:

How to do z-ordering in software correctly

I have no idea how they manage it though. They might use LOD or voxelize the mesh super fast while users are deforming it with a brush or something; to me it doesn't really matter since they let you control things at the per-vertex level, per-polygon level, let you see wireframes, and let you input and output polygonal meshes when loading and saving. Either way it has the effect of handling millions of polygons worth of data (it even managed to rasterize and deform meshes spanining over 20 million polygons 17 years ago which is unparalleled; people can't even match that today 17 years later) and allowing the user to sculpt the results and control things at a per-vertex level somehow while maintaining interactive frame rates without using the GPU for rasterization. They got some kind of voodoo programming going on there as far as I see it, though I might trade a finger to find out how they do it.