I was going to write a long-winded post, but I'll boil it down here:
I'm trying to emulate the graphical old-school style of the NES via XNA. However, my FPS is SLOW, trying to modify 65K pixels per frame. If I just loop through all 65K pixels and set them to so开发者_如何学JAVAme arbitrary color, I get 64FPS. The code I made to look-up what colors should be placed where, I get 1FPS.
I think it is because of my object-orented code.
Right now, I have things divided into about six classes, with getters/setters. I'm guessing that I'm at least calling 360K getters per frame, which I think is a lot of overhead. Each class contains either/and-or 1D or 2D arrays containing custom enumerations, int, Color, or Vector2D, bytes.
What if I combined all of the classes into just one, and accessed the contents of each array directly? The code would look a mess, and ditch the concepts of object-oriented coding, but the speed might be much faster.
I'm also not concerned about access violations, as any attempts to get/set the data in the arrays will done in blocks. E.g., all writing to arrays will take place before any data is accessed from them.
As for casting, I stated that I'm using custom enumerations, int, Color, and Vector2D, bytes. Which data types are fastest to use and access in the .net Framework, XNA, XBox, C#? I think that constant casting might be a cause of slowdown here.
Also, instead of using math to figure out which indexes data should be placed in, I've used precomputed lookup tables so I don't have to use constant multiplication, addition, subtraction, division per frame. :)
There's a terrific presentation from GDC 2008 that is worth reading if you are an XNA developer. It's called Understanding XNA Framework Performance.
For your current architecture - you haven't really described it well enough to give a definite answer - you probably are doing too much unnecessary "stuff" in a tight loop. If I had to guess, I'd suggest that your current method is thrashing the cache - you need to fix your data layout.
In the ideal case you should have a nice big array of small-as-possible value types (structs not classes), and a heavily inlined loop that shoves data into it linearly.
(Aside: regarding what is fast: Integer and floating point maths is very fast - in general, you shouldn't use lookup tables. Function calls are pretty fast - to the point that copying large structs when you pass them will be more significant. The JIT will inline simple getters and setters - although you shouldn't depend on it to inline anything else in very tight loops - like your blitter.)
HOWEVER - even if optimised - your current architecture sucks. What you are doing flies in the face of how a modern GPU works. You should be loading your sprites onto your GPU and letting it composite your scene.
If you want to manipulate your sprites at a pixel level (for example: pallet swapping as you have mentioned) then you should be using pixel shaders. The CPU on the 360 (and on PCs) is fast, but the GPU is so much faster when you're doing something like this!
The Sprite Effects XNA sample is a good place to get started.
Have you profiled your code to determine where the slowdown is? Before you go rewriting your application, you ought to at least know which parts need to be rewritten.
I strongly suspect that the overhead of the accessors and data conversions is trivial. It's much more likely that your algorithms are doing unnecessary work, recomputing values that they could cache, and other things that can be addressed without blowing up your object design.
Are you specifying a color and such for each pixel or something? If that is the case I think you should really think about the architecture some more. Start using sprites that will speed things up.
EDIT
Okay I think what your solution could be load several sprites with different colours (a sprite of a few pixels) and reuse those. It is faster to point to the same sprite than to assign a different colour to each pixel as the sprite has already been loaded into memory
As with any performance problem, you should profile the application to identify the bottlenecks rather than trying to guess. I seriously doubt that getters and setters are at the root of your problem. The compiler almost always inlines these sorts of functions. I'm also curious what you have against math. Multiplying two integers, for instance, is one of the fastest things the computer can do.
精彩评论