I'm developing a game for iPhone in OpenGL ES 1.1; I have a lot of textured quads in a data structure where each node has a list of children nodes. So I traverse the structure from the root, and do the render of each quad, then its childs and so on.
The thing is, for each quad I'm calling glVertexPointer to set the vertices.
- Should I avoid calling it fo开发者_如何学Pythonr each quad? Will improve performance calling just once for example?
- glVertexPointer copies the vertices to GPU memory or just saves the pointer?
Trying to minimize the number of calls will not be easy since each node may have a different quad. I have a lot of equal sprites with the same vertex data, but I'm not necessarily rendering one after another since I may be drawing a different sprite between them.
Thanks.
glVertexPointer keeps just the pointer, but incurs a state change in the OpenGL driver and an explicit synchronisation, so costs quite a lot. Normally when you say 'here's my data, please draw', the GPU starts drawing and continues to do so in parallel to whatever is going on on the CPU for as long as it can. When you change rendering state, it needs to finish whatever it was doing in the old state. So by changing once per quad, you're effectively forcing what could be concurrent processing to be consecutive. Hence, avoiding glVertexPointer (and, presumably, a glDrawArrays or glDrawElements?) per quad should give you a significant benefit.
An immediate optimisation is simply to keep a count of the number of quads in total in the data structure, allocate a single target buffer for vertices that is at least that size and have all quads copy their geometry into the target buffer rather than calling glVertexPointer each time. Then call glVertexPointer and your drawing calls (condensed to just one call also, hopefully) with the one big array at the end. It's a bit more costly on the CPU side but the parallelism and lack of repeated GPU/CPU synchronisations should save you a lot.
While tiptoeing around topics currently under NDA, I strongly suggest you look at the Xcode 4 beta. Amongst other features Apple have stated publicly to be present is an OpenGL ES profiler. So you can easily compare approaches.
To copy data to the GPU, you need to use a vertex buffer object. That means creating a buffer with glGenBuffers
, pushing data to it with glBufferData
and then posting a glVertexPointer with an address of e.g. 0 if the first byte in the data you uploaded is the first byte of your vertices. In ES 1.x, you can upload data as GL_DYNAMIC_DRAW
to flag that you intend to update it quite often and draw from it quite often. It's probably worth doing if you can get into a position where you're drawing more often than you're uploading.
If you ever switch to ES 2.x there's also GL_STREAM_DRAW
, which may be worth investigating but isn't directly relevant to your question. I mention it as it'll likely come up if you Google for vertex buffer objects, being available on desktop OpenGL. Options for ES 1.x are only GL_STATIC_DRAW
and GL_DYNAMIC_DRAW.
I've just recently worked on an iPad ES 1.x application with objects that change every frame but are drawn twice per the rendering pipeline in use. There are only five such objects on screen, each 40 vertices, but switching from the initial implementation to the VBO implementation cut 20% off my total processing time.
精彩评论