Is there a maximum number of assembly开发者_StackOverflow中文版 language instructions to be loaded into the fragment program unit? I have an algorithm on to port from cpu to gpu and apparently it doesn't fit on the gpu.
There are several hard and soft limits, some of which are not immediately obvious:
- Instruction slots: The total number of instructions that the hardware can accomodate in local memory.
- Executed instructions: The maximum number of instructions that will execute (including instructions that run several times in a loop)
- A single GLSL instruction can map to a dozen or more instructions
- Several GLSL instructions can map to a single instruction depending on the optimizer's quality (e.g. multiply-add, dot, lerp)
- Limited temp registers (only 32) may require more instructions than necessary on pre-SM4 hardware (no such problem with 4096).
- Swizzling usually does not cost extra instructions nowadays, but does on some older hardware, and may in some situations on some hardware (esp. gl_FragColor is such a candidate)
- Regardless of actual instructions, OpenGL 2.0 compatible hardware is limited to 8 dependent texture fetches (unlimited on hardware that can do OpenGL 2.1 or better)
You have these guaranteed minimums (most cards have more):
- 512 instruction slots for vertex and pixel shaders on OpenGL 2.x (SM3) capable hardware
- 65536 executed instructions
- 4096 vertex and 65536 pixel shader instruction slots on 3.x (SM4) hardware
- 65536 executed vertex shader instructions, unlimited pixel shader instructions
- At least 24 dynamic branches possible on 2.x (SM3) hardware
- Fully dynamic branching (no limits) on SM4 hardware
- Only conditional move available on SM2.x, everything else must be accomodated by code duplication and loop unrolling, or must fail
There is a limit on the maximum amount of instructions a shader can have. As far as I know, it varies from GPU to GPU. If your shader is too large, compilation will generate an error.
精彩评论