(this may be considered in small parts by some as off-topic, but still half-way in-topic and I’d like to show alternatives and such posting it with this warning)
I did research this a lot, as I’m currently creating some landscape tools and materials, and want to make them exceptional, as such I try to incorporate truely a lot of functionality in there while
having them still perform greatly. So, material performance optimizations have been a huge topic for me. I have many parts which felt to be hugely optimizable with dynamic branching, so that calculations wouldn’t be performed.
Now after figuring out the fine details about dynamic branching, however, I come to a conclusion, that this is only to be used mostly in small areas, where predictable operations will be performed within the custom expression only and when both branches together won’t add more complexity/calculations to the material (because these two branches may still be processed - see previous post!). Using it for external branch optimization - while still functional - makes it rather something to avoid in the first place. Also, I consider it now something to take in consideration only in late-stage material optimization and use it very sparingly and carefully, as it may involve a lot of testing I’d rather avoid to such extremes. It also seems to me, that it makes it something to avoid, when creating materials which can be used in unknown environments or with unknown assets.
So, I rather look in alternative ways to optimize performance for my material / landscape materials. These could be:
- Clever combinations of instructions, which may give the same result as a “proper” calculation but lead to it with much less instructions
- Using texture samples/maps rather than doing calculations, which return the results wanted or the opposite in some cases
- Using runtime virtual textures to cache som of the result - remember, you can always do more calculations afterwards on top of the cached texture even within the landscape material, caching only the “predictable” part of the landscape layer which involves no camera dependend values (camera position, pixel depth…)
- Avoiding any texture operations which are inefficient, like no-power of two texture sizes, or offsetting texture UVs
- Moving calculations from the pixel shader to the vertex shader - may be difficult though with some very advanced landscape materials, but still considering it and looking for possibilities
- Repeating the same nodes for the same calculations (the editor does a great job of collapsing those and avoiding unnecessary instructions) rather than a different node setup for the same calculation - functions help greatly in that regard
- More Lerps, as a semi-branching feature, as it seems shader groups will optimize those (not 100% certain yet, but some things point to this)
- Static switches usually do still a great job for increasing performance
- Quality switches are a built-in branching which already works great and will optimize a material based on quality level set - example: someone asking for low quality, doesn’t need roughness calculated based on a roughness map, a heightmap and the normals, to be optimized in regard to specularity for proper reflections on distance and angle viewing.
- **(update) **Here’s one based on how both the gpu and the editor-to-hlsl compiler work and process things: Some operations may result in better performance if “unrolled”. Take as example an iteration within a custom expression, which could be performed more efficiently, if it were in nodes, because of the way the gpu processess things. On the opposite, if similar nodes which repeat a lot and seem to generate HLSL code with a lot of “local??” variables, may be possible to be written in a way, where they would be collapsed into much less "local"s and perform faster. This is something to look into a case-by-case.
There could be more, but these come in my mind for now and will hopefully help those who were looking for solutions with branching.