Hi,
I noticed that encapsulating two material functions into a larger material function can obstruct shader code optimizations. In such cases, having the encapsulated material function nodes explicit in the actual material asset yields the expected optimized code.
I have tried come up with a very simple example to illustrate the issue.
Please refer to the attached images below for guidance.
Assume two hypothetical material functions, acting like a demultiplex multiplex pair:
- FEncoder : receives several inputs, operates on them, and produces a single output value
- FDecoder : receives a single input, and outputs multiple values
We could now explicitly place and connect both of them in a material, or we could encapsulate them within a FCodec material function and place this abstraction instead in the material.
If we use the FCodec node, as in MTestJoined, and inspect the generated HLSL code, we can notice that each of its outputs is making an invocation to the FEncoder logic (in this case, to the Custom Expression within it, but could be a standard material node network), even though the inputs did not change.
On the other hand, if we place FEncoder and FDecoder directly in the material instead, as in MTestSplit, only one invocation is made, as expected.
Sure, for such simple material functions, the resulting amount of machine instructions will be the same, since the shader optimizer can do a good job of detecting the redundancies. However, in many of our real/complex materials, the shader optimizer is unable to detect the pleonasm and we end up paying a heavy performance toll for this redundancy (especially if whatever custom expression logic being invoked contains loops and such).
Is there any way to assist the shader code generator in this matter?
(For the time being, I quickly hacked a simple variable caching mechanism to prevent redundant work, but my caching system can only handle one instance of FEncoder per material, which limits the work of the technical artists.)
Thanks in advance.