Shader Dynamic branch not working?

kurylo3d · July 7, 2024, 2:35pm

well i dont see anything on that tiny chart either… u have to look at the ms counter for the gpu on the right hand side of stat unit.

And its technically a dynamic parameter for a material instance… i moved it back and forth without saving the material instance… and it dynamically adjusted the performance.

Arkiras · July 7, 2024, 2:36pm

There’s no difference. The graph would be spiking if dynamic branching actually worked here, but it doesn’t, because it’s not.

kurylo3d · July 7, 2024, 2:37pm

doesnt spike for me… and if it did how would u see it… look at that red line at the bottom… if it did spike it would spike 1 pixel on your monitor… barely noticeable. Check the ms counter really quick for the gpu and see if any difference.

Arkiras · July 7, 2024, 2:37pm

That’s my point entirely…

kurylo3d · July 7, 2024, 2:38pm

The numbers do though.

Also for your exmaple… it probably wouldnt benefit anything anyway… u dont even use that many instructions or texture samples.

Arkiras · July 7, 2024, 2:39pm

It’s 1281 instructions in the noise branch. How many would be a valid test? Because I can just keep adding more, I promise you the result isn’t going to change.

kurylo3d · July 7, 2024, 2:40pm

i just tested with your method… the whole time based changing of hte alpha with the ceil and sin… etc etc…

That worked for me as well… i save between .4ms and .6 ms. on the GPU

It fluctuates back and forth… high to low as expected.

kurylo3d · July 7, 2024, 2:42pm

Maybe a dx12 feature? Could also be your gpu.

Arkiras · July 7, 2024, 2:43pm

I’m on DX12 on a 3070, for reference

kurylo3d · July 7, 2024, 2:43pm

dont know what to tell u… try adding some 4k texture samples in tehre and see what happens.

kurylo3d · July 7, 2024, 2:47pm

this is with your time fluctuating alpha to switch back and forth.

Arkiras · July 7, 2024, 3:21pm

I mean maybe it behaves different for texture samplers, I don’t know, I honestly don’t care enough to try and test it because successfully branching in one synthetic test case isn’t very helpful. I’ve never seen this work and the generated HLSL code indicates it shouldn’t work. You’re the first person I’ve seen to say it does, and honestly shaving 2 tenths of a millisecond off a 1300~ instruction material isn’t very convincing, even on a 3090.

This is what I would expect to see when dynamic branching actually works:

This is what you get with all the branch logic inside of a custom node:

(Please forgive this code, I’m not a programmer or a tech artist, just a regular artist)

My opinion is still the same as what I said in my very first reply: Don’t expect it to work unless the code for the branches is all inside a custom node.

kurylo3d · July 7, 2024, 4:01pm

kurylo3d · July 7, 2024, 4:04pm

kurylo3d · July 7, 2024, 4:05pm

perhaps my graph would bounce more if i had that many instructions. I just have no idea how to get my instructions that high… it seemsl like adding 100 mults doesnt add 100 instructions for some reason.

Is there a way to zoom in on that stat unitgraph… or shring the values so u can see spikes? I kinda see my yellow going up and down on the graph but the graph is so small its barely noticeable

Frenetic · July 7, 2024, 4:08pm

in general, when compiling-down, something ‘simple’ like a multiply can be nested so even a dozen calls to multiply something can be run as one instruction ala: mul(mul(mul(mul(x,y), 1), 2), 3) etc…

specific to this case, one can often swap out a power-function, which is ‘expensive’ to just load, let alone execute, vs chaining a few multiplies. IMHO, if you aren’t powing something 10’s of times, just multiply it out. At least this is my understanding, not all functions on the GPU are equal in cost.

kurylo3d · July 7, 2024, 4:10pm

i didnt know that thanks.

Arkiras · July 7, 2024, 4:19pm

This guy notes specifically that the if node (which is the “dynamic branch” node) does not actually branch and he demonstrates it in the video. His “working” branch is using an if branch inside of a custom node, but I cannot reproduce that either. For all I know, it works in some cases, I don’t know. Regardless, it’s clearly unreliable.

His result is also very strange, because his GPU time drops when it takes the cheaper branch, but his performance is still extremely poor (69ms to return a constant??) so I really don’t know what to make of this.

Arkiras · July 7, 2024, 4:30pm

By the way, according to the documentation the entire reason why Volumetric Cloud materials have a conservative density output is specifically because the material editor doesn’t support dynamic branching:

Anyway I don’t really know what else to add here. If you think you’ve managed to get it working all I can say is more power to you. Based on what I have seen in this thread though, I still won’t be branching for optimization outside of a custom node.

kurylo3d · July 10, 2024, 6:25pm

I mean i heard somewhere that dx12 added a lot of features but im not sure