Dynamic shadows artifacts

GeForce GTX 960M. It’s not surprise that UNROLL is faster on that kind of loop. I have never encountered simple not nested loop that would be slower with unrolling. Sometimes benefits are not worth the additional code size but in this case it’s quite clear win. I have to test this with my desktop GPU also when I get to office.

My directional light have default angle(1 degree) and I have tuned r.Shadow.MaxSoftKernelSize=18. When kernel size get’s too big then cache misses start to dominate performance cost.