Originally posted by spacegojira
View Post
Announcement
Collapse
No announcement yet.
Dynamic shadows artifacts
Collapse
X
-
Oh come on, I just praised Kalle-H, I didn't say Krzysztof is a bad person, don't be so sensitive. Didn't you read my post where I welcomed him to the community?
But, if that was hurtful towards him, I apologize.Last edited by spacegojira; 02-21-2018, 05:19 AM.
Comment
-
Code:#if (!MODULATED_SHADOWS) || (FEATURE_LEVEL >= FEATURE_LEVEL_SM4 && !FORWARD_SHADING && !APPLY_TRANSLUCENCY_SHADOWS) FGBufferData GBufferData = GetGBufferData(ScreenUV); #endif #if !MODULATED_SHADOWS #if USE_PCSS #if SPOT_LIGHT_PCSS float Attenuation = GetLightInfluenceMask(WorldPosition) * saturate(dot(GBufferData.WorldNormal, DeferredLightUniforms.LightPosition - WorldPosition)); #else float Attenuation = saturate(dot(GBufferData.WorldNormal, DeferredLightUniforms.NormalizedLightDirection)); #endif #else // Both spot and directional light use same shadowing code. Select proper direction. No need to normalize. half3 Dir = DeferredLightUniforms.LightInvRadius > 0 ? (DeferredLightUniforms.LightPosition - WorldPosition) : DeferredLightUniforms.NormalizedLightDirection; float Attenuation = GetLightInfluenceMask(WorldPosition) * saturate(dot(GBufferData.WorldNormal, Dir)); #endif BRANCH if (Attenuation > 0) //Shadow sampling code.
Normal pointing towards light and pixel inside spotlight volume tests are equally benefical in my test scene.
Normal pointing light test also combines ShadingModel unlit test with this PR. https://github.com/EpicGames/UnrealEngine/pull/4441 (unlit pixels do not need normal and it's defined as(0,0,0) )
But without that optimization it's might be beneficial explicitly test if pixel shading model is Unlit.
Also for some reason subsurface shadows are calculated for all subsurface models but not all them use subsurface shadows. I am not sure about others but I am sure that MATERIAL_SHADINGMODEL_SUBSURFACE_PROFILE is not using them. For cinematics these pixels might cover large sreeen area.
- 1 like
Comment
-
Doubled performance of PCF soft shadows by simply adding UNROLL attribute for loops. Reordered inner loop math. -7ALU per sample. Default use 32 sample so 224ALU total. https://github.com/EpicGames/UnrealE...ull/4508/files
Comment
-
Originally posted by Kalle_H View PostDoubled performance of PCF soft shadows by simply adding UNROLL attribute for loops. Reordered inner loop math. -7ALU per sample. Default use 32 sample so 224ALU total. https://github.com/EpicGames/UnrealE...ull/4508/files
Comment
-
Originally posted by Deathrey View Post
That is a surprise. What hardware did you test on ?
My directional light have default angle(1 degree) and I have tuned r.Shadow.MaxSoftKernelSize=18. When kernel size get's too big then cache misses start to dominate performance cost.
- 1 like
Comment
-
Originally posted by Kalle_H View Post
GeForce GTX 960M. It's not surprise that UNROLL is faster on that kind of loop. I have never encountered simple not nested loop that would be slower with unrolling. Sometimes benefits are not worth the additional code size but in this case it's quite clear win. I have to test this with my desktop GPU also when I get to office.
My directional light have default angle(1 degree) and I have tuned r.Shadow.MaxSoftKernelSize=18. When kernel size get's too big then cache misses start to dominate performance cost.
But yeah, considering that UE4 uses quality presets, that define loop iteration count at compile time, there is absolutely no reason not to unroll. As to code chunk, I think the inflation of the shader size and compile time is incomparable to speed gains in shadow filtering, so can't be even regarded as a downside.
Comment
-
There is diff between unroll and not.
https://www.diffchecker.com/43fcZ35Z
I use 32 samples for both search and pcf loops. Shader is quite big 1662assembly lines but performance is quite good. It's just 2.2ms slower than non soft shadows.Last edited by Kalle_H; 02-23-2018, 08:08 AM.
Comment
-
Originally posted by Deathrey View PostAye, definitely large improvement. Makes me want to take a look at loops in other shaders and re-check them.
On a side, unrelated note, why coarse derivatives are used? Maybe it is worth looking into fine ones, when available? Should give better biasing.
Comment
-
Originally posted by Kalle_H View Post
I have to check is there any visible different with ddx_fine. Is there performance difference and how much?
Overall, there few moves, that I don't understand regarding PCSS in UE4. First being why the PCF bias used is only positive ? The technique itself kinda implies on it being both positive and negative. I agree that only positive bias safeguards you from some acne, but it also eats up the shadows, where they should be. I totally agree that clamping the bias at some point is a must, but should not be only positive.
Second is, why adaptive bias was used at all? Following the logic of conventional PCF implementation in UE4, it might be more consistent(not better, just more consistent) to follow the tradition and just use transition scale and flat bias.
Thirdly and lastly, why sobol random ? I might be biased here, but I was never able to pull a decent random out of it.
And as a general thought, what about using blocker search result to have reduced PCF sampling rate in umbra?
Comment
-
Originally posted by Deathrey View Post
No idea to be fair. I'd expect it to be roughly 4x the cost of coarse. I doubt that would be large enough to be profilable.
Overall, there few moves, that I don't understand regarding PCSS in UE4. First being why the PCF bias used is only positive ? The technique itself kinda implies on it being both positive and negative. I agree that only positive bias safeguards you from some acne, but it also eats up the shadows, where they should be. I totally agree that clamping the bias at some point is a must, but should not be only positive.
Second is, why adaptive bias was used at all? Following the logic of conventional PCF implementation in UE4, it might be more consistent(not better, just more consistent) to follow the tradition and just use transition scale and flat bias.
Thirdly and lastly, why sobol random ? I might be biased here, but I was never able to pull a decent random out of it.
And as a general thought, what about using blocker search result to have reduced PCF sampling rate in umbra?
For reduced sampling rate I am not sure if it's worth it. There are already two early outs based on blocker search. If there are no blockers it skip all samples and there is no shadow. If all samples are blocked it will skip all the samples. Reduced sampling rate could be win without unrolling and variable loop counter.
Just found another optimization. Filter radius can be premultiplier to PCFUVMatrix. -2ALU per sample.
Code:PCFUVMatrix = mul(float2x2(FilterRadius, 0, 0, FilterRadius), PCFUVMatrix);
- 1 like
Comment
Comment