Hello,
I am writing to report unexpected performance behavior with Virtual Shadow Maps (VSM) when using Nanite’s ‘Voxelize’ shape preservation mode in Unreal Engine 5.7.2.
Summary
When using Nanite meshes with “Voxelize” for shape preservation, increasing r.Shadow.NaniteLODBias causes the shadow cost to increase, which is the opposite of the expected performance gain and the behavior seen with other preservation methods.
Context
- Engine Version: 5.7.2
- Scene: A simple test case with a landscape, a PCG-populated forest of Nanite foliage (static meshes), and a single directional light.
- Settings:
- VSM cache is disabled to measure worst-case performance.
- The Nanite foliage assets are configured with Shape Preservation set to Voxelize.
- The base shadow depth cost is around 5.8ms, which I am trying to optimize.
Observed Behavior
When increasing the r.Shadow.NaniteLODBias value from 0, the shadow cost increases instead of decreasing. The cost only begins to drop after the bias value exceeds 4.
Expected Behavior
I expected the shadow cost to decrease proportionally as the r.Shadow.NaniteLODBias increases. This is the behavior I observe correctly when using other shape preservation methods like ‘Preserve Area’ or ‘None’.
I attached a graph showing the performance discrepancy.
Questions
- Is this a known issue, or is this the intended behavior for the ‘Voxelize’ method? If it is intended, could you provide some insight into why it behaves this way ?
- Because of this issue, my only remaining lever to reduce VSM cost is to lower the shadow map resolution with
r.Shadow.Virtual.ResolutionLodBiasDirectional. However, this causes a much more visible and undesirable degradation in quality. Do you have any advice on other ways to effectively reduce VSM costs when using the Voxelize method ?
Thank you for your time and assistance.
ShadowCost.png(162 KB)
Hi Eric,
- In short, this is not unexpected. The voxel rendering is very heavily optimized to be an LOD technique, triangles are used in the magnified case. Thus, for unbiased Nanite rendering it only encounters sub-pixel voxels which it is designed to sample very efficiently. As the voxels become larger, they become boxes that would need a different algorithm to be efficient - e.g., in the case of triangles we use the HW rasterizer for clusters with large triangles. Such a path does not exist for voxels making it very unlikely that bias will be helpful. In our tests the actual rasterization goes down for a bias of 0, but cluster culling etc increases meaning that leaving it at 1 is typically a win (and helps the triangle part).
- What specific resolution bias and resolution are you targeting here, and what hardware? Since VSM is based on the resolution the dynamic res also plays a part and the overall budget must take that into account - e.g., if you end up dialing down the dynamic resolution for other parts of the frame, the shadow cost will also respond so one must be careful of focusing on one feature in isolation. Generally, I’d say a VSM resolution LOD bias of 0 is a reasonable starting point for a dynamic sun. Some other knobs to try:
- r.Shadow.Virtual.MarkCoarsePagesDirectional - if you have little or no atmospheric effects, turning this off completely can help a lot. It remains an unfortunate weak point in the VSM implementation. At the very least for the sake of experimentation you should try without it.
- r.Shadow.Virtual.UseReceiverMaskDirectional should be on by default in 5,7, but make sure it is.
- When testing uncached make sure to use r.Shadow.Virtual.Cache.ForceInvalidateDirectional, I’m not 100% confident r.Shadow.Virtual.Cache=0 has the same effect (notably around HZB use there could be some difference).
Beyond that, for the state of voxels and VSM in 5.7, I’d advice to aim for using caching for large foliage scenes. In 5.8, there’s an experimental feature coming (r.Shadow.Virtual.PrefilteredDistant) that enables rendering distant geometry at lower resolution, which helps a lot in dense foliage scenes, where the sun is dynamic (this path does not use caching at all). It is however, highly experimental at this stage.
Best,
Ola