Nanite Performance is Not Better than Overdraw Focused LODs [TEST RESULTS]. Epic's Documentation is Dangering Optimization.

tsr then… if u have anything better that i can use unreal for upscaling and all that good stuff. please let me know what you think. Think i saw a youtube video a while back about a plugin for a different upscaling method possibly from amd that works a little different but is much faster but not as high quality.

Okay thats a very fair response, so i will try to provide as much data as possible.

But first, we need to be on the same page on what we are discussing here.
This thread is about Nanite Performance. But in your response you shifted very much towards TAA and Lumen. So let’s do a quick detour there.

Lumen is absolutely not production ready as it is geared towards 60 FPS performance. That is not even close to enough FPS imo. We disabled lumen because we could not get it to work within our 90FPS Budget.

Lumens main selling point is, that it does not require Raytracing Hardware to achieve photorealistic lighting. But at the same time it takes so much performance, that old hardware can barely run it.

As you can see in the Steam Hardware Charts, most users nowadays have a RTX Graphics card, as these become affordable. The RTX 3050 sits at around 200$ right now. You can also see if you sort by “changed since last month”, that the number of users with a RTX card is increasing rapidly.
So as a developer, creating a game in the next 2 years, i would just go for hardware raytracing.

Now for TAA. In our case it was actually a design choice. We liked the smooth look it created. It gave the environment a “dreamy” look.
Also, our game being a VR game we barely had any movement, so the temporal effects were not visible.

The point of TAA, or AA in general, is to smooth jaggered edges. If you use any kind of upscaler (which i would recommend anyway, free performance) you don’t need AA, because edges will be smoothed in the upscaled image. So in our case it was totally unnecessary.
Upscalers also have the advantage that they increase Nanite performance drastically, as nanite scales with screen resolution.
Rendering an image in 2k (Native VR), nanite takes around 2.5ms for culling and rasterization. For us it only took 1.8ms because we upscaled from 75%.

If you want, you can give your take on upscalers, would be interested in a good discussion there.
But IMO TAA, or any AA is unnecessary because of upscaling.

1 Like

Alright back to the topic :smiley:

The resolution of the quest is 1832 x 1900 for each eye.
So 3654 x 1900 in total. BUT VR has a few rendering quirks, that lower the resolution cost.

For example Foveated rendering allows to only render at full resolution in the center of the screen and blurry in the peripheral vision.


Using round robin queries, you can have nanite culling for only one eye at a time, and it will always switch between left and right eye. With this you are saving culling costs for the second camera.

All in all our tests showed that rendering on the quest has about the same performance implication as rendering a 2k Monitor screen. So let’s go with that.

Based on the stats provided by epic, rendering nanite at 2k costs about 2.5ms. Nanite scales with screen resolution.

Using 75% upscaling, we can save ~0.7ms, so we are left with ~1.8ms of nanite rendering.
This is changing between 1-2ms based on where you are at the scene and overdraw.

For the virtual shadow maps, we do not have a dynamic day night cycle, so most Shadows can be cached.
That is also why the shadow costs are so low for us, even though we have multiple real time point lights and a directional light in the scene.
Shadow Depths is around 0.6ms and Lights is at 0.4ms, so for all VSM calculations we have ~1ms.

We also lowered the shadow resolution using
r.Shadow.Virtual.ResolutionLodBiasLocal = 0
r.Shadow.Virtual.ResolutionLodBiasDirectional = 0.
In our tests, a value of 0 is perfect in VR. Shadow glitches are barely noticable but it gives significant performance gains.

Out if interest i disabled caching using
r.Shadow.Virtual.Cache.ForceInvalidateDirectional = 1
and got the following stats:


Now the total VSM costs are at 2.2ms. But keep in mind, the whole scene shadow is rerendered every frame, which is not happening in a normal/optimized game.

For a day night cycle there are multiple options.

  • Moving the light on a lower frame rate (1/sec)
  • Having a mostly static light and moving it over a short timeframe

Also keep in mind, you can always increase and decrease these stats based on hardware.
In a “low quality” setting, users expect low resolution shadows, so even a value of
r.Shadow.Virtual.ResolutionLodBiasDirectional = 2 might be viable.
Our stats are based on the maximum quality settings of our game.

Hope that clarifies it. With these stats, nanite and VSM are absolutely production ready in my opinion.
And don’t forget the advantages you can get! Super high resolution meshes, automatic and seamless LODs, Infinite Objects in your scenes and a renderer that does not scale with complexity but screen resolution. Also exponential stronger effects on upscalers.

If you have any further questions feel free to ask. I did not include the motion and still screenshots for now, because TAA is not necessary if you use upscalers. So that discussion is unnecessary IMO :slight_smile:

2 Likes

As well as this newest talk from UE Fest Prague might be helpful:

1 Like

It’s a joke and a complete backtrack on concepts already spoken about.

1 Like

For what it is worth, Epic has directional shadow caching disabled for Fortnite. To maintain performance they dropped the effective shadow resolution instead.

The tradeoffs involved in this decision will differ for other games. There is certainly a visual impact from giving up on shadow map caching and dropping the shadow resolution that may not be an optimal choice for other games. That said, we are reasonably happy with the level of quality we are able to achieve in Fortnite even with no directional light shadow caching at 60 fps, as it is still a noticeable improvement over previous solutions.

1 Like

Indeed. It always depends on the amount of moving elements in the scene.
We had WPO animated leafs on the trees, but disabled WPO for Virtual Shadow Maps, so there were barely any Cache invalidations.

Fortnite wanted to have animated shadows AND a day Night cycle, so they had to do it this way.

But even when disabling caching in my example project, the total render time is 5.8-6.5ms.

So an average of 150 FPS. That is more than enough for any VR Game with that Level of quality.