Nanite Performance is Not Better than Overdraw Focused LODs [TEST RESULTS]. Epic's Documentation is Dangering Optimization.

The reason I did the test was to disprove theories and official lessons by Epic about poly density helping performance where culling isn’t happening all the time. And to prove the 5th iteration of Nanite still doesn’t include substantial performance gains like everyone claimed would happen a couple of years ago.

(Nanite needs a substantial change in how it works to improve performance. We’re talking such massive changes to the point where calling it “Nanite” over “Nanite 2.0 etc” would be a bit of a cheat)

@Lucian The reason why your not seeing that much of a performance difference is because you started off with much more nanite overdraw. A scenario where you would be told to follow “protocol” and try and get more clusters.

You have to do the same thing when you compare quad overdraw with Nanite. You have to analyze the surface area of heat in your frame.

I’ve already ready proven that:

More quad overdraw surface area in frame=more gain with Nanite or x2 the Nanite gain with proper LODs and batching.

So with your test it’s:

More Nanite overdraw=The less gains micro variance displacement/clusters will help performance.
(since flat subdivision to hack in clusters gets compressed to zero clusters with Nanite)

You’re getting the same amount of overdraw in different forms. In your “low poly”, Nanite’s pixel overdraw is spread across the screen. With displacement/clusters, overdraw in getting packed in the same pixels over & over which doesn’t allow you to measure the situation correctly unless you know how this stuff works. Again, this is why the view mode is so ineffective.

(Again, your welcome Epic, invest in fixing this)

So lets measurement performance scenarios from best to worse for anyone wondering:
(I’ll remind everyone here that non-nanite meshes also have to watch pixel overdraw. It just doesn’t have a viewmode in unreal like Nanite does)

Each dropping significantly in performance potential.

  1. Non-Nanite, Quad Ovedraw and Pixel Overdraw contained scenes
  2. Nanite scenes with low Pixel Overdraw
  3. Nanite scenes with high Pixel Overdraw
  4. Non-Nanite, Quad Ovedraw and Pixel Overdraw overwhelmed scenes

So basically you have to neglect optimization (again) to gain with Epic’s suggestions.

I will link your test and my reply of context to the main mega thread.
And for anyone complaining about using “5.5 preview”, it’s the same in case with UE5.4 and all the other version since barely anything was done to improve/modify Nanite other than making other things compatible with it. I assumed this was the case and went back to prove myself right (again).

Let us know your PC specs. I tested several different models, adjusting their positions in the scene and changing the camera view (from any angle), but I couldn’t replicate such a large performance difference just by displacing the mesh. In fact, I barely noticed any change, even with a lot more meshes than in your scene. So, I think the issue is likely something else here, as it might be a compatibility or driver issue (or a bug in 5.5 preview). You can share the project with us so we can compare the performance on different systems.

I don’t believe either the idea that “high poly makes Nanite faster” but it might be true in some cases depending on other objects that occlude in a complex scene. Similarly, when someone said “More LODs can reduce object popping and improve performance” it’s true to an extent, but using 30 LODs per model could actually make things worse.

2 Likes

@Lucian

RTX 12GB 3060 latest game drivers. That’s not target hardware (but we test on it since there are less Nvidia hw optimizations), base 9th gen is (at least for our studio).
We also have RNDA2 hardware we can test the same scenario on when we’re less busy.
I always test at 1080p and that’s a giant factor you might miss.

So, I think the issue is likely something else here, as it might be a compatibility or driver issue (or a bug in 5.5 preview)

It’s not. We just include a lot of context.

You can share the project with us so we can compare the performance on different systems.

It’s not a project it’s just a bunch of cubes. Easy to replicate. My map was inside a much larger personal project.

My response to Lucians oversimplification of our(thousands of people) viewpoint and LODs

I don’t believe either the idea that “high poly makes Nanite faster”

But you already admitted to not fully understanding how everything is drawn when you shared this comment paired with insults towards me. Nanite and non-nanite meshes draw the same way except one draws and culls with clusters instead of the whole mesh. Both will only render triangles per facing the camera and end up in samping range.

You seem to want to take this topic with a reasonable approached now. You need some things set straight though. You’ve suggested mind boggling things such as “needing 10 LODs” or “increased quality” from Nanite+VSMs. VSMs during simple traversal is staggering to performance and requires incompetent TAA to mask terrible noise and it can’t even render at a high enough resoltion to hide jagged lines with reasonable performance.

Nanite also displays pop-in (you can see it if you don’t have blurry TAA/TSR on) and Tessellation x Smarter material workflows would perform 2x faster and would incur reasonable management.

it’s true to an extent, but using 30 LODs per model could actually make things worse.

Again, tessellation for so many reasons…LODs are kindergarten.

You never factor in the aliasing issues nanite promotes..
That is a downgrade in visual quality.

Pop-in is worse these days because unreal’s auto-lods and recent engine dithering patterns are garbage and VERY low quality.
Am I saying everyone should handmake LODs? No, I’m saying we need workflows and higher quality approaches that don’t destroy potential in hardware. I’ve even stated scenarios where nanite makes sense but there is not reasonable compromisable system in unreal.

And it doesn’t matter what the hardware is. The percentage increase will be mostly universal. I recently confirmed a 20fps difference in performance in a recent title that used Nanite on 4080. 20fps increase and no visual loss other than terrain quality when Nanite was disabled. And this was the case when terrain wasn’t even on screen.

When I say “share the project” I mean sharing it at the exact same level to avoid the argument, “It’s not like my test”.

Your comparison was unfair anyway because you added displacement, which introduced more geometric details and changed the shape of the mesh. This essentially forced the creation of far more clusters. Also, PS5 is better than RTX3060, so you’ve considered the worst-case scenario. That’s not necessarily a bad approach, but it’s worth mentioning for context.

In a real scenario, it’s beneficial for Nanite to increase geometry (without changing the model’s shape so much with hard noise) to generate more clusters on large/long polygons.

The clusters are created depending on mesh surface angles, uv, etc. Here is a test of 2 meshes, with exactly the same number of triangles:

More information about this: https://youtu.be/dj4kNnj4FAQ?si=0De9MINyRneN6p00&t=2004

My test recreated your scene by adding subdivision - which improves culling:

~31,000 triangles per cube:

~1,300,000 triangles per cube

5 Likes

Your comparison was unfair anyway because you added displacement, which introduced more geometric details and changed the shape of the mesh.

It’s not unfair, your missing the point entirely if you think that.

This essentially forced the creation of far more clusters.

That’s why I displaced it. The whole point is to show overdraw induced from dense clusters ends up being the same as overlap/failed culling overdraw

In a real scenario, it’s beneficial for Nanite to increase geometry (without changing the model’s shape so much with hard noise.

I’m kinda confused by what you mean here, since you can’t just “increase” geometry. That makes it sound like flat subdivisions will do the trick when Nanite will detects this action and just collapses them into no clusters or large clusters. Unless you have a way to share how flat subdivisions can be kept for clusters? What my test showed(and I replicated in 5.4) is that increasing clusters via detail(noise, rock/wood patterns, doesn’t matter) will increase cost due to the cluster overdraw they induce.

More information about this: https://youtu.be/dj4kNnj4FAQ?si=0De9MINyRneN6p00&t=2004

That was produced 9 days after the Threat Interactive Nanite video was produced. The presenter said the opposite information last year (which I proved wrong) and he gave more wrong information in the video. There are even comments on the video calling him out for it.

My test recreated your scene by adding subdivision - which improves culling:

There is no change in cost as you can measure the surface area of both heatmap shots.
Yes one has more clusters/better culling, but that’s because 2nd objects have the opportunity to have more angle variance. Three issues
#1 causes more aliasing
#2 smooth objects cannot “benefit”(#3) from this.
#3 It cost more than basic overlap overdraw.

There’s nothing really wrong with your test if I’m going by your word(which I can and will for the time being).
It just needs the context.

To avoid re-quoting everything you’ve said, I’ve ordered these sections to only include the parts I want to respond to:

  1. Yes, because you clearly know more than an entire team of engine programmers most of whom have been coding game engines since 1995 which is likely before you were born. lol

  2. They’re so lucky to have someone as inexperienced as you telling them what they need to fix.

  3. Of course completely ignoring the fact that every test you’ve run has been in-editor and not a single one in a packaged version of the game. Best way to get real-world results. As many people have asked, what are your system specs anyway? Are you sure you meet the minimum requirements to run UE5? How many different system configurations have you run your tests on?

Finally, don’t you have anything better to do than rant in these forums every day and spread propaganda? Not just the fact that the majority of what you claim is patently false, but the fact that you think this thread and your YouTube videos are somehow contributing to the community. What on Earth are you trying to get out of this? You claim you want them to stop integrating future features around Nanite, but answer me this, what’s wrong with the regular LOD system in UE? Is it broken in any way? Because you’ve predicated this entire thread on the fact that the regular LOD system is so much better than Nanite. So why not just use that? You’re not being forced to use Nanite. Just opt out. Epic continuing to work on Nanite and what they see as the future of the engine will in no way stop you from being able to pick and choose the features you want and opt out of those you don’t like.

I’ll give you one thing, you’re certainly dedicated. Ranting about this for more than 3 months now. Imagine if you put all of this energy into working on a game project!

5 Likes

Just relax and let Epic take care of it - they are aware of things such as Nanite Foliage and are working hard on making it more performant. Unclench - it will affect your health.

Constantly insulting them and demanding they do things your way is extremely counter productive and just makes you look like a child.

Meanwhile, I’ve been enjoying watching real game devs do some very cool things with UE5, Lumen and Nanite.

4 Likes

Not a flat subdivision, a very subtle displacement will do the job. However, of course, the clusters will appear based on the camera’s distance, and beyond a certain point, these clusters will collapse.



2 Likes

So i would like to help and share my thoughts in real scenario:

→ DX12 - can give you additional depth and clarity in the scene for realistic games when you use alot of foliage. It just looks awesome and it works well with DLSS. All things looks great here,volumetric fog make natural barrier between foliage - rays and shadows just fill up your soul. (Standard LODS doesnt work - FPS drop) . I mean you just enable all here and try to set up on fire your PC:
HW RTX+ HW Lumen+Nanite+Virtual Shadow Maps + DLSS. Cause there is no space here for quality exceptions - each detail improve your immersion.
If not intersted to get breathtaking scenes-> go DX11
Shader Model 6 is for High End Machines Targets AAA+ quality.

→ DX11 - better if you want create some low poly game that doesnt have foliage dense stuff.
it will be hard to get similiar result as on DX12 - there will be also more blurry DLSS imho. Here you can do alot of tweaking and try switching renderer settings to get better frames. Also better to work with LODS here.

My point is - if you create Open World “Realistic title” you should go with DX12 and Nanite. Since there is dimension difference.
DX11


DX12

2 Likes

So correct me if im wrong but doesnt nanite do deferred texturing ? because geo overdraw is magnitudes less expensive than shader overdraw

i just spent hours testing a scene i built with 2k megascans, landscape, foliage, etc i have a fixed perspective camera, observing standalone fullscreen performance. tested scene reconstructed with lumen,nanite and every shadermodel rhi/api dx11,dx12,vulkan.
no raytracing,no virtual shadows, taa, no upscale. lumen is faster than screenspace

without nanite, vulkan sm5 lumen gi/ref 3.3-6.8ms
without nanite, vulkan sm6 lumen gi/ref 4.2-8.8ms
without nanite, dx11 sm5 lumen gi/ref 7.8-10.8ms
without nanite, dx12 sm5 lumen gi/ref 8-13.2ms
without nanite, dx12 sm6 lumen gi/ref 12.8-18.3ms

switching taa upscaler added 1-2ms on everything.
switching to tsr added from 6-10ms on everything.

with nanite, dx12 sm6 lumen gi/ref 32.2-52.5ms
hard to tell the quad overdraw, because everything not in the center was fine.
anything in the center of the viewport became 1/3 overdraw (unless im reading the scale wrong)

none of this means anything i to anyone other than me i guess. its just what i noticed testing these scenarios in a real scenario i can measure for myself.
image attached is only a show of the landscape/pcg layers/foliage
i used a refined 1x1 63x63 test landscape

2 Likes

you are doing performance testing…in the editor?
never test performance in the editor. Always on a packaged build

4 Likes

@TheKJ

CPU performance up 40% and GPU up 20% since 5.0. I’m gonna say I trust their benchmarking more than yours.

5 Likes

To be honest, I didn’t expect that after my last post a few months ago, I would come back and find so many new messages. Since then, I’ve been following the conversation closely, and I must say I’ve been quite entertained by it. Many thanks to the community for the engaging discussion, and special kudos to those who’ve posted detailed and objective responses.

Thanks also to those who uploaded screenshots. Regardless of Nanite, I have to say some of them were truly artistically impressive – keep it up!

Now, a bit of real talk: Normally, problems come with possible solutions, but in this case, that’s not so clear. Is Nanite really the problem? If so, what would the solution be? There are still no proven, functional, documented alternatives. Or is traditional LOD the alternative? Or perhaps the problem lies elsewhere entirely…

Even if the author’s claims were accurate, it would be by chance. His tests are always run in-editor on the same hardware, never showcasing scenarios that would actually be used in a real game. Real-world data is different. Even if Nanite’s performance in these specific tests isn’t ideal, who cares if this scenario would never be replicated in a game or film? Why focus on problems from hypothetical tests rather than real-world challenges?

I also have my doubts about where this discussion is headed. The author will continue to claim that Nanite is garbage; the next ten posts will try to counter that; someone will cautiously ask a question, several people will respond, and the author will start over. It’s an endless loop and a waste of time. It’s clear we’re not going to reach an agreement here – maybe it’s time to let this go? This forum exists to help others with actual problems, not to highlight ones that will never arise. But I don’t mind a little more entertainment either.

5 Likes

Hey if anyone is Interested to check comparisions with nanite enabled on the 8k landscape ive made some tests - It is less performant in 5.4.4 Than standard one with 8 layer texturing. Default Engine settings Landscape Performance Techniques Thread

never showcasing scenarios that would actually be used in a real game.

There are several people getting 40% increases from disabling Nanite in Silent Hill 2 which was covered in this video. Performance boost are very intense in different areas according to commenters which I didn’t even cover.

What you’ll miss is I can take a real world example and people here will find any excuse to say “it’s the developers faults, they did something wrong!” when this is my entire point. If it’s so hard to get right, stop pushing it on developers.

When someone shows nanite scenario running faster, I don’t scream “you did something” wrong, I state the exact issues they got wrong(overdraw).

maybe it’s time to let this go?

No, because it’s shows off how inherently flawed the community is and how stuck they are on Epic’s words on these issues. Also lets users confirm stagnant performance enhancements per UE version release.

His tests are always run in-editor on the same hardware,

People saying this are blindly listening to Epic’s suggestions about “editor distortion” and people don’t even understand why it’s suggested. 3+4+5+6=18. If you take out the first cost(editor, say number 3) 4, 5, 6(shader timings) do not magically shrink.

Or is traditional LOD the alternative?

Compute based tessellation(nanite independent)/batches and newer LOD algorithms that create new textures such as depth bias, and intelligently remap UVs, optimize topology for “max area”, and intelligently reduce tri count by analyzing overdraw instead of blind, deforming algorithms in UE5 right now.

It’s a workflow problem. And right now Nanite is a 12 sec checkbox vs LODs which can take up to 15 mins with poor results(due to funding neglect).

Why focus on problems from hypothetical tests rather than real-world challenges?

Such as the massive amount of UE5 games that have come out?
That’s the real world problem, the small test here only prove exponential performance loss in more complex scenarios and disproving the “threshold of detail for constant cost” or polycount lies.

This forum exists to help others with actual problems,

The thread has helped numerous people turn performance around by understanding the fundamental fact(that Epic refuses to state themselves, if not compete lie about it) that nanite is slower than proper optimization and lots of people care about performance. These same people didn’t get quick access results until this thread was made.

I also have my doubts about where this discussion is headed.

Then it’s not for you, it’s for developers who want good performance and consumer reference.
The thread updates with every major release of UE. It heads where Epic leads it.

The issue I have with your posts is that you repeatedly explain why the results of those who don’t experience significant performance issues are irrelevant, arguing that their tests simply don’t cover the problem areas and thus their performance appears “good.” This implies that you are aware of scenarios where Nanite performs well, but you only highlight those in which it causes performance drops. However, when it comes to your own example, where you show performance gains by turning off Nanite, you don’t explain how one might optimize for Nanite specifically – not for LODs, but Nanite. Even though others have shown you that various factors are at play, you simply put all the blame on Nanite. That’s, of course, an easy position to take.

I understand the argument that it’s often easier to criticize developers. But, honestly, if Nanite were truly that poor in performance, developers would have noticed it during optimization – well before release. Bloober Team, the developers behind Silent Hill 2, are experienced and have worked on several titles in the past. It’s hard to believe they would release the game without knowing about these issues; they may have knowingly chosen Nanite despite some performance loss. I don’t know if you work, but in my job, I regularly see tasks that simply need to be completed, even if every detail isn’t perfect. As long as something is “good enough,” that’s what matters.

Another issue is how you present performance gains. You mention a 40% improvement, but don’t specify whether that’s in FPS or render time (ms). Here’s a simple example: at 10 FPS, adding 4 more frames means a 40% increase in FPS, but the reduction in render time is only 28.27%. This difference matters because FPS can be misleading, as the actual gain in ms is much lower. Increasing FPS from 60 to 120 yields a 100% increase, but the reduction in render time is only 50%. These details are essential for clarity, especially when speaking from a developer’s perspective rather than a consumer’s.

Hardware also plays a role. Newer hardware lessens the render time differences, depending on whether Nanite is enabled or not. Your argument could just as easily apply to ray tracing: although it reduces FPS, it’s increasingly embraced on modern hardware. Many graphics features we appreciate today were previously only possible with more powerful hardware. Even if Nanite currently requires 40% more resources, this difference will matter less as hardware improves. For instance, with an RTX 3060, the difference between Nanite and LOD might be 40%, but with an RTX 4080 or 5070, it might be only 10% or less.

With Nanite, Epic Games has pioneered future technology, much like NVIDIA did with ray tracing. Hardware will need to catch up, as it often has throughout gaming and graphics technology history. Take Crysis as a classic example – when it launched, few systems could handle its features. Even back in 1998, Unreal didn’t perform as well as other games, but it looked considerably better. So if you really want to show that something is better than Nanite, then please give us a demo so that everyone can see for themselves. That would make this discussion far more productive.

8 Likes

Hey the problem with Nanite is that is only option to have with virtual shadow maps hardware rtx and lumen enabled. So you cant use other technique in that scenario.In that case SH2 devs-had to use Nanite.That because the game Would like perform like a twice worse with some settings.

Of course, a new comment to attract people to watch your new video. However…when you disable nanite by console command, there will not be the same models with the same details, the models will have much fewer triangles. So, what you’ve done is compare a very low-poly version of models without Nanite vs original meshes (high details) with Nanite Enabled.

@myasga Maybe the nvidia branch could be an alternative to lumen in that case? ReStir has some nice results

1 Like