Nanite Performance is Not Better than Overdraw Focused LODs [TEST RESULTS]. Epic's Documentation is Dangering Optimization.

I did some nanite vs non-nanite testing of myself a few months back. What I was testing though was to mass “convert” nanite meshes into LOD meshes automatically by disabling nanite and making LOD0 have the same number of triangles as the fallback mesh (usually at 2% of the original mesh), then add LOD levels at a 50% ratio until an LOD is <=100 triangles. First test was the Old West learning ( https://www.unrealengine.com/marketplace/en-US/product/old-west-learning-project ). After about 24h spent in compilling the new meshes, the GPU frame time was nearly identical, about 30ms on my RX 6800 XT.
Second test was CitySample. In CitySample the GPU time was something like double, so around 60ms vs 30 with Nanite but what was really bad was that on CPU shadow culling took 130ms, because when using nanite I think all instances get culled on the GPU and use the indirect drawing APIs. Sure, LOD meshes could do that as well but epic never implemented indirect drawing for them. I think a proper nanite vs non nanite comparison absolutely needs a huge scene with ~1M mesh instances (like in CitySample) and that’s where nanite shines.
PS: Nanite has compute rasterization for small triangles but for big triangles it uses hardware rasterization. You can even change the threshold used in determining what path to use.

1 Like

The issue with this post is, that all of these tests lack a basic understanding of how to use nanite.
As you correctly said @Ciprian_Stanciu, you need a full scene of nanite meshes to feel the performance difference. A game. Not just some benchmarks.

Nanite has a higher base cost but scales much better when adding detail and creating a full scene. So whenever we use benchmarks on “simple scenes” and “single meshes duplicated a thousand times”, traditional rendering will always outperform nanite.

Lets go through this one by one.

  1. This test just imports some high quality meshes without anything else in the scene. Nanite has higher base cost than traditional rendering. Having proper LOD and an Instanced Indirect call will always be faster on single meshes than nanite. But this scenario is unrealistic for real games.
  2. Same thing. Still unrealistic scenario for a game.
  3. Nanite Landscapes only make sense if you generate them with a high triangle count by default. (just like all Nanite meshes should be). Nanite is really bad at handling large triangles. Culling will fail and generate overdraw. Tessellation does not fix this, it adds details after the culling phase.
  4. For me this looks like it is caused by too high texture density. This can be fixed by using Virtual Textures (which is recommended by epic anyway). If it is the triangle count, you can decrease the nanite LOD Bias r.Nanite.ViewMeshLODBias.Offset or lower the Nanite “KeepTrianglePercent” option. Also i would love to see a wireframe view here, because i am assuming, that the stones are really low poly and only the inscryptions are high poly. Nanite meshes need an similar triangle count everywhere for best looks.
  5. You are using traditional, optimized meshes here. Large triangles will lead to large overdraw because clustering and culling is failing. Also this scene has very few models, so again, the nanite overhead is significant.
  6. Lumen is absolutely not production ready and tanks too much FPS. But this thread is about nanite, so this test has nothing to do with the issue.
  7. While this is funny, because Lyra is actually an epic project, the same issue applies here. You have massive triangles, smooth surfaces and long stretched triangles. Additionally, it is an enclosed room with barely any objects. So it works absolutely against anything nanite is good at.
  8. I dont know what this is even supposed to show. 50% of the image is just the skybox, you only have massive objects, with huge triangles. no details or micro objects. This is optimized for “classic” games, not for nanite.
  9. Same as 5, traditional meshes with large triangles, small scene, not a lot of objects. Works against nanites use case.
  10. Thats just a rant post. We dont know any specifics on how the user used nanite, so nothing can be gained from this.
  11. Same point as 4. You can tweak your nanite meshes to show less geometric detail manually. Nanite is an auto LOD system, and as all of these systems, it is not perfect. So on some meshes you do have to manually adjust them for visual quality. Or you just use TAA.
    I know you hate it for some reason, but it works and looks good. Would like to hear some good arguments against it.
  12. A lot of things are talked about here.
  1. Dont get the context here.
  2. SS Shadows are just contact shadows, and unreal has it already implemented and recommend using it for micro details. Just like you suggested. https://www.unrealengine.com/en-US/tech-blog/virtual-shadow-maps-in-fortnite-battle-royale-chapter-4
  3. You can handle draw calls manually or you just ignore them using nanite. At the end, less work for your studio results in higher production quality, because you dont waste time.
  4. Unreal has virtual texturing, which is their solution for the same problem.

The rest of the article is just factually incorrect. Performance is affected by poly count. There are 2 major factors here:

  • Loading: With nanite the amount of triangles on screen and in the VRAM stays about the same. Nanite uses a clustered streaming system to only load triangles that are visible, while not loading the whole mesh.
    Assuming you have a full scene, with high grained LOD setup, it will always need to load the whole Mesh instead of just single clusters into the VRAM. Do this for every mesh and every LOD level, and you have much higher traffic and loading times, which results in worse performance.
    Also assuming you are optimizing for low end hardware and you are limited with your VRAM, you constantly need to load and unload meshes from your hard drive, which results in stuttering. Nanite doesnt have this problem.
  • Nanite does very fine grained culling, because of the clusters. When you have 2 high poly meshes in a scene behind each other, classic renderers will still need to render both meshes, while nanite can just cull parts of the far mesh. We do agree, that higher poly counts result in more computations, for both nanite and classic. The graphics card simply has more data to calculate.
    So in classic renderers the overdraw and computation is much worse, when rendering full meshes, because you need to calculate/render more triangles. Nanite just culls them away.

The test you did here (again) takes a single mesh with 6 million polys. You described it as a “simple” scene. Well… Again, this is not what nanite was intended to do.

If your whole point was to go against epics advice to “use nanite everywhere” i am completely on your side. But thats just one statement they are pushing, so i don’t get why you are so upset about it.
Nanite is a tool, and if you know how to use it, you can achieve much higher details and performance for fully fledged next gen games.

Classic rendering can only get you so far. And while i agree that there are some amazing looking games out there (Horizon Forbidden West, just incredible), the quality level in a AAA nanite production is just on another level. Show me a game with that much detail.

4 Likes

Btw do you have a automated script that updates the number of posts in your first sentence? ^^ Because it perfectly matches the amount of posts that are actually here :slight_smile:
If so, Thats pretty cool!

Also please mind, epic also states that nanite does not work well for large triangles, and objects that do not occlude other objects. Which is mostly the case for traditional meshes. So they are not “lying” at all. You simply took one quote from their docs out of context…

Nanite should generally be enabled wherever possible. Any Static Mesh that has it enabled will typically render faster, and take up less memory and disk space.

More specifically, a mesh is an especially good candidate for Nanite if it:

* Contains many triangles, or has triangles that will be very small on screen
* Has many instances in the scene
* Acts as a major occluder of other Nanite geometry
* Casts shadows using Virtual Shadow Maps

An example of an exception to these rules is something like a sky sphere: its triangles will be large on screen, it doesn't occlude anything, and there is only one in the scene. Typically these exceptions are rare and performance loss for using Nanite with them is fairly minimal so the recommendation is to not be overly concerned about where Nanite shouldn't be enabled if Nanite supports the use case.
1 Like

There’s no point in debating with @TheKJ. I left a comment on his YouTube video, but it mysteriously disappeared after 1 day :smiley: Seems like he’s not interested in hearing different opinions, he’s got his own, and that’s the “truth.” The funny thing is, he doesn’t even use Unreal regularly, but somehow, he still knows everything from 2 test of scenes. I will attach a photo with my comment, here can’t be deleted like on Youtube.

1 Like

@s_phir_h

You have wasted all this time and space defending Nanite; did you not realize that Threat Interactive has repeatedly expressed that it IS a solution to REAL problems but is incompetent solution overall?

Here is what I think about that game example:

@Lucian

As far as YT comments go we have the right to delete all kinds of comments. My PR rep deletes comments that are rude, stupid, etc.

Yes, he deletes any comments that disagree with him, and he’s been doing this since his very first video. He erases and bans anyone who challenges his narrative.

It’s just another YouTube channel with no value, like so many others out there.

3 Likes

We know that ‘Threat Interactive,’ which you reference, is actually you.

The quality is rather mediocre, and it’s clear you don’t even understand most of what you’re talking about.

Furthermore, there’s a lot of empty talk with little substance, except for the goal of raising $900k for a supposed magical in-house solution.

This really smells like a scam.

2 Likes

The toxicity from users like you guys are why I made the important edits to the main post.
You’re not even discussing the topic.

Why did you tag me with this

I’m more than positive that someone else has the same avatar icon as Lucian that has a username similar to yours(A and K etc). Complete accident :+1:

That was hardly toxic, though. You randomly spoke in 3rd person which is weird, even more so - deleting comments that don’t follow your narrative is just cringe and removes any and all of your credibility.

If people are being rude, sure. But you or “your PR” has been removing comments willy nilly.

I initially started following this thread because I was genuinely curious about the various test results, but some of your tests do seem, like many others have pointed out, to be quite unrealistic.

3D modeling takes time, making good LODs takes time, optimizing takes time. Time is money and Nanite does a good enough job eliminating that issue to an extent. It’s not by any means perfect, it definitely has its issues and limitations - but I would expect nothing less from a feature that is arguably still very experimental.

All that said, I think the main reason this thread is going off topic is partly because of you and your utter inability to accept the fact your tests may not be the “damning proof” you think it is.

Frankly, I found the views of the various other thread contributors to be more insightful than your various videos. But from a reader’s perspective it’s as if they’re trying to talk to a brick wall.

This thread started out intriguing but now it’s just silly.

6 Likes

We both wasted our time apparently.
You created and moderated a giant thread AND a youtube channel, dedicated to rage baiting about Nanite, while lacking even the most fundamental understanding on it’s usage.
You wasted hours of time, doing benchmarks on unrealistic scenarios, and faking data to prove your point.

Nanite is by far an incompetent solution. It’s a tool, that has different requirements than traditional rendering. If one does not understand those requirements (as you, apparently) it will lead to worse performance.

About your thoughts on Marvel: The video is 4k, not 1080p. We only see cutscenes not gameplay in this trailer. As a video game developer you should know, cutscenes with 30FPS is Standart in “Quality” Mode, because they ramp up the post processing to make them look better.
Horizon does that, The Last of us does that, everyone does that.
The characters are much higher quality than PS4. Would love to see an example for your claim.
The detail of the environment is on a different level than any other game. (Due to nanite)
And “Bad TAA” is just your preference. Which does not matter in this discussion.

So to summarize, again, half of your claims are lies or lack understanding of development and practices. Looking forward to play a game of “Thread Interactive”. ^^

4 Likes

I might release my quality settings… but ive made myself some low medium and high settings for nanite, lumen and virtual shadow maps… and im shocked that i got my detailed city with lots of detailed nanite meshes running at 90 fps even with my RVT in the scene. highest settings i never get below 60 fps … even with tesselation enabled.

The trick to using nanite, lumen, virtual shadow maps, and Run Time Virtual textures is the CVARS console commands and the best way to test them is device profiles.

And im nto even done yet… i feel like i can get this up to 100 fps. Its currently 4k rendered at 1920x1080 with TSAA upscaling.

World Position offset materials need to be handled with care… disabling them at a distance or telling them not to modify the virtual shadow map all together with realtime updates.

1 Like

Do you mind sharing the console commands you are using? :slight_smile: This will probably help a lot of people!
@kurylo3d

Personally i can recommend:

r.Shadow.Virtual.ResolutionLodBiasDirectional = -0.5
Every time you increase the number by 1.0 it will half the shadow resolution for VSM.
Default is -1.5, but that is a 16k shadow map. If you don’t need pixel perfect shadows, this is the best way to increase performance.
There are also variances for different light types:
r.Shadow.Virtual.ResolutionLodBiasDirectionalMoving
r.Shadow.Virtual.ResolutionLodBiasLocal
r.Shadow.Virtual.ResolutionLodBiasLocalMoving

Using WPO will lead to a lot of cache invalidations and therefore worse performance.
You can set the “Shadow Cache Invalidation Behavior” for each individual mesh under Lightning->Advances->Shadow Cache Invalidation Behavior set it to Rigid. This will disable invalidations from WPO.

In addition you can use r.Shadow.Virtual.Cache.MaxMaterialPositionInvalidationRange, which will set a max range in cm, after which the WPO will not affect VSM Invalidations.

Then i do recommend
r.Nanite.ViewMeshLODBias.Offset = 2
This might reintroduce mesh popping AND will lead to more overdraw, as the clusters will be bigger.
BUT as multiple people already pointed out, nanite is not perfect, and if you have a lot of sub pixel triangles, you can increase your performance quite a bit.
Ofc this depends on your project and mesh density, play around with the value and see if it has any effect.

Had a few more, that i used for my latest project but can’t remember. Will put them here when i get to it.

2 Likes

ill put some of the settings up when i finsih all my testing… But i will say Run Time Virtual textures… man… i dont know why and i cant explain why… but setting r.vt.FeedbackFactor=250 instead of the default 16… upped me by about 20 fps… im not even joking. And its steady… at 16 it flops around between 55 and 70 at 250 it sticks around 83-85…

and if u try to set r.vt.FeedbackFactor=1 … and try to lower the quality. for some reason your fps gets worse… down to like 40 fps… with blurry textures… Its very interesting and i guess i need to test it on lower end machiens to see what thats all about.

Also i can really recommend this talk:

Please keep in mind, people, in Unreal Engine the graphics are 150% from the start, and you need to scale them down to make a game. Nobody needs pixel perfect shadows, or completely seemless LODs.
People don’t care about stuff like that, as long as your game runs smoothly and is well done!

1 Like

Alright here are some benchmarks from my side. This is Nanite-Only, as i have no time manually optimizing 300 Meshes.

The scene is rather small, compared to an open world, as this was a VR escape room puzzle we did for fun a while ago.
It is using full photogrammetry assets, 8k Textures and a lot of foliage.
The full scene has nanite enabled, except the skybox. Real Time VSM. No Lumen.


The detail you can achieve with nanite is beautifull. Especially for a VR Game, peoples jaw dropped when they first played the demo, as you can literally go right up to the meshes, and they still look detailed. We are using super dense foliage meshes and tessellated landscapes. We also have WPO animated leafs and particles. So not the ideal use cases for Nanite.

The foliage and leafs also introduce quite a bit of overdraw, which is not ideal and could be optimized.


On the other hand, you can look at earch individual leaf, as it has geometry!

Okay, now after i apply all optimizations (including TAA, DLSS, console commands to adjust quality settings) we get the following profiler stats:

The Nanite VisBuffer is the biggest chunk and sits around 1.8ms.
This is consistent across hardware. We barely noticed an increase when switching to lower end systems. What does change is the Shadow Depths and the rendering.
So OFC there will be less FPS on lower end systems.

Right now, we are getting 180-200 FPS with a frame time of 5ms.
On my Laptop (3050, 8 core, 3.8GHz) the frame time is ~8ms, so ~120FPS.

This is about the same when running the game in a standalone build.
Running on my Quest 2 with PC link i get ~6.5ms frametime with a VisBuffer of ~2.5ms.

For a VR Game, that requires 90FPS, this is absolutely playable and in Budget.
And while next gen graphics are absolutely demanding, the leap you can achieve with Nanite is incomparable to manual LODs.

Also it is important to notice, we did this project in one week with 3 people. This includes everything! The Programming, downloading the assets (which took a day), building the level, optimization. The quality we achieved in such a short amount of time would be impossible without nanite.

3 Likes

Now first I would like to state a few things.

  • You’re not using Lumen, which is mostly likely due to gameplay design advantage. Good for you, doesn’t help the majority of game designs. That’s nobody’s fault, it just needs to be recognized.

  • Second, I’m not sure if you have a dynamic time of day/ position of lights according to geo which can have a considerable impact (positive) on your shadow timings.

While it might be true you created a scenario that can balance the usually cost of Nanite, you also said this:

Okay, now after i apply all optimizations including TAA, DLSS,
On my Laptop (3050, 8 core, 3.8GHz) the frame time is ~8ms, so ~120FPS.

You just admitted motion is going to appear slop-like to thousands of potential players and all your detail is not going to be visible in motion/gameplay(unless gameplay mostly consist of the player simply turning their head). You almost convinced me, but what is price image quality on the 3050?

Also, you can’t argue with other studios opinions where the belief is realistic lighting is going to have a bigger impact on realism vs high res models. Just like I’m not going to argue with your opposite view.

Give me the the exact and resolution, full frame and in-motion screenshots. Then I can add input as you have given yours to my test. The issues with TAA are far from irrelevant.

@kurylo3d

Give me the same, no context of how motion looks at 60fps. You need to measure motion at 60 since it affects temporal image quality(nobody is mentioning this…) and don’t expect most consumers to play with vsync off when most are limited to 60hz.

TSAA upscaling.

TAAU or TSR. Big difference in quality and cost.

with the temporal thing… certain Materials that are animated need a specific check box to look good with temporal. “Has Pixel Animation” works wonders… like night and day. Ihad animated rain drop puddles that looked like crap with temporal but then i enabled the checkbox for it… I think its for shaders and moving things… and it makes the temporal look awesome on those things. Like it needs to know there is motion in the shader and a way to account for it. Where as temporal has ways to know if things are moving by default but not with shaders unless you enable it in the shader.

I am curious about exploring other AA methods to see what performance bumps i can get instead of using tsaa… though TSAA also has a whole bunch of cvars to tweak for quality and performance that i havent touched yet either.

For now im using TSAA

TSAA is not in unreal.
There is TAA(U)-Low-Epic
And TSR-CVAR control

There are thousands of different AA implementation and one letter off means an entirely different implementation.

though TSAA also has a whole bunch of cvars to tweak for quality and performance that i havent touched yet either.

You can achieve better results with r.AntiAliasingMethod 2 and using
r.TemporalAA.Algorithm 0 (UE4)
r.TemporalAA.HistoryScreenPercentage 200
r.TemporalAA.Upsampling 1 (Only in UE4 unless you want instructions on how to take advantage on this algorithm by deleting a few parts of the engine, otherwise put to 0)
r.TemporalAACatmullRom 0
r.TemporalAAFilterSize 0.09
r.TemporalAASamples 2(MSAAx2)
r.TemporalAACurrentFrameWeight .6 (or whatever barely stops the jitter under v-sync)

Better motion than standard TAA and cheaper than TSR.
No, it will not resolve a BS 8k frame during a still shot, it’s thousands of times more consistent across frames.

“Has Pixel Animation” works wonders… like night and day.

Cool, we should have had it years ago along with mass UV subpixel jitter so only edges are sampled by TAA. But this doesn’t benefit abusive uses of TAA.