Nanite Performance is Not Better than Overdraw Focused LODs [TEST RESULTS]. Epic's Documentation is Dangering Optimization.

Idk he is getting a lot of backlash and spending a ton of time and money trying to censor critics but then again usually creators just make the same type of content until they fall off, Mr Beast has made the same formula of content for like 10 years now and doesn’t seem to be stopping, Kevin’s growth has plateaued though so his content will end up being repetitive to his existing fans, but rage baiters like him just need to keep people angry, they don’t need to keep people entertained.

2 Likes

Alan wake 2 is only GPU limited at high resolutions, otherwise it looks to be CPU limited, we are not given exact thread timings but given the high CPU usage and the fact there isn’t a huge difference between GPUs suggest this, 4k has always been hard to run which is why upscalers are necessary. At 1080p Alan wake 2 runs only 20-30 fps lower than apex legends which runs on the source engine and relies heavily on art tricks to look good, it also uses extensive LODs and streaming.

Alan wake 2 video: https://youtu.be/DMBfdbjzQDA?si=3-eTDYWd6l8yY1aR

GTA V(old tech) on a gtx 970(GPU of its time) at 4k with similar performance to Alan wake 2 at 4k on hardware of its time: https://youtu.be/HHfKS3ytzrU?si=NKc6gI65Ro_orR4k

I did this quick test in a real Unreal Engine scenario that uses a lot of occlusion. I compared the full Nanite meshes to the same meshes with just 5% of the original triangles. The difference is barely noticeable in performance (in this scene): https://www.youtube.com/watch?v=eST6XSf-Tyw

Of course, performance results may vary depending on the scene, but it’s always more useful to test these things in actual environments.

2 Likes

Gain with triangles reduction / precision is on memory allocation on GPU mostly in Packed game - but requires alot of instances and packing with ISM .It require making Nanite switch to min lod1 with 0 triangles(not impostor) cause fallback factor to 0 percent as example not reduce as much - that create memory magic. Its possible to gain alot of headroom for openworld projects. That techinque will not allow to bake Hlod cause trees as example will be invisible - but that can be also done way that Lod1 switch with impostor,bake, then remove impostor and create dummy lod1 with lowest triangles. For instanced hlod . Puree hacks. Iam trying to check where is the gap that World partition is needed for Nanite for my scenario. Cause system also add some overhead itself and its problematic.

1 Like

Myasga, I’ve watched you test and experiment with all the different types of rendering - you’ve kept an open mind, not taking sides but forming your opinion by testing and collating the results with real game scenarios - you have my respect.

1 Like

haha - hows your “game” going?

How do you feel watching everyone else making awesome Nanite games in the latest engine versions?

Thank You for that kind words and wasting time to watch/read, trying to get most of juice from the Engine - i cant watch tutorials with no real transpostions on the current topic and saying its good or bad. Its not cause of non informative documents and videos but for making propaganda from Older Pipeline Lovers and New Ones. Iam also building my game in basement and eating rocks since left my current job.So have more energy and time to continue diplomatic discussion on the forums and learning c++^^. Cross also fingers for 5.7 (Witcher Edition) with nanite assemblies and new render type for them.Testing aswell Nvidia Branch and RTXDI methods. Thank you also for deliver awesome Plugins which help me alot to do tests and tools inside them to make that content. Please also contact with Epic if there is a way to reduce overhead on Impostor/Last Lods that causing more memory usage with instances, No idea it is material thing or just material slot placing with asset. I did not touched debug tools to see what happens(not expert).BB Generated from RDLodTools of course.That could help to minimize differences between systems and allow LOD+Shadow map users enabling higher graphic settings/Ray tracing (cause allocate more memory) so it can be helpfull for consoles and 8gb graphics.Iam not sure if done correctly but maybe Meshpack plugin also not take everything with packing and do better job for Nanite itself. if you need help to configure iam here - All The Best!

BTW. Here how HLOD Baked looks (That not dense distant area) and instanced before that gap. Look for solution how to manage density - maybe Epic can help also with that. Since Baking detailed nanite without LOD1 switch is painfull can take alot of hours and memory in final (had 11,5gb allocated with RT and Epic details - just trees - 20k instances 1,5 mln - auto setup and it took 8 hours VS 30 mins with LOD1 2triangle switch - 5.7 GB allocated. So there is a must with mixing both techniques as i discover for distant baking).Making fallback factor aggresive not help (cut half of trees on HLOD) Also 5k triangle tree is not 2 triangle thing that Come from Bilboard ^ - which saves that crucial memory allocation for close range details in openworld case.
Tried propably all possibilities and scenarios for HLODS - but i was not able to manage density,like baker cut half of Impostor or 0% nanite fallback factor instances. That is the problem i see that should be fixed somehow in the Engine - forcing to all instances to be visible on distant HLOD.

1 Like

Did you try not baking the trees with the HLOD and instead leaving them as instances in the level? It would be an interesting comparison on the render time and VRAM usage not to mention a lot less for the HLOD build if it works out as fairly similar.

1 Like

i will test and back to You. :clinking_beer_mugs:

The problem is that switching to impostor take new slot/material in the nanite mesh and its more memory demanding -than empty LOD1 nanite switch with lowest triangles possible - that takes more memory itself. So i need baked impostors at the moment , then remove Billboard from mesh and let Nanite to be in memory with lod empty swtich to itself. I know its kinda out of logic but test i did allocate minimal memory i could achieve. My terrain Tiles as example and why i not use Landscape system.

that allocation material or material slot - takes memory for some reason when switching. Its visible with thousands of instances. SO it will take more than Nanite empty switch i discover - since nanite mesh calculate like fallback factor somehow but reduce to setup made in LOD1. Keep in mind that also that nanite asset will be blurry in fast movement cause of massive triangle reduction that Epic wouldnot except^^.

Nanite builds a 12x12 frame Impostor when importing/converting meshes and displays that when the mesh is rendered at or less than 12 pixels - but they only take about 40KB memory and I think that’s always resident (I could be wrong about that, but it doesn’t really make any difference) - did you test with pure Nanite trees without HLODing them?

1 Like

That’s why iam trying to not use World Partition at all cause of my discovery. But iam expanding to the limits. Will back to You how things went. How it looks with Pine Tree as example 10 % base triangles and 0 fallback factor.
Full nanite with 100% fallback factor (full geo Pine)



AUTO


0% Feedback

10% triangles 0 % feedback


Auto Look


Maximal Fallback Reduced

10% + 0% fallback and lowest precision

lowest setup


highest (check frames)

1 Like

Here’s some snapshots from a quick test of the impostors at play, pure Nanite tree (100% triangles).

There’s 18,600 trees (all same type) with 18,286 triangles each (total of 340,305,600).

When the scene (in Stand-alone) is viewed from a height where the trees are 12px or less:

You can zoom out further but the tri-count remains around the same amount.

As soon as you move closer and the trees start getting rendered at over 12px, the tri-count quickly goes up to around this:

and stays at that until you get to around this point:

and then as you move even closer, it starts getting much larger as the detail fills in.

Reducing the “Keep Triangle Percent” down also dramatically reduces the amount of triangles.

Interestingly, replacing the Nanite Meshes with aggressive LODs (5LODs + Impostor, SS: 2,1.2,1.0,.75,.6, Impostor .5 gives around the same number of triangles at distance, and increases to the same amount of 68K at around the same height:

But as you get close, even with the LODs the tri-count gets larger (you could reduce the polycount of the LODs more - but when they’re this aggressive it already is starting to really show):

Regarding memory usage:

2 Likes

in the middle of 80K instances of trees 16k Level Size without world partition and RT EPIC settings 100% Screenspace - Nvidia Branch:

1)MAX
Nanite 400k + 400k triangles auto precisions
Packaged Size


Loading Level Time SSD PCIE 4.0:
33s
Memory Usage

  1. Auto fallback
    Packaged Size


    Loading Level Time:
    32 s
    Memory Usage

  2. 10% Triangles Fallback 0% lowest precisions possible to not having broken Asset


Level
31,5 Seconds
Memory

And it comes with the scale so with 80k instance there is minimal difference but there is no RT and LUMEN. So you can 1.5x the memory in each case.


so if proportion as example
Highest without RT - 1GB
Lowest 800mb
so for RT highest will take 1,5 gb and lowest 1,2gb. Then you scale that X ammount assets used and quality they deliver.
Trying to do work with Impostor with Nanite and switch that for 80 k scenario and increase amount and turning on RT.
Keep that in mind that is not Landscape system iam sure you will hit like 6-7GB of memory atm with 8k size due too much more triangles drawn.

I’m not sure what you mean by “feedback factor.” Did you mean “Fallback”? The Fallback mesh is only used when the platform doesn’t support Nanite, or in specific cases like spline meshes in versions earlier than 5.3/ meshes with transparent materials.

In my test, I reduced the “Keep Triangle Percent” value, which effectively controls the maximum number of triangles Nanite can use for a mesh. As shown in the video, setting this to 5% results in significantly simplified meshes.

I ran this test because there was a previous claim that increasing mesh density would negatively impact performance and that Epic was being dishonest about it. My goal was to show that in a real, complex scene, not just a few spheres in a map, performance is barely affected when using more or fewer triangles, as long as you’re leveraging Nanite and occlusion properly.

1 Like

fallback excuse me i mean by that when reducing triangles you get memory headroom- cause more triangles apply more allocation at runtime .
There are 80 k instances of trees and this is only that small part of terrain


3440x1440 with Epic settings 100% screen space arround 4.5 GB of GPU memory now and this is without Raytracing.
Think about that how you can fill that map with more assets(trees) cause rest will be somehow removed,culled.

Pathtracing and RTXDI Packed - 80 k tree instances 10% triangles and fallback factor to 0%
RTX 4060 Full Pathracing Scenario *without FrameGeneration
:


Loading level time: 36s

Memory Allocation

33% screenspace 6.3 GB allocation (TSR)


100% screenspace 7.4 GB allocation (TSR)

33% DLSS4 Ultra Performance - 6Gb GB Allocation


100% DLAA 7.1 GB Allocation


It looks how it looks (33%DLSS) dont like it - RTXDI is very expensive with PathTracing feature as you can see in frames. I kinda turn on everything what was avaiable through renderer setting and Postprocess features (one option had visual bug) Now packing with 100% Nanite.

also new option showed its reduced to minimum possible Fallback -anyone knows where it can be tweak?

100% Nanite took the same amount of memory 7.4 going to scale to 300k instances without RT - maybe because that RT FallbackFactor

280k it took like a 30 minutes to add that amount and engine freeze- how to fill world with those 500k triangles nanite assets can anyone share a tips ?:frowning: and it takes each longer with additional placing stuff.

loading level on SSD 7000mb - 8:40 so nice to get a coffee (each time and imagine a crash) , 300mb more memory demanding.
Unplayable thing so without WP and like really lowpoly assets there are no way to do seamless OpenWorld with Nanite 500k trees.



map fill

with 10% and 0% feedback factor 8:45 and 200mb less memory usage for GPU.
Next World Partition tests.

Yep there are no problems with placing alot off foliage cause WP split instanced actors… maybe someone have solution for that?


Problem is with instances count spawning- 2mln trees is possible with world partition - but the point is that is in memory and take long time to load map…
So everything must be unloaded and trees need to baked into merged HLODs - which doesnt take alot of memory cause instance are not spawned there runtime.

that how look low triangle count pine tree nanite baked into HLOD.
more detailed triangle nanite - more visibility - but time to bake can take eons and more overhead to the streaming when tile is going to load.
So why impostor or billboard should be baked
its look better,its visible from the distance and low memory cost.




and trees gone from that distance

One merged Tile with 20k instances

Nice to see you are outing yourself as to why you are clueless.

@MySaga
Dont use the landscape system. Bake it down to a proper mesh, section it off for occlusion.
The size of the level should be drastically reduced and faster to load as a result.

If you are using Nanite for whatverr reason, then you should really only see benefits from removing other parts of the engine that are obviously flawed to the point of only bringing issues even on non-nantie projects…

I do have to ask this though: is it worh wasting all of your time on this instead of using it to bluild proper mesh LODs and level geometry with HLODs or whatever else baked custom?

Probably not? Anyone can make custom components that work way better than whatever Epic provides. I suggest you do just that and make your own HLOd aggregation system instead of being at Epic’s mercy - in other words, use your time towards things you will forever re-use within your projects rather than focusing on something that Epic will likely change or kill in a version or 2…
Also, I think HLODs are probably still broken. Havent seen anything positive on them in the forums or anywhere else either way…

1 Like

That are baked and optimized Meshes(detail/vis/minimal material overhead- so there were no frames loss when applying advanced material). Less triangles will provide reduction in detail ;( also baked hlods visible seams going to happen. So reducing more with the detail is kinda killing realisitic style at this moment. I see Your point will try to find someone who can help with HLODs or tools that can be used instead Unreal. I can generate HLods but the point is there are no tools that can fix that density of forest,trees even with Bilboards it reduce account of mesh instances on last hlods. About tools will be not able to make this any soon- alone heh.

I found this document the other day, you may have already seen it - but it’s got some useful tips in it.

1 Like

I have read very good Nanite vs LODs test recently here https://synapse.crdg.jp/2025/05/29/testing-nanite-performance-on-mid-range-hardware-lessons-from-the-field/

2 Likes

And it was done on 5.5. Considering 5.6 have even better culling for Nanite that means Nanite beat the ■■■■ out of LODs everyday. It looks and perform better.
Edit : He didn’t use HLODs for the test ?