NEW(3/22/2024) UE5.5+ feedback: Please Invest In Actual PERFORMANCE Innovations Beyond Frame Smearing For Actual GAMES.

Nope. Nanite is best tech

@Trastrastras NO it is not. It was made for virtual production and lazy developers. All the games to perform very well DO NOT use Nanite.

Here is an entire thread showing how damaging Nanite can be in a simple test.

[Optimizing your meshes via detail textures and overdraw friendly LODs will ALWAYS blow away Nanite.

I decided on Nanite the moment I saw how one nanite mesh looks nested inside the other. For example, you have a pile of rubble that you place on the landscape - in the case of UE4 (non-nanite) there is a line on the ground clearly separating the landscape and the mesh. You can have a perfect low poly model with a perfectly baked normal map, you can use postprocess material to blend it, but it will never look good, it just won’t, there will always be a sharp line where the geometry of rubble intrudes into the landscape. In the case of Nanite rubble with several million polygons in UE5, the individual pieces of rubble are actual geometry and if you dip the rubble into the landscape it looks like there are little stones lying on the ground. Or you take a barrel and stick it in a pile of rubble - in UE4 you see a clear articulation of geometry, an ugly sharp line that can’t be hidden, in UE5 with Nanite it looks like one organic model. You just can’t get around this with a normal map or anything else. So it’s not about the laziness of the developers to make honest low poly models but it’s about the fact that the result looks much better with Nanite.

1 Like

I’m not saying it doesn’t have a place in development, but in combination with Lumen and VSMs is a performance hell.

in UE5 with Nanite it looks like one organic model.

Yes, but that exact point becomes completely ridiculous if it performs so badly, you have to use a motion smearing upscaler like DLSS or TSR. Then the detail you asked for becomes irrelevant during gameplay/motion.
The situation becomes an oxymoron and hurts gamers with excellent hardware.
We shouldn’t be trying to make games for 4090’s.

2 Likes

Meanwhile on the RE Engine (RE4 Remake) I get amazing graphics running on an old GTX1070 and even that card isn’t exactly cheap. I agree it is nonsense to make software for gamers only for the fastest hardware out there. Optimization is key.

2 Likes

To be fair to UE5, all the great looking “optimized” RE games I know of aren’t calculating anything super hard. Lighting, shadows and environment’s are all baked(pretty much free) because the games don’t require a lot of dynamism.

UE5 could do the same things(except smaa) and get similar if not better perf.

But I do agree with the last statement tho, studios need to stop catering to ultra rich GPU purchasers. It’s become darn near elitism.

Also, The first descendant looks great(without the forced TAA) and uses Lumen and no Nanite.

I WAS WRONG. You can control WPO disable distances on nanite foliage. Spent a few hours digging through the engine source and different settings searching for ways to optimize my level. I found that WPO distances are set on a per-mesh basis when digging through all of the foliage settings. Never noticed it before.

For Foliage placed with the foliage placement tool, you have set the WPO disable distance for each foliage type (You can absolutely set them all at the same time like any other setting.)

If you’re like me and easily overlook things, I have screenshots

Screenshot showing the setting in the foliage editor.

Setting appears differently in the LGT(Landscape Grass Type) settings.

After adjusting these settings. I was able to push my main development PC just above 70FPS at native 1440p in Unreal Engine 5.3.

Performance uplift at native 1440p


To be completely fair, I modified the NaniteRasterizer.usf shader file. Although, I don’t think this gave me a major boost in performance if any at all. I only noticed a slight millisecond decrease when I first amended the shader. So think and hope that it’s within the margin of error.

GPU Profiling


2 Likes

I know, that’s why I shared the FN nanite presentation.Which I see no one clicked on…
They added that just for foliage. Read the doc to see how to debug/optimize it more.

If you have optimized meshes, turn off nanite due to the Nanite overhead slow down.
Unless you want to sacrifice performance for no pop.

Also, again show me the Real-time “stat gpu” chart. Not the profiler.

My problem was that I was looking for settings via console commands and only found a way to disable WPODisableDistance with no command to change the distance. Also didn’t see in the foliage or static mesh editor and I barely noticed it the when I discovered it the first time.

That’s why I provided screenshots for anyone else like me bogged down with other shyt and can easily overlook things like this. And honest mistake.


For me, Nanite isn’t about pop in. I took down the quality of so many other settings that my VSM’s are showing major pop in.

Nanite was supposed to be a draw call optimization feature as advertised by Epic with Lumen in the land of Nanite. They talked about how Nanite was able to reduce many drawcalls down to a single draw call with automation. I assumed they were using the new Mesh Shader rendering pipeline which was shown to have massive potential for future games that would one day swap pipelines.

I wrongly assumed that optimized geometry would perform better than the Epic tech demos.

My issue with Nanite is what I said before, there is no real way to control how Nanite behaves or what elements you want sacrifice in favor of performance.

For example:

  • If you’re a virtual production team, you’ll benefit from Nanite’s ability to blend geometry on a micro level. But for video games, this level of complexity is not required and should be opt-in. I should be able to control how nanite chooses to cull geometry. This way, I can have optimized geometry that also takes advantage of Nanite’s optimizations without overhead.

  • Nanite also offers no way to control cluster density per static mesh. Let’s say we’re using the City Sample cars. These cars use nanite and look ugly if r.Nanite.MaxPixelsPerEdge=8 at a close enough range where nanite will start to aggressively decrease the geometric detail. Let’s say I don’t want these cars to scale as aggressively as I may want my buildings or other things to scale. You have no way of controlling this. (I already attempted adjusting the precision in the Static Mesh editor. This just creates a weird ‘voxelized’ geometry look on everything)

No, I don’t believe Nanite is for Lazy developers. It’s a new tool with a lot of potential. However, it is not yet ready for games production unless you have the resources to investigate potential improvements for yourself.

2 Likes

I amended the post to include the capture from the GPU.

Updated again

No, I don’t believe Nanite is for Lazy developers

Tell that to the Remnant 2 devs lmao. Anys if you want to use Nanite fine, but I’m not going to be happy with your games performance until it’s performing 70fps at Native 4k on your 3090. (Which I believe is very possible)
I’m rooting for your project, I’m serious about fixing it because it will affect the public perception of UE5.

Show me the real-time "stat GPU’ log.
I enjoy optimizing so its a mutual benefit for the both of us.

EDIT: just saw reply, not profile gpu.
I want the stat gpu.
EDIT 2: Can you include the real-time stat gpu chart as well the Alt-r Nvidia GPU usage stats on the same pic.

1 Like

Did you also revert Foliage and all meshes to non-nanite since no VSM?

What would you say then was the setting(s) that gave you the most uplift in perf? its a good 15FPS and 6-7ms gain!

Did you also revert Foliage and all meshes to non-nanite since no VSM?

All of my test have been with Nanite enabled.

What would you say then was the setting(s) that gave you the most uplift in perf? its a good 15FPS and 6-7ms gain!

Note, I’m only giving the scalability settings that I used in the screenshots above. I copied the BaseScalability.ini from the engine directory to my projects config directory and renamed the file to DefaultScalability.ini.

All of these settings played a role in performance.

I then changed the settings in DefaultScalability.ini as follows:

Virtual Shadow Settings:
[ShadowQuality@3]
r.LightFunctionQuality=1
r.ShadowQuality=5
r.Shadow.CSM.MaxCascades=10
r.Shadow.MaxResolution=2048
r.Shadow.MaxCSMResolution=2048
r.Shadow.RadiusThreshold=0.01
r.Shadow.DistanceScale=1.0
r.Shadow.CSM.TransitionScale=1.0
r.Shadow.PreShadowResolutionFactor=1.0
r.DistanceFieldShadowing=1
r.VolumetricFog=1
r.VolumetricFog.GridPixelSize=16
r.VolumetricFog.GridSizeZ=64
r.VolumetricFog.HistoryMissSupersampleCount=2
r.LightMaxDrawDistanceScale=1
r.CapsuleShadows=1
r.Shadow.Virtual.MaxPhysicalPages=2048
r.Shadow.Virtual.ResolutionLodBiasDirectional=1.0
r.Shadow.Virtual.ResolutionLodBiasDirectionalMoving=0.5
r.Shadow.Virtual.ResolutionLodBiasLocal=0.0
r.Shadow.Virtual.ResolutionLodBiasLocalMoving=1.0
r.Shadow.Virtual.SMRT.RayCountDirectional=6
r.Shadow.Virtual.SMRT.SamplesPerRayDirectional=4
r.Shadow.Virtual.SMRT.RayCountLocal=6
r.Shadow.Virtual.SMRT.SamplesPerRayLocal=4

Global Illumination Settings:
[GlobalIlluminationQuality@3]
r.DistanceFieldAO=1
r.AOQuality=2
r.Lumen.DiffuseIndirect.Allow=1
r.LumenScene.Radiosity.ProbeSpacing=8
r.LumenScene.Radiosity.HemisphereProbeResolution=2
r.Lumen.TraceMeshSDFs.Allow=1
r.Lumen.ScreenProbeGather.RadianceCache.ProbeResolution=8
r.Lumen.ScreenProbeGather.RadianceCache.NumProbesToTraceBudget=200
r.Lumen.ScreenProbeGather.DownsampleFactor=32
r.Lumen.ScreenProbeGather.TracingOctahedronResolution=8
r.Lumen.ScreenProbeGather.IrradianceFormat=1
r.Lumen.ScreenProbeGather.StochasticInterpolation=0
r.Lumen.ScreenProbeGather.FullResolutionJitterWidth=0
r.Lumen.ScreenProbeGather.TwoSidedFoliageBackfaceDiffuse=1
r.Lumen.ScreenProbeGather.ScreenTraces.HZBTraversal.FullResDepth=0
r.Lumen.TranslucencyVolume.GridPixelSize=64
r.Lumen.TranslucencyVolume.TraceFromVolume=0
r.Lumen.TranslucencyVolume.TracingOctahedronResolution=2
r.Lumen.TranslucencyVolume.RadianceCache.ProbeResolution=8
r.Lumen.TranslucencyVolume.RadianceCache.NumProbesToTraceBudget=100

DefaultEngine.ini
; NANITE SETTINGS:
r.OptimizedWPO=1
; THIS SETTING NO LONGER EXIST IN 5.3. This disabled invalidating the VSM cache page on moving foliage. You can now do this by using the new VSM options on each foliage type by setting the Cache Invalidation Behavior to "Static".
r.Nanite.ProgrammableRaster.Shadows=0

r.Nanite.MaxPixelsPerEdge=2
r.Nanite.DicingRate=1
r.Nanite.FastVisBufferClear=2

; I'm still testing if these gave me any performance uplift. So use these with caution.
; ------------------ TEST ------------------
r.Nanite.FastTileClear=1
r.Nanite.MaterialSortMode=2
r.Nanite.MinPixelsPerEdgeHW=32.0
r.Nanite.ImposterMaxPixels=10
r.Nanite.PrimShaderRasterization=1
r.Nanite.VSMMeshShaderRasterization=1
; ------------------ END TEST ------------------

; LUMEN SETTINGS:
r.Lumen.TraceMeshSDFs=0
r.Lumen.SampleFog=0
2 Likes

They added that option since 5.2.1, either per-mesh (in case you have stuff like flags moving with WPO) and Foliage. In 5.3 they added even more flexibility.
But, yeah, VSM caching was taking a big chunk, especially for the overlap of shadows, so it made it looks like Nanite wasn’t worth it.

In my map I have 400M polys and Nanite works as intended. I honestly can’t believe people are against it! It worked amazingly since Early Access and got better and better, especially at culling and keeping small meshes into the scene.
I’ll post more about my racing game and its performance in the next days. Spoiler: both Nanite and Lumen do a great job, even at high speed.

1 Like

I honestly can’t believe people are against it!

It’s worse perf than just optimizing it. And gamers want their games optimized.

Is your game meeting the current gen to resolution ratio at 60fps?
If not, it’s because of something you shouldn’t have chosen to use for your game.

“Just” optimizing it, like that doesn’t take extra months to development and, likely, less quality of the outcome…
Oh, your test is quite pointless as that’s not a video game, those are meshes thrown together in an empty space. I’m talking about thousands of actors, heavy foliage, particles, and whatnot. You know, like a real game.

Yes, my game hits 60fps even on a GTX 1080 (with proper upscale). I guess I decided to put my time on optimizing the whole game rather than any single part of it? The result is a better-looking result, likely more performant, and with a shorter development cycle. The horror…

2 Likes

“Just” optimizing it

It’s called making LODs for you’re objects and using texture tricks.

I’m talking about thousands of actors

I did a test with 10,000. The GPU was faster rendering 5 million triangles over nanite “crunching” it down to 4000

The result is a better-looking result, likely more performant, and with a shorter development cycle. The horror…

That depends, if you’re game is so unperformant it needs an blurry upscaler, it will look look like smeary pixels when you move the camera(so basically gameplay/all the time). Doesn’t really make sense to say it looks better except when you stop the camera to take a screenshot and use that instance as the promotional images.

like that doesn’t take extra months to development and, likely, less quality of the outcome…

You know what go ahead. If customers don’t like what they see, they will review it for others.

Every UE game that has impressed me performance wise, didn’t use Nanite.

Oh, your test is quite pointless as that’s not a video game

If you don’t like my test go make your own.
If such a huge difference in small test can be seen, what do you expect from a real scene? And stop treating me like I’m the only who did test. (I linked 5 other test by 5 other people)

Go optimize your scene and compare it with the Nanite version.
You’re like everyone else I meet who goes insane and insulting when someone says Nanite isn’t god’s’ gift to rendering meshes.

You have the choice of choosing yourself or your customer’s when you optimize vs slapping on Nanite.

my game hits 60fps even on a GTX 1080 (with proper upscale)

By the 30 series, that card is 20% slower than a 3060. So at 1080p, your game should run 48fps at native 1080p.
If you are hitting above that then good for you and your customers who give you their money.
But you said you’re making a racing game.
Not exactly expecting that to require the maximum amount of resources out of UE5 anyways.

EDIT: As for the time constraints of optimizing and making good looking LODs, I already made a post about an AI workflow that would benefit developers the same why nanite does but also helps gamers because of a performance uplift this would provide.

I thought that nanite had some fancy rasterization + culling method that was tanking performance. After investigating and digging through the C++ and shader code, I found that I WAS WRONG

They’re culling seems to be very normal and efficient. So that’s a plus. However, it just goes back to what I said before about ignoring traditional optimization techniques.


The Problem: Nanite’s Rasterization

Simply put, the issue boils down to good old contention issues on the GPU using image atomic operations.

Nanite makes things a bit tricky because it insists on using its GPU LOD picking system. While this system does a decent job of managing scene complexity, it doesn’t consider the optimization of textures, materials, and other vital factors for efficient distant rendering.

Potential Fixes and Ideas

  1. Customizable LODs Per Mesh: Giving developers the power to define custom LOD settings for individual Nanite meshes could be a game-changer. It would enable us developers to optimize textures and materials specifically for Nanite geometry when rendered at a distance. Example: These LOD settings could be distances for which more simplified materials are rendered.

  2. The introduction of a Nanite imposter baking tool for foliage would be a valuable addition. This tool could help simplify the material complexity and reduce the number of holes in masked materials, contributing to the mitigation of contention issues by allowing the culling pass to effectively cull more geometry.

One thing that I have noticed when testing non-nanite static meshes with Virtual Shadow Maps is that they do not perform as well in motion. So using a customizable nanite LOD system seems like the better idea over a Hybrid setup.

If I need to explain contention issues, let me know. But this is why lowering the internal resolution boost nanite performance.

1 Like

Those are the same instances. I’m talking about thousands of instances from hundreds of different meshes of any kind. Like, you know, a real game.

You notice any smearing when you go below 75% upscale. And the heavy lifters are Lumen and VSM, Nanite does its job properly.

Yeah, I’ve been going ahead for over two years now and people clearly love what I’m doing, either from images, videos, and closed beta testing. What have you done?

How many games have been released that used Nanite since the beginning of their development? Likely none. It’s too early to tell.

I don’t have time to do tests as I’m making a game instead. Those tests are likely flawed because they don’t are as complex as a real-life scenario. All demo scenes I’ve seen converted to Nanite meshes performed better.

It’s actually 2% faster than a 3060: UserBenchmark: Nvidia GTX 1080 vs RTX 3060
And that 48fps where is even coming from? Should run? Because every game is the same in terms of level complexity?
And since when a racing game doesn’t require the maximum amount of resources from an engine? In fact, that’s the opposite, since you need to push high speed on top of good visuals. :joy:

2 Likes

First of all, thanks for the Cvars you posted, those are great! I’m going to share my findings in the next days. Going back on topic…

Wouldn’t have to create LODs and impostors be the exact opposite of what Nanite is trying to achieve? This would be like adding back a gas engine to an EV, just in case your battery runs out.
Nanite works great as it is (especially in 5.3), they might just add extra controls for stuff you mentioned.
Non-nanite SM will be a thing of the past as I’m confident they’ll be able to use Nanite for everything, including translucent materials and SKM.

I don’t use userbenchmark.
I watch benchmarks based on real games and did the math that way.
Once again. You just “listened” to no actual data you saw with your eyes.

Like, you know, a real game. How many games have been released that used Nanite since the beginning of their development? Likely none. It’s too early to tell.

How many times are going to ignore the 5 other testand UE5 GAMES that DON’T use Nanite but use LUMEN and VSM.

And that 48fps where is even coming from? Should run? Because every game is the same in terms of level complexity?

I’ll just quote myself. The attitude you’re giving is unbelievably rude.

So at 1080p, your game should run 48fps at native 1080p.
If you are hitting above that then good for you and your customers-

Emphasis on “good for you”. Do I really have to explain to use why 60fps at native res is good?
Did you lack the common sense to think going above 60fps on you’re hardware would offend me?

You notice any smearing when you go below 75% upscale. And the heavy lifters are Lumen and VSM, Nanite does its job properly.

A LOT of us see smearing at 25 to 100%

Yeah, I’ve been going ahead for over two years now and people clearly love what I’m doing, either from images, videos, and closed beta testing. What have you done?

So you came to argue with literally everything I said(and more importantly showed) and wanted to see if you could insult me. Great :clap::clap::clap:

I have been working on games for plenty of years. I’m only “newish”(still vastly ahead of most people) to C++ and UE. And because I am newer to unreal than most, I sadly and clearly see major issues.