Lumen GI and Reflections feedback thread

Okay… That sentence really p****s me off. A 3060 is not a potato card.
Well optimized games with outstanding visuals can run well above 100+fps at native 1080p on that card(even the mobile version).

Potato? Maybe for 4k and 1440p, maybe for unoptimized trash games. Its rasterization power should NOT be called potato for 1080p gaming

Let me remind you that 1080p is the most popular resolution and cannot utilize upscaling.

Like what @jblackwell said, you might be developing a puzzle, but I’m developing a highly dynamic action game targeting(even requiring) 60fps on hardware recommended native resolution(3060 at 1080p, 3080 for 4k, plug in other GPU vendor equivalents). Epic Games has already stated in several of their papers that 30-20 series 'ish gpu rasterization measure their millisecond budget timings.

With severe monitoring of UE5’s flipped on, non-needed features, exact budgeting, you can get 60fps with Nanite, SWRT Lumen, and VSMs(highly realistic graphics) on the hardware specs/resolutions I gave. But only those 3 things. Post process off, Quality off, AA off, almost everything lowered in order to achieve 60fps at spectacular native resolution. But that just isn’t enough, we need motion blur, optional AA methods, etc.

The biggest problem with UE5 games is they provide “Max settings” and gamers who don’t know the engine like us. They are unaware that those max settings target 30fps and then become are shocked when they aren’t hitting 60fps on their GPU’s recommended native resolution.

40 series (and frame gen tech in general) is a joke. I never see a difference in 60fps+. I feel the difference and DLSS3 doesn’t provide that. We need more innovations in performance enhancing code in Nanite, VSM, and Lumen in order to fully bring projects to a new level for the real player market. All we need is a 22% percent increase in performance from those features(In total, not 22% percent from each feature).

Honestly, my biggest problem is VSMs. I can’t use regular VM since that cost way to much but I can’t figure out how to keep VSM’s under 2ms in City Sample with no AI.
(My game’s main environment is similar to that project but it’s not built on top of it)

Really, all jokes aside - if your game isnt suited to Lumen, then dont use it.

Virtual Shadow Maps?

Have ya checked 5.3 yet, afaik there were improvements to VSMs.

It requires dynamic lighting. It’s a part of gameplay.

Have ya checked 5.3 yet, afaik there were improvements to VSMs.

Yep, heard about it(haven’t made the storage room to download yet). I made a post about the possible performance highlights in 5.3 in this thread.

One being they moved to C++20 which could possibly provide that 22 percent I was looking for.

I just tried Unigine 2.17 and while its framerate is much better than Unreal, you can’t really compare the two as Unigine still requires you to fiddle with probes and bake voxel volumes while Lumen has enabled you to skip all of that which is a huge productivity boost. Also, while it’s great that Unigine comes with many very nice presets for ocean and clouds (whereas Epic technically also has it, but one is optimized for Fortnite while the other is optimized for Hollywood and neither comes with good presets and even when they do, Epic forgets to update them for each release), Unigine still looks like a game, while Unreal can look photo realistic.

I’ve done a bit of profiling to VSMs, and they do seem to be overall costing less, but the performance is very content-dependent. While nanite is pretty much a drop-in replacement for any traditional geo (transparency nonwithstanding) and lumen is pretty much unprecedented in game development, VSMs are not the same level of essential feature. It’s particularly weird when I enable RT shadows instead of VSMs and see the frame time improve. Not to mention that animated geo and characters can cost a lot as well.

Anyways, I’m not suprised VSMs are giving grief, but I am a little concerned that Epic is taking it out of beta with these kinds of performance issues. At the very least, some sort of clear scalability knob is needed, just to bring the cost down ‘somehow’.

There is definitely a danger to the psychological implications of max or ultra settings. While taking lumen from high to epic can vastly improve image stability, noise, and resolve, epic to cinematic only really reduces noise in some cases in exchange for an incredible drop in performance. Cinematic doesn’t mean ‘runs if you have a 4090’ (even if you technically can). It means that these are shots baked via the MRQ with a number of seconds to process final-pixel quality graphics, and disabling a lot of optimizations to max out quality.

Most of the process breakdowns I’ve read on successful UE5 projects (Matrix, the Coalition’s work, Fortnite, etc), seem to depend on going over every setting and CVar with a fine-toothed comb, disabling everything that isn’t essential to the specific demands of the project. The series S had lumen reflections and grass shadows cut entirely to give it the perf it needed, but because it aligned with the art of the project, it wasn’t an unacceptable hit.

On the note of VSMs: is there any way you can increase how aggressively VSM caches things? I understand that a lot of VSM performance is dedicated to aggressively reusing information whenever possible.

I know that pre 5.2 VSMs had massive issues with foliage and especially forests… and I would go and see what changed between 5.2 and 5.3 if stupid 5.3 would let me do it… but no, it complains about every 2nd mesh made with the new meshtools and I cant package stuff because of that… tableflip (the meshes were completely fine in 5.2… had the same drama with 5.1 → 5.2)

Maybe… if whoever is in charge for this world allows me to make a build, I will see what changed and how it affects performance.

===============
I actually found VSM performance quite good, given that it is range-unlimited, Cascaded shadow maps get a lot worse, very quickly if you want them to actually cover something. (yes yes, depends a bit on game, I know, but for me they are roughly equal to shadow maps at 12k distance, just that they are unlimited.)

Did a quick test, and the conclusion is: dont use 5.3, it actually performs significantly worse than 5.2, which already performs worse than 5.0 does for me.

Here are my results, same project, same savegame, same settings, same everything:

Explanation:

  • Nanite Only = No VSM (CSM instead), Lumen off, everything to lowest settings and TSR disabled.)
  • VSM only = Nanite + VSM (since I cant disable nanite it would screw performance completely)
  • Lumen Only = Nanite + Lumen (see above)
  • Maxed out = Nanite + Lumen + VSM + TSR and everything else turned up as high as the game allows it.

As is clearly visible, 5.3 is slower when GPU bound.

Visually speaking, there are tiny differences visible, like better shadows on far away leaves and slightly brighter Lumen, but thats it.

PS: Yes, I am aware that Lumen doesnt play nice with some trees… working on it, its painful. (You either get dark trees (5.0 and 5.1) or you get this… 5.2+)

Looking at these numbers… I will stick with 5.2 for the foreseeable future, until at least the next GPU generation.
Performance wise I could “afford” to go from 5.0 to 5.2 despite some lost frames here and there, but 5.3 is another 10% that just goes “poof”. I already have a hard time to reach the same performance I had in 5.0 in 5.2, so 5.3 is out of question.

I do not use Nanite for everything, because I dont want to load even more onto low end gpus, that can otherwise be dealt with by the CPU, so the game is mixed use and turning even more nanite-stuff on has a negative impact on performance.
With 5.3 I am running out of Ideas how to claw back performance, if it gets thrown out of the window like this. (I have not yet fully recovered from 5.0 → 5.2… :sob:)

As much as I like the improvements and progress, -10% fps is a hefty price for no (realistically) noticable visual impact. (especially since we have performance stagnation on the cheap Graphics cards…)

2 Likes

After 5.3 preview came out I removed my custom build and just switched to that. Of course, it’s a preview build, and I only use it for tech tests rather than actual personal projects, so I’m not really expecting good performance out of it, just novel features to play around with, test, and break.

I also know that refactoring code is sort of like remodeling a bathroom: things will get way worse before they get better, and you shouldn’t expect anything to be usable the same way until the job is done. If they’re refactoring nanite, especially to get compute material cost under control, I’m not suprised it’s undoing a fair bit of VSM performance optimization in the process.

Something I will say though: they have definitely improved HWRT performance and stability. For instance, their RT nanite path actually works now, instead of producing insane shading artifacts. I can now use full-res nanite meshes to trace against, and neither the performance costs nor the memory usage is anywhere near as bad as it used to be.

I can now use full-res nanite meshes to trace against, and neither the performance costs nor the memory usage is anywhere near as bad as it used to be.

What would you say would be more performant in City Sample? 5.2-SWRT Lumen+VSM’s or 5.3’s HWRT lumen with RT shadows (no VSM’s?).

Makes no difference to my target hardware which is now current gen consoles (PS5 etc.) and PC’s make up the 30 series and other GPU vender equivalents. (All have rt cores)

Is there any way you can increase how aggressively VSM caches things

Tbh, I am very confused with VSM’s. (The documentation has not helped much). All I know is medium, low res vsm’s is are all I need and regular SM can’t handle(cost 11ms etc) the high resolution meshes.
I feel as if VSM computing power is being wasted on softening the shadows. My projects photorealism doesn’t need require it. Hard shadows look better tbh(Imo/project). Even If I change the source angle, I don’t know if the engine is still computing with the 0 value. I need a hard cut(Cvar) on that “feature”. I would like any input of the VSM’s caching if you have any?

5.3, it actually performs significantly worse than 5.2-Yaeko

Thanks for sharing that. I had a couple of moments where I thought the newest version performance was worse but then I looks at each scene with the real time stat gpu cvar and I found that Lumen and shadows had the same ms cost but [unaccounted] was way higher than the previous versions? Check and see the ms timings on lumen and shadows etc.

If each feature has a higher millisecond with no [unaccounted] affecting performance then vote this thread.

Also, TSR as told to me by the creator is not meant for native resolution (). It’s was meant for upscaling only. Epic TAA is far cheaper (.30ms vs 3ms lmao) and can look almost as good with my modified variables.

I do not use Nanite for everything

Fortnite 5.1 and City Sample has showed me that needs to be the case. I will not be relying on Nanite in my game unless it’s proving to be much more performant than the LOD set up I make for the mesh.

Lol, btw has anyone noticed that UE5 games(using lumen workflow) are built like Fortnite? Every little piece of the world is broken up into separate pieces lmao.

I was able to improve Lumen performance by 29% (turns out some settings have changed etc.), but I am still missing a few frames… one day I will find them… was a long day. EDIT: Found them, TSR is the culprit… see below… that makes 57% improvement then.

I just checked, since I knew that it costed me like 3 fps in 5.0… it now costs me 10! fps.
When I turn it off, I am at 60 fps at 1440p with Lumen enabled :clown_face:

so, apparently TSR now needs 3x the ms than in 5.0, great. (3 fps in 5.0 game, 10 in 5.2)

Guess I will try your variables tomorrow… I can accept losing 3 fps, but not 10… (and that is 10 over TAA…)

looks at 20k pieces the map is made out of - yes.

1 Like

Let me know how you helped lumen’s performance. All feedback and testing from others is greatly appreciated by me and my studio’s vaules :slight_smile:

I recently experienced FSR2.2 and wow. That looks better than blurry ole DLSS and sadly much better than TSR (sry Guillaume.Abadie). It was in another game but I’m trying to test FSR plugin and see if it can perform better then TSR and look better than my modified TAA in a photo realistic scene(So far I can’t get it to work, so I’ll give it some time for the hotfix updates).

I heard it cost around 1.5 ms at worst. I want to see it perform with Lumen’s artifacts.

EDIT: Woo hoo…1000th post in the thread XD

1 Like

That is a really good question. The kicker on HWRT performance is that you can actually get it pretty fast if you make things using the same workflow that epic does. Using the new BVH visualizer, you can see how many additional evaluations a ray would have to do to hit a given object. In Citysample, that number is actually relatively small, because they keep their bounding boxes very tight, and all of the meshes are watertight to one another, so so don’t get a bunch of stacked bounding boxes. Fortnite is very similar in their modular construction. Additionally, courtesy of far-field, you can have distant lighting for a much, much lesser cost than tracing against every mesh. If you can tune it correctly, you can control at what distance near-field hands over to far-field, and minimize your trace cost quite a bit.

By and large, SWRT is faster, but that isn’t always the case. If you’re doing any sort of kitbashing, you pretty much need SWRT, but then you have more work cut out for you with making sure your distance fields are high-res enough, while still not taking up too much memory. Of course, Epic used HWRT and a ton of kitbashing in their electric dreams demo (the reflections suck with SWRT), but since it ran on a 4090 that wasn’t really an issue.

1000 posts! That is genuinely incredible. I remember when Lumen In The Land Of Nanite first came out, and being absolutely gobsmacked by what Epic did, and what we all could do. To be here, years later, and see that vision coming to pass (well, once AAA studios actually get UE5 games in production) is genuinely incredible.

And shoutout to the lumen team for taking user ideas and feedback seriously. IDK about anyone else, but it’s felt very collaborative to watch this technology grow and mature over time. Thank you guys.

1 Like

My game will be using the HWRT Lumen/Fortnite workflow in case we ever want to switch HWRT Lumen. In City Sample, I was able to remove 3+ms of Raytracing calculations from a 21ms rendering time at native 1080p on a Mobile 3060(so, not even target hardware, target is non mobile). I got much closer to the 16.33ms budget.

Which is why I’m keeping with SWRT Lumen and doing my absolute best with that. I’m going for playable photorealism so I will also be keeping reflections range high but low resolution and blurred. Consistently is all that matters for my game.

Can you use SWRT Lumen with raytraced shadows in 5.3? I imagine that isn’t possible but if that was. Then that sounds like a 60fps friendly solution I can push for my game.

DK about anyone else, but it’s felt very collaborative

All I want is more performance and my volume box idea to intercept lighting traces with no cost.

I just want a response from Epic or a source code dev on those 2 things.

I feel like SWRT lumen+ RT shadows may not give the performance you’re looking for, but that’s just a gut assumption, I haven’t done any performance profiling on that specific feature combination yet. Because using lumen is going to mean paying the memory costs for the surface cache, the MDFs if you’re using detail tracing, and the GDF, adding hardware RT to that may be very expensive, because you’re having to maintain two separate scene representations for the both. That’s just a gut reaction, I don’t actually know what the perf will bare out. Out of curiosity, what have DF shadows given you? If you’re already committed to using SWRT, then DF shadows may be a good option.

1 Like

I can’t figure the hell out to turn DF shadows on with Lumen? Moveable doesn’t do anything? I’ll let you know If I ever get them turned on and working in my project.

1 Like

You haven’t figure out as in they’re unexpectedly not working, or you can’t find the setting?

Side note: What I’ve generally found is that DF shadows+SWRT tend to run very well together, and it’s not too surprising considering the architecture. I can’t help but imagine that a project committed to using SWRT, DF shadows, and DF collision physics could probably be stunningly well-optimized if you had the right programmers.

1 Like

probably be stunningly well-optimized if you had the right programmers.

That sounds like the end mission goal I’m searching for lol.

You haven’t figure out as in they’re unexpectedly not working, or you can’t find the setting

I have no idea? Generate distance field meshes is already flipped on, light is set to moveable. Maybe they are already on? I don’t see it in the GPU profiler. Same thing in all my projects.

I have the exact same issue. I am not sure if we ever will see reflections that come close to pathtracing other than some kind of “realtime pathtracing” which I believe will be the new “holy grail” in realtime rendering for the next coming years.