Lumen GI and Reflections feedback thread

We already have jitter from screen pixels and not sure what else could be jittered for roughness=0.

Sorry, there was a typo. I meant to say this is handed off to TSR/TAA. Which is problematic for accessibility(Temporal upscaling and Motion Blur included). Temporal effects such as frostbites SSR and warframe SSAO are low resoltion but have their own systems in place to upscale and accumulate. Advil recently advised against Temporal AA and Upscalers and Blur Busters also trying to bring awareness to devs about this issue. Per effect ghosting isn’t a issue, becuase it doesn’t cause the entire scene or even effect to blur.

DLSS RR uses a specular motion vector buffer based on specular hit distance from the first bounce. Maybe a combination of short term interpolation as a fallback. Also, I would like more control as to how fast past reflections are faded away. I have done some slight jittering with GXXsampling but It didn’t end up working becuase past frame are discarded/faded too fast.

Maybe a value that controls how fast/slow past channel information cuts off over time too?

Now, again I really appreciate the work done on performance regarding reflections but I do have a concern over the major jump in performance cost in the ScreenReconstruction method which jumped significantly. Also, I find screen traces to lack performance/quality seen in other engines and implementations as Frostbites SSSR and this AMD hybrid solution demo(which might be optimized for consoles due to RDNA2 focus).

I have carefully measured performance on each scene, these aren’t meant to put Lumen down, but rather make a point about cost/quality. In in all my test, AA is off, clipmap extent is at 0.0, r.UpscaleQuality is at 0 and assume everything is on default high settings unless stated otherwise.

Frostbites documented SSR uses TAA, which was kinda needed at the time(PS4), but modders and recent games have taken that solution to remain to stable without TAA and at very little cost, we are talking maybe 12% of a 3060 at 1440p in a very wet environment, no world normal limited in trace directions, little to no smearing, both High and 0.0 roughness remain very stable without TAA and no major dithering is produced for that matter. Even in fast traveling motion, no smearing is as pronounce compared to Lumen Screen Traces.

Test scene

Path Traced:


I’ll go ahead an omit blurry 0.0 reflections from the comments below since this seems like it will be tackled soon so I’m not worried about that.

  • 1440p, 3060, cost 4.3ms with the setting I stated. Looks fine if not pretty close to path traced but looks more smeary in comparison to frostbites SSSR in motion. That smear issue is still present in every one of these scenarios, which could be fixed with less frames or better use of specular motion vectors.

  • 1440p, 3060, cost 1.5ms with r.Lumen.Reflections.ScreenSpaceReconstruction 0 and r.Lumen.Reflections.BilateralFilter 0. Anything not moving with roughness above .02 displays very noisy and visually sporadic reflections. The reflection output is rough on the eyes because we can easily distinguish 4 real pixels represent a output pixel. Ugly(sorry😬) dithering patterns also become visible on cutoff edges.

  • 1440p, 3060, cost 3.5ms with r.Lumen.Reflections.ScreenSpaceReconstruction 0 r.Lumen.Reflections.BilateralFilter 0 and r.Lumen.Reflections.DownsampleFactor 1 The noise mentioned in bullet above is now 80-90% cleaned up. Dithering patterns become less visible but are still present. The noise can be further softened with r.Lumen.Reflections.Temporal.MaxRayDirections 32 at no cost.

  • 1440p(r.ScreenPercentage 40, 1027x579), 3060, cost 1.04ms with r.Lumen.Reflections.DownsampleFactor 1 r.Lumen.Reflections.BilateralFilter.StrongBlurVarianceThreshold 0.1
    Everything in the scene looks low res but reflections are clean with no noise, proportionally blurred and clear on differentiating roughness, even 0.0 looks clear my display which is 1440p.
    A slight problem arises, the reflections above 0.07 look too blocky which could be taken care of by a spatial upscaler(rather than nearest) Like r.Upscale.Quality 5 or FSR1+Edge correction seen in the AMD hybrid SSSR+RT project(It would need a Cavr to control what roughness receives this spatial upscale).

Taking advantage of the last scenarios performance would entail computing downsampled versions of the g-buffers and spatial algos, but I can’t imagine that being more expensive than all the other scenarios I listed. Also r.Lumen.Reflections.RadianceCache 1 feels misleading as in my experience, it actually increases the cost of Lumen Reflections (lowest .25ms, highest .50ms).
I feel like this would then be better paired with the power of offscreen traces.

1 Like

you got this test scene as a download? i’d throw my eyeballs at it and judge. i mean… i would not render ssr anyway. i’m over that one. it raytraced or bust. i don’t like ssr artefacts.

what you mean by specular motion? that’s hard to get. there’s no first bounce. specular is a analytical light term created at hit point. if you look at it, it’s computed directly. if you look at specular highlights in a mirror you have a reflection bounce and then the specular is computed for the surface in the mirror. the hit point. getting correct motion for that is a like a buffer for every bounce and computing reflection vectors for every motion sample. expensive. and it can differ based on the roughness and the random reflection vector. roughness creates a random reflection vector/ray.

your pathtraced screenshot is just a reflection. and ofc the rougher the surface is the more scattered the rays wil be. in the game engine you get one ray per pixel per frame and temporal stabilisation via motion history. it’s still randomly sampling roughness. it is what it is. the bilateral is denoising and blurring it. the downsampler changes the resolution. and screenspace reconstruction is a goofball i don’t even know. i don’t use screenspace reflections. it will never reconstruct what it can’t see and expose hidden surface artefacts. facts.

so you may have to accept that you have scattered rays that compute temporal noise on very clean surfaces. in a game world with detailed texture you may not see as much.

what you mean by specular motion? that’s hard to get. there’s no first bounce.

First off, Nvidia spoke about this.. You say it’s expensive but we are already dealing with that, aren’t we? The goal is a alternative.

i mean… i would not render ssr anyway. i’m over that one. it raytraced or bust. i don’t like ssr artefacts.

If done well, you shouldn’t get any and you get free performance that’s usually pretty contextual in terms of when screen information isn’t on screen: it isn’t on screen(basepass etc goes down as raytracing kicks in higher ms). I would check out AMD hybrid SSSR+DXR demo. Because your just throwing away free performance instead of “working smarter”.

your pathtraced screenshot is just a reflection

It’s just reference as to how roughness should scatter across roughness fresnel. Which certain Lumen reflections Cvars can skew how closely Lumen replicates it. I know how sampling works in engines, the problem is that other engines do it faster and in more visually pleasing ways. Regarding the test I did, this was the scene that gave some context. Which is important for any raining scene.

temporal stabilisation via motion history–
so you may have to accept that you have scattered rays that compute temporal noise on very clean surfaces.

It’s not a matter of accepting, it’s already possible.

you got this test scene as a download? i’d throw my eyeballs at it and judge

Aside from personally made assets, isn’t everything a download? I made the scene recently out of some Nvidia projects. Not sure what this meant from your view?

well… i know it doesn’t work or is it smart. even hybrid doen’t change much. look… screeen space portion.

argueable i could use hybrid to safe on the visible facades. but… if the floor was a puddle i’d have to hybrid raytrace 75 percent of the screen anyway and blend it at the ss edge. and don’t get me started on the characters. the complexity is off the charts for hidden surface artefacts. hybrid, peek thru holes, check if you see screenspace pixels, motion tracking and all the fakery sh*t doesn’t need to be done if you have a bruteforce method/raytracing and just do it.

Enabling Substrate in the 5.4 preview appears to mess with the GI. Any help on this?

Sebastian Hillaire in the substrate channel spoke to this a bit: part of it is simply a product of the fact that substrate and the legacy materials aren’t actually guaranteed to be 1-1 matches for each other, and so you may have some lost/gained energy. Not to mention, the system is still in experimental, so I wouldn’t be surprised if you were seeing strange behaviors with radiance.

I would check your lumen scene view first however, to see if the card capture is registering significantly different materials.

Yeah I checked that and they look the same. I’ve just made some screenshots. I’m also discusing this with Seb in the Substrate feedback thread.
So I take them using the “High resolution screenshot” tool in the Viewport but as you can see from the manual screen grabs there is also a strange blurring happening when Game Mode is On.

UPDATE: I can confirm that enabling “Allow Static Lighting” does fix the problem

I decided to dig a little deeper into performance comparisons, maybe @Krzysztof.N or @Daniel_Wright can share some insight as to how Lumen reflections work. I decided to use a hardware/API inspector to get the exact timings on FB SSSR since it is rather visually appealing and pretty performant for real-time.

Test scene specifications:

Frostbite modded(higher res) SSSR channel, 1.4ms@1440p-Desktop3060
Most of the reflections visible in this channel are overlapped(covered) by the next processes.

I tried to match it with the scene above, this was “2.2ms”@1440p on the same 3060.
Settings: High preset, bilateral and reconstruction off(0), downsample factor 1, clipmap extent 0.0, r.Lumen.Reflections.Temporal.MaxRayDirections 32.

The reason why 2.2ms is in quotations is becuase I’m just trying to measure the screen tracing speed. In the same scene, when r.Lumen.Reflections.ScreenTraces 0 is inputted to the console, Lumen reflection cost drops to 1.5ms. So the actual screen traces are costing around (2.2-1.5=)0.7ms.
That’s a lot faster than r.SSR.Quality 3 which is the only ScreenSpace solution that offers elongation but cost 1.7ms+more temporal instability(without TAA etc) due to jitteriness and noise:

Conclusion after data: My point is that SSR in unreal needs to be replaced with Lumen Reflection Screen Traces solution with the settings I applied(downsample factor etc) as it’s much more efficient. It would also serve us well is to allow us to have this screen trace solution at this level of quality&cost without needed to bump up actual SDF/Triangle traces. Also, this would save us the cost of reconstruction from running on screen traces(since at that quality, it really isn’t needed).

Now, the next step is profiling non-screen traces. We aren’t always lucky to have on-screen information so I’ll give my thoughts on the same scenario with no screen traces.

For the purposes of keeping this informational post short, I’m only going to give timing results on visually stable settings with no TAA/DLSS etc.

Non-Screen Trace timings

Default High settings-Software tracing-Desktop 3060:

  • 1440p 4.80ms(3.4ms without SSreconstruction(-30%))

  • 1440p(r.ScreenPercentage 50) 1.5ms(1.2 without SSreconstruction(-20%))

Default High settings-Hardware tracing-Desktop 3060:

  • 1440p 3.02ms(1.5ms without SSreconstruction(-50%))

  • 1440p(r.ScreenPercentage 50) .70ms( .50ms without SSreconstruction(-28%))

I’m wondering if hardware acceleration always boosted reflections like this or is this the new HWRT optimizations for 5.4? Either way, these test are showing issues with SSreconstruction and showing the value of upscaling low resolution traces after SSreconstruction runs at a lower resolution:

Trace>>SSreconstruction(ofc with the exception of low roughness which needs separate jitter if TAA etc is OFF)>>SpatialUpscale reflection channel.

It’s a small test in a small scene but it’s highlighting some issues that could blow up in bigger scenarios.

Side note: Enabling restir on Lumen results in no GI but increased cost and the GI is visible in the reflections. Not sure what’s up with that. I’ve done a lot of trouble shooting but it just doesn’t work.

EDIT 2:

Quick note, 0.7ms isn’t the full story. That only disables the screen tracing, r.Lumen.Reflections.Temporal 1 in that scene cost .22ms and would still be needed. So in reality it’s 0.7ms+anything in the 1.5ms that is needed(for instance, .30ms of that 1.5ms is tracing voxels, etc, etc)

EDIT3: UE5 is inconsistent with temporal stochastic SSR, which until recently it randomly worked after a couple years of it not. I’m pretty sure this makes SSR faster in unreal.

2 Likes

@Krzysztof.N Please consider making matching RaytracingGroupIDs change for each instance of Packed Level Actors. Or at least expose the setting to blueprint for ISM components. Currently, whatever IDs you assign to the meshes are used for every instance of the PLA, leading to this annoying behavior where cards are merged into a big box around all the PLAs instead of how they were assigned inside the PLA:

I thought I could fix this myself by assigning the GroupID manually in the construction script, but this can’t be done from Blueprint because there is no function available for setting the raytracing group ID for individual components, it can only be done at the actor level:

image

2 Likes

Hi,
Is the “High Quality Translucency Reflection” setting supposed to work without Lumen GI? I’m only using Lumen reflections but this setting produces extreme artifact (actual colorful rendering glitches) on all transpareny surfaces without Lumen GI.
I hoped this would help to improve transparency quality with less performance penalty than RT translucency.
From a UI perspective it’s a bit counter intuitive to have it in the reflections tab, if it’s actually linked to GI.

Issue is fixed in 5.4 :slight_smile:

Are you in 5.3, right?
In 5.4 I tested it a while ago and it seemed to be already correctly fixed and working

1 Like

Yes, this was on 5.3. I should have noted that.

I’ll give 5.4 another go.
Yes, it’s actually fixed in 5.4 :man_facepalming:

Thanks for the heads up

1 Like

@Krzysztof.N I’m not sure if it’s still even being supported, but switching on lumen’s irradiance field gather in 5.4 instantly crashed my client, and has done so multiple times, opening different levels and with different lumen settings enabled beforehand. Is that lighting strategy considered abandoned at the moment?

Truthfully, I feel like it hasn’t worked since 5.0 or so, or at least that is the last time I remember it being usable (albeit low quality of course).

@Krzysztof.N or @Daniel_Wright I would like more information on this erratic behavior in 5.4 software Lumen’s AO(I’m positive it’s not short range AO). I haven’t experienced this in editor but haven’t used 5.4 Lumen very much and just wanted to see the quality in FN and I keeping finding this annoying glitch, (video is 4k which means it’s short, put the video on loop)

See how the AO disappeared. It kept happening repeatedly over and over in FN with high scalability.
Maybe you guys will immediately recognize the issue here idk, hope that’s not in the final released 5.4 version.

Let me know when or if you guys find or fix this this issue in the shaders.

1 Like

I like lumen a lot, but I’m having 2 issues.

Lumen is apparently using TAA to get rid of noise. This TAA is blurry in motion, especially on stylized low poly geometry. I can’t find an option to use 200% history screen percentage anywhere. This is essentially upscaling to 200% screen resolution and it’s the solution to TAA blur in motion. I have tried to set the final gather lighting update speed as low as possible to reduce the noise in a different way (with r.Lumen.ScreenProbeGather.Temporal.MaxFramesAccumulated=3 to get less smoothing and blur) but I cannot set it lower than 0.5. I don’t know if it’s possible to go lower at all, but it would certainly help

When I use foliage paint, procedural foliage or PCG with regular LOD meshes, all instances are black in the lumen scene. They can only block the landscape and skylight as a result, no indirect lighting or illuminated reflections. Nanite does not have this problem, but it makes my 3070 quite a bit slower than regular LODs. Static mesh actors are correctly illuminated in the lumen scene as well, but it’s not practical to drag every object in the scene by hand. I don’t know if the problem can be solved for regular meshes, but I think it should be possible to bake instanced foliage into a series of static mesh actors. Only for objects that are bigger than a certain size, to keep the amount of actors as low as sufficient

2 Likes

I forgot to mention one issue. Physically accurate indirect lighting is a little hard to see. That’s why I often brighten it up with the indirect lighting intensity property of the directional light. The problem is that it brightens the reflections as well, which does not look good at all. I think the reflection brightness needs to be decoupled from the indirect lighting intensity. The screen traced reflections are unaffected though, I’m talking about the reflections of off screen objects

1 Like

@NormaalEwoon11,

Lumen uses temporal accumulation from
r.Lumen.ScreenProbeGather.Temporal.MaxFramesAccumulated which at least doesn’t blur the whole screen like TAA(and TSR cost too much or fuzzes-ups motion).

But yeah, I’ve spoken about how raising MaxFramesAccumulated above 10(default, still a lot of noise) causes the light to smear, so at the moment, it forces you to use hybrid temporal smoothing and that really sucks becuase I’ve seen some pretty dependable GI that doesn’t do this.

At this point I really want Lumen to incorporate a bakeable aspects, the volumetric lightmaps are so broken with so much leaking and memory usage. And Lumen also seems to be wasting a lot of performance on trying to figure out and re-iterate on the scene that are static even if everything is lowered to the slowest possible update. Like the noise and sporadicness could be fixed with bakeable probes.

Lumens supersamped probes is a really smart way to go about faking multiple diffuse path traced rays, but what would be more appropriate for most static games with moving lighting would be something a combination of The Divisions GI which runs way better than Lumen. Allowing us to bake supersampled probes could include super sampled world normal buffers to prevent leaking as well as depth functions to probes closest to geometry like the ones used in EA’s solution..

I guessing for a solution like this, we would need two kinds of bakeable volumes. One for indoors and one for outdoors. Indoor volumes would need the mark probes with a hierarchy in this order light source is visible, direct light is only visible, and neither visible. Then for basic dynamicism, those would only have to trace if an object is occluding the probes “view” of the light source or direct light and di probes under a lower hierarchy based on what percent of higher hierarchy probes are being occluded from whatever their tag was.

This wouldn’t be too bad to work with if each volume could have an array of profiles we could blend between and access via blueprints. Atm Lumen seems to cater to dynamic worlds, with dynamic lights, where at most games using unreal are static worlds with dynamic lights.

2 Likes

image
This only came in a day or so ago, and as UE 5.4 just released I don’t exactly have time to rebuild UE and test it, but I’d be very curious to see what the resolve of hit lighting GI looks like. Does it mean the surface cache itself is just discarded, or only read on the second bounce?

4 Likes

This looks very interesting. I am also curious how this behaves in terms of noise and update speed.
GI currently can be painfully slow to update. Not relevant in many use cases, but if fast updates are required it’s almost unusable at the moment.

Good to see Lumen is pushing further in both directions though. Down is terms of scalability and up in performance.

thx for the update. i’ll rebuild it. will test and throw some pictures in approximately 3.5 hours. hmm