I decided to dig a little deeper into performance comparisons, maybe @Krzysztof.N or @Daniel_Wright can share some insight as to how Lumen reflections work. I decided to use a hardware/API inspector to get the exact timings on FB SSSR since it is rather visually appealing and pretty performant for real-time.
Test scene specifications:
Frostbite modded(higher res) SSSR channel, 1.4ms@1440p-Desktop3060
Most of the reflections visible in this channel are overlapped(covered) by the next processes.
I tried to match it with the scene above, this was “2.2ms”@1440p on the same 3060.
Settings: High preset, bilateral and reconstruction off(0), downsample factor 1, clipmap extent 0.0, r.Lumen.Reflections.Temporal.MaxRayDirections 32.
The reason why 2.2ms is in quotations is becuase I’m just trying to measure the screen tracing speed. In the same scene, when r.Lumen.Reflections.ScreenTraces 0
is inputted to the console, Lumen reflection cost drops to 1.5ms. So the actual screen traces are costing around (2.2-1.5=)
0.7ms.
That’s a lot faster than r.SSR.Quality 3 which is the only ScreenSpace solution that offers elongation but cost 1.7ms+more temporal instability(without TAA etc) due to jitteriness and noise:
Conclusion after data: My point is that SSR in unreal needs to be replaced with Lumen Reflection Screen Traces solution with the settings I applied(downsample factor etc) as it’s much more efficient. It would also serve us well is to allow us to have this screen trace solution at this level of quality&cost without needed to bump up actual SDF/Triangle traces. Also, this would save us the cost of reconstruction from running on screen traces(since at that quality, it really isn’t needed).
Now, the next step is profiling non-screen traces. We aren’t always lucky to have on-screen information so I’ll give my thoughts on the same scenario with no screen traces.
For the purposes of keeping this informational post short, I’m only going to give timing results on visually stable settings with no TAA/DLSS etc.
Non-Screen Trace timings
Default High settings-Software tracing-Desktop 3060:
-
1440p 4.80ms(3.4ms without SSreconstruction(-30%))
-
1440p(r.ScreenPercentage 50) 1.5ms(1.2 without SSreconstruction(-20%))
Default High settings-Hardware tracing-Desktop 3060:
-
1440p 3.02ms(1.5ms without SSreconstruction(-50%))
-
1440p(r.ScreenPercentage 50) .70ms( .50ms without SSreconstruction(-28%))
I’m wondering if hardware acceleration always boosted reflections like this or is this the new HWRT optimizations for 5.4? Either way, these test are showing issues with SSreconstruction and showing the value of upscaling low resolution traces after SSreconstruction runs at a lower resolution:
Trace>>SSreconstruction(ofc with the exception of low roughness which needs separate jitter if TAA etc is OFF)>>SpatialUpscale reflection channel.
It’s a small test in a small scene but it’s highlighting some issues that could blow up in bigger scenarios.
Side note: Enabling restir on Lumen results in no GI but increased cost and the GI is visible in the reflections. Not sure what’s up with that. I’ve done a lot of trouble shooting but it just doesn’t work.
EDIT 2:
Quick note, 0.7ms isn’t the full story. That only disables the screen tracing, r.Lumen.Reflections.Temporal 1 in that scene cost .22ms and would still be needed. So in reality it’s 0.7ms+anything in the 1.5ms that is needed(for instance, .30ms of that 1.5ms is tracing voxels, etc, etc)
EDIT3: UE5 is inconsistent with temporal stochastic SSR, which until recently it randomly worked after a couple years of it not. I’m pretty sure this makes SSR faster in unreal.