Hi,
we are running an ST 2110 tiled inner frustum setup with nDisplay. One of the (many) issues we encountered seems to be seam issues between inner-frustum tiles clearly related to radiosity mismatch between render tiles, i.e. surface cache appears to be per-tile / per view-frustum. Each tile is rendered on a separate GPU, so it appears there is obviously no cross-GPU transfer happening. This behaviour is particularily strong in Lumen Reflections (non ScreenTraced ones).
We have our custom UE version so we are good to make any modifications, although preferrably as simple and atomic as possible. I think one way to ensure that all inner-frustum tiles update the same surface cache / have the same surface cache data would be to expand the frustum which determines the surface cache updates per view. This needs to be a pretty large expansion, so a regular per-tile overscan wouldn’t be applicable for performance reasons.
My question is, what would be the most straight forward way to expand the frustum for surface cache updates, so that in a tiled rendering setup, i.e. dual split (left and right frustum tile) both tiles update the same surface cache on their combined expanded frustum?
I have come across the FrustumTranslatedWorldToClip:
// Matrix used for frustum clipping tests in Lumen. For typical views, this is set to WorldToClip, while cube captures
// have an omnidirectional projection, and use a trivial matrix that will pass any point as in-frustum.
FMatrix44f FrustumTranslatedWorldToClip;
I am also thinking the general frustum culling for the surface cache updates must be somewhere in LumenSceneRendering.cpp, i.e.
void UpdateSurfaceCachePrimitives(
FLumenSceneData& LumenSceneData,
const TArray<FVector, TInlineAllocator<2>>& LumenSceneCameraOrigins,
bool bOrthographicCamera,
float LumenSceneDetail,
float MaxCardUpdateDistanceFromCamera,
FLumenCardRenderer& LumenCardRenderer,
bool bAddTranslucentToCache)
{}
and / or
void UpdateSurfaceCacheMeshCards(
FLumenSceneData& LumenSceneData,
FLumenSceneData::FFeedbackData LumenFeedbackData,
const TArray<FVector, TInlineAllocator<2>>& LumenSceneCameraOrigins,
bool bOrthographicCamera,
float LumenSceneDetail,
float MaxCardUpdateDistanceFromCamera,
TArray<FSurfaceCacheRequest, SceneRenderingAllocator>& SurfaceCacheRequests,
const FViewFamilyInfo& ViewFamily)
{}
It is worth to mention that, it appears if the surface cache had a long enough “warm-up”, the Radiosity stored in the SurfaceCache is somewhat stable between render tiles. However, that is not scalable over a large trajectory because even if we keep unused surface cache pages longer in memory per frame, at point they will clear if not used again.
We are doing a couple of engine modifications with regards to Lumen ScreenProbe Radiance sampling at the moment, to have better tiling support on clustered render machines. So this one would be great if we could get a kick start to speed things up.
Any hints on how we could implement that in an easy way are most appreciated!
Thank you!
Best,
Markus
[Attachment Removed]