We have been experiencing a crash when using color correction regions or windows in both vanilla 5.4.4 and an icvfx build in 5.4.2. Our inner frustum resolution is set to 8448x2897 with an overscan of 1.2.
We have the theory that the ndisplay texture resolution might be too big (inner frustum resolution * overscan).
This was flagged last year but we haven’t seen a working fix yet, and we are resuming a production using the same build.
Find attached two different logs of 2 crashes.
It doesn’t crash straight away, we had to wait around 20 minutes.
We tried setting a cvar override: DC.OverrideMaxTextureDimension=16384 (doubling the default max texture size), and we could see a different behaviour. It didn’t crash for an hour, until we moved a bunch of CCR (12 CCR parented to a root) and changed the brightness of it and it froze.
It didn’t crash, but all the nodes froze. So can’t really provide a crash log.
We will try setting the override max texture dimension override to 12K instead of 16K, no give a bit of overhead.
One of our devs just identified that IF the regionstate is inverted, the code which causes the garbage data in the viewport would not be executed. so in the exact same frame we are crashing on, if we WOULD have used regionstate invert, the viewport sizes would be:
We left it running overnight with only CCWindows instead of CCRegions and it didn’t crash the engine for 16h, so we think there’s gotta be some difference between both actor types that is preventing the crash from happening, even though they share the same core. Any ideas?
[2025.05.09-16.01.57:649][291]LogDisplayClusterViewport: Error: The viewport 'ICVFXCameraA' rect 'CustomFrustum' size 10137x3476 clamped: max texture dimensions is 8192This makes your resolution clamped to 8192x2,809 (not entirely sure on the vertical resolution, but horizontal is definitely clamped).
Regarding the crash:
Could you try doing this:
`// Clipping is required because as we get closer to the bounding box the bounds
// May extend beyond Allowed render target size.
BoundingRectangle.Clip(PrimaryViewRect);
// Add these two lines to clamp to max texture dimension:
const FIntRect ViewportMax = FIntRect(16383, 16383, 16383, 16383);
BoundingRectangle.Clip(ViewportMax);`Add the two lines in RenderRegion function in Engine\Plugins\Experimental\ColorCorrectRegions\Source\ColorCorrectRegions\Private\ColorCorrectRegionsSceneViewExtension.cpp
I believe this should plug the issue and if it does, I will permanently fix it in 5.6.1 if not 5.6
Thank you for jumping on this with us. I have been looking into the CCR render parts on our side in the meanwhile and I believe we have hotfixes for several critical issues within CCR. I would like to share my findings, observations and mitigations with you here.
All the issues encountered or at least the symptoms of them appear in ::RenderRegion(). Prior to the issue we have faced recently, there was a narrowing conversion issue in the Scale() function due to a hard assumption of IntType being 64bit, which had been addressed already with:
UE::Math::TIntRect<int64> BoundingRectToBeTruncated(BoundingRectangle);One of the new issues is the Viewport dimensions of the BoundingRectangle being rogue sometimes. The behavior is quite non-determinstic. One might think this shouldn’t happen as the render code attempts to do a mitigation with:
BoundingRectangle.Clip(PrimaryViewRect);However, based on the crash report from above and the fact that the code *is* already clipping against PrimaryViewRect, the mitigation is simply not robust enough as we are reaching limits above allowed platform sizes, i.e. D3D12_VIEWPORT_BOUNDS_MAX in our case for DX12. Could also be that PrimaryViewRect is rogue and hence not clipping Min *always* towards 0. Also, we can’t straight away clip Min.XY with the function as it would clip negative values towards the Max platform number instead of 0 first for Min.XY, which would result in a Viewport with Min.XY == Max.XY (basically 0 Extent). That said, since Min.XY can be negative we have to ensure to clip Min.XY first towards 0 and then towards upper bounds with platform bounds of DX12, like so:
`// ++ Dimension Markus : Tentative fix to resolve ICVFX crash due to invalid DX12 dimensions on RHI viewport include “D3D12.h”
// Dimension Markus : Tentative fix to resolve ICVFX crash due to invalid DX12 dimensions on RHI viewport
…
// ++Dimension Markus : Tentative fix to resolve ICVFX crash due to invalid DX12 dimensions on RHI viewport
// Preventing Graphics API platform crash for hard-limitation of viewport sizes (DX12 only for now)
// We are in a war zone here on-set, so we only considering DX12 platform for the hotfix
FIntRect ViewportLimits{ FIntRect(0, 0, D3D12_VIEWPORT_BOUNDS_MAX, D3D12_VIEWPORT_BOUNDS_MAX) };
BoundingRectangle.Clip(ViewportLimits);
BoundingRectangle.Min.X = FMath::Min(BoundingRectangle.Min.X, D3D12_VIEWPORT_BOUNDS_MAX);
BoundingRectangle.Min.Y = FMath::Min(BoundingRectangle.Min.Y, D3D12_VIEWPORT_BOUNDS_MAX);
// --Dimension Markus : Tentative fix to resolve ICVFX crash due to invalid DX12 dimensions on RHI viewport`
The crash, which was probably the nastiest one, is another undeterminstic crash, potentially due to a race condition between the RenderThread and the RenderGraph execution. More precisely, it seems to me since we are not in an RDG pass but reference an RDG managed resource, it could be that the RDG is updating that particular resource without knowing obviously that we also rely on read access while doing so. Let me elaborate step by step. After the Viewport clips, ::RenderRegion() creates a platform RHI viewport with this particular Constructor :
const FScreenPassTextureViewport RegionViewport(SceneColorRenderTarget.Texture, BoundingRectangle);which chains further into a FScreenPassTextureViewport construction via:
FScreenPassTextureViewport(FRDGTextureRef InTexture, FIntRect InRect) : FScreenPassTextureViewport(FScreenPassTexture(InTexture, InRect)) {}and ultimately results in the explicit FScreenPassTextureViewport Constructor here, which can crash on the runtime assertion:
inline FScreenPassTextureViewport::FScreenPassTextureViewport(FScreenPassTexture InTexture) { check(InTexture.IsValid()); Extent = InTexture.Texture->Desc.Extent; Rect = InTexture.ViewRect; }Now, this case should not happen, although a hot fix might be to check if the RHI resource of FScreenPassTexture is pointing towards valid memory. So, one might think we could fix this in ::RenderRegion() with:
`// Check RHI resource valid before creating platform Viewport
if (!SceneColorRenderTarget.IsValid() || !SceneColorRenderTarget.Texture)
{
return false;
}
const FScreenPassTextureViewport RegionViewport(SceneColorRenderTarget.Texture, BoundingRectangle);`However, sadly this is where the undeterminstic race condition behavior kicks in. This prevention is definitely not enough and the runtime assertion can quite likely still trigger on the inline constructor. Looking at SceneColorRenderTarget a bit further, the derived .Texture resource from FScreenPassTexture is an FRDGTextureRef, so clearly intended to be managed by the RenderGraph. I noticed the FRDGTextureRef by the time of executing ::RenderRegion() is seems to be valid *before* creating the platform viewport, however the underlying FRHIResource* was already null. There is no check on the actual FRHIREsource* specifically here, because ::RenderRegion() is also outside a Graph RenderPass so we probably would need to be within a RenderGraph.Pass() in order to access the RDG managed RHIResource natively. Or de-register the resource before accessing in RenderThread, or better even don’t relying on it if possible. However, that said, this gave me a clearer picture on what was going on and how we could prevent this on-set in the very limited time frame we have left.
By looking through the available constructors of FScreenPassTextureViewport(), I could circumvent the assertion by creating the platform viewport without passing the actual FRDGTextureRef but instead fetching the Extend and using the according FScreenPassTextureViewport() constructor like so:
`// ++Dimension Markus : Tentative fix to resolve ICVFX crash due to RenderGraph RHIResource RaceCondition
// This is not robust alone, as there is a very nasty race condition here having RHI pointer mismatch between the RDG and the RHI resource
if (!SceneColorRenderTarget.IsValid() || !SceneColorRenderTarget.Texture)
{
return false;
}
// We altered the Constructor from FScreenPassTextureViewport to not crash anymore on invalid RHI memory, but set the
// RHI resource of the Viewport to Null as well as the extent to 0
// Since we can’t access the underlying RHI texture in this function due to RDG restrictions, we come up with this double gating and a modified Constructor
// const FScreenPassTextureViewport RegionViewport(SceneColorRenderTarget.Texture, BoundingRectangle);
// Circumventing the crash by using a different constructor chain
RegionViewport = FScreenPassTextureViewport(SceneColorRenderTarget.Texture->Desc.Extent, BoundingRectangle);
if (!(RegionViewport.Extent.X > 0 && RegionViewport.Extent.X < D3D12_VIEWPORT_BOUNDS_MAX) ||
!(RegionViewport.Extent.Y > 0 && RegionViewport.Extent.Y < D3D12_VIEWPORT_BOUNDS_MAX) ||
!SceneColorRenderTarget.IsValid())
{
return false;
}
// --Dimension Markus : Tentative fix to resolve ICVFX crash due to RenderGraph RHIResource RaceCondition`
Another way would be to modifyh the FScreenPassTextureViewport in the Renderer module directly with maybe instead of asserting make a conditional 0 viewport if the RDG resource is not valid. However one could argue it’s good to leave the runtime assertion there because in the perspective of the Renderer module, the RDG resource should arguably always be valid so I believe it is a fair assumption to leave that runtime assertion in there.
We had 2 stable overnight runs with regards to the race condition hotfix. We are still in testing phase with the rest of it but we thought it’s definitely worth to share these fixes and insights from our side here with you.
I could imagine if one would prefer to fix the root of the race condition with the FRDGTextureRef entirely, maybe one could wrap at least the accessing part of the ::RenderRegion() within an actual RenderGraph.Pass() to make sure the RDG Graph schedules our RHI access here without causing a race condition like we experience.
I also recommend adding the sanity clamp and a check for the Max Region before creating the Platform Viewport, like so:
`// ++Dimension Markus : Tentative fix to resolve ICVFX crash due to invalid DX12 dimensions on RHI viewport
// Preventing Graphics API platform crash for hard-limitation of viewport sizes (DX12 only for now)
// We are in a war zone here on-set, so we only considering DX12 platform for the hotfix
FIntRect ViewportLimits{ FIntRect(0, 0, D3D12_VIEWPORT_BOUNDS_MAX, D3D12_VIEWPORT_BOUNDS_MAX) };
BoundingRectangle.Clip(ViewportLimits);
// Clip Min to < D3D12_VIEWPORT_BOUNDS_MAX if Min > 0
BoundingRectangle.Min.X = FMath::Min(BoundingRectangle.Min.X, D3D12_VIEWPORT_BOUNDS_MAX);
BoundingRectangle.Min.Y = FMath::Min(BoundingRectangle.Min.Y, D3D12_VIEWPORT_BOUNDS_MAX);
// In case Min was > Max before Clipping && Min > D3D12_VIEWPORT_BOUNDS_MAX so that Max == Min which is > D3D12_VIEWPORT_BOUNDS_MAX
BoundingRectangle.Max.X = FMath::Min(BoundingRectangle.Max.X, D3D12_VIEWPORT_BOUNDS_MAX);
BoundingRectangle.Max.Y = FMath::Min(BoundingRectangle.Max.Y, D3D12_VIEWPORT_BOUNDS_MAX);
// --Dimension Markus : Tentative fix to resolve ICVFX crash due to invalid DX12 dimensions on RHI viewport`
This is because looking at the Clip function one will realize that the last section basically sets a zero area of Region, however 0 not in terms of actual 0,0 but literally in 0 area, while still potentially having a larger-than-allowed position of that 0-area viewport. This can happen is Min was originally larger than Max. For reference:
// return zero area if not overlapping
Max.X = FMath::Max(Min.X, Max.X);
Max.Y = FMath::Max(Min.Y, Max.Y);
}`Maybe the Clip should just set 0,0 as zero-area rather than the original (probably rogue) input values. But that’s up for discussion and intent I suppose. Anyway, if we apply the additional manual clip from the fix above also for Max, we should be good to ensure both Min and Max are within Platform Viewport Bounds.
Thank you for all of your investigation. I’ve submitted a “plug” for this issue in UE5.6. I still don’t understand what exactly is causing that, since the issue must be the in invalid View.ViewRect which is a much bigger issue.
If you have a moment to try it in UE5.6 and let me know, that would be appreciated.
That’s great thank you. I had a look at your changes, looks good to me! Only thing is, other platforms may have different viewport min/max bounds (if that should be relevant). I didn’t consider that in our hotfix myself because we are Dx12 only.
As for the race condition with the RDG resource I mentioned above, my fixes seemingly also resolved that crash, although I personally believe for a clean solution it would be good to ensure that we safely threadsafe/“graphsafe” access the RDGResource SceneColorRenderTarget.Texture as we are outside the Graph (and since we encountered a race condition, some other thread, e.g. the RenderGraph execution may be on a worker thread or something similar, must be updating that particular RDGResource while the RenderThread is running through the ::RenderRegions().
Since my hotfix for that case, we have not experienced that crash in any of our overnight runs, which proofs the theory with reasonable confidence I would say and the issue behind it. But for a completely safe and clean approach I would probably think of extracting the SceneColorRenderTarget.Texture (GraphBuilder.QueueTextureExtraction()) from the Graph to ensure it is not manipulated during ::RenderRegions(), so that .IsValid() remains deterministic. Or, perhaps just fetching the Viewport Extent during a Graph Pass, caching it and creating the Viewport from the cached safe-extent with the alternative constructor, akin to what I did in the hotfix.
Anyway, just wanted to share my thoughts on the race condition crash with the RDG Resources as well in more detail here for you.
I believe it is something else. In Unreal Engine the recommended RDG workflow is to accumulate all rendering passes before executing the graph which is what happens in this case. This ensures proper resource management, synchronization, and avoids accessing GPU resources prematurely. While accumulating passes on the render thread, you can safely reference FScreenPassRenderTarget& SceneColorRenderTarget especially since it is a local resource. This doesn’t apply to RHI resources which we don’t access on Render Thread. We also check for (SceneColorRenderTarget.Texture) validity with !SceneColor.IsValid().
If you are encountering a crash on access of SceneColorRenderTarget.Texture this means that for some unknown reason the render pipeline provides us with both invalid View and SceneColor that comes with it. Would you happen to have a callstack for suck a crash? However I can only guess and throw out conjectures.
Yeah so the undeterminstic behavior with that resource access has the effect that checking forSceneColor.IsValid() inside ::RenderRegion() doesn’t seem to be threadsafe, the resource or at least the return value from IsValid() can change within the execution of that ::RenderRegion() of the RenderThread. In other words, the race condition causes IsValid() to eventually be true at first, however it *may* be suddenly invalid afterwards during the Constructor of the Viewport which is used in Vanilla at the moment, assertion for IsValid() -> and crash.
Due to that racec condition-like behavior, checking IsValid() before the Constructor who asserts(isValid()) won’t resolve the potential of that crash, as IsValid() can suddenly change its return value -> implying that some other thread is manipulating that resource while the RenderThread runs on ::RenderRegion(). That is why after I switched the Construction chain to use an alternate constructor bypassing the assertion and checking validity of the Viewport Extent afterwards seems to resolve the crash. But similar to the other crash, the worrying part is why does the return value of IsValid() change during that function execution of the RenderThread, who is manipulating the RDG Resource if not the Graph? These are the 2 questions to find the root cause. The ::RenderRegion() is not within a GraphPass, accessing the underlying RHI Resource (which *can* be null here) from the RDG Resource would be invalid due to that.
We don’t have a mini dump from a pure vanilla version of this unfortunately, only from a custom build on-set so I am not sure if you would be able to read from it as you might need the custom symbols from our on-set engine build, to reading the callstack. Although I believe this particular build was mostly Vanilla to be fair, with some ICVFX patches from Epic.
And yeah these issues have been quite tricky, I had to make quite some guesses due to the undeterministic behavior and the fact that we could do only one single test-run over night if even.
You are correct *SceneColor.IsValid()*isn’t thread safe. Which is why we use it on Rendering thread. FScreenPassTexture SceneColor resource cannot be changed (at this stage of the process) on any other thread (specifically RHI or RDG because RDG hasn’t been executed yet and therefore RHI hasn’t been either) so I doubt it is this specific race condition unless there’s an invalid or unsafe use of SceneColor elsewhere in the pipeline. FPostProcessingInputs must have SceneColor either initialized or not at this point and nothing should be able to change it on any other thread until Graph is executed. ::RenderRegion is executed on the render thread, so any access within that context should be safe as long as it’s before graph execution which it certainly is.
It is a strange problem. A stack trace or even an error message could help shed the light on this issue.
Yeah this issue has been much weirder to address then the original one with the Viewport sizes. If we can absolutely rule out any Graph execution as you say from any other thread while being here in ::RenderRegions() on the RenderThread, that would narrow down the case quite a lot. Although, in this case, narrow it down to what else exactly.
I managed to dig up one of these IsValid() crashes sent from on-set, alongside the log and the minidump. Let me know if you get 5.4.4. symbols working on the minidump. In the log you will see in one of the the last lines the assertion which fires within the original FScreenPassTextureViewport Constructor.
In terms of crash frequency without our hotfix, it’s similar to the first crash, i.e. non-deterministic and might or might not run for hours or even over night. The frequency from one week’s test meaning one test per day, I would say probably similar occurence frequency than the viewport extent crash.
[2025.05.13-15.11.35:317][320]LogWindows: Error: appError called: Assertion failed: InTexture.IsValid() [File:D:\EngineBuild\Unreal-Engine-5.4-ICVFX\Engine\Source\Runtime\Renderer\Public\ScreenPass.inl] [Line: 148] No worries at all Rus, I am happy to help to make these things more stable in vanilla. In the meanwhile I am keeping our patches for our build. Once all of these cases are addressed in Vanilla we can then stop maintaining our hotfixes, which is always nice.