5.5 GPU crash and hang

anonymous-edc · September 17, 2025, 5:31pm

We’re seeing a very frequent crash on 5.5 related to async lumen diffuse indirect. We’re seeing it more frequently on high end hardware, but it also seems to manifest as an idle GPU crash.

We have aftermath crashes with shader symbols for the crash. It can crash one of 2 ways in aftermath. One in the shadow map code, a GPU page fault. Another in lumen SDF code, also with a GPU page fault. These two passes are done around the same time on the GPU, lumen in the async path, shadows in the Graphics pipe.

I’ve attached both aftermath dumps.

The other issue is that this isn’t the only way that the problem shows up. for example, yesterday i was crashing for 9 straight hours with this gpu fault. But this morning i’m 95% fine. We also get these floating shimmers

[Image Removed]

this seems to be related to

[Content removed]

turning off async lumen would cost us a lot of perf, but we’re just seeing so many crashes related to it.

any recommendations?

Petrockets · September 17, 2025, 10:47pm

Hello,

From looking at the .dxil and assembly, these crashes don’t look similar to the crash dumps I’ve seen for the InstanceCull/NodeAndClusterCull GPU crash linked [Content removed] and [Content removed] which to my knowledge we haven’t seen a repro for on a 50 series card yet, only 30 and 40 series cards.

These are a couple known potential GPU crashes and fixes related to Lumen

Tech Note: Fix for GPU Crash in Lumen in Unreal Engine 5.5 (likely Intel GPUs only)
GPU Crash from out of memory read/writes in Lumen when using non-power-of-2 TracingOctahedronResolution

And there are some other known potential GPU crashes documented here:

UE 5.5.x Most Common Rendering Issues

Regarding the floating shimmers, it’s not something I’ve seen reported, but if you have a PIX capture of the issue that could help narrow down the possibilities. There was a splitscreen issue with artifacts that might be related:

(UE-232413) r.Lumen.Reflections.DownsampleFactor split screen artifact

Taking a look at the crash dumps it’s hard to say what’s going on without the shader source or breadcrumbs, can you attach those for us to look at?

Petrockets · September 23, 2025, 11:33pm

Thanks for trying that, I haven’t found any other more recent fixes that may address this though the InstanceCull/NodeAndClusterCull GPU crash does appear related to running Lumen on async compute, though our investigations are pointing towards a driver issue. What kind of performance drop do you see with r.Lumen.DiffuseIndirect.AsyncCompute=0?

Petrockets · September 24, 2025, 7:00pm

Assuming that’s 1ms on a 50 series GPU, if you’re willing to accept that tradeoff, you can target testing that on the subset of users that have that hardware using device profile matching rules. In the case of the InstancCull crash, it typically occurred on the first frame of drawing lots of Nanite geo and several local shadow casting lights - have you been able to narrow down from the logs, where players are and what content (number and type of lights, amount of Nanite geo) is typically in the scene? Otherwise, I don’t have any further recommendations at this time.

Also, I’m still not seeing the breadcrumbs for these crashes in the attached files, can you attach those? That will help me look for similar GPU crashes on our side.

Petrockets · October 1, 2025, 12:08am

Apologies for the delay - I’ve been unable to merge the “broken fog 5090.7z” files together with 7-Zip and keep getting an “unexpected end of file” error. Are you able to open the multi-part 7z files locally? I will spin up a separate file location to upload the capture to where we don’t have this limit if it isn’t some problem with the files or something I’m doing wrong.

I was able to find similar crash reports on our end with those breadcrumbs in 5.5 and some in 5.6 but none had known workarounds aside from disabling async compute, and none had additional information.

anonymous-edc · September 17, 2025, 11:16pm

whoops i didn’t attach the PDBs as it’s in my search path for aftermath.

Here you go.

anonymous-edc · September 17, 2025, 11:27pm

and here’s the GPU capture of the shimmering fog.

anonymous-edc · September 17, 2025, 11:28pm

part 2 of the file. I seem to only be able to attach one file per comment.

anonymous-edc · September 22, 2025, 5:00pm

Hey [mention removed] , i implemented both lumen GPU crash CLs from the future and we still see the crash consistently on 5090s

anonymous-edc · September 24, 2025, 2:09pm

we see about a 1ms drop on the GPU

anonymous-edc · September 24, 2025, 9:21pm

here are the breadcrumbs.

1ms is too much, we’ve been looking for fixes or ways to get that perf back.

anonymous-edc · October 2, 2025, 5:44pm

let me see what i can do about the gpu capture. I will say that with the newer 581.49 driver i haven’t been able to repro.