We have been consistently experiencing GPU hangs in StableSpringsSystem (we know this is it due to the Aftermath hash in the crash matching up with when it is compiled.) I was able to reproduce a similar looking hang that randomly happens by modifying SpawnStableNodes values. This happened originally in our project in 5.5 and I was able to confirm this happens in 5.6 vanilla as well.
I have attached the aftermath crash and a screenshot of what it looks like when crashing. Unfortunately I do not have an insights trace.
Steps to Reproduce
Download Unreal 5.6 Vanilla.
I have attached a project that reproduces the problem, but you should be able to reproduce the problem with any Metahuman hair attached to some unstable source, such as the character (you can see in the sample project we have attached a groom to the head bone)
Run PIE and then in Niagara find the SpawnStableNodes of the StableSpringsSystem and double click it.
Disconnect the StepTime, NumSteps, DeltaTime and TimeOffset nodes and then randomly plug numbers into them, BUT KEEP THEM ALL UNDER 1000, to avoid an ‘obvious’ gpu hang since the system does of course scale naturally with NumSteps. At some point you will experience a GPU hang.
This unnatural high rate repro was to try to get the same groom GPU hang we had been seeing in our 5.5 project at a fairly low repro rate. We examine an Aftermath crash of a groom hang and noticed a number of callstack positions were inside of a loop that loops over InitGridSamples_Emitter_HairStrandsOutput_NumSamples which is generated in the .usf file that gets dumped on the shader compile. We were able to get this to no longer happen by clamping the number of samples in InitGridSamples_Emitter_HairStrands which you can see the generated .usf from attached (our version)
We think there is some way the NumSamples can get very high but have not been able to debug the compute shader well enough to know why it is getting so high to cause a GPU hang. It appears to happen randomly but the above repro is the easiest way to get it to happen, even though it seems unnatural it may be the same thing that happens randomly, so I have provided it.
Please advise on if this is a known problem, and desired fix.
I should also note that checking if the delta is infinite and then clamping it will also work (a little less aggressive than clamping OutNumSamples)[Image Removed]
Hi Alex, Thanks for reporting the issue and for providing a fix. We are going to debug it locally thanks to your repro will likely submit a fix!
Thanks again,
Michael