Crash in InitializeVertexBuffersResources

This question was created in reference to: [UE5.5.4 - Crash in FNiagaraRibbonGpuBuffer::Allocate and [Content removed]

Just wanted to call out that since our upgrade from 5.5.4 to 5.6.1, we have also been seeing this crash:

Unhandled Exception: EXCEPTION_ACCESS_VIOLATION reading address 0x0000000000000028

[ 00 ] FD3D12DynamicRHI::RHILockBuffer (D3D12Buffer.cpp:686)

[ 01 ] FRHICommandListBase::LockBuffer (RHICommandList.h:729)

[ 02 ] FNiagaraRendererRibbons::InitializeVertexBuffersResources (NiagaraRendererRibbons.cpp:2514)

[ 03 ] FNiagaraRendererRibbons::GetDynamicMeshElements (NiagaraRendererRibbons.cpp:963)

[ 04 ] FNiagaraSystemRenderData::GetDynamicMeshElements (NiagaraSystemRenderData.cpp:235)

[ 05 ] FNiagaraSceneProxy::GetDynamicMeshElements (NiagaraComponent.cpp)

[ 06 ] FProjectedShadowInfo::GatherDynamicMeshElementsArray

[ 07 ] FProjectedShadowInfo::GatherDynamicMeshElements

[ 08 ] UE::Trace::FChannel::operator|

[ 09 ] TaskTrace::FTaskTimingEventScope::{ctor}

[ 10 ] UE::Tasks::Private::FTaskBase::TryExecuteTask

[…]

It seems to stem from parallel threads altering the Vertex buffers out from under each other. The call of …

NiagaraRendererRibbons.cpp
 
// Make sure our ribbon data buffers are setup
VertexBuffers.InitializeOrUpdateBuffers(RHICmdList, GenerationConfig, DynamicDataRibbon->GenerationOutput, SourceParticleData, DynamicDataRibbon->MaxAllocationCount, bShouldUseGPUInit);

… can release or reallocate any of the shared vertex buffers while another thread is within the Lock/Memcpy/Unlock sequence below, operating on the same vertex buffers.

I’ve done a tentative fix locally where I just move the UE::TScopeLock VertexBuffersLock(VertexBuffersGuard) declaration to the top of FNiagaraRendererRibbons::InitializeVertexBuffersResources, making sure only 1 thread can touch the vertex buffers at a time, and am no longer seeing the crash. But I haven’t been able to test that extensively.

I’ll update if this indeed seems to fix the crash for us.

Hi Daniel,

Thanks for the information, do you happen to know if when this occurs the ribbon is visible in multiple shadow projection and not visible in the main rendering pass?

I’m asking because for main rendering pass GetDynamicMeshElements should only be called once, and we loop per view for split screen inside here. Therefore if the main pass was invoked the buffers would already be allocated and the projection passes would be fine since the alloc is a noop. What I’m assuming is that the shadow projection pass is calling GetDynamicMeshElements multiple times and on different threads. If that’s true it would explain why your fix works.

However, I might have concerns around RHICmdList execution ordering, I think you should be ok as we don’t do anything with these buffers until all the collectors have executed, but something to watch out for.

Thanks,

Stu

Thanks for confirming my suspicion, I’m going to see if I can make myself a repro and I’ll look over the surrounding code to confirm we don’t execute anything on those buffers before they join back to the immediate context (I don’t think we do but want to be careful).

Thanks,

Stu

Wanted to let you know that I confirmed the race adding some verification locally, I didn’t get the crash, but confirmed it can race when off screen and multiple shadow casting lights.

The lock works ok also, it’s what I’m going to submit for 5.7.

Thanks again for the report and fix.

Stu…

I don’t know if it was the case everytime, but in the repros that I can recall, yes that seemed to be the case.

There would be ~3 threads waiting on the mutex, all with the same callstack as the crashing thread and all with the same vertex buffer pointers. And the crash seemed to happen once I hard turned away from our ribbons, suggesting they were no longer part of the main render pass.