Hi there,
As you noted, this crash seems quite difficult to reproduce and is very rare. It does look like a race condition to me, but I’m unsure as to where it might be coming from. Checking the code, it looks like parallel updates / reads to DynamicData should be protected by mutually exclusive scope guards. See my detailed breakdown below for my reasoning here, and possible additional debugging steps.
It does look like this code has been majorly refactored from UE 5.5 onwards, so this should no longer be an issue in 5.5+. The following commit should fix the issue, as it removes all dependencies on DynamicData from the GetBaseSkinVertexFactory method:
https://github.com/epicgames/UnrealEngine/commit/a9ca63cf10614cc446572c880459c58e4156dd76
However, it’s also quite a large refactor, so it might be difficult to backport if you need to do that.
Detailed breakdown:
This crash is coming from a deformer graph, containing a cloth node, which triggers a call to FSkeletalMeshObjectGPUSkin::GetBaseSkinVertexFactory, where the race condition and crash occurs. The DynamicData variable it is reading is updating in FSkeletalMeshObjectGPUSkin::UpdateDynamicData_RenderThread, which is in turn enqueued from FSkeletalMeshObjectGPUSkin::Update, which enqueues UpdateDynamicData_RenderThread to the UE::RenderCommandPipe::SkeletalMesh command pipe. This command pipe DOES run on its own thread (NOT the render thread), however, it should not be possible for this task pipe to be running when FSkeletalMeshObjectGPUSkin::GetBaseSkinVertexFactory is called. In your call stack there is a call to ComputeFramework::FlushWork, which contains the line:
UE::RenderCommandPipe::FSyncScope SyncScope({ &UE::RenderCommandPipe::SkeletalMesh });
This should prevent the pipe from running while this render thread task is executing. But, perhaps something isn’t working as expected here. You could try adding `check(!UE::RenderCommandPipe::SkeletalMesh.IsReplaying())` to the top of the FSkeletalMeshObjectGPUSkin::GetBaseSkinVertexFactory, function to ensure that this condition is being met.
It might also be interesting to see the state of the parallel stacks window in visual studio when you get the crash to occur again. This might tell you where the modification to DynamicData is coming from, assuming the thread modifying this hasn’t moved on very far since the crash occurred.
Hopefully this can help you debug and solve your issue,
Regards,
Lance Chaney