We have had an intermittent crash in our project that occurs when the recast nav mesh tries to serialize. When we debugged this it looked like the recast nav mesh actor was being destroyed on the main thread while simultaneously serializing on the worker thread. Essentially the null pointer check for RecastNavMeshImpl in SerializeRecastNavMesh passes, but once it gets inside that function something on the main thread calls Destroy on the recast nav mesh actor. In ANavigationData::Destroyed it calls UnregisterAndCleanUp, which calls ARecastNavMesh::CleanUp, which calls DestroyRecastPImpl and deletes RecastNavMeshImpl out from under the async serialization process.
We experimented with two different fixes for this:
Fix #1: In ARecastNavMesh::Serialize, before the else if (RecastNavMeshSizeBytes > 4) check, we added this block of code
else if (IsActorBeingDestroyed())
{
UE_LOG(LogNavigation, Warning, TEXT("%s: ARecastNavMesh: Navmesh destroyed prior to load. Skipping serialization. \n"), *GetFullName());
CleanUpBadVersion();
}
However, I get the feeling that rather than fix the problem this simply shrinks the timing window considerably. So the crash theoretically could still happen, but in actual practice maybe not.
Fix #2: Remove the call to DestroyRecastPImpl in ARecastNavMesh::CleanUp. The ARecastNavMesh destructor will destroy it for us anyway, so we let it handle it which should be a threadsafe area. This also fixed the crash for us, and between the two fixes this feels like the more comprehensive fix.
Have you guys seen anything like this? Are there other fixes you would recommend?
Thanks!