What is the proper way to use Async Pathfinding?

I created a custom AI MoveTo node that uses the Async Path request instead of sync path requests. My current flow for using it is as follows:

  • ExecuteTask is called and I call the function FindPathAsync to receive a uint32 handle for the query result and store the result in the node memory block (non-instanced C++ native Task node)
  • In TickTask, I check each frame to see when the Async path is formed
  • In TickTask, if a query is pending and the uint32 handle is valid, I collect the NavPathSharedPtr and I call RequestMove() on the AI’s AIController which passes the path to the PathFollowingComponent
  • Sometimes (I will clarify later), when RequestMove is called, reallocation of memory for the multi-cast delegate FPathObserverDelegate ObserverDelegate on the FNavPathSharedPtr cached on the UPathFollowingComponent will fail on a memory allocation on an AddObserver call with a read access violation 0xFFFFFFFF, which I believe is uninitialized memory

To clarify on the sometimes, this only becomes a reliable repro if I crank my unit test stress test of many AI’s requesting and re-requesting Async paths constantly. If I don’t stress test, then this crash has a pretty low repro to almost non-existent rate with ~12 AI’s pathing. When the AI’s do path, they do path properly to the correct locations.

What I first want to know is if this is the proper way to incorporate Async pathing for AI’s? Is the flow of requesting an Async path and then calling RequestMove on the PathFollowingComponent with the FNavSharedPathPtr the intended way it is used? Or, could the problem be in the use/setup of the Async MoveTo BehaviorTask Node? I know C++ native nodes are non-instanced, so I’ve only used the memory pool for the node for any persistent data and I am not making changes to any Node member data itself.

I do have a way to test the Async pathfinding outside of the behavior node. I wrote some functions to draw a path based on NavPolyRefs (Async path doesn’t populate more than the start and end for path points and instead uses PolyRefs from the NavMesh). So, I can separate and test if needed.

I am happy to share code or Pseudo-code if it helps. Thanks for any advice.

Edit:
Adding some more information to this. It looks like the current way Async Path Queries are done is not thread-safe. It is accessing an Actor (UObject derivative outside of the main game-thread. When using CrowdAvoidance, they process the same data set and ocassionally allocate delegate memory outside of the main game-thread, when certain steering conditions hit.

Here is an image of the parallel stacks:

The For loop on the non-main-game thread mentions this itself:

for (FAsyncPathFindingQuery& Query : PathFindingQueries)
{
	// @todo this is not necessarily the safest way to use UObjects outside of main thread. 
	//	think about something else.
	const ANavigationData* NavData = Query.NavData.IsValid() ? Query.NavData.Get() : GetDefaultNavDataInstance(FNavigationSystem::DontCreate);

	// perform query
	if (NavData)
	{
		if (Query.Mode == EPathFindingMode::Hierarchical)
		{
			Query.Result = NavData->FindHierarchicalPath(Query.NavAgentProperties, Query);
		}
		else
		{
			Query.Result = NavData->FindPath(Query.NavAgentProperties, Query);
		}
	}
	else
	{
		Query.Result = ENavigationQueryResult::Error;
	}

	// @todo make it return more informative results (bResult == false)
	// trigger calling delegate on main thread - otherwise it may depend too much on stuff being thread safe
	DECLARE_CYCLE_STAT(TEXT("FSimpleDelegateGraphTask.Async nav query finished"),
		STAT_FSimpleDelegateGraphTask_AsyncNavQueryFinished,
		STATGROUP_TaskGraphTasks);

	FSimpleDelegateGraphTask::CreateAndDispatchWhenReady(
		FSimpleDelegateGraphTask::FDelegate::CreateStatic(AsyncQueryDone, Query),
		GET_STATID(STAT_FSimpleDelegateGraphTask_AsyncNavQueryFinished), NULL, ENamedThreads::GameThread);
}

My guess is that the Async queries were never tested rigorously while CrowdAvoidance was turned on. Would the best current workaround be to override the NavigationSystem and get rid of the reliance on accessing a UObject outside of the main game-thread?

2 Likes

For the very few that might care about this, I was able to fix this crash by putting a scoped critical section on any use of a NavPathSharedPtr that is also used as a part of a RequestMove. While FNavPathSharePtr is supposed to be thread-safe, its contents are certainly not, such as the PathCorridors. Fortunately, we only draw paths when the user opts into it to see a path preview, so the lock won’t hinder performance very much. I haven’t had much time to dive deeper into this but for those who also ran into this issue, this is a potential solution

TArray navPathPointsDoubleBuffer;
TArray navPolyRefsDoubleBuffer;
{
   check( IsInGameThread() );
   FScopeLock scopedLock( &PathPointPolyEdgeCorridorsCriticalSection );

   const TArray& navPathPoints = navPath->GetPathPoints();

   // According to their docs, FNavPathSharedPtr is valid when PathPoints is > 1
   constexpr int ValidPathResults = 1;
   const bool bHasValidPath = ( navPathPoints.Num() > ValidPathResults );
   if ( bHasValidPath == false )
   {
      return false;
   }

   const TArray& navPolyRefs = navMeshPath->PathCorridor;

   // Append takes care of Reserving Memory Blocks
   navPathPointsDoubleBuffer.Append( navPathPoints );
   navPolyRefsDoubleBuffer.Append( navPolyRefs );
}
3 Likes