I’ve been getting reports from a couple of projects having traversal stutters with Recast, and finally had a chance to reproduce. I tracked down the problem to Detour and also how Unreal uses Detour, and I’m offering a fix I developed that removes the issue entirely at a small memory cost.
I’m going to try to explain what happens as best as I can, but feel free to ask if you have questions.
This project is a World Partition project, and when a new section is streamed in, tile generators are launched in async tasks. Recast actually tries to spread this over various frames (generally every 3rd frame). Once the tile generation is complete, it calls FRecastNavMeshGenerator::AddGeneratedTilesAndGetUpdatedTiles() in the game thread to add the new generated tiles.
In this function it first checks if the tiles were fully regenerated and it removes all the tiles first if that’s the case:
if (TileGenerator.IsFullyRegenerated()) { // remove all layers ResultTileRefs = RemoveTileLayersAndGetUpdatedTiles(TileX, TileY, &OldLayerTileIdMap); }
Then it proceeds to go into a loop adding all the newly generated tiles, by calling FRecastNavMeshGenerator::AddGeneratedTileLayer().
Now, in FRecastNavMeshGenerator::AddGeneratedTileLayer(), Unreal first tries to remove the tile if it exists, then tries to add the new tile using an old reference to the removed tile. This is an optimization in Detour to avoid having to recreate the seed for the tile. If it fails to add it that way, then it tries to add the new tile without an old reference.
The issue we were running into is that dtNavMesh::addTile() can sometimes take inordinate amounts of time to add every tile. I tracked it down with the profiler to this bit of code:
while (tile && tile != target) { prev = tile; tile = tile->next; }
That code is fulfilling two tasks:
1) Ensure that the node is on the free list, thus still free.
2) Get the prev pointer in case it’s free and we need to unlink it from the list.
It looks pretty innocuous, and fast, but if you look at how the m_nextFree free nodes linked list is implemented in Detour, you’ll notice that as you remove tiles, the nodes get appended to the head of the list. So if you’re just deleting a single node, and then using the old ref to re-add it, it’s super fast, because the node you just removed is right there at the head of the free list.
Now, if you remove 5000 tiles and then try to do the old-ref match with the first node you removed, you’ll be traversing the entire free list to find it. Then when you do the same with the next tile, you do it all over again. Rinse and repeat. Noteworthy as well is that while the tiles array is in a block of contiguous memory, the free list links to arbitrary nodes, meaning they’re scattered through the memory block, so the traversal can also thrash the cache.
FRecastNavMeshGenerator::AddGeneratedTileLayer() does the right thing. It removes the node first, collects the old-ref then re-adds it. That’s the fast case. But if we go back to the TileGenerator.IsFullyRegenerated() condition, we’ll run into the problem that it’ll remove all the tiles first, bulk up the free list in Detour, then add the new ones individually, still trying to re-use, which triggers the worst case of dtNavMesh::addTile().
What I did to address this, is effectively turn the free list into a double-linked list, keeping a prev node, and reusing it also to mark if the node is still effectively free. This avoids the entire iteration through the free list, at the cost of an extra 8 bytes per tile node. The memory cost is negligible and completely eliminates the traversal stuttering. That said, if you wanted to turn this into only 4 bytes per node, you could store the node index instead at the expense of some extra pointer math.
Attached is the patch applied to Unreal 5.5. I marked the change with OPTIMIZE_FREELIST_LOOKUP.
Hope it helps!
Ernesto.