Race condition crash with incremental GC on

When we turn on incremental GC, we get a crash in ULevelStreaming::GetLevelStreamingStatus when it tries to do FindObjectFast. It appears that the “if (IsGarbageCollecting())” check is false, but by the time it makes it to FindObjectFast the uobject hash table is locked. We hit this fatal error StaticFindObjectFast:

UE_CLOG(IsGarbageCollectingAndLockingUObjectHashTables(), LogUObjectGlobals, Fatal, TEXT(“Illegal call to StaticFindObjectFast() while garbage collecting!”));

It looks like when GetLevelStreamingStatus is called from a worker thread it is doing so potentially when garbage collection starts running, which causes it to try to read the bools when they are in the middle of being written to. I don’t think incremental GC is the problem, but rather exposes the problem.

As a workaround we added this code in GetLevelStreamingStatus after the “if (IsGarbageCollecting())” check:

if (!LevelStreamingCVars::bShouldReuseUnloadedButStillAroundLevels)

{

return LEVEL_Unloaded;

}

Since this var needs to be false for incremental GC to work properly anyway, this gives us an early out before trying to FindObjectFast, so it avoids the problem altogether.

Do you have any insights into a better way to fix this issue?

Thanks,

Andy

Hi,

We don’t recommend using Incremental GC unless you are running a single thread server. The Incremental GC is not thread safe and cause all sort of crashes. This is documented in the ‘Known Limitation’ at the very bottom of this page: https://dev.epicgames.com/documentation/en\-us/unreal\-engine/incremental\-garbage\-collection\-in\-unreal\-engine. At the moment, we have no work in progress to fix incremental GC. Being said, if your fix works and you don’t notice any side effects, ship it! But on your customer PC with widely different capacities/hardware, you will have a different timing and may hit different race conditions leading to other crashes. So if you can run without incremental GC, that would be safer.

Regards,

Patrick

Hi,

I’m confident work will eventually be done for incremental GC, but that’s not in 5.6 nor 5.7 and probably not in 5.8. Trying to guess priorities further away is gambling. We don’t have a exhaustive list of bugs of problems, since we have it off (except maybe in single threaded servers), so we are not tracking issues.

Regards,

Patrick

Thanks for the response Patrick. It’s unfortunate to hear there is no work being done on incremental GC at the moment, considering how large of a perf boost it could give. But I guess that’s how it goes.

Do you guys have examples of known async work being done by the engine that can cause the incremental reachability race conditions you’re talking about?