Loading is stuck, flush will never finish

Hey Danny, thanks for the reply.

Our shipping server map load times are not close to the currently configured 120 second timeout- increasing this timeout without first verifying async loading is working as expected is not an option.

If I understand correctly the DumpState crash is expected if async loading is still active? Is there a known set of debugging instrumentation or logs that are valid to be called at any point during async loading so we can enable those in our prod environment?

Here is some state from one of our cores which leads me to think there’s something unexpected happening:

(lldb) f frame #3: 0x00000000052e35d5 Server-Linux-ShippingFAsyncLoadingThread2::FlushLoading(this=0x00007fa0c60b59c0, RequestIDs=TConstArrayView @ 0x000057172ced7e40) at AsyncLoading2.cpp:9991:7

(lldb) p PendingRequests
(TSet) size=1 {
[0] = -1
}

(lldb) p RequestIdToPackageMap
(TMap<int, FAsyncPackage2 *>) size=1 {
[0] = {
UE::Core::Private::Tuple::TTupleBase<TIntegerSequence<unsigned int, 0U, 1U>, int, FAsyncPackage2 *> = {
UE::Core::Private::Tuple::TTupleBaseElement<int, 0, 2> = (Key = -1)
UE::Core::Private::Tuple::TTupleBaseElement<FAsyncPackage2 *, 1, 2> = {
Value = 0x00007fa05ecaf520
}
}
}
}

(lldb) p RequestIDs
(TConstArrayView) {
DataPtr = 0x00007fa05313aa80
ArrayNum = 95
}
[ 119, 120, .., 213 ]

(lldb) p SyncLoadContext->RequestIDs
(TArray<int, TSizedInlineAllocator<4, 32> >) size=95 {
[ 119, 120, .., 213 ]

(lldb) p SyncLoadContext->RequestedPackages
(TArray<FAsyncPackage2 *, TSizedInlineAllocator<4, 32> >) size=95 {
ArrayNum = 95
[0] = nullptr
[1] = 0x00007fa05ecaf520

}

(lldb) p ((FAsyncPackage2*)0x00007fa05ecaf520)->Desc
(FAsyncPackageDesc2) {
RequestID = 120

while (IsAsyncLoadingPackages()) // → return PackagesWithRemainingWorkCounter != 0; (lldb) p PackagesWithRemainingWorkCounter == 12
{
bool bDidSomething = false;
EAsyncPackageState::Type Result = TickAsyncLoadingFromGameThread(*GameThreadState, false, false, 0.0, RequestIDs, bDidSomething);
if (RequestIDs.Num() != 0 && !ContainsAnyRequestID(RequestIDs)) <— 1. // ContainsAnyRequestID → return Algo::AnyOf(RequestIDs, [this](int32 RequestID) { return PendingRequests.Contains(RequestID); });
{
break; // 2. no break here
}

if (!bDidSomething)
{

if (LastActivity == 0.0) // LastActivity != 0

else if (FPlatformTime::Seconds() - LastActivity > GStallDetectorTimeout && ++IdleLoopCount > GStallDetectorIdleLoops) // haven’t bDidSomething for >120seconds and checked at >50 times
{

// TConstArrayView RequestIDs = .Num = 95, .Data = [ 119, 120, .., 213 ]
// TSet PendingRequests = .Num = 1, .Data = [ -1 ]

UE_LOG(LogStreaming, Fatal, TEXT(“Loading is stuck, flush will never finish”)); <— 3. rip
}
}`If I’m reading this correctly: at 1. both RequestIDs.Num > 0 && PendingRequest matches an element in RequestIDs but when we assert at 3. in the same loop PendingRequests has been changed which is suspicious.

While PendingRequests[0] = -1 and RequestIdToPackageMap[-1] = 0x00007fa05ecaf520, the package at 0x00007fa05ecaf520 Desc::RequestID = 120 which matches the SyncLoadContext->RequestID[1] = 120 and SyncLoadContext->RequestedPackages[1] = 0x00007fa05ecaf520. Is this a valid state for PendingRequests? Is SyncLoadContext->RequestedPackages[0] = nullptr expected?

Please let me know if I can provide anymore info,

-blake