Horde/UBA Improve stability

Hello,

We fully use Horde/UBA to compile our project and run CI/CD. However, we face multiple instabilities.

Some of them are:

fatal error C1356: unable to find mspdbcore.dll

// or

Unhandled Exception: System.IO.FileNotFoundException: The specified module could not be found. (Exception from HRESULT: 0x8007007E)
System.IO.FileNotFoundException: The specified module could not be found. (Exception from HRESULT: 0x8007007E)
ASSERT: Unhandled exception/crash. Suppress debugger startup and try to report issue instead. This message is here to hopefully see callstack
 CALLSTACK:
   uba::Detoured_CreateProcessW (UbaDetoursFunctionsKernelBase.inl:3052)
   KERNEL32.dll: +0x713f9
   KERNEL32.dll: +0x71e94
   KERNEL32.dll: +0x71996
   KERNELBASE.dll: +0x12c6f9
   ntdll.dll: +0xa58d8
   ntdll.dll: +0x8ce46
   ntdll.dll: +0xa296f
   ntdll.dll: +0x52554
   ntdll.dll: +0x522a7
   KERNELBASE.dll: +0x25369
   clr.dll: +0x1810a8
EXEC : error : Process 3492 VCTIP.EXE
fatal error C1356: unable to find mspdbcore.dll

Sometimes we also face agent stalls (the agent responds to pings but never finishes the job).

While users can retrigger compilation, these issues on build machines cause builds to fail. To mitigate this, we use some flags/build configurations such as -UBAForcedRetry or -UBAForcedRetryRemote.

We also consider using dedicated agents, but we still want to leverage our workstation compute power. To reduce errors, we’ve already updated agents, the Horde server, and reviewed agent configurations.

We’ve also increased the page file size. Workstation users use the idle system of Toolbox to ensure enough resources are available.

Questions:

  • Is there anything we’ve missed to improve stability?
  • We plan to update to latest release engine version. Should we target a specific CL?
  • Due to our needs, we use devenv as CI (and we have also noticed this appears to users): by using -UBAForcedRetryRemote, even if the failed task restarts, MSVC considers the compilation as failed. Is there a way to avoid this?

Thank you for your help :smiley:

[Attachment Removed]

Hi,

Can you test the latest UBA binaries (found under Engine/Binaries/Win64/UnrealBuildAccelerator)? UBA itself is always supposed to be backward compatible and I recognize this VCTIP error from a year back or so and some vague memory of fixing it. I think we have gotten to a point where we need to start separating releases of UBA and engine but haven’t done that yet.

If it still doesn’t help and you still run into problems, next step is to build uba from main in debug and run again and see if we can get more info out of the issues.

[Attachment Removed]

We did that, but we still face issues on some agents. It’s unclear how it works and what uses these DLLs.

However, we plan to upgrade that was mostly what the post was about: to know if there are major changes in non-“release” versions of the engine that you would recommend us to bring to our branch.

About the retry flags, do you have more info about them?

[Attachment Removed]

Check out this thread, maybe the last answer can help you: [Content removed]

[Attachment Removed]

I have finally found this issue! (A colleague got a frequent enough repro). Just submitted the fix for it so it will end up on github today.. (change is in UbaDetoursFunctionsKernelBase.inl).. have kicked binaries as well so they should be on github today too.

[Attachment Removed]

I’m glad to hear this, I will check that asap. We’ve moved our code base to 5.7.3 so it will be easier to get these changes. Thanks

[Attachment Removed]