The smallest change I can make to have things function again is to revert to the dxcompiler.dll from 5.7-preview. That single DLL change makes things function.
Further investigation performed:
Manually loaded the dll with a custom program on the remote agent machine - this works fine
Compared dependents of the dll, they are identical between 5.7-preview (the working version) and 5.7 (.0).
Intercepted syscalls using procmon, none of the LoadModule calls fail.
Because of all of this, I’m currently suspecting that the detours version of LoadLibrary is what is causing the problem. (Invoked via FDllHandle:FDllHandle in DXCWrapper.cpp, via FWindowPlatformProcess::LoadLibraryWithSearchPaths.)
It is obvious there is logging code in the UbaDetours module, but so far I’ve not been successful in getting this to actually work (compiled this and other modules in debug mode to ensure logging is compiled in, enabled various cvars for log relaying etc, but no luck. There is one but of (to me) dubious code in FUbaJobProcessor::RunTaskWithUba where the “LogFile” gets set to the input file, but changing this to a different filename seems to not help creating a log file either.
Not sure if anyone else has seen this? Some assistance in helping getting detours logging working properly or insights in other ways of debugging this would be wonderful. At this point what I’m really after is the various output of the remote agent from DEBUG_LOG_RETOURED to get a better sense of what’s going on internally there.
I do see some errors like this in our log files, but barring more detailed logging I’m not sure if this is a red herring:
LogUbaHordeAgent: Response [ExecuteOutput]: UbaSessionClient - DetourCreateProcessWithDllEx failed: "D:\P4Workspace\Engine\Binaries\Win64\ShaderCompileWorker.exe" <removed shader params> -Multiprocess (Working dir: D:\Horde\Work\Saved\Uba\empty\). Exit code: 8 - Not enough memory resources are available to process this command.
Steps to Reproduce
For us, on 5.7 (final, not preview), try to load EngineTest while having configured Horde (5.6 deploy) for UBA. When it starts dispatching remote shader compilation, it asserts due to failing to load dxcompiler.dll.
I presume you already found CVar `r.UbaController.LogVerbosity`, but if not, please set it to 2 to forward all UBA logs to UE_LOG. Then you should also run with a debug build of UbaAgent as this enables detoured logging. From what I can see in the code, the log file should have the same path as the input file (indeed not obvious why) plus `.log` extension and optionally some other suffixes for child processes.
Hi, we found a bug in uba where it is not properly copying files from system32 folder if they exist locally. Uba has a “known system files” list containing files that should not be copied.. but the logic had a bug which made it not copy other files in system32 (vcruntime is not a known system file).. I’m doing a fix in the code right now which will need new binaries.
You can do the fix locally if you want to test it out. Open UbaSessionClient.cpp
in case that helps finding the culprit code, this also occurs when compiling shaders distributed with SN-DBS without Horde (with the exact same assertion & log).
Yeah that’s our workaround, but I’d like to find the root cause of the issue to actually use the latest (and I suspect Epic needs to fix this for others too).
FWIW: I just recalled that our dxcompiler.dll as well as ShaderConductor.dll are now compiled with Clang by default instead of MSVC. Are you able to rebuild dxcompiler.dll from UE5.7 source? If so, you can verify whether this is causing the issue by running this command:
Engine/Source/ThirdParty/ShaderConductor/Build_ShaderConductor_Win64.bat -msvcYou’ll need CMake 3.17+ (not compatible with CMake 4 at the moment unfortunately), Python 3, and Visual Studio Active Template Library (ATL).
For A/B comparison, run this command without the `-msvc` argument to compile with Clang compiler.
Tried compiling with -msvc and it doesn’t seem to make a difference. Didn’t really expect there to be any because I already verified all dependencies of the dll and they all checked out.
FWIW I’ve added debug code to LoadLibraryWithSearchPaths that reads the dll’s imports and verifies they are all valid on the target machine. This doesn’t result in any missing imports.
[mention removed] can you confirm you can have detour logging redirected and working for yourself? Just trying to rule out if it is me doing something wrong or if it doesn’t work on your side either.
I built UbaAgent, UbaCli, UbaDetours, and UbaHost modules in Debug mode (maybe only UbaAgent is necessary, but I haven’t double checked). I also enabled CVar `r.UbaController.ProcessLogEnabled=true` and now I can see the UBA agent logs show up under %LOCALAPPDATA%\Temp\UbaControllerStorageDir\0\sessions\<SESSION>\log. These log files have (unintuitively) the same filename as the agent input files, e.g. 0.uba.j-4.in (“j-4” meaning there were 4 shader compiler jobs encoded). This file looks something like this:
Detached: true
RPC_MESSAGE Init
ProcessId: 1
CmdLine: "<REDACTED>\Engine\Binaries\Win64\ShaderCompileWorker.exe" "C:/Users/<USER>/AppData/Local/Temp/UbaControllerWorkingDir/<GUID-OR-HASH>/0/" 45780 0 "0.uba.j-4.in" "0.uba.j-4.out" -Multiprocess
WorkingDir: <REDACTED>\Engine\Binaries\Win64\
ExeDir: <REDACTED>\Engine\Binaries\Win64\
ExeDir (actual): <REDACTED>\Engine\Binaries\Win64\
SystemTemp: C:\Users\<USER>\AppData\Local\Temp\UbaControllerStorageDir\0\sessions\<SESSION>\temp
Rules: 27 (171384111)
RPC_MESSAGE GetSharedMemory
T GetEnvironmentVariableA OANOCACHE -> NOTFOUND
T GetEnvironmentVariableA OAPERUSERTLIBREG -> NOTFOUND
T GetEnvironmentVariableA OACACHEPARAMS -> NOTFOUND
T RegOpenKeyExA 0 (SOFTWARE\Microsoft\OLEAUT) -> Error
D GetModuleFileNameW 0 260 (<REDACTED>\Engine\Binaries\Win64\ShaderCompileWorker.exe) -> 62
T NtQueryInformationProcess (class 37) 18446744073709551615 -> Success
T CreateMutexEx Local\SM0:24200:304:WilStaging_02
T NtClose 516 (UNKNOWN) -> Success
T NtClose 512 (UNKNOWN) -> Success
T NtClose 508 (UNKNOWN) -> Success
...
I hope that works for you. Please note that we’ll be on winter break and I’m OOO starting tomorrow.
I tracked down the issue and I do think it has to do with the newly built dxcompiler.dll. The newer C++ runtime dlls provide exportsthe old ones do not, and it looks like the newer dxcompiler.dll makes use of these.
The newly compiled dxcompiler.dll in 5.7 requires VC_redist.x64.exe of 17.8 or newer. This is not a missing DLL, instead, something goes wrong init time with older runtimes.
UE5 (5.7) itself (shadercompiler.exe, etc) and dxcompiler.dll from 5.6 all happily work with the latest redist I can download for VS2015+ (v17.2).
For more detailed version numbers, 14.36.32532 (17.6) does not work for me but 14.38.33142 (17.8) does work.
Does this also mean that UBA doesn’t actually detour the c++ runtimes? Meaning that the local executing environment isn’t entirely mirrored?
Hey, yes - as per my last post of the 18th of December that’s what we ran into as well, including some of the surprising findings how UBA handles this. I assume for performance reasons UBA doesn’t detour the VC runtimes, but maybe something can be done to make error handling around this more robust? I guess Epic manually update the VC runtimes on their UBA build farm as part of software management processes on the agents (e.g., ansible or the like).