DXGI_ERROR_DEVICE_REMOVED while building HLOD (on Windows 11?)

Hello!

We have three builders for building HLODs for one of our projects.

They used to all work flawlessly. Then the project finished, was not touched for a while, and now we’re doing a port to a different platform.

We need to rebuild the HLODs with different settings for this, so I reactivated our TeamCity HLOD build configuration. However, only one of the three builders is now able to fully build the HLODs. On the other two builders, the build process always crashes halfway through with a DXGI_ERROR_DEVICE_REMOVED error. Full error:

LogD3D12RHI: Error: Resource->Map(0, ReadRange, &ResourceBaseAddress) failed
 at D:\BuildAgent\work\e31dbb7ced8a18bf\Engine\Source\Runtime\D3D12RHI\Private\D3D12Resources.h:282
 with error DXGI_ERROR_DEVICE_REMOVED

LogD3D12RHI: Error: GPU crash detected:

 - Device 0 Removed: DXGI_ERROR_DEVICE_REMOVED


LogRHI: Error: Active GPU breadcrumbs:



 Device 0, Pipeline Graphics: (In: 0x800170b0, Out: 0x800170a8)

  No breadcrumb nodes found for this queue.



 Device 0, Pipeline AsyncCompute: (In: 0x00000000, Out: 0x00000000)

  No breadcrumb nodes found for this queue.


LogD3D12RHI: Error: Shader diagnostic messages and asserts:



 Device: 0, Queue 3D:

  No shader diagnostics found for this queue.



 Device: 0, Queue Copy:

  No shader diagnostics found for this queue.



 Device: 0, Queue Compute:

  No shader diagnostics found for this queue.


LogD3D12RHI: Error: DRED: No breadcrumb head found.
LogD3D12RHI: Error: DRED: No PageFault data.
LogD3D12RHI: Error: Video Memory Stats from frame ID 227:
LogD3D12RHI: Error:  Local Budget: 64702.85 MB
LogD3D12RHI: Error:  Local Used: 6865.26 MB
LogD3D12RHI: Error:  System Budget:   0.00 MB
LogD3D12RHI: Error:  System Used:   0.00 MB
LogD3D12RHI: Error: GPU Crashed or D3D Device Removed.


Check log for GPU state information.
Took 452,04s to run UnrealEditor-Cmd.exe, ExitCode=3
********** World Partition HLOD Build Command FAILED **********

The breadcrumbs vary a lot between crashes. Sometimes there are none, like above, sometimes there’s quite a lot of output.

Physically, the three build agents are still exactly the same as back when they were all able to build HLODs successfully. They are also identical between each other (same CPU, RAM, etc…). I should also point out that all three builders do NOT have a dedicated GPU, and the build is running through TeamCity, with the agent running as a service - so Unreal is running completely headless.

The only factor I can point to that changed between builds finishing fine a couple of months ago, and them now failing, is that two of the build agents were updated to Windows 11. The one build agent that is still able to complete the build is still on Windows 10.

Any ideas what could be going on here?

Ciao, Daniel!

[Attachment Removed]

Hello!

Have you validated that the software GPU driver is available on the Windows 11 nodes? Check the log for “Chosen D3d!2 Adapter Id” to confirm that the “Microsoft Basic Display Adapter” is found and used.

Please share a full log of a failure for analysis.

Regards,

Martin

[Attachment Removed]

You are welcome!

I did find some additional information that you will want to try. We are using a similar setup when building our HLOD (ie no GPU on the nodes) but we are using the Windows WARP driver. You can force the editor to use it by adding the -warp argument .

Regards,

Martin

[Attachment Removed]

First of all - thank you for always pointing my nose in the correct direction. I really appreciate the great support!

I checked and compared the WorldPartitionBuilderCommandlet logs on both the builder that works, and a builder where it doesn’t work.

  • On both builders, there are two Microsoft Basic Render Driver adapters.
  • On both builders, the first one is picked, and D3D12 with SM6 is attempted.

That’s where the commonalities end.

First on the builder that works - D3D12 is reported to support at most SM5, so it falls back to using D3D11 instead. Log output is this:

LogRHI: Using Default RHI: D3D12
LogRHI: Using Highest Feature Level of D3D12: SM6
LogRHI: Loading RHI module D3D12RHI
LogRHI: Checking if RHI D3D12 with Feature Level SM6 is supported by your system.
LogD3D12RHI: Adapter only supports up to Feature Level 'SM5', requested Feature Level was 'SM6'
LogRHI: RHI D3D12 with Feature Level SM6 is not supported on your system, attempting to fall back to RHI D3D11 with Feature Level SM5
LogRHI: Loading RHI module D3D11RHI
LogRHI: Checking if RHI D3D11 with Feature Level SM5 is supported by your system.
LogRHI: RHI D3D11 with Feature Level SM5 is supported and will be used.

On the builders where it doesn’t work, this happens instead:

LogRHI: Using Default RHI: D3D12
LogRHI: Using Highest Feature Level of D3D12: SM6
LogRHI: Loading RHI module D3D12RHI
LogRHI: Checking if RHI D3D12 with Feature Level SM6 is supported by your system.
LogRHI: RHI D3D12 with Feature Level SM6 is supported and will be used.

I wanted to know where this discrepancy comes from, so I looked at the feature level identification code in Unreal and compared the supported features of the adapters on Windows 10 and Windows 11.

The driver on Windows 11 is indeed newer (UMDVersion 10.0.26100.5074 vs 10.0.19041.4355) and reports more supported features. For example, D3D_SHADER_MODEL_6_8 is supported on Windows 11 vs. D3D_SHADER_MODEL_6_2 on Windows 10.

The crucial difference however seems to be that Windows 10 doesn’t have Atomic64 support, as AtomicInt64OnTypedResourceSupported = FALSE, while Windows 11 does support it.

This doesn’t explain why the build with the D3D12 adapter is crashing, however. Maybe the support check is insufficient and there is some feature missing, which then trips it up during the HLOD build.

In any case, as a workaround, I’m now forcing D3D11 for the HLOD build on all builders. The HLOD builds on all builders have now completed successfully.

Thank you once more for your great support,

Ciao, Daniel!

[Attachment Removed]