I’m wondering if there are any known workarounds to prevent the crashes and the game terminating due to the ‘removed device’ directx error code being received during a rendering operation?
This looks like it exits harshly by design in the code, and would require restoring rendering resources after this happens to implement a fix. There is a comment suggesting this is known, at least for textures and the swap chain in D3D11Util.cpp @ 190 or so (commit ed100718a12140d948a34d0c247f0f67d1a282c6)
Although its an unusual case, and difficult to smoothly recover, there is no strict reason why software can’t handle this error a little more gracefully (and not lose the player’s progress)… even if we need to stall for a bit to reinitialise and reupload things.
I noticed several ‘fixed’ issues for the ‘hung’ case - where specific reproduction cases have been handled:
My reproduction case is unfortunately not something I can share… however I will say it is reproducable when switching power config on the machine in question. Whilst this is quite unusual, its also not something I expect to bring a game down (because we can recreate rendering resources in principle) - especially with no good last chance to save some data.
Its possibly a driver issue. Altering the power config of the machine does not prevent the issue, even disabling it and using the documented debug registry settings to try and force it to never restart the hardware.
Comparing other engines the problem doesn’t occur with Unity, although it does with Frostbite.
fwiw it doesn’t happen in my bedroom code either because having to restore things on mobile platforms, or other platforms with higher standards, worked out to making it fast to implement on all platforms from being pathological about not repeating code and avoiding platform dependencies that aren’t strictly necessary. However, rather than expecting a quick fix as a result, I can appreciate from seeing the engine source code here that this has the look and feel of a substantially larger task than this due to the architectural choices and code repetition.
I was hoping this might be gone with the DX12 implementation but the logic in D3D12Util.cpp around line 210 (same revision) looks functionally identical.