D3D device being lost: UE-42280

This issue affects cooked builds, projects in Editor, desktops and laptops, have read countless posts reflecting 100s of affected users, see Epic has acknowledged the bug since February of 2017, the promise for a solution made with each engine version since 4.15. I spent several months chasing windmills to break loose from UE4 crashing every single time:

  • tested across projects
  • tested across engines
  • disconnected from wall current, held power button 20 secs
  • reset Windows
  • fresh instal UE4
  • many hours working with Nvidia TS
  • many hours working with Microsoft TS
  • many hours working with Boxx (manufacturer) TS

Nividia and Boxx techs reviewing performance from Kombuster and Mem86 were satisfied I wasn’t dealing with any hardware issues. This problem only occurs in UE4, clearly the driver here. I can only speak for this PC and these projects, but it’s useful if others compare notes and sort for what’s in common. Hopefuly, Epic techs learn something that points them on the right track.

While I encounter this crash error in a cooked build at various points in the game, what’s interesting in Editor is that simply opening a project, letting Editor just sit there with no input from me, I can hear the fans throttling, see GPU-Z working my GTX-1080, albeit at perfectly acceptable levels for my machine (and well below what Kombuster demands), this static state leading to a crash. No info from Nvidia Aftermath has any hope of shedding light on what’s going on, all we (Microsoft techs, Nvidia, Boxx) can see is the expected response to something happening in the GPU by Microsoft C++ Redistributable seeing the freeze and stepping in to protect hardware by shutting things down, namely TDR (Timeout Detection and Recovery). This information isn’t useful to understanding the problem, is normal behavior in response to some deeper problem within UE4.

What’s truly perplexing, but possibly provides a clue, is a workaround I discovered in a post by a user who wasn’t experiencing this issue, but reported, “whenever I encounter a problem like this in UE4 90% of the time the problem goes away when I remove the gpu and memory, clean the contacts with alcohol and reconnect them.” I was out of ideas, so what the hell, I’ll try anything. Stability in Editor and packaged builds was restored. Okay, that’s just one datapoint, but three months later I’m now encountering the same crashing behavior. This time, I was more methodical, not to introduce more than one variable at a time. I began by running Mem86, 0 errors. I then removed four RAM DIMMs, wiped the contacts with alcohol, replaced them. I tested across projects and a packed build, all seems stable. Go figure. What does this say about conditions? What theoretically can explain how a power drain doesn’t catch what physically removing DIMMs does (or might, just two data points now, but can’t dismiss a second time error going away). Furthermore, what is happening in UE4 to produce that condition and what else is (mis)happening in the engine? This now seems memory related. Does this help?

Part of the problem chasing this issue down, I’m told, is that UE4 devs have their hands tied if they can’t reproduce the problem on their end. I send a project file giving me fits, works fine on their end. Nonetheless, with hundreds of users encountering this error, albeit under different conditions (system resources, gpu and RAM load per game, etc.), they acknowledge this as a bug, but since February 2017 and since UE4 4.15 they can’t seem to make any progress. What I honestly submit is discouraging is how I submitted a bug report, have been in communication with a dev, am getting but one liner responses, we’re working on it, will let you know. I’ve been let known not the first thing. I even offered to make my PC available for testing, let them have a machine exhibiting the issue to get around this reproducibility issue. I still get the same one liner responses. Whining gets us nowhere. I’m sharing my experience solely to feed the system what’s needed to hopefully make a little more sense of ultimate causes, as well as to light a flame. Does anyone see a pilot light?

Thanks!

Addendum: Things were stable for about an hour, both in the packaged game and Editor. Now in both, the crashing is back. I’ll now remove both memory and gpu, will report findings.

Hello,

We’ve recently made a switch to a new bug reporting method using a more structured form. Please visit the link below for more details and report the issue using the new Bug Submission Form. Feel free to continue to use this thread for community discussion around the issue.

https://epicsupport.force.com/unrealengine/s/

Thanks