We have somewhat recently started experiencing a frequent GPU crash in our game builds, and could use some help identifying likely causes and possible resolutions. We are currently doing a tranche of testing internally to try and isolate the problem, and will update with further details if this yields results.
We are using a forked version of 5.3.2 release, with minimal changes made to it. We experience this in game builds and these are created for Windows, with development configuration currently.
This crash seems to reliably impact out target machines using ADA 6000 GPUs, but we aren’t able to reproduce on workstations using RTX 4090 GPUs
Nsight description of the crash:
- MMU fault detected during a GPU memory Read of a destroyed unnamed resource or other resource(s) at address 0x000000XXXXXXXXXX.
- There are no debug names found for the resources in the Page Fault Resource History list.
We are able to somewhat reliably reproduce by:
- Running build with automated level switching in a cycle, until crashing
- Load directly into a specific level, loop sequence until crashing
Levels consist of a Nanite & Lumen environment with several skeletal mesh performing looping animations driven by Sequencer, some of these emit Niagara particles from their skeletal mesh
We have determined the following from testing:
- Crash persists even after removing all Nanite environment meshes and Niagara effects from the level
- Not likely related to GPU skinning as this is not enabled in our project
The development team suspects it’s related to:
- A caching issue loading data from previous level
- Data being corrupted by the load in-between scenes
Any insight you can potentially provide would be valued. I will try and share more information on the issues when I receive it from the team.
Thanks,
Thor