Hello,
I have some updates from my colleague, who is currently improving the crash minidump generation process. To test the crash handling, he is rewriting random bytes in the mapped memory of the game. Interestingly, it reports 1 out of 10 crashes as a GPU crash. The GPU report looks like the other crashes from the crash hunt sessions - no DRED, no Aftermath, no active shaders or resources (all of them are enabled). During the testing, he also got a lot of crashes going from the Nvidia driver. However, subsequent sessions without any artificial memory overwrites resulted in the same Nvidia driver crashes after several seconds of the game run. This state continued until he cleared the PSO cache. Then, the game stabilized back to normal. This behaviour leads us to the conclusion that some memory stomps may lead to GPU crashes and overall instability in the subsequent runs.
“In the past we had a UE side PSO cache but that’s been disabled for years - I’m assuming you’re not using that feature and are referring to the default driver PSO cache.”
I cannot give you an exact answer here, because PSO caching was a task for another colleague [mention removed]. But I think he just cherrypicked PSO handling improvements from the higher engine versions and improved the precaching at game side.
Regards,
Tomas Ruzicka.
[Attachment Removed]