Hello!
We have started experiencing somewhat frequent deadlocks when a minidump is being generated in a Windows Development Client build. This occurs when an ensure is triggered while PSOs are being compiled in parallel. Specifically, we’ve been seeing the deadlock with this ensure during loading:
void FCoreRedirects::Initialize() { ensureMsgf(IsInGameThread(), TEXT("FCoreRedirects can only be initialized on the game thread."));
I’ve attached a screenshot showing the PSO compilation thread that is locking the critical section and the thread attempting to write the minidump.
I went through the same investigations as mentioned in [Content removed] My understanding is that, as MiniDumpWriteDump suspends all threads when generating the minidump, if at that moment any thread holds the lock on the heap management critical section then MiniDumpWriteDump will deadlock if called in-process. Also, as mentioned in [Content removed] I’ve seen that Unreal already supports out-of-process minidump generation using USE_CRASH_REPORTER_MONITOR.
We have a few questions regarding all of this:
FCoreRedirects::Initialize ensure
This ensure has been occurring from time to time and we’re not sure what actions people are doing that trigger this.
- Are there actions that are known to cause this that we should stop doing?
- Are there things that we should be doing to prevent this?
USE_CRASH_REPORTER_MONITOR
- Has Epic started enabling this for non-Editor builds (latest on git still has this enabled for Editor only)?
- Are there still known issues with enabling this option? [Content removed] mentions issues with the pipe used for communication.
USE_CRASH_REPORTER_MONITOR and EAC
We are planning on using EAC as one of our anti cheat mechanisms.
- Are there any integrations steps that we need to be aware of if we do decide to enable USE_CRASH_REPORTER_MONITOR?
- From what we saw, EAC will allow the external crash process monitor for 30 seconds before ending it. Can you confirm if we need to configure anticheat_integritytool.cfg, specifically runtime_configuration -> crash_reporter_name = “CrashReportClient.exe”, with a different value?
Thanks!
Hi,
The core redirect ensure seems to have been fixed by CL 36713788 (https://github.com/EpicGames/UnrealEngine/commit/a7fcf0cbad6d199f5e8f246add45f0c7543951d0\). Obviously, having an ensure deadlocking the creation of the minidump is a bug, I’ll report it, but you can probably mitigate the issue by integrating the CL I pointed out. I’m not sure the team will prioritize fixing this deadlock though. So I hope CL 36713788 is going to workaround your issue. Let me know if it doesn’t.
Regarding your other questions, I enabled the monitor mode to report crashes out-of-process in Fortnite 4 years ago, but we started getting crash reports with missing information. So I disabled it in June 2021. The post you linked mentioning a pipe issue was from November 2020, so it was probably fixed when I enabled out of process reporting in 2021. I remember lot of work happening on this at that time. Anyway, I documented the latest status I’m aware of in \Engine\Source\Runtime\Core\Private\Windows\WindowsPlatformCrashContext.cpp
// Workaround for non-Editor build. When remote debugging was enabled (CRC generating the minidump + callstack) in games, several crashes // were emitted with no call stack and zero-sized minidump. Until this issue is resolved, games bypass the remote debugging and fallback to // in-process callstack/minidump generation, then spawn a new instance of CRC to send that crash. The CRC spanwed at startup will keep running in // background with the only purpose to capture the monitored process exit code and send the analytic summary event once the process died. #if USE_CRASH_REPORTER_MONITOR && WITH_EDITOR
Since then, nobody retried it or fixed it and the workaround is still there, but when it was enabled, we were able to get some crashes with EAC… but it was 4y ago. As of today, we still launch CRC in monitor mode with Fortnite to get and report the game exit code along with some out-of-process analytics, so it is known to work with EAC to some extend, but we haven’t tested out-of-process crash reporting with EAC active in 4ish years… it might be broken.
If you end up using USE_CRASH_REPORTER_MONITOR and EAC, I don’t know for sure about EAC configuration (anticheat_integritytool.cfg), I’m not an expert, but the template I have found says you don’t need to configure it at all (see below). But if you end up with problems, I’d recommend opening a new tickets specific to that, we will route it to the EAC team.
`/*
- Name of the crash reporter used by the game (Windows only).
- In the case that the game uses a widely used game engine this will not need
- to be provided here. Crash reporters of engines such as Unity and Unreal Engine
- have already been whitelisted on our end.
*/
/crash_reporter_name = “MyCrashReporter.exe”;/`Regards,
Patrick
Hey Patrick,
Sorry for the delay, I wanted to find the time to sync back to a changelist that triggered the ensure to test the fix. The ensure is gone now that I’ve integrated the change, so thanks for sharing that changelist!
Regarding USE_CRASH_REPORTER_MONITOR, I had missed that comment about the workaround… For the moment, given that I’m not aware of other occurrences of a MiniDumpWriteDump deadlock, except in this specific situation where the timing was good to trigger it, I would rather stick with the most commonly used path for crash reporting. I wouldn’t want to potentially trade one crash reporting issue for another one in a path that hasn’t been proven for awhile😅. As for EAC, that’s good to know, we’ll keep that in mind if we do go down that route at a later time.
Thanks for the help!