Game crash report

OK I think I’m good w/r/t symbols etc now, but I’m having trouble with CRP that seems unable to correctly invoke MDD exe.

Probably some Service configuration issue (e.g. .net - How can I run an EXE program from a Windows Service using C#? - Stack Overflow)

Do you have some special tips to run CRP as a service / did you have trouble with that yourself?

Thanks!

I haven’t had any problems with that. We just install the service using InstallUtil.exe and it runs as a specific user (the one with the Perforce environment that MDD needs).

OK thanks Chris. I’m sure I’ll find the reason this exe call misbehaves

How do we run these as services? I see the ServiceInstaller.cs files, but not sure what to do with them.

I’ve just used sc.exe

For the record though I don’t run CRP as a service yet, because the MDD it invokes behaves differently vs. just running CRP “normally” (service vs. normal exec yield subtly different environments, so I have to investigate…) No pb for CRR.

@.perrin is there a link you can post that gave you the .zip file for the missing models? I am also getting that error. I got this somewhat working for 4.12 but cannot get it running on 4.13. It would be great to have this all documented.

Our process is that our symbols are uploaded to amazon s3 and we have a proxy that turns them into a symbol server. It doesn’t seem to have a config properties that asks where the location of the symbol server is.

Can someone clarify on what exactly the Local, internal, external zones are? @chriswood

Hi,

If you have access to UDN:

https://udn.unrealengine.com/questions/313115/crash-reporter-clientserver-setup.html

or

https://udn.unrealengine.com/questions/312726/crashreporterwebsite-solution-is-missing-files.html

If not, you should probably ask Steve Hutton to post it here.

afaik it’s WIP / documentation is thin for now.

For the symbols issues (so, CRC/CRP rather than CRW), I think I tweaked the code to directly point to the srv* I wanted.

If you’re talking about internal/external landing zones, not exactly sure but I’ve only used the InternalLandingZone.

Internal/external is something we use at Epic as we used to have a single CRR feeding internal crashes to a landing zone inside our network and multiple CRRs receiving public crashes feeding a landing zone in the DMZ.

Hi Chris,

Another quick question (I might edit it later for details):

So basically now I’ve got a “full CR architecture” running, and now I get to observe more finely what works and what does not - I’m noticing that on CRP side, only 1 MDD runs at any given time, yielding logs like:

PROC-7 ProcessDumpFile: Thread blocked for 6 483,1s then MDD ran for 220,8s.

If I’m not mistaken the lock (MinidumpDiagnosticsLock) in engine\source\programs\crashreporter\crashreportprocess\reportprocessor.cs is recent (4.13 vs. 4.12) - could you tell me more about it?

Thanks,

Because MDD can interact with a Perforce workspace and your pdb cache, multiple instances of MDD could cause issues. Older versions of the CRP were single threaded so there would always be one MDD. Later I added support for multiple processing threads but had to add MinidumpDiagnosticsLock to ensure only one MDD would run so this became the main bottleneck. The newest version (actually from CRP v1.1.22, Aug 10th 2016) uses named locks to protect access to P4 and the pdb cache inside MDD allowing multiple instances to run pretty well.

Over 200 seconds to run MDD seems pretty slow. We typically take between 1 and 10 seconds per run with a fully populated pdb cache.

Speed wise, the MDD would also slow down if its log folder filled with too many files before CRP v1.1.25. That version added better handling of MDD logs.

OK thanks Chris - in our case MDD doesn’t do anything with p4, but it makes sense to have a pdb cache-related critical section indeed (glad you’re making it smaller though!)

But yeah my bigger issue actually seems to be MDD exec time (I have some MDD execs that time out (1800s), but for those that don’t, durations are all over the place between 0 and 1800…)

Times in the log entries should help pinpoint the problem.

Time in the logs tends to jump abruptly :wink:

Looks like Control->GetContextStackTrace takes some time, but less than, well, I guess it’s all the Symbol->ReloadWide - I could try and add more logs

Ex:

[0000.13][ 0]LogCrashDebugHelper: Successfully opened minidump: [redacted]

[0000.13][ 0]LogCrashDebugHelper: Modules loaded: 460, unloaded: 0

[0000.17][ 0]LogCrashDebugHelper: Symbol paths

[0000.17][ 0]LogCrashDebugHelper: SRV*[redacted]

[0000.17][ 0]LogCrashDebugHelper: SRV*[redacted]

[0000.17][ 0]LogCrashDebugHelper: SRV*Symbol information

[0000.17][ 0]LogCrashDebugHelper: Image paths

[0000.17][ 0]LogCrashDebugHelper: SRV*[redacted]

[0000.17][ 0]LogCrashDebugHelper: SRV*[redacted]

[0000.17][ 0]LogCrashDebugHelper: SRV*Symbol information

[0959.41][ 0]LogCrashDebugHelper: Context size matches x64 sizeof( CONTEXT )

[1077.68][ 0]LogCrashDebugHelper: 0: KERNELBASE!RaiseException()

[1077.73][ 0]LogCrashDebugHelper: 1: UE4Editor_Core!NewReportEnsure() [redacted]

Was that a significant slow down? Starting with how many files?

It would gradually slow down initialization of MDD (by a few seconds) when tens of thousands of logs would build up. You don’t have that problem in the log you posted above.

OK.

I guess I got something very wrong on the machine that runs CRP; taking on of those queued dmps and running MDD on my machine, I get something like 300s with empty cache / 50s for repeated executions, while pretty much all MDD execs on the dedicated machine now time out at 1800s…

I’m adding more detailed logs; looks like the first Symbol->ReloadWide takes some time (C:\Windows\System32\XINPUT1_3.dll but I doubt it’s relevant), then it’s mostly Control->GetContextStackTrace

(I guess with SYMOPT_DEFERRED_LOADS*, time will always be spent in Control->GetContextStackTrace, so it’s hard to investigate deeper than this…)

/* by the way, I think the code commentary is wrong here:
// Always load immediately; no deferred loading
SymOpts |= SYMOPT_DEFERRED_LOADS;

Is there any ETA on the DataRouter being provided as SaaS?

Hello I have been having a problem with my engine crashing. Here is the crash report can someone tell me what it means?

https://pastebin.com/aK1zxhdb