Unreal Insight with nothreading

Could you tell me whether Unreal Insight works correctly with the nothreading argument? Is it possible that the single-threaded mode is introducing artifacts into the profiling data?We’re observing strange behavior on dedicated servers deployed in Kubernetes. During profiling, spikes appear at some point after “between frames” — it looks as if a context switch occurred and the CPU was taken away from the server process, but this always happens strictly at the end of the frame and always with a one-minute periodicity.

For example:

13 m 05 s

14 m 05 s

15 m 05 s

[Image Removed] [Image Removed]

The spikes don’t appear immediately, but they do show up even when idle. The issue reproduces both with and without FPS limits.

Maybe you know something about this.

Best regards,

Edward.

Hi,

Unreal Insights/tracing is expecting to work correctly with “nothreading”. When single threaded, the trace update loop will run on game thread, on each frame.

To get more info regarding the actual execution of the trace system, record a trace using -tracescopes and enable also the counter channel: -trace=default,counter.

In Timing Insights, look for FTraceAux_* and Writer_* CPU timing scopes. Also, in Counters channel, look for the Trace/* counters (like Trace/Bandwidth/Emitted and Trace/Memory/BlockPool). See if these correlates with the stalls.

A possible cause for a stalls caused by trace system is when the runtime emits way to many trace events in a short amount of time, faster than the trace update loop to process and send the data to connected target (to network socket or writing to a file). Still, it is strange that the stalls happens on fixed (1min) intervals… If trace would be the cause, then I would expect few consecutive frames having bigger and bigger stalls, not only at each 1 min intervals.

In UE 5.6, we have introduced a default limit of 79 MB for BlockPool (see UE_TRACE_BLOCK_POOL_MAXSIZE). This avoids unbound memory usage (when runtime emits more trace data than it can process), but it will also introduce a stall in that case. Still, in a single-threaded, this is unlikely to be the cause of stalls you see (i.e. it is unlikely for trace to produce more than 79MB of data per frame). Also, as you use UE 5.5, this cannot be the cause.

Use -statnamedevents (if not already) to ensure you get more CPU scopes. Try also to add more CPU scopes in FEngineLoop::Tick() to narrow the code location where the stall occurs.

Best Regards,

Ionut

Hello. I followed your advice and added counters to FEngineLoop::Tick.

I was able to reproduce similar behavior on a local dedicated server, but it’s different – the spikes occur more frequently and without the regularity seen on Linux. It turned out that FCoreDelegates::OnEndFrame.Broadcast() periodically causes spikes; ten objects are subscribed to it, and I had to track down each one and add a counter.

In the end, I see that the spike is generated by UE::Trace::Update(), possibly due to disk writes.

Best Regards,

Edward

[Image Removed]

Yes, we intend to add CPU profiler scopes for all Begin/End Frame delegates, for UE 5.7.

In UE 5.6, the OnEndFrame delegate for UE::Trace::Update() (called from FTraceAuxiliary) already has a CPU profiler scope: “FTraceAux_Trace_Update”. Also the “-tracescopes” would also add more child scopes to this one.

If the stalls are caused by file I/O operations, maybe a workaround is to trace to UTS (UnrealTraceServer) instead of tracing to a file. If needed, the UTS can be setup on a different machine, and server would be run with -tracehost=<ip_of_machine_running_UTS> (instead of -tracefile=…) or using the console command: “trace.send <ip>” (instead of “trace.start” or “trace.file” usage).

Still, 14ms for the UE::Trace::Update() seems too much… It is like on those frames there is an unusual large amount of trace data emitted. Check to see if there are unnecessary trace channels enabled that may generate that data and disable trace channels not actually needed. Usually the “default” channels should be ok from this p.o.v., but “memory” or “task” (for example) are know to generate large amount of data. Unfortunately without the detailed cpu trace scopes from UE 5.6 it is more difficult to diagnose this issue.

Cheers,

Ionut

Hi. As far as I can tell from the source code, tracescopes were only added in Unreal 5.6. In the profiler we enable cpu, gpu, frame, log, bookmark, screenshot, region, and we also separately turn on Cmd: stats.NamedEvents On and Cmd: stats.VerboseNamedEvents On.

I’ll gather a profile with the additional “counter” channel.

> Still, it is strange that the stalls happens on fixed (1min) intervals…

Yes, I also find these spikes very strange. It’s possible that something on the Kubernetes side is taking resources from our process; I’m trying to find the cause. But first I’d like to be sure that it’s not profiling overhead that isn’t captured by the counter.

Best Regards,

Edward

I think you should also add a counter for the FCoreDelegates::OnEndFrame delegate. It may include any logic that won’t be visible in the profiler.

Best Regards,

Edward