Long times spent in D3D12_Present

Hello,

we’ve noticed that on some computers, our game spends a significant but highly variable amount of time in D3D12_Present on the RHI thread (>2ms on some frames, each of which are about 7ms or so). This is causing the frame rate to become unstable. I assume that it’s waiting for something to finish, but looking at it in Insights, it doesn’t seem to line up with anything specific happening on the GPU. Does anyone have any advice about what it might be waiting for, and how we can avoid it doing so? It seems to not happen much at all on some computers and much more on others.

Thanks,

Sofie

Hello,

Present can block the calling thread if there isn’t room in the swap chain for a new frame, for example if you’re far enough ahead of the swap chain flip or GPU that the RHI thread is getting throttled. Can you provide more information about the frame rate instability or share the insight trace for us to look at?

Apologies for the delay, and thanks for providing the trace. To discover where the time is going you may need to use a sampling profiler like Super Liminal to see if it’s all in the driver. In the past we’ve also seen long D3D12_presents when using D3DKMTQueryStatistics, but that’s a super edge case. I’m passing this over to my colleague who is more familiar with this area.

Hi Sofie,

From your Insights trace, it looks like the engine is running without vsync enabled. Generally frame pacing can be quite poor when vsync is disabled. This is something we’re planning to address at some point in the future, but it’s likely a multi-year project to solve entirely. Related to this is how our “Frame” stat shown in the “stat unit” display etc is measured at a specific point on the game thread, which isn’t representative of the consistency of frames actually flipping onto the front buffer, but can affect delta time calculations.

You might find you have more stable frame kick-offs on the game thread if you set “r.GTSyncType” to 1. This makes the game thread synchronize with the RHI thread, which in turn is implicitly synchronized with GPU progress based on the throttling the driver does inside D3D12_Present. This does come at the cost of a shorter pipeline, which can make you more susceptible to dropping frames if you’re over budget.

Windows D3D12 RHI does already set GD3D12RHINumBackBuffers to 3, which gives you 3 buffers in the swap chain, and allows the GPU to get up to 2 frames ahead of the front buffer flip.

Cheers,

Luke

Hi Alex,

thanks, that makes sense. Is there a way to avoid that happening, e.g. by requesting a longer swap chain, even if that increases the latency as well?

I’m attaching one of the trace files that we’ve seen this behavior on - usually, D3D12_present only very rarely takes more than 1 ms, and never more than 2, but it happens somewhat commonly on this one. There are a few rather big UI-related hitches in this one as well that we are working on, but it’s mainly the very unstable framerate that worries me, since we don’t feel we have a good understanding of what’s causing it yet. The main common denominator of the cases we’ve seen it in so far have been D3D12_Present more often than usual taking a significant amount of time, but there could of course be other contributing reasons as well.

Hi Luke,

thank you for the advice! We’ll experiment with vsync and the game thread sync CVar, and see how they affect the problem.

Best wishes,

Sofie