Getting GPU-Frametime in a shipping exe

So I am starting to think it is not possible without source code changes, but would love to be wrong about it!

We want to get the “naive” gpu frametime, with complete disregard to any details. Just the overall time the gpu needs to prepare a frame, no need to recognize bubbles/stalls etc. What the stat gpu command shows in its topmost row seems like the right fit, but we need it in a packaged game with shipping configuration.

The purpose is to show users how specific graphic options affect their specific hardware configuration in regards to spent milliseconds.

Suggestions for third party libaries would also be highly appreciated.

What we already tried:

  1. from Any way to get the GPU bound time in milliseconds?
    FPlatformTime::ToMilliseconds(GPUCycles)->only nets us the overall frametime incl. VSync.

  2. RHIGetGPUFrameCycles()->Same as above.

  3. Stats from “stat GPU” are not generated in a shipping configuration.

  4. Looking at the msUntilRenderComplete from presentmon (GitHub - GameTechDev/PresentMon: Tool for collection and processing of ETW events related to frame presentation on Windows.) :
    The description reads: “The time between the Present() call and when the GPU work completed, in milliseconds.” But the time doesn´t seem to correlate with gpu usage in our UE4 test exe´s.

Thank you in advance!