VR Profiling and performance thread!

I have seen there is no performance-centred thread, so let’s create one!

Let’s discuss:

  • Using the profiler, issues, tips and tricks
  • Common things that are not viable in VR (hello, grass!) and workarounds
  • Tips and tricks for performance tweaks!

Notable stuff:
HTC Vive template on Steam forums

Amazing blog post for VR performance and profiling, signaled by The Beej

Quickly understand whether your VR scene is CPU or GPU bound with a free tool called CPUKiller!
Run a scene, check the frame time, run cpu killer on 10%, check frame time, compare. Done!

Tips from Epic’s VR Showdown

VR Best Practices at Epic Wiki
Contains some things about performance too!

I’ll get this started with one nasty series of issues:

I am profiling with Vive and somehow causing both the GPU and CPU profiler to LIE to me!
Look at these:

  • While frame skipping is on and the DRAW thread is obviously being the bottleneck, according to the profiler there is this SELF action that takes the majority of time!



How is one supposed to profile the bottleneck if the majority of time is spent on something like that? How can I understand what is the CPU working on from that?

22 ms = 45 fps. I’m not sure for 100%, but i think SteamVR divides framerate by two if it’s below 90, so your actual frame rendering time is lower then 22 ms.

I’m 90% sure the majority of time spent on your draw thread is V-Sync waiting. I think your GPU can’t push out the frame in under 11.1ms, so the whole system has to slow down to 22.2ms for V-Sync. Just in case you didn’t know, stat commands aren’t the same as the GPU profiler which can be used with ctrl+shift+, it captures a frame that you can look over in the editor after your play session. Expensive screen space effects like SSR, AO, and even atmospheric fog have a heavy performance cost in VR, so disable those if you don’t absolutely need them. This blog post has a lot of good info on optimising VR in UE4.

Thanks for the reply.

Actually, profileGPU (aka ctrl+shift++) is the only one that is showing the correct frame times.
Even if the stat unit shows 22 ms on GPU, if I profile the gpu I get my actual frame time which is around 10 ms.
I believe the bottleneck is the draw thread, which just like the game thread I cannot seem to profile properly, as if I stat start/stop file I’ll get frame times of 22/11 ms with the “self” action causing most of the time (see image).
Maybe the bottleneck actually IS the GPU and the frame time on profileGPU is halved? I really can’t get any worthy data with this freaking fps control from SteamVR : the first thing I am trying to achieve is to disable it so I can remove the cap on the FPS and start actually profiling properly. I think I’ll ask Valve, I don’t really know.

Fun thing is I tested profiling without VR in the same project and the frame times are absurdly lower. Draw tops at 3ms while it goes up to 22 in VR (yes I know frame skipping is involved here but the point is it should not trigger at all). With these stats I don’t really get why is Draw the bottleneck… Maybe the one-frame profileGPU ipothesys is actually the real answer.

So, I did a lot and I mean A LOT of profiling. I managed to disable SteamVR’s control on framerate (it was in SteamVR’s options!), I used so many different profiling tools & speculations and I have concluded that the bottleneck is the Draw thread.
That being said, the issue is still there.
The CPU profiler is not telling me what is costing what. Instead, the fDrawSceneCommand gets a nice “Self” action costing pretty much all the MS like in the picture above. This does NOT happen if I am not in VR. I will keep up my research and share any result.

There has been a few mentioned about using -emulatestereo in the command line when launching the game and then using ‘r.SetRes 2160x1200’ to get the VR resolution and setup your correct r.ScreenPercentage to the one you are using. This should give better profiling results, however I still see some unexplained oddities in my stat profiling as well, with a few stray ‘Self’ taking up some time.

Wow! It is great to know I am not alone!
For now I went back to the good old high level observation. When it’ll come to performance optimization, I’ll be sharing my tears and methods.

I encouter the same issue, and I get the unknown self action in fdrawsceneCommand as you find. The framerate of my game is limited to 30 fps when I open the SteamVR’s frame controller, and the fps changed to 45 if I close the frame controller. Have you find any resolution? HELP ME!

And when I change the oculus DK2 device, I could get about 75 fps…

From my experience, the only quick way to figure out stuff is usually to bulk remove things and see what happens.
The GPU profiler is useful to get some things sorted out, like what in the post process costs what.
Stat scenerendering can tell you about draw calls, but I haven’t figured out how to use it in VR mode since the graph is unreadable.

You can try “hmd mirror 2”, and you will see the stat scenerendering result.
And have you find why the “self” part of the FdrawsceneCommand take so much time?

Bit late to the thread, but did anyone figure out what was causing the “self” action?

If your fps >45, I think the reason is that you open the interleaved reproject, a technology similar to Oculus’s ATW.

from the little knowledge and experience I have, I have come to learn that when it comes to VR one should stop looking at the UE4 cpu profiler as something that always makes sense.
Some things will be weird, and the tool was never too useful to me.
To profile VR, I have come to use the good old “insert feature - test - remove feature - test”. It is a very bad way to approach profiling but it is quick and it works for most things.

I have a 1080P screen and when I do “hmd mirror 2” all stat characters become so blurry that I can’t read them at all. Do you have this problem?

I’m using SteamVR and the Vive and having a similar result. Here’s some stuff I just posted to a bug report that was in the answerhub:

This can be reproduced in the VR template on 4.13 with the HTC Vive.

To those of you trying to hunt down performance issues and are wondering why FDrawSceneCommand:Self is taking so long, here’s my hypothesis so far:

The FDrawSceneCommand in the render thread is actually calling a blocking function (probably WaitGetPose from SteamVR) which makes it seem like it’s actually doing work (since we all are looking for a WaitForWork or Sleep type event name). Thus, it’s not actually the performance bottleneck. Think of Self as a sleep call in this instance.

On non-SteamVR games, Self doesn’t block so you are left with the normal CPU Stall event. I haven’t tried with VR setups outside of the Vive so I’m not sure if they have the same issue, but I would wager they do since the rendering pipeline should be similar.

Additionally, when using stat unit to profile the game & render threads & GPU, you’ll always see the render thread at a very similar value to GPU & frame times, even if it’s not the bottleneck, most likely because of this issue.

What I’d like to see out of the Epic devs:
1.Make it clearer in the CPU profiler that this is indeed a sleep command, or rewrite the blocking mechanism to sit on a separate thread and have the render thread stall until that thread comes back
2.Remove that extra time from the stats to make it more easily discernable to noobs like me that the render thread isn’t causing the bottleneck