Please help me understand where performance block is!

Hello,

I’m posting here as I didn’t get any resolution from my forum post and I can’t find any specific profiling examples anywhere that seem relevant to my case.
Iv’e spent a few days on this now and am really getting frustrated by the feeling of just running round in circles and not getting any closer to figuring out where the problem lies.

Basically,

In my VR game I am not getting the performance I need. I don’t think I have much geometry or crazy shaders or many shadow lights or much post processing.
But most confusingly I can’t find out where the block is.

This is my stat unit,
stat_unit.png
GPU time spent seem to be pushing the frame time over the 11ms limit, halving the fps.

BUT I then packaged up the game, did a startFile-stopFile and took that into the frontend profiler.
I am confused because the numbers it’s reporting seems to say that the Game thread is fast enough and just idling,
AND also the Rendering is fast enough. There’s a high slate value but this is not using the editor nor do I have any other 2d ui things or stat commands running.

Game Thread

Rendering Thread

It’s like they’re both fast and fine but waiting for each other nonetheless.

Please can someone suggest what might be going on here?

Cheers

Hi ,

The high GPU time is indicating you are GPU bound, not CPU bound. stat startfile/stopfile will only help profiling CPU time (e.g., Game or Draw time but not GPU time). Check out this documentation: Timing Insights in Unreal Engine 5 | Unreal Engine 5.1 Documentation and use profilegpu (with or without r.showmaterialdrawevents)

Cheers,
Michael Noland

Hi Michael,

I have been looking at that too but thought it only worked in the editor. I now figured out how to read the log file from a packaged game run dump.

Would you be able to help me have a quick look and see if you think these numbers are high or that the resulting ms is roughly as expected?
(Its a vr render on a gtx970)

LogRHI: 100.0%17.67ms FRAME 508 draws 6456210 prims 11295894 verts
LogRHI: 83.1%14.68ms Scene 503 draws 6455804 prims 11295082 verts
LogRHI: 1.5% 0.26ms Dynamic 1 draws 170056 prims 110014 verts 642749 prims/ms 415813 verts/ms
LogRHI: 51.5% 9.09ms Other Children
LogRHI: 8.5% 1.50ms PrePass DDM_AllOpaque (Forced by ForwardShading) 98 draws 3221430 prims 4726042 verts
LogRHI: 16.4% 2.90ms SlateUI 3 draws 402 prims 804 verts

Thanks

Are you profiling the GPU in the editor? If so you won’t get accurate results - it’s better to profile in a standalone instance. Currently, Slate is taking almost 3ms, but that’s likely due to the editor and not your game.

I’d recommend profiling in Standalone first. What kind of things are in the scene?

500 draw calls and 11 million vertices seems pretty high for a normal game, extremely high for a VR game.

Thanks for your comments!
I have recently moved to just testing cooked builds as it seems just to random when doing it PIE. Also like you say Jamsh I get things like the slate times which are confusing.
I yesterday found the excellent performance Youtube videos by TechArtAid and have started using the Intel Graphics Monitor to pinpoint render costs in a much more clear and detailed way. Awesome tool for anyone interested and the videos too.
Also I am starting to strip off shaders and turn off lights and see but unfortunately I don’t see huge gains. I’m starting to think that maybe its just too much in there?

(BTW, that 3ms slate time was actually from a cooked build and I was surprised to see that item as I don’t think I am using any slatey things in the game)

But then again even just a low poly floor with a simple shader is 0.3ms. It is covering maybe 30% of the screen but I would have expected such a low poly object with very basic shader would render faster than that.

The scene is a boxing/mma style arena. (no crowds yet). Shaders I’m generally trying to keep as basic as I can but maybe there’s a few too many/dense triangles here and there that I need to get rid of.
There’s a lot of lights of various kinds. Many are set to ‘Stationary’ but then I keep the influence radius very low (around 1 meter real world scale. Ends up being small in pixels covered)
Bunch of spotlights have translucent fog cones but I don’t see the translucency item on the GPU profiler reporting high ms times.

Currently in the Intel tool I have not a lot of time spiking assets left as I have been stripping stuff, but still ~16ms GPU time.

Work continues…

I recommend to you to profile something which is already good and VR ready, so you can see the numbers produced by something already optimized and compare the environment, lights, assets and materials with yours. I suggest going to Epic’s Launcher at Modding tab and download Robo Recall and play with it a bit and profile it. You might discover quickly by doing so, otherwise you will just need to share some parts of your project for others too look at, but then you will not learn effectively this way.

Good luck!

@NilsonLima

Great idea! I’ll do that right away!

Are you using intel hd graphics? That’d be why it’s slow :smiley:

Haha no I am not. Thanks for the suggestion though.
It’s not THAT slow, just too slow for my needs. :slight_smile:

The graphics monitor is a general GPU tool not tied to Intel graphics hardware. It’s very cool I recommend taking a look.

[QUOTE=Fredrum;748779]
Haha no I am not. Thanks for the suggestion though.
It’s not THAT slow, just too slow for my needs. :slight_smile:

The graphics monitor is a general GPU tool not tied to Intel graphics hardware. It’s very cool I recommend taking a look.UE4 Graphics Profiling: Intel Frame Analyzer - YouTube

Ah, looks very interesting! The intel site made it seem like this can only be used with their integrated graphics.