GPU performance tuning of "HZB SetupMips" (As seen in GPU Visualizer)

Now that we can see the internal jira tickets (thank you!) I noticed that this is a Won’t Fix because it’s blamed on DX11. We see the same problem on OpenGL as well, so waiting on DX12 won’t really help us. Is there any plan to dig into the problem a bit more?

Another data point for me is that when I installed Renderdoc, the problem returned. Coincidentally, when I fixed the problem last by rebuilding my whole project, I hadn’t reinstalled Renderdoc, so the fix for me may very well have been uninstalling Renderdoc. It’s reproducible for me now though; HZB goes through the roof with Renderdoc installed, and goes back to normal with Renderdoc removed. But man, I’ve got to have Renderdoc! Anybody have any clue what the real “DX11” issue is?

It seems this can also affect 4.11.2, but only in an empty level in Editor (not PIE). Merely a curiosity at this point, but may help track down the root cause.

According to the JIRA ticket, this only affects the profiler view and doesn’t actually affect performance? So VR preview should still run at the full speed when this is happening? Can anyone confirm that? I thought I was having some performance differences but I’I’ll have to go back and test (for me hzb was taking around 4ms in the profiler, not as extreme as some others have reported).

VR Preview is definitely effected by this, at least with the condition I see. As I mentioned earlier, the error occurs for me when I have Renderdoc plugin installed (regardless if I’m running it), and it kills my VR Preview. If I remove Renderdoc, the issue seems to go away, though that is hardly a solution since I have no idea how to debug my shaders without Renderdoc.

Hello, just stumbled across this thread because I’m having the exact same issue with 4.12.5
I’ve just checked the JIRA ticket and it’s tagged as “Won’t Fix”. What are we supposed to do now? It worked with 4.11 and stopped working with 4.12.
I don’t understand what DX11 has to do with any of this if it was working in the previous version.

I’ve come across exactly the same issue in 4.12. seeing both BIG and wildly DIFFERENT numbers in ‘HZB SetupMips Mips 1…9 512x256’

As you may know, the bug has been acknowledged here, and it seems that the plan is just to wait until DX12… Unreal Engine Issues and Bug Tracker (UE-33448)

The thing is, DX11 could be around for a long time and remain a huge part of the user base, because DX12 is to be Windows 10 only (as far as I know). I love UE4 but it seems like it is worth looking at.

What’s especially puzzling, is that I’ve been only comparing the GPU performance of two identical, untouched copies of the default Flying game, using ‘profilegpu’. I’ve been using these as benchmarks to compare with my game (which is just a slightly advanced version of it)

So I have one BADLY performing benchmark version and one VERY-BADLY-performing benchmark version, from the same source (not counting my own game in this discussion). In case you are interested, I’ve cropped out the two GPU profiles of my two benchmarks and put them side by side in the attached image. The left image is HIGH and the right image shows HUGE in ‘HZB SetupMips Mips 1…9 512x256’ (underlined in yellow in the picture). The difference is constantly about 5.00 ms.

What puzzles me is that that both benchmarks are two identical projects, untouched by me, straight from hitting the ‘Flying’ project button, no intervention. I’ve hackily tried copying and pasting the high-performing benchmark version… but this 3rd copy shows similar performance to the worst performing benchmark version. I also tried creating new Flying projects, again untouched, but these copies were the same.

As you’ll understand, I’d like to keep the high performing version… but it might just get erased or corrupted. Like everyone else, it might really help if I could get to the root of this, so I can keep ‘HZB SetupMips Mips 1…9 512x256’ nice and quick.

Looks like the issue was closed as “won’t fix”. That’s too bad, I have the same issue.

I have bumped into this thing, with around 10-15 ms wasted randomly on HZB SetupMips. I found a strange solution, I just needed to change the screen percentage on the fly, so typing HMD SP X and it got rid of the nasty stuff. Also on other levels I had a similar issue but at that case there was a wild ShadowDepths thing eating around the same amount. But that could be eliminated with changing the SP too. If it doesn’t work for the first time just try to switch around multiple times, going back and forth between SP settings. It’s a good idea to attach the console command to a key so you just need to press it. Really weird stuff.

I know this is way too hacky but at least it’s a temporary solution. Also I was using 4.13 with CV1.

Hi guys,

This problem affects the Sequencer abilities of Unreal 4.13 extremely seriously, to the point that it is not usable. See attached screenshot, where it is using up 150ms+ of GPU time.

Are you sure that there is a relation between the two, other than that they both run? Have you tried changing the screen percentage runtime? :slight_smile:

Hey

Just a heads up that from our tests, we found that “HZB SetupMips Mips 512x256” hits about 2ms in the standalone game and close to 4ms in the editor. So it’s about halved, but still has quite an impact (9% of the scene in standalone)

I can confirm that in 4.14, it definitely affects VR performance.
I really need this fixed before i release my game on steam!

Yeah. Having the same problem still in 4.14 on a fresh project. It uses 4ms out of 11ms total. That’s crazy.

Hi Guys!

So, I did a bunch of digging, it appears this not actually a problem with generating HZB mipmaps, but is actually time spent in the FSlateApplication::DrawWindowAndChildren function leaking into spurious stat counters.

To figure this out, I turned off all the relevant parts for this Issue (HZB, SSAO, SSR), which caused a nearly identical amount of the time in the ClearTranslucentVolumeLighting GPUStat. When this happened, I decided to go to the CPU Profiler to find out what was going on. It was then I noticed a correlation between the CPU stat counter of the Slate drawing and the time of the ClearTranslucentVolumeLighting GPUStat.

The best way to check this is to take a Sequencer Recording of an Actor’s transform for at least 10 seconds, with it moving constantly, then opening the Recording in Sequencer.

The SSequencerObjectTrack::OnPaint causes a very significant performance impact, as it is seriously inefficient, and
renders very large quantities of Slate Elements.

It would be awesome if a few others could verify these findings.

Regards,

Gossy

Also, I just thought about it by logical extension, this could mean other
rendering jobs could be leaking into this stat counter, hiding other potential performance problems.

Just to confirm, this step helped in our case. We don’t need nor Ambient Occ. nor SSR, so at the moment we are fine with it.

Just want to bump this. Out of nowhere my project (UE 4.15 all of a sudden Shows 80ms per frame of which 70 are “HZB”. Come on guys, and you wont even fix this?

I’m in the same case, using DirectX12… “Not much we can do about it until we are on DX12.”
And then ?

Some tests later : Instanced Stereo Mode downs the time for base pass from 3.16 to 1.03, but scales up the HZB thing from 2.8 to 4.2.
With the 4.16 preview3, the results are similar with or without Instanced Stereo : 2,6-3 for base pass, 4,08-4,18 for HZB.