Intermittent Invisible Cluster Issue on PGO-Optimized builds

We have run into a peculiar issue with nanite on our PGO-optimized Test config builds.

~ 1 in 4 times we load into our main level we have a bunch of invisible clusters in our nanite meshes. Every mesh is affected and the invisible clusters change as you move towards and away from the meshes as you’d expect. We’ve also found that the issue is easily corrected at runtime by creating an instance of FGlobalComponentRecreateRenderStateContext. There are a number of console commands that invoke this and any will do.

My hunch is that there is some kind of race condition that PGO is bringing to the surface.

Can anyone provide a lead to look into, a potential fix (or workaround)?

Hello,

Thank you for reaching out.

Can you please send us a minimal test project that demonstrates this?

The guide for test projects: [Content removed]

Hi Jonny,

I reached out to some folks internally to assist you with tracking down this issue. I am still waiting to hear back from them, but in the meantime, does the issue also occur with -PGOProfile enabled? The reason I am asking is that we also have -PGOFastGen enabled on Fortnite, and I have not heard about an issue with Nanite and PGO at the moment. I will update you as soon as I have more information for you though.

Cheers,

Tim

Hello,

We’re not aware of any issues with this currently but it seems like this could be related to the GPUScene if it’s fixed with FGlobalComponentRecreateRenderStateContext. If PGO is messing up the bounds or Max WPO displacement that could effect culling and cause what you are seeing. Can you compare your GPUScene primitive and instance data between the two in Pix or Renderdoc to narrow down which calculation is getting messed up?

Ok, no problem. I will keep this ticket pending, since our platform can ping you in two weeks if there is no answer. Hopefully, that will give a chance to track a repro with QA. Please keep us posted once you have some more information.

Hi there. I’m still waiting on out QA team trying to reproduce this issue. Can we please keep this open for now?

Hi, I can reset this case to go back into pending. We prefer not to have the case stay in the open state since there is nothing actionable on my end, and our platform will continuously ping us if we do not answer. I hope that is okay with you.

Hi Kevin. This is an extremely niche issue. It’s only something we’ve seen in our sub-level loaded nanite meshes on Test config builds with PGO enabled. It is unlikely to be reproducible without a lot of effort.

Is there any chance you could provide some insight into what would potentially cause some nanite meshes to partially render and require a full flush of all FPrimitiveSceneInfo instances via FGlobalComponentRecreateRenderStateContext?

Hello,

Changes that affect quality of materials, meshes, etc on startup can trigger the render state recreation across the board. Most commonly this comes from Device Profiles or Scalability on startup.

However, what is triggering it in your case initially is not clear based on the current context.

Given the niche nature of the issue, can you answer some questions to help debug the issue?

  • What platform(s) do you see the issue on?
  • Are you using other optimization flags in addition to PGO?

The issue we have is that nanite cluster rendering is badly broken *unless* we deliberately create an instance of FGlobalComponentRecreateRenderStateContext.

We are seeing the issue on Windows and are not using any optimization flags other than PGO. We used PGOFastGen when we created our instrumented build.

The issue does not occur on an instrumented build. It only happens on an optimized build.

Thanks Alex. I spent a lot of time inspecting the FPrimitiveSceneInfo data before and after the “fix” but couldn’t find an issue. Let me look specifically at bounds and WPO displacement.

So unfortunately this issue no longer reproduces with our latest builds so I’m not able to investigate with Pix. I am trying to get some time from QA to see if we can figure out a consistent repro in our latest build but it will take a week or so.