Instance culling after being enabled does not seem to do anything

I am working on a test project which is targeting mobile and PC, and I want to use Instance culling(GPU based occlusion culling) on both platforms.

These are the Console Vars that I modified for instance culling.

; Instance culling
r.AllowOcclusionQueries=0
r.InstanceCulling.OcclusionCull=1

; GPU Scene on Mobile
r.Mobile.SupportGPUScene=1

; Z Pre Pass on Mobile(I think)
r.Mobile.EarlyZPass=1

To test instance culling I created a scene and added an occluder in front of the camera, I did not notice a change in FPS.
I tried capturing a frame in render doc and it seems to be performing instance culling. It’s building the HZB, it logged events related to instance culling and the draw calls are DrawIndexedInstance. But even when all the objects in the scene were occluded, the number of draw calls were still massive(I figured out with instance culling some draw calls may be empty so we can’t rely on draw calls to be a reliable indicator of whether or not instance culling is working).

Also when I freeze rendering and move the camera behind the occluder, I can see the objects being rendered. I don’t know if this is a limitation of the debugging mode(it may not handle instance culling but I am not sure if this is the case), it may just be that instance culling is not properly working. (Since instance culling uses indirect draw calls, the CPU does not know which objects are culled, meaning we can’t freeze rendering to check if instance culling is working).

The only way to know if instance culling is working or not is to check the DrawIndexedInstance events in render doc and if the second argument is 0, it means that it is an empty draw call and the object was successfully culled. In my case the second argument is not 0, meaning instance culling is not working.

Additional Information:
Even after enabling Instance culling, the program seems to not want to perform it. I obtained the results below by commenting out the following return statements in InstanceCullingOcclusionQuery.cpp

uint32 FInstanceCullingOcclusionQueryRenderer::Render(
	FRDGBuilder& GraphBuilder,
	FGPUScene& GPUScene,
	FViewInfo& View)
{
	// SA: Had to comment the return statements, otherwise instance culling would early exit
	if (!IsCompatibleWithView(View))
	{
		// return 0;
	}

	const uint32 ViewMask = RegisterView(View);

	if (ViewMask == 0)
	{
		// Silently fall back to no culling when we hit the limit of maximum supported views
		// return 0;
	}

        .....
}

Render doc events:
image

Occlusion culling can be tricky to setup properly. I agree that putting a occluder in front of the camera should do “something” but that’s not always the case. You seem to be using the right tools to check for occlusion culling performance so I think you’re right in that it’s not doing anything for you. That said, you mentioned GPU culling (or hardware culling as it’s also known by) yet you have AllowOcclusionQueries=0…which subsequently turns off hardware culling according to the docs. Furthermore, not all mobile devices support hardware culling anyway.

Then there’s the whole other issue with instance culling which is a slightly separate beast. Have you adjusted the occlusion culling distance for the instances? I’m thinking something like this Unreal Engine 5 3 Instance Start Cull Distance (youtube.com)

Getting occlusion culling right cross-platform can be tricky for sure.

Thank you for your response, it gives me a few topics that I can look into.

There were a few mistakes in my initial testing methodology and I will start with that

Corrections

  • With instance culling some draw calls may be empty so we can’t rely on the number of draw calls to be an indicator of whether or not instance culling is working
  • Since instance culling uses indirect draw calls, the CPU does not know which objects are culled, meaning we can’t freeze rendering to check if instance culling is working.

The only way to know if instance culling is working or not is to check the DrawIndexedInstance events in render doc and if the second argument is 0, it means that it is an empty draw call and the object was successfully culled. In my case the second argument is not 0, meaning instance culling is not working.

Reasoning behind my usage of the CVars

There are 2 different hardware culling methods in Unreal as far as I can tell: occlusion culling using GPU queries and Instance culling.

I want to use instance culling. Based on my understanding so far, with instance culling a compute shader is used to perform culling and then indirect draw calls are used to either draw or cull objects. Because indirect draw calls are used, the result of the culling process does not need to be sent back to the CPU(result will be stored in a buffer on the GPU) reducing latency, which happens to be the biggest problem with hardware occlusion queries.

Unreal however does send the result back to the CPU in the form of a visibility mask. I can’t figure out why, because it does not seem to be used anywhere(although from what I remember, it seems to be added as a pass param). (Code can be found in MobileShadingRenderer.cpp > FMobileSceneRenderer::RenderHZB

My assumption is that with the following settings:

r.AllowOcclusionQueries=0
r.InstanceCulling.OcclusionCull=1

I am enabling instance culling and disabling hardware occlusion queries. Again all of this is based on my current understanding and I may be wrong.

I managed to enable instance culling. Turns out I had forgotten to add a CVar to enable instance culling queries r.InstanceCulling.OcclusionQueries=1

Another correction for my original post:

When instance culling is working the draw call to check would be IndirectDrawIndxed(< v1, v2>) not DrawIndexedInstance. If instance culling is working, v2 should be 0.