Optimisation for culling in RuntimeVirtualTexture::GatherMeshesToDraw

We had an issue (multiple second delays waiting for mesh/grass scatter to occur due to Landscape taking a long time to resolve detail) which was temporarily resolved by increasing the value of `r.VT.MaxUploadsPerFrame`. Unfortunately this caused a lot of time to be spent in `RuntimeVirtualTexture::GatherMeshesToDraw` on the Render Thread.

When I looked at what was happening, I noticed a lot of time was being spent in `FConvexVolume::IntersectSphere` and that there was a TODO comment around investigating a cheaper test for intersections.

Anyway, I had a go and managed to save a bunch of time here, so thought that you may be interested (and, more importantly, tell me if there’s any reason that I’m doing something wrong).

I changed:

for (const int32 PrimitiveIndex : PrimitiveIndices)
{
	//todo[vt]: In our case we know that frustum is an oriented box so investigate cheaper test for intersecting that
	const FSphere SphereBounds = Scene->PrimitiveBounds[PrimitiveIndex].BoxSphereBounds.GetSphere();
	if (!View->ViewFrustum.IntersectSphere(SphereBounds.Center, SphereBounds.W))
	{
		continue;
	}
...

to:

// We are guaranteed that View->Frustum is such that:
// * View->ViewFrustum.PermutedPlanes.Num() is either 0 or 4
// * View->ViewFrustum.PermutedPlanes[0] is { -1.0, 1.0, 0.0, 0.0 } when it exists
// * View->ViewFrustum.PermutedPlanes[1] is { 0.0, 0.0, -1.0, 1.0 } when it exists
// * View->ViewFrustum.PermutedPlanes[2] is { 0.0, 0.0, 0.0, 0.0 } when it exists
//
// This means that we can greatly simplify the calculations done in FConvexVolume::IntersectSphere
if (View->ViewFrustum.Planes.IsEmpty())
{
	return;
}
const VectorRegister PlanesXY = MakeVectorRegister(-1.0, 1.0, -1.0, 1.0);
const VectorRegister PlanesW = VectorMultiply(VectorLoadAligned(&View->ViewFrustum.PermutedPlanes[3]), MakeVectorRegister(-1.0, -1.0, -1.0, -1.0));

for (const int32 PrimitiveIndex : PrimitiveIndices)
{
	const FSphere SphereBounds = Scene->PrimitiveBounds[PrimitiveIndex].BoxSphereBounds.GetSphere();

	// see FConvexVolume::IntersectSphere for details of replacing !View->ViewFrustum.IntersectSphere(SphereBounds.Center, SphereBounds.W)
	const VectorRegister OrigXY = VectorSwizzle(VectorLoadFloat2(reinterpret_cast<const double*>(&SphereBounds)), 0, 0, 1, 1);
	const VectorRegister VRadius = VectorLoadFloat1(&SphereBounds.W);
	const VectorRegister DistXY = VectorMultiplyAdd(OrigXY, PlanesXY, PlanesW);
	if (VectorAnyGreaterThan(DistXY, VRadius) != 0)
	{
		continue;
	}
...

I’d really appreciate you letting me know if I’ve walked into a pitfall.

[Attachment Removed]

Hi Tom,

This seems reasonable. I think it won’t cull items that are within the XY bounds but outside Z bounds of the frustum? But maybe that is a worthwhile tradeoff for your (and most) content setups.

One further question it raises is whether your content has many primitives writing to the RVT and they aren’t streamed or are somehow too granular? Your code optimization here is great, but it could be worth validating that there aren’t content optimizations on the table too.

Thanks for sharing this!

Jeremy

[Attachment Removed]

Thanks for taking a look!

Great shout on taking a look at the content. I think that we are possibly using the Grass system in an odd way which is causing trouble (we are building a racing game where we travel very fast through large detailed worlds and there’s an awful lot of requests for steaming/lods/mips going on all the time). I’ll ask our TAs if they can think of any content levers that we can pull (they’re all much more knowledgeable about the various cool bits of Unreal tech than me).

[Attachment Removed]