Knowledge Base: Garbage Collector Internals

You need to first make sure the UObject class type allows being in a cluster:

virtual bool CanBeInCluster() const override { return true; }

Then, for the parent UObject that contains all the child UObjects you don’t want the garbage collector to scan, you can simply call this single function on it and it will automatically scan all subobjects recursively and add them if they CanBeInCluster:

ParentObject->CreateCluster();

If the automatic cluster creation is too slow for you, you can do it quicker by manually creating the cluster and specifically adding the subobjects:

// Run this on the parent object that contains all the sub-objects you want clustered.
FUObjectItem* RootItem = GUObjectArray.ObjectToObjectItem(this);
const int32 InternalIndex = GUObjectArray.ObjectToIndex(this);
const int32 ClusterIndex = GUObjectClusters.AllocateCluster(InternalIndex);
FUObjectCluster& Cluster = GUObjectClusters[ClusterIndex];

this->SetClusterIndex(ClusterIndex);
this->SetFlags(EInternalObjectFlags::ClusterRoot);

// Optional, if you know in advance how many sub-objects you're adding, including this parent.
// This saves the array from having to resize all the time if you're adding many objects.
Cluster.Objects.Reserve(TotalSubobjectCount);

Cluster.Objects.Add(this);
// And call Add for all the sub-objects
	
// ClusterTestObject->CreateCluster() calls Sort, I don't know if it's required, but it must do it for a reason.
Cluster.Objects.Sort();

I must admit that I just parroted this advice as I’ve heard years ago. As I was checking how to make GC clusters for you I wasn’t sure what the actual performance gain was so I made a test. The results aren’t as mindblowing as I was hoping, it’s maybe a 50% ms saving in a Shipping build for the clustered objects. The reason it’s not faster than that is mostly due to how long MarkClusteredObjectsAsReachable takes.

In Development and Test builds my cluster was actually not faster at all due to GC-verification. You can turn the verification off in Development and Test builds with -NoVerifyGC -DPCvars="gc.VerifyAssumptionsOnFullPurge=0" startup parameters. You can also use -LogCmds="logGarbage verbose" to get the timings of the GC in the log to compare.

This is what the results look like for me, with around 600k objects clustered and not much else happening in the scene:

Collecting garbage - No GC Cluster
0.019800 ms for MarkObjectsAsUnreachable Phase (34 Objects To Serialize)
17.532300 ms for Reachability Analysis
GC Reachability Analysis total time: 18.11 ms (18.11 ms on reference traversal)
18.11 ms for GC - 99367 refs/ms while processing 1799453 references from 598753 objects  with 0 clusters
Freed 65536b from 16 GC contexts
1.186699 ms for Gather Unreachable Objects (10 objects collected / 598762 scanned with 126 thread(s))
0.125203 ms for unhashing unreachable objects (10 objects unhashed)
GC purged 10 objects (598762 -> 598752) in 11.437ms (1 iteration(s))
Compacting FUObjectHashTables data took  21.28ms


Collecting garbage - GC Cluster
10.099500 ms for MarkObjectsAsUnreachable Phase (34 Objects To Serialize)
0.851799 ms for Reachability Analysis
GC Reachability Analysis total time: 11.47 ms (11.47 ms on reference traversal)
11.47 ms for GC - 508 refs/ms while processing 5840 references from 882 objects  with 1 clusters
0.002302 ms for Dissolve Unreachable Clusters (0/1 clusters dissolved containing 0 cluster objects)
Freed 65536b from 16 GC contexts
1.156200 ms for Gather Unreachable Objects (0 objects collected / 598762 scanned with 126 thread(s))
0.000097 ms for unhashing unreachable objects (0 objects unhashed)
GC purged 0 objects (598752 -> 598752) in 0.002ms (1 iteration(s))
Compacting FUObjectHashTables data took   7.52ms

Reachability analysis decreased from 17.3ms to 0.85ms.
But MarkObjectsAsUnreachable increased from 0.02ms to 10.1ms. So yeah, your mileage may vary and it might not be worth it. MAybe it’s better to just keep the UObject count low :man_shrugging:

1 Like