[PCG] High memory usage of runtime PCG metadata assets

anonymous-edc · April 30, 2025, 1:54pm

Hi there, I’m investigating the memory usage of PCG assets for our open world game. We’re seeing very high memory usage when loading the PCG metadata of up to 600MB in some cases. Our PCG cell size is set to the default 51200

More specifically, the allocations are being made from FPCGMetadataAttributeBase::Serialize called from UPCGMetadata::Serialize. The EntryToValueKeyMap allocates large amounts of memory for the hash table, tagged as PCGMetaData in the LLM stats.

For example, in one location during load we’re seeing 17 assets loaded, with 40-50 attributes per asset, and each attribute has up to 15,000 hash map entries costing 300KB (250MB total). Surprisingly this data seems not to be accessed or needed during the runtime at all, and can be deleted immediately after load with no impact on the placement of the PCG assets in game.

Currently our PCGVolume Generation Trigger is set to ‘Generate on Demand’, as we were previously seeing CPU performance spikes when using Generate at Runtime.

Can I check:

1) Is it possible to remove the EntryToValueKeyMap hash table from the serialized runtime data as it seems to be unused?

2) Is this data tied to the generate on demand setting, what’s the difference between generate on load and on demand?

3) Is it possible to reduce the overhead of the data based on how the PCG network is setup? It looks like there is a lot of duplication of the hash table across multiple attributes

For a memreport showing this issue, please see the attached memreport CSV, showing the usage in the section ‘Obj List: -resourcesizesort’ for object PCGMetadata with 137 objects and 763MB of data.

Many thanks,

Andrew

Julien_LHeureux · May 2, 2025, 7:00pm

Hi Andy,

Thanks for reaching out.

There might be a few things at play here, so let me know if I haven’t covered your case.

I’ll answer your questions below, but these steps might help you out regardless:

1- Generation happens in the editor.

If the generation happens in the editor only, you can mark the component (or the PCG graph) as editor-only, and the data will be removed at cook, which might be a good idea if you don’t need it.

2- Generation happens at runtime (once on load)

It’s possible that you serialize/have serialized data on the components that isn’t needed.

Any data that’s passed to the output of the “top” PCG Graph (e.g. the one in the PCG component) is serialized on the component itself - it’s used to be able to pass data between different components.

However, if you’re not using this or are not relying on that data, you don’t need to serialize it - ergo don’t return it.

3- Generation happens at runtime (once on load) and you need to pass data to other components

It’s possible to remove attributes from data (Remove / Keep attribute node) that will strip out that data and that might be worth it if that needs to be serialized.

4- Generation happens at runtime, but there’s nothing on the graph output

Data can be serialized for hierarchical generation, if so, please refer to (3) as it is similar in nature.

Otherwise, maybe you’re in a case that the cache is holding on to data that’s not needed anymore.

The pcg cache can empty and trigger a GC under some circumstances, and you can play with these numbers (pcg.Cache.MemoryBudgetMB is one) to adjust what makes sense.

You could also empty the cache completely (if you’re currently loading - through pcg.flushcache, or from the PCG subsystem “FlushCache” method) then trigger the GC for good measure.

Ok, so onto your questions:

1- Is it possible to remove [data] from the serialized runtime data if it’s not used?

You can’t remove the entry to value key map (it’s used!) per se, but as I pointed out, you can remove attributes you don’t need, and that will get rid of the memory.

2- What’s the difference between generate on load and generate on demand?

The generate on load happens when the PCGComponent has its OnBeginPlay method called, after streaming in.

The generate on demand is when you want to manage generation yourself - either as part of your tools in the editor or following whatever’s happening in your game (triggers, etc.)

There is no functional difference in how things are executed otherwise.

3- Is it possible to reduce the overhead of the data, based on how the PCG graph is setup?

It is very possible to do, and you might notice that we do it in some of our samples/plugins (such as BiomeCore) or Electric Dreams, where we’ll aggressively remove attributes we don’t need or abstract multiple attributes to an index in a table and so on.

The problem is especially prevalent when you do long chains of operations and add more and more attributes; reusing attributes and so on might be a reasonable idea in some cases.

On a side-note, in 5..6 we’re introducing a new data structure for points that can lower significantly the memory cost which might interest you - basically we’ve added a SoA version of the point data and that becomes the default point format we flow through the graph. It allows us to allocate only the properties needed, and also support inheritance.

Hopefully this helps,

Cheers,

Julien

anonymous-edc · May 6, 2025, 3:34pm

Hi Julien,

Thanks very much for the detailed reply. I’m hoping that your step 1. should be applicable to us because we should only need generation in editor and this should remove all of the metadata at runtime. Is there any difference in setting ‘Is Editor Only’ for the attribute PCGComponent/Cooking compared to the one under just ‘Cooking’? We’re testing this and can let you know if we see the expected memory reduction.

You mentioned that the pcg cache can trigger a GC, could you let us know in what circumstances you’d expect that to happen (in game not editor)?

Cheers,

Andy

anonymous-edc · May 7, 2025, 9:42am

Just to add to this, we tried setting ‘Is Editor Only’ on the PCGVolume for both the PCGComponent/Cooking and Cooking settings. After then re-generating the PCG data in Editor, and re-cooking everything we don’t see any reduction in the PCG Meta data memory usage in game.

We’ve also spotted the ‘Is Editor Only Actor’ property, which applies to both the PCGVolume and PCGWorldActor - is this setting required on one or both assets to remove the runtime meta data without removing the scattered assets in game?

Julien_LHeureux · May 12, 2025, 9:04pm

Hi again Andy,

I will suggest you try to mark the PCG Graph as editor only ( “Is Editor Only” property in the PCG graph properties under the “Cooking” category ) and that will bubble up to any and all PCG components using that graph.

I’m surprised that marking the component as editor only wasn’t working directly but maybe something changed since we’ve added that.

In your case, maybe you have partitioned your graph, so removing the “original” component could have broken some of the PCG mechanism, but if everything is already generated, the components & data on the partition actors are still serialized (because they themselves won’t be Editor Only).

Let me know if that fixes your issue.

I looked in the memory capture and sure enough, it has to do with point data serialized, most likely on the components.

As far as the pcg partition actors & the pcg world actor:

if you mark the partition actor editor only, then all the content on it won’t be in the cook (which isn’t what you want)
if you mark the pcg world actor as editor only, then you might run into some issues with components that are still alive and of course runtime generation will straight up not work.

Hope this helps,

Cheers,

Julien

anonymous-edc · June 2, 2025, 10:24am

Hey Julien, just to update and close this thread - marking the PCG graph as Editor only removed all the runtime memory that we were seeing associated with PCGMetaData and PCGPointData in the memreport, saving over 700MB.

Many thanks,

Andy