Understanding bOptimizeBPComponentData

Hi, we’re in the process of optimizing some of our expensive BP actor spawning and I’m looking into whether or not we should be using bOptimizeBPComponentData on these actors. I can see by attaching to the cook process that it’s executing BeginCacheForCookedPlatformData and correctly identifying some component are eligible, but when I profile the cooked build with Unreal Insights, I’m not seeing any obvious difference with it on/off. For reference, we use IOStore and I’m profiling in Development to see more detailed reporting in the trace, in case that’s relevant.

I’m interested in understanding what cost is it expected to reduce and whether or not it’s applicable for our title. It would also be good to know if there are any conditions or gotchas that may diminish the effectiveness of enabling bOptimizeBPComponentData.

Thanks.

[Attachment Removed]

Hi!

bOptimizeBPComponentData optimizes serialization for components, like the ones spawned via the BP’s Construction Script. If you have many/heavy components spawned in BP Construction Script, and/or lots of properties in components, it can speed up that part of actor construction. But if most of your actor construction cost is in functions that run during initialization, like the C++ constructor or, more likely, in PostLoad and BeginPlay, then the cost saved will seem minimal in comparison.

[Attachment Removed]

Can you post your performance gains here for future answer seekers? In case someone else has the same question :slightly_smiling_face:

[Attachment Removed]

Excellent, thank you! I’ll close this ticket now :+1:

[Attachment Removed]

Ah that makes sense, most of our components were added in the c++ constructor. That said, we do have a number of properties added in the BP, so I would expect some difference if that’s the case. Do you know if that cost is labeled in Unreal Insights traces? Or do you know where I could look in the code to profile it and see if there’s a meaningful difference?

[Attachment Removed]

The fast path appears to be used only in USCS_Node::ExecuteNodeOnActor and AActor::AddComponent.

There aren’t any scopes in those, so I would look out for those in a CPU sampling profiler, or add your own scopes there.

[Attachment Removed]

Perfect, thanks!

[Attachment Removed]

Sure. It seems to have a very minor improvement in my test case here (enabling it for one of our more expensive character BPs). The observed range was 1.3-1.5ms without the checkbox checked and 1-1.1ms with it checked (ignore the absolute timings, since this is isn’t an optimized build). It’s close enough that it feels like margin of error, but with a sample size of ~8 each time, it definitely is statistically slightly better to have it enabled. I imagine the difference in our particular scenario here may be less than ~100us or so in Shipping config though (and that’s on our slowest target hardware). Certainly not zero difference, but clearly we’re not making heavy use of the code path it’s trying to optimize. [Image Removed]

[Attachment Removed]

And for completeness, we had zero calls to AActor::AddComponent, hence why the trace only shows USCS_Node::ExecuteNodeOnActor.

[Attachment Removed]