StateTree Performance vs BehaviorTree

1. Purpose & Key Conclusions

Test Purpose

We benchmarked StateTree vs BehaviorTree runtime performance in UE 5.7 to evaluate whether StateTree is feasible as a BehaviorTree replacement in large-scale AI simulations (100-400 NPCs).

Key Conclusions

1. ~2x Framework Overhead: At unified TickInterval (0.5s), StateTree’s total execution duration is approximately 2x that of BehaviorTree, even with fewer execution counts.

2. High Fixed Overhead: Unreal Insights profiling shows StateTree component’s own Excl time is significantly higher than its internal task execution time, indicating substantial framework overhead independent of actual AI logic.

3. Default Tick Difference: Under default settings, StateTree ticks every frame while BehaviorTree does not, resulting in up to 18x higher CPU consumption at 400 NPCs.

2. Test Environment

- CPU: Intel Core Ultra 7 265K @ 3.90 GHz

- RAM: 32 GB

- GPU: NVIDIA GeForce RTX 5060 (8 GB)

- Engine: UE 5.7 (Packaged Build)

Methodology: Both BT and ST implement identical AI logic (movement, perception, behavioral responses). Data collected via Unreal Insights after 30s warm-up.

3. Supporting Data

3.1 Default Tick Strategy Difference

At 400 NPCs without animation/mesh, StateTree executes every frame while BehaviorTree has variable per-frame execution:

Figure 1: BehaviorTree Single Frame Execution (400 NPCs)

[Image Removed]

Figure 2: StateTree Single Frame Execution (400 NPCs)

[Image Removed] Performance Comparison (400 NPCs, No Animation, Default Settings):

- Execution Count: BT 271,073 vs ST 15,563,600 → 57x difference

- Total Duration: BT 2,721 ms vs ST 49,803 ms → 18x difference

- CPU Time %: BT 0.91% vs ST 16.63% → 18x difference

3.2 Unified TickInterval = 0.5s (400 NPCs)

With Animation (2min test):

- BT: 101,483 executions, 869 ms total

- ST: 98,854 executions, 1,577 ms total

- Ratio: ST is 1.8x slower

Without Animation (2min test):

- BT: 132,365 executions, 1,178 ms total

- ST: 106,872 executions, 2,432 ms total

- Ratio: ST is 2.1x slower (even with fewer executions)

3.3 Framework Overhead Analysis

Figure 3: BehaviorTree Task Execution Duration (No Animation, 2min)

[Image Removed]

Figure 4: StateTree Task Execution Duration (No Animation, 2min)

[Image Removed]

StateTree’s Excl time (framework overhead) is significantly higher than its internal task callees, while BehaviorTree shows lower component overhead relative to task execution.

4. Questions

Q1: Is the ~2x framework overhead expected for StateTree’s architecture? What contributes to this overhead?

Q2: Under what scenarios would StateTree outperform BehaviorTree?

Q3: What optimization strategies are recommended to reduce StateTree overhead for large-scale AI (400+ NPCs)?

Q4: In Insights, StateTree’s Excl time far exceeds internal task time. What internal processes cause this?

Q5: Adjusting ForegroundWorker via console crashes release builds. Is this known? Is there a safe way to configure worker threads at runtime?

[Attachment Removed]

重现步骤
Open “ST_BT_Benchmark” project and Run:

  1. Edit “TotalTime”, “FileName”, “Count” InputBox like: “300”, “ST_300s_400n”, “400”.
  2. Click “StartTreeNPC” (Has Anim) or “Pawn_StateTree”(Only Pawn) button to spawn Actors.
  3. Click “StartLogic” to Start
  4. when game stop, the “$PRJ$/Saved/Profiling/ST_300s_400n.utrace” is this turn “Unreal Insights trace file”.
    [Attachment Removed]

This is quite intriguing. We expect StateTree to be more performant on both CPU and memory than BT. We did use StateTree to ship the Witcher 4 demo last year which used StateTree, and we have been using StateTree on internal projects as well.

It may take me a little time to dive deeper into the project to discover what may be causing the slowdown you are experiencing. Thank you for the repro project into this as it certainly is helpful to find something we may have missed or caused a regression for perf.

-James

[Attachment Removed]

So I have run a few tests of your project with this using varying numbers of AI. I have seen where the Insights total execution time for StateTree has been higher than Behavior Tree. However, I have also seen that perf is better with StateTree especially as agent count increases. At 200 agents, fps was nearly identical, and at 400, StateTree had an average fps 6 frames higher than BT.

I continually got errors/crashes attempting to use the NPC variants in your project, so something may have corrupted on my end. I did use the Pawn versions for the testing.

I spoke with our primary dev on StateTree who noticed that some of the tags in the engine are not showing properly in Insights for StateTree. It should track preparing runtime data for the component, starting the StateTree execution, transitions, and cleaning up. These are all steps of the StateTree tick. Some of the extra time in StateTree is due to EnterTask due to BP. With the other changes to StateTree, you could probably make the tree utilize tick even less by not having transitions running On Tick but rather send events or delegates to trigger the transition.

-James

[Attachment Removed]

Thank you for your reply. Our project involves many NPCs executing complex AI logic, and we’re eager to discover approaches that can substantially enhance StateTree’s performance.

[Attachment Removed]

:person_raising_hand:

[Attachment Removed]

:waving_hand:

[Attachment Removed]