GUseNewTaskBackend merges pooled threads with task graph threads, which can quickly bottleneck task graph threads

Prior to UE5, task graph threads and pooled threads were managed as separate groups. With the new Tasks System in UE5 and GUseNewTaskBackend=1 (by default), it seems these two have been merged into a single group. This behavioral change introduces situations where long-running tasks can interfere with important tasks on task graph worker threads.

It’s common to have long-running tasks that should not prevent task graph threads from running. FUDPPing::UDPEcho is one example.

Here’s the situation where this issue caused major hitches on the Game Thread:

We use FUDPPing::UDPEcho to establish network latency to various servers. These UDP tasks wait on the reply and block the calling thread. In UE4 this would cause the pooled threads to stall, which was not a big deal. In UE5 this causes the background worker threads to stall, which are the same resources used by many Game Thread tasks. The background worker threads can become very saturated with this work.

I’m looking for guidance on why these systems seemed to merge in UE5 when compared to UE4, and how we should respond. It seems like we could switch the UDPEcho task to use EAsyncExecution::Thread instead of EAsyncExecutuon::ThreadPool. However, there are many places in the engine that use EAsyncExecution::ThreadPool which would need auditing. Should those have been swapped to EAsyncExecution::Thread?

Steps to Reproduce
In UE5:

  1. Start several (10-20) FUDPPing::UDPEcho tasks to ping servers and calculate the average ping duration.
  2. Observe the duration of the tasks and how they block other work from executing on task graph threads.

In UE4:

  1. Start the same number of FUDPPing::UDPEcho tasks.
  2. Observe that the duration of those tasks do not impact task graph threads. Those tasks execute in a separate thread pool and allow task graph threads to proceed without issue.

Hi,

The goal with UE5 was to properly scale (remove limitations) and reduce preemption for systems with lots of core. Having more threads than core is not good for a task graph in general so everything was merged together so that the task graph would run optimally on high core count systems.

That being said, long running tasks are not friendly to a task graph. In your specific case, we suggest that you create your own thread-pool and issue your blocking ping request or long running tasks on that thread-pool.

Ideally, you would use non-blocking APIs to accomplish the same goal which would not require having a separate thread-pool, but obviously it requires more work.

Hope this helps,

Danny

EAsyncExecution::Thread should almost be banned as it creates a thread for each task, which is not what people want/expect most of the time as the cost is huge. There is no downside in using a threadpool unless the code is misbehaved. One such example would be a recursive function that would create a task at every recursion and wait on it, such code would likely exhaust all threads in the pool and end up into a deadlock. But such code would be far better off just being replaced with a sane pattern anyway :slight_smile:

As for all the other things that runs from the same pool, this is by design and what we want to avoid preemption/oversubscription and performance problems. In a typical taskgraph, long tasks should be split into smaller ones so that every tasks has a fair chance to run. For long tasks that can’t be broken for some reasons, using an external thread pool could be the proper avenue taking into account that it will inevitably cause oversubscription and additional preemption when the system is work saturated.

Hope this helps,

Danny

Hey Danny,

Thank you for the extra context.

For our specific issue with using FUDPPing::UDPEcho we have a few avenues to mitigate the impact.

For more general uses of EAsyncExecution::Thread throughout the engine, are there known issues with switching over to use the unified set of threads? The engine has several uses that would have previously used the old split, and now those tasks are running from the same pool.

I misspoke with my previous question: I intended to ask about `EAsyncExecution::ThreadPool`. I just performed an audit of usage throughout the engine and found that `FUDPPing::UDPEcho` is likely the primary long-running task that could introduce issues. Most other uses are not long-running or blocking tasks.

This has been helpful, thank you Danny!