Hi, I was wondering how Unreal Engine 4 tackles CPU processing and multithreading. Is processing done on threads spread automatically across all cores, or is the majority of the workload piled on a single Worker core? Is it done automatically? Also, is it possible to assign processing jobs to multiple threads whereby specific jobs could be done halfway and then moved to a different processing Worker core (like Core 0, Core 1, Core 2 etc.) to reduce workload? Are you able to influence this through blueprint code or would I have to change it through C++?
Internally various tasks are multithreaded. A good place to start analyzing the engine’s behaviour in this regard would be to use the profiling tools to find out how the different workloads are spread over the available cores.
A far as I know blueprints don’t expose this and in that case users shouldn’t have to worry about threading as it is all handled internally. In c++ you do have control over this. Rama wrote a tutorial about this specific topic for the wiki, which you can find here.
All blueprints currently run on the main game thread as far as I know. Blueprints aren’t intended to run on multiple threads and odd things occur if you do that (at least with vanilla Unreal). It’d take some work to multithread blueprints and make them safe.
As was already said, the main game loop (where all the ticks and blueprint stuff happens) is executed in a single thread. The rendering is done in a separate independent thread.
There are also other thread, e.g. for asset streaming and various job executors.
With C++ you can either tap into the threading framework provided by the Unreal Engine or you can just spin up your own vanilla C++ threads.
Which CPU cores the threads are working on is not decided by the Unreal Engine but rather by the operating system. And it is possible that the OS decides to move a thread from one core to another one, for example to spread the generated heat.
A typical job graph system is used for most tasks that can be computed independently, like animation updates, particle systems, etc.
I want to correct something, for blueprints, it utilized by the engine with multithreading on this artilce by Intel https://software.intel.com/en-us/articles/unreal-engine-4-optimization-tutorial-part-3
"Because UE4 takes advantage of multithreading, it spreads the blueprints across many worker threads, allowing the evaluation to run faster by effectively utilizing all CPU cores."
I hope any staff confirm that or deny it.
That’s false, blueprints don’t (and can’t) run on multiple threads. All ticks are executed on the game thread.
I’m not sure why intel would spread such misinformation…
Actually, it’s true for animation blueprints that are setup to use the fast path. But yeah, normal BPs all run on the same thread. Also, there’s nothing stopping you from calling UFUNCTIONs on other threads (there are even some plugins on the marketplace for that) but you need to make sure your code only does thread-safe things.
If everything in Blueprints except anims etc is one thread, I’d help to understand more as this seems at odds with practice…
Race-conditions are common at startup, plus different startup orders in packaged vs editor… How does that occur then???
Newly spawned actors also often appear to be ahead of behind the code that created them. Plus how does Delay-0 work…
Those problems are mostly “object A’s initialization depends on object B’s initialization, but A initializes before B”, not threads.
If you do some empirical experiments you do see different running orders, so how does that happen…
If its not related to threads etc, what other factors cause dependencies to end up executing differently?
I’m also confused, thats why I hope any of Unreal staff confirm that by 100%
Might try this: https://www.unrealengine.com/marketplace/bpthreads
I have an explanation why Intel said such statement and why Intel is wrong and right (not quite) at the same time.
It’s common knowledge that UE4 doesn’t support multi-threading in BPs, but I wanted to verify this just in case since it came from a respectable entity as Intel.
So, I created a BP with a ForLoop from 0 to 50.000.
A single instance of this BP took around ~100ms to complete.
Now, having two instances of this BPs take (to my surprise) about ~145ms to complete.
With 4 instances, 250ms. That’s 1.6x times faster than the expected 400ms. But why?
After scratching my head for a while and trying different settings, I finally found an explanation: Allow Parallel GC
This setting is turned On by default. After turning it off and running the test again, the cpu time started increasing linearly proportional to the amount of instances of this BP.
No, UE4 doesn’t use Multi-Threading for Blueprints.
Yes, it’s faster to split the load of a single (and cpu-intensive) BP into multiple BPs because the GC operations for those BPs will run in parallel.
What I do not know is how the GC distributes the cleaning operations jobs and why is tied to only what single BP instance allocated (some kind of VM scopes?).
Maybe an UE4 engineer has an answer to a question which no one really wants to know the answer.
I don’t know why one should bother about multithreading for BPs in the first place, since you are not supposed to do heavy calculations and complex logic in BPs anyways. I know that you have the ability to create a whole game in BPs, but that doesn’t mean you should, especially if you are seeking good performance. If you stumble across stuff like multithreading you obviously care or even have to care for the performance of your game, which leads to the question why you are using BPs for core logic in the first place. Therefore, the first thing I would do is extract all the core game logic (except UMG) to C++, THEN if it still doesn’t perform well you can think about further optimization, threads etc.