Hi,
Recently, we discovered that in specific circumstances, FTickTaskManager::StartFrame may deadlock on consoles and low-core CPU targets when tick.AllowConcurrentTickQueue=1. We did some investigation and found through superluminal traces that in some circumstances, the ParallelFor on AllTickFunctions may result in QueueTickFunctionParallel going in infinite loops on all available cores.
The scenario where we can reproduce this issue implies a web of prerequisites around blueprints and skeletalmeshes. The important part here is the prerequisites, not really how they got there. The short version of the issue is as follow: when running the ParallelFor in StartFrame, each tick is run in a different thread, but for each prerequisite that is encountered, a new run of QueueTickFunctionParallel is triggered. In cases where enough tick have the same prerequisite or chain of prerequisite, it is possible to run out of CPU cores before the whole chain is resolved resulting in a live deadlock.
Quick example, if we have 9 cores and are “lucky” enough that 9 ticks have intertwined prerequisites… We could end up where B=>A and A=>B causing themselves to deadlock. Here the relationship between B and A is simplified, in our case, it is chains of prerequisites with blueprints in the mix. Anyway, once that deadlock is in place, any other prerequisite on A or B would propagate the deadlock to eventually stop everything (or rather go in the else of QueueTickFunctionParallel resulting in an infinite loop). This case should seldom happen since that would be a cycle, but it could happen before StackForCycleDetection actually detects it.
A different example that is more insidious: again, if we have 9 cores/workers and are “lucky” enough, we could end up with K=>J, J=>I, I=>H, H=>G, G=>F, F=>E, E=>D, D=>C, C=>B, B=>A, A where K, J, I, H, G, F, E, D, and C are waiting for resolution of B and A, but since there are no cores/workers available, they cannot run and all those cores/workers are now deadlocked (well, livelocked). Yes, they are calling YieldThread, but that call implies that there is enough extra workers to be able to run the extra tick possibilities which may not be the case for particularly long or complicated chains of prerequisites.
For now, our solution seems to be to just to disable concurrent tick (tick.AllowConcurrentTickQueue=0), but hopefully this can be re-instated in the future.
Hope you have a nice day!