Does Parallel For need manual barriers?

Let’s say I’ve got an array with similar but independent data, and would need to run the same non-trivial but completely independent function on each entry. I’ve seen that there exists a function named ParallelFor that would be perfect for this purpose, as it splits the work in several task graph tasks and blocks until all of them are complete.

Now the question is, if I need to do something special afterwards to be certain that the processed data is available in the game thread, like setting up memory barriers?

What I mean: Does this code work, or do I need to put something in place of the comment to ensure that the data set in the ParallelFor is available for DoSomethingNotParallel()?

Is there a better way that ensures that I get the results back before the ParallelFor finishes?

void processData(TArray<DataType>& Data)
{
    ParallelFor(Data.Num(), [&](int32 CurrIdx)
    {
        Data[CurrIdx] = DoSomethingParallel(Data[CurrIdx]);
    }

    //Do I need to put a memory barrier or something else here?

    DoSomethingNotParallel(Data);
}

I couldn’t stop myself from reading engine source code. The answer is actually pretty easy to find, if one knows where to look…

ParallelFor uses FEvent to tell the game thread that the individual tasks that are created have finished. FEvent is basically an abstraction of thread synchronization using events, and behaves like Windows Event objects (on Windows it’s just a wrapper around those). This means, that whenever the game thread gets informed of a parallel-for task being finished, there’s a memory barrier in place, so that the data in the main thread is guaranteed to contain all changes done within the parallel for.

In other words: Parallel For is perfect for what I want to do, and no additional synchronization is needed.

1 Like