Why does the performance of my threaded code not increase when I use more than one thread?

I want to spawn a handful of threads at the start of a level that generate various pieces of a voxel terrain. The threads seem to function correctly, but for some reason it takes the same amount of time for the terrain to finish generating whether I split it across two threads or eight!

All I’m doing is creating a child of the FRunnable class, “FTerrainGenWorker”, and starting its thread with

Thread = FRunnableThread::Create(this, TEXT("FTerrainGenWorker"), 0,
                                  TPri_AboveNormal,
                                  FGenericPlatformAffinity::GetPoolThreadMask());

The Run() function for each thread just loops through a part of the voxel terrain grid and fills it in with noise:

const FVector scale(0.1f, 0.1f, 0.1f);

FVector pos;
for (int32 z = MinZ; z <= MaxZ && z < Grid.NumZ(); ++z)
{
    pos.Z = (float)z * scale.Z;

    for (int32 y = 0; y < Grid.NumY(); ++y)
    {
        pos.Y = (float)y * scale.Y;

        for (int32 x = 0; x < Grid.NumX(); ++x)
        {
            pos.X = (float)x * scale.X;

            Grid.Get(x, y, z) = NoiseGen::GetNoise(pos);
        }
    }

    FPlatformProcess::Sleep(0.005);
}

    return 0;

The more threads I have, the smaller the range of Z values each thread works through, yet it always takes about 27-30 seconds whether I have 1, 4, or 8 threads! I don’t see any problems with how I’m splitting my work, so is there something wrong with how I’m using FRunnable or thread affinities?

1 Like

Try to swap y and x loop declarations, check your data for cache alignments and cache miss. Depending on how your data is structured in memory you might invalidate cache lines by writing. try to use temporary arrays to store data and join when you are done with it…