Using TArray inside an FAsynTask is very slow in packaged game

Hi,

I’ve already made an answerhub question about this but this site seems dead: https://answers.unrealengine.com/que…ing-tarra.html

Basically I noticed that the multithreaded performance gets very bad in packaged game when you use TArrays inside a FAsyncTask or FRunnable class. Everything is fine in editor, just the packaged version of the game is slow.

If someone could try this and let me know their result, just to confirm I’m not crazy, that would be great:

  1. Create a new Third Person C++ project called “MyProject”,
  2. Create a new C++ class child of Actor called “MyActor”,
  3. Open the solution in Visual Studio,
  4. Replace MyActor.h by the code provided below,
  5. Replace MyActor.cpp by the code provided below,
  6. Compile,
  7. In Unreal Engine, go inside ThirdPersonCPP>Blueprints>ThirdPersonCharacter,
  8. In the graph, bind the Enter key to a SpawnActorFromClass node (spawning a MyActor actor),
  9. Play In Editor and press Enter to launch async tasks: the total ms is printed on screen when the tasks are done,
  10. Now package the game and play it: the tasks are now very slow.

MyActor.h


// Fill out your copyright notice in the Description page of Project Settings.

#pragma once

#include "CoreMinimal.h"
#include "GameFramework/Actor.h"
#include "Async/AsyncWork.h"
#include "MyActor.generated.h"

class FMyTask : public FNonAbandonableTask
{
friend class FAsyncTask<FMyTask>;

protected:

void DoWork();

FORCEINLINE TStatId GetStatId() const { RETURN_QUICK_DECLARE_CYCLE_STAT(FMyTask, STATGROUP_ThreadPoolAsyncTasks); }

public:

FMyTask();

float Time = 0.0f;
};

UCLASS()
class MYPROJECT_API AMyActor : public AActor
{
GENERATED_BODY()

public:

UPROPERTY(BlueprintReadWrite)
float FinalTime = 0.0f;

UPROPERTY(BlueprintReadWrite)
bool bIsDone = false;

private:

TArray <FAsyncTask<FMyTask>*> Tasks;

TArray <float> WorkTime;

public:
// Sets default values for this actor's properties
AMyActor();

protected:
// Called when the game starts or when spawned
virtual void BeginPlay() override;

public:
// Called every frame
virtual void Tick(float DeltaTime) override;

};




MyActor.cpp


// Fill out your copyright notice in the Description page of Project Settings.


#include "MyActor.h"
#include "Async/Async.h"
#include <chrono>

using namespace std::chrono;

FMyTask::FMyTask()
{

}

void FMyTask::DoWork()
{
auto start = high_resolution_clock::now();

for (int32 i = 0; i < 5000; ++i)
{
TArray <float> Biomes;
Biomes.Add(1.0f);
}

auto stop = high_resolution_clock::now();
auto duration = duration_cast<milliseconds>(stop - start);
Time = duration.count();
}

// Sets default values
AMyActor::AMyActor()
{
// Set this actor to call Tick() every frame. You can turn this off to improve performance if you don't need it.
PrimaryActorTick.bCanEverTick = true;

}

// Called when the game starts or when spawned
void AMyActor::BeginPlay()
{
Super::BeginPlay();

for (int32 i = 0; i < 1000; ++i)
{
FAsyncTask<FMyTask>* Task = new FAsyncTask<FMyTask>();
Tasks.Emplace(Task);
Task->StartBackgroundTask();
}

GEngine->AddOnScreenDebugMessage(-1, 10.0f, FColor::Red, TEXT("Tasks launched..."));

}

// Called every frame
void AMyActor::Tick(float DeltaTime)
{
Super::Tick(DeltaTime);

if (!bIsDone)
{
for (int32 i = Tasks.Num() - 1; i >= 0; --i)
{
FAsyncTask<FMyTask>* Task = Tasks*;
if (Task && Task->IsDone())
{
WorkTime.Emplace(Task->GetTask().Time);
delete Task;
Tasks.RemoveAtSwap(i);
}
}

if (Tasks.Num() == 0)
{
bIsDone = true;
float Sum = 0.0f;
for (float T : WorkTime) Sum += T;
FinalTime = Sum / WorkTime.Num();
GEngine->AddOnScreenDebugMessage(-1, 60.0f, FColor::Orange, *FString::SanitizeFloat(FinalTime));
}
}

}

In editor, the tasks get done instantly whereas in packaged build it takes almost 3 seconds on my computer…
If you comment out the line adding an entry to the TArray (line 22 inside MyActor.cpp), you get the same (instant) performance between editor and packaged game.

Thanks!

Bump.
I tested again in my packaged game and the slow down is very noticeable, it is unplayable.

I’m currently going through the same issue. I’m procedurally generating my world with the Runtime Mesh Component (from a plugin), and using Async / FNonAbandonableTask to speed up the process.

Depending on how large of a world I generate, the loading time can be 3 - 15x slower in the packaged build,

I’ve attached an image from Unreal Insights, showing the time taken when running my project in the editor, compared to the packaged build

That’s weird. The whole loop in FMyTask::DoWork should be completely cut-out by the compiler, because it has no effects.

The problem is probably with the allocation as changing allocator type, for example to
TArray<float, TInlineAllocator<4>> Biomes; solves the problem.