Spread spawning across multiple frames?

I avoid performance issues of spawning large amounts of AI in a single frame I want to spread them across multiple frames. At the moment I’m doing this with a few goofy delay loops.

What is the recommended way of doing this? I was going to create an AsyncForLoop that uses UBlueprintAsyncActionBase to create a latent node to spawn actors when the loops delegate is called. Would this work to spread it across frames? or do I also need to add some delay logic to that array handling?

I am not an expert on multithreading but I think having an Async spawner would be less performant than spawning in the main thread because the task would need to synchronize too often with the main game thread (to actually spawn the object) Async can probably help if you have some really heavy initialization before the object is spawned.

I would probably make a manager that spawns a fixed number (1-3) per tick and see if it works. I can later decrease the tick rate of the manager or increase the number os fpawned actors per tick. (hell, I can even change the tick rate depending on how many it has already spawned if I want to)

Again, not an expert on multithreading. Feel free to correct me.

As far as I’m aware UBlueprintAsyncActionBase is not multi-threading. It will run on the main thread. It’s just async so it won’t tie up the main thread. Basically the goal is to avoid locking the main thread while it’s spawning everything in (that’s why you get big ol’ hiccups when spawning in a lot at once). So 1 frame might spawn like 10, another 12, another 6 depending on main thread load. That would be the ideal goal.

The event tick happens once per frame.

This code would thus spawn one actor per frame

That’s 60 actors per second if you have 60fps, should be plenty, but if it’s not, you can just spawn X amount per frame.

There is as far as I can tell no multithreading support in blueprints, you’d have to use c++ to do it, or plugins. I’ve looked for it but haven’t found any information at all about multithreading in bp.

But if your goal is to spread spawning between frames, then event tick is the way to do it, each tick is one frame, so spawn a reasonable amount per tick.

I am embarrassed. I always assumed it was on another thread without actually checking. As I said, not an expert.

Still I believe there wont be a lot of benefit moving it to async because of the actual spawn. I don’t think it can happen without blocking the thread. I might be wrong though.

Yea that would be ideal but I don’t think Async will do that for you out of the box.

Edit: You can have target budget set and spawn actors in the same tick until you reach that budget. You will have to somehow get the previous frame time minus the spawning though to have some info on the budget you have for the spawning.

If you have the time I would love to see a test “async task vs manager actor” with fixed number of actors per tick.

I’m going to goahead and give UBlueprintAsyncActionBase a try and create an AsyncForLoop. Then will try adding delay support for it (e.g. like a Delay For Loop macro), which should work fine.

If you have the time I would love to see a test “async task vs manager actor” with fixed number of actors per tick.

I have already done such an implementation. My bullet manager is a world subsystem that utilizes async projectiles. Lets me have thousands and thousands flying around with the only real cost being the visuals. The projectiles themselves are just sphere traces. I far prefer subsystems over manager actors.

@Cestarian I do not want to use tick. This isn’t something that needs to be running non-stop every frame. I’ve an AI spawner that spawns in waves of enemies. So for example lets say I need to spawn in 300 AI (my game is optimized for large amounts of AI, but spawning is still a heavy process regardless). I want to spread those 300 AI across multiple frames. I’m doing this now by using a bit of randomized delay per chunk of AI (e.g. every 5 AI delay by 1 frame). This works OK, but is sloppy IMO. So I’m just trying other methods to improve it.

Why not just stop ticking once everything has been spawned?

Then start ticking again when you need to spawn more?

The ideal way to solve this is to split it between threads, problem is blueprints don’t let you do that by default but i think there are plugins that make it doable, and there’s a way in C++.

You could also try just changing your spawner from a blueprint to a c++ class, sometimes that brings a ridiculous amount of performance improvement.

I don’t want the actor ticking at all. If I’m going to do that I’d just use a timer. Spawning can only be done on main thread so spreading it to other threads won’t work.

I’ve my AsyncForLoop working and does bring some minor performance improvements compared to unreal engine ForLoop macro, but needs more improvement. Thinking that injecting delay into it using a timer might be the solution here.

Ok, looks like my AsyncForLoop with a timer works great. Normally I’d just do this in BP, but the problem with BP timers is they’re scoped to an event/function. So if the timer is called while the timer is still running you run into problems of overriding the timer, etc… but with this the timer is scoped to an async task so it can be ran 30 times and it will properly generate those 30 timers.

Below is my current solution. I can delay each broadcast by 0.02 for example and spawn 300 actors with minimal FPS impact (10-15 fps). So no more stuttering or massive frame drops.

AsyncForLoop.cpp

#include "AsyncLoops/Public/AsyncForLoop.h"

void UAsyncForLoop::Activate()
{
	if ( Delay > 0.f ) {
		DelayLoop();
		return;
	}

	for ( int32 Index = FirstIndex; Index <= LastIndex; ++Index ) {
		LoopBody.Broadcast( Index );
	}

	Completed.Broadcast( 0 );

	SetReadyToDestroy();
}

void UAsyncForLoop::DelayLoop()
{
	if ( ! IsValid( World ) ) {
		SetReadyToDestroy();
		return;
	}

	World->GetTimerManager().SetTimer( DelayTimerHandle, this, &UAsyncForLoop::LoopDelayed, Delay, false, Delay );
}

void UAsyncForLoop::LoopDelayed()
{
	LoopBody.Broadcast( CurrentIndex );

	if ( CurrentIndex == LastIndex ) {
		if ( IsValid( World ) ) {
			World->GetTimerManager().ClearTimer( DelayTimerHandle );
		}

		Completed.Broadcast( 0 );
		
		SetReadyToDestroy();
	} else {
		++CurrentIndex;

		DelayLoop();
	}	
}

UAsyncForLoop* UAsyncForLoop::AsyncForLoop(
	UObject* WorldContextObject,
	int FirstIndex,
	int LastIndex,
	float Delay
) {
	UAsyncForLoop* BlueprintNode = NewObject<UAsyncForLoop>();

	if ( IsValid( WorldContextObject ) ) {
		BlueprintNode->World = WorldContextObject->GetWorld();
	}

	BlueprintNode->FirstIndex = FirstIndex;
	BlueprintNode->LastIndex = LastIndex;
	BlueprintNode->Delay = Delay;

	return BlueprintNode;
}

AsyncForLoop.h

#pragma once

#include "CoreMinimal.h"
#include "Kismet/BlueprintAsyncActionBase.h"
#include "Delegates/IDelegateInstance.h"
#include "AsyncForLoop.generated.h"

UCLASS( Blueprintable )
class ASYNCLOOPS_API UAsyncForLoop : public UBlueprintAsyncActionBase
{
	GENERATED_BODY()

public:
	DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam( FOAsyncForLoopSignature, int, Index );

	virtual void Activate() override;		

	UPROPERTY( BlueprintAssignable, Category = "AsyncLoops" )
		FOAsyncForLoopSignature LoopBody;

	UPROPERTY( BlueprintAssignable, Category = "AsyncLoops" )
		FOAsyncForLoopSignature Completed;

	UFUNCTION( BlueprintCallable, meta = ( BlueprintInternalUseOnly = "true", WorldContext = "WorldContextObject" ), Category = "AsyncLoops" )
		static UAsyncForLoop* AsyncForLoop(
			UObject* WorldContextObject,
			int FirstIndex = 0,
			int LastIndex = 0,
			float Delay = 0.f
		);

protected:
	class UWorld* World = nullptr;
	int FirstIndex = 0;
	int LastIndex = 0;
	int CurrentIndex = 0;
	float Delay = 0.f;
	FTimerHandle DelayTimerHandle;

	UFUNCTION()
		void DelayLoop();

	UFUNCTION()
		void LoopDelayed();
};

Going to continue to work on it and add burst processing. So something like spawn 5 every 0.02, etc…

Object Pooling ?!?!?!?!

Spawn all the AI Instances (Max Wave Size) during map load. Spawn them dormant and under the map so they are well outside scene render distance. I typically do 3Km below map origin.

As needed teleport them to their wave spawn area. On death, set them dormant and teleport them back to the dormant area.

1 Like

Object Pooling is a last resort and completely unnecessary. Spreading the spawn across multiple frames entirely solves frame drop issues. With the above I can spawn 300 AI (I don’t even need 300 for my game) with little frame drops and I don’t have any of the memory leak risks of object pooling. I would rather just go for a proper ECS than waste my time on something like Object Pooling.

I also just destroy the actors and spread GC across multiple frames. This is now natively supported in UE5 (experimental) so it’s even easier to do. I also clear GC during downtime (load screens for example).

With that said Object Pooling STILL doesn’t solve this problem. To pool 300 actors you still need to spawn 300 actors. It only solves the problem of the NEXT 300.

With that said Object Pooling STILL doesn’t solve this problem. To pool 300 actors you still need to spawn 300 actors. It only solves the problem of the NEXT 300.

I do it during level load so there’s no impact visually on the client.

Ok, below is my final solution. It includes 3 new nodes:

AsyncForLoop: standard loop using an async task broadcasting each loop
AsyncDelayForLoop: inserts a delay between each loop
AsyncBurstForLoop: inserts a delay between X loops (e.g. every 3 index insert a delay)

AsyncForLoop.cpp

#include "AsyncLoops/Public/AsyncForLoop.h"

void UAsyncForLoop::Activate()
{
	if ( Delay > 0.f ) {
		DelayLoop();
		return;
	}

	for ( int32 Index = FirstIndex; Index <= LastIndex; ++Index ) {
		LoopBody.Broadcast( Index );
	}

	Completed.Broadcast( 0 );

	SetReadyToDestroy();
}

void UAsyncForLoop::DelayLoop()
{
	if ( ! IsValid( World ) ) {
		Completed.Broadcast( 0 );

		SetReadyToDestroy();
		return;
	}

	World->GetTimerManager().ClearTimer( DelayTimerHandle );

	LoopBody.Broadcast( CurrentIndex );

	if ( CurrentIndex == LastIndex ) {
		Completed.Broadcast( 0 );

		SetReadyToDestroy();
		return;
	}

	++CurrentIndex;

	if ( Burst > 1 ) {
		for ( int32 Index = FirstIndex; Index <= ( Burst - 1 ); ++Index ) {
			LoopBody.Broadcast( CurrentIndex );

			++CurrentIndex;

			if ( CurrentIndex == LastIndex ) {
				Completed.Broadcast( 0 );

				SetReadyToDestroy();
				return;
			}
		}
	}

	World->GetTimerManager().SetTimer( DelayTimerHandle, this, &UAsyncForLoop::DelayLoop, Delay, false, Delay );
}

UAsyncForLoop* UAsyncForLoop::AsyncForLoop(
	int FirstIndex,
	int LastIndex
) {
	UAsyncForLoop* BlueprintNode = NewObject<UAsyncForLoop>();

	BlueprintNode->FirstIndex = FirstIndex;
	BlueprintNode->LastIndex = LastIndex;

	return BlueprintNode;
}

UAsyncForLoop* UAsyncForLoop::AsyncDelayForLoop(
	UObject* WorldContextObject,
	int FirstIndex,
	int LastIndex,
	float Delay
) {
	UAsyncForLoop* BlueprintNode = NewObject<UAsyncForLoop>();

	if ( IsValid( WorldContextObject ) ) {
		BlueprintNode->World = WorldContextObject->GetWorld();
	}

	BlueprintNode->FirstIndex = FirstIndex;
	BlueprintNode->LastIndex = LastIndex;
	BlueprintNode->Delay = Delay;

	return BlueprintNode;
}

UAsyncForLoop* UAsyncForLoop::AsyncBurstForLoop(
	UObject* WorldContextObject,
	int FirstIndex,
	int LastIndex,
	float Delay,
	int Burst
) {
	UAsyncForLoop* BlueprintNode = NewObject<UAsyncForLoop>();

	if ( IsValid( WorldContextObject ) ) {
		BlueprintNode->World = WorldContextObject->GetWorld();
	}

	BlueprintNode->FirstIndex = FirstIndex;
	BlueprintNode->LastIndex = LastIndex;
	BlueprintNode->Delay = Delay;
	BlueprintNode->Burst = Burst;

	return BlueprintNode;
}

AsyncForLoop.h

#pragma once

#include "CoreMinimal.h"
#include "Kismet/BlueprintAsyncActionBase.h"
#include "Delegates/IDelegateInstance.h"
#include "AsyncForLoop.generated.h"

UCLASS( Blueprintable )
class ASYNCLOOPS_API UAsyncForLoop : public UBlueprintAsyncActionBase
{
	GENERATED_BODY()

public:
	DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam( FOAsyncForLoopSignature, int, Index );

	virtual void Activate() override;		

	UPROPERTY( BlueprintAssignable, Category = "AsyncLoops" )
		FOAsyncForLoopSignature LoopBody;

	UPROPERTY( BlueprintAssignable, Category = "AsyncLoops" )
		FOAsyncForLoopSignature Completed;

	UFUNCTION( BlueprintCallable, meta = ( BlueprintInternalUseOnly = "true" ), Category = "AsyncLoops", DisplayName = "Async For Loop" )
		static UAsyncForLoop* AsyncForLoop(
			int FirstIndex = 0,
			int LastIndex = 0
		);

	UFUNCTION( BlueprintCallable, meta = ( BlueprintInternalUseOnly = "true", WorldContext = "WorldContextObject" ), Category = "AsyncLoops", DisplayName = "Async Delay For Loop" )
		static UAsyncForLoop* AsyncDelayForLoop(
			UObject* WorldContextObject,
			int FirstIndex = 0,
			int LastIndex = 0,
			float Delay = 0.01f
		);

	UFUNCTION( BlueprintCallable, meta = ( BlueprintInternalUseOnly = "true", WorldContext = "WorldContextObject" ), Category = "AsyncLoops", DisplayName = "Async Burst For Loop" )
		static UAsyncForLoop* AsyncBurstForLoop(
			UObject* WorldContextObject,
			int FirstIndex = 0,
			int LastIndex = 0,
			float Delay = 0.1f,
			int Burst = 5
		);

protected:
	class UWorld* World = nullptr;
	int FirstIndex = 0;
	int LastIndex = 0;
	int CurrentIndex = 0;
	float Delay = 0.f;
	int Burst = 0;
	FTimerHandle DelayTimerHandle;

	UFUNCTION()
		void DelayLoop();
};

It’s probably not perfect and there’s probably better ways to code this, but for now works great.