Simulating a large number of persistent NPCs in UE4 [Part 1]

This is the post I have written on our studio’s webpage. It is reproduced in full below.

The Article on our webpage.

Youtube account where I will be adding videos about the technical process.

The log messages, spawning of the NPCs and the static meshes are done through commands from the Simulation Thread. It also decides which NPCs to show.

List of features:

  • Separate simulation thread.
  • Thread communication.
  • Variable Simulation frequency.
  • Simulation Load Balancing based on CPU power.
  • Handling player interaction
  • Handling features that can only be safely accessed by the main game thread.
  • Simulation Level of Detail

In our game, the Absence, you play as the leader of a clan of Swedish Vikings. To properly convey how well you are doing as a leader, your home town changes as the game progresses. This allows us to show you how you are doing, instead of simply displaying a text bubble. For this, we need to simulate Birka circa 815AD. The population at that time was estimated to be around eight hundred.

Our plan is to simulate all eight hundred as the player tries and makes his mark on history. This includes their likes, dislikes, interests, skills, abilities, wealth, friendships, rivalries, history and so on. As you can imagine, updating each of the eight hundred NPCs per frame would be incredibly and unnecessarily taxing on the CPU.

So, to achieve our goal, we use a multitude of methods. Our most prominent method is the use of another thread to do the simulation of the NPCs. In Unreal, it is trivial to create a new thread. However, simply making a new thread and pushing all your NPCs there would be naive. For one, Unreal does not like its objects being accessed randomly, or from other threads. It uses Garbage Collection so it needs to know if the object is being accessed or not which is difficult to communicate between threads. But more importantly, simply sharing data without forethought will cause all sorts of issues. There has to be a structure and a fixed time to access any data so it can be effectively monitored for race conditions.

Since Unreal expects per frame updating of its Actors to be done during its Tick Events, we should follow that as a rule and only have the Simulation Thread affect the Actors when the Tick Events are occurring. But even then, how do we know which Actor’s Tick Event is occurring and when these events are completed from the Simulation Thread. Also, how do we time the writes so that they don’t occur when the Main Game Thread is itself writing data?

The easiest way to solve this is to let the Main Game Thread do the actions on Actors itself. That means it handles all data changes, function calls etc, on the Actors on its own. How does the Simulation Thread inform the Main Game Thread what to do? Commands.

A command is any action that one of the threads needs the other thread to do. It could be to start or stop rendering an Actor, log a message, update the known location of the player etc. An example Command:

 struct SCTW_LogMessage
    	int32 MessageSlotID = -1;
    	float Time = 4.0f;
    	FColor Color = FColor::White;
    	FString Message;

Which allows us to call the following functions in the Main Game Thread:

  GEngine->AddOnScreenDebugMessage(Log.MessageSlotID, Log.Time, (Log.Color), (Log.Message));
    GLog->Log(TEXT("SimulThreadLog"), ELogVerbosity::Warning, Log.Message);

(The first one displays the message on the screen. The second one saves it in the output logs.)

The next question would be to ask how to send the command from one thread to another. Thread communication can be done in a few similar ways. One such way is to use message box. The basic idea is that instead of directly telling the other thread, the Simulation Thread puts its commands in a message box which the Main Game Thread can read at its own leisure. To ensure that the Main Game Thread doesn’t read the message that the Simulation Thread is writing (the information would be incomplete), we can implement a lock. A lock stops two threads from writing at the same time causing race conditions. But you may notice the lack of code. That’s because we use a simpler idea. Circular Buffers and One Write Access.

Circular Buffers are a (usually fixed size) structure which can hold a certain amount of data. You start writing from Index 0 until Index N-1 and after that you start overwriting from Index 0. A fixed size means it isn’t being reallocated by the TArray implementation while the size grows. Why is that important? Because it allows us to read while writing (but similarly to message box, you don’t get to read something that is being written).

 /*Max number of objects in queue*/
template<int32 N>
struct SimulHeaderBufferQueue
	SimulHeader Headers[N];
	int32 NextToRead = 0;
	int32 NextToWrite = 0;
	FCriticalSection Mutex;

This a simple circular buffer implementation. It stores N SimulHeaders, has the location from where to read next (If the buffer size is 10, and only 5 spots are written, then after reading these 5, it must read from Index5 next). NextToWrite stores where the thread with write access is going to write next. Remember, we only allow one thread to write to a buffer. The other thread can only read. Also we use templates to hammer home the idea that these are fixed sized.

struct SimulHeader
	SimulCommand::Type CommandType = SimulCommand::S2G_NoCommand;
	void* Data = nullptr;

The Header simply contains the CommandType and a pointer to the data. The Enum allows us to cast to correct struct (example SCTW_LogMessage).
To write to the buffer we use:
template< class T>

void AddData(SimulHeader& Header, T& Data)
		while (!Mutex.TryLock());
		Headers[NextToWrite].CommandType = Header.CommandType != SimulCommand::S2G_NoCommand ? Header.CommandType : SimulCommand::TW_NoCommandSpecified;
		Headers[NextToWrite].Data = new T(Data);;

We wait until we can Lock the Mutex ourselves and then proceed.

To read we first check if there is something to read:

bool IsThereSomethingToRead()
		if (!Mutex.TryLock())
			return false;
			if (Headers(NextToRead)].CommandType != SimulCommand::S2G_NoCommand)
				if (NextToRead == NextToWrite)
					return false;
				return true;
		return false;

This is the simplest way to do the read check. We use a Mutex, which we lock when writing. The reader function simply checks if it can also Lock the Mutex or not. There is a chance that the read thread stopped was paused at the else line. In that use, when actually reading the Headers we have to ensure that we wait until the Mutex can be locked by the reader.

Some of you may wonder why we use this simpler form of a message box. This is because the commands vary in importance and impact. By using this method we can trivially add new types of Circular Buffers as needed. For example, logging something is nice but it also nowhere near as important as updating the player location or as important and taxing as creating/rendering a new NPC. So we give each thread two of these buffers.

static SimulHeaderBufferQueue<10000> GameThreadWriteBuffer;
    	static SimulHeaderBufferQueue<10000> SimulationThreadWriteBuffer;
    	static UsedBuffer GameThreadWriteBufferCrucial;
    	static UsedBuffer SimulationThreadWriteBufferCrucial;
    typedef SimulHeaderBufferQueue<4096> UsedBuffer;

Logs go into GameThreadWriteBuffer and render commands go into GameThreadWriteBufferCrucial. This way the crucial commands can be tended to before and then if the frame time permits, the non-crucial commands can be looked at. Here is some example usage:

if (FSimulationMainThread::GetSimulationMainThread())
    		if (FVector::Dist(OldLocation, GetActorLocation()) > 200)
    			OldLocation = GetActorLocation();
    			SimulHeader Header;
    			Header.CommandType = SimulCommand::G2S_UpdatePlayerLocation;
    			SCG2S_UpdatePlayerLocation PlayerLocation;
    			PlayerLocation.NewLocation = GetActorLocation();
    			FSimulationMainThread ::GameThreadWriteBufferCrucial.AddData<SCG2S_UpdatePlayerLocation>(Header, PlayerLocation);
    		while (FSimulationMainThread::SimulationThreadWriteBufferCrucial.IsThereSomethingToRead())
    SimulHeader Header = FSimulationMainThread::SimulationThreadWriteBufferCrucial.ReadNextSlot();
    			if (Header.CommandType == SimulCommand::S2G_CreateHouse)
    				SCS2G_CreateHouse& NewHouse = *(SCS2G_CreateHouse*)Header.Data;
    				FActorSpawnParameters SpawnParam;
    				SpawnParam.Owner = this;
    				SpawnParam.SpawnCollisionHandlingOverride = ESpawnActorCollisionHandlingMethod::AdjustIfPossibleButAlwaysSpawn;
    				ANPCHouse* House = GetWorld()->SpawnActor<ANPCHouse>(NPCHouseBP, NewHouse.SpawnLocation, FRotator(0, 0, 0), SpawnParam);
    				House->House = NewHouse.House;
    				MGTInterface->RenderedHouseList.Add(NewHouse.House, House);

It is important to plan all this in advance so changes can be made easily. This is also the reason why I discuss thread communication before discussing how to create them or use them. If you mess up this part, all is for naught. The next article will look at actually creating the threads. But, if you don’t want to wait, here is a nice post on UnrealWikithat you may refer to.

Thank you for reading. Have a good day.

**** this is good! Really looking forward to this tutorial series. I’m gasping here, drawing in small amounts of air in excitement; this has literally given me asthama exciteamitis!

PS. Please be easy on the beginners. I mean I will do my research but… yea its ok. I’d be super grateful just for having this series completed :wink:

PPS. Also so looking forward to The Absence. 800 is a high number of AI. Kinda gives a STALKER series type of vibe with their A-Life. Here is a link which you may probably already have read but is a good one if not. I’m not sure the AI will be of a similar scale but its a good read nonetheless.

this is interesting