How to efficiently render 10,000 or more soldiers?

Hi.
I am studying the issue of creating a game about battles on a scale comparable to Total War. I hope there are people who can suggest something.
At the moment I have found out the following:

  1. Each unit should be a StaticMesh, animated using the vertex animation texture technique. No SkeletalMesh, Character or other ready-made unreal functions. We turn off physics completely;
  2. Using Actor, adding your own functionality to it, is not suitable (each Actor is a separate unit). On my computer, FPS starts to fall below 60 at around 3000 units. I have moved all the movement calculations to separate threads, so the GameThread only has to run through several arrays in one cycle and distribute Location, Rotation, etc. to each Actor. In such an implementation, 8+ms from a frame is the call to UWorld_SendAllEndOfFrameUpdates and then grows with the increase in the number of Actors.
  3. Niagara helps to achieve high performance - the same 3000 units are rendered with a large FPS reserve. However, using Niagara for such purposes seemed like a dirty hack. For example, each system has Bounds - a volume within which particles are visible to the player. As soon as Bounds stop hitting the camera, all particles disappear. This can be fixed by making fixed bounds the size of the entire battlefield. Also, transferring an array of 2000 vectors to the system parameters (to the GPU memory, if I understood correctly) is ~5ms, which is also a significant time, which will only grow further, because in addition to the location and rotation of the particle, you also need to transfer information such as the flags “alive”, “selected by the player”, etc.;
  4. I looked at MassEntity, but did not go into detail, because it turns out that the same Actors will be rendered there, and I have already found out their limitations. Perhaps I was wrong;
  5. The last known option at the moment is Instanced Static Mesh. A thousand identical knights should fly to the render in one draw call and that’s good. But again, the component is designed to draw trees, a bunch of identical village houses and other things. Setting the position on the map via world position offset and animation seems to be possible, but it also looks like a dirty hack.

As a result, it turns out that to solve the problem you need something like Niagara, but not about appearing and disappearing particles, but about Actors. Maybe someone has solved similar problems and can share their experience?

From the render side, it has to be this and only this way:

  • If each unit has at least a hundred or more triangles, they should be rendered as instanced mesh.

  • If at any point units get so small that any of its triangles starts covering less than about 30 screen pixels, geometry has to be simplified and this unit needs to be rendered with lower resolution instanced mesh.

  • If at any point you need so low geometry resolution, that your unit starts containing less than a hundred triangles, a group of several units needs to be merged into a single mesh.

Any deviation from these rules are substandard.

To achieve that, you can use instanced static mesh component, hierarchical instanced static mesh component or niagara, or your custom component.
Unless for some reason you need functionality of instanced static mesh component beyond that of rendering(collision queries), niagara is preferable.

Coding a system to maintain yourself within 3 rules mentioned on top is on you, irrespectively of if you are going to use ISM, HISM or Niagara.

1 Like

Hi DeathreyCG.
The soldiers should be smart like in Total War. The collision should be calculated too.
By this I can’t do all calculations in Niagara. Now I use next architecture:

  1. I have struct like this
struct 
{
    TArray<FVector> Locations;
    TArray<FVector2d> ForwardVectors;
    ...
};
  1. I do all calculations in separate threads and fill this struct each frame;
  2. GameThread only needs to send data arrays to Niagara using UNiagaraDataInterfaceArrayFunctionLibrary::SetNiagaraArrayVector. It takes about 5ms for an array of 2000 entries on each frame;
  3. Niagara emitter sets location, rotation and animation for each particle. Index in array is equal to particle’s index;
  4. Emitter has LODs using renderer tag.

The performance of this approach is not bad, but I think it’s a dirty hack. Niagara is not designed for this. I’m looking for the most beautiful solution.

Hopefully relevant: “simulating large crowds in niagara” :slight_smile: