How to efficiently render 10,000 or more soldiers?

Hi DeathreyCG.
The soldiers should be smart like in Total War. The collision should be calculated too.
By this I can’t do all calculations in Niagara. Now I use next architecture:

  1. I have struct like this
struct 
{
    TArray<FVector> Locations;
    TArray<FVector2d> ForwardVectors;
    ...
};
  1. I do all calculations in separate threads and fill this struct each frame;
  2. GameThread only needs to send data arrays to Niagara using UNiagaraDataInterfaceArrayFunctionLibrary::SetNiagaraArrayVector. It takes about 5ms for an array of 2000 entries on each frame;
  3. Niagara emitter sets location, rotation and animation for each particle. Index in array is equal to particle’s index;
  4. Emitter has LODs using renderer tag.

The performance of this approach is not bad, but I think it’s a dirty hack. Niagara is not designed for this. I’m looking for the most beautiful solution.