Context: another Cube Voxel Engine, following the “standard” practices:
The world is split into chunks - each chunk is an Actor
Only the “surface” faces are added as render geometry for each chunk
Chunks are rendered independently one after another
Two steps for each chunk:
4.1. initial building chunk mesh;
4.2. update chunk mesh when new voxels are added or existing voxels are removed
Tried:
PMC: one ProceduralMeshComponent per Chunk, vertices (and triangles) of the “surface” mesh are added as MeshSections (per different Materials);
Result:
Initial chunk mesh building: not great, but tolerable (exact performance metrics not recorded)
Update chunk mesh during play: PMC needs to rebuild the mesh sections when any vertices are changed (no incremental update) - not ideal, but still playable, with glitches when updating voxels
DMC: one DynamicMeshComponent per Chunk, vertices/triangles of the surface mesh are added -
Result:
Initial chunk mesh building: comparable to PMC - basically adding the same level of magnitude of vertices/triangles
Update chunk mesh during play: DMC can incrementally update vertices/triangles (FDynamiMesh3) - performance is seemingly better than PMC - overall smooth
ISMC (5.4-preview-1): the reason I waited until 5.4 was that 5.3.2 still has a bug in RemoveInstanceInternalthat the InstanceBody removal logic didn’t respect the “bUseRemoveAtSwap” flag, which only got fixed in 5.4/ue5-main branches.
For the ISMC approach, I generated the 6 voxel faces as StaticMesh assets using a Python script - all the faces are actually the same square, but with different orientations, so I don’t have to rotate one face at runtime; also, each face could have different meterials - the Python script uses the GeneratedDynamicMeshActor.OnRebuildGeneratedMesh to build the faces dynamically using Geometry Scripting functions (basically the same FDynamiMesh3 stuff), and finally converting the face meshes to static mesh assets - those face static meshes will then be used as the base static mesh for instanced static meshes.
For each chunk, to make things simpler, just one voxel type case - I created 6 ISMCs, each ISMC contains all the instances for the corresponding face of each voxel of this chunk, with each ISMC using the corresponding above generated face static mesh. The instances of each ISMC range from 10k to 20k (around) as observed in the debugger
Then, the initial chunk mesh building with ISMC approach is extremely slow, about 10x slower than DMC, and before the building even finished, an ensure condition was hit at RenderGraphUtils.h line 433 in ValidateGroupCount. Switching from AddInstanceById to AddInstancesById (s) still got the same result.
I’m not going to dive into the implmentation details of ISMC, but it does seem not efficient in that instances and body instances are all kept around, and for overlapping / raycast tests, it even goes through the instances / bodies in a sequential manner - this is definitely not seemingly a very performant implementation, let alone that each AddInstance(s) might need to rebuild the render data and physics data (and I saw navigation updates as well)
So far, the conclusion I could make is that ISMC is not meant to be used for a huge number of instances, a few hundreds maybe; several thousands instances, a definite NO.
I think ISMs are very efficient for static objects - but yeah, transforming them isn’t exactly lightening fast - it looks like they’ve made some changes in 5.4 by pre-calcing some things - I haven’t tested if it speeds it up yet though.
The actual instance data itself is available - you can get pointers to the transform data (array of matrices) and custom data etc and make the changes yourself just marking it as dirty (pre to 5.4, you also need to increment ismc->InstanceUpdateCmdBuffer.NumEdits but that’s changed now).
There are also Batch methods for processing arrays of instance data.
Maybe another approach would be more optimal? What about having static instances in the distance which switch to dynamic actors when close by?
For something like laying out a voxel area - perhaps having tiles of ISMCs where you’re laying out other tiles in the background?
Also - you may actually find that just using a full cube voxel mesh is more efficient that breaking it into sides and calculating which ones to show.
The problem I’m facing with ISMC is actually on the performance of initial building up the instances (AddInstance(s)) - adding 10k ~ 20k instances exhibits a huge performance problem and eventually after adding several chunks, it crashed the renderer (ensure).
I haven’t had the chance to even get to the “updating” side of the things - since once the initial expensive building up can successfully finish, updating will actually be much more lightweight - since only a bunch of voxels would be changed at one moment.
Furthermore, while I like the optimization advices and thoughts, I’m currently mostly on the benchmarking side - to generate the entire world (800 chunks, with 32x32x32 voxels each chunk) all generated all at once - PMC & DMC can handle this load “pretty” fine, while ISMC seems to fail miserably - and when digging into “some” implementation details of ISMC (as described in my initial post), it really doesn’t seem that performant by nature.
The reason I decided to give ISMC a try was that I thought ISMC might handle “merging” of the face quad meshes (instances) efficiently both for rendering and collision than manually adding vertices/triangles in PMC/DMC - but the reality is that ISMC has to keep the instances around for its designed logic - it has to maintain the instances / bodies to return InstanceIDs when doing physics queries, just as an example
The Batch update functionality is still for existing instances, it won’t help the initial adding - AddInstances seem to do the batch already, but it doesn’t seem to be effective as expected.
I’m not sure how you did it, did you use AddInstances() to create the 20-40k instances on the same ISMC on the same tick? Is your rdInst plugin using the stock UE ISMC or you rolled out your own?
Yes that’s 20-50K instances spawned in a single frame in that tutorial - the internals uses AddInstances, but it also has a “recycle” system that reuses existing old instances that have been hidden. This test was on a 12th gen i7 with an RTX4070.
It was tested in UE5.3 though - I’m wondering if there is an issue with 5.4 - I’ll do some testing when I get the chance.
Maybe try using HIMS in 5.3 to get the RemoveAtSwap (or just hide them and recycle when needed) to see how performance compares with 5.4?
I created a simple blueprint only project to showcase the performance issues I encountered, those who are interested please feel free to play with it:
It’s creating 200 ISM Actors, each containing one ISMC, each in turn contains 200x200=40,000 instances, so altogether a total number of 8 million instances are added to the scene.
The initial spawning process takes a while to complete due to the spawning logic (see PC_ISM event graph for details); when the camera is moved around, the rendering exhibits huge lag in GPU timing, even though no geometry is changed!
Hi! Do you find a solution to the lag of ISM? Im trying to make a voxel world but the performance of PMC is too bad when i create the sections for one PMC per 0.05 seconds