Instanced static mesh transform update performance

I’m trying to update the transforms on an instanced static mesh (with 50k instances), but finding the performance unacceptable (7-8 fps). I find that i can call BatchUpdateInstancesTransforms with bMarkRenderStateDirty false without dropping the framerate, but not with it set true. Calling MarkRenderStateDirty() once per tick after the call to batch update performs equally badly. If I don’t mark the render state dirty then the instances don’t appear to move.

The use case is a bullet hell sort of thing and I’m attempting to implement an optimized (and simplified) physics and collision system that applies transform changes to the ISM.

So the question is, am I missing a setting or function call that will work for this, or is it not possible to update ISM instance transforms in realtime? If it’s not, is there any other construct that might work? Projectiles as individual actors (especially attempting to use the engine’s standard collision system) were far too slow. The ISM can at least render the required number of cubes.

2 Likes

I’m not in front of the editor at the moment to test this out but, in the past, I’ve iterated through the instances and set the new transform with UpdateInstanceTransform. The key to make that work efficiently is to make sure that you only set the MarkRenderStateDirty on the very last instance update…otherwise you’re performing a render update for every transform update which of course is really slow. I’ve never used BatchUpdateInstancesTransforms so I can’t speak to whether or not it’s doing that. If you haven’t solved this by time I’m in front of an editor I’ll take a look at the function to see what it’s doing but I suspect it might not be doing what we think.

2 Likes

Thinking about this problem a bit I reached back into my old tutorials list to an optimization video I usually go to for ISM work.

UE4 Optimization: Instancing - YouTube

In here he’s simulating an asteroid belt with thousands of asteroids and maintaining frame rate. Admittedly I’ve never had to work in that territory which is where you appear to be. What I found interesting is that, in the section starting around 20:12, he talks about what I mentioned with setting the render state dirty…and goes on to mention (and demonstrate) how with that many actors it still tanks performance…in fact, his goes for 120 fps down to 7-8 fps…just like your situation :wink: . For his asteroid belt he settles on rotating the entire belt (actor) versus each individual instance.

It’s the next section that might contain information that helps you out. You can essentially create a single static mesh out of a group of individual static meshes and use them for your instance rather than individual static meshes. Consider this…even if you turned 2 bullets into a single static mesh that would drop your static meshe instances in half, thus reducing the iterations on tick. I feel like this will be the key for getting the performance you want.

The only other concern I might have is running any sort of per instance physics or collision with the number of instances you’re trying to use. Simplified or not, I’d imagine that is going to be really expensive. This might come down to a trade-off between performance and functionality in the end where if you want the cool visual you might need to lower your expectations with regards to simulation of effects. As in the tutorial, he’s moving the entire asteroid belt rather than the individual meshes…this means he loses out on the per instance fidelity that might make it look cooler (or more realistic) but gains the performance required. He ends up with something that still looks visually stunning and so your mind ignores the minor details.

Either way, it sounds like a fun problem to solve. Good luck!!

4 Likes

Thanks for the info, I’ll check that out. I thought an ISM would be the natural structure for reducing batch calls, but apparently it does some stuff extra that makes it not suitable for individually moveable meshes.

This is actually something I did in an earlier iteration of this project in a custom engine. The proof of concept used 100,000 projectiles colliding against 100 targets as a core gameplay mechanic. It was able to achieve >60 fps in a dev build. But the rendering was simple 2D point sprites in raw OpenGL with a flat data structure that minimized cache misses when iterating over the entities and it used 4 threads for the collision. Anyway, wouldn’t expect to reach that scale in Unreal, but I was hoping for at least 10k/20 or so. Even if that’s not feasible, I was surprised that just marking the render state dirty was such a big hit.

1 Like