InstancedStaticMesh vs SkeletalMesh


once again I got a question about skeletal meshes and how does there performance compare to instanced static meshes?

In my current project I want to spawn small meshes(around 600 triangles each) along a animated spline curve, collisions are disabled.

The curve is animated at runtime, using a multithreaded system to handle all the calculations, and this system provides an array of calculated transforms.

I managed to write a class to generate a skeletal mesh at runtime (when the actor spawns). The skeletal mesh is composed of a root bone and one bone per mesh instance.
However reading the forums I realized that there are InstanceStaticMeshes available and that there exist blueprint nodes to update the transforms of the individual mesh instances.

In the final game I will have like 60 meshes with 120 instances each visible at once.

What is better suited to render these meshes?

a) 60 skeletal meshes with 120 bones (one per instance)(each bone transform is updated every frame)
b) 60 instanced static meshes with 120 instances (each instance transform is updated every frame)

Is there any disadvantage with instanced static meshes?

Looking for your answers


In your scenario, I believe Instanced Static Mesh would give you much better performance.

Instanced Static Mesh – 120 instances, are drawn with one single draw call. Even if this draw call is updated each frame, nothing else changes (vertices are all at the same position, only Transform of each instance is different).

120 copies of static mesh, all need their individual draw calls (so 120 of them).

As far as number of updates to vertices, GPU should take up the same amount the time; your calculation for positions, either way, should take up about the same time on CPU too. The only difference would be that overall, with instanced mesh you’ll make 60 draw calls (one for each original), and with skeletal mesh you’ll make 7200 draw calls. (one for each copy of whatever original is being used)

Check to see how much time is currently spent in Render Thread (where draw calls are being made). If this is not a bottleneck for you now, in whatever hardware you intend to target your game (eg do a ‘stat unit’ from console) then overall you wont see any improvement.

For example, if your ‘stat unit’ with skeletal mesh shows:
Frame: 10ms
Game: 2ms
Draw: 10ms
GPU: 10ms
This tells you that Draw (Render Thread) takes longest (and GPU ends up waiting on Draw, and thus has the same time) – Although separated, time spent on Render Thread is still time on CPU.

If your current stats are:
Frame: 10ms
Game: 2ms
Draw: 5ms
GPU: 10ms
This tells you that video card takes the most time and optimizing your solution by using Instancing won’t give you any benefits, since GPU will still draw the same number of vertices/polygons

Last scenario, if your stats are:
Frame: 10ms
Game: 10ms
Draw: 10ms
GPU: 10ms
– this means that CPU (Game thread) takes the longest, and other two are waiting for CPU. Again, you won’t see any performance benefit by optimizing draw calls, and you’ll need to optimize CPU time first.

The target platform is PC (Windows, Linux, MacOS).

If InstancedStaticMeshes are rendered with only one draw call, this would indeed be an alternative as the overhead of updating the internal states of skeletal meshes is removed.

However what I don t understand, why would a skeletal mesh take 7200 draw calls?
I repeat the static mesh multiple times(120x) in a single vertex buffer and build the bone hierarchy at runtime (when the actor spawns). One mesh with 120 bones, one bone per instance of the
As far as I understand a skeletal mesh takes only 1 draw call and the skinning takes place on the GPU, shouldn t it be the same as InstancedStaticMesh then? Except the internal overhead with updating bone matrices.

Ahh i see – skeletal option, in that case would be sent to GPU once, but each vertex separately – just a 120x bigger draw call. (draw calls still happen each frame, regardless where they’re skinned). In addition you have bone matrices to send, but these are equivalent to transforms that are sent for each instance of static mesh.

Overall difference would be in draw calls – instanced mesh would give you some benefit.

Measure if your current bottleneck is in draw calls, if it isn’t, then perhaps there is no case to change. :slight_smile: