Instanced Static Mesh Decreasing performance!

Do this:

Add static mesh. Put the return value into an instanced static mesh variable, and plug that variable into your add instance.

Click on the add instance node and see if your instance is set to static or movable. Moveable is default I think, set it to static and see if that helps; and let me know.


Thanks for all good suggestions.

Tried this, and for both cases (movable, static) it was more efficient to create Static Meshes instead of instances of Instanced Static Mesh.

I did kind of as you desrcibed in my first try with instanced static meshes (before i fell back to the tutorial).
But I didn’t do it exactly the way you described and I will try that in a couple of days when i have access to my UE4 environment again.
Meanwhile, is there anybody out there that can reproduce the problem by implementing the small blueprint above and analyse the result of “stat scenerendering”?
I will also test the obvious thing myself: Use a standard box instead of the mesh, in case there i something that is not good with the mesh i am using.

Although the main question remains: How can it be more effective to use individual static meshes than instanced static meshes under the same circumstances?

I can’t really get a grip on the problem. I think one needs to know what UE4 does under the hood during this “RenderQuery” time to sort this out.

Instanced Meshes reduce draw calls, or the number of separate cases where the CPU has to talk directly to the GPU.

Depends on your PC, but with a mid to high end computer you should be able to pull out at some tens of thousands of basic instances without having issues. I can pull out 100-200K relatively non-optimized tree instances before getting to the FPS you are experiencing.

The figures is from a machine with a Core i5 (16GB) and a Geforce 650Ti card (1GB). In the example above i used some 4k of instances. With 1k of instances i get around 30 fps, so obviously something is not working well here. As i mentioned earlier, using static meshes gives significantly higher FPS (but i takes the blueprint a very long time to create them).

According to that stats list, you are heavily GPU bound. When you see a lot of time in Present or Query, that means it had to wait on the GPU to finish what it was doing.

Note that using InstancedStaticMeshComponent defeats CPU side culling, so if you have 4K instances but only 5 are visible, then you’re still paying to draw the other 3995 of them. How many triangles is each of your instances?

Michael Noland

Each instance should be around 840 triangles and 691 vertices.
Thanks for you answer, this was all new to me.
So using instances is then not recommended when a lot of them are not visible?

But… in the example above all (or most) of the meshes are visible. So why is it then more effective to use static meshes?

I made two test cases. One using the editorcube mesh, and another using Sphere (StaticMesh’/Engine/BasicShapes/Sphere.Sphere’). That sphere is 960 tris and 559 verts, very close to your model. Keep in mind my video card is a beast (Titan).

  1. Cube:

Was able to spawn 100,000 instanced cubes and the GPU time was 8ms

  1. Sphere:

With 5000, GPU is still at 8ms.

With 15000, GPU time is ~28ms. That is 28 million triangles coincidentally. My renderquery time was only 20ms.

What video card are you running?

You mentioned that renderquery was still just as long with only 2x2 (aka 4) total instances, right? Is it actually the same number or just also high?

If you can give us the renderquery time with only a few instances that will be helpful. If there aren’t many instances there should be no reason for that number to be high. If it still is really high Id suggest making a new test BP in a new level. Never hurts to rule things out. Maybe old instance data somehow was hanging around somehow.

The card is a Geforce 650Ti (1GB).
The times were much shorter for only a few meshes, but still worse for instances than ordinary meshes and i don’t need more than a few hundred meshes before the fps drops to critical levels.

I will repeat the experiments in a new project and with different meshes on this weekend (on a business trip right now).

@RyanB: Do you get better or worse performance if you replace the add instance with just an ordinary add static mesh? I suppose you cant try that on 15000 meshes though so it might get difficult for you to test …

Instanced Static Meshes cost more on the GPU than regular static meshes, but cost less CPU time to process and submit them. However, my understanding is that they only cost a little bit more, and only in vertex time, not pixel time.

If all or most instances are visible at once, then they’re useful but if most are not visible then they’re not a good fit (but can still be used in spatially local regions, e.g., instead of 64x64, do 16x16 regions for example to balance culling with batching).

RE: Your actual situation, 4k x 840 is a decent number but not totally insane for a modern GPU (3.36 M tris). However, I’m wondering if they’re some how accumulating and you are drawing way more than you think you are; maybe throw in a ClearInstances call on the component before the for loop, and also make sure you only have one of these actors in the level.

Michael Noland

Once again, thanks all for the help and information.
There is a lot of test suggested in this thread that i will carry out as soon as i am back.

I added the “ClearInstances” call before adding the instances, and suddenly the framerates are up around 70 fps.
Now i can remove the “ClearInstnaces” call and still have 70 fps. Obviously there must have been tons of instances lurking in the background.
Problem solved, but I am not completely comfortable that I can’t reproduce it.

[EDIT:] I am able to reproduce it!
While fidelling around i found that have removed the PointLights i had in the first example, and when i placed a couple of point lights over the tiles the framerates dropped very low again.
I turns out that if i have a few point ligths in the scene the rendering with Static Meshes is more effective, but if one only use directional light the instanced meshes are more effective. How can that be? (The point lights are set to stationary)

Sorry for reviving an old thread, but I am interested in the answer to Kartoms last question.

Was this overseen or is it solved somewhere else? Can anyone provide any helpful statements?

Have you (or anybody) been able to reproduce the results in the latest version of the engine?

Dynamic lighting is expensive? I can reproduce this

That dynamic lightning is expensive is understandable, but why does it work so bad together with instanced meshes? In my previous experiments I didn’t get no way near the same penalty when working with ordinary static meshes as i got with instanced static meshes.

That i cant answer mate. Try heirarchical instanced meshes instead.

its simple math and expected behavior.

with directional light, all of the trees get rendered twice no matter where the light is. once for the view once for the shadow.

with point lights, its going to do the lighting calculation using the bounds. individual meshes have tight bounds that fit around the mesh. instanced static meshes get grouped together, so a few lights inside of a forest will of course cause more shadow rendering compared to regular static meshes. They have been grouped together so in effect the pointlights will each be affecting more polygons and objects with larger bounds which then tough more neighbor groups.

Hierarchical instances should not have the same limitation or the crossover point might be different.

I know this is an old post but does the foliage tool and foliage spawner use Hierarchical instances.