Nanite & ISMs - Best Practices

Hi

We are using Nanite pipeline.

We have been investigating ISMs. We have very large worlds with large structures, many actors currently which I want to reduce, as well as anything we can do to help nanite performance

Found that ISMs where effective in saving CPU costs, by about 1-1.5ms in the editor on RHi & Render Thread. Did also seem to help GPU. Again editor sampling so timings could be off

25k actors down to 8k actors by using ISMs

When it comes to grouping objects to put into the ISMs, I assume we still need to be reasonable carefuly on how we batch it?

For my test, I just selected every instance and used the modelling tools to put them in a ISM, which isn’t particularly “intelligent”, as all the instances have to be rendered if a single object gets within the frustum right?

An example would be:

We have a massive building which has loads of interior walls which are the same. Does it make sense to put those all into 1 ISM, or more intelligently build the ISM so its split into multiple areas?

I watched this Unreal Fest which said do still use ISMs, but didn’t go into detail if there was any particular batching strategy

https://www.youtube.com/live/Cb63bHkWkwk?t=6272s

Weighing up art/dev time vs what nanite actually needs

Does this impact GPU nanite performance at all?

I assume it must do, as nanite would have to cull more meshes itself?

Just trying to understand the full impact. This really matters for lower end hardware as we are trying to make sure we can scale to older GPUs

I know for nanite we should use custom primitive/instance data and not material instances ideally as another optimisation

If we are meant to use a more intelligent way of converting static mesh actors into ISMs, is there anything automated to do that? Like by distance/cluster?

EDIT:

Sorry forgot to add as well

Do ISMs work with the new Virtual Painting system?

I have read that it doesn’t survive the conversion process

and what happens if we add more to the ISM?

Thanks

Karl

Hello,

Thank you for reaching out.

I’ve been assigned this issue, and we will be looking into these performance questions for you.

Hello,

If you are using World Partition, keep the cell size in mind when batching. A single ISM for objects nearby and across the map isn’t going to be the most efficient way to load objects. If you are using some other streaming system, then that needs to be taken into account as well.

Distance, frustum, occlusion, and similar per-instance culling will be taken care of automatically, both with and without Nanite, on the GPU. When using ISMs, it is possible to set Nanite meshes to distance cull. For an example of this in action, see the water plants in and on the ponds in the Hillside Sample. For more information on this, please see “BuildInstanceDrawCommands.usf”, in particular, the function “IsInstanceVisible(…)”

There are several existing tools for converting large bodies of Static Meshes into ISMs:

  • Merge Actors tool
  • HLOD Instancing layers
  • Packed Level Actors

If none of these satisfy what you are looking for, you might need to build your own tool.

These tickets discuss batching Nanite Static Meshes further, including potential cook-time batching of Static Meshes into ISMs:

[Content removed]

[Content removed]

This World Building Guide also goes into more detail on this topic in general:

[Content removed]

This ticket discusses working with Mesh Paint Textures and ISMs:

[Content removed]

Please let us know if this helps.

Ok thank you

We will bare in mind the ISM size for our worlds, we certainly won’t make them large enough to span loading cells

“Distance, frustum, occlusion, and similar per-instance culling will be taken care of automatically, both with and without Nanite, on the GPU.”

The thing is with this, while I know it is *automatically *handled we are
much more concerned about what gets us the best *performance*

There must be some CPU side culling taking place with ISMs?
Or do ISM’s get culled for the entire group or per instance?
The aim here is to just save as much CPU & GPU performance as possible so
we can squeeze out more performance

Hence some confusion about the recommended workflows and pipelines

Thanks

Karl

Hello,

ISMs will cull individual instances on the GPU. This happens before Nanite rasterizes the instances, so, it helps improve the performance of Nanite. This is where the performance increase on the GPU comes from when using Nanite ISMs, as compared to individual Nanite Static Meshes.

While you can cull whole ISM components on the CPU, the distance you would want to cull them at will likely correspond to the distance that you want to unload the related streaming cell at.

Keep in mind that this advice is general. To get the best performance, profile frequently on your target hardware, and guide your optimizations based on that.

Please let us know if this helps.

Thanks for the detailed explanation really appreciate it

Also very excited about your new features announced with 5.6, especially the FastGeo and Cell Transformers

Seems very much like what we need!

Hello,

Thank you for the reply.

Can we close your case? You can always re-open it if you need additional assistance with the same issue.

yes please all done for now thank you