Article written by Matt O.
In this article, we’ll cover at a high-level what shader permutations are, how they happen, the performance issues they create both at runtime and compile time, and potential remedies.
Covered in more detail in How the Unreal Engine Translates a Material Graph to HLSL, the Unreal Engine will compile a material graph down to a program in the High-Level Shading Language (HLSL) for a target platform (DirectX SM5, Android GLES 3.1, Metal, etc.). These programs still contain branches and #ifdefs, but modern graphics processors work best when they can execute exactly as many instructions as they need to, in order to draw a pixel. In order to do this, there are additional compilation steps necessary to take the generated HLSL source code and convert it into bytecode that can then run on the graphics hardware itself.
The bytecode is, in essence, the HLSL source code that has been completely flattened as much as possible. This shouldn’t be a problem if we have one material suited for precisely one circumstance. We’ll have as many bytecode files as we do materials and we can call it a day.
The Unreal Engine opts to compile a lot of different versions of a material at cook time, as opposed to compiling these on-the-fly while the application is running. This has the benefit of ensuring that very little of the graphics hardware’s time is spent compiling shaders when it should be rendering a frame.
Where Do Permutations Come From?
Materials in Unreal need to support a wide range of conditions, features, platforms, and usages that require slightly different shader code (or languages!).
Project-level Settings:
The first category of situations that can generate additional permutations are global settings:
- Project Settings, e.g. Unreal needs to compile different code for raytraced materials vs. raster materials. Same goes for virtual texture support, and others.
- Project Shader Permutation Settings, here Unreal allows you to toggle different features that you may or may not need for your project. These are all turned on by default in order to allow you to freely create your worlds without having to worry about whether or not you’ve got “Support Stationary Skylight” checked for example.
- Supported Platforms, again to ensure you’re able to freely create content by default a project is set up to support all available platforms. This means Unreal will have to compile different versions of your materials for each platform!
So for a simplified example, if the project allows static lighting, and supports atmospheric fog, we need to compile the Statically-lit, Atmospheric Fogged version of the material, the Dynamically-lit and Atmospheric Fogged version of the material, as well as Statically- and Dynamically-lit versions of the material that aren’t affected by Atmospheric Fog.
Material Features
Once we consider each of those permutation generators, we get down to the level of the material itself. There are two places in a materials settings that will require an additional set of permutations to be generated:
- Material Property Overrides, changing settings like Blend Mode, TwoSided, and DitheredLODTransitions all require a different set of permutations for those features.
- Material Usages, this one is oft-overlooked. Unreal compiles different versions of a material based on how it’s used in your project. For example, if you apply a material (or an instance of a material) to both a Niagara GPU Sprite and a Static Mesh Actor, those require Unreal to create slightly different shader programs based on the different vertex factories.
Landscape Materials
Landscape materials provide an added layer to the already numerous factors that affect the number of permutations. For a given landscape material, Unreal needs to compile a separate permutation for each component of the landscape, and additionally compile different versions of the same material based on the Landscape Layers used on a given landscape. If you have a landscape material with four layers (Snow, Dirt, Grass, Rocks), but you’ve only painted Snow in a given component, Unreal will require different permutations of that component’s material when you paint Dirt onto it, and again for Grass and Rocks.
The Material Graph Itself
Finally, you can create additional permutations in your material graph itself through the usage of switches and switch parameters. If the preview in the MaterialInstance window briefly switches to the WorldGridMaterial when you change something, it’s highly likely that you’ve found something that creates permutations. These usually come in two flavors:
- Quality, FeatureLevel, Shading Path, ShaderStage, ReflectionCaptureSwitch, RayTracingQualitySwitch, VirtualTextureFeatureSwitch, and ShadowPassSwitch. (For example, if you’re using a MaterialQualitySwitch to selectively remove some complex math from a material on a lower spec, or switch to a cheaper version of the same functionality.)
- StaticSwitch, StaticSwitchParameters, and StaticComponentMaskParameters.
The latter case is a common cause of a lot of user-generated shader permutations. These often arise from the desire to create a single “Base” material that contains all the functionality you might need for all of your uses, and selectively toggling those features based on your needs. For example, your standard diffuse material may optionally include support for using dithered Pixel Depth Offset to blend the material into other surfaces, or you may optionally support Fuzzy Shading. This is often done for maintainability, so if you need to change how you process your Albedo maps across your entire project, then it’s easy enough to go and make that change in one place.
Static switches in Unreal don’t work quite the same way as an #ifdef or [branch] would in HLSL. Instead of triggering a separate permutation for each side of the static switch, the final compiled HLSL for a given material instance or graph contains only the code used by the selected branches. So if you have two static switch parameters set to (True, False), then Unreal will only compile the required permutations (i.e. quality, platform, rendering features) for the (True, False) case. If a user creates a material instance from that material which sets the flags to (False, False), then Unreal will compile the required permutations for that case. Essentially, each used combination of static switches creates a new shader.
This means that Static Switch usage does not increase permutations by 2^N, so the bloat of permutations can be mitigated through the careful structuring of the material hierarchy. However, this can lead to redundant permutations in the case where a user creates two separate material instances that independently set those static switch parameters to the same values.
In the material editor in Unreal, you can plug inputs into pins that are greyed out on the Material Output block. When you do this, Unreal doesn’t compile anything for those inputs, but you can override certain material properties in a material instance, including Shading Model and Blend Mode. This would allow you to create a single base material that also includes functionality for, say, Two-Sided Foliage materials and Translucent materials. So if you’ve got 8 switches in your base material that, by default is set to Opaque, Default Lit, but then create a material instance to act as your base material for all your Masked, Two-Sided Foliage materials you’ve now created a number of permutations equal to (or potentially greater than) the number of permutations from your base material.
And finally, if you have a single base material that’s covering all your needs, Unreal will check the “Used with XXXX” checkboxes in that material, and each of those compiles its own set of permutations for each combination of switches, material features, and project-level settings. The numbers add up at an alarming rate.
Performance Implications of Permutations
There are two key areas where an increased number of permutations can result in performance issues:
Cook-Time
Since we compile all the required permutations of a material based on the above-listed factors, the total number of shaders we need to compile during a cook can be quite high. The more shaders we need to compile, the longer it will take to compile those shaders. This can result in lower-than-desirable iteration or build times, which reduces the amount of time you can spend making your project the best it can be!
Run-time
Without going into great depth about the architecture of GPUs, suffice it to say that the GPU likes to do a lot of the same thing. Modern GPUs can evaluate a certain number of pixels in parallel, as long as they are using the exact same bytecode. So if a 64-pixel area all uses the same bytecode, great! We can render 64 pixels in parallel and reduce our overall draw time. However, different permutations of a shader result in different bytecode, so while it may appear that all of your materials are the same because they inherit from the same monolithic base material, each usage and switch combination of that material is different as far as the GPU is concerned. The greater variety of materials that need to be rendered in a frame, the longer it will take to render the frame.
Remedies
In discussing ways to remedy the issue of long cook times for users, we have to find a balance between a few key factors:
- The ability to affect an entire project’s worth of materials by changing one file
- The amount of time it takes to compile all those permutations
- What calculations do we need to make on the fly, and which calculations would be better made as part of a different permutation
Reduce Project-level Supported Features
It’s perfectly reasonable, and may even be acceptable, to go into your project settings and turn off the shader permutations that you don’t need for your project. This could prove such a drastic reduction in the number of permutations that Unreal will compile that no further action is necessary!
Increase the Number of Shader Compiler Threads
You can increase the number of threads compiling shaders on your computer by modifying the DevOptions.Shaders parameters in your Engine/Config/BaseEngine.ini file. Increasing the number of threads used for shader compiling will increase the load on your computer, potentially making your computer less usable while shaders are compiling. However, more concurrently running threads could result in an overall reduction in the amount of time it takes to compile shaders.
Here’s that section from the ini file with explanations for each of the settings:
[DevOptions.Shaders] ; Make sure we don’t starve loading threads NumUnusedShaderCompilingThreads=3 ; Make sure the game has enough cores available to maintain reasonable performance NumUnusedShaderCompilingThreadsDuringGame=4 ; Core count threshold. Below this amount will use NumUnusedShaderCompilingThreads. Above this threshold will use PercentageUnusedShaderCompilingThreads when determining the number of cores to reserve. ShaderCompilerCoreCountThreshold=12 ; Percentage of your available logical cores that will be reserved and NOT used for shader compilation ; 0 means use all your cores to compile Shaders ; 100 means use none of your cores to compile shaders (it will still use 1 core). PercentageUnusedShaderCompilingThreads=50
For machines that have less or equal to ShaderCompilerCountThreshold cores, Unreal will determine the number of shader compiling threads using NumUnusedShaderCompilingThreads. Those machines will use TotalNumberOfCores - NumUnusedShaderCompilngThreads threads (so on a 12-core machine, Unreal will use 9 threads). You can increase the number of cores used for shader compiling by decreasing the NumUnusedShaderCompilingThreads. As mentioned, this can make your computer unusable during shader compilation.
For machines with greater than ShaderCompilerCountThreshold cores, Unreal determines the number of threads to use based on PercentageUnusedShaderCompilingThreads. By default this is set to 50%, so on a 128-core Threadripper, shader compilation will happen on 64 of those cores. Decreasing this value will increase the number of used cores. Again, setting this all the way to 0 will likely result in an unusable machine.
Check Material Usages
In some cases, it’s possible that the Usages of your material are not updated in the event that a usage is no longer valid. For example, if you apply your material to a Skeletal Mesh then Unreal will check the “Used with Skeletal Meshes” checkbox. If you then remove the material from all skeletal meshes but do not also check out the material, then the “Used with Skeletal Meshes” checkbox will remain checked and those unused permutations will be compiled.
It may be preferable to turn off “Automatically Set Usage In Editor” for commonly-used materials. This is under the Advanced tab of the Usages section in the material’s details panel. This way if someone tries to use your DefaultLit base material on a Niagara particle, they’ll be met with the WorldGridMaterial to show that that’s not a supported usage.
Change Your Base Material Architecture
Unreal also has a number of features to help you architect your base materials in such a way as to reduce the number of permutations while still also permitting a high degree of maintainability. You can leverage MaterialFunctions to encapsulate your common functionality. For example, let’s say you want to make sure all your Albedo calculations are consistent across most of your different material needs (your default lit base material, your tree trunk material that blends with runtime virtual textures, and your leaf material), but not all of them (your fire spark particle effect). Instead of setting up a static switch for “UseFireSparkAlbedo”, you’d have four separate materials:
- M_Base_DefaultLit
- M_Base_TreeTrunkRVTBlend
- M_Base_Foliage
- M_Base_FireSparks
In the first three materials you would wrap all your desired albedo adjustments into a material function (MF_Albedo), which you would then connect into the Base Color input for the first three materials. Then in your FireSparks material you’d set up an entirely different set of albedo adjustments (if any!). Now if you need to make any changes to your common albedo adjustments, you can do that in MF_Albedo instead of M_Base, and you’ve removed a few different places where additional permutations would crop up by splitting M_Base_FireSparks out into its own base material.
Caveats
As has been mentioned a few times already, the benefit of the Unreal Engine’s cooktime shader compilation is to save runtime resources on the graphics hardware. We’ve so far discussed techniques to reduce the total number of permutations that can be generated by a single base material that can be either a net neutral or net benefit to the runtime execution of the material.
With that in mind, there are also ways to reduce the total number of permutations generated by a base material that can have a net negative impact on runtime execution of the material (both by increasing instruction counts and resources used on the graphics hardware). They are presented here as a cautionary statement whose application must be carefully considered.
It is not recommended that you replace StaticSwitchParameters with If statements or Lerps in your material in order to reduce permutations generated by a base material. Because these statements are not static they won’t generate permutations, and as an added benefit can allow you to quickly toggle or blend between states on the fly (you’ll notice, for example, that you cannot call Set Static Switch Parameter on a Material Instance Dynamic). In these cases (especially with Lerp, and sometimes with If) the graphics hardware will calculate both inputs to those nodes. So you’ve instead moved the cost of compiling those shaders from cooktime to a different cost that accumulates on every pixel you render. Ifs and Lerps in materials are better-suited for working with dynamic inputs (for example, if you wanted to quickly toggle between two states for debugging purposes).
Additional Resources and References
The Shader Permutation Problem, Part 1
The Shader Permutation Problem, Part 2
Unreal Engine 4 Rendering Pipeline Part 5, Shader Permutations
Seven Tricks to Speed Up Shader Compilation in Unreal 4
Tim Jones’ Shader Playground - Write some HLSL and see how it compiles down to various flavors of bytecode!
UE VERSION
4.27