Material optimization

Maximum-Dev · June 26, 2016, 10:38pm

Hi,

I’m working on a landscape shader with several layers and many textures. Shader complexity is brown and it’s just because of the high number of textures.
I have a several questions regarding optimization here.

A) Previously I have asked this on the forums and the reply I got was that 1 Texture map + Alpha channel = 2 Texture maps in terms of memory usage. Is this true in all cases? do we get no benefit from packing into alpha channel then?

B) If if we put Roughness into the B channel of normal map. How is the normal map supposed to work without it’s B channel? do we have to append the RG with a flat blue color or something? wouldn’t this result in wrong light receiving?

C) If the answer to question A is that 1 Texture map + Alpha channel = 2 Texture maps then why they put grey scale Albedo into the Alpha channel?

Thank you.

ZacD · June 26, 2016, 11:17pm

Difference texture compression means different memory usage, they might not be using DXT5. The blue channel of a normal map can be stripped like you suggested, a normal map doesn’t contain depth, it contains lighting information for left to right and top to bottom, that’s only 2 axis and 2 channels. For the color channel, they are probably combining it with a constant or with a gradient map, see this A Technical Artist's Blog: Gradient Mapping - An Awesome Way to Get Cheap Variation in Textures

Sotalo · June 26, 2016, 11:31pm

A - Yes, kind of. The alpha channel is always uncompressed, so if you use all 4 channels, you’re really not saving much memory. HOWEVER, uncompressed is typically such a good quality, it looks as though you have doubled the size of the texture for only 3 times the memory cost. Actually doubling the size costs FOUR times as much memory. Under certain circumstances where compression is undesirable, the cost can definitely be worth it. By putting the texture detail in the uncompressed slot, they get some nice quality detail and normal/roughness information for a far less memory usage than using larger textures. Of course, the only problem with this is you can only have one flat color, and that color can only vary in terms of value.

B - You can append with a flat blue, but it is recommended to mask the RG values, plug that into DeriveNormalZ, and then use that result as the normal map. This will generate and append the blue channel based off of the RG values. I’m not sure what the math is behind it, but it does look a lot nicer than appending a flat 0,0,1 value. Just so you know, if you’re using this method, you first need to convert the RG range from 0:1 to -1:1. Constant bias scale it: Bias at -0.5, scale 2.

C - It was their preference to have more detail in the texture than in the normals or roughness. The roughness channel shouldn’t have extremely sharp increases in value. Normal maps are much better to leave uncompressed so the lighting is accurate, but if you’re in the mode of saving memory, you need to sacrifice the normal detail. One normal map uncompressed costs as much as 2 textures, and 2 textures can hold 6 compressed channels. Also, having one uncompressed normal channel and another compressed channel while leaving textures uncompressed can look awkward. Their method puts texture detail quality at the forefront at the expense of everything else.

Stephane_Roncada · June 26, 2016, 11:38pm

Hi Maximum-Dev!

A) Previously I have asked this on the forums and the reply I got was that 1 Texture map + Alpha channel = 2 Texture maps in terms of memory usage. Is this true in all cases? do we get no benefit from packing into alpha channel then?
Yes this is true in all cases because DXT5 compression is used when you have an alpha channel, which doubles the texture memory since the alpha channel does NOT get compressed. One way to keep the texture memory down when using an alpha channel is to force DXT1 compression with 1-bit alpha (no gradients in alpha anymore), but as of right now, UE 4.11 does not allow you to do that (if anyone knows a way then please correct me). The benefit you get from packing a texture in the alpha channel is 1 less texture sampler call.

B) If if we put Roughness into the B channel of normal map. How is the normal map supposed to work without it’s B channel? do we have to append the RG with a flat blue color or something? wouldn’t this result in wrong light receiving?
The blue channel of a normal map does not contain that much information, so you can re-construct the normals using DDX and DDY. See https://forums.unrealengine.com/showthread.php?15179-Storing-a-heightmap-in-Normal-Map-B-Channel&highlight=reconstruct+normals+from+red+green+channels for doing that.

C) If the answer to question A is that 1 Texture map + Alpha channel = 2 Texture maps then why they put grey scale Albedo into the Alpha channel?
They are doing it this way to save texture sampler calls. You’re limited to 16 texture samplers per material (unless you use “Shared: Wrap” in your Sampler Source), and 2 (or 3?) are already reserved for lightmaps and something else (can’t remember what), so that leaves you with about 13-14 texture samplers. By packing 3 different textures into the normal map, they save 2 texture samplers per material type, which adds up quickly.

So for example, let’s say you’re making a material that contains 3 different texture types (dirt, concrete and grass). If you only need a normal map, a reflectance map and a grayscale albedo (more like a gradient ramp) map, then packing all 3 into the normal map texture like the example above would make sense, and would give you 3 texture samplers total instead of 9. But if you a few more textures, for example a roughness map and a AO map, then you might as well NOT use the alpha channel of the normal map for the grayscale albedo and instead use a second texture where you would pack the roughness map in the red channel, grayscale albedo in the green channel and the AO in the blue channel. Doing it this way, you still use the same amount of texture memory, but you get 2 more textures for free basically (as long as none of your texture maps use an alpha channel).

I hope this helped make a little more sense. I am sure you’ll get more info from other people as well, good luck

Maximum-Dev · June 27, 2016, 12:04am

Wow thanks guys!

So with all that, I have new questions appeared.

D) When I import a normal map if I don’t set the compression to “normal map” the lighting looks bad. If I set the compression to “normal map” it ignores the Alpha channel. What do we do?
E) If I pack into normal map and set the compression to “normal map” then the BA channels which are Roughness and Albedo, are no longer sRGB (since normal maps should not be checked sRGB). What about that?

Thanks a lot for the helpful tips!

divi · June 27, 2016, 9:47am

for D refer to mariomguy’s answer on B regarding adjusting the value ranges. generally speaking, if you want to use channel packing then you will have to do all the things the compression settings would normally do for you in the shader.
as for E - roughness should, just like normal maps, not be authored with sRGB on.

Maximum-Dev · June 27, 2016, 11:16pm

My mistake about roughness needing sRGB. I was writing at 5 AM I think.
Thanks guys for all the tips. I am still heavily channel packing to reduce the shader complexity.
If anyone also know of other methods for optimization I’d like to hear it.

Thank you all!

ZacD · June 28, 2016, 1:43pm

You can also make a non sRGB albedo texture, just make sure you are referencing values correctly on PBR charts.

Sotalo · June 29, 2016, 5:20am

sRGB just raises a linear space to a power of 1.8 or 2.2 (IDK what UE4 uses, everyone uses something different). You can always convert a linear texture to sRGB fairly cheaply by multiplying the texture by itself. This raises it to a power of 2. It MIGHT be a bit lighter than it should be, but it only costs one instruction and eliminates the obvious washed-out look with linear space, and players playing the game won’t notice the difference. Alternatively, you can use the levels adjustment in Photoshop ahead of time and save one instruction at the expense of some quality in the brighter areas of the texture. I’d recommend not doing this, exactly, because the one instruction savings is not worth the loss in quality.

The best way to channel-pack normals is to use a default texture compression settings, no sRGB, append the normal channels together and ConstantBiasScale it (Bias: -0.5, Scale: 2.0) and DeriveNormalZ to the normal map, and multiply whatever part of the texture that’s going to be affecting screen colors by itself. Take a look at the example below:

Sotalo · June 29, 2016, 6:06am

Of course, the benefits of having three distinct textures (standard 3-channel texture map, normal map, channel-packed roughness, displacement, metallic) is color, quality, and quicker computation. You’ll be able to have parallax occlusion with a semi-metallic surface, complete with normal maps and full color textures. Even though that expanded capabilities is also twice as memory-intensive and requires 3 draw calls instead of just one, really, memory and draw calls don’t matter anymore. You should try to avoid anything that is computationally expensive, with a high shader or lighting complexity. That will be your bottleneck.

With Direct X 12 being integrated in UE4 and the prevalence of shared samplers, there is less of a need to worry about draw calls. And with 2 GB GDDR5 VRAM + 8 GB DDR4 RAM and solid state drives as large as 250 GB becoming something of a standard on modern computers, memory has already become a problem of the past. If you still obsess about memory, then you’re developing for consoles and mobile. But even the cheapest of consoles have 1 GB of RAM, and they’re all going to be upgraded pretty soon. By and large, as we move into VR, 4K, and beyond-60-FPS framerates, which is where the future is going now, memory will be the last problem we will have to worry about: instead, we should focus on trying to eliminate the computational expense of rendering such features. Achieving higher framerates with even remotely good graphics will cost us DEARLY in GPU computations. We need to find a way to get better graphics for a cheaper cost, and this means looking back at some of the rendering methods from the past: the Gamecube/PS2/Xbox era games can provide some good study cases. Use reflection maps over water instead of realtime reflections. Use smarter textures with basic lighting built-in. Use vertex shading for translucency instead of pixel shading. Use opaque materials instead of translucent ones. Making tons of smaller particles instead of many larger ones. Use geometry in lieu of bump/parallax mapping. Separate materials, and keep them simple. If you have an expensive shader, limit your lighting. Worst case scenario, use static lighting for the world, and only use specular highlights for dynamic lights. On top of that, don’t have too many dynamic lights overlapping each other.

When it comes to materials, as much as it is a pain to admit, the shader complexity and overdraw will always be your number 1 performance killer. As long as your polygon count is not too bad and you’re not using too many 2K textures and your gameplay elements aren’t complete and total overkill, then your problem is nothing else but shader complexity. The Wii ran hundreds of moving gameplay objects fairly easily, and a modern CPU runs 5 times faster than that on just one thread. Most computers have 4 threads, and since more work being done on GPUs nowadays, that frees up our CPUs a lot. That darn GPU is going to be a tough bottleneck to overcome. My PC runs textures like nobody’s business. I shine a few lights on a chandelier, and the rate crashes. If you want to optimize, just try to find a way to make good graphics with the fewest instructions possible. Use multiply exponents instead of the power function. Find ways to reuse calculations so you don’t run the same calculations twice (this was a problem with parallax occlusion in UDK and older versions of UE4). Limit yourself to a majority of 1K, rarely 2K textures, and make sure those pixels count. Use blending and detail texturing techniques to reduce the repetition in materials, and try not to break the bank doing it. Sometimes, less is more. If your material is getting too complex, try to find ways you can keep the same feeling while cutting back on the complexity. If you made an awesome material, find a way to cut down half of your material nodes while still making it feel as awesome as it did before. If you can do that, then you’re gold.

Deathrey · June 29, 2016, 12:51pm

You would benefit from packing stuff into alpha channel, when you need to reduce number of samplers at memory cost. The example you’ve linked was about detail textures, and I think it would be safe to assume that these textures were relatively low res.

You would have to pick a different compression method, other than default BC5, for your normal map and reconstruct B channel in the shader.

I’m quite sure that they found using less samplers more efficient. Besides, as I said above, this method was used for detail textures. It is safe to assume that these textures were not larger than 512.

I will give you an example where the opposite is true, in my opinion. Imagine needing a material, that blends 3 4k normal, basecolor, and roughness maps. You can pack roughness into (A) of Albedo, or you can pack 3 roughness maps into (R) , (G) and (B) of another texture respectively. In the second case, you would need extra sampler, but memory savings with large textures would be worth it.

Maximum-Dev · June 29, 2016, 2:22pm

I’m not actually worried about memory ATM because we’re doing good there so far and still have a lot of room. My reason for starting to pack textures is that the number of textures is greatly increasing the shader complexity, not that there is a lot going on but when it comes to landscapes it really counts how many textures are painted on a component. And we can’t really limit the number of layers for the game we are working on so the purpose of channel packing here for me is to A) be able to have more layers to paint with B) don’t have a high shader complexity. Freeing up memory is a plus.

I have already dropped a lot of Roughness maps and have started driving roughness from albedo and only use roughness map for where it’s really necessary. The materials that don’t need a roughness map I am trying to pack the normal maps together like this:

Texture map 1: RG-> Normal map 1 RG channels B-> Normal map 2 R channel
Texture map 1: R-> Normal map 2 G channel GB-> Normal map 3 RG channels
…

Thank you for the very helpful information so far!
@Deathrey, That’s good information. Thank you.

Sotalo · June 29, 2016, 5:35pm

The biggest problem with landscape is having too many layers, not textures. While there are limits to the number of texture samplers you can have on landscape, having too many layers will reach that limit. But since most landscape textures will be loaded at once, using shared samplers and controlling your layer count to no more than 3-4 layers will indeed curb the rendering costs. Shared samplers will group all the shared textures into one gigantic texture array and call them at once, so again, at the cost of memory, you can save draw calls and computation. And with DX 12, there’s no need to worry so much about draw calls anyways. The rendering costs of handling blending physical and surface details between 4 different layers will kill you before anything else does.

I suggest keeping normals as standard uncompressed normal maps: this might cost two extra textures memory wise, but you’re getting the quality of a texture 4 times as large doing so. You can pack all of your roughness channels into one 3-channel texture, your height maps into one 3-channel texture, and standard 3-channel textures for colorful texture maps (or use the basic coloring method with just one channel for whatever monochromatic surfaces you might have). So for 4 independent textures, that translates to 4 normal maps, 4 diffuse textures, and 3 channel-packed maps to store roughness and height information for a combined total of 11 textures. And of course, you can always exchange texture data with detail texture maps, masks, add some detail normals as well, take away certain maps you might not need, whatever you need to do. And if you do go the route of using shared samplers, you will have the ability to boot up to 128 textures at once, so, I don’t really see textures as being a problem unless you start using ridiculous 2K and 4K sizes. Some characters on the PS4 had 6 @ 2K texture maps to render, though, so if you’re targeting mid to low-range PCs, 2K textures might not even be such a huge deal.

Millionviews · June 29, 2016, 8:15pm

Are we talking too many material layers? I mean…8 layers doesn’t have to be 8 materials / ‘nature kind of layers’ in a material. In my landscape I have 4 nature layers and 7 layers to bring variation such as fading rocks away or blend two variations.

Deathrey · June 30, 2016, 7:33am

I don’t think that sharing samplers has anything to do with texture arrays and memory cost at all.

The biggest problem is exactly about balancing out math and texture fetches, and so far, with landscapes, in roughly 80% of cases it was exactly texture fetches that were bottle-necking us, so @Maximum-Dev is on right way to reduce number of texture samplers per component.
Number of landscape layers hits memory mostly.

That seems wrong. I did not look into the engine source but I am quite positive that physical surface ID is calculated once and stored. Real-time cost, connected with number of layers, should lie within costs of sampling weightmaps and memory costs of storing them.

Maximum-Dev · June 30, 2016, 6:03pm

I am putting normal map into RG and roughness in B and setting the compression to default. I get these black artifacts on the normal maps I can see then up close.

Am I doing something wrong or it’s natural to get these artifacts?

Thanks for the information Deathrey.

Millionviews · June 30, 2016, 6:18pm

I ones had that while reconstruct the Z component. Don’t know how you are doing that, but I had to multiple by 0.99. I also getting these black artifacts when I export stuff from substance designer in 16bit.
Turns out, pure black is not ok. Hence the math blue * 0.99.

Deathrey · June 30, 2016, 7:36pm

@Maximum-Dev

Ensure that sRGB is turned off for your texture.
Then make sure that you unpack RG of your normal into -1 to 1 range
After that calculate B of your normal, using DeriveNormalZ node.

Sotalo · June 30, 2016, 8:39pm

From what I understand, shared samplers combine textures into one array so that the whole array is called at once instead of multiple fetches to individual textures. And shared samplers needs to load in all the textures at once at the highest LOD of any texture being used. So if you’re sharing textures between a lamp far away and a landscape texture up close, both textures will boot in at the highest LOD setting in use. So, the memory cost is all the textures need to be loaded to the highest LOD.

And shader complexity! It might not seem like much, but you put 8 different layers on one component, and not only do you have to deal with all the textures loading into all those 8 layers, but the shader math to blend it all, too! It’s VERY easy to make a landscape shader with very expensive components when you go beyond 3 to 4 layers. 8 layers on one component is overkill for sure. Unless you’re targeting high-end PCs, like NVIDIA GTX 960+, I would not try to use that many layers in my materials.

Maybe not physical surface details, but shader complexity, yes. I’m making a landscape project with Sand, Grass, Dirt, and Rock layers. The sand layer also has special blending for the shoreline, above water, and underwater. The more layers overlap, the more the shader complexity kicks up. I have tons of textures on my landscape, somewhere between 10-12 textures, and it runs fine. The texture count in the entire material doesn’t matter: the best performance I get is when I don’t have too many layers overlapping. The sand layer by itself has a colored texture, noise map, medium normals, and detail normals. And shared samplers work really nicely to reduce the texture fetches. I’m not sure if it uses texture arrays, but I do know on Direct X shared samplers will allow you to use up to 128 textures in one pass.

Maximum-Dev · June 30, 2016, 9:55pm

Everything is setup correctly on my end but the black spots are still there.