What is faster: Extra Texture Sample or this function on existing Texture Sample?

Slavq · July 19, 2019, 7:15pm

I have a material with 4k heightmap render target sample. I need normal & slope map from that, I can do that in two ways:

a) Use this function on the existing heightmap (that is already used in the material):

b) Draw the normal map (+ slope map in the alpha channel) into a render target in Blueprints once (at design time), then just use it in my material through an extra Texture Sample.

… In option a) we skip adding one additional texture (render target with normal map) by reusing the existing heightmap sampler. But is it worth it? The function samples the texture from 5 different UVs and does these calculations on it, so I’m not sure… It samples the same texture, but still, I don’t know if that matters or if it’s expensive to do that 5 times on 4k.
I’d be grateful for any tips.

Deathrey · July 20, 2019, 2:51am

Depends on ratio of screen resolution to texture resolution and how it will be used but generally option b would be preferable.

Slavq · July 20, 2019, 5:37pm

Thanks, I’ll probably go with b) indeed.

I’m not very proficient in the material stuff yet, but I try to learn more. I’ve read about texture lookups, shader instructions, etc. and it seems like a function like that can be quite expensive on 4k RT, since it gets the value 5 times from it and does these calculations.
I also found another thread with related topic here:
https://forums.unrealengine.com/development-discussion/rendering/1502226-material-cost-texture-lookups-vs-shader-instructions

anonymous_user_fbe2d247 · July 20, 2019, 10:59pm

I would prefer A but profile both cases in your use case.

Manoel.Neto · July 21, 2019, 12:13pm

You should profile. You are sampling the same texture, yes, but since the samples are on neighboring texels the GPU texture cache will play a part on it, so it’s not clear if it’s going to be slower than a single sample from another texture.

Slavq · July 21, 2019, 12:49pm

Indeed, since there is no universal answer for that and it depends on use case, I’ll build both scenarios and profile it in my scene. Thank you for the info!

anonymous_user_fbe2d247 · July 21, 2019, 1:40pm

And if memory is problem then A is better.

anonymous_user_fbe2d247 · July 21, 2019, 1:52pm

You can also try option C. Screenspace derivate normals. Calculating screen space polygonal face normal is easy.

But you can also get lot smoother normal via same method if you calculate heighmap offset per pixel for worldspace position.



Pseudo code:
OffsetPosition = WorldPos + VertexNormal * HeightMapSample * Intensity;
normal = normalize(cross(ddx(OffsetPosition), ddy(OffsetPosition)));

This avoid faceted look because this OffsetPostion is continuous.

Slavq · July 21, 2019, 2:09pm

Thanks, in my case I can’t use Vertex Normals since they are altered in unusual way (long story, but I can’t change it in current project and just need to avoid it. I have Tangent Space Normal off in my material). But it’s definitely a nice solution if we aim for stylized flat shaded look.

Manoel.Neto · July 23, 2019, 12:02pm

anonymous_user_fbe2d247:

You can also try option C. Screenspace derivate normals. Calculating screen space polygonal face normal is easy. https://answers.unrealengine.com/sto…latshading.png
But you can also get lot smoother normal via same method if you calculate heighmap offset per pixel for worldspace position.
Pseudo code:
OffsetPosition = WorldPos + VertexNormal * HeightMapSample * Intensity;
normal = normalize(cross(ddx(OffsetPosition), ddy(OffsetPosition)));
This avoid faceted look because this OffsetPostion is continuous.

Using ddx() and ddy() on values that come from texture samples returns the same value for every 2x2 pixel block, causing the effect to look half-res pixelated. You can’t use that to cheat your way out from sampling multiple times.

There’s option D: figure out a way to use TextureObject.Gather() through a custom node (if that’s even possible), which will give you the four sample values in one call (which you have to manually bilinear filter).