Perf Cost: Texture Object & Texture Sample vs Texture Object & Texture Sample *2

WpWhite · December 28, 2022, 5:46pm

So say you have a 4 channel texture, you plop it in a texture object and sample that using a texture sample. This has a cost, now you want to do something with the uvs or the mips or something so you plop down another texture sample and plug it up to the same object. Did you double your cost or no, or something else? What’s going on under the hood here?

MostHost_LA · December 28, 2022, 6:30pm

Shouldn’t be a double cost, but you need to bench it to be sure.

Using 2 of the same texture as manually set in a texture sample has no additional memory cost (that’s all you care about) for the material.

Its basically the reason it is better to pack textures togeter as much as possible.
Less files = less memory footprint = less drag on performance.

Theres also the fact you want to always use the correct texture size for the job.
Ei: no point loading in an 8k texture when you see maybe 1/64th of it on screen?
Using larger than needed textures is the #1 cause for performance loss.

At the same time, It is better to have 8K textures saved somewhere when you make an asset. Its just that they shouldn’t be making it into the final game without being reduced and compressed as much as needed.

BananableOffense · December 28, 2022, 6:51pm

When a texture is sampled, it is also partially or fully cached in memory. Subsequent samples of the same texture will be faster because they can access that cache.
However, the act of actually applying the UV transforms and sampling the texture itself are not free. Assuming a high cache hit rate, the cost will be much less than double. But don’t take that to mean it’s a good idea to use this liberally. Notice that many consider techniques like multisample blurs or tri-planar mapping to be expensive techniques even though they are re-sampling the same texture. Parallax Occlusion Mapping is another example of a technique that resamples a texture repeatedly - often dozens of times - although it’s usually a low resolution heightmap

MostHost_LA · December 28, 2022, 11:42pm

Well wait a minute there…
Thats got to do with the mathematic formulas involved, its not just the same image as it were and you call it a day.
The cost comes in due to the number of instructions, which on some of those techniques are per pixel, multiple times (passes).

The “cost” of a single texture can actually be really high, on something like a landscape - to the point you want to strip even the most basic stuff and re-use a base color channel as a basis for height blends for example.

Or insignificant on something like water, where you usually sample the same normal 4 times with diffetebt UVs but adding in an extra image ends up looking better &and& costing less…

BananableOffense · December 29, 2022, 1:26am

Right, that’s my point exactly. You’re doing math - often per pixel - when you sample a texture. You can save most of all of the memory cost on subsequent samples but that isn’t a green light to load dozens of Texture Sample nodes in a material because they all have unique instructions such as UV transforms. I bring it up because OP specifically mentions “doing something” with UVs and mips which will increase the cost for additional samples (but still not near double) especially when using high resolution textures or UDIMs.

MostHost_LA · December 29, 2022, 3:39am

I often put stuff (read as math) in a custom UV just to have it computed faster (gpu vs cpu) even when it doesn’t really need it.

In a way, “doing stuff” to the UV and re-sampling is maybe even 1/4 the cost of an extra texture… milage may vary.

Gfx cards are way faster at computation than you’d think. In spite of the 40x series bringing nearly nothing if not a higher cost to the table

Either way, as an extra tip for Op, just set your editor to Shader Complexity mode, and fine tune the material.

As a general rule of thumb for large (landscape components, around 100m^2) you get in the red with 9 textures (not necessarily just samples).

For landscape particularly, more layers = more complexity = shared wrap costs you more just because you can apply many layers to a single component…

Frenetic · December 29, 2022, 4:24am

^^^ This. I thought it would be standard practice to make a performance pass, and for materials, why wouldn’t one load up any free custom-UVs with all possible maths?

BananableOffense · December 29, 2022, 5:02am

Well for one because the Vertex Interpolator node exists so there isn’t much need to go through the process of doing that manually. But to each their own workflow.
And as your comment implies not all math can be done per vertex. All that said I agree that math is almost always cheaper than memory and per vertex over per pixel whenever possible.
I’m just making the point that just because sampling the second time isn’t as expensive as the first, it’s not open season. Once again, you very frequently see performance concerns about having to sample the same texture multiple times when using techniques that require it like tri-planar, anti-tiling, blurs, PoM, etc.
I am really glad that modern GPUs are as good as they are. The hoops you used to have to jump through to get something to look good and run decent even as recently as PS4. We’re really spoiled these days.

Frenetic · December 29, 2022, 5:36am

“possible maths” but yes…

kurylo3d · November 12, 2023, 3:42pm

What is the drawback of using a 4096 vs 4 2048s? And i mean in terms of performance on a shader. I know the virtual texture route ud be sampling the texture twice vs every uv set… but im more curious about non virtual textures at 4k.

MostHost_LA · November 12, 2023, 9:55pm

This dpeends on the system in use.

For something as simple as the landacape with a single 2km squared tile, 3 or 4 years back, with engine version around 4.22
INTEL wrote a paper showing that changing the texture from 2k to 1k netted a lot more than half the perfoemance increase.

Now, that’s probably unimportant because the landacape system is such a mess that you cant consider benchmarks with it as any kind of standard.

The reality is, its also 4 (or perhaps more) years later, and technology is way different.

The give and take on this is basically RAM cost, which depends on the type of image you have to store.
It also depends on the subserving system that reads in the texture (as they behave differently, regular vs VT).

You have hardware as the unchangable bottleneck and the way you choose to read the texture as the bottleneck you can control.

Obviously, smaller is better.
Less load, less performance drag.

The question you ask yourself should really be: “how small a power of 2 texture can I get away with while this item shades looking like a million bucks”?

If your answer is 4k, then something may be wrong with the model (unless you are rendering cinematics in 8k).

Think about it this way, 4k is the screen size you display at. That is the bigger size to shoot for because people do game in 4k (at 244hz if possible?)

Any objects should be contained within that screensize, so whatever you are depicing should always be smaller than 4k…
Do you really need the object’s texture to be the same size or even bigger than the screen?
(Rethorical as it may be, the answer to that is “it depends”).

Frenetic · November 13, 2023, 6:19pm

I can tell you just from anecdotal experience that changing my texture-arrays from 4k, to 2k, to 1k all have demonstrable performance differences in practice. Less-large the texture, less information is traveling around.

In my landscape/heightfield-mesh experiments, I have a decently expensive material. Changing around from 4k vs 1k it’s like a 12 fps difference for me at 4k resolution. And I know w.o specs that doesn’t really quantify it for you, but that it CAN be quantified, shows it’s valid…

MostHost_LA · November 13, 2023, 8:20pm

Well, you know… we think that 2k vs 4k is basically half because of numbering, but its really not.

There’s 4 2k images inside a 4k image…
And 16 within an 8k one.
Having half the savings is really just wishful thinking. Its more like 3/4ths worth of savings when you scale down to the next avaliable option.

And yea, its 100% valid… as always, projects need to be benched/tested for performance considerations…

kurylo3d · December 12, 2023, 10:45pm

Thanks for the input. I am also curious about texture arrays. I am considering using them to lessen draw calls rather then building 1 atlas texture. The question there is… does it consider a texture array as 1 texture to sample/lookup? And if so… is it like sampling an extremely large texture if multiple textures on it? Or maintain the speed over a few smaller ones?

kurylo3d · December 12, 2023, 10:46pm

Thats only partially true in terms of monitor resolution. I mean a 4k texture although large u have to take into account the pixels on teh other side of the mesh will not be on the view… its not a 1 to 1 measurement 4k texture to 4k monitor… it would be more like seeing have the texture on display on the monitor…

MostHost_LA · December 12, 2023, 11:31pm

Hence the “it depends” after it?

BananableOffense · December 13, 2023, 12:56am

An array is only one sample no matter how many textures it contains, as it can only sample one index per pixel.

kurylo3d · December 13, 2023, 1:25am

but does a large array of textures make it more expensive … like a 2048 is more expensive then a 1024? Would an array of 4 2048s… somehow have the performance of a single 4096 texture?

BananableOffense · December 13, 2023, 2:01am

A larger array consumes more memory, but if we assume those textures would’ve been in memory anyway, this is irrelevant. Where you can get into trouble is if you make an excessively large array of textures that results in memory being allocated to textures that aren’t actually in use. I would say rather than four 2k tex performing like one 4k tex, you should think of it as performing like a 2k tex that consumes as much memory as a 4k tex.

kurylo3d · December 13, 2023, 2:06am

i guess im just curious about the lookups(not sure if i got the terminology right… lookups or samples)… like everytime you access one of the textures to modify it with math and operations… does it know to use only the selected one in the array? Does resolution, in that regard, matter?