Download

Performance cost of double sided material: not visible in stats?

I am currently creating character meshes for my small unreal engine prototype. Now, I wanted to be clever and save polygons by making the cloth out of a doublesided material… but then I forgot to single out clothing into its own material! Doh!
I am testing now my first bake, where everything is in a single UV map, using a double sided material, and just wanted to gauge the performance difference between single and double sided but… according to “stats Engine”, there is none to speak of.

Switching the material from single to double sided does not increase the tris count at all, nor does the render pass time budge. I have multiplied the amount of character meshes to get a greater impact, but still no difference.
The only difference I can see is that the material is reporting an increase of 4 instructions for the base pass.

Now, are these stats misleading? Thing to note is, MOST of these doubled tris would be invisible for the camera because of the angle. They would mostly be useful for the shadow claculations when light falls on the inside of clothing, and for certain camera shots where the inside is visible. Can unreal engine cull such hidden tris early (like it does for backface ones), and this is the reason why they have almost no performance impact?
Am I looking at the wrong stats? :slight_smile:

If these findings are correct, should I even bother with creating a second UV Map containing all double sided geometry of the character, so I can move the rest (maybe 2/3 of the total polygons) to a single sided material? Wouldn’t the additional draw call eat up all the performance I could possibly gain by avoiding rendering those tris with a slightly more expensive shader, if the additional tris not visible get culled early anyway?

I am a little bit confused here, given I found a thread were someone was given the advice to create two materials for his flag to avoid rendering the flagpole two sided…

You can use same uv’s but split character to two materials. Two sided materials will cause more overdraw. Normally GPU will cull backfaced triangles right before pixel shader so there is no cost related to them. But when using two sided material all triangles are rasterized and you are onbly relying early z culling. So depending on viewing angle your character cost double of the fillrate in worst case.

The stats alone are not enough to understand how a material will behave, important but not enough. You have other things to look at: shader complexity view, which is also good, but again not precise, and the best one is the profiler. This one will give to you the entire scene analysis and you can dig by feature what is going on. Your experiment is the best, because when you multiplied the instances you would see the difference in milliseconds, which is what you want to care about. Also, multiplying the number of meshes (if they are the same) with the same material would not show much difference, because GPU is optimized to handle similar things in a very efficient way. If thou, the meshes has even small differences and the parameters for the same material differs also a little, then there is not much similarity and the optimization is gone and now you will rely on brute force.

The profiler is the best tool always. The stats can’t tell properly because sometimes they have built-in conditions which makes the flow of instructions being executed differ, and since I do program in HLSL a lot, I can say that a material with 1 thousand instructions showing in stats might run in a pass depending on conditions only 200 instructions sometimes, in others materials with 100 instructions have a loop which will run 1000 times for the same pixel !!.. thats why you can’t rely only with stat info in my opinion.

Stats are correct in this case. The amount of triangles does not change.

Setting material to double-sided increases strain on rasterizer and situationally, on pixel shader and you won’t see any performance difference, until these become the bottleneck.

Depends on circumstances, but occasionally you might want to save on one extra draw call. Triangles, that are not visible, are not getting culled. Not to full extent you’d expect.

Doing so is debatable, as viewed in given context, is definitely not a ground truth and is a subject to discussion. No doubt, there are cases when you’d want to make exactly opposite. But commonly, there is a trunkload of other reasons to split them in two materials. Typically you’d need another shading model and some vertex animation for the flag, which is not needed for the pole.

Bottomline is, common sense should be applied first of all, when deciding things like that, for it is quite hard to anticipate the performance hit of such choices.

Triangles are culled by backface and object culling. So if you have a sphere, typically, the forward-facing tris are rendered while the backfaced tris are not. But by making the sphere double-sided, now the engine is filling all the backfaced tris on the sphere, doubling the vertex rendering for that object. And it’s doing this even if one pixel of the sphere is in view (technically, the object bounds). Keep in mind modern GPUs can handle millions of polygons, so you won’t ever reach a situation where making a single material double-sided will have a real notable performance impact (unless you’re rendering translucency or exceptionally high-polycount objects).

It’s a good idea in modern graphics cards to split up materials and take advantage of specialized shaders, tilable textures, and stuff like that. A flag on a flagpole is an excellent example: the flagpole might be made of metal or paint, and therefore should have a different material surface than the soft fabric cloth. And yes, the cloth can be rendered two-sided while the pole is not, but that’s not really significant.

Thanks for all the replies…

So then I simply cannot see the impact in stats because a) stats only show the raw tris, without adjusting for double sided (which makes sense), and b) the render time does not budge because the additional load will probably not show up until a bottleneck is reached with an increasing instance count of the same character being visible, which might happen sooner thanks to the double sided material, correct?

I am trying to save on render cost here as I might want to have many characters onscreen, and therefore I wanted to make sure the material is as cheap as possible. Also, the characters will not be seen up close, so I thought about simply not caring about hair and skin shader, and use a simpler subsurface material, which, with well created maps, has usually been good enough when seen from a certain distance. I thought that fitting everything into a single material would be best for DX9 and DX11 performance, don’t know how relevant draw calls still are under DX12 / Vulkan (haven’t had time to really look into this).
But then I have been using Unity for a long time, where saving on draw calls is extremly important because the engine has almost no “automagic” to do that for you (save for static meshes)…

So given you don’t need specialized shading models for different parts, and COULD fit everything into one material. Do you reckon its a good idea to try to go with the least amount of materials you can go away with? Is reducing the amount of materials from 2 to 1 worth about 3-4000 additional tris being double sided, even if they don’t have to be?
Or is it not really that much overhead for the GPU if the character is split into 2 materials, compared to the additional tris being rendered double sided?

Looking at the shader cost view mode, that looks identical to the shader without double sided… green-ish (yeah, the accuray of that is probably not too great given you need to eyeball it). I guess this only takes the 4 additional base pass instructions into account, but not what they are spent on? Only the cutout parts (which thankfully are small) appear red, due to overdraw I guess.
This view mode makes me believe that the double sided tris are cheaper than the area that is affected by the opacity mask… is that true? Or just the view mode not taking everything into account?

One question before more answers: are your characters going to have LoDs? Because if yes, usually you prepare even cheaper materials for far distance, and the most expensive ones for near distance. Than there is other approach which I love to talk about, since we are going to have a presentation from Ryan Brucks about this soon, is the actual use of impostors for very far meshes, which I think you can use even for characters that are not even moving, and then use the regular LoD system when they are moving.

So there are so many options… hard to tell if it is generic, but the best way to understand the impact is the actual use case… you said said about having “many characters onscreen” it helps but a better info would be number of tris per char x number of instances x number of materials per char, not matter at first analysis if it is double sided or not, but might evolve depending on the numbers to actually get them into account.

Personally, I do my best to optimize pixel rendering cost. Optimizations here allow games to run faster at higher resolutions and yield the most dramatic performance improvements, especially on middle-higher end hardware. Draw calls are more of a constant overhead that affects lower end users more than anyone else. Still important in D3D11, but no where near as much in D3D12, Vulkan, and Metal. Unfortunately, Vulkan is still not stable enough to develop for right now.

Even if you have a lot of characters you can certainly get away with two material separations, take advantage, and make a nicer, more appropriate cloth shader. But that all depends on your desired fidelity and number of characters: a truly proper character needs a special model for hair, eyes, skin, cloth, and accessories as all these components require radically different shading models from each other. The eye alone typically requires separate shaders too, if you break it down far enough.

How many characters are in your project, what platforms are you targeting, and what fidelity are you trying to achieve?

Well… thats the plan, for the future in which I hopefully have time to generate the LODs. As you can see from this answer, maybe I should be LESS concerned about optimizing my character meshes for performance right now as I plan to omit LODs until later… probably should simply not worry about it at the moment.

Imposters sound EXACTLY like what I would have planned for the highest/lowest LOD (never sure where you start counting with LOD :)) … A simplebillboard with the coarse outline of the character for distances where it would be hard to see even simple animations. Very interested about this presentation… will it be available online?

About the numbers: The idea was to create a very detailled looking art style, in spirit mimicking the old 2D isometric RPGs with extremly detailed 2D characters. Thus the characters would have a rather high poly count for characters seen from an isometric perspective (so that characters look good even in 4k), which I plan to make up for with aggressive LODs for characters further away to keep the polycount low.

Currently I am aiming for 6000-8000 tris for a humanoid character. Current rather conservative plan is to allow for around 100 characters onscreen max. Of course, not all of them in the highest LOD… I might create the lower LODs “by hand” and replace the geometry that needs the double sided material for the highest LOD there, so the double sided material would only be used for the highest LOD (which would also have the highest tris count, but would only be active for a small percentage of characters on screen as its a “pseudo-isometric” view).

Would you reckon the impact of having 6000-8000 double sided tris, and only one material is way higher than having only 2000-3000 double sided, and 4000-5000 one sided ones, but in two separate materials, given there might be 20-30 such characters in the highest LOD on screen max at any given time?

Okay, good points. I also was under the impression that drawcalls where no longer as important in more modern rendering pipelines than in DX9… Given that this is just a prototype and still years away from any kind of release, probably should stop trying to work around DX9/DX11 problems.

As for the fidelity/style: the characters look quite realistic, but I am not targetting real photorealism. Thus I don’t care too much about small incosistencies with lighting and shading… all the textures will be hand-drawn, so trying to go for a 100% realistic look would probably increase the problems with textures looking off. So, as said, I am less concerned about using hair and eye shaders and all of that. From the distance the characters are seen, and in the environment using my textures, they look perfectly fine using a simpler subsurface shader.

Now, I of course decided that while starting the project in Unity, where AFAIK draw calls have a way higher impact than in Unreal 4. So my decision to move everything into a single material was motivated by the technical necessity as well as not seeing a visual problem with it. If I could use 3 or 4 materials “for free”, thus without impact, I probably would look into what the eye, hair and skin shader could offer (though aren’t all these shaders more “expensive” performance wise?).
Sticking with that idea, is it possible, and wise, to have a single texture atlas mapped to different UV materials? Thus have the eyes in the same texture map as the hair? Or does it make more sense to create a hair atlas for different characters, and share the materials between the different character’s hair geometry? I am a little wary of having to many extra textures for small things like eyes, or strands of hair. I guess that is fine for games were characters will fill the whole screen during a cutscene anyway, but for what I plan to do, this seems overkill.

He will post here first and a stream for the community will follow later: http://shaderbits.com/blog/various-distance-field-generation-techniques
Always keep a look at his blog in the link above, that article came few days ago.

This is a piece of cake for current system to handle, even for a XBox One S.

As a side note, if you project is still to come in years, even if one year, the GPUs will probable get even better and the current high-tiers today will fall in price and be the common ground (hopefully the madness of cryptocurrency will go away), meaning even if the performance requires what today is a GTX 1080, this one would be replaced already for something better.

Okay, I guess the answer is then “It shouldn’t matter as every system will be fast enough either way”… right? If that is the case, guess I will stop worrying about that for now, and simply try to pick what fits my art creation pipeline, and material needs better.

Guess I will have to come back to fix problems with my textures and meshes later on anyway…

Thanks for taking your time to post all these useful information!

Quick follow up question if anyone here has the time to answer: Is it true that, at least on the GPU side, a “Context Switch” caused by a Drawcall is actually cheaper when little changes, for example the same material with different texture inputs is used (for example a material instance), versus a “Context Switch” to a completly different material using a different shading model, blend mode, and whatnot?

This would mean modern GPUs are already optimized to recognize when the same input is used and circumvent loading the same data again, correct?

Now that Drawcalls shouldn’t be the huge bottleneck to the CPU anymore with DX12 and Vulkan, this might become more important… would mean that you can split your models into multiple materials fine with little overhead (unless there is a batchsize limit to drawcall batching in DX12 or Vulkan), as long as the materials are as similar as possible.

Reason for the question is that all this discussion of splitting the mesh into more UV maps to use multiple materials has gotten me thinking that I might want to split the “main part” of the mesh into 2 UV maps, because that part at the moment would be best to use a 1024x2048 Texture, yet I read that using nonsquare maps in Unreal engine 4 would be a bad idea. 1kx1k has to low resolution, 2kx2k has too high, and having two 1kx1k ones would give me enough resolution without needing more VRAM than really needed.
Thus if I can use 2 material instances without a big hit under modern rendering pipelines, that is probably what I will end up doing. At least for the highest LOD…

Low-end graphics cards nowadays has a minimum of 4GB RAM and GDDR5 which is fast, models like GTX 1050Ti and RX 560 offers cheap Full HD play at good framerates, faster graphics cards next into the scale are GTX 1060 and RX 570 with approx performance and memory setup are like twice the price…

In general if a game made with your assets are meant to be played in PCs you will succeed even not worriend vigorously… but in the end only PC game developers will be interested in your assets. A really good approach is to make your assets with performance enough for Consoles and knowing their limitations because they stay in the market for a longer time than graphics cards for PCs. By last, developers create a second set of materials for mobile.

If you have too many characters onscreen simultaneously, a material built taking into considerations to have different textures, mask to give them grunge, scratchs, dust, wetness, etc, different normals to add details, then they clearly will offer the variety and modularity you would need. Thats why Substance is a great tool nowadays for texturing, the versatility is too good to be ignored.

If my line of work was modeling and texturing, I would always keep my assets at minimum a standard to work at least with PS4 Pro, since X Box One X is quite more powerfull anything is compliance with PS4 are OK for X Box One X. Having a PS4 Dev Kit would be nice to fine tune. As for X Box, you can use a regular console with Windows in Dev Mode and register at Microsoft to be able to create test builds and run, as far as I know, you can’t with PS4.

This is not correct. Two sided or not the vertex and triangle count it exactly same. Backface culling is happening after vertex shader stage and only affect rasterization and pixel shading cost. On CPU side there is no difference. It’s just simple rasterizer state.

Has anyone experienced different performances in real projects with 2-sided materials? If so, can you describe them?

Well, in that case, the only way two-sided materials would incur any extra cost is with translucency and stuff like baking. For characters there’s absolutely no difference?

Incorrect. You are getting increased rasterisation and material pixel shader costs.

Is there a way to quantify that, without actual real world testing? What kind of overhead are we looking at? 2x the cost? 50% additional performance cost? 10%?

Just trying to gauge if it is even worth worrying about, unless the target is very weak hardware.

Roughly double cost of rasterisation and fragment shader.
May sound scary, but you might even not be able to measure this difference in a practical scene, so little it can be. It all should be viewed in a context, not as an absolute.

Double is worst case. Depth buffering should reduce overhead cost in average case by half with convex models and even more with complex models with multiple layers.