I’ve been trying unsuccessfully to replicate a dense foliage scene that really should be possible in UE4 with decent performance. I set up 3 grass mesh LODs, with only the first one having world position offset. My trees also have 3 LODs with the first one using modeled leaves to avoid overdraw (Similar to Kite demo). Both have a culling distance set up.
Here are a few screenshots:
Here’s an album showing the grass and trees on an empty third person project, on an empty map: http://imgur.com/a/rrAp7
As you can see in the first screenshot it’s running at just over 40fps. However, this is a nVidia 780GT and a i7-4770, it should be eating that scene for lunch.
Could you make a screenshot with “Show Shader Complexity” activated? With deferred renderers, that´s usually the biggest issue. Are you using the foliage tool to place everything?
Hm. The shader complexity seems to be fine.
Haven´t worked much with the performance profiling tools yet, but if I read correctly it seems like you are having around 8.7 Mio triangles in your scene. (not sure if that counts for the whole scene or only the currently rendered triangles).
What´s the triangle count on your trees in LOD0? Maybe a screenshot in wireframe mode might be helpful.
A while back I was working on foliage with speedtree and I noticed that having (even simple) modeled leaves would cause a significant performance drop while not really giving a big visual payoff - except when you are really close to a leaf.
It was definitly the right decision for the Kite Demo, because it really showed the great streaming capabilities of the engine (smoothly moving the camera from a detailed leaf to a birds-eye view of the ridiculously huge map is just amazing), however that might not be suitable for every game project.
As you can see from the ShaderComplexity, my LOD0 has level 2 branches that were modeled to fit their texture, to avoid overdraw. The LOD0 has some 40k triangles which I know is too much. However, locking the trees to LOD1 (4k) or LOD2 (2k) does little to raise the FPS unfortunately. Both of them kick in rather quickly anyhow.
It seems from your screenshots that the trees are by far the biggest performance problem since the grass by itself was running at 80fps.
In your shader complexity view most of it looks good but there is a decent amount of white pixels creeping in towards the horizon where it looks like grass and trees overlap. You could start by trying to reduce that area somehow.
Is there any room to optimize your grass or leave materials?
Also if your most aggressive tree LOD is 2k that could be the problem. For kite demo our furthest LOD was a billboard and that distance was actually pretty close to the camera. The screen size for the billboard was pretty big at 0.01. They were supposed to be 2polys but I mistakenly used a plane mesh that was tesellated into 4x4 but it was for absolutely no reason other than my oversight…
As for my materials - the grass material is almost a copy of the kite demo grass material, with a few things removed for simplicity (I can’t recall what exactly now). That being said, testing ou the grass with the material being set to Opaque rather than masked gives less than a 2 fps performance boost so overdraw doesn’t seem to be the issue. I have a similar setup for my trees as kite demo as well, with LOD0 having the second level branches modeled to fit the branch and leaf shape, while LOD1 and LOD2 use cards. None of the leaf materials do anything fancy either. They apply a tint to the texture, do the SpeedTree wind (Set to Better quality), use a mask to separate the leaves and branches and apply proper subsurface color and fuzzy shading.
I also guessed that the vert count on the trees was the biggest factor and my original intention was to make a billboard as well, but frankly the billboard material from the kite demo is just a bit out of my skill reach. That being said, I can probably just use the fake UV macro to get the tiling proper, however I’d need to check out how to generate a proper sprite atlas in SpeedTree, as it generates some weird sideways stuff by default.
Last but not least, I doubt the vert count thing just a bit because I don’t see much of a performance gain from locking the tree mesh to LOD 1 (which is 10x less verts that LOD0) or 2. I am however in the process of creating a more lightweight tree to test it out.
The default billboards output out of speedtree should be just fine. In most of or our tree variants used the billboards straight out of speedtree due to time constraints. They aren’t really billboards, rather they are a series of cross planes and the vertex shader hides all but the one (or two) most facing the camera. For all intents and purposes you can consider them billboards though.
Just because changing the LOD one level doesn’t show a big enough performance increase does not rule out the issue.
To test if thats really the case, replace every tree with a cube or something else ridiculously simple.
I am also curious if you see any performance difference if you remove the “SpeedTree” node from the worldpositionoffset input on all the tree materials. It is possible there is some high cost there due to high vert count.
@RyanB, Doesn’t billboards cause big overdraw issues? isn’t that affecting the performance by a lot when we have so many billboards like being a little far from a forest and all trees are then billboards.
What I meant by SpeedTree billboard was the atlas texture of billboards it outputs. I never could figure out how to get their billboard LOD to show up, it always just spat out 3 LOD levels for me.
Yes it can. For these kinds of optimizations we are always balancing between several bad options so you sometimes have to try things and adjust as needed on a case by case basis.
For certain cases it makes sense to try to cut the billboard out. Speedtree has some tools to do this automatically. It’s not too hard to set it manually either. For Kite demo it wasn’t a huge deal since the shape of the trees was somewhat boxy.
For a pine tree this means your billboard can often be a single triangle which is nice.
Ok, I did some research and replacing my trees with appropriately sized boxes makes their impact negligible. However, I did observe something interesting. Replacing my standard leaf material with just a masked diffuse + normal with the foliage shading model gives me back a lot of FPS (from 24 to 42). My standard material is this:
Will try plugging stuff out to see what’s causing it…
Ok. I looked at that fuzzy shading function and theres a ton of expensive stuff in there that really should be done on the vertex shader for foliage. It has a transform, two one minuses, powers etc.
The problem is that its doing this per-pixel so its not possible to do it in the vertex shader. If you used vertex normal as the normal for fuzzy shading you could compute it via Custom UVs and it should help. You’d also have to put 1 for basecolor and multiply your base color after the fact using the customUV result.
Actually scratch my previous message - the copy of the tree that had the simplified material I was switching it with was locked at LOD 2, hence the FPS gain.
Edit: Just skipping the FuzzyShading node gives me a 1fps gain. Sigh, gonna continue working on the simplified tree.
Doh… no worries it can be easy to confuse yourself when trying a bunch of things. Still you found something and everything does add up so I would expect to have to do both changes to meet your targets.
Hey Ryan, I’m in the middle of creating my tree and SpeedTree still won’t generate my Billboard LOD level (3 levels, all are mesh based) even though Make Billboards is checked. Do I need to do something in addition to that?