Trying to Understand Kite Demo Performance Implications

Hey there Epic community!

I’ve recently downloaded the Kite Demo to my computer, because I wanted to play around with it and see what I could do. I really wanted to get a better understanding of some of the rendering features in the demo. Before I go further, here are my specs:

Core i5 6600k quad-core CPU @ 3.5GHZ
32GB DDR4 RAM
GTX 1070 GPU with some light overclocking

Based on these specs, the Kite Demo really should be running alright considering what Epic recommended for specs; my GPU should actually be a little faster than the GTX980 that Epic recommends. The only thing I’m short on for the target spec is the 6-core CPU, but the demo never really utilizes more than 40-60% of my CPU. And yes, I updated my graphics driver.

I’m getting 20FPS with just the opening scene loaded (none of the background levels, or the rest of the landscape). I ran the GPU profiler, since I know the CPU and RAM aren’t falling short. It tells me that the BasePass is taking up most of the frame computation, at ~30-35ms/frame. Under that BasePass, ‘Dynamic’ takes most of its frame time, and the two ‘Static’ passes take a little time but not a ton.

Aside from that, the ShadowDepths takes a big chunk too (~10-15ms). There is only one 8k atlas, and there are four ‘WholeScene split’ parts to the directional light source. One of those is taking most of the computation time for this category. I think this category is for CSM from the directional light, but please correct me if I am wrong.

Then there’s the Lights category taking ~2ms. This category seems to contain Ray-Traced DF shadows which are taking just 1ms. There are several references to HFGI in this category too, but in all, it seems to be using a lot less resources on this than I was anticipating, assuming this Lights category is including all the calculations for DFGI and HFGI.

Finally, the last category that has a significant impact is SkyLightDiffuse at ~2-3ms. This category includes ‘DistanceFieldLighting’ which is mostly ‘HeightfieldIrradiance’.

So just from what I’m gathering, it seems like HFGI and DFGI/DFAO are only using 5-6ms of frame time. CSM (or whatever ShadowDepths is) takes a larger chunk, but what I don’t get is BasePass. Is this just mesh rendering? It seems like it’s too high for some reason, but I don’t know what.

Here’s a few screenshots of the camera angle I got my numbers from in lit and wireframe, plus some screenshots of the GPU Profile: http://imgur.com/a/aeQYf

PS: The shadowing on the landscape/trees on the upper right side of the lit image in the imgur link is pretty harsh. Is this normal behavior, or are those supposed to be a little softer?

I think this issue is related to HLOD.

In the project downloaded from the launcher, HLOD data is stripped, since HLOD is a kind of derived data.

You have to rebuild it. However, to do so you need Simplygon - which has issues with UE4 currently.

I think Epic should reconsider distributing the project with HLOD data.

So the problem is that I don’t have LOD information? That would certainly make sense. Is there a way for me to fix this?

Make sure you have loaded all sublevels.

Click on Window -> Hierarchical LOD Outliner -> Generate Clusters & Generate Proxy Meshes

If you have Simplygon, meshes will be merged and polygons will be reduced; otherwise, only meshes will be merged to get lower draw call numbers.

However, since the acquisition of Simplygon by Microsoft, Simplygon Connect has issues with UE4; if you want to use Simplygon, you have to either buy the full SDK or wait for M$ to fix it.

Wait, do I need to have Simplygon external from UE to run the demo? That’s really stupid. I thought UE had its own LODing tools, or it had Simplygon built in or something.

HLOD was not used in Kite demo.

I’m guessing you are actually rendering thread bound - unfortunately when you do a ProfileGPU while RT bound, huge timings show up in nonsensical places. 30ms GPU base pass doesn’t make any sense.

‘stat dumpframe’ will show you what the rendering thread is spending time on.

Wow showed up.

Sorry for my wrong assumptions.

However I’m getting even more interesting results. My specs are GTX1080, i7-6700K @ 4.5G, and I can get ~31fps (probably capped) from the prebuilt version from the launcher. However, when I download the project (4.15) and make a shipping build I can get only ~5fps in an open view :frowning:

I set all the scalability settings to low (using a DefaultGameUserSettings.ini) and the fps was not improved at all…

shows up to help little old me out. I feel honored. :stuck_out_tongue:

Anyways, I did a stat unit. This is the result: Imgur: The magic of the Internet

Also, I did a dumpframe as well but have no idea how to interpret the result. I copied the output into a txt file and am attaching it here.

There’s an issue that causes very low fps in recent project builds of the Kite demo (Unreal Engine Issues and Bug Tracker (UE-42748)).

I’m going to download 4.13 and see what my performance is there, since it sounds like the issue may have been introduced in 4.14. I’ll update with results.

After comparing the different versions of the kite demo, I found that the LOD settings of a lot trees (not all) in versions higher than 4.15 are broken, LOD0’s screen size is 1.0, while all others are 0.0, which will make them never function.

That’ll do it, I’ve seen people report LOD settings stop working correctly after updating projects to newer versions.

Yes, that LOD setting screwup upon migration has bitten me in the *** as well.