[Questions] Performance issues on a high end PC/UE3 and Unity Comparisons

I was wondering what the plans for performance are? I am having great difficulties getting acceptable performance going on my PC, which is pretty much the most powerful PC you can reasonably expect gamers to have (Win7 64bit, 32 GB ram, GTX780 Overclocked edition [on par with a Titan in performance almost], 4 Ghz Quad core Intel, latest 3D card drivers).

I ran a comparison between UE3 and UE4. Same scene, same meshes, same lighting (A dom dir light with no cascaded shadow maps in UE3 vs a Stationary Dir Light with no cascaded shadow maps in UE4). The resulting FPS difference is huge. Nearly half of what UE3 delivers.

In Solus my FPS plummet heavily. I think it is a combination of a few features, and not just one specific slow feature, but still. The whole thing is so very slow. I tried building it in the most optimized way, and I have a powerful PC, but I can barely get over 30 FPS if even that.

This scene renders at 30 FPS at best. It has only 2900 drawcalls in view. It has no particles present at all. The materials are medium complex only and of a complexity you’d be able to accomplish in next gen. The foliage was set to 200 instance clusters. They have no shadow casting (doing that saved lot of FPS actually - was much worse first). Foliage has cull distances set to a quite close up range.The sunlight is moveable directional, and has 3 cascaded shadow maps. However the fade out range is 6000 only rather than the default 20 000. At 6000 it is quite noticeable, but at least it saves a few FPS. It is however still very slow.

If you look at the primitive stats I don’t really have that much going on. I do not have LODS on meshes, something I intend to do later, but still. It should be able to manage that on a PC this powerful and a level this tiny.

The level is really small, what you see is all there is to it. Let alone if I build serious UE3 size levels?

Replacing almost everything with a default material and removing all foliage still only results in 31 FPS.

At this point I have near no materials, no foliage, no particles, only three translucent surfaces in the entire level, just one directional light and one stationary pointlight, no post process blendables, no reflection capture actors, I use Gaussian blur only, and a landscape that LODS heavily and was made with a quite small size. And yet I got still only 31 FPS.

We discussed this on the Rocket IRC channel the other day and someone ran Unity comparisons:

Rocket
[16:00:19] [Raven67854]: 343 lights all set to moveable FPS is at 38 FPS
[16:00:23] [Raven67854]: no shadows

Unity
[16:07:26] [Raven67854]: with 1024 spotlights in Unity3D looking at the entire scene
[16:07:32] [Raven67854]: no shadows casting I get 41 FPS
[16:12:40] [Raven67854]: okay so inside of Unity3D I set all postFX on. AO/AA/Bloom/Motion Blur/Sun Shafts(not used but maybe it will hit performance)/DOF and HDR
[16:12:47] [Raven67854]: Same FPS actually

I don’t know how accurate his test was, and I am sure there are more additional values and parameters, but the point remains it just feels slow. I feel like I need to go at great lengths to get it anywhere near ok performance wise. Things like lit particles are all fancy but at the moment I am trying to make everything as simple and optimized as possible, I just cannot consider using many of the new features because of performance. Kind of beats the point of having it all of it…

What is the current status on optimization and what are the plans?

It should be noted that I have two 2560 monitors and that that also appears to have an impact. I run 1920 playable windows in it though.
I just set up the PC for my colleague, and when we played the game full screen (1920) on that one with one monitor attached we hit 30 FPS constant. With 2 monitors it is 20 FPS. Seems two monitors has a major impact also.

Not sure if a general hardware thing or Rocket specific.

Perhaps Epic could release a benchmark for both UDK and Rocket so we can all give it a run and pass the collected data onto Epic for proper analysis. This could include DX9 vs DX11 in UDK and DX11 in Rocket to perhaps see if theres an issue in the DX11 implementation that wasnt there previously. Ofcoarse not everything is directly comparable but it might narrow things down.

IDK if that help, but I have run GPU profiler on one of my scenes that tanked down to 9 fps (!) in latest build and narrowed issue to BokehDOF postproces (it took almost 60ms). Maybe you could try to closer look at your post processing.

Changing it from Bokeh do Gaussian improved my framerate from 9 to about 30.

Comparing UE4 to UE3 directly isn’t entirely fair. By default in UE4 you are getting a number of additional features and greatly improved shading.

That said, you are GPU bound. We have a GPU profiler that will give you more information on where you are spending your time.

If you run the standalone game, press Ctrl+Shift+Comma or open the console (~) and run the profilegpu command. You’ll get a GUI with a hierarchical breakdown of where GPU time is going. That information is also dumped to the log. Hopefully that will give you some pointers on where you’re spending time. If you attach the log I can take a look at it and maybe provide some tips.

ProfileGPU is very nice. Works well. It turned out that the two heaviest things to render were my foliage, as expected (any plans there? I remember that a few months ago instanced meshes did not allow lightmapping, could help FPS some?), and actually my cubemap capture. I have a quite small 256x256 cubemap capture actor in there for the ocean, and that seemed to kill the fps.

At 64x64 my performance is much better.

Thanks.

I don’t see the profiling results in that log. Maybe it was another file?

A dynamic cube map component will cost a huge amount of performance. It means you are rendering the scene seven times per frame (once per face plus the final render). You might consider using a combination of Reflection Environment and screen-space reflections (if your ocean is rendered as opaque).

Another option is to do a single 2D scene capture for planar reflections.

As for foliage, it is something we hope to improve but we don’t have a timeline for when we’ll work on it right now.

Err yeah didn’t intend to attach it, since I had found the problem already myself.

I tried a plane capture but it didn’t look very good. I also couldn’t get reflections to work because it is translucent. Which is another major problem I hit, making nice water without any real specular or reflection to hook into is hard. Any plans there?

Thanks.

One other option you might consider: disable bCaptureEveryFrame on your cube map component. That will capture the scene once when the map is loaded. As long as you’re ok with the ocean reflecting the static scene you should get the same look but not pay the cost of updating it every frame.

Thanks, I have a dynamic day and night cycle though, so it would be obvious if its static.

One thing I was thinking though is that it is currently possible to set a max capture distance. Would it be possible to add a min capture distance to it? That way I would be able to make it forget about the level, and only capture the far away sky, which is the only thing I really need and which would prevent rendering the whole map again?

I’ll add it to our list of feature requests. Alternatively, maybe an actor filter so you could ask it to capture only the skybox and maybe some vista assets.

Does this help a bit? → https://rocket.unrealengine.com/questions/7355/reflectionsspecularity-on-translucent-materials.html

I checked the beta 5 and TLM_surface is there now.

TLM_Surface is the intended way to render water, glass, etc. It still needs improvement but it will use reflection environment on a per-object basis to render reflections.