Can I move certain processes/features From the GPU to the CPU

I am encountering a high draw time for my GPU (18-21ms) and a low draw time for my CPU. (~1-4ms).

Are there any features or functions that are computed on the GPU by default but have the option to have them run on the CPU instead? (Sort of like the new GPU morph target option but the other way around.) Or methods that I can use particular features in a way that would pull some of the stress off of the GPU and move it to the CPU.

My Machine’s Specs:

  • Unreal Engine 4.13.0

  • Windows 10 Professional 64-bit

  • 28GB DDR3 1333MHz RAM

  • Intel I7 3770k - 3.5GHz

  • MSI 4GB Radeon R9 290 - Underclocked by 15%.

Any advice would be greatly appreciated.

What you’re asking for is basically somehow mashing up Hardware and Software rendering together.
Short answer: no.

I was thinking of a smart answer for a while but noone has ever done this and it’d be a huge performance hole to even attempt this.
GPU’s are good at performing certain pre-made tasks repeatedly (vertex and polygon based 3D graphics aswell as shading) while CPU’s are not and would have to do a lot of redundant operations a GPU would do in a snap.
It’d be like asking a child to move some furniture while a bodybuilder was standing right next to it (funny enough the metaphor should work in the way that you don’t ask for them both to do it at the same time).

You should probably optimize your scene and/or get a better GPU.

For rasterization, that’s true.
For vertex processing (transformation) you might be able to hoist that to the CPU, but Unreal doesn’t really have support for that.

However, you should do profiling to figure out what it is that is taking most of the time.
Run in a 640x480 window without anti-aliasing; how much faster does it run?
Replace all textures with a 4x4 gray texture; how much faster does it run?
If you have lots of ground cover or other small meshes; don’t draw them; how much faster does it run?

If it’s purely fill rate, then smaller frame buffer is the only solution (or a faster graphics card :slight_smile:
If it’s something else, such as scene read-back stalls, then you may be able to find other fixes.

@BMAliens: I am fully aware that there are massive differences in the processing architectures between CPU and GPUs, and therefore require entirely different approaches in many aspects, but they aren’t so fundamentally different that it’s impossible to use one to do some of the work of the other. Of course it requires significant changes in code and the how the algorithms make use of hardware, but it can still be a viable solution. But I understand why you made the point that you have, and I appreciate it nonetheless.

@jwatte: I’ve done extensive profiling and can rest easy that it is neither number of draw cells nor texture bandwidth. Many of my static meshes rely on a small assortment of seamless albedo, roughness/metal/ao, and detail normal maps that use material instances that have parameterized Luo’s WorldAlignedTexture/Normal material functions as well as a base normal map uved to static meshes that have been traced from high poly versions in SP which never stray above 2k. I also decided to change the gruffer to only use 8bit precision for normal maps instead of 16bit. And all of the static meshes are either using HISM in a blueprint or have been merged together for draw cell reduction while maintaining effective occlusion culling bounds.

My texture streaming pool never strays above 300MB in indoor levels and 1100MB on outdoor levels. So as far as I know I should be alright given that my GPU has 4GB of GDDR5. Unless I’ve missed another crucial detail, which is certainly possible. And nearly everything is statically lit as well. Aside from that I’m only using a small number of dbuffer decals and only two (at the moment) particle effects that take up minimal on-screen pixels. (Maybe 1/16th of the actual screen @ 1920x1080 at the absolute most).

It seems for the most part that it’s an issue with Screen-Space Reflections. It usually is taking up somewhere around 4-8ms at 1920x1080. Which seems a bit absurd especially since I have it below the default settings. I’m not at my computer atm, but if I remember correctly I have it clocked at 100 intensity, 30 quality, and 0.4 roughness. I can’t seem to determine the issue beyond that. I would much prefer using Planar Reflections but while it works great in some areas, it absolutely destroys the performance of larger scenes, and since I can’t turn off the “support global clipping plane” depending on the level due to it being a project-wide setting. It’s just not a practical option. And I can’t imagine any benefit stemming from somehow moving the apparently heavy workload of the SSR onto the CPU and expecting any improvement.

So I am a bit stuck, because dynamic real-time reflections are so important for my project’s visual style, but at the same time the performance toll is downright absurd.

I think the reason it’s “slow” is that it forces a pipeline stall. You can’t read the screen space image until the screen space image is done, so this forces the card to wait for all the previous work to complete before it can start working on reflections.

The only other thing I can think of is dropping to 720p resolution and see if that improves things (it should.)
Or buy a faster graphics card :slight_smile:

Is that so? I wasn’t aware of that. Do Planar Reflections also suffer from this? And are there any workarounds/fixes to the pipeline stall? And its not like I’m getting horrendous performance with the card. I just see a massive performance drop. (ie 120fps down to ~90fps. and while it’s not recommended to use fps for testing, I just use it when I’m making small adjustments because it’s easier for me remember fps rather than ms.) But I’m just afraid of what those with older hardware will endure. But since I’m going to make it an option in the graphics menu I guess it doesn’t matter all that much does it?

I don’t think so. Instead, the problem there is that you have to draw everything reflected-visible twice – once, normal, and once, mirrored through the plane. I could be wrong about this, though – I haven’t read the code in detail.

Again, I haven’t read the code in detail here, but I would think that if there was a fix, that would already be used in the engine.

Does it? Only you can decide what you want your game to feel and look like :slight_smile: