How is DX11 implemented in Unreal Engine?

Hey guys,

I’m working on a game on Unreal engine 4.27 and it’s being run on DX11. We have 20-30% CPU usage with 40% GPU usage. However, on profiling with Nvidia NSight systems I can see that the Render thread is always maxed out and is being switched across to multiple CPU cores from time to time. I’m not sure why Unreal is using just 1 thread to stage commands to the gpu. I thought Unreal used the deferred context in DX11 to stage draw calls and even them out across multiple threads for much better workload distribution across cores and efficient usage of command lists. But I can clearly see that it’s not happening at least from my profiling stats.

So now my question is:

  1. Does Unreal 4.27 support only immediate mode rendering or it implements the deferred context as well?
  2. If it does implement the latter, how can I enable it or do something in my game that the engine does that automatically?
  3. If it doesn’t implement the latter, what are my options to prevent this thread bottleneck from happening. We have around 7000 draw calls (on average) and the visibility commands (rhi cmdlist) under ‘stat gpu’ takes around 23 ms and we are getting 14 FPS.

So right after this, we tested the game with DirectX12 just by enabling it from the Project settings and guess what the FPS came down to less than 1! Upon carefully watching the CPU cores, there was a bigger bottleneck than before!

I thought that Unreal engine 4.27 was designed with DX12 in such a way that there would be much better usage of CPU threads on multicore CPUs but based on my findings, I am not seeing any of that. I’m not sure if this is an architecture related issue because Unreal 4 was designed with DX11 in mind and they just couldn’t scale it due to the radical changes that came with DX12 and Vulkan.

So any help on this would be just great!