Running Neural Network Inference with GPU Input/Output tensors

vladar45 · November 14, 2022, 9:06am

Hello!

I’ve been trying to implement a neural network inference as a part of the Render Graph, basically applying neural network inference to each frame of the renderer. I have been referencing GitHub - microsoft/OnnxRuntime-UnrealEngine: Apply a Style Transfer Neural Network in real time with Unreal Engine 5 leveraging ONNX Runtime. for the most part, which covers the basic implementation and registering the view extension that adds a style pass to the render thread.

The process itself is divided into the following steps:

Texture2D is fetched through RHI FRHITexture2D* Texture = SourceTexture->GetRHI()->GetTexture2D();
Texture is copied from GPU to CPU, resized and converted to float array
That array is moved to an Input Tensor
Inference is performed on GPU, ONNX in the backend performs moving input and output tensors to and from GPU respectivelly.
Output is resized to input image dimensions and converted to byte array
Output is copied from CPU to GPU.

The problem I am having is that this whole process takes about 50ms, tanking the FPS. The inference itself takes less than 10ms.

My question is: Is there a way to copy the data from GPU to a GPU tensor and using it as is without preprocessing and expensive CPU <-> GPU communication? Is there a way to do the same for output?

Thanks in advance!

ranierin · November 18, 2022, 4:16pm

Hi vladar45,

Thanks for your interest in this topic! Unfortunately it is currently not supported to infer from and to GPU buffers directly. But the the plugin is continuously improved and features added, so please stay tuned for future releases!

Best
Nico

ranierin · May 22, 2023, 7:09am

Hey vladar45,

We published a new plugin called NNE. It is still experimental, but it supports working directly on RDG buffers without the need of expensive cpu transfers. Some rough overview can be found here.

Hope that helps