Hello!
I’ve been trying to implement a neural network inference as a part of the Render Graph, basically applying neural network inference to each frame of the renderer. I have been referencing GitHub - microsoft/OnnxRuntime-UnrealEngine: Apply a Style Transfer Neural Network in real time with Unreal Engine 5 leveraging ONNX Runtime. for the most part, which covers the basic implementation and registering the view extension that adds a style pass to the render thread.
The process itself is divided into the following steps:
- Texture2D is fetched through RHI
FRHITexture2D* Texture = SourceTexture->GetRHI()->GetTexture2D();
- Texture is copied from GPU to CPU, resized and converted to float array
- That array is moved to an Input Tensor
- Inference is performed on GPU, ONNX in the backend performs moving input and output tensors to and from GPU respectivelly.
- Output is resized to input image dimensions and converted to byte array
- Output is copied from CPU to GPU.
The problem I am having is that this whole process takes about 50ms, tanking the FPS. The inference itself takes less than 10ms.
My question is: Is there a way to copy the data from GPU to a GPU tensor and using it as is without preprocessing and expensive CPU <-> GPU communication? Is there a way to do the same for output?
Thanks in advance!