Course: Neural Network Engine (NNE)

In this course you will learn everything needed to empower your game with AI using Unreal Engine’s neural network engine NNE.


Would there be a way to utilize this to train a characters animation based on audio? So if I gave it new audio it would create something new in real-time?


Hi tomhalpin8,

The plugin NNE does not support training. Typically, a network would be trained in some environment as e.g. python and pytorch and then exported as .onnx, to be imported as an asset and used inside NNE for inference.

But then, yes, turning audio into animation is certainly possible. Check out the ‘Hellblade actor demos MetaHuman Animator’ from GDC 2023 where audio is used to create tongue animation.

I love the idea to be able to use neural networks for game logic.

Question here: Could you add an example for asych processing the model?
Other question: I like that you include BP as option, I only fail to stay in BP e.g. passing a screenshot to the model: Any chance to include some helper functions there?

Hi Nico,

I used your example to try out a Yolo 2 model (models/vision/object_detection_segmentation/tiny-yolov2 at main · onnx/models · GitHub) but the function GetFromDDC will fail with it - RunTime name is “NNERuntimeORTCpu”. Is that a problem of the model or am doing somewthing wrong?
Other question: Seems only CPU and sync is supported instead if GPU plus asynch - is that planned for the future to make it less impactful on performance?

Hey Drstreit,

In the tutorial, the returned unique pointer that you get from CreateModelCPU (indicating the caller becoming the owner of the model) is converted to a shared pointer.
The shared pointer can be passed to any thread / task for execution e.g. by using AsyncTask.
For instance, you can pass the model in the capture list of the lambda function, RunSync inside (which is carried out on a separate thread) and then pushing the results back to the game thread with another AsyncTask launched to ENamedThreads::GameThread.

There are two difficulties you need to take care of:

  1. The memory that is passed as input / output to the model must stay valid throughout execution (Which you can achieve easily e.g. by copying data to the thread and back (at the cost of efficiency due to memory copying)).
  2. You must make sure the model finished evaluation before you stop the game (if the game crashes when you quit, it usually means that some model was still running).

I hope that helped?

I am not sure what you mean by ‘include BP as option’. The tutorial wraps the NNE models in Blueprint functions as an illustrative example, but of course you can call everything from any c++ code you want.
If you want to work on screenshots, you may want to look into FSceneViewExtensionBase and use an RDG runtime to directly process a render target inside the render pipeline.

Does this answer your question?

1 Like

Hi BastianDev,

If getting the model from DDC fails, the model will be recooked and stored locally a little later. If you are not connected to a DDC this will happen everytime you restart the Editor (DDC is caching previous cooks to speed up development).
So it should not be a problem, just an optimization. Does the model cook/load afterwards? If not, do you see any warning in the log (you can filter by NNE)? It could be something different.

Regarding the other questions:

We leave the async execution of models up to the game logic as it may vary a lot between games. Checkout the answer I gave to drstreit, you can basically pass a shared pointer of a model to an AsyncTask (or any other thread).

Regarding GPU, there are two kinds of how to run things on GPU:

One is if you use a INNERuntimeGPU, where you provide cpu memory, the model will be evaluated on GPU and you receive back the memory on CPU.
As you can imagine, due to CPU/GPU sync, this is not very performant and thus only available in Editor (e.g. if you want to run ML asset actions).

The other one is INNERuntimeRDG where you can embed a neural network into the render graph and thus become part of the render pipeline.
This is still work in progress and thus not yet well documented. But please feel free to start playing with it. I am happy to answer questions if you encounter any.

1 Like

Thank you - long way to go for me until I really understand it all - eg took me a bit until I got an in game screenshot and converted it to array of float to feed it into a tensor: The more examples, the merrier, it would be absolutely great to continue your BP/C++ tutorial series.
I foresee a future where we will not rayscan everything to get an “detect whatever” reaction from an NPC but will use simply models to understand and react to in game situations: Prohibitive expensive maybe today (for multiplayer that is) - but I guess not anymore in the near future…

1 Like

Yes, agreed, every example will help! We definitely want to continue with the tutorials but need to find the right balance between making them simple but yet correct (the difficulty for async) and also not spend time on tutorials where things are about to change (the difficulty for GPU/RDG). But while the API gets more stable we certainly make more, and hopefully more intuitive tutorials.

Until then I recommend looking into the NNE public API in source code (access required), which contains at least some hints. Also, you can get some inspiration from this tutorial, it is based on the old deprecated plugin, but I am sure you can easily update it to use NNE and maybe even the GPU or RDG interface.

I fully agree on ML playing a bigger role in games in the future. Please also checkout the Learning Agents, they also leverage AI.

Thank you for the tutorials. In the tutorials, “It supports both CPU and GPU inference on desktop machines and on some consoles.” was mentioned. I am wondering what consoles are currently supported and if there are any plans to plans to support both “INNERuntimeCPU” and “INNERuntimeRDG” on PS5 or Xbox.

Hello, Thanks for the tutorial.
Only models that seems to work are the ones suggested in the NNI tutorial.
I tried to import Roberta and GPT2, but their shapes seem to have a dynamic dimension.

GetInputShape() returns an array with -1 values.
CreateTensor() with Shape[i] < 1 returns false (can’t create tensor).
I tried to replace by 1 but it crash the engine

Does this mean that the plugin is not finished and/or functional?
How do you handle SymbolicTensorShape ?

1 Like

Thank you for this, it is very insightful.

I was able to get the course completed with a model loaded and syncing.

The model that I loaded is a large language model from HuggingFace.

As my understanding of C++ is lacking, I am not sure how to set up the NNE to receive text input.

I assume it would be by changing the input variables to receive a string instead of a shape.

It appears we need to have a tokenizer for the model set up within the project in order convert the input string into tensor data and the output from tensor data back to a string.

I think you always need to have shape, they define how each tensor are made. However for each model you have to gather info from tensordesc to get the shape and type of your tensor, so you may create many different tensor type where only the data type change according to the available nne tensor type.
You have to change the tutorial to support different tensor type at the same time e.g inputtensor is int64 and output as float.

For me roberta seems to work with dynamic input / output tensor but not gpt2 it gives me a onnx error about output tensorshape requested different from the one i provide, but the shape is exactly the one from the tensordesc and i’ve checked after the input is set.
I think there is an error somewhere.

About the tokenizer yes we will need to do pre/post-processing the bad news is tokenizer are not available in cpp they are in python few ppl on the web provide old and approximative port with dependancies for gpt2 but for other models it could be harder.

At this point i think the plugin is useless for llm. Also onnx even in python have it’s issues.
I could be wrong, for me AI is new and i know nothing in this domain :sweat_smile:.
If you want an access to llm there’s other way.
-Runtime Python Plugin + onnx or pytorch etc. (slow, maybe inneficient)
-Create a GGML plugin (Really new could change a lot, maybe difficult)
-WebRequest and chatgpt (paid for you and/or your users)

1 Like

Hello, thanks for the wonderful tutorial!

I would like to know is there a specific reason for not supporting the mobile platforms, or is it planned for the future? I’m very excited to try a few things but without mobile support it’s kinda hard to push forward, because we are developing a cross-platform game and would like to have parity in our core features.

Thank you for the plugin and the tutorials. I have a couple questions regarding NNE:

  1. I assume NNE is replacing NNI, is that correct? (if that’s the case, a note in the code base would be super useful).

  2. NNE seems to be built on top of the ONNX runtime, from what I could tell looking at the code base, Xbox consoles are supported, any idea about the PS5? (there were no mentions in the PS5 dev specific forum). The post mentions supports for consoles and I’d love to know which ones.


Hi everyone and sorry for the late replies!

@jackchen03281992 :
We already support some of the consoles (everything WIP). Please checkout your Engine/Platforms folder to see on which platforms the plugin is enabled (You will need to be a dev for that platform).

@grabthar :
In NNE (caution, NNE != NNI, NNI will be removed in 5.3), dynamic shapes are supported in some runtimes already (e.g. ORT cpu).
When you have loaded a model, you can access the tensor descriptors which return symbolic tensor shapes, containing a -1 for dimensions which are dynamic.
Before the first inference (and every time your input shape changes), you must call SetInputTensorShapes with the actual size of your input tensor (in specific, -1 replaced by the actual size), so that the model can run shape inference and potentially allocate the right amount of memory for intermediate tensors.

@FallenNinja :
As pointed out by grabthar, you always need the shape (describing the size of the actual input). Text input would probably be a 1D input tensor. Unfortunately, NNE currently only supports floats as in and output types, thus text is not yet supported.
But we are working on it, please stay tuned for updates on this!

@dchamploo :
Indeed, mobile is not supported yet, but we certainly want to extend device support in the future, please stay tuned!

@matt :

  1. Yes, NNI will be gone soon and NNE is the way to go
  2. Yes, NNE is currently making heavy use (but it is not limited to) the ONNX runtime. Please checkout my answer to jackchen03281992 for platform support .

I hope that helped, let me know if you have further questions!


Thanks @ranierin I was able to run our model on ORT CPU.
Things were a bit tricky since the model has dynamic shapes, thankfully since our code is in C++, once we learned the NNE abstraction layers, we got there decently quickly.

In case others are trying the same thing, here is a quick snippet of what I ended up doing:

In our class:

TArray<FNeuralNetworkTensor> Inputs;
TArray<FNeuralNetworkTensor> Outputs;
// manually set the input shape and data.
Inputs.SetNum(1); // We know that our model only has 1 input tensor
Inputs[0].Shape = { 1, static_cast<int32>(NumSamples), 1 };
// copy the data to the input tensor so the model can process it
FMemory::Memcpy(Inputs[0].Data.GetData(),, NumSamples * sizeof(float));

Finally, if you have dynamically output shapes, don’t forget to manually set them and allocate the data in them. Here is an example where we know that all our output shapes have a dynamic size but at that point we know what the size will be (same as the input tensor shape example where we know the number of samples at that point).

    for (auto i = 0; i < Outputs.Num(); i++)
        auto Shape = Model->GetOutputShape(i);
        if (Shape.Num() == 0)
        Outputs[i].Shape = Shape;

        int32 Volume = 1;
        for (int32 j = 0; j < Shape.Num(); j++)
            if (Shape[j] < 1)
                // -1 means dynamic size, we need to use our calculated size
                Volume *= CalculatedOutShapeSize;
                Outputs[i].Shape[j] = CalculatedOutShapeSize;
                Volume *= Shape[j];
        if (Volume>0)

In this example, it’s important to note that we set both the specific shape size (for the dynamic shape) AND also the shape data size which is then used to receive the model output.

finally, don’t forget to call SetInputs as per the tutorial before calling RunSync (or whatever you implemented to run the model).

Model->SetInputs(Inputs); // needed so the input bindings are properly set
// get your data out of Outputs, Outputs[0].Data will have the data of our first output tensor.

So while there is some work needed to convert existing code (or learn how to use models), the fact that NNE handles all the dependency and runtime management is a huge help, thanks @ranierin and team!

1 Like

@ranierin would you mind showing a basic example/snippet of code setting up tensors for an RDG runtime?

The API is quite different from using CPU/GPU runtimes and I couldn’t figure out how to create a FTensorBindingRDG, probably because I never used FRDGBufferRef and I apparently need to set render buffers for the inputs and outputs in order for the runtime to run the model. I have my model running on CPU and GPU runtimes, but I got stuck trying to create a RDG model.
I’m also curious to know the recommended way to run EnqueueRDG from the render thread.



Hey @mattaimonetti,

The workflow to setup and initialize a RDG model should be pretty similar to the CPU and GPU case that you summarized above very nicely.

However, creating in- and outputs and EnqueueRDG is quite different, as the RDG API is for a completely different use case:
While CPU and GPU inference is done completely outside the render loop, the RDG API is meant to be used to embed a neural network into the render flow, e.g. to work on a resources that is used to render a frame (mesh, texture, rendertarget, …).
Or in other words, you can think of the RDG model as a set of compute shaders that are added to the render graph, before it is launched to render the next frame.

A “simple” use case would be post processing: You could create a class inheriting from FSceneViewExtensionBase and override the PrePostProcessPass_RenderThread function.
The engine will call your function from the render thread and give you the colored frame inside FPostProcessingInputs which you can then process and write back.
Instead of enqueueing here a classic compute shader to process the frame, you would just EnqueueRDG your neural network doing the processing for you.

A challenge is to convert the texture (which could have any pixel format) into a tensor of type float. So you need a compute shader to convert from texture to a FRDGBufferRef and the result back to a texture again.
You can create the intermediate buffers (in- and output for the network) with the following code

FRDGBufferDesc InputBufferDesc = FRDGBufferDesc::CreateBufferDesc(sizeof(float), NeuralNetworkInputSize.X * NeuralNetworkInputSize.Y * 3);
FRDGBufferRef InputBuffer = GraphBuilder.CreateBuffer(InputBufferDesc, *FString("NeuralPostProcessing::InputBuffer"));

To set it as an input to the compute shader you may need to create an UAV resource from it with

FRDGBufferUAVRef InputBufferUAV = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(InputBuffer, PF_R32_FLOAT));

It is possible to create your own graphbuilder from an RHI command list, but only between two frames. Then you could register your external buffers (or upload from CPU) and enqueue the neural network to be processed outside the frames.

It will probably also help to read in the UE docs about how the RDG pipeline works and how the resources behave.

I hope I could help, let me know if you have further questions.

1 Like