@jvin1011 Yes, there are different runtimes with different platform support and also different format support. Unfortunately we do not have any runtime yet that supports the tflite file format. However, there are a number of model converters out there which are able to convert tflite to e.g. onnx (see here for an example) and then you could use existing runtimes.
Of course it is always possible to use any other inference engine directly (also tflite) without going through NNE. The downside is that you have to integrate the libraries yourself, and if it does not run on all target platforms that you are interested then you need to do the same for other runtimes, creating a huge fragmentation in your code. Or. e.g. if you want to use NPUs you will need to add a runtime for each hardware vendor you plan to run on as well.
That is why we started with NNE: Provide you with a single API to be able to access all platforms the same way and be extendible to include future runtimes as well.
So long story short: I would try to export or convert your model to onnx and save yourself the pain to add your own runtimes
Hi Nico, I have exported my ONNX model in FP16 format. When I attempt model inference using the RDGHlsl backend, I encounter the following error.
[2025.05.20-06.53.23:601][ 0]LogNNERuntimeRDGHlsl:
Warning: Input at index '0' (from template T0) is of type
"Half' witch is not suDported for that input.
[2025.05.20-06.53.23:601][ 0]LogNNERuntimeRDGHlsl:
Warning: OperatorRegistryfailed to validate
operator:Slice
[2025.05.20-06.53.23:602][ 0]LogNNERuntimeRDGHlsl:
Warning: Model validator 'RDGModel validator'
detected an error.
[2025.05.20-06.53.23:602][ 0]LogNNERuntimeRDGHlsl:
Warning: Model is not valid.
[2025.05.20-06.53.23:602][ 0]LogNNERuntimeRDGHlsl:
Warning: Cannot create a model fromthe model data
with id B67298DC456900C2B7797DA66DF4EA2F
[2025.05.20-06.53.24:736][0]LogTemp: Error: Could not
create the RDG model
Is FP16 inference not supported in NNE, or is it specifically unsupported with the RDGHlsl backend? The same model when exported in FP32 works fine using RDGHlsl backend but the performance is a little slow, so I’m exploring ways to optimize and improve it.
Yes, fp16 support with the HLSL runtime is still limited. Which engine version are you using?
If you work on a DirectX based system you can use the runtime NNERuntimeORTDml where you have a high chance to get the model running. Also, depending on the model, DirectML can access tensor cores giving you an additional boost.
This may not be suited for your final product if you aim for multiple target platforms, but at least will help you assess the performance of your model.
I am using version 5.5.1 of Unreal Engine. Since I am running on Windows, I initially tried using DirectX support, but it didn’t work for me, which is why I’m working with the HLSL runtime. Overall, the HLSL runtime works fine, but I’m looking for ways to optimize it. Please suggest.
@jvin1011 apologies for the late reply. I think your best chance is to try to reduce the model size, sorry Alternatively try to get DirectML running, it should work if you are on a DirectX based system.
@gabaly92 We spent a lot of time in this release on NNERuntimeIREE. However, it is still work in progress and needs some expertise on how to adapt the model to get it running. But it shows great perfromance on CPU for small real time models due to it’s low overhead compared to other runtimes.
Hi,
im trying to use a Model with multiple outputs in Unreal 5.5 NNERuntimeCPU but i cant get it to work.
How do I set up the tensor shapes and output data properly for multiple outputs and how do I get the data in the end? Do i then use CPUOutputBindings[0].Data CPUOutputBindings[1].Data and CPUOutputBindings[2].Data?
The model is SuperPoint LightGlue with inputs ( 2 B , 1 , H , W ) and outputs ( 2 B , 1024 , 2 ) , ( M , 3 ) , ( M , ) . I used it succesfully in the regualr ONNXRuntime but i cant set it up in NNE . Can someone help me?
You can try to view the plugin example I released earlier, which includes the cpu deployment of the multi-output model. Although he wrote the code for 5.3 and 5.4, I remember that my 5.4 version can run very well in 5.5. Even if changes are needed, they are very easy to implement because I have no memory of the modifications adapted to 5.5. You can simply change the plugin version number to try running it. The source code section clearly demonstrates how to deploy a multi-input and multi-output model.
Actually i got one more problem. My output size is dynamic too, it depends on the number of matches. When i try to run inference a second time the output size changes and i get this error: LogNNERuntimeORT: Error: Non-zero status code returned while running Reshape node. Name:'/matcher/Reshape_3' Status Message: D:\a\_work\1\s\onnxruntime\core\framework\execution_frame.cc:173 onnxruntime::IExecutionFrame::GetOrCreateNodeOutputMLValue shape && tensor.Shape() == *shape was false. OrtValue shape verification failed. Current shape:{743} Requested shape:{1009}
Is dynamic output size not supported or is it a skill issue?
As far as I know, when exporting onnx model files, the dynamic axes of input and output can be set, such as BatchSize. I’m not sure if you have made this setting, as it seems to be an error from onnxruntime rather than NNE itself.
However, even if you have an onnx file that supports dynamic variable output, NNE is very likely not to support it.
Therefore, I suggest using some “mask” to mark the valid data area in the input/output and setting a fixed maximum size. This enables NNE to be bound to a fixed-size input/output buffer, which helps to avoid encountering strange problems. This is the simplest and most effective solution to your problem.
Dynamic shape support is neither limited by NNE nor the ONNX file format, but the runtime you are using. (Both NNE and ONNX just define dynamic shapes by symbols, it is up to the runtime to interprete/fill them).
However: there are two kinds of ‘dynamic shapes’:
One is when the output shape depends on the shape of your input (e.g. you can change your batch size in the input and it will change the batch size of the output) which is supported by the Onnxruntime that you are using.
The other one is, if the output shape depends on the content of your input, which seems to be the case in your model. This is typically more difficult to handle by the runtimes and thus @HelloJXY suggestion is the most robust you can do.