Failed load model to Learning-Agents

Hello everyone,

I’m trying to replace the internally generated Critic network in UE5’s Learning-Agents plugin with an external ONNX model exported from PyTorch. I’ve made some modifications but am running into a crash and would appreciate any guidance.

My Goal:
To pass a UNNEModelData asset (created from a PyTorch-exported ONNX file) into the ULearningAgentsCritic setup process, so it uses my custom model instead of building its own MLP.

My Approach & Modifications(Code Below):

  1. I created modified versions of MakeCritic and SetupCritic (MyMakeCritic, MySetupCritic) in ULearningAgentsCritic. I added a new parameter UNNEModelData* MyModelData to pass the external model asset.

  2. In MySetupCritic, I modified the critical section to use my model data instead of the built-in FModelBuilder

  3. In ULearningNeuralNetworkData, I created MyInit and MyUpdateNetwork functions to handle the external UNNEModelDatafrom MySetupCritic

My Code:

  • LearningAgentsCritic.cpp
ULearningAgentsCritic* ULearningAgentsCritic::MyMakeCritic(
  ULearningAgentsManager*& InManager,
  ULearningAgentsInteractor*& InInteractor,
  ULearningAgentsPolicy*& InPolicy,
  UNNEModelData* MyModelData,
  TSubclassOf Class,
  const FName Name,
  ULearningAgentsNeuralNetwork* CriticNeuralNetworkAsset,
  const bool bReinitializeCriticNetwork,
  const FLearningAgentsCriticSettings& CriticSettings,
  const int32 Seed)
{
  if (!InManager)
  {
    UE_LOG(LogLearning, Error, TEXT(“MakeCritic: InManager is       nullptr.”));
    return nullptr;
  }

  if (!Class)
  {
	UE_LOG(LogLearning, Error, TEXT("MakeCritic: Class is nullptr."));
	return nullptr;
  }

  const FName UniqueName = MakeUniqueObjectName(InManager, Class, Name, EUniqueObjectNameOptions::GloballyUnique);

  ULearningAgentsCritic* Critic = NewObject<ULearningAgentsCritic>(InManager, Class, UniqueName);
  if (!Critic) { return nullptr; }

  Critic->MySetupCritic(
	InManager,
	InInteractor,
	InPolicy,
	MyModelData,
	CriticNeuralNetworkAsset,
	bReinitializeCriticNetwork,
	CriticSettings,
	Seed);

  return Critic->IsSetup() ? Critic : nullptr;

}

void ULearningAgentsCritic::MySetupCritic(
	ULearningAgentsManager*& InManager, 
	ULearningAgentsInteractor*& InInteractor, 
	ULearningAgentsPolicy*& InPolicy, 
	UNNEModelData* MyModelData,
	ULearningAgentsNeuralNetwork* CriticNeuralNetworkAsset, 
	const bool bReinitializeCriticNetwork, 
	const FLearningAgentsCriticSettings& CriticSettings, 
	const int32 Seed)
{

    //...
    //...same as ULearningAgentsCritic::SetupCritic...//
    //...

	if (!CriticNetwork->NeuralNetworkData || bReinitializeCriticNetwork)
	{
		// Create New Neural Network Asset

		if (!CriticNetwork->NeuralNetworkData)
		{
			CriticNetwork->NeuralNetworkData = NewObject<ULearningNeuralNetworkData>(CriticNetwork);
		}

		//UE::NNE::RuntimeBasic::FModelBuilder Builder(UE::Learning::Random::Int(Seed ^ 0x2610fc8f));

		//TArray<uint8> FileData;
		//uint32 CriticInputSize, CriticOutputSize;
		//Builder.WriteFileDataAndReset(FileData, CriticInputSize, CriticOutputSize,
		//	Builder.MakeMLP(
		//		ObservationEncodedVectorSize + MemoryStateSize,
		//		1,
		//		CriticSettings.HiddenLayerSize,
		//		CriticSettings.HiddenLayerNum + 2, // Add 2 to account for input and output layers
		//		UE::Learning::Agents::Critic::Private::GetBuilderActivationFunction(CriticSettings.ActivationFunction)));

		//check(CriticInputSize == ObservationEncodedVectorSize + MemoryStateSize);
		//check(CriticOutputSize == 1);

		//CriticNetwork->NeuralNetworkData->Init(CriticInputSize, CriticOutputSize, CriticCompatibilityHash, FileData);
		CriticNetwork->NeuralNetworkData->MyInit(ObservationEncodedVectorSize + MemoryStateSize, 1, CriticCompatibilityHash, MyModelData);  //Modified
		CriticNetwork->ForceMarkDirty();
	}

	// Create Critic Object
	/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////  <---- ERROR
	CriticObject = MakeShared<UE::Learning::FNeuralNetworkCritic>(
		Manager->GetMaxAgentNum(),
		ObservationEncodedVectorSize,
		MemoryStateSize,
		CriticNetwork->NeuralNetworkData->GetNetwork());

	Returns.SetNumUninitialized({ Manager->GetMaxAgentNum() });
	UE::Learning::Array::Set<1, float>(Returns, FLT_MAX);

	ReturnsIteration.SetNumUninitialized({ Manager->GetMaxAgentNum() });
	UE::Learning::Array::Set<1, uint64>(ReturnsIteration, INDEX_NONE);

	bIsSetup = true;

	Manager->AddListener(this);
}
  • LearningNeuralNetworkData.cpp
void ULearningNeuralNetworkData::MyInit(const int32 InInputSize, const int32 InOutputSize, const int32 InCompatibilityHash, UNNEModelData* InMyModelData)
{
	InputSize = InInputSize;
	OutputSize = InOutputSize;
	CompatibilityHash = InCompatibilityHash;
	ModelData = *InMyModelData;
	MyUpdateNetwork();
}


void ULearningNeuralNetworkData::MyUpdateNetwork()
{
	if (FileData.Num() > 0)
	{
		if (ensureMsgf(ModelData, TEXT("Could not find requested ModelData"))) {
		}

		if (!Network)
		{
			Network = MakeShared<UE::Learning::FNeuralNetwork>();
		}

		  ensureMsgf(FModuleManager::Get().LoadModule(TEXT("NNERuntimeBasicCpu")), TEXT("Unable to load module for NNE runtime NNERuntimeBasicCpu."));

		TWeakInterfacePtr<INNERuntimeCPU> RuntimeCPU = UE::NNE::GetRuntime<INNERuntimeCPU>(TEXT("NNERuntimeBasicCpu"));

		TSharedPtr<UE::NNE::IModelCPU> UpdatedModel = nullptr;

		if (ensureMsgf(RuntimeCPU.IsValid(), TEXT("Could not find requested NNE Runtime")))
		{
			UpdatedModel = RuntimeCPU->CreateModelCPU(ModelData);
		}

		// Compute the content hash

		//ContentHash = CityHash32((const char*)(ModelData->GetFileData()).GetData(), (ModelData->GetFileData()).Num());

		// If we are not in the editor we can now clear the FileData and FileType since these will be
		// using additional memory and we are not going to save this asset and so don't require them.

#if !WITH_EDITOR
		ModelData->ClearFileDataAndFileType();
#endif

		Network->UpdateModel(UpdatedModel, InputSize, OutputSize);
	}
}
  • onnx_generator.py
import warnings
import torch
import torch.nn as nn

warnings.filterwarnings("ignore", message=".*legacy TorchScript-based ONNX export.*")


class NeuralNetwork(nn.Module):
    def __init__(self, input_dim=66, hidden_dim=128, output_dim=1):
        super(NeuralNetwork, self).__init__()
        self.model = nn.Sequential(
            nn.Linear(input_dim, hidden_dim),
            nn.ELU(),
            nn.Linear(hidden_dim, hidden_dim),
            nn.ELU(),
            nn.Linear(hidden_dim, output_dim)
        )

    def forward(self, x):
        return self.model(x)

model = NeuralNetwork(input_dim=66, hidden_dim=128, output_dim=1)
model.eval()

dummy_input = torch.randn(1, 66) #inputsize=2  memsize=64

torch.onnx.export(
    model,
    dummy_input,
    "dynamic_batch_model.onnx",
    export_params=True,
    opset_version=11,
    input_names=['input'],
    output_names=['output'],
    dynamic_axes={
        'input': {0: 'batch_size'},
        'output': {0: 'batch_size'}
    }
)

The Problem / Crash:

The code crashes when MySetupCritic tries to create the FNeuralNetworkCritic

CriticObject = MakeShared<UE::Learning::FNeuralNetworkCritic>(
    Manager->GetMaxAgentNum(),
    ObservationEncodedVectorSize,
    MemoryStateSize,
    CriticNetwork->NeuralNetworkData->GetNetwork() // **CRASH HERE**
);

The crash occurs inside the TSharedPtr’s operator*, with a check(IsValid()) failure. This indicates that the TSharedPtr<UE::Learning::FNeuralNetwork> returned by GetNetwork() is null or invalid.

/**
 * Dereference operator returns a reference to the object this shared pointer points to
 *
 * @return  Reference to the object
 */
template <
	typename DummyObjectType = ObjectType
	UE_REQUIRES(UE_REQUIRES_EXPR(*(DummyObjectType*)nullptr)) // this construct means that operator* is only considered for overload resolution if T is dereferenceable
>
[[nodiscard]] FORCEINLINE DummyObjectType& operator*() const
{
	check( IsValid() );
	return *Object;
}

What I’ve Checked/Tried:

  • The ONNX model has an identical architecture to the original critic network.
the results from the original, unmodified plugin:
================Critic Network ID: 1================================
NeuralNetwork(
  (model): Sequential(
    (0): Linear()
    (1): ELU()
    (2): Linear()
    (3): ELU()
    (4): Linear()
  )
)
  • I’ve verified that the InputSize and OutputSize I’m passing match the expectations of the original code and my ONNX model.

Environment:

  • Unreal Engine 5.6 Release

I’m a student researcher focusing on Multi-Agent Reinforcement Learning (MARL). I’m using UE5 to create custom environments for my research, and the Learning-Agents plugin is central to this work.

I’m looking for a robust method to integrate externally trained ONNX models into UE5’s Learning-Agents framework. While I’d appreciate help with my current implementation issues, I’m equally interested in any alternative approaches that achieve this integration successfully.

Thanks for any insights!

1 Like

Hello,

I think the issue is that you’re using the wrong runtime. You’re currently using NNERuntimeBasicCpu but this only works with Learning Agents custom format. You should be using NNERuntimeORTCpu, where ORT stands for ONNX Run-Time.

Good luck with your project and let me know if you run into more issues.

Brendan

Thanks for your previous replies—they were incredibly helpful. I’ve studied your suggestions and now have everything up and running. However, I still have a few questions regarding my Pendulum environment training.

Here’s a brief overview of the issue:

I followed a racing tutorial and adapted it to train a Pendulum environment. When using the plugin’s auto-generated model(C++ and UseMemoryis closed), the Pendulum stabilizes after around 800 iterations. However, when I use my own ONNX model(I exported the model generated by the LA plugin to an ONNX model, used Netron to examine its structure, and then wrote a model based on that reference.), it takes about 1800 iterations to achieve full stabilization. And all pendulums suddenly learned to balance upright at the 1800th iteration, after constantly spinning before that.

To give more context, here’s what I modified:

I reconstructed the model construction chain by mimicking the original functions:

  • MakePolicy / MakeCritic

  • SetupPolicy / SetupCritic

  • Init

  • UpdateNetwork

I also made sure to pass the externally received ONNX model through this chain.
As pointed out earlier, I noticed that the original plugin uses NNERuntimeBasicCpu, which requires a specific serialized format (.ubnne models) and does not support ONNX. So, I switched to NNERuntimeORTCpu.

In UpdateNetwork, I made the following changes:

ModelData->Init(TEXT("onnx"), FileData);
`// FileData already contains the model values from init

if (!Network)
{
Network = MakeSharedUE::Learning::FNeuralNetwork();
}

ensureMsgf(FModuleManager::Get().LoadModule(TEXT(“NNERuntimeORT”)), TEXT(“Unable to load module for NNE runtime NNERuntimeORT.”));

TWeakInterfacePtr RuntimeCPU = UE::NNE::GetRuntime(TEXT(“NNERuntimeORTCpu”));`

I also updated all functions involving LoadFromSnapshot, since they call UpdateNetwork. The socket parts remained unchanged as I’m only using shared memory.

On the Python side:

In train_common.py, I replaced:

network = build_network(device)

with a custom model defined in my_train_common.py.

network = MyGenerateNN(network_id)

Here’s my_train_common.py:

import torch
import torch.nn as nn
import torch.onnx
import torch.nn.functional as F
import numpy as np
import io


def MyGenerateNN(network_id):
    network_name = ""
    match network_id:
        case 0:
            network_name = "policy"
        case 1:
            network_name = "critic"
        case 2:
            network_name = "encoder"
        case 3:
            network_name = "decoder"
    if network_name == "":
        return None
    else:
        return torch.load(f'E:/Proj/UE_Release/Generator/ONNX/{network_name}_all.pth', weights_only=False)


def generate_filedata(model):
    torch.save(model, model.network_all_path)
    torch.save(model.state_dict(), model.network_pth_path)
    export_onnx(model)


def export_onnx(model, path=None):
    if path is None:
        model.eval()
        model.to('cpu')
        
        file = open(model.network_onnx_path, 'wb')
        torch.onnx.export(
            model,
            torch.randn(1, model.input_num),
            file,
            export_params=True,
            opset_version=11,
            input_names=['input'],
            output_names=['output'],
            dynamic_axes={'input': {0: 'batch_size'}, 'output': {0: 'batch_size'}}
        )
        file.close()
        model.to("cuda")
        model.train()
    else:
        model.model.eval()
        model.model.to('cpu')
        
        file = open(path, 'wb')
        torch.onnx.export(
            model.model,
            torch.randn(1, model.input_num),
            file,
            export_params=True,
            opset_version=11,
            input_names=['input'],
            output_names=['output'],
            dynamic_axes={'input': {0: 'batch_size'}, 'output': {0: 'batch_size'}}
        )
        file.close()
        model.model.to(model.device)
        model.model.train()


class MyNeuralNetwork(nn.Module):
    def __init__(self, network_id, input_dim, output_dim, continue_training):
        super(MyNeuralNetwork, self).__init__()
        self.continue_training = continue_training
        self.onnx_model_bytes_data = None
        self.network_name = None
        self.network_id = network_id
        self.network_filepath = None
        self.input_num = input_dim
        self.output_num = output_dim
        self.compatibility_hash = None
        self.device = 'cuda'

        match self.network_id:
            case 0:
                self.network_name = "policy"
            case 1:
                self.network_name = "critic"
            case 2:
                self.network_name = "encoder"
            case 3:
                self.network_name = "decoder"

        self.network_pth_path = f'E:/Proj/UE_Release/Generator/ONNX/{self.network_name}.pth'
        self.network_all_path = f'E:/Proj/UE_Release/Generator/ONNX/{self.network_name}_all.pth'
        self.network_onnx_path = f'E:/Proj/UE_Release/Generator/ONNX/{self.network_name}.onnx'

        self.to('cuda')

    def get_onnx(self):
        with torch.no_grad():
            self.eval()
            self.to('cpu')
            
            bytes_io = io.BytesIO()
    
            torch.onnx.export(
                self,
                torch.randn(1, self.input_num).cpu(),
                bytes_io,
                export_params=True,
                opset_version=11,
                input_names=['input'],
                output_names=['output'],
                dynamic_axes={'input': {0: 'batch_size'}, 'output': {0: 'batch_size'}}
            )
    
            # 获取字节数据
            self.onnx_model_bytes_data = bytes_io.getvalue()
            bytes_io.close()
            
            self.to('cuda')
            self.train()

    def get_filedata_size(self):
        self.get_onnx()
        return len(self.onnx_model_bytes_data)

    def save_to_filedata(self, data):
        torch.save(self, self.network_all_path)
        torch.save(self.state_dict(), self.network_pth_path)
        data[0:len(self.onnx_model_bytes_data)] = np.frombuffer(self.onnx_model_bytes_data, np.uint8)

    def load_from_filedata(self, data):
        if self.continue_training:
            self.load_state_dict(torch.load(self.network_pth_path, weights_only=True))


class MyPolicyNetwork(MyNeuralNetwork):
    def __init__(self, input_dim, hidden_dim, output_dim, continue_training=False):
        super(MyPolicyNetwork, self).__init__(0, input_dim, output_dim, continue_training)
        self.hidden_dim = hidden_dim
        self.FC1 = nn.Linear(self.input_num, self.hidden_dim)
        self.FC2 = nn.Linear(self.hidden_dim, self.hidden_dim)
        self.FC3_sigmod = nn.Linear(self.hidden_dim, self.hidden_dim)
        self.FC3_tanh = nn.Linear(self.hidden_dim, self.hidden_dim)
        self.FC4 = nn.Linear(self.hidden_dim, self.hidden_dim)
        self.FC5 = nn.Linear(self.hidden_dim, self.output_num)
        self.multipier = nn.Parameter(torch.tensor([0.1, 0.1]))
        self.adder = nn.Parameter(torch.tensor([0.0, 0.0]))
        
        self.misc_adder = nn.Parameter(torch.zeros(64))

        self.to('cuda')

    def forward(self, x):
        m_x = self.FC1(x)
        m_x = F.relu(m_x)
        m_x = self.FC2(m_x)
        m_x = F.relu(m_x)

        m_x1 = self.FC3_sigmod(m_x)
        m_x1 = F.sigmoid(m_x1)
        m_x2 = self.FC3_tanh(m_x)
        m_x2 = F.tanh(m_x2)
        m_x3 = (1 - m_x1) * (m_x * 0 + torch.tanh(self.misc_adder))

        m_x = m_x1 * m_x2 + m_x3
        
        m_x = self.FC4(m_x)
        m_x = F.relu(m_x)
        m_x = self.FC5(m_x)
        
        m_x = m_x * self.multipier + self.adder

        return m_x


class MyCriticNetwork(MyNeuralNetwork):
    def __init__(self, input_dim, hidden_dim, output_dim, continue_training=False):
        super(MyCriticNetwork, self).__init__(1, input_dim, output_dim, continue_training)
        self.hidden_dim = hidden_dim
        self.FC1 = nn.Linear(self.input_num, self.hidden_dim)
        self.FC2 = nn.Linear(self.hidden_dim, self.hidden_dim)
        self.FC3 = nn.Linear(self.hidden_dim, self.output_num)

        self.to('cuda')

    def forward(self, x):
        m_x = self.FC1(x)
        m_x = F.relu(m_x)
        m_x = self.FC2(m_x)
        m_x = F.relu(m_x)
        m_x = self.FC3(m_x)
        return m_x


class MyEncoderNetwork(MyNeuralNetwork):
    def __init__(self, input_dim, hidden_dim, output_dim, continue_training=False):
        super(MyEncoderNetwork, self).__init__(2, input_dim, output_dim, continue_training)
        # self.multipier1 = nn.Parameter(torch.tensor(1.0))
        # self.multipier2 = nn.Parameter(torch.tensor(1.0))
        # self.adder1 = nn.Parameter(torch.tensor(0.0))
        # self.adder2 = nn.Parameter(torch.tensor(0.0))
        self.multipliers = nn.Parameter(torch.tensor([1.0, 1.0]))
        self.adders = nn.Parameter(torch.tensor([0.0, 0.0]))

        self.to('cuda')

    def forward(self, x):
        return x * self.multipliers + self.adders


class MyDecoderNetwork(MyNeuralNetwork):
    def __init__(self, input_dim, hidden_dim, output_dim, continue_training=False):
        super(MyDecoderNetwork, self).__init__(3, input_dim, output_dim, continue_training)
        self.multipier1 = nn.Parameter(torch.tensor([1.0, 1.0]))
        self.adder1 = nn.Parameter(torch.tensor([0.0, 0.0]))

        self.to('cuda')

    def forward(self, x):
        x = x * self.multipier1 + self.adder1
        return x

Training workflow:

  1. Call generate_filedata() from my_train_common.py to generate four networks:

    my_policy = MyPolicyNetwork(2, 64, 2)
    my_critic = MyCriticNetwork(2, 64, 1)
    my_encoder = MyEncoderNetwork(2, 2, 2)
    my_decoder = MyDecoderNetwork(2, 2, 2)

    generate_filedata(my_policy)
    generate_filedata(my_critic)
    generate_filedata(my_encoder)
    generate_filedata(my_decoder)

  2. Copy the generated ONNX files to UE.

  3. Connect them via MakePolicy / MakeCritic in Blueprints.

  4. Start training.


plugin’s auto-generated policy model

my policy model(Since I didn’t enable the memory option, I pruned some modules from the C++ generated model that I believed to be ineffective.)

**

Data flow:**

  1. UE sends the first SendNetwork, but Python ignores it and uses the PyTorch model (saved via torch.save) generated by generate_filedata.

  2. UE sends experience data.

  3. Python trains and returns updated networks.

I’d really appreciate it if you could help me understand:

  • Why the custom ONNX model trains significantly slower?

  • What might be causing the sudden performance jump around 1700 iterations?

  • Whether my model construction and ONNX integration approach looks correct.

Thank you in advance for your time and support!

**Sorry, I might have accidentally replied to my own message. I actually wanted to ask about the previous response.

Thank you in advance for your time and support!**

Hey,

I’m glad you were able to get your ONNX model working. Sadly, I don’t have bandwidth to think about your individual setup too deeply. Internally at Epic, we are using the Learning Agents model format and the BasicCPU runtime, so I don’t have a lot of experience with ONNX and issues that can arise.

Good luck with your issue - It’s impressive you made it as far as you have given editing the internals of Learning Agents to use ONNX is no small task. Hopefully you’ll figure it out.

Thank you once again for your help. I’ll continue working on resolving the issues I’m facing. Also, a big thank to both you and the LA plugin for helping me learn how to train my model in Unreal Engine.:handshake: