Tutorial: Learning Agents Introduction

jonathaj · April 15, 2023, 9:01am

I pulled the latest commits yesterday, and with a few tweaks, things are back up and running! The changes are great, things feel way more organized. Also, good job on the documentation, it seems like almost everything is commented by now, makes life easier

jonathaj · April 18, 2023, 1:15pm

Just posted a new medium article covering the new changes to the plugin, along with two experiments using it. Check it out!

Deathcalibur · April 18, 2023, 1:26pm

Looks great Jonathan. Thanks for the write-up.

FYI this week we are working on some nice changes for the Data Recorder and the Imitation Learning components. Nothing fundamentally different but cleaning up their APIs and allowing recordings to work as Data Assets and/or binary files. Should be possible to do interactive data recording in the editor but also have decent support for data recording in a more automated fashion (say from a game replay).

Feel free to explore them as they are but know that some cleanup is coming

Thanks!

TRoman07 · April 19, 2023, 9:32pm

Hey! is it at all possible to share also what were the changes that you did to the blueprints from this first article? Thanks!

jonathaj · April 21, 2023, 7:37am

Here is the project from the articles as a public github repo:

VeganApps · April 21, 2023, 3:28pm

Ohh, this is great news! I am very excited to test it when it is a plugin for the official Editor release.

BBlackford · May 3, 2023, 9:12pm

Thank you for the info! This is exciting stuff.

Is this a very generalist plugin for ML, or are there specific scenarios this plugin is designed for? For my specific implementation I need to train AI drivers to take turns well in a racing game.

Deathcalibur · May 15, 2023, 1:50pm

The plugin is designed for training agents using reinforcement learning and imitation learning. You should be able to use it to train anything that is a character’s behavior in a game, e.g. driving should work pretty well.

Right now the algorithm we have supports primarily only continuous actions, i.e. throttle from 0.0-1.0, brake from 0.0-1.0, and not really discrete actions like “jump” or “open door”. You can sort of hackily make these work, but we’re working on supporting those action types better in a future iteration.

GabalyIndustries · May 16, 2023, 7:07am

Hi everyone, has anyone tried to compile this plugin for 5.2 ?
I am very close to getting it to compile

I updated the following files to get the build going:

GenericPlatformMath.h → added the tanh function
SplineComponent.h and SplineComponent.cpp → added the GetDistanceAlongSplineAtLocation function

splineh920×70 21.4 KB

These updates removed most of the errors, there is just one error that I cannot solve which is in SplineComponent.h → in the FSplineInstanceData Struct, I am getting the following error in reference to the GENERATED_BODY() line:

[15/28] Compile [x64] Module.LearningAgents.3_of_3.cpp
C:\Program Files\Epic Games\UE_5.2\Engine\Source\Runtime\Engine\Classes\Components\SplineComponent.h(879): error C4430: missing type specifier - int assumed. Note: C++ does not support default-int

How do I solve this ? are there any other files that I need to tweak ?

I appreciate the help, thank you

Deathcalibur · May 16, 2023, 2:00pm

If you’re not going to use the spline observations in Learning Agents, you could comment them out and it should compile after adding TanH(). I was able to get it to work by basically commenting out the entirety of USplineComponentHelper.

Deathcalibur · May 17, 2023, 4:00pm

Over the last couple weeks, we pushed out some changes that reduce room for bugs and simplify most of the setup code. We’re mainly in QA mode for near future and looking for any feedback the community has on the plugin.

Summary of Latest Changes:

Removed components having references to agent ids. Instead, all components use their associated manager and operate on all agents that have been added to that manager. You can get a similar functionality to what existed before by using multiple managers. We feel this design is less error prone and also results in a much more elegant implementation in the codebase with less edge cases.
Order of operation checks: the lower level training/inference functions (EncodeObservations, ProcessExperience, etc.) now handle and log messages anytime expected operations were not run in the correct order. For example, it’s expected to collect the observations, actions, and rewards prior to calling ProcessExperience (otherwise the data is incomplete). This prevents bugs where data could have been stale or defaulted. Many other edge cases are addressed with this addition.
Manager setup is now automatic and components will automatically find the manager they are attached to

Thanks!

GabalyIndustries · May 17, 2023, 11:52pm

Thank you, this worked, compilation successful after

Commenting out the USplineComponentHelper Class in LearningAgentsHelpers.h and in LearningAgentsHelpers.cpp

I will test it out first like this and then figure out how to use spline observations later

SirYeetbix · May 18, 2023, 11:00am

Hello! I am still fairly new to unreal engine, but currently in the quality assurance department and was wondering if anyone has used this for QA testing yet?

Deathcalibur · May 22, 2023, 2:41pm

As far as I am aware, not yet; however, this is one area where we intend for the plugin to be useful. If you try it out, let us know what your feedback is!

nidne-sandne · May 26, 2023, 8:31am

Hello! I have tried to use Learning Agents for super simple environments. I have a few requests and questions on it.

When implementing Interactor and Trainer subclasses natively in C++, I get a link error in FLearningAgentsTrainerPathSettings. The error was fixed by adding the LEARNINGAGENTSTRAINING_API.
After saving networks to binary files using the Save(Policy/Critic)ToSnapshot functions, an error occurs on the Python side at the start of training when trying to load it in another process. How should I load the networks?

Thanks!

Deathcalibur · May 26, 2023, 2:14pm

Hello!

For issue 1: will do. That’s something we overlooked! Thanks for catching this issue and reporting back

For issue 2: I think I need more information to understand exactly what you are trying to do and why (i.e. why are you trying to load it in another process?). The way to load the networks normally would be to use ULearningAgentsNeuralNetwork::LoadNetworkFromSnapshot. If you are working purely inside UE, you probably want to look at SaveNetworkToAsset/LoadNetworkFromAsset instead of Snapshots.

Thanks,
Brendan

nidne-sandne · May 29, 2023, 3:25am

Thank you for your response.

If you are working purely inside UE, you probably want to look at SaveNetworkToAsset/LoadNetworkFromAsset instead of Snapshots.

I also wanted to save it as an asset. However, I could not create the “ULearningAgentsNeuralNetwork” BP class in the editor. I think the UCLASS macro for this class needs to be set to “Blueprintable”.

In the future, for performance reasons or for learning on a PC without project files, I thought it would be useful if I could save them as just binary files instead of assets, so I used the Snapshot version.

why are you trying to load it in another process?

The reason for loading in another process is, for example, to resume learning or to perform inferences.(“Another processes” simply means another processes of the same UE project.)

By the way, is this the right place to report bugs or ask detailed questions? If there is another appropriate place, I will ask my question there.

Deathcalibur · June 1, 2023, 4:40pm

ULearningAgentsNeuralNetwork - for making this, use the Data Asset workflow:

Then select LearningAgentsNeuralNetwork in the picker menu.

For the error: can you send me any details of the error and any extra steps you have run to recreate? I can attempt to recreate on my end.

For reporting issues, I would encourage people to open new topics on the forums and/or post here. We have added the follow tags to the forums:

Learning-Agents
Machine-Learning
Reinforcement-Learning
Imitation-Learning

I will receive notifications on any of these tags and will try to respond promptly!

RealtimeGraphX · June 10, 2023, 7:48pm

How would one save a trained model for future use?

I have looked into the save/load policy/critic functions in the blueprint. It seems like this would be the way to go, but there is probably something that needs to be setup first. Simply using the SavePolicyToAsset function in the Learning Manager gives me an error:

LogLearning: Error: PolicyNetwork: Asset is invalid.

The SavePolicyToSnapshot functions doesnt have an exec pin, so not sure how to use that.

Any suggestion on how use a trained model for inference or further finetuning would be appreciated.

Deathcalibur · June 12, 2023, 12:58pm

Hello,

The proper way to save a network is to create one in the editor and then to save into using the SavePolicyToAsset. So follow the steps in my previous post to make a LearningAgentsNeuralNetwork asset (Right click in Content Browser → Miscellaneous → Data Asset → Search for LearningAgentsNeuralNetwork). Give it an appropriate name and then pass it as an argument to SavePolicyToAsset

Once it’s saved during the training game, you can exit PIE and see that the neural network asset will be marked as “dirty” in the editor, so you can save it to make that change permanent (IIRC, dirty files are only saved in memory so actually save it to get the weights stored on disk), or you can discard the changes later on if you don’t want to lose the previous weights you had.

To load the weights, merely pass the asset as an argument during SetupPolicy:

	 * @param NeuralNetworkAsset Optional Network Asset to use. If provided must match the given PolicySettings. If not
	 * provided or asset is empty then a new neural network object will be created according to the given

This can allow you to either do inference from your existing policy, or you can continue to train the weights. For example, this enables imitation learning → reinforcement learning transfer learning.

The SavePolicyToSnapshot functions doesnt have an exec pin, so not sure how to use that.

That’s a bug Thanks for pointing it out - should be fixed now!

Thanks,
Brendan