Tutorial: Learning Agents Introduction

order66 · October 17, 2023, 7:42am

Thanks for the information. I’ll give it some more thought. Maybe it makes more sense to wait with testing stuff like this and continue with it once discrete actions are added.

Another question I have: With a use case such as the Lyra Bots, would it make more sense to use imitation learning here? One goal would be to achieve bots that are not “bot-like” but rather play like humans.

What state is the imitation learning in? Can it be used for such a case right now?

Deathcalibur · October 17, 2023, 1:28pm

Yes, imitation learning would be important for human-like aiming. We currently give you behavior cloning which is mostly what you would need for this.

In lieu of a formal tutorial for imitation learning:

Setup an interactor - implement Setup and Set Observations for data collection. Create the actions you would want to track during SetupActions and then make the Action object variables public.
Create a blueprint from ULearningAgentsController and implement SetActions - this is like SetObservations but for actions. During SetActions, set the publicly exposed action objects via their SetXAction function (don’t accidentally try to set the variable itself)
Create a blueprint from the ULearningAgentsRecorder - nothing needed in the BP Event Graph.
Create a Miscellaneous->DataAsset->LearningAgentsRecording and provide this during Recorder->SetupRecorder.
Create a data collection manager from ULearningAgentsManager - during the manager’s Tick: call Interactor->EncodeObservations, Controller->EncodeActions, Recorder->AddExperience (behind a branch with Recorder->IsRecording == true). After spawning or whenever an enemy is found, call Recorder->BeginRecording. Whenever the player gets a kill or however you want, call EndRecording. Every begin/end recording will create a Record in the Recording object.
Create empty blueprints from the ULearningAgentsImitationTrainer and ULearningAgentsPolicy
Implement GetActions on the original interactor if not already done
Create a separate manager for imitation learning with the Policy, Interactor and ImitationTrainer components attached (could be on the same manager with some controls added to it). During tick, call ImitationTrainer->RunTraining and pass in the Recording asset which was populated with data above.

This looks like a lot of steps when written out but it structurally shouldn’t take long to setup. Deciding on how to do the interactor is the hardest part.

The biggest challenge is that aiming in a human-like way takes place over many frames, so you need some kind of trajectory information, i.e. a memory. This can be achieved currently by doing a time-lagged MLP. For example, if you cared about the target’s position (seems necessary for aiming lol) and you wanted 1 second of history data at 30 FPS, you could feed the time dimension as different columns: Pos_Time0, Pos_Time1, Pos_Time2… managing the observations is a bit cumbersome right now.

Anyway, hope this helps!

SHU_INC · October 21, 2023, 1:40pm

erm, hello sir. Its kinda stupid to ask. But i’ve already done with the driving example. Its working fine. And now i want to Train a simple scene where AI chasing an object. But i’m stucked at the penalty set up. My reward is the distance from AI to target and if it facing the target or not and the penalty is also the distance to the target, What i’m going to ask is how can i set the penalty, i can only find 1 penalty function in the but its for position. Thankyou

FexotheFCO · October 22, 2023, 8:54pm

Hey amazing work! I was looking forward to seeing machine learning on Unreal.
Follow your tutorial and I was having a small doubt, I’m getting this error
“LogLearning: Display: Training Process: RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from Official Drivers | NVIDIA”
so my doubt is, this is exclusive to Nvidia gpus? Thanks!

fortniterickroll · October 23, 2023, 1:17pm

same here, I tried following the lets learn to drive tutorial:Learning Agents - Getting Started | Course

I get these errors, Im on an Amd cpu and gpu

LogLearning: Display: BP_RLTrainingMana PIE: Server logged in
PIE: Play in editor total start LogLearning: Display: BP_DrivingRLTrainer: LogLearning: Display: Training LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training LogLearning: Display: Training LogLearning: Display: Training LogLearning: Display: Training LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training Process: LogLearning: Display: Training LogLearning: Warning: Training ger_C_UAID_3C219C5E395D7FB501_1735601647: Adding Agent SportsCar_Pawn_C_0 with id 0.
time 0.593 seconds.
Sending / Receiving initial policy…
Process: {
“TaskName”: “BP_DrivingRLTrainer”,
“TrainerMethod”: “PPO”,
“TrainerType”: “SharedMemory”,
“TimeStamp”: “2023-10-23_21-37-00”,
“SitePackagesPath”: “C:/Program Files/Epic Games/UE_5.3/Engine/Plugins/Experimental/PythonFoundationPackages/Content/Python/Lib/Win64/site-packages”,
“IntermediatePath”: “C:/Users/kavis/OneDrive/Documents/Unreal Projects/AiCarTest/Intermediate/LearningAgents”,
“PolicyGuid”: “{8640D3B8-41AF-6B36-34DB-51B2478F7EDE}”,
“ControlsGuid”: “{B9DCFDAF-44BB-660D-09B5-DDA99411FF7E}”,
“EpisodeStartsGuid”: “{63107E00-4D95-F9F3-63BE-7195A699B07A}”,
“EpisodeLengthsGuid”: “{62B6E09B-41C2-26DF-2D90-C1BF26A8E0B6}”,
“EpisodeCompletionModesGuid”: “{631EEE41-4923-093D-166E-12ABFB2FDB1D}”,
“EpisodeFinalObservationsGuid”: “{4E3CB7BA-40B0-D531-6947-32B6586392CB}”,
“ObservationsGuid”: “{F49869F2-40E9-2586-EBAA-58A11488C880}”,
“ActionsGuid”: “{D2A58163-47E2-9D51-0789-6B93FDF8C207}”,
“RewardsGuid”: “{E5369965-42F1-8790-D21D-7EACD8730C89}”,
“ObservationVectorDimensionNum”: 8,
“ActionVectorDimensionNum”: 2,
“MaxEpisodeNum”: 1000,
“MaxStepNum”: 10000,
“PolicyNetworkByteNum”: 72788,
“PolicyHiddenUnitNum”: 128,
“PolicyLayerNum”: 3,
“PolicyActivationFunction”: “ELU”,
“PolicyActionNoiseMin”: 0.25,
“PolicyActionNoiseMax”: 0.25,
“CriticNetworkByteNum”: 71240,
“CriticHiddenUnitNum”: 128,
“CriticLayerNum”: 3,
“CriticActivationFunction”: “ELU”,
“ProcessNum”: 1,
“IterationNum”: 1000000,
“LearningRatePolicy”: 9.999999747378752e-05,
“LearningRateCritic”: 0.0010000000474974513,
“LearningRateDecay”: 0.9900000095367432,
“WeightDecay”: 0.0010000000474974513,
“InitialActionScale”: 0.10000000149011612,
“BatchSize”: 128,
“EpsilonClip”: 0.20000000298023224,
“ActionRegularizationWeight”: 0.0010000000474974513,
“EntropyWeight”: 0.009999999776482582,
“GaeLambda”: 0.8999999761581421,
“ClipAdvantages”: true,
“AdvantageNormalization”: true,
“TrimEpisodeStartStepNum”: 0,
“TrimEpisodeEndStepNum”: 0,
“Seed”: 1234,
“DiscountFactor”: 0.9900000095367432,
“Device”: “GPU”,
“UseTensorBoard”: false,
“UseInitialPolicyNetwork”: true,
“UseInitialCriticNetwork”: false,
“SynchronizeCriticNetwork”: false,
“LoggingEnabled”: true
Process: }
Process: Creating Replay Buffer…
Process: Creating Networks…
Process: Traceback (most recent call last):
File “C:\Program Files\Epic Games\UE_5.3\Engine\Plugins\Experimental\LearningAgents\Content\Python\train_ppo.py”, line 361, in
train_ppo(config, trainer)
File “C:\Program Files\Epic Games\UE_5.3\Engine\Plugins\Experimental\LearningAgents\Content\Python\train_ppo.py”, line 87, in train_ppo
actor_network = NeuralNetwork(
File “C:\Program Files/Epic Games/UE_5.3/Engine/Plugins/Experimental/PythonFoundationPackages/Content/Python/Lib/Win64/site-packages\torch\nn\modules\module.py”, line 852, in to
return self._apply(convert)
File “C:\Program Files/Epic Games/UE_5.3/Engine/Plugins/Experimental/PythonFoundationPackages/Content/Python/Lib/Win64/site-packages\torch\nn\modules\module.py”, line 530, in _apply
module._apply(fn)
File “C:\Program Files/Epic Games/UE_5.3/Engine/Plugins/Experimental/PythonFoundationPackages/Content/Python/Lib/Win64/site-packages\torch\nn\modules\module.py”, line 530, in _apply
module._apply(fn)
File “C:\Program Files/Epic Games/UE_5.3/Engine/Plugins/Experimental/PythonFoundationPackages/Content/Python/Lib/Win64/site-packages\torch\nn\modules\module.py”, line 552, in apply
param_applied = fn(param)
File “C:\Program Files/Epic Games/UE_5.3/Engine/Plugins/Experimental/PythonFoundationPackages/Content/Python/Lib/Win64/site-packages\torch\nn\modules\module.py”, line 850, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
File "C:\Program Files/Epic Games/UE_5.3/Engine/Plugins/Experimental/PythonFoundationPackages/Content/Python/Lib/Win64/site-packages\torch\cuda_init.py", line 172, in _lazy_init
torch._C._cuda_init()
Process: RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from Official Drivers | NVIDIA
Process finished with warnings or errors

FatemehMousavu · October 25, 2023, 2:37pm

I think maybe you can use a scalar reward with negative scale to make penalty

Mahzad · October 27, 2023, 2:48pm

Thank you for this first indication of imitation learning, but do you have a target date for an official tutorial?

Armorix · October 28, 2023, 1:02am

Great job with Learning Agents!

There’s a time dilation console command called “slomo” that can be used to simulate slower or faster. By accelerating the simulation with “slomo 5”, are we accelerating learning?

eternalme · November 16, 2023, 11:53am

@Deathcalibur where can we actively fallow the development of Learning Agents? Maybe a separate section in Forums could be useful?

MetinCelik · November 24, 2023, 6:34pm

Still not working on AMD. Maybe this should be mentioned in the tutorial. @Deathcalibur

LogLearning: Display: Training Process: RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx

Titirez · November 25, 2023, 10:27am

Hey @MetinCelik, I had problems with nvidia gpu too, not sure if this will help you, but i just switched to CPU training. (I think GPU is default).

MetinCelik · November 25, 2023, 11:11pm

Thanks, now it’s working. But training will take longer without GPU acceleration. I just read that the Learning Agents plugin uses PyTorch which supports AMD only on Linux so we won’t get AMD support anytime soon. @FexotheFCO @fortniterickroll

tu-danaaa · December 29, 2023, 2:09am

URotationAction I need to use the SetXAction function, but I don’t see it in the code.

BlueBloodUnion · January 26, 2024, 1:53am

Dear Learning Agents Team,

Thanks for the great support and effort in developing such an awesome plugin for UE users. I am a UE beginner and an expert in RL and AI. It would be great if you could create a document to tell us about the philosophy behind it and add more examples of how to build RL, and IL workflows.

danimand · March 8, 2024, 2:27am

I’ve been getting an invalid object as a Return Value for the Get Agent function in some (but not all, or even most) of the Iterator and and Trainer events

Same happens in the Vehicle Template I did first and in my own project where I’ve implemented (or tried to) the same thing.

When the breakpoint after Is Not Valid is triggered, the Get Agent output looks like this.

And the Array Element in the loop node has no debug data which I have no idea what it means.
ForEach

Any ideas? Is this thread alive?

Deathcalibur · March 8, 2024, 1:20pm

Did you make sure the objects you’re adding as agents can be cast to the “Agent Class”? If they can’t, then you will get a nullptr which will show up as “Unknown”.

The no debug data is just a problem with the debugger. You have to stop somewhere after the data has been used, usually its one node further than you think.

Brendan

danimand · March 9, 2024, 12:03am

32 objects mae from CollectoBot Blueprint Class are placed on the map and they call the Add Agent on the manager and set themselves as the agent

When calling GetAgentNum from the Manager, I get 32

But When going through the foreach loop, some objects are valid and some are not

I’m not sure how only some of them could be invalid

Deathcalibur · March 11, 2024, 1:13pm

Strange…

Are you doing anything with networking?
Are you somehow destroying the agents?

danimand · March 11, 2024, 11:05pm

Huh, this one’s on me.
The agents save their location at begin play, and use it for their relocation function for episode reset. I forgot reset episode is ran first so it saved 0,0,0 as a respawn location, bunched all of them there and kept resetting the episode (not sure why though, is the episode reset if the agent cannot be legally spawned because a collision?)

1sirianth1 · March 12, 2024, 9:33pm

Hi There,

We currently can’t do RL/IL using “image data” (screenshots from cameras) using this plugin; is that correct?