Course: Learning Agents - Getting Started

Get familiar with Learning Agents: a machine learning plugin for AI bots. Learning Agents allows you to train your NPCs via reinforcement & imitation learning. It can be used to create game-playing agents, physics-based animations, automated QA bots, and much more!

https://dev.epicgames.com/community/learning/courses/M3D/unreal-engine-learning-agents-getting-started

1 Like

Thank you so much for sharing this, AND using the vehicle template!!! It needs more love :stuck_out_tongue:
Really excited about this topic and curious to see how it evolves in the game ecosystem. Thank you Brendan!

1 Like

What’s about this error:

LogDebuggerCommands: Repeating last play command: New Editor Window (PIE)
LogContentBundle: [VehicleExampleMap(Standalone)] Generating Streaming for 0 Content Bundles.
LogWorldPartition: Display: GenerateStreaming for 'VehicleExampleMap' started...
LogWorldPartition: Display: GenerateStreaming for 'VehicleExampleMap' took 1 ms
LogPlayLevel: PlayLevel: No blueprints needed recompiling
PIE: New page: PIE session: VehicleExampleMap (2023年9月8日 上午4:49:56)
LogOnline: OSS: Created online subsystem instance for: NULL
LogOnline: OSS: TryLoadSubsystemAndSetDefault: Loaded subsystem for type [NULL]
LogPlayLevel: Creating play world package: /Game/VehicleTemplate/Maps/UEDPIE_0_VehicleExampleMap
LogPlayLevel: PIE: StaticDuplicateObject took: (0.238914s)
LogPlayLevel: PIE: Created PIE world by copying editor world from /Game/VehicleTemplate/Maps/VehicleExampleMap.VehicleExampleMap to /Game/VehicleTemplate/Maps/UEDPIE_0_VehicleExampleMap.VehicleExampleMap (0.238947s)
LogUObjectHash: Compacting FUObjectHashTables data took   0.57ms
LogWorldSubsystemInput: UEnhancedInputDeveloperSettings::bEnableWorldSubsystem is false, the world subsystem will not be created!
LogChaos: FPhysicsSolverBase::AsyncDt:-1.000000
LogAIModule: Creating AISystem for world VehicleExampleMap
LogWorldPartition: ULevel::OnLevelLoaded(VehicleExampleMap)(bIsOwningWorldGameWorld=1, bIsOwningWorldPartitioned=1, InitializeForMainWorld=1, InitializeForEditor=0, InitializeForGame=1)
LogWorldPartition: Display: WorldPartition initialize started...
LogWorldPartition: UWorldPartition::Initialize : World = VehicleExampleMap, World Type = PIE, IsMainWorldPartition = 1, Location = V(0), Rotation = R(0), IsEditor = 0, IsGame = 0, IsPIEWorldTravel = 0, IsCooking = 0
LogWorldPartition: UWorldPartition::Initialize Context : World NetMode = Standalone, IsServer = 0, IsDedicatedServer = 0, IsServerStreamingEnabled = 0, IsServerStreamingOutEnabled = 0, IsUsingMakingVisibleTransaction = 0, IsUsingMakingInvisibleTransaction = 0
LogContentBundle: [VehicleExampleMap(Standalone)] Creating new contrainer.
LogWorldPartition: Display: WorldPartition initialize took 14 ms (total: 343 ms)
LogPlayLevel: PIE: World Init took: (0.015442s)
LogAudio: Display: Creating Audio Device:                 Id: 2, Scope: Unique, Realtime: True
LogAudioMixer: Display: Audio Mixer Platform Settings:
LogAudioMixer: Display:     Sample Rate:                          48000
LogAudioMixer: Display:     Callback Buffer Frame Size Requested: 1024
LogAudioMixer: Display:     Callback Buffer Frame Size To Use:    1024
LogAudioMixer: Display:     Number of buffers to queue:           1
LogAudioMixer: Display:     Max Channels (voices):                32
LogAudioMixer: Display:     Number of Async Source Workers:       4
LogAudio: Display: AudioDevice MaxSources: 32
LogAudio: Display: Audio Spatialization Plugin: None (built-in).
LogAudio: Display: Audio Reverb Plugin: None (built-in).
LogAudio: Display: Audio Occlusion Plugin: None (built-in).
LogAudioMixer: Display: Initializing audio mixer using platform API: 'XAudio2'
LogAudioMixer: Display: Using Audio Hardware Device 扬声器 (Realtek(R) Audio)
LogAudioMixer: Display: Initializing Sound Submixes...
LogAudioMixer: Display: Creating Master Submix 'MasterSubmixDefault'
LogAudioMixer: Display: Creating Master Submix 'MasterReverbSubmixDefault'
LogAudioMixer: FMixerPlatformXAudio2::StartAudioStream() called. InstanceID=2
LogAudioMixer: Display: Output buffers initialized: Frames=1024, Channels=2, Samples=2048, InstanceID=2
LogAudioMixer: Display: Starting AudioMixerPlatformInterface::RunInternal(), InstanceID=2
LogInit: FAudioDevice initialized with ID 2.
LogAudioMixer: Display: FMixerPlatformXAudio2::SubmitBuffer() called for the first time. InstanceID=2
LogAudio: Display: Audio Device (ID: 2) registered with world 'VehicleExampleMap'.
LogAudioMixer: Initializing Audio Bus Subsystem for audio device with ID 2
LogSlate: Updating window title bar state: overlay mode, drag disabled, window buttons hidden, title bar hidden
LogLoad: Game class is 'VehicleAdvGameMode_C'
LogWorld: Bringing World /Game/VehicleTemplate/Maps/UEDPIE_0_VehicleExampleMap.VehicleExampleMap up for play (max tick rate 0) at 2023.09.07-20.49.57
LogWorld: Bringing up level for play took: 0.013049
LogOnline: OSS: Created online subsystem instance for: :Context_2
LogSpawn: Warning: SpawnActor failed because no class was specified
r.MotionBlur.Amount = "0"
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Adding Agent SportsCar_Pawn_C_UAID_1C697ABAE03202A701_1103699866 with id 0.
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Adding Agent SportsCar_Pawn_C_UAID_1C697ABAE03202A701_1113401869 with id 1.
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Adding Agent SportsCar_Pawn_C_UAID_1C697ABAE03202A701_1116819871 with id 2.
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Adding Agent SportsCar_Pawn_C_UAID_1C697ABAE03202A701_1098920865 with id 3.
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Adding Agent SportsCar_Pawn_C_UAID_1C697ABAE03202A701_1111684868 with id 4.
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Adding Agent SportsCar_Pawn_C_UAID_1C697ABAE03202A701_1116824872 with id 5.
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Adding Agent SportsCar_Pawn_C_UAID_1C697ABAE03202A701_1111679867 with id 6.
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Adding Agent SportsCar_Pawn_C_UAID_1C697ABAE03202A701_1119350873 with id 7.
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Adding Agent SportsCar_Pawn_C_UAID_1C697ABAE03202A701_1121720875 with id 8.
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Adding Agent SportsCar_Pawn_C_UAID_1C697ABAE03202A701_1113406870 with id 9.
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Adding Agent SportsCar_Pawn_C_UAID_1C697ABAE03202A701_1119355874 with id 10.
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Adding Agent SportsCar_Pawn_C_UAID_1C697ABAE03202A701_1121725876 with id 11.
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Adding Agent SportsCar_Pawn_C_0 with id 12.
PIE: Server logged in
PIE: Play in editor total start time 0.379 seconds.
LogLearning: Display: BP_DrivingRLTrainer: Sending / Receiving initial policy...
LogLearning: Display: Training Process: {
LogLearning: Display: Training Process:     "TaskName": "BP_DrivingRLTrainer",
LogLearning: Display: Training Process:     "TrainerMethod": "PPO",
LogLearning: Display: Training Process:     "TrainerType": "SharedMemory",
LogLearning: Display: Training Process:     "TimeStamp": "2023-09-07_20-49-57",
LogLearning: Display: Training Process:     "SitePackagesPath": "H:/UE/engine/UE_5.3/Engine/Plugins/Experimental/PythonFoundationPackages/Content/Python/Lib/Win64/site-packages",
LogLearning: Display: Training Process:     "IntermediatePath": "H:/Learn/LearningAgents/LADrive/LearningAgentsDrive/Intermediate/LearningAgents",
LogLearning: Display: Training Process:     "PolicyGuid": "{C94A521B-4177-C2FD-ACF8-D68698A0422E}",
LogLearning: Display: Training Process:     "ControlsGuid": "{93D29EF3-4F49-C959-A8B2-D99487D5D88B}",
LogLearning: Display: Training Process:     "EpisodeStartsGuid": "{331723B2-4201-6FA5-374A-5196FFC221BB}",
LogLearning: Display: Training Process:     "EpisodeLengthsGuid": "{7B455B43-428C-ECEC-9819-6E89F3C9D1CA}",
LogLearning: Display: Training Process:     "EpisodeCompletionModesGuid": "{34CEF0D3-424C-4C23-A6C8-F8AB66C03520}",
LogLearning: Display: Training Process:     "EpisodeFinalObservationsGuid": "{78B33138-426C-0A57-FD9C-B7BDD8C0926F}",
LogLearning: Display: Training Process:     "ObservationsGuid": "{C3DB2476-4482-AE6D-7B57-F8AF3450CFC6}",
LogLearning: Display: Training Process:     "ActionsGuid": "{A8D8F54D-4D67-E8B9-880E-0DAECE2607C7}",
LogLearning: Display: Training Process:     "RewardsGuid": "{DF40CA7A-4DBB-5EA6-66F3-7E9245E8E29C}",
LogLearning: Display: Training Process:     "ObservationVectorDimensionNum": 8,
LogLearning: Display: Training Process:     "ActionVectorDimensionNum": 2,
LogLearning: Display: Training Process:     "MaxEpisodeNum": 1000,
LogLearning: Display: Training Process:     "MaxStepNum": 10000,
LogLearning: Display: Training Process:     "PolicyNetworkByteNum": 72788,
LogLearning: Display: Training Process:     "PolicyHiddenUnitNum": 128,
LogLearning: Display: Training Process:     "PolicyLayerNum": 3,
LogLearning: Display: Training Process:     "PolicyActivationFunction": "ELU",
LogLearning: Display: Training Process:     "PolicyActionNoiseMin": 0.25,
LogLearning: Display: Training Process:     "PolicyActionNoiseMax": 0.25,
LogLearning: Display: Training Process:     "CriticNetworkByteNum": 71240,
LogLearning: Display: Training Process:     "CriticHiddenUnitNum": 128,
LogLearning: Display: Training Process:     "CriticLayerNum": 3,
LogLearning: Display: Training Process:     "CriticActivationFunction": "ELU",
LogLearning: Display: Training Process:     "ProcessNum": 1,
LogLearning: Display: Training Process:     "IterationNum": 1000000,
LogLearning: Display: Training Process:     "LearningRatePolicy": 9.999999747378752e-05,
LogLearning: Display: Training Process:     "LearningRateCritic": 0.0010000000474974513,
LogLearning: Display: Training Process:     "LearningRateDecay": 0.9900000095367432,
LogLearning: Display: Training Process:     "WeightDecay": 0.0010000000474974513,
LogLearning: Display: Training Process:     "InitialActionScale": 0.10000000149011612,
LogLearning: Display: Training Process:     "BatchSize": 128,
LogLearning: Display: Training Process:     "EpsilonClip": 0.20000000298023224,
LogLearning: Display: Training Process:     "ActionRegularizationWeight": 0.0010000000474974513,
LogLearning: Display: Training Process:     "EntropyWeight": 0.009999999776482582,
LogLearning: Display: Training Process:     "GaeLambda": 0.8999999761581421,
LogLearning: Display: Training Process:     "ClipAdvantages": true,
LogLearning: Display: Training Process:     "AdvantageNormalization": true,
LogLearning: Display: Training Process:     "TrimEpisodeStartStepNum": 0,
LogLearning: Display: Training Process:     "TrimEpisodeEndStepNum": 0,
LogLearning: Display: Training Process:     "Seed": 1234,
LogLearning: Display: Training Process:     "DiscountFactor": 0.9900000095367432,
LogLearning: Display: Training Process:     "Device": "GPU",
LogLearning: Display: Training Process:     "UseTensorBoard": false,
LogLearning: Display: Training Process:     "UseInitialPolicyNetwork": true,
LogLearning: Display: Training Process:     "UseInitialCriticNetwork": false,
LogLearning: Display: Training Process:     "SynchronizeCriticNetwork": false,
LogLearning: Display: Training Process:     "LoggingEnabled": true
LogLearning: Display: Training Process: }
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Resetting Agents [0 1 2 3 4 5 6 7 8 9 10 11 12].
LogViewport: Scene viewport resized to 1280x720, mode Windowed.
LogAutomationController: Ignoring very large delta of 2.08 seconds in calls to FAutomationControllerManager::Tick() and not penalizing unresponsive tests
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Resetting Agents [12].
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Resetting Agents [0 1 2 3 4 5 6 7 8 9 10 11].
LogLearning: Display: Training Process: Creating Replay Buffer...
LogLearning: Display: Training Process: Creating Networks...
LogLearning: Display: Training Process: Receiving Policy...
LogLearning: Display: Training Process: Creating Optimizer...
LogLearning: Display: Training Process: Creating PPO Policy...
LogLearning: Display: Training Process: Opening TensorBoard...
LogLearning: Display: Training Process: Begin Training...
LogLearning: Display: Training Process: Profile| Pull Experience            19823ms
LogLearning: Display: Training Process: Traceback (most recent call last):
LogLearning: Display: Training Process:   File "H:\UE\engine\UE_5.3\Engine\Plugins\Experimental\LearningAgents\Content\Python\train_ppo.py", line 361, in <module>
LogLearning: Display: Training Process:     train_ppo(config, trainer)
LogLearning: Display: Training Process:   File "H:\UE\engine\UE_5.3\Engine\Plugins\Experimental\LearningAgents\Content\Python\train_ppo.py", line 199, in train_ppo
LogLearning: Display: Training Process:     assert response == UE_RESPONSE_SUCCESS
LogLearning: Display: Training Process: AssertionError
LogLearning: Warning: Training Process finished with warnings or errors
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Resetting Agents [9].
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Resetting Agents [12].
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Resetting Agents [0 1 2 3 4 5 6 7 8 10 11].
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Resetting Agents [11].
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Resetting Agents [2].
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Resetting Agents [12].
LogLearning: Display: BP_RLTrainingManager_C_UAID_1C697ABAE032EAA601_1806685639: Resetting Agents [9].
LogLearning: Error: BP_DrivingRLTrainer: Error waiting for policy from trainer. Check log for errors.
LogLearning: Display: BP_DrivingRLTrainer: Stopping training...
LogAutomationController: Ignoring very large delta of 39.51 seconds in calls to FAutomationControllerManager::Tick() and not penalizing unresponsive tests
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.
LogLearning: Error: BP_DrivingRLTrainer: Training has failed. Check log for errors.

Looks like it’s the same issue as here: Learning Agents fails - #2 by Deathcalibur

It looks like you’re running into a timeout when doing the experience gathering loop. You can verify this by printing out the response using the steps in that reply.

You should be able to resolve this by going to Program Files\Epic Games\UE_5.3\Engine\Plugins\Experimental\LearningAgents\Content\Python\train_common.py and increasing the timeout on line 228 in shared_memory_recv_experience_multiprocess

We will work on making the error messages clearer for this type of issue and seeing what we can do to make this easier to avoid.

@Deathcalibur

Hi Brendan,
im using ue5main, was able to follow tutu to the end however there is one change I believe in recent version

In trainer InCritic type has changed and therefore cant connect CriticSettings to it and also there is missing node on RunTraining. Any idea how this can be fixed? Thank you

UE5 Main is a work-in-progress with a lot of in-flight changes so you may find some things are broken. We will try to keep it in a good state between every commit but can’t guarantee it will be stable.

Thanks for your reply, i will just leave it then for now as have no idea.

@Deathcalibur When can we expect a Mac version of Learning Agents that doesn’t involve fiddling with paths in the engine source code?

I have access to a Mac testing machine now so will work on fixing it on UE5 Main sometime hopefully this month or next.

Hey Brandan, how’s it going with the Mac support? Would love to be able to setup my Macbook as the training machine so I can keep using my desktop for other stuff :slight_smile:

Thanks for the tutorial! Is there any way to do multi-GPU training, or are there any plans to support this?

Not currently supported but something we can keep in mind. Thanks!