Tutorial: Learning Agents Introduction

Hello again,

Great question!

It’s possible to use replays but there are some major challenges. I’ve tried it - not with Lyra though - so this is coming from 1st-hand experience:

  1. The UE built in replay system does not directly record ground-truth actions, i.e. user inputs. For example, rotations of the player pawns will be the final result synced over the network, so if you have multiple things manipulating the rotations, it’s challenging to untangle them. For example, weapon recoil to pitch a gun up vs user moved the mouse to pitch up. Depending on the problem you’re trying to solve, this may not be a issue for you.
    • You could try to solve this by learning an inverse dynamics model or by pushing up more data into the replays (I’m not an expert on how to do this - yet)
  2. The replays are not a deterministic simulation of the game, so querying the game state can be challenging as well. For example, at game time you might call “IsFiring()” on a Pawn to see if its currently firing its weapon, but at replay time this function might always return false because the backing code doesn’t need to run (only the special effects like bullet tracers actually get played back at replay time)
    • You can get creative with how you query game state but it’s a major pain.

How could you do it though?
Assuming you still want to try, you simply spawn a learning agents manager in the replay level and have it query the stuff you want through the interactor and save off the data with the LA Data Recorder object. I haven’t done this recently so it’s possible there is some regression in the LA design that makes this no longer work (please let me know if you try and run into problems).

I’m working on internally “solving” this issue and hope to be able to port the solution into LA/replay system but I have absolutely no idea on the time frame or if it’ll ever come!

Hope that answer your question!
Brendan

I did want to let you know we’ve run into issues where the engine will go to sleep because its waiting for the subprocess to finish writing which isn’t great. If it was possible to have this not hitch the editor and instead just buffer(so it could do this async) or stream that information to write to the subprocess then that would help a ton!
We were seeing this with SharedMemoryTraining::RecvNetwork btw.

1 Like

Async is in the works! How big are your obs/training batches that it’s causing the “engine to sleep”? I’ve never seen that so I’m curious exactly what is happening.

This is occurring on the Learning to Drive 4.2 tutorial with even 1 agent training(noticed it with the 32 agents and tested with 1 and it would still occur). The project is on an SSD(Sabrent Rocket 4.0 Plus), the GPU is an RTX 3080 and CPU is an AMD Ryzen 9 5950x, all new as of three months old regarding hardware incase there is any concern that its a hardware failure.

I was following the tutorial to the letter and this was happening after both the initial implementation and adding on improved learning(which Im still having issues with my cars staying on the track after letting it train all night). I ran the training on the GPU and then switched to the CPU to see if it was an issue there, occurs on both.

Hopefully this helps with diagnosing the issue, let me know if there’s any other info I can provide!

1 Like

Unrelated to the hitching, I am also having an issue with getting tensorboard to work with my project. I enabled it within the training settings in the Blueprint defaults and I am able to get it running(with some work regarding numpy being the incorrect version and dealing with that).


Is there a way I can observe or specify an action where instead of a float it will give me an integer? As well as an integer within a min/max range? I found exclusive discrete action’s but I’m only getting zero… this is on 5.4.2

What exactly are you trying to accomplish with the integer?

You can maybe use one of the existing observations, or you could create your own observation if you have a C++ project.

(Sorry for the delay - been on summer break)

Once the new asset is created, we should give it an appropriate name. This manager will be used for RL training, so let’s name it “BP_RLTrainingManager”. You can open the blueprint graph if you like as we will be returning to build it up as we continue this tutorial.

Finally, place an instance of the manager on the “VehicleExampleMap”. The manager’s location is not important for this tutorial. I’m using version 5.4.3 and this tutorial is really difficult to follow for example I created the learning agents maanger blueprint class and the parent turns out to be learning agents manager in the blueprints class settings atlhough in the dropdown it’s nested under Actor so even if I change it to Actor Component I can’t do the next step of placing it on the Vehcile Map. Can you clarify the gaps in the instrucitons?

It was more for testing out different approaches to building out schemas, I was able to get around it using floats for now.