Tutorial: Learning Agents Introduction

Yes, imitation learning would be important for human-like aiming. We currently give you behavior cloning which is mostly what you would need for this.

In lieu of a formal tutorial for imitation learning:

  1. Setup an interactor - implement Setup and Set Observations for data collection. Create the actions you would want to track during SetupActions and then make the Action object variables public.
  2. Create a blueprint from ULearningAgentsController and implement SetActions - this is like SetObservations but for actions. During SetActions, set the publicly exposed action objects via their SetXAction function (don’t accidentally try to set the variable itself)
  3. Create a blueprint from the ULearningAgentsRecorder - nothing needed in the BP Event Graph.
  4. Create a Miscellaneous->DataAsset->LearningAgentsRecording and provide this during Recorder->SetupRecorder.
  5. Create a data collection manager from ULearningAgentsManager - during the manager’s Tick: call Interactor->EncodeObservations, Controller->EncodeActions, Recorder->AddExperience (behind a branch with Recorder->IsRecording == true). After spawning or whenever an enemy is found, call Recorder->BeginRecording. Whenever the player gets a kill or however you want, call EndRecording. Every begin/end recording will create a Record in the Recording object.
  6. Create empty blueprints from the ULearningAgentsImitationTrainer and ULearningAgentsPolicy
  7. Implement GetActions on the original interactor if not already done
  8. Create a separate manager for imitation learning with the Policy, Interactor and ImitationTrainer components attached (could be on the same manager with some controls added to it). During tick, call ImitationTrainer->RunTraining and pass in the Recording asset which was populated with data above.

This looks like a lot of steps when written out but it structurally shouldn’t take long to setup. Deciding on how to do the interactor is the hardest part.

The biggest challenge is that aiming in a human-like way takes place over many frames, so you need some kind of trajectory information, i.e. a memory. This can be achieved currently by doing a time-lagged MLP. For example, if you cared about the target’s position (seems necessary for aiming lol) and you wanted 1 second of history data at 30 FPS, you could feed the time dimension as different columns: Pos_Time0, Pos_Time1, Pos_Time2… managing the observations is a bit cumbersome right now.

Anyway, hope this helps!

4 Likes