Tutorial: Learning to Drive

Mahzad · September 20, 2023, 4:03pm

My class default for BP_RLTrainingManager :

and nope I didnt modify Brake

How long did your training take?
You have to wait 2, 3 hours
At first, this is normal the “weird behavior”, as the agent is more in exploration mode than exploitation.

FatemehMousavu · September 20, 2023, 4:07pm

After 2 hours they have the same behavior. Thanks for your reply . I will try again from start . Just for last question , are you using UE5.3 ?

Mahzad · September 20, 2023, 4:08pm

yes!

Orfin74 · September 20, 2023, 9:03pm

The lesson says that it is better to use termination. And at the same time truncation is used.
I’m confused.

Deathcalibur · September 21, 2023, 12:59am

Good catch. I fixed the blueprint

MousaviRineh · September 21, 2023, 1:30am

Please also fix this part in the tutorial , since at the beginning of training the Reinitialize Policy Network should be false

Deathcalibur · September 21, 2023, 10:45am

This needs to be true the first time you run the network otherwise I think your network will not have randomly initialized weights. I could be wrong but I believe our brand new networks are merely zero initialized.

ncmcclure · September 21, 2023, 7:24pm

@Deathcalibur Hey there! I’ve got a few questions but first wanted to say thank you for the awesome work behind this plugin! It really is incredibly well thought through and I’ve been having a blast working with it.

Since 5.3 launched I’ve been working on a DeepMimic-like system to train a 100% physics-based character to learn animations with the help of the Physics Controller plugin. It’s taken a couple weeks, but I’ve finally gotten the system to a good point and have begun training.

Problem: I’m working on a pretty high-end machine (i9 13900k, RTX 4090, 64GB RAM, NVMe storage) and I’m only getting ~60fps when training with only a single agent, and Unreal only seems to be consuming around 25% of both my GPU and CPU, which would indicate that Unreal is somehow the bottleneck here. For reference, with my Learning Agents systems disabled, I can run 48 of these Physics Control driven (target orientation & strength set every frame) skeletal meshes before dipping below 60fps, and my CPU is at 50% utilization.

I understand that the system I’ve built is quite intricate and requires significantly more resources than the provided driving tutorial, seeing as my character is driven by 23 physics controls, with my interactor having 46 total actions (Target Rotation & Strength for each joint), 69 Observations (Angular velocity, Linear Velocity, Position for each anim reference spline), three rewards, and a couple termination events… with my learning manager ticking at 30hz.

Edit 9/23/2023: The plot thickens with performance… When increasing the number of agents to 16, my framerate tanks to around 5fps… but my CPU and GPU utilization decrease to roughly 10%, from 25% with 1 agent. I’m really stumped what’s going on here. Keep in mind this is with my Learning Manager ticking at 30hz. If I decrease this to 1hz, I get 100fps, but 1hz obviously isn’t very constructive for learning in this case.

Question 1: Seeing as this is pretty computationally intensive, are there any underlying reasons why my hardware might not be more heavily utilized during training? I understand it may not be straightforward to diagnose without seeing everything. But due to there being so many variables at play here, training with just a single agent will take an incredibly long time and it seems like I’ve got some more headroom to spare for a few more agents.

Question 2: If you had to guess, how long do you think it would take to sufficiently train a system like this, and with what kind of hardware/scaling infrastructure? It’s been really difficult to find many answers to this elsewhere with relation to DeepMimic.

Edit 9/23/2023: I was actually able to find via DeepMimic’s GitHub repo that they were able to train in a day using 16 agents with 60 million samples (iterations?) on a single GPU (I presume). This makes me believe this should definitely be doable with Learning Agents, we just need to figure out what’s currently causing this bottleneck.

Question 3: Are there any Policy/Trainer/Training settings that would stand out to you as being the most important to tune for something like this in order to prevent overfitting and optimize for performance?

tomhalpin8 · September 21, 2023, 11:25pm

@BLRRD

Since 5.3 launched I’ve been working on a DeepMimic-like system to train a 100% physics-based character to learn animations with the help of the Physics Controller plugin. It’s taken a couple weeks, but I’ve finally gotten the system to a good point and have begun training.

I would be so interested in seeing your results from this!

kexul22 · September 23, 2023, 3:18am

Interesting, load unreal in headless mode might help, try using -nullrhi -nosound.

matedeol · September 26, 2023, 3:11am

Hi Brendan I am new to ML in Unreal and little confused about setup. I am getting this error at the section “Registering Agents with your manager”. I copied the blueprints snippet but can’t seem to fix this error:

LogBlueprint: Error: [AssetLog] D:\UnrealProjects\LearningAgentDrive\Content\VehicleTemplate\Blueprints\SportsCar\SportsCar_Pawn.uasset: [Compiler] The property associated with Agent Id could not be found in ‘/Game/VehicleTemplate/Blueprints/SportsCar/SportsCar_Pawn.SportsCar_Pawn_C’

matedeol · September 26, 2023, 3:23am

Same issue

ConverseFox · September 26, 2023, 6:31am

You didn’t create the variable correctly. You’re supposed to right click the SET Agent Id node and choose Create variable ‘AgentId’, but it looks like you accidentally right clicked the input pin on the SET node instead and chose to promote it to a new separate variable. Doing that also broke the connection between the Add Agent and SET nodes.

PM_Zahid · September 26, 2023, 5:56pm

hi what is the rotation array observation node for? how is it diff from its non array counterpart?

Deathcalibur · September 28, 2023, 1:40pm

Hey, thanks for the kind words and for checking out Learning Agents.

Can you spin this topic out into its own thread?

I think we need to use Unreal Insights to try to find the bottleneck in your code. Unreal Insights in Unreal Engine | Unreal Engine 5.3 Documentation

We have a demo environment we have tested internally that is more intense than your setup and it works well, so it’s likely there is some small thing that is not implemented well and is causing the hang up. With the information from the Unreal Insights CPU profile, it should be possible to diagnose (but you need to get familiar with how to use unreal insights if it’s new to you).

Happy to help you further. Sounds interesting!

Deathcalibur · September 28, 2023, 1:54pm

The array versions of observations make it easier to observe a lot of the given data. So rotation observation let’s you feed in one rotation, vs the array lets you have multiple.

You can achieve the same as the array by simply looping the non-array version in a for-loop, but the array versions will have slightly less overhead in terms of memory usage.

ncmcclure · September 29, 2023, 12:52am

Awesome, thank you Brendan!

I’ve gone ahead and created a new thread and tagged you in it. I’ve also started doing some digging using Unreal Insights (thanks for the suggestion!), and it looks like the function GetPositionAtSplineDistance is definitely the culprit for the long frame times. I’ve added some more details in the new thread for some more context.

Mahzad · September 29, 2023, 1:58pm

it’s normal; you’re still at the beginning of your training, but your average reward is 0 and I think that’s not normal. Maybe you need to check your rewards?

PM_Zahid · September 30, 2023, 8:11am

@Deathcalibur thnks for replying. after my car agents mature into jason statham, can i edit the observations and actions to have more features? so my jasons are still jasons but need to learn the new tricks i added. another thing, what is the get_mesh_bone_positions node for?

TUNGSTENDUNG · October 7, 2023, 12:20am

Is there anyway to speed up the training by utilizing more system resources? My CPU is siting at 7-9% and my GPU at 30-39% with what looks like a consistent spike up to 64-90% (I’m guessing each episode).

Unless I’ve got something wired up wrong. I had to increased the tick interval to 0.0083 which fixed the issue I was previously having with the cars going super slow in reverse or forward.

I’m running a Threadripper 3970x 32 core with an RTX 4080 and 128GB RAM

Also, to really understand what’s going on in this, you need to understand how Deep Reinforced Learning works. I highly recommend the RL course at huggingface, Welcome to the 🤗 Deep Reinforcement Learning Course - Hugging Face Deep RL Course