I’m creating a new thread upon @Deathcalibur 's request based on a performance problem I’ve been encountering with my Learning Agents system. This should act as a place to discuss these issues and hopefully share findings should we get to the bottom of it.
Since 5.3 launched I’ve been working on a DeepMimic-like system to train a 100% physics-based character to learn animations with the help of the Physics Controller plugin. It’s taken a couple weeks, but I’ve finally gotten the system to a good point and have begun training.
Problem: I’m working on a pretty high-end machine (i9 13900k, RTX 4090, 64GB RAM, NVMe storage) and I’m only getting ~60fps when training with only a single agent, and Unreal only seems to be consuming around 25% of both my GPU and CPU, which would indicate that Unreal is somehow the bottleneck here. For reference, with my Learning Agents systems disabled, I can run 48 of these Physics Control driven (target orientation & strength set every frame) skeletal meshes before dipping below 60fps, and my CPU is at 50% utilization.
I understand that the system I’ve built is quite intricate and requires significantly more resources than the provided driving tutorial, seeing as my character is driven by 23 physics controls, with my interactor having 46 total actions (Target Rotation & Strength for each joint), 69 Observations (Angular velocity, Linear Velocity, Position for each anim reference spline), three rewards, and a couple termination events… with my learning manager ticking at 30hz.
What’s even more strange is that when increasing the number of agents to 16, my framerate tanks to around 5fps… but my CPU and GPU utilization decrease to roughly 10%, from 25% with 1 agent. I’m really stumped what’s going on here. Keep in mind this is with my Learning Manager ticking at 30hz. If I decrease this to 1hz, I get 100fps, but 1hz obviously isn’t very constructive for learning in this case.
Paths to begin troubleshooting: Deathcaliber mentioned that they’ve created even more intensive scenarios with LearningAgents and haven’t had much issue, so there’s likely an issue somewhere in my code that I can uncover through Unreal Insights. This is something that I’ll begin looking into but would appreciate any good suggestions since I haven’t used Unreal Insights before.