Learning Agents 5.4 Questions

Deathcalibur · April 19, 2024, 9:14pm

To clarify the extension of the collision detection completion logic, in the case that I am checking each bone of a skeletal mesh (within the agent blueprint) for a collision and if any bone not foot_l or foot_r contacts the tagged ‘ground’, I store that in a variable (like the set you used), call the completion if a contact is within that set, and reset the episode?

Pretty much, but to be clear: you don’t need to call any function from the pawn to have learning agents check the completion. Rather, you have the GatherAgentCompletion function inspect/call functions on the Pawn then do a MakeCompletion on Condition and make sure whatever condition you care about is True when you want it to complete.

To clarify with a velocity similarity reward, I want to setup the logic for a velocity similarity in the Agent itself if I wish to track the elapsed tick or elapsed steps?

If I want the value of the reward to scale based on the magnitude of the difference between the target and actual velocity, I want to set that up in my reward logic (wherever it goes) and output the value itself into a MakeReward node?

Both these things should be setup within the GatherAgentReward function. The reward function has a reference to the agent so you can inspect the agent’s velocity. If the logic is complicated, you can consider to make a function on the pawn to help out.

The part to make clear is that you shouldn’t be trying to call MakeReward from inside the pawn… do this inside the Trainer object’s Gather function.

Now, has the Reward Scale value been intended to normalize rewards in the event that a dev may want the sum of various rewards to scale to 1 or does it have another purpose?

If you use the MakeReward, the internal function is doing:

	const float Reward = RewardValue * RewardScale;

Usually you make the scale a fixed value and the value itself can vary, but yes its essentially just there to remind you to normalize the rewards a bit. Obviously you can do the normalization yourself if you leave the scale as 1.0.

Thank you very much for the assistance, I will hopefully have this working by tomorrow.

By the way, once I really dove into the observations and actions, I couldn’t help but be very impressed with how powerful this API is turning out to be. I am super excited about it.

Good luck and thanks!