Learning Agents Project Doesn't Work as Expected

Hi, I’m working on a Learning Agents project in Unreal Engine 5.4 as my graduation project but I have a problem and I don’t know where I’m doing mistake. I need your help guys. I have to finish this project as soon as possible.

Case

There is an agent and my goal is make agent to follow the player until distance between agent and player is 2 meter. And rotate agent’s direction towards the player.

Problem

Rotating work fine after training for 15 mins but movement doesn’t work well even for 2 hours training.

Specify Agent Observation

1-Specify agent location and direction observation.

2-Specify player location and direction observation.

Gather Agent Observation

1-Gather agent location and direction observation

2- Gather player location and direction observation

This Player Pos and Player Direction (which is normalized player location - agent location vector) values are updating in agent’s tick function.

Specify Agent Action

These are the specified agent actions.

Perform Agent Action

This how I perform action values.

1-Direction

2-Movement

Gather Agent Reward

1- Direction reward

2-Movement reward
These bool variables are updating in agents tick function to determine if agent moving towards player or away from player.

Agent Tick Function

Beside this movement rewards I’ve tried make reward on location difference above threshold and make reward on location similarity functions separately but both didn’t work as expected. Even after 2 hours of training when I run inference agents were moving off from player to a no sense direction.

In perform agent action instead of using XInput and YInput float values to make a direction vector used them in AddMovementInput function. XInput to scale forward movement and YInput to scale right movement.

And I updated my reward functions. There is no need to calculate movement direction inside tick function. I assume default location based reward functions (like make reward from location similarity) better than mine.

This is the result of approximately 1 hour of training.