Now I am using the following observations and rewards:
Observations: https://blueprintue.com/blueprint/0jzdgbud/
Rewards: https://blueprintue.com/blueprint/olyxk-7z/
Observations:
- Goal Position, Distance and Direction (the scale of position and distance is 5000)
- Rays collision distance and location, bool collision checker
- Time and pawn velocity
Rewards:
- Time Penality
- Goal Distance Reward
- Rays collision penality (distance from walls)
- Collision penality
- Direction penality (if there is a big angle between the last input and the forward vector I give a penality because I don’t want the pawn to go too much left or right)