Learning Agents 5.4 Question, adding obstacles to the track of the tutorial

First of all thanks for anyone that helps me with this topic, i copied this message from the learning agents 5.4 courses because makes more sense to belong to the forum, sorry for the double posting

POST1

However, I have a question about the observations—I’m not sure if I’m understanding them correctly.

I tried to add obstacles to the track, and if a car collides with them, it should receive negative rewards to encourage avoiding the obstacles. I created an actor that produces overlaps and stores information about who collided with it.

Interactor:

  • In the “specify agent observation” section, I added a location observation and constructed an array observation to add to the map containing the track and car observations. Then, I added a pin there called “Obstacles.”
  • In the “gather agent observation” section, I added a pin to the sequence to gather the observations of the obstacles.

I didn’t change anything in the actions of the agent because I thought the actions should remain the same.

Trainer:

  • I updated the reward function based on whether an obstacle was hit or not, giving -1 per obstacle hit by the agent.
  • I also updated the reset function to return the obstacles to a clean state.

After these changes, I broke it, haha. Now, the score is not going above 0.85, no matter how long I let it run.

Could it be that my reward function or my observations are wrong? It would be nice to have an example of how to gather observations of objects on the track that are not part of the spline. I tried following the example of the robot with the gun and so on, but I don’t know if I’m doing something wrong.

POST2

This are the obstacles, they are fix just to test if they are able to avoid something that i put in the track:

The obstacle actor:

Now the interactor:

Specify Agent obervations function:

I added tha part Obstacle observation + de entry on the map for the obstacles.

Gather Agent observations function

I added this as an step of the sequence

I added this to add the gathered observations on the sequence to the map

Now the trainer:

Gather Agent Reward function

I added this as part of a sequence to calculate when the agent hitted any of the obstacles and if it happens it add the negative reward.

Reset Agent Episode function

I added the reset of the obstacles to not mantain when a agent hitted an obstacle in the next iteration.

Also i modified the Reset function of the pawn to not fall in an obstacle when they are moved at the begining of the episode.

And this were my modifications, i was able to run it headless, using tensorboard etc… the only step that i need to do is restart from an snapshot to continue the training, but that is something that i can try later.

Thanks to anyone that can give me some light to the topic! :slight_smile:

1 Like

Thanks for this post, I was just wondering if you’ve considered using raycast in observations?

Hello,

Nope, i didn´t i will try with different ways to collect the observations of the obstacles, but as i said i dont know what im doing wrong with the obstacles to get that such bad behavior.

This looks good as far as I can tell.

I added the reset of the obstacles to not mantain when a agent hitted an obstacle in the next iteration.

For the rewards, it probably makes more sense to use OnActorEndOverlap? I guess having the Reset is good too to make sure its cleared out.

I have a similar setup to you internally which I haven’t had time to write a tutorial about and found that it worked well with raycasting but not with the array observation. I’m not sure if this is due to a bug in the array observation (although I independently use that observation in a different demo and it worked well so I don’t believe its bugged). So I think it might be a representation issue in the observations or a training issue! Not sure yet.

Let me know what works for you though when you figure it out!

Brendan