Improving Learning Agents

MrEichi · July 11, 2024, 1:57pm

First of all, thanks a lot for creating this plugin! It truely is a powerful addition and I am totally in love with it.

I have been working with it for quite a whilte so far and I have had some friction every now and then. This made me wonder whether there is the possibility for additions to the plugin?

My most wanted feature is to have an event delegate that we can hook up to which is being called whenever a new policy is received. Right now the only way to know for sure that (or when) a policy update is received, is by calling EndTraining() manually, because then a policy update is enforced. But there is no chance to get to know an updated policy was received when any of the agents episode buffers were full and ProcessExperience() caused a training circle.

Thanks a billion! <3

Learning-Agents

MrEichi · July 18, 2024, 8:33pm

I have found another wish, which I cannot seem to implement with the latest version.

I have stated above that EndTraining() seems to actually run the training process and retrieve a new policy update. Well, it does not.

So right now it seems like there is no way to force running the training process. There also is no way to be notified when a new policy is received.

Would it be possible to add a function that will invoke the Python training process? I do not want to wait until all episode buffers are full and run the training e.g. every 20 seconds. I suggest turning a part of this function:

void ULearningAgentsTrainer::ProcessExperience(const bool bResetAgentsOnUpdate)
{
  	...

  	if (bReplayBufferFull)
	{
  	  	... << turn everything in here into its own function
  	}

  	...
}

Into its own function and expose it to blueprints.

Would it also be possible to implement an event delegate that we can hook up to that will inform us when a new policy was received? That would be nicely placed in above function.

Much love <3

Learning-Agents

MrEichi · July 26, 2024, 4:53pm

Update:

I ended up customizing the plugin code.

I have added an event delegate that can be hooked to from blueprints to get to know when a new policy was received. This enables me to have a “generation” counter to see how many times the model was trained.

I have also added the possibility to enable/disable agents. This way agents can be excluded from the training process when needed, for example when they die. This way they do not lose their episode buffers/gathered experience, as they would if I would temporarily remove them while being dead. I do not want agents to participate in training while they are dead, as regardless of any observation or action they take, none of them would have any effect and training a model on dead agents is not meaningful. I simply enable the actors agent again once they respawn.

If these additions would be useful for the main branch, I am more than happy to contribute to the Github repository of the plugin. Please let me know if contributing is wanted/helpful!