Is it possible to run inference on pre-trained policies (performing a sub-task) while training the main policy? Such as the typical case of using a pre-trained “low-level” controller as part of an RL workflow.
We designed Learning Agents so that it should be possible to have multiple policies with the possibility of doing what you have described.
That said, we haven’t had the opportunity to do extensive testing with this exact setup yet so there may be unforeseen issues that arise.
It should be an simple as training the low level policy, then using the saved weights while training the new policy.
Brendan