Over the last couple weeks, we pushed out some changes that reduce room for bugs and simplify most of the setup code. We’re mainly in QA mode for near future and looking for any feedback the community has on the plugin.
Summary of Latest Changes:
- Removed components having references to agent ids. Instead, all components use their associated manager and operate on all agents that have been added to that manager. You can get a similar functionality to what existed before by using multiple managers. We feel this design is less error prone and also results in a much more elegant implementation in the codebase with less edge cases.
- Order of operation checks: the lower level training/inference functions (EncodeObservations, ProcessExperience, etc.) now handle and log messages anytime expected operations were not run in the correct order. For example, it’s expected to collect the observations, actions, and rewards prior to calling ProcessExperience (otherwise the data is incomplete). This prevents bugs where data could have been stale or defaulted. Many other edge cases are addressed with this addition.
- Manager setup is now automatic and components will automatically find the manager they are attached to
Thanks!