I am trying to implement a reward function that depends on the observations made that step, basically when an observation is made, i need to retrieve that in order to calculate the reward for my agent.
Is there a way to do this ?
The simplest approach: cache the values yourself. You’re the one computing the observations in GatherAgentObservation, so just store them in member variables on your interactor (or trainer) and read them back in GatherAgentReward.
There’s no built in way to get the obs currently.
1 Like