Hi,
A related question. How do you handle invalid moves? (Although perhaps connect 4 doesn’t have invalid moves). There doesn’t seem to be support for masking valid moves in the framework and I am having a hard time just getting my agent to learn what an invalid move is.
Part of the issue is that the invalid move happens at Run Inference (so at the end of the train chain → completion, reward, process experience, run inference). Now I am in a situation where the agent tried to make an invalid move, so the game cannot proceed. I would catch this in a completion, but the completion doesn’t run until the next time the agent tries to move. (And before that the other player - which is just random - also needs to move).
Do you have suggestions on how to deal with this situation in a clean way?
thank you
Dan