Maybe don’t use player overlap to determine what to interact with, use line trace?
In first person it would be whatever is under the reticle. In a mode with a mouse pointer it would be whatever is under the mouse pointer. Then there is no ambiguity.