I suppose just using sprites would be the easiest way to control perpective. a 48 pixel tall guy will be 48 pixels always.
You may be able to set a camera to orthagonal instead of perspective. Though with the amount they move back and forth from the camera I doubt you’d see much perspective anyway, use a short fov.?
To mimick the camera movement they probably have it centered on a point between the players, so both players are always on screen. Maybe you can put invisible wall tied to camera at each end of the screen. Then both players would hit a wall if they get too far apart. Other than that have a lerp between their locations for the camera center. (the walls might be hard to do properly at different resolutions)