To simulate perspective in an orthographic view, you need to counter animate the foreground and background elements separately. This is how we used to do it back in the days of 2D.
You need to translate foreground/background objects in a direction OPPOSITE to the camera movement.
Background elements should translate slower than the camera, foreground elements should translate faster than the camera. You can also stack the layers up. Layers farther back should get progressively slower, layers closer to the camera should get progressively faster.