You can project the vertices onto the camera plane. That way, they’re always in front of everything while still being in the same pass/scene.
This has the advantage that it works like anything else (can receive shadows, screenspace reflections, etc.). It even works with occlusion culling:
The only problem is that the first person mesh doesn’t cast self shadows. Though, if you look at a lot of first person games (COD for example), you’ll notice the first person meshes don’t cast self shadows, either. So this makes me think this is similar to how they do it (or at least suffers from the same problem).