So in this case you will need to have a camera rig (something like this: https://www.capturingreality.com/FullBody-Scanning-with-Backface). Then you need to capture him during movement/singing. As for movie, you will need 24 fps and create a model from each frame. Then you need to put these models to one movable alembic model which I suppose can be used in virtual environment.