Dear Steff,
is the problem in the scene background only? I would propose to add some background with a solid color and lit it properly so that feature detector does not find too many features on the background.
To explain the masking in RealityCapture - even we do not have a tool for editing masks build in our app, the app itself supports masks to some extent. For alignment step all you need to do is to modify image color channel and paint it with a solid color. This will cause that natural features will not be detected there and thus will not influence camera alignment. For meshing we support alpha-channel masks. Adding alpha channel and masking only important parts actually speeds up whole computation. Another benefit is that you would not need to use reconstruction region to filter parts which are not important.
You should be able to generate these masks with 3rd party software in batch process easily.