Hi Paul,
the first thing what could break such model could be a wrong alignment. As I mentioned, drone’s GPS is not so precise and when it is close to the building, it could be even less precise. Is this also happening when you try to process the images without the georeferenced information?
The other problem could be the big angle difference between the nadir, oblique and façade images. In general, the difference shouldn’t be bigger than 15 degrees.
Regarding the consistency, to get the almost same model quality, you should take images from the same distance and use the same image overlap. So, it is not about altitude only, but about the distance to the captured object.
You can go as close as it is possible to focus. But then you will need to use more images to cover the whole object.