Matching RealityCapture Camera Calibration with OpenCV for Rendering 3d Model onto Image

CatWithKnight · May 26, 2026, 5:32pm

I have a pipeline that classifies and generates improved synthetic 3D building models from a photogrammetry scene, exporting each building as an .fbx.synthetic file.

I’m currently working on a texture-generation pipeline for these synthetic building models using LLMs and generative AI.

For each building, the workflow is:

Find the original drone images in which the building appears
Crop the building region from those images
Feed the cropped images into an LLM/VLM
Generate a material and architectural description of the building
Use that description to generate textures

To extract the building regions from the images, I’m trying to render the building into a mask using the corresponding OpenCV camera (while the camera properties are taken from exported RealityCapture camera calibrations).

Current workflow:

Render the building model into a binary mask using OpenCV camera intrinsics
Apply inverse distortion to the resulting mask
Use the mask to crop the original drone image

From the RealityCapture calibration export, I’m using the following camera properties:

Camera position/orientation
Focal length
Principal point
k1, k2, k3, k4
t1, t2
cx, cy
focalLength35mmEq

The problem is that even after applying distortion correction, the rendered mask does not align correctly with the building in the original image, especially near the image edges.

My guess is that something in the interpretation of the RealityCapture camera calibration in OpenCV is incorrect — possibly even the proper camera field of view, which I extracted from focalLength35mmEq.

Could you share the exact math RealityCapture uses to project a 3D point into final image pixel coordinates, including distortion?

Also, is there a better or recommended way to accurately extract the visible building region from the original drone images?