Determining reconstructed 3D point (x,y,z) coordinates using input image (x,y) coordinates


Is there any way to obtain corresponding 3D point (x,y,z) from the reconstructed model using input image (x,y) coordinates? I have tried several approaches but not succeeded yet. Any quidelines how to proceed would be appriciated.





Please, what exactly are You trying to achieve?

Sole image pixels are not explicitly bind to 3D points. But You can do these two things:

  • project sfm points from sfm cloud into cameras (input images) to obtain (x,y) coordinates in the associated images,
  • associate 3D points with some image pixels by mapping the pixels onto model mesh.

Thanks for your quick response!

I’m trying to estimate the scale for the mesh. I have target object on the scene (visible in some images) and I would like to check the size of the target object in the 3D space and use that information to scale the output mesh.


Well, I see why You would like to project a pixel into the scene.

I am afraid that the only correct way to rescale the model is to mimick the process which is done in the GUI application.


For at least two physical points in the original scene You have to know:

  • the (relative) distance between them (in Your preferred units) // e.g. 4.2 meters
  • projection in at least two images // e.g. (102.2, 243.4) in image 02 and (554.2, 2003.5) in image 21

Create an object implementing IControlPoints from CapturingReality.Sfm.h. It has to include:

  • two control points representing the two points in the original scene with the measurements properly set,
  • one constraint with end points set to be the two control points and with set distance.

Set this object to the IStructureFromMotion object with SetControlPoints() and call UpdateConstraints().

Thanks for the explanation!

It seems that I need to implement some interfaces to get this working… I’ll try this.

Just for the curious, You wrote that it is also possible to project sfm cloud into camera pixels. How this could be done?


Assume we want to project 3D point X onto a camera. It can, but does not have to be, a position of a sfm point.


We need to get:

  • ISfmCameraModel from ISfmReconstruction which is taken from IStructureFromMotion
  • SfmCamera structure for the selected camera, obtainable from ISfmReconstruction

Then simply call ISfmCameraModel->GetProjection(). Note that the sfm reconstruction has a different indexing of cameras, the original sfm input index is stored in the SfmCamera structure.

Hi Milan,

I tried approach which you suggested but I have some problems with it.

At first I used ISceneStructure->GetPoints() function to obtain corresponding sfm points. I copied each point to WoldPoint X. However, I don’t know what value I should put to X.w component (currently using fixed value 1) .

After copying, I use ISfmCameraModel->GetProjection to obtain camera projection of the WoldPoint X. Projected points (x,y) are between -0.03 - 0.005 and z is around 60 to 80.

I tried to use ISfmCameraModel->CameraToImage transform with input from GetProjection() but still the x,y coordinates are between -0.015 - 0.015. What scaling and transformation I should use to convert these coordinates to original image pixels? The shape of the projected pixel values looks correct but the place and scale are wrong.
What steps I should use if I want to project a image pixel to the model surface (and obtain corresponding 3D point)? There are function ISfmCameraModel->ImageToCamera() but I did not found any function for projection into worldCoordinates (model coordinates). This approach would be much better and more usable for us than converting 3D points to 2D.Thanks in advance!Jukka


Terve Jukka.

I am sorry, I forgot to mention that the projection of sfm points into cameras must be done in sfm space instead of model space. Thus You need to use the raw sfm point coordinates. To obtain them call ISfmReconstruction->GetStructure(). Each SfmReconstructionPoint contains a WorldPoint X. SfmReconstructionPoint.w is a valid non-zero positive value. The ISfmCameraModel->GetProjection() eats the WorldPoint as it is in its raw non-normalized form, but in the case You would like to draw it on screen in sfm space, do not forget to normalize it (i.e. x = x / w, y = y / w, z = z / w, w = 1.0).

The obtained projection of a 3d point from sfm space into a sfm camera is in relative coordinates, i.e. the longer dimension is in the range [-0.5,+0.5] and the shorter dimension in the range [-0.5*ar,+0.5*ar] where ar is aspect ratio (longer dimension / shorter dimension). The third value is depth which You probably don’t need.

What RC does is that it guesses a 3d model (structure) from 2d images (cameras). This can be, in a more abstract way, represented as a mapping from 2d image pixels to 3d points, which is what You are actually asking for. Simply, I am trying to say that this inverse process is not that easy and straightforward as mapping 3d points to 2d images. What You can try to do is get depth maps calculated from the reconstructed model by CreateModelVisibility() and facilitate them to pair 2d input image pixels with reconstructed 3d model surface points.

Dobrý deň Milan,

Your comment fixed the projection issue from 3D to 2D space! This is useful, but in a case where the projected sfm points (sparse point cloud) are not close enough of my original 2D points it cannot be used so accurately. Of course it gives good esimation of the locations.

I was wondering a workaround with my problem. How I could store final positions and focal points from the registered cameras?  I assume that function ISfmCameraModel->GetCameraCentre returns the position of the camera but how I could obtain the focal point (where camera is looking at)? Then I could use external libraries to project image/camera 2D point into 3D surface on the generated mesh.




The data You are asking for are readily available in the struct SfmCamera. Suppose we have struct SfmCamera C;, then:

  • C.K.K.focalLength
  • Vec3 cameraRightVector = Vec3( C.R[0], C.R[1], C.R[2] );
  • Vec3 cameraUpVector = Vec3( C.R[ 3], C.R[ 4], C.R[5] );
  • Vec3 cameraForwardVector = Vec3( C.R[6], C.R[7], C.R[8] ); // * please see the note below

* I suggest that You debug these vectors first to be sure about the orientation of the forward (look at) vector, because of obvious reasons it may be oriented in the opposite direction.

Hi Milan,

Thanks for the useful info! I used a vector from GetCameraCenter() to place the camera and used cameraUp and cameraForward vectors to set the direction of the camera. It seems that the camera is pointing near to the origo of the mesh coordinates which is not the center part of the input image (I assume it should point at the center of the image) . How I could fix this issue? 

Second question is related to sensor dimensions approximation. My dataset does not includes any exif data. How the RC approximates the sensor dimensions? And where I could obtain these? ? I’d like to use this info to calculate correct FOV for the rendering process.