Its not often that someones asks you to re-invent the wheel. But this was one one these times and it was mostly necessary.

The task involved projecting already digitized city blocks outlines onto camera images.

A normal approach would be something among the lines of drawing the shapes in a 3d space (directX, openGL etc) and showing the image in the background. But when you want to accurately simulate a “real” existing camera then it seemed to me easier to build the whole construct from scratch.

In Order to do that we had to know both the interior and exterior orientation of the camera.

- the interior orientation is the position of the principal point, the focal length and the radial distortion. Some manufacturers provide those values (when it comes to photogrammetric cameras at least) but most do not. There are ways to calibrate yourself a camera but that’s completely a different topic.
- the exterior orientation of the camera is the position (x,y,z) and rotations (ω,φ,κ) of the camera at the time of the shooting

So what we need to do is to go from city block coordinates to image coordinates.

The main idea is described in the following steps

(1) A Point in Ground coordinate System

(2) A vector that connects Ground coordinated with image Coordinates

(3) Calculate the corresponding point in the Photo in meters

(4) calculate the correct position of each Point by offsetting the results based on the principal point

after calculating the x,y of the undistorted coordinates one can apply the radial distortion correction on those values

Now those results are in the same unit that our original data were (meters in my case). So in order to calculate the corresponding pixel, one has to know the physical pixel size of the sensor of the camera and transform so the begging of our system is not the center of digital image but the upperleft corner having the y positives values increasing downwards.

In my tests I used a MamiyaZD which has quite a big sensor (36mmX48mm) and a pixel size of 8.24μ

In most everyday commercial cameras getting the sensor size in order to calculate the pixel size is not always possible.

where y is the principal point in the Ground Coordinate system

where R is the Rotation Matrix

If we wanted to project the city block outlines into an aerial image our work would more or less end here. But when we want to project them into a typical ground level photograph (ω around 90 degrees) there are a lot more things to be considered.

The most important is to know at all times which objects lie in front of us and which objects lie behind us.

Objects that lie behind us are been projected above the horizon, high above the sky. A simple way to find their position, in our frame system but outside our frame is to inverse the signs in formula (3).

But even if we do that (I use a completely different approach which will become apparent later) there are some points that cannot be calculated in our normal or virtual frame because they tend to approach infinite values. Those points are the ones that lie between us and the first pixel of the photographic plane (anything between A and C in image [1]). In other words everything that lies out of the vertical field of view of the camera.

So what I ended up doing is that I used the NetTopology Suite, a UNION on the city block polygon geometries with both the virtual horizontal point of view lines (image[2] lines BD and BE) and the line that passes from point C in image[1] and image[2] and its perpendicular to the phi angle vector. What the nettopology suit union function does when applied to lines and polygons is that it returns the polygons with extra nodes at the point of intersections (like polygon FGHI). So in that way, if we treat city block polygons as lines we could find easily only what lies within our field of view and we can both minimize the load and get rid of extra possible projection errors.
Pingback: metric information from camera ray reconstruction ← Code and Tonic