Projections¶

Tip

Projections transforms from the image to camera coordinate system. Parameters are \(f_x, f_y, c_x, c_y\) and the image size in px.

This section describes the different projections which are available for the projection of the objects in the camera coordinate system to the image coordinates.

In the camera coordinate system, the camera is positioned at (0,0,0) and is pointing in \(z\) direction.

For each projection the projection formula is provided which allows to transform from the camera coordinate system (3D) to the image coordinate system (2D). As information is lost from transforming from 3D to 2D, the back transformation is not unique. All points that are projected on one image pixel lie on one line or ray. Therefore, for the “backtransformation”, only a ray can be provided. To obtain a point this ray has e.g. to be intersected with a plane in the world.

The coordinates \(x_\mathrm{im},y_\mathrm{im}\) represent a point in the image pixel coordinates, \(x,y,z\) the same point in the camera coordinate system. The center of the image is \(c_x,c_y\) and \(f_x\) and \(f_y\) are the focal lengths in pixel (focal length in mm divided by the sensor width/height in mm times the image width/height in pixel), both focal lengths are the same for quadratic pixels on the sensor.

Parameters¶

focallength_x_px, \(f_x\): the focal length of the camera relative to the width of a pixel on the sensor.
focallength_y_px, \(f_y\): the focal length of the camera relative to the height of a pixel on the sensor.
center_x_px, \(c_x\): the central point of the image in pixels. Typically about half of the image width in pixels.
center_y_px, \(c_y\): the central point of the image in pixels. Typically about half of the image height in pixels.
image_width_px, \(\mathrm{im}_\mathrm{width}\): the width of the image in pixels.
image_height_px, \(\mathrm{im}_\mathrm{height}\): the height of the image in pixels.

Indirect Parameters¶

focallength_mm, \(f_\mathrm{mm}\): the focal length of the camera in mm.
sensor_width_mm, \(\mathrm{sensor}_\mathrm{width}\): the width of the sensor in mm.
sensor_height_mm, \(\mathrm{sensor}_\mathrm{height}\): the height of the sensor in mm.
view_x_deg, \(\alpha_x\): the field of view in x direction (width) in degree.
view_y_deg, \(\alpha_y\): the field of view in y direction (height) in degree.

Functions¶

class cameratransform.CameraProjection(focallength_px=None, focallength_x_px=None, focallength_y_px=None, center_x_px=None, center_y_px=None, center=None, focallength_mm=None, image_width_px=None, image_height_px=None, sensor_width_mm=None, sensor_height_mm=None, image=None, sensor=None, view_x_deg=None, view_y_deg=None)[source]¶

Defines a camera projection. The necessary parameters are: focalllength_x_px, focalllength_y_px, center_x_px, center_y_px, image_width_px, image_height_px. Depending on the information available different initialisation routines can be used.

Note

This is the base class for projections. it the should not be instantiated. Available projections are RectilinearProjection, CylindricalProjection, or EquirectangularProjection.

Examples

This section provides some examples how the projections can be initialized.

>>> import cameratransform as ct

Image Dimensions:

The image dimensions can be provided as two values:

>>> projection = ct.RectilinearProjection(focallength_px=3863.64, image_width_px=4608, image_height_px=3456)

or as a tuple:

>>> projection = ct.RectilinearProjection(focallength_px=3863.64, image=(4608, 3456))

or by providing a numpy array of an example image:

>>> import matplotlib.pyplot as plt
>>> im = plt.imread("test.jpg")
>>> projection = ct.RectilinearProjection(focallength_px=3863.64, image=im)

Focal Length:

The focal length can be provided in mm, when also a sensor size is provided:

>>> projection = ct.RectilinearProjection(focallength_mm=14, sensor=(17.3, 9.731), image=(4608, 3456))

or directly in pixels without the sensor size:

>>> projection = ct.RectilinearProjection(focallength_px=3863.64, image=(4608, 3456))

or as a tuple to give different focal lengths in x and y direction, if the pixels on the sensor are not square:

>>> projection = ct.RectilinearProjection(focallength_px=(3772, 3774), image=(4608, 3456))

or the focal length is given by providing a field of view angle:

>>> projection = ct.RectilinearProjection(view_x_deg=61.617, image=(4608, 3456))

>>> projection = ct.RectilinearProjection(view_y_deg=48.192, image=(4608, 3456))

Central Point:

If the position of the optical axis or center of the image is not provided, it is assumed to be in the middle of the image. But it can be specifided, as two values or a tuple:

>>> projection = ct.RectilinearProjection(focallength_px=3863.64, center=(2304, 1728), image=(4608, 3456))

>>> projection = ct.RectilinearProjection(focallength_px=3863.64, center_x_px=2304, center_y_px=1728, image=(4608, 3456))

CameraProjection.imageFromCamera(points)[source]¶

Convert points (Nx3) from the camera coordinate system to the image coordinate system.

Parameters:: points (ndarray) – the points in camera coordinates to transform, dimensions (3), (Nx3)
Returns:: points – the points in the image coordinate system, dimensions (2), (Nx2)
Return type:: ndarray

Examples

>>> import cameratransform as ct
>>> proj = ct.RectilinearProjection(focallength_px=3729, image=(4608, 2592))

transform a single point from the camera coordinates to the image:

>>> proj.imageFromCamera([-0.09, -0.27, -1.00])
[1968.39 2302.83]

or multiple points in one go:

>>> proj.imageFromCamera([[-0.09, -0.27, -1.00], [-0.18, -0.24, -1.00]])
[[1968.39 2302.83]
 [1632.78 2190.96]]

CameraProjection.getRay(points, normed=False)[source]¶

As the transformation from the image coordinate system to the camera coordinate system is not unique, image points can only be uniquely mapped to a ray in camera coordinates.

Parameters:: points (ndarray) – the points in image coordinates for which to get the ray, dimensions (2), (Nx2)
Returns:: rays – the rays in the camera coordinate system, dimensions (3), (Nx3)
Return type:: ndarray

Examples

>>> import cameratransform as ct
>>> proj = ct.RectilinearProjection(focallength_px=3729, image=(4608, 2592))

get the ray of a point in the image:

>>> proj.getRay([1968, 2291])
[0.09 -0.27 -1.00]

or the rays of multiple points in the image:

>>> proj.getRay([[1968, 2291], [1650, 2189]])
[[0.09 -0.27 -1.00]
 [0.18 -0.24 -1.00]]

CameraProjection.getFieldOfView()[source]¶

The field of view of the projection in x (width, horizontal) and y (height, vertical) direction.

Returns:

view_x_deg (float) – the horizontal field of view in degree.
view_y_deg (float) – the vertical field of view in degree.

CameraProjection.focallengthFromFOV(view_x=None, view_y=None)[source]¶

The focal length (in x or y direction) based on the given field of view.

Parameters:

view_x (float) – the field of view in x direction in degrees. If not given only view_y is processed.
view_y (float) – the field of view in y direction in degrees. If not given only view_y is processed.

Returns:

focallength_px – the focal length in pixels.

Return type:

float

CameraProjection.imageFromFOV(view_x=None, view_y=None)[source]¶

The image width or height in pixel based on the given field of view.

Parameters:

view_x (float) – the field of view in x direction in degrees. If not given only view_y is processed.
view_y (float) – the field of view in y direction in degrees. If not given only view_y is processed.

Returns:

width/height – the width or height in pixels.

Return type:

float

Projections¶

All projections share the same interface, as explained above, but implement different image projections.

Note

Some sources define the projections slightly different. Cameratransform uses a minus sign for the x direction to revert the flipping of the image which the pinhole camera model does. This is done to keep the coordinate system aligned with the space coordinate system, e.g. x increasing from left to right.

Rectilinear Projection¶

class cameratransform.RectilinearProjection(focallength_px=None, focallength_x_px=None, focallength_y_px=None, center_x_px=None, center_y_px=None, center=None, focallength_mm=None, image_width_px=None, image_height_px=None, sensor_width_mm=None, sensor_height_mm=None, image=None, sensor=None, view_x_deg=None, view_y_deg=None)[source]¶

This projection is the standard “pin-hole”, or frame camera model, which is the most common projection for single images. The angles \(\pm 180°\) are projected to \(\pm \infty\). Therefore, the maximal possible field of view in this projection would be 180° for an infinitely large image.

Projection:

\[\begin{split}x_\mathrm{im} &= f_x \cdot \frac{-x}{z} + c_x\\ y_\mathrm{im} &= f_y \cdot \frac{y}{z} + c_y\end{split}\]

Rays:

\[\begin{split}\vec{r} = \begin{pmatrix} -(x_\mathrm{im} - c_x)/f_x\\ (y_\mathrm{im} - c_y)/f_y\\ 1\\ \end{pmatrix}\end{split}\]

Matrix:

The rectilinear projection can also be represented in matrix notation:

\[\begin{split}C_{\mathrm{intr.}} &= \begin{pmatrix} -f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \\ \end{pmatrix}\\\end{split}\]

Cylindrical Projection¶

class cameratransform.CylindricalProjection(focallength_px=None, focallength_x_px=None, focallength_y_px=None, center_x_px=None, center_y_px=None, center=None, focallength_mm=None, image_width_px=None, image_height_px=None, sensor_width_mm=None, sensor_height_mm=None, image=None, sensor=None, view_x_deg=None, view_y_deg=None)[source]¶

This projection is a common projection used for panoramic images. This projection is often used for wide panoramic images, as it can cover the full 360° range in the x-direction. The poles cannot be represented in this projection, as they would be projected to \(y = \pm\infty\).

Projection:

\[\begin{split}x_\mathrm{im} &= -f_x \cdot \arctan2{\left(\frac{-x}{-z}\right)} + c_x\\ y_\mathrm{im} &= -f_y \cdot \frac{y}{\sqrt{x^2+z^2}} + c_y\end{split}\]

Rays:

\[\begin{split}\vec{r} = \begin{pmatrix} -\sin\left(\frac{x_\mathrm{im} - c_x}{f_x}\right)\\ \frac{y_\mathrm{im} - c_y}{f_y}\\ \cos\left(\frac{x_\mathrm{im} - c_x}{f_x}\right) \end{pmatrix}\end{split}\]

Equirectangular Projection¶

class cameratransform.EquirectangularProjection(focallength_px=None, focallength_x_px=None, focallength_y_px=None, center_x_px=None, center_y_px=None, center=None, focallength_mm=None, image_width_px=None, image_height_px=None, sensor_width_mm=None, sensor_height_mm=None, image=None, sensor=None, view_x_deg=None, view_y_deg=None)[source]¶

This projection is a common projection used for panoramic images. The projection can cover the full range of angles in both x and y direction.

Projection:

\[\begin{split}x_\mathrm{im} &= -f_x \cdot \arctan2{\left(-x, -z\right)} + c_x\\ y_\mathrm{im} &= -f_y \cdot \arctan2{\left(y, \sqrt{x^2+z^2}\right)} + c_y\end{split}\]

Rays:

\[\begin{split}\vec{r} = \begin{pmatrix} -\sin\left(\frac{x_\mathrm{im} - c_x}{f_x}\right)\\ \tan\left(\frac{y_\mathrm{im} - c_y}{f_y}\right)\\ \cos\left(\frac{x_\mathrm{im} - c_x}{f_x}\right) \end{pmatrix}\end{split}\]