Camera Calibration: Intrinsic & Extrinsic Parameters for CV

Master camera calibration in computer vision! Learn to estimate intrinsic & extrinsic parameters for accurate 3D reconstruction, pose estimation, and AR. Essential for AI.

Camera Calibration: Intrinsic and Extrinsic Parameters

Camera calibration is a fundamental process in computer vision that estimates the parameters of a camera. This estimation is crucial for removing lens distortion and establishing a precise relationship between 3D real-world coordinates and 2D image coordinates. It is an essential step for applications like accurate 3D reconstruction, pose estimation, and augmented reality.

The calibration process determines two main types of parameters:

  • Intrinsic Parameters: These describe the internal characteristics of the camera, such as its focal length and optical center.
  • Extrinsic Parameters: These define the camera's position and orientation in the 3D world.

Why is Camera Calibration Important?

Camera calibration is vital for several reasons:

  • Removes Lens Distortion: Real-world lenses introduce distortions that can skew image geometry. Calibration allows for the correction of these distortions.
  • Enables Precise 3D Measurement: By accurately mapping 2D image points to their corresponding 3D world points, calibration facilitates precise measurements in the real world.
  • Maps 3D World to 2D Image Coordinates: It provides the mathematical model to project points from a 3D world onto a 2D image plane.
  • Essential for Advanced Applications: It's a prerequisite for many complex computer vision tasks, including stereo vision, Simultaneous Localization and Mapping (SLAM), and Augmented Reality (AR).

Intrinsic Parameters of a Camera

Intrinsic parameters define how a camera projects 3D points onto its image plane. They are independent of the camera's position and orientation in the world.

The key intrinsic parameters are:

  • Focal Length ($f_x, f_y$): The distance between the optical center of the lens and the image plane, measured in pixels. For square pixels, $f_x = f_y$.
  • Principal Point ($c_x, c_y$): The intersection of the principal axis of the lens with the image plane. This is often near the center of the image.
  • Skew Coefficient ($s$): Accounts for non-orthogonality between the camera's x and y pixel axes. For most modern cameras with well-manufactured sensors, this is usually 0.
  • Distortion Coefficients ($k_1, k_2, k_3, p_1, p_2$): These parameters model the non-linear distortions introduced by the camera lens.

Intrinsic Camera Matrix (K)

The intrinsic parameters are typically represented in a 3x3 Intrinsic Camera Matrix (K):

$$ K = \begin{bmatrix} f_x & s & c_x \ 0 & f_y & c_y \ 0 & 0 & 1 \end{bmatrix} $$

Where:

  • $f_x, f_y$: Focal lengths in pixels.
  • $c_x, c_y$: Coordinates of the principal point.
  • $s$: Skew coefficient.

Extrinsic Parameters of a Camera

Extrinsic parameters describe the camera's pose (position and orientation) in the 3D world. They are used to transform points from the world coordinate system to the camera's coordinate system.

The extrinsic parameters consist of:

  • Rotation Matrix (R): A 3x3 matrix that defines the orientation of the camera relative to the world coordinate system.
  • Translation Vector (t): A 3x1 vector that defines the position of the camera's optical center in the world coordinate system.

These parameters are often combined into a 3x4 Extrinsic Matrix:

$$ [R | t] = \begin{bmatrix} r_{11} & r_{12} & r_{13} & t_x \ r_{21} & r_{22} & r_{23} & t_y \ r_{31} & r_{32} & r_{33} & t_z \end{bmatrix} $$

Full Camera Projection Model

The complete model that transforms a 3D world point $X = [X, Y, Z, 1]^T$ to its corresponding 2D image coordinates $x = [u, v, 1]^T$ is given by the following equation:

$$ x = K \cdot [R | t] \cdot X $$

This equation combines the intrinsic and extrinsic transformations:

  1. The extrinsic matrix $[R | t]$ transforms the 3D world point into the camera's coordinate system.
  2. The intrinsic matrix $K$ then projects this 3D point onto the 2D image plane, accounting for focal length, principal point, and skew.

Lens Distortion

Real-world lenses introduce geometric distortions that deviate from the ideal pinhole camera model. These are typically modeled as radial and tangential distortions.

Radial Distortion

Radial distortion causes straight lines in the real world to appear curved in the image. It's more pronounced towards the edges of the image.

The distorted pixel coordinates $(x_{distorted}, y_{distorted})$ are related to the ideal, undistorted coordinates $(x, y)$ by:

  • Radial Distortion: $x_{distorted} = x(1 + k_1 r^2 + k_2 r^4 + k_3 r^6)$ $y_{distorted} = y(1 + k_1 r^2 + k_2 r^4 + k_3 r^6)$

Tangential Distortion

Tangential distortion occurs when the lens is not perfectly parallel to the image sensor plane.

  • Tangential Distortion: $x_{distorted} = x + [2p_1xy + p_2(r^2 + 2x^2)]$ $y_{distorted} = y + [p_1(r^2 + 2y^2) + 2p_2xy]$

Where:

  • $r^2 = x^2 + y^2$ is the squared radial distance from the image center.
  • $k_1, k_2, k_3$ are the radial distortion coefficients.
  • $p_1, p_2$ are the tangential distortion coefficients.

Camera Calibration Using OpenCV – Hands-On

OpenCV provides robust functions for camera calibration, commonly performed using a known calibration pattern like a chessboard.

Step-by-Step Example (Conceptual):

  1. Prepare Calibration Images: Capture multiple images of a chessboard from different viewpoints.
  2. Detect Chessboard Corners: For each image, use cv2.findChessboardCorners to detect the exact pixel coordinates of the internal corners of the chessboard.
  3. Define Object Points: Create a list of the 3D coordinates of the chessboard corners in a known object coordinate system. For a chessboard, these are typically simple grid points.
  4. Calibrate the Camera: Use cv2.calibrateCamera with the lists of object points and image points. This function returns the intrinsic matrix ($K$), distortion coefficients ($dist$), rotation vectors ($rvecs$), and translation vectors ($tvecs$).
import cv2
import numpy as np
import glob

# Define chessboard dimensions (number of inner corners)
chessboard_size = (9, 6)
# Size of each square in real-world units (e.g., cm)
square_size = 1.0

# Prepare object points (3D coordinates of chessboard corners in world space)
# We create a grid of points assuming the chessboard lies on the Z=0 plane
objp = np.zeros((np.prod(chessboard_size), 3), np.float32)
objp[:, :2] = np.mgrid[0:chessboard_size[0], 0:chessboard_size[1]].T.reshape(-1, 2)
objp *= square_size

objpoints = []  # List to store 3D points
imgpoints = []  # List to store 2D points (detected corners)

# Load calibration images
# Assuming calibration images are in a folder named 'calib_images'
images = glob.glob('calib_images/*.jpg')

for fname in images:
    img = cv2.imread(fname)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    # Find the chessboard corners
    ret, corners = cv2.findChessboardCorners(gray, chessboard_size, None)

    # If corners are found, add them to the lists
    if ret:
        objpoints.append(objp)
        # Refine corner locations for sub-pixel accuracy
        imgpoints.append(corners)
        # Optional: Draw and display corners for verification
        # cv2.drawChessboardCorners(img, chessboard_size, corners, ret)
        # cv2.imshow('img', img)
        # cv2.waitKey(500) # wait 0.5 sec

# cv2.destroyAllWindows()

# Calibrate the camera
# The second argument (None) means we don't provide initial guesses for K
# The third argument is the image size, needed for calculating focal length in pixels
ret, K, dist, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, gray.shape[::-1], None, None)

print("Camera Matrix (Intrinsic Parameters):\n", K)
print("\nDistortion Coefficients:\n", dist)

# ret: RMS re-projection error
# K: Camera Matrix
# dist: Distortion coefficients (k1, k2, p1, p2, k3)
# rvecs: Rotation vectors (one for each image)
# tvecs: Translation vectors (one for each image)

Undistort Image

Once the camera is calibrated, you can use the obtained intrinsic matrix ($K$) and distortion coefficients ($dist$) to undistort any new image taken with that camera.

# Load a test image
img = cv2.imread('test_image.jpg')
h, w = img.shape[:2]

# Get a new camera matrix based on optimal cropping (optional but recommended)
# alpha=1 means all pixels are retained, 0 means only valid pixels
newcameramtx, roi = cv2.getOptimalNewCameraMatrix(K, dist, (w, h), 1, (w, h))

# Undistort the image
dst = cv2.undistort(img, K, dist, None, newcameramtx)

# Optional: Crop the image to remove black borders after undistortion
# x, y, w_roi, h_roi = roi
# dst = dst[y:y+h_roi, x:x+w_roi]

# Display the original and undistorted images
cv2.imshow("Original Image", img)
cv2.imshow("Undistorted Image", dst)
cv2.waitKey(0)
cv2.destroyAllWindows()

Summary Table: Intrinsic vs. Extrinsic Parameters

Parameter TypeDescriptionExample ParametersRole
IntrinsicInternal camera characteristics that define how a camera projects 3D points onto its image plane.$K$ (fx, fy, cx, cy)Models the camera's internal geometry and projection properties.
ExtrinsicThe camera’s position and orientation (pose) in the real world.$R, t$Relates the camera's coordinate system to the world coordinate system.

Applications of Camera Calibration

Camera calibration is a cornerstone for numerous computer vision applications:

  • Augmented Reality (AR): Overlaying virtual objects onto the real world requires precise knowledge of the camera's pose to align virtual content correctly.
  • Robotics and SLAM: Robots use calibrated cameras to perceive their environment, navigate, and build maps (SLAM). Accurate pose estimation is critical for these tasks.
  • 3D Modeling and Reconstruction: Creating 3D models of objects or scenes from images relies on correctly mapping 2D points to their 3D world locations.
  • Industrial Measurement Systems: High-precision measurement of dimensions and distances in manufacturing and quality control.
  • Stereo Vision and Depth Estimation: Calculating depth information requires understanding the relative poses and intrinsic parameters of two or more cameras.

Conclusion

Camera calibration is an indispensable technique in computer vision. It provides the necessary mathematical framework to overcome lens imperfections and accurately relate the 2D world captured by a camera to its 3D reality. By understanding and applying intrinsic and extrinsic parameters, developers can unlock capabilities for precise measurement, 3D understanding, and sophisticated interactions in augmented reality and robotics. Libraries like OpenCV simplify this process, making powerful camera calibration accessible through straightforward tools and patterns like the chessboard.

Camera Calibration: Intrinsic & Extrinsic Parameters for CV