📐 Understanding Epipolar Geometry

The mathematical foundation of stereo vision and 3D reconstruction

🎯 What is Epipolar Geometry?

Epipolar geometry describes the geometric relationship between two cameras viewing the same 3D scene. It's the fundamental constraint that makes stereo vision and 3D reconstruction possible.

                The Key Insight: When a point P in 3D space is viewed by two cameras, the image of P in one camera constrains where P can appear in the other camera—not to a single point, but to a line called the epipolar line.
            

The Epipolar Constraint

Consider a point P in 3D space observed by two cameras. Let p and p' be its projections in the left and right images. The epipolar constraint states that:

The 3D point P, the two camera centers O and O', and the two image points p and p' all lie on a single plane called the epipolar plane.
The intersection of this epipolar plane with each image plane creates a line—the epipolar line.
Given a point p in one image, its corresponding point p' must lie on the corresponding epipolar line in the other image.

Key Concepts

Baseline: The line connecting the two camera centers
Epipole: The point where the baseline intersects each image plane (where one camera "sees" the other camera's center)
Epipolar Line: The line on which a corresponding point must lie
Epipolar Plane: The plane containing the baseline and a 3D point

🔄 Stereo Rectification

Rectification is a transformation that simplifies stereo matching by warping both images so that corresponding epipolar lines become horizontal and aligned.

                Why Rectify? After rectification, corresponding points in the left and right images lie on the same row. This reduces the 2D search problem to a 1D search along horizontal lines, dramatically simplifying and speeding up stereo matching algorithms.
            

The Rectification Process

Compute the fundamental/essential matrix describing the epipolar geometry between cameras
Find homography matrices H_L and H_R that transform each image
Apply the transformations to warp both images
Result: Epipolar lines become horizontal scan lines

Properties of Rectified Images

All epipolar lines are horizontal
Corresponding epipolar lines have the same vertical coordinate
The epipoles are at infinity (parallel epipolar lines)
Stereo matching becomes a 1D search problem

📊 Mathematical Foundation

The Fundamental Matrix

The fundamental matrix F encodes the epipolar geometry between two uncalibrated cameras. For corresponding points p and p':

p'^T F p = 0

This equation states that if p and p' are projections of the same 3D point, they satisfy this constraint.

The Essential Matrix

When cameras are calibrated (intrinsic parameters known), we use the essential matrix E which relates normalized image coordinates. The relationship is:

E = K'^T F K

where K and K' are the camera intrinsic matrices.

Epipolar Lines from the Fundamental Matrix

Given point p in left image, the epipolar line in right image is: l' = Fp
Given point p' in right image, the epipolar line in left image is: l = F^Tp'

🔢 The Homography Matrix

A homography (or projective transformation) is a 3×3 matrix H that maps points from one plane to another. It's the most general linear transformation that preserves straight lines, making it fundamental for image rectification, panorama stitching, and perspective correction.

The Transformation: A point (x, y) is transformed to (x', y') using homogeneous coordinates:
                
                    [x', y', w']T = H · [x, y, 1]T

                    Then: xout = x'/w', yout = y'/w'

Matrix Structure & Color Coding

The 3×3 homography matrix can be decomposed into regions that control different aspects of the transformation:

Basic Transform Parameters

Linear Transforms (h₁₁, h₁₂, h₂₁, h₂₂)

The 2×2 upper-left submatrix controls linear transformations that preserve the origin:

Scale X & Y: Diagonal elements h₁₁ and h₂₂ control horizontal and vertical scaling. Values >1 enlarge, <1 shrink.
Shear X: Element h₁₂ skews the image horizontally (slants vertical lines)
Shear Y: Element h₂₁ skews the image vertically (slants horizontal lines)

Translation (h₁₃, h₂₃)

The rightmost column (excluding h₃₃) controls translation:

Translate X: Element h₁₃ shifts the image horizontally
Translate Y: Element h₂₃ shifts the image vertically

Perspective (h₃₁, h₃₂)

The bottom row elements create perspective (projective) effects—transformations that make parallel lines converge:

Perspective X: Element h₃₁ creates horizontal vanishing point effects
Perspective Y: Element h₃₂ creates vertical vanishing point effects

These parameters cause the divisor w' = h₃₁·x + h₃₂·y + h₃₃ to vary across the image, creating the characteristic perspective "foreshortening" effect.

Global Scale (h₃₃)

Element h₃₃ acts as a normalizing factor. Since homographies are defined up to scale, this is typically set to 1. Changing it uniformly scales all other parameters.

🔄 Composite Transformations

While individual parameters offer basic control, powerful transformations emerge from combining them using trigonometric relationships.

2D Rotation

Pure rotation by angle θ requires a specific relationship between the linear elements:

The cosine and sine functions ensure the transformation preserves distances and angles (it's an isometry). The constraint h₁₁ = h₂₂ = cos θ and h₂₁ = −h₁₂ = sin θ creates a pure rotation around the origin.

3D Rotation (Perspective Rotation)

A powerful technique for simulating 3D rotation of a planar surface combines scaling and perspective in a coordinated way. This is useful for correcting images taken at an angle to the subject.

                The Key Insight: When you rotate a plane in 3D, two things happen simultaneously:
                The dimension along the rotation axis compresses (foreshortening)
A perspective gradient appears (far edge appears smaller than near edge)

            

For Yaw (rotation around vertical axis) and Pitch (rotation around horizontal axis), the 3D rotation homography is:

Where f is the focal length (in pixels). This matrix elegantly combines:

cos(yaw) scaling on X: Compresses horizontally as the surface rotates away—this corrects for the distortion that pure perspective would introduce
cos(pitch) scaling on Y: Compresses vertically for pitch rotation
−sin(yaw)/f perspective: Adds the horizontal vanishing point effect
−sin(pitch)/f perspective: Adds the vertical vanishing point effect

                Why the scaling matters: If you only applied perspective without the cosine scaling, the image would appear stretched on one side. The cos(angle) scaling precisely compensates for this, ensuring the transformation looks like a natural 3D rotation rather than a distorted stretch.
            

Combining Transformations

Multiple homographies can be composed by matrix multiplication. To apply transformation H₁ followed by H₂:

H_combined = H₂ · H₁

This allows building complex transformations from simpler components—for example, rotating around an arbitrary point by: translating to origin → rotating → translating back.

🚀 Project Motivation

This interactive simulator was created as an educational tool to help visualize and understand the concepts of epipolar geometry, which can be abstract and difficult to grasp from equations alone.

                Learning by Doing: By manipulating virtual cameras in 3D space and seeing how epipolar lines change in real-time, users can develop an intuitive understanding of these geometric relationships.
            

Features of the Simulator

Interactive 3D Scene: Orbit around the scene to see how two cameras view a 3D object
Real-time Epipolar Lines: See how epipolar lines change as cameras move
Camera Presets: Explore different stereo configurations (parallel, converging, rotated)
Manual Rectification: Apply homography transformations to understand the rectification process
Visual Epipolar Planes: See the 3D planes that connect camera centers and scene points

This project is part of a series exploring camera geometry concepts, from basic pinhole cameras to advanced stereo vision techniques.

🎥 Try the Simulator

👏 Credits & Acknowledgments

Dhruva Gowda Storz
Project Creator & Developer
Claude Opus 4.5 (Anthropic)
AI Programming Assistant
Juan (juane3d on Sketchfab)
3D Thai Mask Model

This work uses "Thai Mask" licensed under CC-BY-4.0

Technologies Used

Three.js - 3D graphics library
WebGL - Hardware-accelerated graphics rendering
HTML5 Canvas - 2D overlays for epipolar lines

📚 Further Reading

Hartley & Zisserman - "Multiple View Geometry in Computer Vision" (The definitive textbook)
Wikipedia: Epipolar Geometry
Wikipedia: Fundamental Matrix
Wikipedia: Image Rectification