Lecture 10 - Stereo Reconstruction Flashcards
(28 cards)
What is Depth ambiguity in an image?
All points on the ray 𝑂𝑃 has the same image coordinates 𝑝.
A single 2D image lacks depth information because:
* All 3D points along the line of sight project to the same 2D pixel.
* The depth Z cannot be recovered from one image alone.
How is Depth ambiguity mathematically represented?
x = fX/Z = fkX/kZ
y = fY/Z = fkY/kZ
For an k does not equal 0
What is the understanding of stereo vision in human eyes?
The disparity between the left and right retina images gives us a perception of depth
The disparity in images is proportional to the object’s distance
What is Random Dot Stereogram?
- Generates 3D perception from two flat images with random dots.
- A shifted region in one of the images creates a depth illusion.
What is the process of Random Dot Stereogram?
- Generate an image with random dots
- Duplicate the image
- Select a rectangular region inside one of the images and shift it horizontally
- Fill the hole with more random dots
- View with a stereoscope OR focus on a point behind the image
- You will see the rectangular region to appear in front or behind the rest of the surface giving a 3D perception
What is Camera-Based Stereo?
Two cameras observe the same scene from different positions.
Triangulation is used to estimate 3D point locations.
REFER TO QUESTION ANSWER FOR IMAGE
What is the process of Camera-Based Stereo?
Process:
1. Identify matching pixels in both images.
2. Use the disparity to compute 3D position.
What is the correspondence problem (How can we resolve the triangulation)?
We want to find the world coordinates of the pixel p, In other words, we want to find the location of 𝑃 in world coordinates since 𝑝 is its projection on the left image plane
- To complete the triangle, we need to find the projection of 𝑃 on the right image plane
- This is called finding the corresponding point of p
What is the Epipolar Plane?
A 3D point P and two optical centers 𝑂𝑙 and 𝑂𝑟
What are Epipolar Lines?
The intersection of the epipolar plane with the image planes
What are Epipoles?
Where all epipolar lines in an image plane intersect
REFER TO NOTES FOR VISUAL EXAMPLE
What is the Essential matrix?
A 3x3 matrices that encode the epipolar geometry of the two views
Describes the epipolar geometry in camera coordinates.
Defined as: E = [t] * R, where t is translation and R is rotation
Has rank 2, and 5 degrees of freedom
What is the Fundamental matrix?
Similar to the essential matrix, but in pixel coordinates.
Depends on intrinsic and extrinsic camera parameters
Transforms a point 𝑝 in one image to an epipolar line 𝑙′ in the other: 𝑝^T * F𝑝’ = 0
Has rank 2, and 7 degrees of freedom
What is the 8-point algorithm and what are the steps??
Used to estimate the fundamental matrix
Steps:
Construct the m x 9 matrix A
- Find Singular Value Decomposition (SVD) of A
- The last column of 𝑈 contains the entries of the 𝐹 matrix
- F must have rank 2
- Find SVD of F
- Set the smallest singular value in 𝑆𝑓 to zero to create 𝑆𝑓’
- Recompute 𝐹 = 𝑈𝑓𝑆𝑓′𝑉𝑓
What is Stereo Image Rectification?
Transforms stereo image pair so that:
* Epipolar lines become horizontal and aligned.
* Simplifies correspondence search to 1D horizontal scan.
What are the general steps of Stereo Image Rectification?
Step 1: Compute Rectification Rotation Matrices
* Compute rectification matrix for the left camera: Rl
* Compute rectification matrix for the right camera as: Rr=R*Rl where R is the relative rotation between the two cameras.
Step 2: Transform Each Point
For each left camera point p=[x,y,f]^T
1. Apply Rectification Rotation:
Rlp=[x′,y′,z′]^T
2. Project the Rotated Point to Image Plane:
p′=fz′[x′,y′,z′]^TRepeat the same steps for the right camera using Rr instead of Rl
===
GENERAL NOTES:
Reproject the image planes onto a common plane parallel to the line between the optical centers
- Requires a rotation around the optical center i.e. the focal point
- A rotation around the optical center is simply a 2D homography in the image
Stereo Rectification – What are the Homography Matrix Properties?
Let Hand H′ be the homographies applied to rectify the left and right images respectively:
1. Epipole Alignment:
Lines h2 and h2′′ must pass through the epipoles e and e′ respectively.
2. Epipolar Line Correspondence:
○ h2 and h2′,
○ h3 and h3’
are corresponding epipolar lines.
3. Rectifying Plane Definition:
○ h3 and h3′ define the rectifying plane.
4. Non-uniqueness:
○ The pair (H,H′) is not unique.
AS WELL AS THE MATHEMATICAL PROPERTIES - REFER TO SLIDES
What is Polar Rectification?
Used when the epipole is inside or close to the image:
* Standard rectification fails due to heavy distortion.
* Polar rectification remaps image coordinates in circular form around the epipole.
How to achieve perfect correspondences?
Correspondence Problem
* Goal: Match each pixel in the left image with its true counterpart in the right image.
* Epipolar constraint reduces search to the epipolar line.
Why is it challanging to find corresponding points in two (rectified) image pairs?
(Challanges to achieve perfect correspondence)
- Textureless regions
- Repetitive patterns
- Occlusions
- Window size selection
What is putting constraints on correspondence search?
- Epipolar: Corresponding points must lie on corresponding epipolar lines
- Uniqueness: A point in one image can have only one corresponding point in
the other image - Ordering: Points in the second image must appear in the same order as in
the first image. Ordering of matches along corresponding epipolar lines must be consistent. - Smoothness: Disparity values are expected to change slowly except at
object boundaries - Left-Right Constraint: Matching is performed a second time and inconsistent matches removed. Good for detecting invalid matches caused by occlusions.
What is Image Matching and what some techniques?
Find corresponding points in two images
▪ Area Based
▪ Feature Based
▪ Hybrid
▪ Relaxation
▪ Dynamic Programming
What is the best matching technique?
It depends on whether you care about speed or invariance.
* Zero-mean: Fastest, very sensitive to local intensity.
* Sum of squared differences: Medium speed, sensitive to intensity offsets.
* Normalized cross-correlation: Slowest, invariant to contrast and brightness.
What are the camera and projector systems?
- Laser stripe scanner
- Structure light system – Binary code
- Structured light – color coded
- Spatial Pattern