Stereo Basics and Epipolar Geometry - Week 7/8 Flashcards

1
Q

What is the goal of stereo / epipolar geometry?

A

Recovery of a 3D structure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the problem with single-view geometry for stereo imaging?

A

Recovery of structure from one image is inherently ambiguous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What visual cues can lead to retrieving 3D geometry from 2D image?

A

Shading
Texture
Focus
Perspective
Motion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is stereo vision?

A

Given several images of the same object or scene, compute a representation of its 3D shape

Narrower definition:
Given a calibrated binocular stereo pair, fuse it to produce a depth image

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is triangulation?

A

Gives a reconstruction in 3D space as an intersection of two rays

Requires:
- Camera pose (calibration)
- Point correspondence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the focal length for a pinhole camera?

A

The distance between the Image Plane and the Center of projection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What two equations translate images based on their X, Y, Z from the centre of projection to points x, y on the image plane. Based on the focal length

A

x = f/Z * x

y = f/Z * y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the baseline in a stereo system?

A

The distance between the centre of projections of the two images

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the definition of disparity?

A

Displacement between conjugate (corresponding) points in left and right images

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the formula to calculate the Z depth of the image at a point in two stereo images given baseline b, focal length f and the disparity between the two points (xl - xr)?

A

Z = b*f/(xl - xr)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the components of stereo analysis?

A

Find correspondences
- Conjugate pairs of points
- Potentially hard - lots of pairs

Reconstruction
- Calculate scene coordinates (X, Y, Z)
- Easy once you have done…

Calibration
- Calculate parameters of cameras (b, f, …)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the epipolar constraint?

A

The match for a given (xl, yl) lies on a given yr = yl
(For the simple system given in the lectures)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What makes edges good places to match for correspondances?

A
  • The correspond to significant structure
  • Small number of points to match (aren’t usually too many of them - combinatorics)
  • Can use image features (polarity, direction) to verify matches
  • They can be located accurately (Canny - sub-pixel localisation)
  • Multi-scale location (coarse to fine search
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Problems with matching edges for correspondance?

A

Image gradients at corresponding points may not be equally high
- Shadows, occlusions, illumination differences

Horizontal edges are difficult to match
- Match points are poorly localised along epipolar lines

  • Not all significant structure lies on the edge
  • Near magnitude features may not be reliable for matching
  • Near-horizontal edges do not provide good localisation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are interest operators?

A

Locally distinct points
Edge matches could be obtained at neighbouring points along an edge

“Interest” operators seek isolated discrete points

Moravec operator
DoG, LoG
Harris corner detection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the moravec operator?

A

Calculate sum(I(i,j) - I(i+1, j)), sum(I(i,j) - I(i-1, j+1)), sum(I(i,j) - I(i, j+1)), sum(I(i,j) - I(i+1, j+1)) for a region (e.g. 5x5 pixels)

Output the minimum of the 4 values above.

suppress non-maxima of the filter output
- Isolate local maxima to get distinct points

Find points where intensity is varying quickly
- Taking minimum eliminates edges as candidates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Do the two cameras for stereo imaging need to have parallel opitcal axis’?

A

No

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Why is the epipolar constraint useful?

A

It constrains finding a points correspondence to a 1D search problem along conjugate epipolar lines

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the baseline?

A

Line joining the camera centres.

20
Q

What is the epipole?

A

Point of intersection of the baseline with the image plane

21
Q

What is the epipolar plane?

A

Plane containing the baseline and the world point

22
Q

What is the epipolar line?

A

Intersection of the epipolar plane with the image plane

23
Q

What is the significance of the epipolar line of a point P?

A

Potential matches for the point p for the epipolar line in Image 1 have to lie on the corresponding epipolar line in Image 2

24
Q

How are coordinate systems related?

A

Rotation matrices Rl and Rr giving the orientations of each of the camera coordinate systems relative to the scene coordinate system

Translation vectors Tl and Tr between the camera origins and the scene origins

25
Q

Why do the projected vectors in stereo reconstruction normally not collide?

A

Because of measurement inaccuracies
Need to find the mid-point of the vector between the closest points on each

26
Q

What does it mean for a camera rig to be calibrated?

A

It means we know how to translate and rotate camera reference frame 1 to get camera reference frame 2

27
Q

How is the rotation mathematically represented in stereo geometry?

A

A 3x3 matrix

28
Q

What does the cross product do?

A

Takes two vectors and returns a third vector that’s perpendicular to both inputs

29
Q

What is the essential matrix?

A

Relates corresponding image points between both cameras, given the rotation and translation.

If we observe a point in one image, its position in other image is constrained to lie on the line defined by the essential matrix

30
Q

What is rectifying?

A

Transforming (warping) two images so that their image planes are parallel
- Epipoles should be at infinity
- Epipolar lines should be horizontal

31
Q

How is rectification achieved?

A

By re-projecting image planes onto a common plane parallel to the line between optical centres

32
Q

What are the 4 main steps to stereo reconstruction?

A
  1. Calibrate Cameras
  2. Rectify Images
  3. Compute Disparity
  4. Estimate Depth
33
Q

What are the hard constraints of epipolar geometry?

A

That the corresponding point to p in image 1 must lie on the corresponding epipolar line in image 2

34
Q

What are the soft constraints of epipolar geometry?

A

Parts of features that indicate they are similar:
- Similarity
- Uniqueness
- Ordering

35
Q

What are the assumptions made to find matches in the image pair?

A

Most scene points visible from both views
Image regions for the matches are similar in appearance

36
Q

How do dense correspondance searches work?

A

For each pixel in the first image:
- Find corresponding epipolar line in the right image
- Examine all pixels on the epipolar line and pick the best match (e.g. SSD, correlation)
- Triangulate the matches to get depth information

Last stage is easiest when epipolar lines are scanlines -> rectify the images first

37
Q

What is the effect of the window size in a window search (example of dense correspondence search)?

A

Want a window large enough to have sufficient intensity variation, yet small enough to contain only pixels with about the same disparity

38
Q

What is sparse correspondence search?

A

Restrict search to sparse set of dedicated features
Rather than pixel values (or lists of pixel values) use feature descriptor and an associated feature distance.

Can still narrow search further using epipolar geometry

39
Q

Dense vs Sparse correspondence search comparison?

A

Sparse
- Efficiency
- Can have more reliable feature matches, less sensitive to illumination than raw pixels.

  • Have to know enough to pick good features
  • Sparse information

Dense
- Simple process
- More depth estimates, can be useful for surface reconstruction

  • Breaks down in textureless regions anyway
  • Raw pixel distances can be brittle
  • Not good with very different viewpoints
40
Q

What are the difficulties caused by the similarity constraint?

A

Un-textured surfaces
- Can’t differentiate all white pixels from all other all white pixels

Occlusions
- Feature may not be visible from both images

41
Q

What is the ordering soft constraint?

A

Points on the same surface (opaque object) will be in same order in both views

42
Q

What are the possible sources of error for stereo?

A

Low-contrast / textureless image regions

Occlusions

Camera calibration errors

Violations of brightness constancy (e.g. specular reflections)

Large motions

43
Q

What are some applications of stereo imaging?

A

Depth for segmentation
(Could find edges in disparity map with image edges enhances contours found)

View interpolation (From Brave Search: View interpolation is the process of creating a sequence of synthetic images that represent a smooth transition from one view of a scene to another)

Virtual viewpoint video (From Brave search: Virtual Viewpoint Video (FVV) is a form of user-centered virtual reality that allows viewers to freely select the viewing position and angle)

44
Q

What are the parameters of a camera?

A

Extrinsic parameters:
- Rotation matrix R (3X3) (3 free parameters)
- Translation Vector (Tx, Ty, Tz)

Intrinsic parameters:
- Relate pixel coordinates to image coordinates
- Pixel size (sx, sy): pixels may not be square
- Origin offset (dx, dy): pixel origin may not be on optic axis
- Focal length, f
- Not totally independent (Need dx, dy, f, sx/sy)

45
Q

What is stereo calibration?

A

Need to know these camera parameters:
- R, T and f to calculate triangulation

  • dx, dy, f, sx/sy to calculate image coordinates from pixel coordinates

Can calculate the second parameters if we know the scene coordinates of sufficient image points

Calibration using target image:
- Accurately measured feature positions
- Reliable location on images

46
Q

What is a target image for callibration?

A

An image used to provide sufficient scene coordinates of image points to calculate camera parameters

47
Q

What are the compromises for calibration algorithms?

A
  • Accuracy of parameter estimation
  • Robustness of parameter estimation
  • Complexity of calculation
    • Least squares … non-linear optimisations
  • Engineering requirement of target
    • Points on a plane
    • Points throughout 3D volume