finalexam1 Flashcards

1
Q
  1. What is the objective of multiple view geometry?
A

To understand the 3D structure of a scene given multiple images taken from different perspectives.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
  1. What is the difference between the 3D reconstruction and the Structure from Motion in multiple view geometry?
A

3D Reconstruction (Stereo Vision): Assumes known intrinsic (K) and extrinsic (R, T) parameters to recover 3D scene using two cameras. Structure from Motion (SfM): Recovers 3D scene structure and camera poses simultaneously using multiple images/views (K might be given).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
  1. In stereo vision with parallel cameras
A

what is the meaning of the following equation? Z=bf/(u1−u2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
  1. In stereo vision
A

what is the potential issue of a small baseline?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
  1. What is the goal of triangulation?
A

To estimate the 3D coordinates of a point given its 2D projections in multiple images and the camera positions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q
  1. What are the benefits of stereo rectification in feature matching?
A

Stereo rectification simplifies the process of finding feature correspondences by making the image planes coplanar.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
  1. What is the epipolar constraint?
A

Corresponding points on one image must lie on the epipolar lines of the other image. The three vectors of p1, p2, and c1c2 are coplanar.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
  1. Why can’t the scale ambiguity be avoided in multiple view geometry with monocular vision?
A

Both object distance from the camera and object size are needed to determine the scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q
  1. What is the reprojection error in two-view geometry?
A

The distance between the original 2D point and the point obtained by triangulating its 3D position using the estimated (R, T) and projecting it back onto the image plane using (R, T) and (K1, K2).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
  1. What are the possible causes for outliers in two-view geometry?
A

Changes in scale and perspective, variations in illumination, noise and blur, and occlusions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
  1. What is the goal of RANSAC?
A

To estimate unknown pose when given measurements X which may contain outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
  1. Explain the procedures of RANSAC when applied to line fitting.
A

Randomly select a minimal subset to estimate the parameter. Calculate the number of inliers vs. outliers. Repeat k times and choose the parameter with the smallest number of outliers, assuming enough inliers exist.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q
  1. What are the three main components needed to implement RANSAC?
A

The model, minimum number of points, and a method to determine the distance between the model and the data points.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q
  1. In RANSAC
A

why do we select the smallest number of data points required to determine the unknown parameter?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q
  1. Describe the four steps required for the sequential structure from motion.
A

Feature detection, feature matching/tracking, motion estimation, and local optimization (bundle adjustment).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q
  1. What is the difference between front-end and back-end of visual odometry?
A

Front-end: Handles feature detection, matching, and pose estimation between two frames. Back-end: Refines pose among multiple frames.

17
Q
  1. Describe the goal
A

the type of correspondence

18
Q
  1. Describe the goal
A

the type of correspondence

19
Q
  1. Why do we need the mapping step in visual odometry?
A

To extend the structure by selecting new keyframes and extracting new features

20
Q
  1. Why do we need the bundle adjustment step in visual odometry?
A

To refine structure (Pi) and motion (R, T) by minimizing the reprojection error in SfM and visual odometry.

21
Q
  1. How is visual SLAM different from visual odometry?
A

Visual SLAM addresses loop detection and closure to guarantee global consistency

22
Q
  1. Describe the characteristics of the indirect method in visual SLAM.
A

Extracts features with RANSAC and minimizes reprojection error. Can handle large relative motion between frames but is slow due to RANSAC.

23
Q
  1. Describe the characteristics of the direct method in visual SLAM.
A

Minimizes the photometric error of the image without extracting features using RANSAC. Uses all image information for greater robustness and accuracy. Fast because no RANSAC is used, but sensitive to initial guess and cannot handle large relative motion between frames.

24
Q
  1. What is the goal of tracking?
A

To locate a moving object in consecutive video frames.

25
Q
  1. What are the pros and the cons of feature detection and matching compared with tracking?
A

Feature Detection and Mapping: Pros: Works even with large motion between two frames. Cons: Does not utilize additional information from small motion between frames. Tracking: Pros: Utilizes additional information from small motion between frames. Cons: May not work well with large motion between frames.

26
Q
  1. Summary for Point Tracking
A

Block-based methods: Robust to large motions but computationally expensive. Differential methods: No search performed, applied to small motions, can be extended to large motions using multi-scale implementation, and computationally efficient for optical flow.

27
Q
  1. Summary for KLT Template Tracking
A

Template tracking by matching the warped image. Challenging to guess the initial parameter, and Taylor series approximation may not hold well for large initial errors. KLT can be repeated for increasing resolution. Template might be updated with the recent image to handle occlusion, illumination change, and object deformation.

28
Q
  1. Describe the image features that are not preserved under homography
A

and the image feature that is preserved under homography.

29
Q
  1. What are the pros and the cons of vision compared with IMU?
A

Vision Pros: Accurate in slow motion, more informative than IMU measurements. Vision Cons: Inaccurate for fast motion, not robust for low texture and high dynamic range, scale ambiguity in monocular vision, low output rate.

30
Q
  1. What are the pros and the cons of IMU compared with vision?
A

IMU Pros: More accurate for motion with large accelerations and angular velocity, higher output rate. IMU Cons: Inaccurate for motion with low acceleration, measurement drifts over time.

31
Q
  1. What are the differences between the loosely coupled approach and the tightly coupled approach in visual-inertial fusion?
A

Loosely coupled: VO and IMU estimate pose independently. Easy to implement but inaccurate. Tightly coupled: IMU measurements are integrated with feature tracking and 3D reconstruction. Harder to implement but more accurate.

32
Q
  1. In the tightly coupled visual-inertial fusion
A

compare the filtering and the optimization for speed and accuracy.