03-05 - VSLAM, BA, SfM Flashcards

1
Q

What is the general bundle block adjustment problem & how are BA problems usually solved?

A

Non-linear error-minimization problem
x + correction = lambdaPX
we try to minimize the reprojection error

Numerically we end up saying:
observations + corrections = A * unknowns
where the unknowns can be split up into the 3d points and the 6d cam parameters and a is split up into C (obsxpoints) and B(obsximgs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a gauge-freedom, where does it usually appear in BA problems and how can it be fixed?

A

It means we have multiple solutions
It usually appears if we do not have any known control points
To fix it: Add constraints and priors, controll points, loop closure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do gross errors affect BA problems?

A

Since we do take the sum of squares, a single outlier can fuck up the whole result. It is very important to remove outliers before
Outliers can f.ex. be caused be wrong feature matching.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When and why do we need a sparse solver for BA?

A

If we have large sets of data to save computation time. (So mostly when using global BA)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How and why do we derive Jacobians for BA problems?

A

It represents the partial derivatives of the reprojection error with respect to the parameters being optimized.

The Jacobian matrix provides information about how changes in the parameters affect the reprojection error, which is crucial for finding the optimal parameter values that minimize the error.

Can be found analytically and numerically.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What defines visual odometry?

A

Motion Estimation with the help visual input
It may use local optimization but not global.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why do VO solutions tend to drift?

A

The error is accumulating over time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which kinds of correspondences can be used in VO?

A

2D-2D - reprojection error
3D-2D - reprojection error
3D-3D - 3D point difference
3d-3d has the disatvantage of the 3D point calculation being uncertain, on the other hand stereo has the advantage before monocular, that the scalefactor is not unknown and there is no scaling drift. Local ba should be used no matter which method is chosen.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How is the essential matrix estimated from consecutive frames?

A

5 Point Method
8 Point Method: Longuet-Higgins, all 8 pointpairs are put in a vector, E is vecotrized, solve: p_2^T * E * p_1 = 0
now we have a Ax = 0 problem where we want to find x (which is E)
We are not interested in the trivial solution where E is only zeros, therefore we do not use fx gaussian elemination, but SVD to solve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How is relative motion computed from the essential matrix?

A

SVD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When should robust estimation (e.g. RANSAC) be used in VO?

A

Outlier removal to prepare for BA
Causes for outliers can be image noise, occlusion, blur and changes in viewpoint/illumination that the mathematical model of feature descriptors does not account for.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is loop detection and closure?

A

After some time detecting same features again, and making sure in the map that the loop is closed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Explain the Graph-SLAM approach.

A

The graph represends the problem, every node a pose of the robot during mapping (the states). The edges correspond to spatial constraints between the poses (relative transforms, but very uncertain). So an edge between two nodes correspond to the odometry measurement. It exists if the robot either moves from the one pose to the other or if the robot observes the same part of the environment from both poses

Even though we see the same thing, we are not in the same position with the camera yet, so we need to find that last transform to be able to close the loop:

$X_i^{-1}X_j$ , where $X_i$ is the transformation from origin to $x_i$ and $X_i^{-1}$ is the inverse transformation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Appearance based SLAM vs feature based SLAM

A

Appearance based:
- uses intesity information of all pixels
- computationally heavy less accurate
- Global

Feature based
- uses only salient and repeatabæe features across images
- fast, accurate, requires ability to match accross frames
- local

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the purpose of front-end and back-end in SLAM?

A

Making the system applicable in real-time (VO in front end, BA in backend)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How can we reconstruct 3d geometry from uncalibrated cameras?

A

Structure for motion

17
Q

SfM ambiguities?

A

Projective: perserves intersections & tangency
Affine: perserves parallelism and volume ratios
Similarity: perserves angles and length ratios
Eucledian: perserves lengths

18
Q

How can we use orthographic projection approximations in SfM?

A

if we are close enough, computation is cheaper bc we only have 12dof (affine) instead of 15dof(projective).

19
Q

Affine SfM pipline?
How is a zero-skew constraint introduced in SfM reconstruction?

A

We assume no vanishing points and orthographic projection
Now given m images with n fixed 3D points we need to use the mn correspondences x to estimate the m projection matrices A, m translation vectors b and n 3D points X

1: simplify by centering (removing b)
2: Construct measurement matrix D (x = AX for each point, stack to make a 2m (cameras) x n (points) matrix
measurement = motion x shape
3: Factorize D to get Motion and Shape matrix: Do SVD, keep only most important info (think principal componant analysis)

Problem: solution is not unique. We can eliminate the affine ambiguities by expecting the image axes to be perpendicular and of unit length to find

20
Q

How do we deal with missing data?

A

One approach:

  1. Find a dense subblock of the measurement matrix
  2. Do reconstruction
  3. Add data
21
Q

2D-2D pipeline

A
  1. capture frame, extraxt and match features
  2. Find the essential matrix, eg with the Longuet-Higgins 8p algorithm
  3. factorize E via svd to get R and t
  4. find the correct solution out of the four by checking for which one the z coordinate is positive)
  5. compute the relative scale (use the absolut distance between the 3D points, there will always be scale drift)
22
Q

3d-3d

A
  • Needs stereo vision (to triangulate 3D points)
  • min 3 non collinear correspondences
  • Find the transformation that minimizes the sum of 3D distances, which we use kabsch for
23
Q

3d-2d

A

minimizes reprojection error
works for stereo and monocilar cases
Depending on which PnP (perspective n point) algorithm is used the requirements are different (so how many point pairs are needed, but 3 typically is the minimum)