Local Features - Week 5/6/7 Flashcards by Ollie Ursell

What are the two types of object recognition?

Model based object recognition - Find the razors given a 3D model

Image based object recognition - Given a picture of the razor find the razors in the image

How well did you know this?

Not at all

Perfectly

What is a feature in computer vision?

Local, meaningful, detectable parts of the image

Location of a sudden change

How well did you know this?

Not at all

Perfectly

Why are features useful in computer vision?

They have a high information content

Invariant to change of view point, illumination

Reduces computational burden

How well did you know this?

Not at all

Perfectly

Why are features useful in computer vision?

They have a high information content

Invariant to change of view point, illumination

Reduces computational burden

How well did you know this?

Not at all

Perfectly

What are some applications of image features?

Visual SLAM (simultaneous Localisation and Mapping)

Image Matching (Are two images of the same thing?)

Image alignment
3D reconstruction
Motion Tracking
Indexing and database retrieval
Robot Navigation
Other…

How well did you know this?

Not at all

Perfectly

What is the procedure for image stitching?

Detect feature points in both images
Find corresponding pairs
Use these pairs to align the images

How well did you know this?

Not at all

Perfectly

What is the general approach to finding and describing features in images?

Find a set of distinctive key points (interest points)
Define a region around each key point (interest point)
Extract and normalise the region content
Compute a local descriptor from the normalised region
Match local descriptors

How well did you know this?

Not at all

Perfectly

What are the requirements for local features?

Region extraction needs to be repeatable and:
- Invariant to translation, rotation, scale changes
- Robust or covariant to out-of-plane (~affine) transformations
Robust to lighting variations, noise, blue, quantisation

Locality: Features are local, therefore robust fo occlusion and clusster

Quantity: Need a sufficient number of regions to cover the object

Distinctiveness - The regions should contain “interesting” structure.

Efficiency - Close to real-time performance

How well did you know this?

Not at all

Perfectly

Why are corners good for features?

Edges only localise in one direction

Corners provide repeatable points for matching, so are worth detecting.

How well did you know this?

Not at all

Perfectly

What is the idea behind Harris corner detection?

In the region around a corner, image gradient has two or more dominant directions.

So shifting a window around a corner in any direction should give a large change in intensity.

How well did you know this?

Not at all

Perfectly

What are the three distinctive interest points for corners?

“flat” region: no change in all directions

“Edge” region: no change along the edge direction

“Corner” significant change in all directions

How well did you know this?

Not at all

Perfectly

How is the 2x2 matrix M computed for a region that is being checked for a corner? What actually is M?

the gradient with respect to x times the gradient with respect to y multiplied by the window function at the point same point

How well did you know this?

Not at all

Perfectly

What are the three types of covariance matrices?

Spherical, Diagonal and full covariances

How well did you know this?

Not at all

Perfectly

Given eigenvalues lambda 1 and 2 of the M matrix of a point on the image, what identifies flat, edge and corner regions?

If both lambdas are small, then E is almost constant in all directions, so the region is flat

If one lambda is much greater than the other then it is an edge region

If both lambdas are large and lambda1 ~ lambda2 then the region is a corner

How well did you know this?

Not at all

Perfectly

How is the R corner response in the Harris Corner Detector calculated?

R = Det(M) - alpha * trace(M)^2

How well did you know this?

Not at all

Perfectly

How does the value R in the Harris Corner Detector related to the image regions?

The flat regions R has a low value

For Corner regions, R has a high value

For edge regions R has a negative value

How well did you know this?

Not at all

Perfectly

How does the value R in the Harris Corner Detector related to the image regions?

The flat regions R has a low value

For Corner regions, R has a high value

For edge regions R has a negative value

How well did you know this?

Not at all

Perfectly

What is the Harris corner detector workflow?

Compute the corner responses R

Find the points with large corner responses through thresholding.

Take only the local maxima of R

How well did you know this?

Not at all

Perfectly

Why do we need to sum over a window for the uniform window function but not for the gaussian window?

Study These Flashcards

By computing gaussian blur in the first place we’ve already computed the weighted sum

Do Harris detectors provide rotation invariance?

Study These Flashcards

Yes the corner response is invariant to the image rotation

Do Harris detectors provide scale invariance?

Study These Flashcards

No, if only looking at a small area, a corner could be classified as many edges

What are the advantages of the Harris detector providing interest points

Study These Flashcards

Precise localisation
High repeatability

In order to compare these points, we need to compute a descriptor over a region. Scale invariant interest regions

What is the naive approach to scale invariant region selection / description?

Study These Flashcards

Multi-scale procedure, compare descriptors while varying the patch size.

Computationally inefficient, still possible for matching

Prohibitive for retrieval in large databases

Prohibitive for recognition

What is the solution to scale invariant region selection?

Study These Flashcards

To design a function on the region which is “scale invariant” (the same for corresponding regions, even if they are at different scales) to use as the descriptor

What is the common approach to automatic scale selection?

Take a signature function (Maybe scale-invariant?) (e.g. a good one is the laplacian of the gaussian filter - the second derivative of 2D gaussian) Compute the signature function at different region sizes, find the maximum of the function at those scales, the region size for which the maximum is achieved should be invariant to image scale. The two f against region size plots are generated separately from each other, then their local maximums are computed and compared.

Is the Laplacian a scalar?

Yes, can be found using a single mask

Does the laplacian of an image retain orientation information?

No, orientation information is lost

What effect does taking the laplacian of an image have on noise

The Laplacian is the the sum of second-order derivatives of the image. Taking derivatives increases noise, so the laplacian is very noise sensitive.

What is the laplacian always paired with?

It is always paired with a smoothing operation, to deal with the fact that is amplifies noise.

How is the characteristic scale defined?

The scale that produces peak of Laplacian response

What can be used as an approximation of Laplacian of Gaussian?

Difference of gaussians Take the gaussian with a value sigma, and a scaled value k*sigma and compute their difference, producing a similar function to the laplacian of gaussians

What is the state of the art for feature matching?

SIFT

How is feature invariance achieved?

1. Make sure the feature detector is invariant to translation, rotation and scale - Know how to find interest points (locations and their corresponding characteristic scales) - Know how to remove the effects of difference in scale once we detect one of these interest points 2. Design an invariant feature descriptor

What is the disadvantage of using patches with pixel intensities as descriptors?

Small changes (scale, rotation, 3D viewpoint change) can affect matching score a lot

What is a better method for descriptors than pixel intensities?

Histogram of gradient directions of the patch

How are rotation invariant descriptors created?

Final local orientation - Dominant direction of gradient for the image patch Rotate the patch according to the above angle - This puts the patches into a canonical orientation

How does SIFT describe features?

It first gets features and normalises them (typically to 16x16 size), say with DoG (Difference of gaussians - approximation of Laplacian). Then it gets the gradient orientation over a 16x16 pixel region around the interest point. Computes a histogram of image gradient orientation for all pixels within 4 4x4 sub-patches, makes a histogram of them in 8 bins. All of the histogram counts are then concatenated to give a 128 dimension descriptor vector for the feature.

What do the sift descriptors of one image yield?

For each feature / patch: - 128-dimensional descriptor: each is a histogram of the gradient orientations within a patch - A scale parameter specifying the size of the patch - An orientation parameter specifying the specifying the angle of the patch - 2D points giving the position of each patch

How is the best feature for each feature determined?

1. Define a distance function that compares two descriptors 2. Test all the features in one image to a feature in the other, find the one with minimum distance

What common distance functions exist to compare features?

SSD - Sum squared distance of the descriptor, can give good scores to ambiguous matches (think fence tops) Ratio distance = SSD(f1,f2) / SSD(f1,f2') Where f is the best SSD match, f' is the second best SSD match. Gives large values (~1) for ambiguous matches

How are bad feature matches removed?

Thresholding based on the feature distance (matches with a distance greater than e.g. 100, removed)

What are true positives for feature matches?

The number of detected matches which are correct

What are false positives for feature matches?

The number of detected matches that are incorrect

What curve is used to evaluate a feature matcher?

An ROC curve ("Receiver Operator Characteristic")

What is the true positive rate for feature matches?

The number of matches found by the matcher / The number of potentially correct matches

What is the false positive rate for feature matches?

The number of matches found by the matcher that were incorrect / The number of features that don't actually have a match

What are some of the applications of features?

Image alignment (e.g. mosaics) 3D reconstruction Motion Tracking Object recognition Indexing and database retrieval Robot navigation Panorama stitching Recognition of specific objects

Local Features - Week 5/6/7 Flashcards

(47 cards)