SIFT Flashcards by ROWAN Gomanee

What is a video in visual computing?

A sequence of still images (frames) shown over time to create the illusion of motion.

How well did you know this?

Not at all

Perfectly

How well did you know this?

Not at all

Perfectly

What challenges does object recognition in video face?

Motion, lighting changes, scale and rotation, clutter, and background changes.

How well did you know this?

Not at all

Perfectly

How well did you know this?

Not at all

Perfectly

What is an interest point in an image?

A pixel-level structure that is repeatable and distinctive, such as a corner or blob.

How well did you know this?

Not at all

Perfectly

How well did you know this?

Not at all

Perfectly

What are desirable properties of an interest point?

Repeatable, distinctive, stable under transformations, subpixel accurate, and well-represented by a descriptor.

How well did you know this?

Not at all

Perfectly

How well did you know this?

Not at all

Perfectly

What is a blob in image analysis?

A region with distinct intensity or texture that can be localized and measured at a certain scale.

How well did you know this?

Not at all

Perfectly

How well did you know this?

Not at all

Perfectly

What is a scale space in image processing?

A set of images smoothed at increasing levels of Gaussian blur, used to detect features at different scales.

How well did you know this?

Not at all

Perfectly

How well did you know this?

Not at all

Perfectly

What is the formula for scale space?

S(x, y, σ) = G(x, y, σ) * I(x, y), where G is a Gaussian filter and I is the image.

How well did you know this?

Not at all

Perfectly

How well did you know this?

Not at all

Perfectly

How are blobs detected in scale space?

By finding local maxima or minima across both spatial and scale dimensions.

How well did you know this?

Not at all

Perfectly

How well did you know this?

Not at all

Perfectly

What is the Difference of Gaussians (DoG)?

An approximation to the Laplacian of Gaussian, computed as the difference between two Gaussian-blurred images at different scales.

How well did you know this?

Not at all

Perfectly

How well did you know this?

Not at all

Perfectly

Why is DoG used in SIFT?

It is computationally efficient and effective at detecting blobs at multiple scales.

How well did you know this?

Not at all

Perfectly

How well did you know this?

Not at all

Perfectly

What is SIFT?

Study These Flashcards

Scale-Invariant Feature Transform — a method to detect and describe distinctive features in an image.

Study These Flashcards

What are the main steps of SIFT?

Study These Flashcards

Blob detection, keypoint localization, orientation assignment, descriptor creation, and normalization.

Study These Flashcards

Why is orientation assigned to a SIFT keypoint?

To make the descriptor invariant to image rotation.

How is the principal orientation determined in SIFT?

By computing a histogram of gradient directions around the keypoint and choosing the peak.

What is the formula for gradient orientation in SIFT?

θ = arctan(∂I/∂y ÷ ∂I/∂x)

What does the SIFT descriptor represent?

A histogram of gradient orientations in spatial subregions around the keypoint.

How many dimensions does a typical SIFT descriptor have?

128 dimensions (from 4x4 grid of 8-bin histograms).

Why is the SIFT descriptor normalized?

To make it invariant to lighting changes and contrast.

What is the L2 distance between descriptors used for?

To measure similarity; a smaller L2 distance indicates a better match.

What is the formula for L2 distance between two histograms?

d(H₁, H₂) = sqrt(Σ(H₁(k) - H₂(k))²)

What is histogram intersection used for in feature matching?

To measure similarity by summing the minimum values of corresponding bins in two histograms.

What is the formula for histogram intersection?

Σ min(H₁(k), H₂(k))

What is an application of SIFT in image alignment?

Matching keypoints between images to compute a homography for alignment or stitching.

How is SIFT used in video tracking?

By matching features frame-to-frame to track the motion of objects.

How does SIFT support object recognition in video?

By detecting and matching distinctive features invariant to scale, rotation, and lighting.

What is the advantage of using descriptors like SIFT?

They are robust to transformations and help in reliable matching across images or video frames.

SIFT Flashcards

(51 cards)