The Final Iteration Flashcards by Joshua Carey-Young

What is the epipolar plane?

Given two optical centres and a point in an image, you can compute the epipolar plane.

How well did you know this?

Not at all

Perfectly

What is an epipolar line?

An epipolar line is defined by both the epipolar plane and the image plane for a camera. The epipolar line emerges where the image plane intersects the epipolar plane.

How well did you know this?

Not at all

Perfectly

How does an epipolar plane help with stereo vision?

It reduces the dimension for the correspondence problem from 2D to 1D, making it more efficient.

How well did you know this?

Not at all

Perfectly

What is the cost volume in regards to correspondence search?

The cost volume stores matching costs for each pixel over a range of disparities, which represents how well that pixel matches with a shifted pixel in the other image.

How well did you know this?

Not at all

Perfectly

How is the cost volume used in regards to correspondence search?

The cost volume is used to compute the disparity match by selecting the disparity with the lowest cost for each pixel

How well did you know this?

Not at all

Perfectly

What is the fine-grained version of how a U-net is trained on image data for segmentation?

Feed it image-label pairs, where each label is a pixel-wise segmentation map
Architecture uses an encoder to downsample features, and a decoder with skip connections to reconstruct spatial details
During training, a loss function compares predictions with the ground truth network
Network updates via backpropagation
Performance evaluated using a test set

How well did you know this?

Not at all

Perfectly

How does Background Subtraction work in Object Tracking?

Background subtraction works by comparing current frames to a background model using changes in pixel intensities. It uses Gaussian Mixed Models, which allows it to handle dynamic backgrounds.

How well did you know this?

Not at all

Perfectly

What are advantages of using SIFT descriptors over raw pixel intensities?

Invariant to scale, rotation and minor illumination changes
Uses local gradient orientation histograms, which are resistant to noise and misalignment

How well did you know this?

Not at all

Perfectly

What are the benefits from using drop-out when training networks?

Improves generalisation
Reduces overfitting
Forces the network to learn more robust representations

How well did you know this?

Not at all

Perfectly

How does a Gaussian Mixed Model work?

It models each pixel as a mixture of several Gaussian distributions, representing different background states.
Each pixel is then compared to these different Gaussian models to determine whether it fits into the background or the foreground i.e. an object of interest.

How well did you know this?

Not at all

Perfectly

How does histogram equalisation work?

Redistributes pixel intensity values so that they span the full range of possible values
Computes the cumulative distribution function (CDF) of the image histogram, and maps the original intensities to new ones
Spreads out frequent intensity values, improving contrast in low-contrast images

How well did you know this?

Not at all

Perfectly

What is the equation used to calculate disparity between two images?

Disparity = Xl - Xr
Where:
Xl = Point observed in left image
Xr = Point observed in right image

How well did you know this?

Not at all

Perfectly

What is the equation used to calculate depth, using disparity?

Z = f(T)/D
Where:
Z = Depth
f = Focal Length
T = Real-world distance between two cameras
D = Disparity

How well did you know this?

Not at all

Perfectly

How does a Particle Filter work?

Generates particles, each of which represents a different hypothesis that ‘guesses’ the position of the object in the next time-frame
Each particle has an assigned weight, which is computed through the use of an observation/motion model that compares predicted measurements with real sensor data
Over time, particles with lower weights contribute less, leading to degeneracy

How well did you know this?

Not at all

Perfectly

How does resampling alleviate the problem of degeneracy in Particle Filters?

It duplicates high-weight particles and discards low-weight particles when generating the next set of particles.
It thereby focuses computation on more likely hypotheses and maintains tracking accuracy

How well did you know this?

Not at all

Perfectly

What is semantic segmentation?

Assigns a class label to each pixel in an image, grouping pixels by category without differentiating between individual objects.

How well did you know this?

Not at all

Perfectly

What is instance segmentation?

Study These Flashcards

Assigns a class label to each pixel in an image, and also identifies and separates each instance of an object i.e. it identifies each object separately unlike semantic segmentation

How is training loss computed for a denoising diffusion model?

Study These Flashcards

Generates images with predicted levels of noise
Compares predicted noise in image with actual noise in same image in sequence using the MSE loss function.
Repeats this process at each timestep

What are some disadvantages of using VAEs for generating images?

Study These Flashcards

Blurry image generation due to reconstruction loss through the use of KL divergence terms and the use of a probabilistic decoder.
Utilise a latent space regularisation term, which also contributes to blurry images

How is illumination invariance implemented in feature detection?

Study These Flashcards

HOG descriptors
SIFT descriptors

Both of these are able to implement illumination invariance through focusing on edge orientations rather than absolute intensities, as well as normalising local patches to reduce brightness variations

What is a formal definition of lower-level tasks?

Study These Flashcards

Low-level tasks involve basic image processing such as edge detection or noise reduction

What is a formal definition of mid-level tasks?

Study These Flashcards

Mid-level tasks involve interpreting groups of pixels, including segmentation and depth estimation

What is a formal definition of high-level tasks?

Study These Flashcards

High-level tasks refer to semantic understanding such as object detection, recognition and pose estimation

What is the sensing stage in Computer Vision?

Study These Flashcards

It involves acquiring raw image data through devices like cameras or depth sensors, and provides the initial input to the CV system.

What is in a basic image processing pipeline?

- Image acquisition - Pre-processing - Feature extraction - Classification

What are the pinhole camera's limitations?

- Low brightness, due to a lack of light - Image blue if the hole is too large - Diffraction effects if the hole is too small

How are 2D coordinates derived from 3D coordinates?

- It is derived through the use of a projection process that uses the camera projection matrix. - The matrix combines intrinsic and extrinsic parameters from the camera, and transforms the 3D point into the camera coordinate system and then projected onto the image plane

What are some edge detection algorithms that are commonly used?

- Sobel - Canny

How does Sobel work?

Sobel uses convolutional kernels to compute gradients in the horizontal and vertical directions

What is an advantage and disadvantage of using Sobel to detect edges?

Advantage - Simple and computationally efficient Disadvantage - Sensitive to noise

What are keypoints in feature detection?

Keypoints are distinct and repeatable locations in an image, such as corners or edges, that are stable under various transformations like rotation and scale.

What are local descriptors?

Local descriptors capture distinctive information from small regions around keypoints in an image, which are robust to changes in scale, rotation and illumination.

What is Thresholding in regards to grayscale images?

Thresholding converts a greyscale image into a binary image by selecting a threshold value. Pixels with intensity values above the threshold are considered as foreground objects, whereas pixels below the threshold are considered as part of the background.

Describe the Region Growing segmentation method

- Start with user-defined or automatic selection of seed points in the image - Expand regions by including neighbouring pixels that meet pre-defined criteria, such as intensity - Growth continues until no more similar pixels are found, resulting in segmented regions.

What is the formal definition of Affine Transformations?

They are a linear mapping method that preserves points, straight lines and planes. It includes transformations such as rotation, translation, scaling and shearing.

How could affine transformations be used in image registration?

They are used to align two images to each other by correcting geometric distortions, enabling one image to be mapped to the other

How does intensity-based registration work?

It utilises the pixel intensity values and compares them directly between two images, rather than relying on extracted features. It is typically achieved through minimising the sum of squared differences.

Why are geometric transformation techniques important for image alignment problems?

They are important because they bring different views of the same scene or object into a common coordinate system.

What is a basic pipeline for image alignment?

- Image acquisition & loading - Use a similarity measurement - Use a registration algorithm to find the best transformation parameters - Apply geometric transformations

In a particle filter used for tracking, what does each particle represent? How are these particles updated at each time step?

Each particle in a particle filter represents a possible state hypothesis (e.g., object position + velocity), with an associated weight. - Prediction: particles are propagated using a motion model - Update: weights updated based on how well each particle matches new observations - Resampling: new particles drawn according to weight → focus on high-probability areas

The Final Iteration Flashcards

Keep adding here, and then study it all. And I mean all of it (40 cards)