Additional Exam Notes Flashcards

The deck that brings together other points I have yet to cover properly. (39 cards)

1
Q

Explain the term Epipolar Plane

A

It’s a combination of the image feature and two optical centres, all of which define a plane of interpretation.
The world feature generating the known image feature and the corresponding feature in the other view must lie in this plane.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why does the epipolar plane simplify the stereo correspondence problem?

A
  • The intersection of the plane of interpretation and the second image plane is a straight line (called an epipolar line)
    The epipolar plane determines the epipolar lines. Given a feature extracted from one view, the corresponding feature must lie on the corresponding epipolar line in the other.
  • Search space is reduced from two dimensions to one
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How is the Integral Image calculated?

A
  • Computes a value at each pixel, that is the sum of the pixel values above and to the left of the source pixel
  • Cumulative row sum: s(x, y) = s(x-1, y) + i(x, y)
  • Integral image/ii(x, y) = ii(x, y-1) + s(x, y)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What benefit does using Integral images bring?

A
  • Advantages of speed i.e. more efficient calculations. Only 4 numbers are needed to calculate the sum of intensity values for each rectangle
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are advantages of a pinhole camera?

A
  • Simple to understand
  • No lens distortion due to lack of lens
  • Infinite depth of field: No depth of field effect distorting the image
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a one line description for segmentation?

A

Assign all pixels to objects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a one line description for Recognition?

A

Identify the main object in the image

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a one line description for Detection?

A

Find the location of all objects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a one line description for Pose?

A

Find all of the object parts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are characteristics of a good feature?

A
  • Invariant to scale and rotation
  • Reflect useful object properties
  • Unique and repeatable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are example types of object recognition tasks?

A
  • Segmentation
  • Pose Estimation
  • Detection
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why are Diffusion Models preferred over GANs?

A
  • Easier to train
  • Generate high-quality images
  • Avoid GAN’s issues such as mode collapse or tough to train due to discriminator balance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are important concepts in Background Subtraction?

A
  • Gaussian Mixture Models
  • Thresholding
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What features are commonly used in Classical Computer Vision?

A
  • HOG descriptors
  • Colour histograms
  • Texture Features
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe the structure of a pinhole camera

A
  • The pinhole camera’s optical point/centre is located on the principal axis / Z axis
  • The 3D scene is projected onto the image plane through the pinhole located at the optical point
  • The focal length narrows or widens the field of view by increasing or decreasing in length respectively.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Explain the forward process in a diffusion model

A
  • Noise is added in a Markov Chain
  • Each step’s noise is only dependent on the previous step
  • This is a fixed process, containing no learned parameters.
17
Q

Explain the reverse process in a Diffusion model

A
  • Train and use a U-net style model that learns the reverse diffusion process
  • This is trained by predicting the noise in the image given the current timestep
  • By removing predicted noise from the image, we get an approximation of the original image
18
Q

Describe the process of object detection for classification

A
  • Features are extracted using feature extraction methods
  • They are then used to train a classifier model, which identifies regions in test images likely to contain objects and classifies them accordingly
19
Q

Explain how a particle filter updates object tracking

A
  • Each particle represents a hypothesis
  • Predictions are made based on a motion model
  • Measured positions are then used to update the particle weights based on how well they match the observation
20
Q

Describe the architecture of a U-net model?

A
  • U-net uses an encoder to compress features
  • Followed by a decoder to reconstruct segmentation maps
  • Skip connections between matching levels of encoder and decoder to preserve spatial information
21
Q

Explain both semantic and instance segmentation

A
  • Semantic labels every pixel by class
  • Instance distinguishes individual objects of the same class
22
Q

What are the challenges of using a single Gaussian Background Model?

A
  • Cannot model dynamic backgrounds
  • Sensitive to lighting changes
  • Cannot represent multiple background modes at a pixel
23
Q

What’s an advantage and disadvantage of using a Particle Filter over a Kalman Filter?

A

Advantage - Particle filters handle multi-modal distributions
Disadvantage - Requires more computation

24
Q

Why is SIFT useful?

A
  • Invariant to scale, rotation and illumination
  • It’s robust in regards to matching across different views
25
What is the importance of illumination modelling?
Helps accurately interpret scene radiance, ensuring features and colours are not misinterpreted due to lighting changes
26
What factors affect the quality of images captured in digital photography?
- Shutter speed - Aperture - ISO Setting
27
What are valid applications of saliency prediction in Computer Vision?
- Smart image cropping - Object Detection - Content-aware compression
28
In Stereo vision, what helps estimate depth from two images?
- Epipolar geometry: Contains correspondence search - Disparity maps: Allow depth calculation
29
What are components of a Kalman filter used in tracking?
- Motion model - Measurement Update - State Prediction
30
Explain how ISO, Shutter Speed and Aperture interact to control image brightness?
- ISO controls sensor sensitivity - Aperture determines how much light enters the lens - Shutter Speed determines how long the light is exposed for Increasing one typically requires reducing another to maintain consistent exposure
31
Describe how disparity maps are used in Stereo Vision to infer depth
Disparity maps represent pixel shifts between left and right images. Depth is inversely proportional to disparity. The larger the disparity, the closer the object is to the camera that took the image.
32
How does a Kalman Filter handle uncertainty during object tracking?
Kalman filters maintain a mean and a covariance for state estimates. Prediction steps add uncertainty and measurement updates reduce it
33
What is the formal way of explaining the role of Pooling layers?
Pooling layers downsample feature maps to reduce computation and enforce spatial hierarchy.
34
What is the formal way of explaining the role of Activation Functions in CNNs?
Activation functions introduce non-linearity, enabling complex pattern learning.
35
What is the function of a Disparity Map in Stereo Vision?
Disparity maps capture the shift of corresponding pixels between stereo image pairs. This disparity allows calculation of depth via triangulation
36
Compare background subtraction using a single Gaussian model against a Gaussian Mixed Model
Single Gaussian: - Assumes a static background Mixed Gaussian: - Can represent dynamic backgrounds like moving leaves by modelling multiple modes per pixel
37
How does Motion Difference capture movement?
Motion difference compares sequential frames to detect pixel intensity changes, indicating movement
38
What is a major challenge of evaluating saliency maps?
Human attention is subjective, making ground truths hard to define. Multiple metrics exist, but interpreting results remains complex
39
What is the main advantage of using U-Net in image segmentation tasks?
U-Net preserves spatial resolution through skip connections and so they are very efficient for small dataset segmentation tasks.