Autoencoders and Computer Vision Flashcards

1
Q

What is Dimensionality Reduction?

A

Shrinking your data without losing its meaning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the Curse of Dimensionality?

A

More features → data becomes sparse.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is an Autoencoder?

A

A special kind of neural network that learns to compress then rebuild data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the key idea behind an Autoencoder?

A

Learn a smart encoding of the input, then use that encoding to reconstruct the original.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the components of an Autoencoder?

A
  • Encoder: Compresses input into smaller vector
  • Latent Space (Code): The compressed form
  • Decoder: Reconstructs the original from the code
  • Loss: Measures how close output is to original (e.g., MSE)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the goal of an Autoencoder?

A

Minimise the difference between input and output.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Fill in the blank: The __________ is the compressed form in an Autoencoder.

A

Latent Space (Code)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the architecture of an Autoencoder?

A

Input size = 784, hidden size = 128, code size = 32.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a Denoising Autoencoder?

A

Trains the autoencoder to remove noise from input images.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does the input and target look like for a Denoising Autoencoder?

A
  • Input = Noisy image
  • Target = Clean image
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a Convolutional Autoencoder (CAE)?

A

Autoencoders for image data using Conv2D layers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the role of the Encoder in a Convolutional Autoencoder?

A

It uses Conv2D layers and MaxPooling2D to compress the input.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the role of the Decoder in a Convolutional Autoencoder?

A

It reconstructs the image using Conv2D layers and upsampling methods.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

True or False: Learnable upsampling in a Decoder leads to better performance than fixed upsampling.

A

True.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the applications of Autoencoders?

A
  • Denoising
  • Compression
  • Image Colourisation
  • Anomaly Detection
  • Feature Extraction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Fill in the blank: In anomaly detection, a large __________ error indicates a likely anomaly.

A

reconstruction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What was the result of using a Convolutional Autoencoder on medical ultrasound images?

A

Successfully removed added annotations & noise.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the goal of a Computer Vision Pipeline?

A

Get machines to ‘see’ and understand images/videos.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the three levels of tasks in Computer Vision?

A
  • Low-level
  • Mid-level
  • High-level
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are examples of low-level tasks in Computer Vision?

A
  • Edge detection
  • Texture analysis
  • Color analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are examples of mid-level tasks in Computer Vision?

A
  • Segmentation
  • Object tracking
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are examples of high-level tasks in Computer Vision?

A
  • Object recognition
  • Scene understanding
23
Q

What is Image Segmentation?

A

Segmenting = Splitting an image into meaningful parts.

24
Q

What are the types of Image Segmentation?

A
  • Unsupervised Segmentation
  • Supervised Segmentation
  • Semantic Segmentation
  • Instance Segmentation
25
What is Unsupervised Segmentation?
No labels, cluster-based.
26
What is Supervised Segmentation?
Learn from labeled data.
27
What is Semantic Segmentation?
Label each pixel with a class (e.g., 'car').
28
What is Instance Segmentation?
Separates individual objects (e.g., car #1 vs car #2).
29
What does R-CNN stand for?
Region-based CNN.
30
What is the primary function of R-CNN?
Detect objects in images using bounding boxes.
31
What is the pipeline of R-CNN?
* Input Image * Generate ~2000 region proposals (Selective Search) * Classify each region using CNN * Refine bounding boxes
32
Why is R-CNN considered slow?
* Classifies all 2k regions separately * Selective Search is not learnable * Trains 3 models: CNN + Classifier + Bounding Box Regressor
33
What are the key changes in Fast R-CNN?
* Runs CNN once on the image * Extracts a feature map * Regions of Interest (RoI) are pulled from that map * Everything is trained end-to-end in a single model
34
What are the pros of Fast R-CNN?
* Way faster * More efficient learning
35
What are the cons of Fast R-CNN?
Still uses Selective Search, which is slow.
36
What major improvement does Faster R-CNN introduce?
Adds a Region Proposal Network (RPN).
37
How does the Region Proposal Network (RPN) work?
* CNN → Feature Map * RPN slides across map, creates anchors * Predicts which anchor = object and how well it fits
38
What is the purpose of RoI Pooling?
Converts different-sized regions into fixed-size feature maps.
39
What is U-Net designed for?
Label every pixel in medical images.
40
What is the structure of U-Net?
* Downsampling path * Upsampling path * Skip connections between matching levels
41
What are the pros of U-Net?
* No Dense layers * Any input size allowed * Combines location + context
42
What is Mask R-CNN built upon?
Faster R-CNN.
43
What additional feature does Mask R-CNN provide?
A branch for pixel-wise binary masks.
44
What outputs does Mask R-CNN provide?
* Class * Bounding box * Object shape
45
What is the main purpose of R-CNN?
Object Detection.
46
What is the main purpose of Fast R-CNN?
Faster Detection.
47
What is the main purpose of Faster R-CNN?
Fully learnable Detection.
48
What is the main purpose of U-Net?
Semantic Segmentation.
49
What is the main purpose of Mask R-CNN?
Instance Segmentation.
50
What is a key strength of R-CNN?
Accurate but slow.
51
What is a key strength of Fast R-CNN?
Shared CNN pass.
52
What is a key strength of Faster R-CNN?
Adds RPN.
53
What is a key strength of U-Net?
Flexible, great for medical applications.
54
What is a key strength of Mask R-CNN?
Adds mask branch to Faster R-CNN.