Autoencoders and Computer Vision Flashcards by Franklin Hole

What is Dimensionality Reduction?

Shrinking your data without losing its meaning.

How well did you know this?

Not at all

Perfectly

What is the Curse of Dimensionality?

More features → data becomes sparse.

How well did you know this?

Not at all

Perfectly

What is an Autoencoder?

A special kind of neural network that learns to compress then rebuild data.

How well did you know this?

Not at all

Perfectly

What is the key idea behind an Autoencoder?

Learn a smart encoding of the input, then use that encoding to reconstruct the original.

How well did you know this?

Not at all

Perfectly

What are the components of an Autoencoder?

Encoder: Compresses input into smaller vector
Latent Space (Code): The compressed form
Decoder: Reconstructs the original from the code
Loss: Measures how close output is to original (e.g., MSE)

How well did you know this?

Not at all

Perfectly

What is the goal of an Autoencoder?

Minimise the difference between input and output.

How well did you know this?

Not at all

Perfectly

Fill in the blank: The __________ is the compressed form in an Autoencoder.

Latent Space (Code)

How well did you know this?

Not at all

Perfectly

What is the architecture of an Autoencoder?

Input size = 784, hidden size = 128, code size = 32.

How well did you know this?

Not at all

Perfectly

What is a Denoising Autoencoder?

Trains the autoencoder to remove noise from input images.

How well did you know this?

Not at all

Perfectly

What does the input and target look like for a Denoising Autoencoder?

Input = Noisy image
Target = Clean image

How well did you know this?

Not at all

Perfectly

What is a Convolutional Autoencoder (CAE)?

Autoencoders for image data using Conv2D layers.

How well did you know this?

Not at all

Perfectly

What is the role of the Encoder in a Convolutional Autoencoder?

It uses Conv2D layers and MaxPooling2D to compress the input.

How well did you know this?

Not at all

Perfectly

What is the role of the Decoder in a Convolutional Autoencoder?

It reconstructs the image using Conv2D layers and upsampling methods.

How well did you know this?

Not at all

Perfectly

True or False: Learnable upsampling in a Decoder leads to better performance than fixed upsampling.

True.

How well did you know this?

Not at all

Perfectly

What are the applications of Autoencoders?

Denoising
Compression
Image Colourisation
Anomaly Detection
Feature Extraction

How well did you know this?

Not at all

Perfectly

Fill in the blank: In anomaly detection, a large __________ error indicates a likely anomaly.

reconstruction

How well did you know this?

Not at all

Perfectly

What was the result of using a Convolutional Autoencoder on medical ultrasound images?

Successfully removed added annotations & noise.

How well did you know this?

Not at all

Perfectly

What is the goal of a Computer Vision Pipeline?

Get machines to ‘see’ and understand images/videos.

How well did you know this?

Not at all

Perfectly

What are the three levels of tasks in Computer Vision?

Low-level
Mid-level
High-level

How well did you know this?

Not at all

Perfectly

What are examples of low-level tasks in Computer Vision?

Edge detection
Texture analysis
Color analysis

How well did you know this?

Not at all

Perfectly

What are examples of mid-level tasks in Computer Vision?

Segmentation
Object tracking

How well did you know this?

Not at all

Perfectly

What are examples of high-level tasks in Computer Vision?

Object recognition
Scene understanding

What is Image Segmentation?

Segmenting = Splitting an image into meaningful parts.

What are the types of Image Segmentation?

Unsupervised Segmentation
Supervised Segmentation
Semantic Segmentation
Instance Segmentation

What is Unsupervised Segmentation?

No labels, cluster-based.

What is Supervised Segmentation?

Learn from labeled data.

What is Semantic Segmentation?

Label each pixel with a class (e.g., 'car').

What is Instance Segmentation?

Separates individual objects (e.g., car #1 vs car #2).

What does R-CNN stand for?

Region-based CNN.

What is the primary function of R-CNN?

Detect objects in images using bounding boxes.

What is the pipeline of R-CNN?

* Input Image * Generate ~2000 region proposals (Selective Search) * Classify each region using CNN * Refine bounding boxes

Why is R-CNN considered slow?

* Classifies all 2k regions separately * Selective Search is not learnable * Trains 3 models: CNN + Classifier + Bounding Box Regressor

What are the key changes in Fast R-CNN?

* Runs CNN once on the image * Extracts a feature map * Regions of Interest (RoI) are pulled from that map * Everything is trained end-to-end in a single model

What are the pros of Fast R-CNN?

* Way faster * More efficient learning

What are the cons of Fast R-CNN?

Still uses Selective Search, which is slow.

What major improvement does Faster R-CNN introduce?

Adds a Region Proposal Network (RPN).

How does the Region Proposal Network (RPN) work?

* CNN → Feature Map * RPN slides across map, creates anchors * Predicts which anchor = object and how well it fits

What is the purpose of RoI Pooling?

Converts different-sized regions into fixed-size feature maps.

What is U-Net designed for?

Label every pixel in medical images.

What is the structure of U-Net?

* Downsampling path * Upsampling path * Skip connections between matching levels

What are the pros of U-Net?

* No Dense layers * Any input size allowed * Combines location + context

What is Mask R-CNN built upon?

Faster R-CNN.

What additional feature does Mask R-CNN provide?

A branch for pixel-wise binary masks.

What outputs does Mask R-CNN provide?

* Class * Bounding box * Object shape

What is the main purpose of R-CNN?

Object Detection.

What is the main purpose of Fast R-CNN?

Faster Detection.

What is the main purpose of Faster R-CNN?

Fully learnable Detection.

What is the main purpose of U-Net?

Semantic Segmentation.

What is the main purpose of Mask R-CNN?

Instance Segmentation.

What is a key strength of R-CNN?

Accurate but slow.

What is a key strength of Fast R-CNN?

Shared CNN pass.

What is a key strength of Faster R-CNN?

Adds RPN.

What is a key strength of U-Net?

Flexible, great for medical applications.

What is a key strength of Mask R-CNN?

Adds mask branch to Faster R-CNN.