CVI FINAL EXAM DAY Flashcards by Amélie Avery

What is the key difference between RoI Pooling and RoI Align

RoI Pooling rounds coordinates causing misalignment while RoI Align uses interpolation to preserve spatial accuracy

How well did you know this?

Not at all

Perfectly

Why is SURF considered faster than SIFT

SURF uses approximations like box filters and integral images while SIFT uses precise gradients making it slower

How well did you know this?

Not at all

Perfectly

What kind of transformation does a Region Proposal Network RPN learn

It learns to transform anchor boxes into tighter object proposals by predicting shifts and scales

How well did you know this?

Not at all

Perfectly

Why is C3D not used for optical flow estimation

Because C3D performs video classification not pixelwise motion estimation so it does not output motion vectors

How well did you know this?

Not at all

Perfectly

How does DeepLab use atrous convolution and ASPP for segmentation

It uses atrous convolution to enlarge receptive field and ASPP to capture multiscale context features

How well did you know this?

Not at all

Perfectly

Fast RCNN

Deep learning for object detection

How well did you know this?

Not at all

Perfectly

Faster RCNN

Deep learning for object detection with learned region proposals

How well did you know this?

Not at all

Perfectly

RCNN

Deep learning for object detection

How well did you know this?

Not at all

Perfectly

OmniMotion

Advanced motion estimation using video tracking across occlusions

How well did you know this?

Not at all

Perfectly

Lucas Kanade

Traditional optical flow estimation

How well did you know this?

Not at all

Perfectly

Horn Schunck

Traditional optical flow estimation

How well did you know this?

Not at all

Perfectly

FlowNet

Deep learning for optical flow estimation

How well did you know this?

Not at all

Perfectly

VoxelMorph

Deep learning for image registration

How well did you know this?

Not at all

Perfectly

What does Otsus method actually optimise during threshold selection

It maximises inter class variance between foreground and background pixel distributions

How well did you know this?

Not at all

Perfectly

Name one deep learning model commonly used for image registration

VoxelMorph

How well did you know this?

Not at all

Perfectly

Prewitt

Traditional edge detection using gradient approximation

How well did you know this?

Not at all

Perfectly

AlexNet

Deep learning for image classification

How well did you know this?

Not at all

Perfectly

What is the difference between self supervised and unsupervised learning

Self supervised uses pseudo labels from data while unsupervised finds structure without labels

How well did you know this?

Not at all

Perfectly

Robinson

Traditional edge detection using compass kernels

How well did you know this?

Not at all

Perfectly

C3D

Deep learning for video classification and action recognition

How well did you know this?

Not at all

Perfectly

How do skip connections in U Net help during segmentation

They pass high resolution features from encoder to decoder to preserve localisation and detail

How well did you know this?

Not at all

Perfectly

HED

Deep learning for edge detection using holistically nested networks

How well did you know this?

Not at all

Perfectly

DoG

Study These Flashcards

Traditional edge detection using Difference of Gaussians

How does contrastive loss function mathematically encourage representation learning

Study These Flashcards

It pulls similar pairs closer and pushes dissimilar pairs apart in embedding space using similarity scores

How are eigenfaces constructed using PCA

By computing eigenvectors from face dataset and representing new faces as weighted sums of these eigenfaces

When would you use mutual information instead of SSD for image registration

When the two images come from different modalities or have different intensity scales

What is the visual output difference between semantic and panoptic segmentation

Semantic labels each pixel with a class while panoptic also distinguishes between different object instances

UNet

Deep learning for semantic segmentation

Mask RCNN

Deep learning for instance segmentation

FlowNet2

Deep learning for optical flow estimation

YOLO

Deep learning for real time object detection

Sobel

Traditional edge detection using gradient approximation

Kirsch

Traditional edge detection using compass kernels

Canny

Traditional edge detection with gradient smoothing and hysteresis

LoG

Traditional edge detection using Laplacian of Gaussian

Rich Feature Hierarchies

Deep learning based edge detector built on top of CNNs

TimeSformer

Transformer based model for video understanding

DyeNet

Deep learning for video object segmentation using appearance and motion cues

Two Stream CNN

Deep learning for action recognition using RGB and optical flow as parallel inputs

How to calculate weights in CNN?

Kernel x kernel x colour channel number x filters

How to calculate bias in CNN?

Number of filters

What are two morphological operators and outline their purpose

Dilation – Expands the boundaries of foreground (white) regions. 🔹 Used to fill in small holes or connect nearby segmented regions. Erosion – Shrinks foreground regions by eroding boundaries. 🔹 Used to remove small noise or separate objects that are touching.

What region proposal method does RCNN use

Selective Search

How does RCNN extract features

It applies a CNN separately to each proposed region

What is the output of RCNN

Class label and bounding box for each region

What region proposal method does Fast RCNN use

Selective Search

How does Fast RCNN extract features

It applies a CNN to the entire image once and uses RoI Pooling on the feature map

What is the output of Fast RCNN

Class label and bounding box for each region

What region proposal method does Faster RCNN use

Region Proposal Network RPN

How does Faster RCNN extract features

It uses a CNN on the entire image and applies RoI Pooling to shared feature maps

What is the output of Faster RCNN

Class label and bounding box for each region

What region proposal method does Mask RCNN use

Region Proposal Network RPN

How does Mask RCNN extract features

It uses a CNN on the entire image and applies RoI Align to shared feature maps

What is the output of Mask RCNN

Class label bounding box and segmentation mask

What is a feature map

The output of a convolutional layer in a CNN: It’s a 3D tensor: [Height × Width × Channels]. Each channel detects a specific feature (edge, curve, texture) Shows how much of X pattern is in each area

What is spatial resolution

The size (in pixels) of the output feature map at any layer (decreases as you go deeper in a CNN) i.e. how big is the output grid

What is a feature hierarchy

The idea that lower layers detect simple features (edges, corners), while higher layers detect complex features (faces, wheels, textures) Makes CNNs good at generalising across task.

CVI FINAL EXAM DAY Flashcards

(57 cards)