CVI - ExamQuestions I should know Flashcards by Amélie Avery

What is the first step of the Canny edge detection algorithm?

Perform Gaussian filtering to suppress noise.

How well did you know this?

Not at all

Perfectly

How does the choice of sigma (𝜎) in Gaussian filtering affect Canny edge detection results?

A small 𝜎 detects fine features; a large 𝜎 detects large scale edges.

How well did you know this?

Not at all

Perfectly

What is the second step of the Canny edge detection algorithm?

Calculate the gradient magnitude and direction.

How well did you know this?

Not at all

Perfectly

How is the gradient magnitude (M) at a pixel calculated in Canny edge detection?

Using the derivatives along X (Gx) and Y (Gy), the magnitude is calculated as M = sqrt(Gx^2 + Gy^2).

How well did you know this?

Not at all

Perfectly

What is the third step of the Canny edge detection algorithm?

Apply non-maximum suppression (NMS) to get a single response for each edge.

How well did you know this?

Not at all

Perfectly

How does Non-Maximum Suppression (NMS) in Canny edge detection check if an edge is valid?

NMS checks if the gradient magnitude at a pixel is a local maximum along the gradient direction, aiming to get a single pixel‑wide response for each edge.

How well did you know this?

Not at all

Perfectly

What is the fourth step of the Canny edge detection algorithm?

Perform hysteresis thresholding to find potential edges.

How well did you know this?

Not at all

Perfectly

Explain hysteresis thresholding in Canny edge detection.

It uses two thresholds, t_high and t_low. Pixels with magnitude > t_high are accepted as edges. Pixels < t_low are rejected. Pixels between t_low and t_high are accepted only if they are connected to an accepted edge (>t_high).

How well did you know this?

Not at all

Perfectly

What is a common relationship between the high and low thresholds in Canny hysteresis thresholding?

Typically, t_high = 2 * t_low.

How well did you know this?

Not at all

Perfectly

According to the sources how are Canny and Sobel described in the context of edge detection?

Canny is described as a standard edge detector, while Sobel is listed as a typical operator for edge detection.

How well did you know this?

Not at all

Perfectly

What is an advantage of the Canny edge detector compared to the Sobel operator?

Canny generally produces thinner and more accurate edges and is less sensitive to noise due to its multi‑stage process.

How well did you know this?

Not at all

Perfectly

What is a disadvantage of the Canny edge detector compared to the Sobel operator?

Canny is computationally more complex and requires tuning multiple parameters (Gaussian sigma, high/low thresholds).

How well did you know this?

Not at all

Perfectly

What is an advantage of the Sobel operator compared to the Canny edge detector?

Sobel is simpler and faster to compute.

How well did you know this?

Not at all

Perfectly

What is a disadvantage of the Sobel operator compared to the Canny edge detector?

Sobel is more susceptible to noise and produces thicker edge responses.

How well did you know this?

Not at all

Perfectly

How can the Sobel operator contribute to detecting diagonal edges?

The Sobel operator calculates intensity gradients in the X and Y directions. The resulting gradient magnitude (sqrt(Gx^2 + Gy^2)) is high at pixels where intensity changes rapidly, regardless of orientation, thus highlighting diagonal edges.

How well did you know this?

Not at all

Perfectly

Do the sources mention other edge detection operators besides Sobel and Canny that can detect edges at different orientations?

Yes, typical operators include Prewitt, Sobel, Robinson, and Kirsch operators.

How well did you know this?

Not at all

Perfectly

Can a Convolutional Neural Network (CNN) be used for image classification?

Yes, Deep Learning models, including CNNs, are widely applied to tasks like image classification.

How well did you know this?

Not at all

Perfectly

What do CNNs need to perform final image classification?

A typical classification network requires fully connected layers and a classification layer (like Softmax) at the end.

How well did you know this?

Not at all

Perfectly

What is the formula provided for calculating the output dimension of a convolutional layer?

The formula is Sout = Floor((Sin + 2*padding - dilation*(kernel_size - 1) - 1) / stride) + 1.

How well did you know this?

Not at all

Perfectly

How can the network Conv1 -> Conv2 -> Pool1 -> Conv3 be modified to perform image classification with 10 classes?

The output of Conv3 needs to be flattened, then fed into one or more fully connected layers, followed by a classification layer (like Softmax) with 10 output units, one for each class.

How well did you know this?

Not at all

Perfectly

What is the primary source of information gathered for visual processing tasks?

Study These Flashcards

Visual processing tasks use images or videos as input. Information gathering involves acquiring this visual data, typically via cameras.

For a visual vehicle identification system aiming to identify vehicles by registration number what kind of information needs to be gathered and processed?

Study These Flashcards

Images or video of vehicles must be captured. The raw data are pixel intensities, and key information to extract is the license plate text for identification.

What techniques are typically used for edge detection?

Study These Flashcards

Edge detection is often performed using image convolution with operators like Sobel, Prewitt, Robinson, or Kirsch, or using the multi‑stage Canny edge detection algorithm.

What is a common method for detecting straight lines or circles from edge points?

Study These Flashcards

The Hough Transform uses edge points to find parametric representations of shapes such as lines and circles.

What are some traditional methods for **feature extraction** besides edge detection?

Beyond simple edge maps, classical feature extraction uses descriptors such as SIFT and SURF (built around corners/blobs) and LBP (built around local texture patterns). These descriptors turn visual structures into numeric vectors that older ML algorithms can handle.

How do Deep Learning image models perform **feature extraction**?

Deep learning models, such as **Convolutional Neural Networks (CNNs)**, learn a hierarchy of features directly from the input data.

What are some Deep Learning architectures mentioned for **Image Classification**?

Architectures include **LeNet, AlexNet, Inception (GoogLeNet), VGGNet, and ResNet**.

What are some CNN-based methods for **Object Detection**?

Methods include **R‑CNN, Fast R‑CNN, Faster R‑CNN, and YOLO (You Only Look Once)**.

What is **Semantic Segmentation** and what are examples of DL architectures for it?

Semantic Segmentation assigns a class label to each pixel ('stuff'). DL architectures include **U‑Net** and the **DeepLab** family (v2, v3, v3+).

What is **Instance Segmentation** and what is an example method?

Instance Segmentation detects and masks each individual object instance ('things'). **Mask R‑CNN** is a method for this task.

What task combines Semantic and Instance Segmentation?

**Panoptic Segmentation** combines pixel‑based segmentation ('stuff') with instance identification ('things').

What is the goal of **Visual Saliency Modelling**?

It aims to identify regions of an image or video most likely to attract human attention, measuring how much something stands out.

What are some applications of **Optical Flow** (calculating pixel motion)?

Applications include motion segmentation, video compression, structure‑from‑motion, and augmented reality.

What Deep Learning models are mentioned for estimating **Optical Flow**?

Models like **FlowNet** and **FlowNet2** are used.

What Deep Learning approaches are used for **Video Understanding** tasks like classification or action recognition?

Deep learning uses **Deep Features** and architectures like **TwoStream, C3D, and SlowFast networks**.

What dimensionality reduction technique is mentioned?

**Principal Component Analysis (PCA)** reduces datasets by lowering dimensionality.

What types of methods are used for **Image Registration**?

Methods include **intensity‑based methods** (MSD/SSD, Mutual Information) and **feature‑based methods** (using points like edges, corners).

What are some types of **Generative Models**?

Types include **Generative Adversarial Networks (GANs)**, **Variational Autoencoders (VAEs)**, and diffusion models.

What pre‑processing step is commonly used to **suppress noise** in images?

**Gaussian filtering** (smoothing) is commonly used to suppress noise.

What are some low‑level vision tasks that can improve **low‑quality images**?

Tasks like **Denoising, Super‑resolution, and De‑blurring** can enhance image quality.

What are some general challenges that make many computer‑vision tasks difficult?

Challenges include variations in **illumination, object pose, clutter, viewpoint, intra‑class appearance, and occlusions**.

What is a drawback of edge detection and how is it mitigated?

Edge detection can be sensitive to **noise** (mitigated by smoothing such as Gaussian filtering) and can give fragmented edges (mitigated by NMS and hysteresis thresholding).

What is a drawback of the Hough Transform for complex shapes?

Its **computational complexity can be high**, especially for shapes like circles that require a large parameter space.

What is a major drawback for many supervised Deep Learning approaches?

They require **large datasets with expensive manual annotations**.

How can the high data requirement for supervised Deep Learning be addressed?

Using **pre‑trained models, synthetic data, weak supervision, or semi‑supervised learning** can reduce the need for large annotated datasets.

What issue arises in Semantic Segmentation due to downsampling layers?

Downsampling (e.g., pooling) can cause **reduced feature‑map resolution and localisation accuracy**.

How are resolution and localisation issues addressed in Semantic Segmentation?

Techniques such as **atrous convolutions** and **skip connections** preserve or recover spatial detail.

What problem occurs when applying models trained on one data type to another (e.g.images to video)?

A **domain shift** can lead to lower performance on the target domain.

How can the problem of domain shift be addressed?

**Domain Adaptation** techniques improve performance when applying models across different data domains.

What ethical concern is associated with Generative AI models?

They can be used to create **deepfakes and facilitate information manipulation**; responsible use is essential.

CVI - ExamQuestions I should know Flashcards

Based on 2024 and 2023 exam papers (50 cards)