Difficulties in Image Registration and Generative Models Flashcards
(61 cards)
What does VAE stand for?
Variational Autoencoder
What is the difference between self-supervised and unsupervised learning?
Self-supervised learning generates labels from the data itself, while unsupervised learning doesn’t have labels and focuses on identifying patterns.
What does semantic segmentation focus on?
Semantic segmentation assigns a class label to each pixel, classifying background stuff like sky or road.
What does instance segmentation focus on?
Instance segmentation detects individual objects and assigns unique identifiers to each object, distinguishing between objects of the same class.
Why is panoptic segmentation important?
Panoptic segmentation combines semantic and instance segmentation, providing both class labels and object identification for a complete scene understanding.
What is the limitation of instance segmentation?
Instance segmentation doesn’t handle background stuff and focuses only on distinguishing between objects of the same class.
What does bottom-up saliency refer to?
Bottom-up saliency is based on low-level features like contrast, colour, and edges that automatically draw attention.
What does top-down saliency refer to?
Top-down saliency uses high-level task-specific context or guidance to focus attention, such as the task at hand (e.g., detecting pedestrians in self-driving cars).
What is subitizing in visual saliency modelling?
Subitizing refers to the ability to quickly and accurately judge the number of objects in a small set (usually 1-4 objects) without counting.
What is context-aware saliency detection?
Context-aware saliency detection incorporates higher-level guidance or task-specific information (like object importance or focus on certain areas) to generate saliency maps.
What is domain shift in saliency models for images and videos?
Domain shift refers to training a model on one data domain (e.g., images) and adapting it to work on another (e.g., video), which can affect performance due to differences in data characteristics like motion.
What is the core mechanism behind GANs?
GANs consist of a generator that creates fake data and a discriminator that tries to detect whether the data is real or fake, improving the model over time.
What is the difference between Pix2Pix and CycleGAN?
Pix2Pix uses paired data for image transformation, while CycleGAN works with unpaired data, learning to transform images from one domain to another without matching pairs.
What does a diffusion model do in generative AI?
Diffusion models start with random noise and progressively denoise it to generate a realistic image, learning to reverse the noise process during training.
What is the application of diffusion models in healthcare?
Diffusion models are used to generate synthetic medical images (like heart images), reducing the need for real patient data and providing more training examples.
What are the benefits of panoptic segmentation?
Panoptic segmentation combines semantic and instance segmentation, giving a complete output that labels both background (semantic) and objects (instance).
What is the Chain Rule used in backpropagation?
The Chain Rule in backpropagation helps calculate gradients by combining the upstream gradient and the local gradient to update network parameters.
What are the four steps in the Canny Edge Detection algorithm?
- Gaussian filtering to suppress noise, 2. Compute gradient magnitude and direction, 3. Apply non-maximum suppression (NMS), 4. Use hysteresis thresholding to detect edges.
What is the main difference between semantic and panoptic segmentation?
Semantic segmentation classifies pixels into categories, while panoptic segmentation provides both class labels for the background and instance identifiers for individual objects.
What is task-dependent saliency?
Task-dependent saliency (top-down saliency) adjusts the focus of attention based on the context or task at hand, such as detecting specific objects based on a goal or environment.
What is semantic segmentation?
Semantic segmentation classifies each pixel of an image into predefined categories (like sky, road, or tree) but doesn’t distinguish between individual objects of the same class.
What is instance segmentation?
Instance segmentation assigns unique labels to individual objects of the same class, allowing the model to detect multiple instances of the same object type (like multiple cars).
How does panoptic segmentation differ from semantic segmentation and instance segmentation?
Panoptic segmentation combines both semantic segmentation (for background) and instance segmentation (for object instances), giving a full pixel-wise segmentation that identifies both the object type and instance.
What is UNet’s primary architecture?
UNet’s architecture consists of an encoder (contracting path), a bottleneck (compressed features), and a decoder (expanding path with skip connections to recover spatial details), which helps with pixel-wise segmentation.