Lecture 5 Flashcards

1
Q

Segmentation

A
  • ConvNet can classify patches and label the central pixel
    o Can slide all patches and classify each of them -> but very slow
    ▪ We repeat a lot of convolutions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Fully Convolutional Network

A
  • Fully connected layers has fixed dimensions and
    throws away spatial coordinates
    o Can be seen as a convolution with kernels
    that cover the entire input region
  • Train with patches of size PxP to predict one value
    o Pros:
    ▪ Smaller patches -> less memory -> faster training
    ▪ Can combine patches from different sources -> increase diversity & converge
    faster
    ▪ Patch sampling allows us to control class balancing
    ▪ Sample more from difficult locations to control difficulty of training
    procedure
    o Cons:
    ▪ Borders between objects are not explicitly defined
    ▪ Only the label of the centralised pixel may be available
    ▪ Output is usually not full resolution
  • Low resolution: consequence of pooling layers
    o But pooling is good: increases receptive field and reduces size of feature maps
  • Full resolution output by:
    o Combining multiple low-res results
    o Avoiding low resolution (no pooling)
    o Upscaling: interpolation or learning de-convolution filters
  • Convolutions at original image resolution are very expensive
    o Effective receptive field is also very small
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Receptive Field

A
  • Area of the input image that is seen by single neurons in any
    layer of the network
    o Each activation map #1 sees an area of 3x3 in the input
    o Each activation map #2 sees an area of 3x3 in map #1
    ▪ Sees a 5x5 region of the original input
  • Factors that affect receptive field:
    o Number of layers
    o Filter size
    o Presence of pooling
    ▪ Increases receptive field
    ▪ Decrease resolution
    ▪ Lose spatial information
  • Effective receptive field: follows a Gaussian distribution
    o Occupies a fraction of the theoretical receptive field
    o Depends on:
    ▪ Initialisation strategy
    ▪ Non-linearity
    ▪ Type of layers used in the network
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Dilated Convolutions

A
  • Aims to efficiently increase the receptive field
  • Exponentially expanding receptive field
    o Based on dilation rate
  • Can be plugged into existing architectures
  • No pooling, no subsampling
  • Allows dense prediction at full resolution
  • No new filters made, only apply convolution in a
    different way
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Deconvolution Network

A
  • Learn filters to do up-sampling
  • Unpooling: remember which element was max
    o Fill that element with input value, all other
    elements with 0
  • To learn a filter to be used during upscaling:
    o Multiply filter coefficients by input value
    o Slide filter by stride value
    o Sum where outputs overlap
  • Can also do up-sampling with nearest neighbour
    + same convolutions with stride 1
  • Often leaves checkerboard artifacts behind
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

U-Net

A
  • Implemented up- and down-sampling
  • Each 2x2 convolution halves the
    number of feature channels
  • Skip-connections: add back details to
    context
    o Sometimes called
    concatenation
  • Can be trained with little data
  • Often uses heavy data augmentation
  • Uses cross-entropy loss at pixel level
    and loss weight to enforce good
    segmentation at object borders
  • Problem: input size and feature map sizes need to be configured to correspond to each other,
    often difficult to do
    o Particularly between skip-connections, especially difficult if cropping is not symmetric
    o Can use same convolutions, feature maps now have exactly the same size -> easy to
    concatenate
    ▪ But this introduces artifacts, because we have to do zero-padding
  • Dense layers in the bottom part maximise the field of view of the network
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

nnU-Net

A
  • nnU-Net is a framework for U-Nets
    that does initialisation of parameters
    and network for you
    o Thus no expert knowledge
    required
  • Very fast
  • Uses standardised baseline and outof-the-box segmentation method
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Image Registration

A
  • Aims to have a spatial correspondence between two+ images
    o E.g. given that one image shows the heart, where is
    the heart located in the other image?
  • Mainly used to find geometrical, anatomical and functional
    alignment for e.g.
    o Disease monitoring, motion analysis and growth
    analysis
  • Intra-patient image registration: images of the same patient,
    e.g. to quantify change
  • Inter-patient image registration: images of different patients, e.g. to find structures present
    in both patients
  • Image fusion: combine images from multiple sources into one image
    o E.g. combining CT and PET scans into one image
  • Registration pipeline:
    o Center alignment
    o Translation
    o Affine registration
    o Deformable registration
    ▪ Aligning images into a
    common coordinate
    frame
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Medical Image Representations

A
  • Continuous: using functions
    o 𝐹, 𝑀: 𝑅
    𝑑 → 𝑅
  • Discrete: using d-dimensional
    matrix
  • Meta information (in world matrix)
    o Pixel/voxel spacing
    o Image origin
    o Image direction/orientation
  • Transformations can also be nonparametric
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Objective Function

A
  • To measure whether a deformation is reasonable, we
    need an objective function
  • How do we compute similarity between images?
    o Mono-modal image similarity (NGF):
    ▪ sum of squared differences (SSD): how
    different each pixel is between two
    identical locations in different images.
    Assumes identity relationship between
    intensities
    ▪ Normalised Cross Correlation (NCC):
    assumes linear relationship between
    intensities
    o Multi-modal image similarity:
    ▪ Normalised gradient fields (NGF): assumes intensity at same location
    ▪ Mutual Information (MI): describes how well one image is explained by the
    other image
    ▪ Modality Independent Neighbourhood Descriptor (MIND): exploits selfsimilarity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Evaluation of Image Registration

A
  • Very difficult task
    o Rarely a point-wise correspondence from one image to another available
  • Quantitative evaluation: exact definition of correct transformation
    o Synthetic transformation applied to the images
    o Hard to define realistic deformation
    o Simplified problems
    o Likely to get phantoms
    o Auxiliary measures:
    ▪ Segmentation overlap
    ▪ Landmark error (error in identifying significant structures in an image)
    ▪ Quality of deformation field, e.g.
  • Deformation field: a large matrix of 3D translations for each voxel in
    source and target image
  • # foldings
  • Smoothness
  • Evaluation must be independent of cost function or registration features
  • Highly depends on application
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Learning-Based Image Registration

A
  • Conventional: iterative optimisation for each image pair
    -> time consuming
  • Learning based: train a neural network to learn network
    parameters
  • But often no point-wise correspondence available
    between registration network output and ground truth
    o Medical experts cannot annotate a reference
    deformation field
  • How to supervise a registration network:
    o Supervised methods: use ground-truth
    deformation field for training
    o Self-supervised/unsupervised methods: use the cost function of conventional image
    registration (similarity measure + regulariser) as loss function
    o Weakly-supervised methods: are supervised with prior information (e.g.
    segmentation masks)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Multi-Level Registration Networks

A
  • Avoid local minima
  • Speed up computations
  • Avoid foldings
  • Loss function:
    o Add prior knowledge into
    training by designing
    specific loss functions
    o Time consuming annotations are only required on training data
    o Can also generate labels automatically
  • HyperMorph strategy: way to improve learning hyper-parameters
    o Trains a single model instead of iteratively improving on previous optimal values
How well did you know this?
1
Not at all
2
3
4
5
Perfectly