Lecture 5 Flashcards

Question 1

Q

Segmentation

Answer

A

ConvNet can classify patches and label the central pixel
o Can slide all patches and classify each of them -> but very slow
▪ We repeat a lot of convolutions

Question 2

Q

Fully Convolutional Network

Answer

A

Fully connected layers has fixed dimensions and
throws away spatial coordinates
o Can be seen as a convolution with kernels
that cover the entire input region
Train with patches of size PxP to predict one value
o Pros:
▪ Smaller patches -> less memory -> faster training
▪ Can combine patches from different sources -> increase diversity & converge
faster
▪ Patch sampling allows us to control class balancing
▪ Sample more from difficult locations to control difficulty of training
procedure
o Cons:
▪ Borders between objects are not explicitly defined
▪ Only the label of the centralised pixel may be available
▪ Output is usually not full resolution
Low resolution: consequence of pooling layers
o But pooling is good: increases receptive field and reduces size of feature maps
Full resolution output by:
o Combining multiple low-res results
o Avoiding low resolution (no pooling)
o Upscaling: interpolation or learning de-convolution filters
Convolutions at original image resolution are very expensive
o Effective receptive field is also very small

Question 3

Q

Receptive Field

Answer

A

Area of the input image that is seen by single neurons in any
layer of the network
o Each activation map #1 sees an area of 3x3 in the input
o Each activation map #2 sees an area of 3x3 in map #1
▪ Sees a 5x5 region of the original input
Factors that affect receptive field:
o Number of layers
o Filter size
o Presence of pooling
▪ Increases receptive field
▪ Decrease resolution
▪ Lose spatial information
Effective receptive field: follows a Gaussian distribution
o Occupies a fraction of the theoretical receptive field
o Depends on:
▪ Initialisation strategy
▪ Non-linearity
▪ Type of layers used in the network

Question 4

Q

Dilated Convolutions

Answer

A

Aims to efficiently increase the receptive field
Exponentially expanding receptive field
o Based on dilation rate
Can be plugged into existing architectures
No pooling, no subsampling
Allows dense prediction at full resolution
No new filters made, only apply convolution in a
different way

Question 5

Q

Deconvolution Network

Answer

A

Learn filters to do up-sampling
Unpooling: remember which element was max
o Fill that element with input value, all other
elements with 0
To learn a filter to be used during upscaling:
o Multiply filter coefficients by input value
o Slide filter by stride value
o Sum where outputs overlap
Can also do up-sampling with nearest neighbour
+ same convolutions with stride 1
Often leaves checkerboard artifacts behind

Question 6

Q

U-Net

Answer

A

Implemented up- and down-sampling
Each 2x2 convolution halves the
number of feature channels
Skip-connections: add back details to
context
o Sometimes called
concatenation
Can be trained with little data
Often uses heavy data augmentation
Uses cross-entropy loss at pixel level
and loss weight to enforce good
segmentation at object borders
Problem: input size and feature map sizes need to be configured to correspond to each other,
often difficult to do
o Particularly between skip-connections, especially difficult if cropping is not symmetric
o Can use same convolutions, feature maps now have exactly the same size -> easy to
concatenate
▪ But this introduces artifacts, because we have to do zero-padding
Dense layers in the bottom part maximise the field of view of the network

Question 7

Q

nnU-Net

Answer

A

nnU-Net is a framework for U-Nets
that does initialisation of parameters
and network for you
o Thus no expert knowledge
required
Very fast
Uses standardised baseline and outof-the-box segmentation method

Question 8

Q

Image Registration

Answer

A

Aims to have a spatial correspondence between two+ images
o E.g. given that one image shows the heart, where is
the heart located in the other image?
Mainly used to find geometrical, anatomical and functional
alignment for e.g.
o Disease monitoring, motion analysis and growth
analysis
Intra-patient image registration: images of the same patient,
e.g. to quantify change
Inter-patient image registration: images of different patients, e.g. to find structures present
in both patients
Image fusion: combine images from multiple sources into one image
o E.g. combining CT and PET scans into one image
Registration pipeline:
o Center alignment
o Translation
o Affine registration
o Deformable registration
▪ Aligning images into a
common coordinate
frame

Question 9

Q

Medical Image Representations

Answer

A

Continuous: using functions
o 𝐹, 𝑀: 𝑅
𝑑 → 𝑅
Discrete: using d-dimensional
matrix
Meta information (in world matrix)
o Pixel/voxel spacing
o Image origin
o Image direction/orientation
Transformations can also be nonparametric

Question 10

Q

Objective Function

Answer

A

To measure whether a deformation is reasonable, we
need an objective function
How do we compute similarity between images?
o Mono-modal image similarity (NGF):
▪ sum of squared differences (SSD): how
different each pixel is between two
identical locations in different images.
Assumes identity relationship between
intensities
▪ Normalised Cross Correlation (NCC):
assumes linear relationship between
intensities
o Multi-modal image similarity:
▪ Normalised gradient fields (NGF): assumes intensity at same location
▪ Mutual Information (MI): describes how well one image is explained by the
other image
▪ Modality Independent Neighbourhood Descriptor (MIND): exploits selfsimilarity

Question 11

Q

Evaluation of Image Registration

Answer

A

Very difficult task
o Rarely a point-wise correspondence from one image to another available
Quantitative evaluation: exact definition of correct transformation
o Synthetic transformation applied to the images
o Hard to define realistic deformation
o Simplified problems
o Likely to get phantoms
o Auxiliary measures:
▪ Segmentation overlap
▪ Landmark error (error in identifying significant structures in an image)
▪ Quality of deformation field, e.g.
Deformation field: a large matrix of 3D translations for each voxel in
source and target image
# foldings
Smoothness
Evaluation must be independent of cost function or registration features
Highly depends on application

Question 12

Q

Learning-Based Image Registration

Answer

A

Conventional: iterative optimisation for each image pair
-> time consuming
Learning based: train a neural network to learn network
parameters
But often no point-wise correspondence available
between registration network output and ground truth
o Medical experts cannot annotate a reference
deformation field
How to supervise a registration network:
o Supervised methods: use ground-truth
deformation field for training
o Self-supervised/unsupervised methods: use the cost function of conventional image
registration (similarity measure + regulariser) as loss function
o Weakly-supervised methods: are supervised with prior information (e.g.
segmentation masks)

Question 13

Q

Multi-Level Registration Networks

Answer

A

Avoid local minima
Speed up computations
Avoid foldings
Loss function:
o Add prior knowledge into
training by designing
specific loss functions
o Time consuming annotations are only required on training data
o Can also generate labels automatically
HyperMorph strategy: way to improve learning hyper-parameters
o Trains a single model instead of iteratively improving on previous optimal values

Lecture 5 Flashcards

(13 cards)