Computer Vision Techniques Flashcards

(67 cards)

1
Q

Median Background Subtraction

A

Temporal Averaging: simple average across frames

Spatiotemporal: Weighted average across frames using kernel

Temporal Median: Median across frames using convolution

These give us the background of an image which can be subtracted from each frame

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Mixture Of Gaussians Background Subtraction

A

Image is modelled as histogram.

The histogram can be represented as a mixture of gaussians.

Each gaussian has a weight, depending on its amplitude.

Calculate probability of pixel colour, low p is considered foreground.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Background Subtraction Methods

A

Median

Mixture of Gaussians

Kernel Approach

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Image Convolution

A

Applies template across image, calculated weighted sum.

Convolution flips kernel horizontally and vertically, cross correlation does not.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Convolution Vs Fourier

A

Convolution time complexity:
O(N^2 x M^2)

Fourier:
O(log(N))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Statistical Filters

A

Mean (Direct Average/Boxcar filter):
- Smooths and blurs image

Median (Sort):
- Makes edges clearer
- Medical Use

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Wavelets

A

Allow for scale-space analysis and simultaneous decimation in space and frequency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Gabor Wavelets

A

Complex function enveloped by a gaussian function.

Has a real and imaginary part, giving space and freq information simultaneously.

Better than Fourier but computationally expensive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Gabor Wavelet Applications

A

Texture Modelling and analysis:
- Iris texture measurements
- Face Feature Extraction for automatic face recognition system

Image Coding

Image Restoration

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Intensity and Spatial Processing

A

A histogram shows the number of pixels for colour/greyscale in an image.

Can normalise the histogram to follow a probability distribution.

Can normalise the x axis to range from zero and unity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Histogram Stretching

A

Sometimes the image is too narrow leading to poor visibility.

Simply stretch the domain of the histogram linearly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is an Edge?

A

An edge is a sharp change in intensity.

Can have:
- ramp edges: /. Gradual sloping edge

  • step edges: _|^. Sudden change in colour.
  • roof edges: ^. Usually lines.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Edge Detection

A

The best way to find an edge is to take the derivative.

Two Approaches:
- take the local maxima of the first derivative

  • take the zero crossing of the second derivative

However, derivates are sensitive to noise and will not work well for noisy edges.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Edge Detection: Image Gradient

A

The First Derivative on an image (Image gradient) can be approximated as two partial derivates for x and y.

These can be approximated by convolving with these masks:

X: [ -1 1]

Y: [-1
1 ]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Edge Detection Kernels

A

Simple Partial Derivatives will not capture diagonal or oriented edges, leading to Roberts 2x2 masks.

2x2 masks have no symmetry, this may shift detected edges. This lead to Prewitt 3x3 masks (Row of -1, 0, 1)

To smooth image before edge detection, Sobel masks include the centre 2 for smoothing.

Sum of all mask coefficients is always 0. Partial derivatives = 0? Do not change the image?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Feature Extraction

A

Evidence Gathering, Active Contours, Statistical Shapes, Skeletonization, etc.

Can be done through matching low level features (Template matching, Hough)

Can be done through evolution (snakes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Template Matching

A

Correlation and Convolution using template kernel. Result maximum highlights feature location.

Can implement with Fourier.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Convolution vs Correlation with Fourier

A

Convolution:
- F(f*g) = F(f) * f(g)

Correlation:
- F(f x g) = F(f) * [F(g)]*

  • = complex conjugate

The conjugate is simply where you flip the sign of the imaginary part.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Hough Transform

A

Achieves equivalent performance to template matching but is faster.

Defines accumulator space mc based on equation of a line.

If points are on the same line in the image space, these lines will intersect in the accumulator space, giving mc for the line in the image.

Can form accumulator map with max values corresponding to the most prominent lines.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Active Contours

A

For unknown arbitrary shapes - extract by evolution.

Alternative segmentation method to thresholding.

Start with a contour (set of points), forces applied to contour from image and contour itself.

Contour will move towards object, segmenting it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Active Contour ‘Basics’

A

Aiming to evolve to minimum energy solution.

Energy Functional: Total energy integrates the sum of Internal and Image energy as well as the external constraints.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Internal and Image Energy

A

Internal Energy: controls the shape of the snake, penalises stretching (alpha) and bending (beta)

Image Energy: Pulls the snake towards useful parts in the image. Sums line, edge and term energy.

Line - attracts snake to bright/dark regions.
Edge - attracts snake to edges.
Term - attracts snake to endpoints, corners or curves.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Geometric Active Contours

A

Imperfect Algorithm which can result in mistakes. Are not knowledge based like alternatives.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Segmentation vs Object Detection

A

Image Segmentation attempts to extract exact object boundaries.

Object Detection overlays bounding boxes around objects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
UNet : Segmentation Network
Image Segmentation CNN. Activation Functions: ReLu for all layers other than final Softmax. All layers are convolution or pooling. Encoder is symmetric with respect to Decoder (U Shape) Cross entropy loss, trained using backpropagation. Transfer learning can be used to use apply to existing datasets.
26
UNet Process
Left side down samples (encodes), right side up samples (decodes). Left Side: Convolution followed by max pooling, repeat 4 times. Right Side: Up convolution followed by up sample, repeat 3 times. Each up-sampling layer is concatenated with the parallel encoding layer.
27
UNet Up-sampling
Takes smaller image from max pooling and resizes by padding with zeros.
28
ReLu Vs Softmax
ReLu: f(x) = max(0,x). Output = input if positive otherwise 0. Softmax: extension of the sigmoid curve used in multiclass classification.
29
Transfer Learning
UNet needs to be trained on thousands of images. Pre-trained UNets can be applied to any dataset using transfer learning. Involves freezing some layers and replacing final ones, then fine tune using new dataset.
30
Object Description Techniques
Region Based: Histograms, geometric properties, moments Boundary Based: Discrete chain codes, curves.
31
Object Description Approach
Extract Features from an image. Describe the features. Classify, label, match, recognise, reason with descriptors.
32
Object Description Applications
Classification/Recognition: match image descriptors to descriptor database Target Detection: can locate specific 'targets' (like wings on a plane) based off its descriptors and matching these descriptors with ones from an image.
33
Object Description Desirable Properties
Complete: Different objects must have different descriptors Congruent: Similar objects must have similar descriptors Compact: Efficient (i.e. quantity of information) Invariant: Recognise independent of changes.
34
Region Description
Geometric Properties: characterise a region independent of its chromatic attributes (colour). Area, perimeter, compactness, dispersion, moments
35
Area
Area of a region is the count of all 'white' (1 pixels in a binary image) multiplied by a defined area of a pixel. Invariant to Translation and Rotation. Not invariant to scale.
36
Perimeter
The sum of distances between consecutive pixels forming the boundary of a region. Invariant to translation and rotation Not invariant to scale.
37
Compactness
The efficiency of which a boundary encloses an area, therefore a circle has a compactness of 1 (maximum value). The relationship between area and perimeter: 4*pi*Area / Perimeter ^ 2 Invariant to scale, transformation and rotation
38
Dispersion
Measures the occupancy as the ratio between the area of of the circle enclosing the region and the area of the region: Area of enclosing circle / Region Area. Invariant to scale, translation and rotation. Can also be ratio between enclosing and inclosing circle. Shows region complexity.
39
Moments
A moment is a way to describe an image by treating it as a mass distribution. The moment of an image (M_pq) is the same of all pixel values, each multiplied by their x-position to the power of some p and y-position to the power of some q. P and Q determine the type of moment.
40
Moment Examples
M_0,0: Zero-Order, total mass. Corresponds to area for binary regions. M_1,0, M_0,1: First Order, Centre of mass. P and q determine x and y centre of mass coordinate.
41
Moment Invariance
The Zero Order Moment is invariant to translation and rotation. All other M_pq are not invariant to scale, translation or rotation. Centralised moments introduce translation invariance. Normalised Centralised Moments introduce scale invariance. Invariant moments are invariant to all three.
42
Centralised Moments
Moments which are invariant to translation as the shape is translated to the origin, by subtracting the the centre of mass (M_0,1 etc) from each x and y. First Order Centralised Moments equate to 0 as the centre of Mass is the origin. Second Order Centralised Moments can be simplified in terms of first and second order regular moments (as can all moments). They describe the spread (variance of the region)
43
Normalised Central Moments
Invariant to translation and scale. Normalised Central Moment of Order P+Q is obtained by dividing the central moment of the same order by a normalisation factor. Gamma = (P+Q)/2 + 1 n_pq = u_pq/u_00 ^ gamma
44
Invariant Moments
Introduce rotation invariance to complete total invariance Includes the 7 Hu Moments. M1 is the sum of the two second order normalised central moments, which can be written in terms of centralised moments. M = (u_20 + u_02) / u_00^2
45
Why are Invariant Moments rotation invariant?
It proves that the sum of second-order central moments (mu-20 plus mu-02) stays the same when a shape is rotated. This is because the squared distance from the origin doesn't change with rotation, so the total weighted sum over the region remains constant. Therefore, mu-20 plus mu-02 is rotation invariant.
46
Advantages and Disadvantages of Moments
Advantages: - work on greyscale as well as binary objects - can utilise pixel brightness - access to detail Disadvantages: - assumes only one object present - computationally expensive - high order moments have large numbers and are sensitive to noise
47
Fourier Descriptors
Boundary Based, calculates the Fourier expansion of image curves before calculating the distances based on that. Two common types: - Polar Fourier Descriptors (angular) - Elliptic Fourier Descriptors. Complex curve
48
Elliptic Fourier Descriptor
Obtain a complex function from a closed 2D shape, perform Fourier expansion, define descriptors from Fourier coefficients.
49
Elliptic Fourier Trigonometric Expansion
Given complex equation: c(t) = x(t) + j*y(t) We can expand x and y as a Fourier Series, expansion simplifies to vector equation with coefficients a_xk, b_xk, a_yk, b_yk.
50
Invariant Elliptic Fourier Descriptors
Translation: The translation has no effect on high frequency coefficients where k!=0 and they remain the same. Scale: A multiplier is applied therefore the ratio of the new coefficients is the same as the pre-scaled. a'_xk/a'_x1 = a_xk/a_x1 Rotation: Rotation matrix is applied therefore coefficients are multiplied by accompanying rotation formula.
51
Orthonormal Property
Rotation mixes the x and y coefficients, but it doesn’t change the overall energy (length) of each harmonic component. So to get rotation-invariant descriptors, use the magnitudes of the coefficient vectors, not their raw values.
52
Multi-scale Elliptic Fourier Descriptors
In the defined equations, K approaches infinity. As this number of frequencies increases there is more detail. However, as K gets large the process is very computationally expensive
53
Similarity
The distance between features plotted in a feature space is a measure of similarity. For an NxN image, the feature space is N^2, therefore distance metric in a high-dimensional space is required.
54
Distance Metrics
Must satisfy three conditions to be used: - D(a,b) > 0 (Positivity) - D(a,b) = D(b,a) (Symmetry) - D(a,c) <= D(a,b) + D(b,c) This includes: - Manhattan Distance - Euclidean Distance
55
Feature Examples
Image Intensity Fourier Transformed Image Fourier-Mellin Transformed Image Image Moments Object Properties
56
Supervised Learning
Given labelled data, classify unseen data.
57
K-NN
K Closet Points in Feature Space Algorithm : - Choose K - Compute all distances - Sort Distances - Select K smallest - Vote on these k examples
58
Unsupervised Learning
Given unlabelled data, group it into classes based on its features
59
K Means Clustering
Represent unlabelled data by K Centres Algorithm: - Choose K - Randomly assign examples to K sets - Compute Mean value for each set - Re-assign each point to the nearest mean - Repeat until mean values remain unchanged
60
Classification vs Recognition
Classification: Assign an input to a known Category Recognition: Identify who or what an input is
61
Identification vs Verification
Verification: 1-to-1. Is this person who they say they are? Compare one image to one known identity. Identification: 1-to-many. Who is this person? Compare one image to many identities. CMC curve is used.
62
Cumulative Match Curve
A CMC curve shows how often the correct identity appears within the top N guesses made by a recognition system. As the rank gets lower and lower the identification rate increases.
63
Correct Recognition Rate (CCR)
Total number of correct subjects at rank 1 / Total number of subjects. Literally just accuracy.
64
Intra/Inter Class Distances
Intra Class Distances: distances between images of the same class Inter Class Distances: distances between images of different classes Good Recognition capability should have small intra class distances and large inter class. The two groups should not overlap much, meaning that the distance between subjects is large.
65
Choosing a verification threshold
Set a distance threshold based on the Intra/Inter class variation. If less than threshold, accept match for verification. Raising the threshold will raise both specificity and sensitivity or the system. A trade of must be decided. Will also increase / decrease False reject and accept rates. Want a threshold which gives equal error rate.
66
Receiver Operator Characteristic
ROC Curve, helps evaluate binary classification performance. False Positives against True Positives. Perfect system hugs top left corner, a random system is x=y. The area under the ROC curve summarises performance, 1 = perfect classifier.
67
False Accept vs False Reject Rate
False reject rate gives security. False Accept rate gives banking.