Computer Vision Techniques Flashcards

Question

UNet : Segmentation Network

Answer 1

Image Segmentation CNN. Activation Functions: ReLu for all layers other than final Softmax. All layers are convolution or pooling. Encoder is symmetric with respect to Decoder (U Shape) Cross entropy loss, trained using backpropagation. Transfer learning can be used to use apply to existing datasets.

Answer 2

Left side down samples (encodes), right side up samples (decodes). Left Side: Convolution followed by max pooling, repeat 4 times. Right Side: Up convolution followed by up sample, repeat 3 times. Each up-sampling layer is concatenated with the parallel encoding layer.

Answer 3

Takes smaller image from max pooling and resizes by padding with zeros.

Answer 4

ReLu: f(x) = max(0,x). Output = input if positive otherwise 0. Softmax: extension of the sigmoid curve used in multiclass classification.

Answer 5

UNet needs to be trained on thousands of images. Pre-trained UNets can be applied to any dataset using transfer learning. Involves freezing some layers and replacing final ones, then fine tune using new dataset.

Answer 6

Region Based: Histograms, geometric properties, moments Boundary Based: Discrete chain codes, curves.

Answer 7

Extract Features from an image. Describe the features. Classify, label, match, recognise, reason with descriptors.

Answer 8

Classification/Recognition: match image descriptors to descriptor database Target Detection: can locate specific 'targets' (like wings on a plane) based off its descriptors and matching these descriptors with ones from an image.

Answer 9

Complete: Different objects must have different descriptors Congruent: Similar objects must have similar descriptors Compact: Efficient (i.e. quantity of information) Invariant: Recognise independent of changes.

Answer 10

Geometric Properties: characterise a region independent of its chromatic attributes (colour). Area, perimeter, compactness, dispersion, moments

Answer 11

Area of a region is the count of all 'white' (1 pixels in a binary image) multiplied by a defined area of a pixel. Invariant to Translation and Rotation. Not invariant to scale.

Answer 12

The sum of distances between consecutive pixels forming the boundary of a region. Invariant to translation and rotation Not invariant to scale.

Answer 13

The efficiency of which a boundary encloses an area, therefore a circle has a compactness of 1 (maximum value). The relationship between area and perimeter: 4*pi*Area / Perimeter ^ 2 Invariant to scale, transformation and rotation

Answer 14

Measures the occupancy as the ratio between the area of of the circle enclosing the region and the area of the region: Area of enclosing circle / Region Area. Invariant to scale, translation and rotation. Can also be ratio between enclosing and inclosing circle. Shows region complexity.

Answer 15

A moment is a way to describe an image by treating it as a mass distribution. The moment of an image (M_pq) is the same of all pixel values, each multiplied by their x-position to the power of some p and y-position to the power of some q. P and Q determine the type of moment.

Answer 16

M_0,0: Zero-Order, total mass. Corresponds to area for binary regions. M_1,0, M_0,1: First Order, Centre of mass. P and q determine x and y centre of mass coordinate.

Answer 17

The Zero Order Moment is invariant to translation and rotation. All other M_pq are not invariant to scale, translation or rotation. Centralised moments introduce translation invariance. Normalised Centralised Moments introduce scale invariance. Invariant moments are invariant to all three.

Answer 18

Moments which are invariant to translation as the shape is translated to the origin, by subtracting the the centre of mass (M_0,1 etc) from each x and y. First Order Centralised Moments equate to 0 as the centre of Mass is the origin. Second Order Centralised Moments can be simplified in terms of first and second order regular moments (as can all moments). They describe the spread (variance of the region)

Answer 19

Invariant to translation and scale. Normalised Central Moment of Order P+Q is obtained by dividing the central moment of the same order by a normalisation factor. Gamma = (P+Q)/2 + 1 n_pq = u_pq/u_00 ^ gamma

Answer 20

Introduce rotation invariance to complete total invariance Includes the 7 Hu Moments. M1 is the sum of the two second order normalised central moments, which can be written in terms of centralised moments. M = (u_20 + u_02) / u_00^2

Answer 21

It proves that the sum of second-order central moments (mu-20 plus mu-02) stays the same when a shape is rotated. This is because the squared distance from the origin doesn't change with rotation, so the total weighted sum over the region remains constant. Therefore, mu-20 plus mu-02 is rotation invariant.

Answer 22

Advantages: - work on greyscale as well as binary objects - can utilise pixel brightness - access to detail Disadvantages: - assumes only one object present - computationally expensive - high order moments have large numbers and are sensitive to noise

Answer 23

Boundary Based, calculates the Fourier expansion of image curves before calculating the distances based on that. Two common types: - Polar Fourier Descriptors (angular) - Elliptic Fourier Descriptors. Complex curve

Answer 24

Obtain a complex function from a closed 2D shape, perform Fourier expansion, define descriptors from Fourier coefficients.

Answer 25

Given complex equation: c(t) = x(t) + j*y(t) We can expand x and y as a Fourier Series, expansion simplifies to vector equation with coefficients a_xk, b_xk, a_yk, b_yk.

Answer 26

Translation: The translation has no effect on high frequency coefficients where k!=0 and they remain the same. Scale: A multiplier is applied therefore the ratio of the new coefficients is the same as the pre-scaled. a'_xk/a'_x1 = a_xk/a_x1 Rotation: Rotation matrix is applied therefore coefficients are multiplied by accompanying rotation formula.

Answer 27

Rotation mixes the x and y coefficients, but it doesn’t change the overall energy (length) of each harmonic component. So to get rotation-invariant descriptors, use the magnitudes of the coefficient vectors, not their raw values.

Answer 28

In the defined equations, K approaches infinity. As this number of frequencies increases there is more detail. However, as K gets large the process is very computationally expensive

Answer 29

The distance between features plotted in a feature space is a measure of similarity. For an NxN image, the feature space is N^2, therefore distance metric in a high-dimensional space is required.

Answer 30

Must satisfy three conditions to be used: - D(a,b) > 0 (Positivity) - D(a,b) = D(b,a) (Symmetry) - D(a,c) <= D(a,b) + D(b,c) This includes: - Manhattan Distance - Euclidean Distance

Answer 31

Image Intensity Fourier Transformed Image Fourier-Mellin Transformed Image Image Moments Object Properties

Answer 32

Given labelled data, classify unseen data.

Answer 33

K Closet Points in Feature Space Algorithm : - Choose K - Compute all distances - Sort Distances - Select K smallest - Vote on these k examples

Answer 34

Given unlabelled data, group it into classes based on its features

Answer 35

Represent unlabelled data by K Centres Algorithm: - Choose K - Randomly assign examples to K sets - Compute Mean value for each set - Re-assign each point to the nearest mean - Repeat until mean values remain unchanged

Answer 36

Classification: Assign an input to a known Category Recognition: Identify who or what an input is

Answer 37

Verification: 1-to-1. Is this person who they say they are? Compare one image to one known identity. Identification: 1-to-many. Who is this person? Compare one image to many identities. CMC curve is used.

Answer 38

A CMC curve shows how often the correct identity appears within the top N guesses made by a recognition system. As the rank gets lower and lower the identification rate increases.

Answer 39

Total number of correct subjects at rank 1 / Total number of subjects. Literally just accuracy.

Answer 40

Intra Class Distances: distances between images of the same class Inter Class Distances: distances between images of different classes Good Recognition capability should have small intra class distances and large inter class. The two groups should not overlap much, meaning that the distance between subjects is large.

Answer 41

Set a distance threshold based on the Intra/Inter class variation. If less than threshold, accept match for verification. Raising the threshold will raise both specificity and sensitivity or the system. A trade of must be decided. Will also increase / decrease False reject and accept rates. Want a threshold which gives equal error rate.

Answer 42

ROC Curve, helps evaluate binary classification performance. False Positives against True Positives. Perfect system hugs top left corner, a random system is x=y. The area under the ROC curve summarises performance, 1 = perfect classifier.

Answer 43

False reject rate gives security. False Accept rate gives banking.

Computer Vision Techniques Flashcards

(67 cards)