Object Detection - Week 9 Flashcards

1
Q

What are the advantages of local features

A

Critical to find distinctive and repeatable local regions for multi-view matching

Complexity reduction via selection of distinctive points

Describe images, objects, parts without requiring segmentation; robustness to clutter & occlusion

Robustness - Similar descriptors in spite of moderate view changes, noise, blur, etc…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does it mean when two feature descriptors are close in feature space

A

The two features have similar local content

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the idea behind visual words?

A

Extract local features from a number of images, e.g. a sift descriptor, which can be represented as points

Map high-dimensional descriptors to tokens/words by quantising the feature space. Can quantise via clustering, let cluster centres be the prototype “words”

Determine which word to assign to each new image region by letting cluster centres by the prototype “words”

Determine which words to assign each new image region by finding the closest cluster centre

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do inverted file indexes work?

A

Detect words in images, an inverted index is a dictionary where the key is the word number, and the values are the images that have the key word in them

New query images are mapped to indicies of database images that share a word. images already in the index are selected based on which ones have the highest word matches with the query image

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is spatial verification?

A

Can use generalised hough transform:
- Let each matches feature case a vote on location, scale, orientation of the model object
- Verify parameters with enough votes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the steps of the video google system?

A
  1. Collect all words within query region
  2. Inverted file index to find relevant frames
  3. Compare word counts
  4. Spatial verification
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What sampling strategies exist for visual vocabulary formation?

A

Sparse, at interest points
- Better to find specific, textured objects

Dense, uniformly sampled
- For object categorisation this is better

Randomly sampled
Multiple interest operators

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the typical clustering method for visual words?

A

K-means clustering

Also used: agglomerative clustering, mean-shift

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How are words collected in a query region?

A

Pull out only the SIFT descriptors whose positions are within the polygon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is object categorisation?

A

Find this particular object
Recognise any car
Recognise any cow

Given a small number of training images of a category recognise a-priori unknown instances of that category and assign the correct category label

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is evidence for how humans categorise?

A

Evidence that humans (usually) start with basic-level categorisation before doing identification
- Easier and faster for humans to do basic-level categorisation than object identification
- Most promising starting point for visual classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How many object categories are there?

A

~10,000 to 30,000

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What types of categories are there?

A

Functional categories
- Chairs = “something you can sit on”
- Ad-hoc = “something you can find in an office environment”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the challenges for object categorisation?

A

Robustness
Illumination
object pose
clutter
occlusions
intra-class appearance
viewpoint

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the idea of bag of words?

A

Represent whole images as a bag of it’s features, “independent features”

Stricter definition
Independent features
Histogram representation
- x-axis is the features, y-axis is how many times that feature appears

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the steps for bag of words learning?

A
  1. Feature detection & representation
  2. Codewords dictionary
  3. Image representation