High level perception - THEORIES Flashcards

(37 cards)

1
Q

Theorists in structural description models of object recognition

A

Biderman (1987)

Marr (1982)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Biderman’s theory of object recognition
- geons

A

He calls his theory of object recognition: ‘recognition by components’

We recognise objects by combining simple 3D shapes called “geons.” We can combine these geons in different ways to create an abstract form of an object

These geons are identified using a few basic edge features that are easy for our brains to detect, no matter how we view the object or how clear the image is. This makes it possible for us to recognize objects even from different angles or when the image is unclear.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

In Biderman’s model, what are 3D objects represented using

A

3D objects are represented using basic volumetric parts (primitives) known as ‘geons’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How many geons did Biderman say were enough to represent most objects

A

36

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The role of the long term memory in Biderman’s theory

A

We have these geons in our LTM and we are able to construct them in different ways and make a very abstract form of an object

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Torch example: Biderman’s theory (1987)

A

In order for us to recognise something as a torch, we would recognise its three dimensional constituent parts - the shapes that make up the torch (geons)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Geons

A

Constituent 3 dimensional parts

You can think of geons as basic shapes like cylinders, bricks, pyramids etc that you could use to ‘build’ more complex objects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Marr (1982)

Cylinder

A

Structural description theory of high-level vision

Came before Bidermann’s theory but they are both similar - have the same concept

Now, the difference here in Mars theory is that instead of having all of these geons (in Bidermann’s theory , he says, we have about 36)

In Mars’ theory he argues we only have one geon that he calls a generalised cylinder

He argues we can build abstract objects from this generalised cylinder

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Key points to know about these theories of object recognition

A
  1. Biderman and Mar’s theory are both structural description models of object recognition
  2. They both argue that object recognition is accomplished by having an abstract representation in memory, whether that’s in the form of geons or in the form of a generalised cylinder
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Similiarites of Biderman and Marr + Nishihara’s theories of object recognition

A

Both theories aim to explain how humans recognise objects based on their 3d shapes

Both propose a hierarchal approach to object recognition, starting with basic features and progressing to more complex representations

Both suggest that recognition process is unaffected by viewpoint.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Difference between Bicderman and Marr + Nishihara’s theories of object recognition - building blocks

A

RBC uses a limited set of 36 “geons” (e.g., cylinders, cones) as fundamental building blocks for object recognition. Marr and Nishihara, on the other hand, propose a more general approach using a ‘generalised cylinder’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Limitations with the object recognition theories

A

Their focus on static object representations: they ignore the importance of motion in object recognition e.g. We recognise patterns of movement associated with people and objects to aid in their recognition (such as recognising somebody by their walk)

Specifically, both theories struggle to account for the influence of viewpoint and the dynamic nature of object recognition in real-world scenarios where movement and context play a crucial role.

These models only explain recognition of basic classes of objects, but to identify and distinguish between different faces, breeds of animal or types of pen will require a more complete explanation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Marr’s theory of object recognition

A

1982

He argued object recognition involves various processing stages and is much more complex than had previously been thought.

Explains how we understand what we see by breaking the process down into 3 main stages:

  1. Primal sketch
  2. 2.5D sketch
  3. 3-D model representation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q
  1. Primal sketch
A

First, we detect basic features like edges, light, and dark areas—this is like an outline of the image

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q
  1. 2.5-D sketch
A

Next, we build a rough idea of the shapes and how they’re positioned in space, based on lighting and depth, but only from our point of view. disparity. It resembles the primal sketch in being
viewer-centred or viewpoint-dependent (i.e., it is influenced by the angle from which the observer sees objects or the environment).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q
  1. 3-D model representation
A

This describes objects’ shapes and their relative positions three-dimensionally; it is independent of the observer’s viewpoint and so is viewpoint-invariant.

We create a full, detailed 3D representation that helps us recognize objects no matter the angle we see them from.

17
Q

Viewpoint invariant

A

Object recognition is independent of the oberver’s viewpoint

18
Q

Why was Marr’s theory so influential?

A

He successfully combined ideas from neurophysiology, anatomy and computer vision

He was among the first to recognise the enormous complexity of object recognition

His distinction between viewpoint-dependent and viewpoint-invariant representations triggered much subsequent research

19
Q

Limitations of Marr’s theory - bottom-up processing

A

He focused excessively on bottom-up processes, admitting himself that “Top-down
processing is sometimes used and necessary.”

However, he de-emphasised the major role expectations and knowledge play in object recognition

20
Q

Limitations of Marr’s theory - vision

A

Marr assumed that “Vision tells the truth about what is out there” - he assumed that our visual system always gives us an accurate and truthful picture of the world.

But there are numerous exceptions - Our perception can be distorted by things like distance or visual illusions.

e.g. When you look down from a tall building, people look tiny—even though you know they’re not.
In the vertical-horizontal illusion, a vertical line can look longer than an identical horizontal one, even though they’re the same length.

These examples show that vision doesn’t always “tell the truth,” which goes against Marr’s idea. So, a key limitation is that his theory doesn’t fully account for the ways our perception can be misleading or influenced by context.

21
Q

Limitations of Marr’s theory - complexity

A

“The computations
required to produce view-independent 3-D object models are now thought
by many researchers to be too complex.”

Researchers now think that the brain likely doesn’t do all of these complicated calculations because it would take too much time and effort. Instead, we may rely on simpler, faster methods to recognize objects.

So, the limitation is that Marr’s theory might overestimate how much detailed processing the brain actually does in everyday vision. It may not be realistic to assume we always create full 3D models just to recognize things.

22
Q

Biderman’s stages of object recognition - geons

A
  1. The geons of an object are determined
  2. When this information is available, it is
    matched with stored object representations or structural models consisting
    of information about the nature of the relevant geons, their orientations,
    sizes and so on.
  3. Whichever stored representation fits best with the geon-based information obtained from the visual object determines which object is identified by observers.
23
Q

Biderman’s stages of object recognition - first stage

A
  1. The first step to in recognizing an object is detecting its edges, based on things like brightness, texture, and color. This creates a basic outline, like a line drawing. Then, the brain figures out how to break the object into its basic parts, called geons.
24
Q

Biderman’s stages of object recognition - stage 2

A

Which edges should we focus on?

Biederman (1987) said that we pay most attention to non-accidental properties—features that stay the same no matter the viewing angle. Examples include whether a line is straight or curved, and whether a shape bends inward (concave) or outward (convex). Concave parts are especially important. Using these stable features, the brain builds the object’s geons.

25
What did Biderman's stage 2 draw on?
This part of the theory leads to the key prediction that object recognition is typi- cally viewpoint-invariant (i.e., objects can be recognised equally easily from nearly all viewing angles). The argument is that object recognition depends crucially on the identifi- cation of geons, which can be identified from numerous viewpoints. Thus, object recogni- tion is difficult only when one or more geons are hidden from view.
26
Why has Biderman's theory been influential?
It indicates how we can identify objects despite substantial differences among the members of most categories in shape, size and orientation. The assumption that non-accidental properties of stimuli and geons play a role in object recognition has received much support.
27
Non-accidental properties
Features of an object that stay the same no matter what angle you view the object from. They are called “non-accidental” because they are unlikely to change just by chance due to the viewpoint. Examples include: Whether an edge is straight or curved Whether a shape is concave (bends inward) or convex (bulges outward) Whether lines intersect or are parallel These features help us recognize objects because they are reliable clues—they don’t change much even when we look at the object from different sides. That’s why Biederman said they’re key to identifying geons and recognizing objects.
28
Limitations of Biderman's theory - bottom-up
It focuses predominantly on bottom-up processes triggered directly by the stimulus input. As a result, it de-emphasises the impact on object recognition of top-down processes based on expectation
29
Limitations of Biderman's theory - perceptual discrimination
The theory accounts only for fairly basic perceptual discriminations. It doesn't help us understand how we tell apart more detailed things—like figuring out what breed a dog or cat is.
30
Limitations of Biderman's theory - too rigid
The notion that objects consist of invariant geons is too inflexible. As Hayward and Tarr (2005, p. 67) pointed out, “You can take almost any object, put a working light-bulb on the top, and call it a lamp.” This is a limitation of Biederman’s theory because it assumes that objects are recognized based on fixed, unchanging parts called geons. But in real life, object categories can be much more flexible.
31
32
viewer-centred or viewpoint-dependent.
if object recognition is faster and easier when objects are seen from certain angles
33
Gauthier and Tarr (2016)
“Depending on the experimental conditions and which parts of the brain we look at, one can obtain data supporting both the structural-description (i.e., the viewpoint- invariant) and the view-based [viewpoint-dependent] approaches.”
34
Limitation with structural models of object recognition - processing direction and neurone
There are as many backward projecting neurons (associated with top-down processing) as forward projecting ones throughout most of the visual system (Gilbert & Li, 2013).
35
Evidence for top-down processing
Up to 90% of the synapses from incoming neurons to the primary visual cortex (involved in early visual processing) originate in the cortex and thus reflect top-down processes.
36
When is top-down processing useful
Top-down processes should have their greatest impact on object recognition when the information coming from our eyes (bottom-up) is unclear or limited—like when we see blurry or very quick images.
37