High level perception - THEORIES Flashcards by Rose Campbell

Theorists in structural description models of object recognition

Biderman (1987)

Marr (1982)

How well did you know this?

Not at all

Perfectly

Biderman’s theory of object recognition
- geons

He calls his theory of object recognition: ‘recognition by components’

We recognise objects by combining simple 3D shapes called “geons.” We can combine these geons in different ways to create an abstract form of an object

These geons are identified using a few basic edge features that are easy for our brains to detect, no matter how we view the object or how clear the image is. This makes it possible for us to recognize objects even from different angles or when the image is unclear.

How well did you know this?

Not at all

Perfectly

In Biderman’s model, what are 3D objects represented using

3D objects are represented using basic volumetric parts (primitives) known as ‘geons’

How well did you know this?

Not at all

Perfectly

How many geons did Biderman say were enough to represent most objects

How well did you know this?

Not at all

Perfectly

The role of the long term memory in Biderman’s theory

We have these geons in our LTM and we are able to construct them in different ways and make a very abstract form of an object

How well did you know this?

Not at all

Perfectly

Torch example: Biderman’s theory (1987)

In order for us to recognise something as a torch, we would recognise its three dimensional constituent parts - the shapes that make up the torch (geons)

How well did you know this?

Not at all

Perfectly

Geons

Constituent 3 dimensional parts

You can think of geons as basic shapes like cylinders, bricks, pyramids etc that you could use to ‘build’ more complex objects

How well did you know this?

Not at all

Perfectly

Marr (1982)

Cylinder

Structural description theory of high-level vision

Came before Bidermann’s theory but they are both similar - have the same concept

Now, the difference here in Mars theory is that instead of having all of these geons (in Bidermann’s theory , he says, we have about 36)

In Mars’ theory he argues we only have one geon that he calls a generalised cylinder

He argues we can build abstract objects from this generalised cylinder

How well did you know this?

Not at all

Perfectly

Key points to know about these theories of object recognition

Biderman and Mar’s theory are both structural description models of object recognition
They both argue that object recognition is accomplished by having an abstract representation in memory, whether that’s in the form of geons or in the form of a generalised cylinder

How well did you know this?

Not at all

Perfectly

Similiarites of Biderman and Marr + Nishihara’s theories of object recognition

Both theories aim to explain how humans recognise objects based on their 3d shapes

Both propose a hierarchal approach to object recognition, starting with basic features and progressing to more complex representations

Both suggest that recognition process is unaffected by viewpoint.

How well did you know this?

Not at all

Perfectly

Difference between Bicderman and Marr + Nishihara’s theories of object recognition - building blocks

RBC uses a limited set of 36 “geons” (e.g., cylinders, cones) as fundamental building blocks for object recognition. Marr and Nishihara, on the other hand, propose a more general approach using a ‘generalised cylinder’

How well did you know this?

Not at all

Perfectly

Limitations with the object recognition theories

Their focus on static object representations: they ignore the importance of motion in object recognition e.g. We recognise patterns of movement associated with people and objects to aid in their recognition (such as recognising somebody by their walk)

Specifically, both theories struggle to account for the influence of viewpoint and the dynamic nature of object recognition in real-world scenarios where movement and context play a crucial role.

These models only explain recognition of basic classes of objects, but to identify and distinguish between different faces, breeds of animal or types of pen will require a more complete explanation.

How well did you know this?

Not at all

Perfectly

Marr’s theory of object recognition

1982

He argued object recognition involves various processing stages and is much more complex than had previously been thought.

Explains how we understand what we see by breaking the process down into 3 main stages:

Primal sketch
2.5D sketch
3-D model representation

How well did you know this?

Not at all

Perfectly

Primal sketch

First, we detect basic features like edges, light, and dark areas—this is like an outline of the image

How well did you know this?

Not at all

Perfectly

2.5-D sketch

Next, we build a rough idea of the shapes and how they’re positioned in space, based on lighting and depth, but only from our point of view. disparity. It resembles the primal sketch in being
viewer-centred or viewpoint-dependent (i.e., it is inﬂuenced by the angle from which the observer sees objects or the environment).

How well did you know this?

Not at all

Perfectly

3-D model representation

Study These Flashcards

This describes objects’ shapes and their relative positions three-dimensionally; it is independent of the observer’s viewpoint and so is viewpoint-invariant.

We create a full, detailed 3D representation that helps us recognize objects no matter the angle we see them from.

Viewpoint invariant

Study These Flashcards

Object recognition is independent of the oberver’s viewpoint

Why was Marr’s theory so influential?

Study These Flashcards

He successfully combined ideas from neurophysiology, anatomy and computer vision

He was among the first to recognise the enormous complexity of object recognition

His distinction between viewpoint-dependent and viewpoint-invariant representations triggered much subsequent research

Limitations of Marr’s theory - bottom-up processing

Study These Flashcards

He focused excessively on bottom-up processes, admitting himself that “Top-down
processing is sometimes used and necessary.”

However, he de-emphasised the major role expectations and knowledge play in object recognition

Limitations of Marr’s theory - vision

Study These Flashcards

Marr assumed that “Vision tells the truth about what is out there” - he assumed that our visual system always gives us an accurate and truthful picture of the world.

But there are numerous exceptions - Our perception can be distorted by things like distance or visual illusions.

e.g. When you look down from a tall building, people look tiny—even though you know they’re not.
In the vertical-horizontal illusion, a vertical line can look longer than an identical horizontal one, even though they’re the same length.

These examples show that vision doesn’t always “tell the truth,” which goes against Marr’s idea. So, a key limitation is that his theory doesn’t fully account for the ways our perception can be misleading or influenced by context.

Limitations of Marr’s theory - complexity

Study These Flashcards

“The computations
required to produce view-independent 3-D object models are now thought
by many researchers to be too complex.”

Researchers now think that the brain likely doesn’t do all of these complicated calculations because it would take too much time and effort. Instead, we may rely on simpler, faster methods to recognize objects.

So, the limitation is that Marr’s theory might overestimate how much detailed processing the brain actually does in everyday vision. It may not be realistic to assume we always create full 3D models just to recognize things.

Biderman’s stages of object recognition - geons

Study These Flashcards

The geons of an object are determined
When this information is available, it is
matched with stored object representations or structural models consisting
of information about the nature of the relevant geons, their orientations,
sizes and so on.
Whichever stored representation ﬁts best with the geon-based information obtained from the visual object determines which object is identiﬁed by observers.

Biderman’s stages of object recognition - first stage

Study These Flashcards

The first step to in recognizing an object is detecting its edges, based on things like brightness, texture, and color. This creates a basic outline, like a line drawing. Then, the brain figures out how to break the object into its basic parts, called geons.

Biderman’s stages of object recognition - stage 2

Study These Flashcards

Which edges should we focus on?

Biederman (1987) said that we pay most attention to non-accidental properties—features that stay the same no matter the viewing angle. Examples include whether a line is straight or curved, and whether a shape bends inward (concave) or outward (convex). Concave parts are especially important. Using these stable features, the brain builds the object’s geons.

What did Biderman's stage 2 draw on?

This part of the theory leads to the key prediction that object recognition is typi- cally viewpoint-invariant (i.e., objects can be recognised equally easily from nearly all viewing angles). The argument is that object recognition depends crucially on the identiﬁ- cation of geons, which can be identiﬁed from numerous viewpoints. Thus, object recogni- tion is diﬃcult only when one or more geons are hidden from view.

Why has Biderman's theory been influential?

It indicates how we can identify objects despite substantial diﬀerences among the members of most categories in shape, size and orientation. The assumption that non-accidental properties of stimuli and geons play a role in object recognition has received much support.

Non-accidental properties

Features of an object that stay the same no matter what angle you view the object from. They are called “non-accidental” because they are unlikely to change just by chance due to the viewpoint. Examples include: Whether an edge is straight or curved Whether a shape is concave (bends inward) or convex (bulges outward) Whether lines intersect or are parallel These features help us recognize objects because they are reliable clues—they don’t change much even when we look at the object from different sides. That’s why Biederman said they’re key to identifying geons and recognizing objects.

Limitations of Biderman's theory - bottom-up

It focuses predominantly on bottom-up processes triggered directly by the stimulus input. As a result, it de-emphasises the impact on object recognition of top-down processes based on expectation

Limitations of Biderman's theory - perceptual discrimination

The theory accounts only for fairly basic perceptual discriminations. It doesn't help us understand how we tell apart more detailed things—like figuring out what breed a dog or cat is.

Limitations of Biderman's theory - too rigid

The notion that objects consist of invariant geons is too inﬂexible. As Hayward and Tarr (2005, p. 67) pointed out, “You can take almost any object, put a working light-bulb on the top, and call it a lamp.” This is a limitation of Biederman’s theory because it assumes that objects are recognized based on fixed, unchanging parts called geons. But in real life, object categories can be much more flexible.

viewer-centred or viewpoint-dependent.

if object recognition is faster and easier when objects are seen from certain angles

Gauthier and Tarr (2016)

“Depending on the experimental conditions and which parts of the brain we look at, one can obtain data supporting both the structural-description (i.e., the viewpoint- invariant) and the view-based [viewpoint-dependent] approaches.”

Limitation with structural models of object recognition - processing direction and neurone

There are as many backward projecting neurons (associated with top-down processing) as forward projecting ones throughout most of the visual system (Gilbert & Li, 2013).

Evidence for top-down processing

Up to 90% of the synapses from incoming neurons to the primary visual cortex (involved in early visual processing) originate in the cortex and thus reﬂect top-down processes.

When is top-down processing useful

Top-down processes should have their greatest impact on object recognition when the information coming from our eyes (bottom-up) is unclear or limited—like when we see blurry or very quick images.

High level perception - THEORIES Flashcards

(37 cards)