Perception 5: Multisensory perception Flashcards
(31 cards)
In our perception of something using multiple senses, we seem to be quite visually dominant.
Rock and Victor (1964) investigated this
Completed either condition:
Subjects felt an object from underneath without seeing own hand (no vision)
Subjects visual impression of the size and shape of the object is distorted by a lens (no haptics)
Subjects underwent vision and haptics conditions
Subjects had to give their impression of the size of the square
What did they find?
Subjects’ impression was dominated by visual input (regardless of response type)
Only 1 in 5 subjects reported awareness of the conflict
Lens distortion dominated over correct haptic information
What is the ventriloquist illusion?
Vision stronger than hearing as it looks like speech is coming from puppets mouth
What does hearing allow us to do, unlike touch?
Localise an object in extrapersonal space
Jackson (1953) tested whether our estimate of an object’s location is dominated by vision or hearing.
Two types of stimulus combinations:
E1. Ecologically meaningless (bells and lights)
E2. Ecologically meaningful (visual impression of steam rising and sound of kettle whistling)
Asked where sound came from
What were results?
Strong visual capture of auditory localisation
Less capture with increasing spatial congruence (more on this later)
More capture with ecologically meaningful combinations
Visual dominance effect - only when there is small difference between auditory and visual stimulus location
E1 = bells and lights - lower response to visual (capture) as this was a less meaningful combination than kettle and whistle
What is the McGurk effect?
Effect of vision on heard speech - seeing mouth move in different ways can make us perceive the same sound differently
Is interesting because the resulting perception is a novel composite resulting from fusion
e.g hearing /b/ and seeing /g/ results in perceiving /d/
What does Posner et al. (1976) suggest sensory dominance is the result of?
An adaptive attentional mechanism
=> vision is less capable of exogenous orienting (involuntary attention to new stimulus) over other modalities (much harder to startle someone with visual stimulus than with auditory stimulus)
=> humans preferentially attend to vision as a result - more reliable?
In what two possible scenarios might hearing dominate over vision?
- Resolving ambiguous visual events
- Resolving temporal differences
Vision is sluggish - photoreceptor chemical changes are slower than vibrations in ear, auditory signals also get to brain and are processed more quickly
Sekuler et al. (1997) - bouncing balls illusion:
Objects either cross over each other or bounce off each other - adding click sound makes it seem like they bounce, don’t pass through (auditory perception dominates)
What happens in sound offset version? (Playing sound until visual coincidence period, then sound stops)
If sound plays at or just before the balls meet in the middle, they will judge them to be bouncing off each other
If sound plays just after balls meet in the middle, they judge them to be passing through - no effect
If sound is playing up until balls meet, and then stops while they meet, then continues, there is no effect - shows it is due to multisensory perception and not change in attention
What is a criticism of tasks aimed to test visual dominance?
Have we just found visual dominance because tasks target vision better than audition?
How did Shams et al (2000) demonstrate the first evidence of auditory dominance over vision by considering temporal properties of a stimulus rather than spatial?
When a flash is shown with two beeps, people perceive this as two flashes
However, when one beep or no beeps accompany one flash, people perceive this correctly as one flash
Number of flashes perceived can only be increased by auditory stimulus, not decreased
What is the discontinuity hypothesis? (Shimojo and Shams, 2001)
“the modality that carries a signal which is more discontinuous (and hence more salient) becomes the influential or modulating modality”
Would explain why two beeps influence perception of the single flash
This appears to have some effect but can’t explain all cases of visual dominance or other forms of multisensory integration
What is the modality appropriateness hypothesis? (Welch and Warren, 1980)
In spatial tasks, vision dominates.
In temporal tasks, hearing dominates.
Other similar theories considered modality precision
Stein studied neurons in the superior colliculus because it is small, easy to map out (receives inputs from retina), and is known to be involved in spatial orienting.
Studied cat’s response to bird sound vs sight of bird
What did they find?
Superficial layers were almost exclusively visual neurons
Deep layers had diverse preferences - multisensory
What are the three laws of multisensory neuron behaviour?
Law 1: Super-additivity
Law 2: Law of inverse effectiveness
Law 3: Spatiotemporal coincidence
What is the law of super-additivity?
Neural response to combined visual + auditory stimulation is greater than sum of unisensory responses
But super-additivity is dependent on some conditions
What is the law of inverse effectiveness?
The degree of the additive response is inversely related to the strength of the unisensory responses
Weak unisensory cues are more effective as multisensory ones
With optimal unisensory cues, multisensory responses can be sub-additive
What is the law of spatio-temporal coincidence?
The additive response also depends on spatial and temporal congruity factors…
Response enhancement when visual and auditory stimuli places within respective receptive fields
Response depression when auditory stimulus placed outside of receptive field
How does spatial coincidence affect mutlisensory perception?
Only have multisensory effects when events are perceived to be occuring in same region of space
How does temporal coincidence affect multisensory perception?
Greatest multisensory response when auditory stimulus is later (for short distances) - as audition is processed faster, however opposite for long distances as light travels faster than sound
Stein et al (1989)
Cat orients to V+A stimuli to get food reward
Unisensory stimuli are near threshold - weak so it is difficult to tell whether they are in a different place
They found that performance with both cues together was better than each cue alone - a multisensory effect - why was this?
(They followed up with another experiment)
Cat orients to V stimuli (ignoring A) to get food reward when V and A stimuli spatially disparate (incongruent)
Using V results in correct orienting
Using V and A results in less correct orienting
Suggests spatial coincidence effect - only superadditivity (better performance due to multisensory perception) when stimuli are in same spatial receptive fields
Risberg & Lubker (1978) – visual and auditory speech perception shows large super additivity in perception.
How did their experiment show this?
1 - weak sound of speech
2 - weak visual stimulus (lip reading)
3 - combined = super-additive effect - most correctly perceived keywords
Calvert et al (2000)
Participants see and hear a person reading “1984” by George Orwell
Separate periods of V, A, and VA
fMRI BOLD response recorded
What did they find?
Calvert et al (2000) found super and sub additive responses to visual/auditory speech perception in superior temporal sulcus
Superadditive when there was audio-visual congruence
Sub additive when they were incongruent
Ernst and Banks (2002) transformed the way we think of multisensory integration:
Maximum likelihood integration
Their findings imply that humans integrate visual and other cues in a statistically optimal way
-> Each cue is weighted based on its reliability (or precision)
-> Does not require a high-level “appropriateness” explanation
What was the task they used?
Use a 2 interval forced choice task:
See each stimulus one at a time - could ask which one was bigger
Often take one of the two to be the standard stimulus - always same size throughout experiment
Comparison stimulus can be a range of different sizes
When asking someone which of the two is bigger, you are asking whether comparison or standard is bigger:
If someone is worse at the task, they should still 30% of the time think comparison is bigger when comparison size is 2 and standard size is 6 - cumulative distribution function would flatten out
If someone is better at the task, they should 0% of the time think comparison is bigger when comparison size is 2 and standard size is 6 - cumulative distribution function stretches out (taller)
If someone is biased - their standard size perception is shifted, so whole curve should shift
Our sensory estimate of some physical property varies due to internal noise. What is this noise assumed to be?
Normally distributed