Perception 4: High level hearing Flashcards
(32 cards)
Why do older people typically hear laurel and young people typically hear yanny?
Yanny = high frequencies
Laurel = low frequencies
Older people are less sensitive to high frequencies so more likely to hear laurel
What three lobes does the lateral sulcus divide?
Temporal, frontal, parietal
What two lobes does the central sulcus divide?
Frontal and parietal
If the lower part of the superior temporal gyrus was peeled away, what would it reveal?
Would reveal primary auditory cortex (A1)
A1 = organised tonotopically
Limited spatial mapping - unlike visual cortices
What are the three subareas of the auditory cortex?
Core - contains A1 and projects to belt
Belt - contains secondary areas and projects to parabelt
Parabelt - contains tertiary areas
What subarea of the auditory cortex does low level processing, and which one does more complex processing (e.g. voice based speech)?
Core = low level processing
Parabelt = more complex processing
(from receiving projections from belt)
Neurons in belt area (especially anterior belt) showed preference for more complex sounds over pure tones
e.g. central frequency for 1/3 octave elicited biggest response
Is it easier to find neurons with specific frequency sensitivities on the aBelt or pBelt?
pBelt
Recanzone (2000) measured spatial tuning of neurons in A1 vs posterior belt in monkeys
Monkey indicated (lever press) when a sound changed direction
Assessed responses to changes in azimuth (horizontal) and elevation (vertical)
Was spatial tuning greater in the belt or the A1?
Much greater in the pBelt than in A1
This tuning was a good predictor of monkey’s perception
pBelt more sensitive to changes in location - increasing spatial preference of neurons here compared to aBelt and A1
Behaviour of lever pull was better predicted by pBelt than A1
Lomber and Malhotra (2008) supported the dorsal-parietal-where and ventral-temporal-what streams.
They used cortical cooling in cats to temporarily deactivate one of two specific areas
They measured performance on two tasks:
Spatial localisation of sound
Discrimination of sound pattern
i.e. this was a double dissociation design performed within each animal - strong empirical test of this theory
What did the double dissociation show?
Found that posterior area (parietal) - was involved in spatial localisation of sound
Found that anterior area (temporal) - was involved in pattern discrimination of sound
These results support a functional division between processing “what” a sound is and “where” it is coming from
Posterior auditory cortex for spatial localising
Anterior auditory cortex for sound pattern identification
As in visual perception, however, auditory perception requires the integration of both spatial and pattern information in order to function holistically
How does auditory segmentation happen in the brain? i.e. how is a listener able to segment sound sources with the same spatial origin (or with very similar origins)?
Hint - tones played for shorter or longer duration
When tones played for longer duration, harder to distinguish between high and low streams - perceive one stream going higher or lower
When tones played for shorter duration, less time between each one in stream - easier to distinguish - perceive two separate streams - one high and one low
e.g. Bregman and Campbell
Asked listeners to write down order of notes
Could only make judgments accurately if they are in the same auditory stream - difficult to jump between auditory streams
Therefore more accurately report each separate stream when notes are played for shorter durations
What is auditory scene analysis? (Bregman, 1990)
The ability to group and segregate auditory data into discrete mental representations
Bregman argued that streams are segregated based on the concept of the perceptual distance between successive auditory components
Based on the weighting of a number of acoustic dimensions
What could these dimensions be?
Time
Frequency (pitch)
Also - loudness and spatial location
How does perceptual distance determine what sounds are grouped together?
If sounds are of similar duration, similar frequency and similar loudness/spatial location, they are more likely to be grouped together in the same stream
If there is a higher relative weighting of time to frequency, then tones that are played closer together in time are more likely to be associated as one stream than tones close together in frequency
If there is a higher relative weighting of frequency to time, then tones that are similar in frequency are more likely to be associated as one stream than tones that are closer together in time
How is perceptual coherence measured?
Often measured in ABA-ABA sequences
A and B are separate tones
A is high frequency, B starts low frequency and gets higher over time
Listeners are asked to judge at what point they hear a “galloping” (two streams), and when this turns into a continuous (one stream) rhythm
Subjective measure
What might determine relative weighting for time vs frequency?
Task constraints and other factors
What is an objective measure of perceptual coherence?
Can be used when either segregation or integration leads to better performance:
Integration leading to better performance -
The temporal displacement of tone B becomes harder to detect when frequency difference between A and B increases
i.e. detecting whether B tone is exactly in the middle of two A tones or slightly offset is much harder when the A and B tones have very different frequencies
Ability to detect temporal offset of a tone is easier when frequencies are more similar - as it is one stream
If ability to detect this is poor, it is likely you have segregated them into two streams based on frequency differences
Is segmenting streams more likely to happen when sound plays for longer duration? (not time of tones, duration for which whole pattern is played)
Yes
Auditory system begins with thinking there is one stream - over time with accumulation of evidence it recognises different streams
Fission = seperating streams
Suggests this is a bottom-up process
Thompson et al (2011) used the ABA triplet task presented to one ear
Listeners either
1. attended to the ABA task throughout
2. performed the noise task for 10 sec before switching ears and performed ABA task in other ear
Difference - attention switching
In what condition is accuracy better?
What does this mean for the role of attention
Accuracy is better in the unattended (switched) condition
ABA performance is better (identifying whether tone B is exactly in between two A tones) when two streams are integrated and perceived as one
Longer time with full available attention increases segmentation
Therefore, less attention resulted in integrated stream
Therefore, attention must be required for segregation
Judging temporal position of tones is worse when you identify separate streams
Since less attention meant better performance (and performance is inhibited by segregation)
Attention is required for segregation
What is schema based segregation of sounds?
- Top-down
- Based on stored templates
- Distinct from “primitive” (bottom-up)
Bey and McAdams (2002)
Listeners compare target and comparison melody: “Same” vs “different”?
One melody is intermixed with distractors
In different conditions, the first melody or second melody has distractors.
How do people perform between the two conditions?
Performance is better when hearing unmixed melody first
When hearing target melody first, you know what to listen for in mixed melody - sets up a schema as a template of the melody - can tune out distractor - shows there is top down processing - attention and schemas - in auditory segmentation
Schema effects in segregation essentially show how the brain “knows what to listen for”
i.e. it parses information based on stored templates
This is very effective for recognising complex sounds in noise - like speech
What are the smallest units of speech?
Phonemes
Vowel sounds have formants at certain frequencies. Rapid changes in frequency either side of a formant are known as formant transitions. What are these associated with?
Consonants
What is coarticulation?
Interaction between overlapping speech sounds
The acoustic signal for the same phoneme varies based on its context
i.e. what comes before/after it
E.g. the figure shows how the formant transition for /d/ varies based on the following vowel sound
Even if phoneme is the same, sounds around it can make it different
One other difficulty lies in variation between speakers, speaking contexts, dialects, etc.
(More formal or informal speech)
What is the distinction between voiced and voiceless consonants?
Voiced consonants involve vibration of the vocal chord (b, d, w, z)
Voiceless do not (f, ch, k)