Perception 4: High level hearing Flashcards by Fran Hodgkins

Why do older people typically hear laurel and young people typically hear yanny?

Yanny = high frequencies
Laurel = low frequencies
Older people are less sensitive to high frequencies so more likely to hear laurel

How well did you know this?

Not at all

Perfectly

What three lobes does the lateral sulcus divide?

Temporal, frontal, parietal

How well did you know this?

Not at all

Perfectly

What two lobes does the central sulcus divide?

Frontal and parietal

How well did you know this?

Not at all

Perfectly

If the lower part of the superior temporal gyrus was peeled away, what would it reveal?

Would reveal primary auditory cortex (A1)
A1 = organised tonotopically
Limited spatial mapping - unlike visual cortices

How well did you know this?

Not at all

Perfectly

What are the three subareas of the auditory cortex?

Core - contains A1 and projects to belt
Belt - contains secondary areas and projects to parabelt
Parabelt - contains tertiary areas

How well did you know this?

Not at all

Perfectly

What subarea of the auditory cortex does low level processing, and which one does more complex processing (e.g. voice based speech)?

Core = low level processing
Parabelt = more complex processing
(from receiving projections from belt)

Neurons in belt area (especially anterior belt) showed preference for more complex sounds over pure tones
e.g. central frequency for 1/3 octave elicited biggest response

How well did you know this?

Not at all

Perfectly

Is it easier to find neurons with specific frequency sensitivities on the aBelt or pBelt?

pBelt

How well did you know this?

Not at all

Perfectly

Recanzone (2000) measured spatial tuning of neurons in A1 vs posterior belt in monkeys
Monkey indicated (lever press) when a sound changed direction
Assessed responses to changes in azimuth (horizontal) and elevation (vertical)

Was spatial tuning greater in the belt or the A1?

Much greater in the pBelt than in A1
This tuning was a good predictor of monkey’s perception
pBelt more sensitive to changes in location - increasing spatial preference of neurons here compared to aBelt and A1
Behaviour of lever pull was better predicted by pBelt than A1

How well did you know this?

Not at all

Perfectly

Lomber and Malhotra (2008) supported the dorsal-parietal-where and ventral-temporal-what streams.
They used cortical cooling in cats to temporarily deactivate one of two specific areas
They measured performance on two tasks:
Spatial localisation of sound
Discrimination of sound pattern

i.e. this was a double dissociation design performed within each animal - strong empirical test of this theory

What did the double dissociation show?

Found that posterior area (parietal) - was involved in spatial localisation of sound
Found that anterior area (temporal) - was involved in pattern discrimination of sound

These results support a functional division between processing “what” a sound is and “where” it is coming from

Posterior auditory cortex for spatial localising
Anterior auditory cortex for sound pattern identification

As in visual perception, however, auditory perception requires the integration of both spatial and pattern information in order to function holistically

How well did you know this?

Not at all

Perfectly

How does auditory segmentation happen in the brain? i.e. how is a listener able to segment sound sources with the same spatial origin (or with very similar origins)?

Hint - tones played for shorter or longer duration

When tones played for longer duration, harder to distinguish between high and low streams - perceive one stream going higher or lower
When tones played for shorter duration, less time between each one in stream - easier to distinguish - perceive two separate streams - one high and one low

e.g. Bregman and Campbell
Asked listeners to write down order of notes
Could only make judgments accurately if they are in the same auditory stream - difficult to jump between auditory streams
Therefore more accurately report each separate stream when notes are played for shorter durations

How well did you know this?

Not at all

Perfectly

What is auditory scene analysis? (Bregman, 1990)

The ability to group and segregate auditory data into discrete mental representations

How well did you know this?

Not at all

Perfectly

Bregman argued that streams are segregated based on the concept of the perceptual distance between successive auditory components

Based on the weighting of a number of acoustic dimensions
What could these dimensions be?

Time
Frequency (pitch)
Also - loudness and spatial location

How well did you know this?

Not at all

Perfectly

How does perceptual distance determine what sounds are grouped together?

If sounds are of similar duration, similar frequency and similar loudness/spatial location, they are more likely to be grouped together in the same stream

If there is a higher relative weighting of time to frequency, then tones that are played closer together in time are more likely to be associated as one stream than tones close together in frequency
If there is a higher relative weighting of frequency to time, then tones that are similar in frequency are more likely to be associated as one stream than tones that are closer together in time

How well did you know this?

Not at all

Perfectly

How is perceptual coherence measured?

Often measured in ABA-ABA sequences

A and B are separate tones
A is high frequency, B starts low frequency and gets higher over time

Listeners are asked to judge at what point they hear a “galloping” (two streams), and when this turns into a continuous (one stream) rhythm

Subjective measure

How well did you know this?

Not at all

Perfectly

What might determine relative weighting for time vs frequency?

Task constraints and other factors

How well did you know this?

Not at all

Perfectly

What is an objective measure of perceptual coherence?

Study These Flashcards

Can be used when either segregation or integration leads to better performance:

Integration leading to better performance -

The temporal displacement of tone B becomes harder to detect when frequency difference between A and B increases

i.e. detecting whether B tone is exactly in the middle of two A tones or slightly offset is much harder when the A and B tones have very different frequencies

Ability to detect temporal offset of a tone is easier when frequencies are more similar - as it is one stream

If ability to detect this is poor, it is likely you have segregated them into two streams based on frequency differences

Is segmenting streams more likely to happen when sound plays for longer duration? (not time of tones, duration for which whole pattern is played)

Study These Flashcards

Yes

Auditory system begins with thinking there is one stream - over time with accumulation of evidence it recognises different streams
Fission = seperating streams

Suggests this is a bottom-up process

Thompson et al (2011) used the ABA triplet task presented to one ear

Listeners either
1. attended to the ABA task throughout
2. performed the noise task for 10 sec before switching ears and performed ABA task in other ear

Difference - attention switching
In what condition is accuracy better?
What does this mean for the role of attention

Study These Flashcards

Accuracy is better in the unattended (switched) condition
ABA performance is better (identifying whether tone B is exactly in between two A tones) when two streams are integrated and perceived as one
Longer time with full available attention increases segmentation
Therefore, less attention resulted in integrated stream
Therefore, attention must be required for segregation

Judging temporal position of tones is worse when you identify separate streams
Since less attention meant better performance (and performance is inhibited by segregation)
Attention is required for segregation

What is schema based segregation of sounds?

Study These Flashcards

Top-down
Based on stored templates
Distinct from “primitive” (bottom-up)

Bey and McAdams (2002)
Listeners compare target and comparison melody: “Same” vs “different”?
One melody is intermixed with distractors
In different conditions, the first melody or second melody has distractors.
How do people perform between the two conditions?

Study These Flashcards

Performance is better when hearing unmixed melody first

When hearing target melody first, you know what to listen for in mixed melody - sets up a schema as a template of the melody - can tune out distractor - shows there is top down processing - attention and schemas - in auditory segmentation

Schema effects in segregation essentially show how the brain “knows what to listen for”

i.e. it parses information based on stored templates

This is very effective for recognising complex sounds in noise - like speech

What are the smallest units of speech?

Study These Flashcards

Phonemes

Vowel sounds have formants at certain frequencies. Rapid changes in frequency either side of a formant are known as formant transitions. What are these associated with?

Study These Flashcards

Consonants

What is coarticulation?

Study These Flashcards

Interaction between overlapping speech sounds

The acoustic signal for the same phoneme varies based on its context
i.e. what comes before/after it

E.g. the figure shows how the formant transition for /d/ varies based on the following vowel sound

Even if phoneme is the same, sounds around it can make it different

One other difficulty lies in variation between speakers, speaking contexts, dialects, etc.
(More formal or informal speech)

What is the distinction between voiced and voiceless consonants?

Study These Flashcards

Voiced consonants involve vibration of the vocal chord (b, d, w, z)
Voiceless do not (f, ch, k)

What is a strong cue to distinguish voiced from voiceless consonants?

Voice onset time (VOT) is a strong cue to distinguish voiced from voiceless sounds Time between the beginning of the sound and the onset of the voicing Voiced consonants typically have much shorter VOTs (even negative) than voiceless ones This is also useful for making discriminations within voiced or voiceless sounds e.g. da = voices ta = voiceless

What does experimentally manipulating the VOT in a sound reveal?

A sharp phonetic boundary (Eimas & Corbitt 1973) Point along a stimulus continuum that categorically separates two phonemes Change is only relevant around the boundary Bottom-up feature of the acoustic signal

In addition to bottom-up (i.e. stimulus driven) cues, the listeners also use top-down knowledge to understand speech Sentence embedded in acoustic random white noise - how far does noise have to be reduced to for sentence to become intelligible? Miller and Isard (1963) measured signal-to-noise ratio thresholds for three types of sentences 1. Grammatical 2. Anomolous (makes grammatical sense but has no meaning) 3. Ungramatical How did thresholds differ between sentence types?

Could recognise grammatical sentence amongst much more white noise than others Therefore, grammatical and syntactical rules help speech perception - top down

What is the difference between yanny/laurel and green needle/brain storm?

Yanny/laurel is bottom up - hear what frequencies you best pick up - very difficult to hear the other one Green needle/ brain storm is top down - about your expectations - can hear either

What are transitional probabilities?

The chances of a certain sound following another sound Listeners rely on these to segment speech Transitions occurring within words are usually very highly probable Transitions occurring between words are less probable

Saffran et al (1996) Infant hears stream of nonsense words, with similar transitional probabilities repeated e.g. bidaku is often heard At test, bidaku would be compared to bidago (second version is novel) What do infants pay more attention to?

Novel version Certain sounds more likely to occur with one another in time - when this is violated, infants pay more attention as it is novel and unexpected Knowing what to listen for - top down

What did Belin et al. (2000) find out about the superior temporal sulcus?

Response in this is different for voices vs non-voice stimuli (not specific to whether speech is intelligible)

A dual stream theory of speech processing has been suggested, mirroring that of the “what” vs “where” dissociation (Hickock & Poeppel, 2007) Is it possible to damage one network and leave the other intact?

No - Impossible to damage one network and leave all the rest fully intact due to complex connections

Perception 4: High level hearing Flashcards

(32 cards)