Final exam ch6-11 Flashcards

(89 cards)

1
Q

What can babies tell us about speech perception?

A

Infants as young as 1 month old can perceive differences between speech sounds

By 4 months old, infants can discriminate basic contrasts in their native language
- This suggests that certain speech perception capacities represent innate mechanisms (so, are these capacities specific to speech in general or just the infant’s native language?)
- Can discriminate, with little training, between nonnative phonemic features, and continued exposure to a dominant language decreases that ability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does the motor theory of speech perception suggest?

A

That speech perception is based on invariant articulatory gestures; that is, speech perception uses the motor commands of the speech production system as the units of perception

The listener accesses their own knowledge of how sounds are produced and then uses that reference to process the perception of sounds produced by another individual

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the three key features that describe the motor theory of speech perception?

A

The biological specialization
of perception for speech

Analysis by synthesis

The lack of invariant aspects of the acoustic signal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are bottom-up theories of speech perception?

A

Those in which the acoustic signal provides essential and sufficient information for perceptual recognition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What do top-down theories of speech perception suggest?

A

The information from the acoustic signal is not sufficient for perceptual recognition… higher-level information from contextual, linguistic, and cognitive cues is necessary for accurate speech perception

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Tell me about the motor learning principles of speech production.

A

Principles of motor learning are the practice and feedback conditions that are used to enhance the learning and retention of new motor behaviors

The goal of the alterations in motor patterns for therapy is to help the patient achieve more effective communication that meets the patient’s daily needs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain the DIVA model of speech production.

A

Speech production consists of motor commands (feed forward) and auditory and somatosensory maps (feed back)

Planning: The brain selects a speech sound and activates the corresponding region in the Speech Sound Map.

Motor Execution:
- Feedforward Path: Sends motor commands to articulators (lips, tongue, etc.).
- Feedback Paths: Monitor auditory and somatosensory input.

Error Correction: If the sound produced doesn’t match the target, the model uses feedback to adjust future motor commands.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What three major prosodic features contribute to accentedness?

A

Identical words may receive unique syllabic stress in different dialects

The extent to which vowel reduction occurs varies among languages

The basic building blocks of prosody are not used the same in every dialect and language

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is speech rhythm?

A

A language-dependent phenomenon that encompasses both the temporal and spectral patterned recurrence of strong and weak prosodic elements, including pitch, stress, loudness, and rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are vocalic vs. intervocalic intervals?

A

Vocalic intervals: consist of vowels, as well as liquids and glides that do not have clear change in formant structure when examined spectrographically

Intervocalic/consonantal intervals: consonants, liquids, and glides that are clearly identifiable from vowels by change in formant structure, and segments in which vowel reduction in unstressed syllables leads to lack of clearly identifiable formant structure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is syllabic stress?

A

The use of f0, intensity, and/or duration to place emphasis on one or more syllables of a word

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are some characteristics of syllabic stress?

A

Often used together as a stress cue

The stressed unit is often higher in pitch, louder, and of longer duration

Language defined

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is prominence?

A

The amount of emphasis placed upon a syllable of group of syllables to convey meaning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How are prominence and syllabic stress defined?

A

Prominence is speaker-defined, and syllabic stress is language-defined

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is prosody?

A

Suprsegmental level (prosody): the ensemble of phonetic properties that do not enter into the definition of individual speech sounds

Prosodic features are defined by their relative values to one another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does prosody include?

A

Prosody is a broad term that includes patterns of intonation, timing, and loudness

The acoustic correlates of these features are f0 contour, duration and jucture, and intensity contour

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Tone vs. intonation

A

Tone: pitch as a distinctive feature at the word level

Intonation: pitch contour at the utterance level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does the fundamental frequency contribute to?

A

Our perception of the emotional intent of the speaker

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is f0 declination?

A

The tendency of f0 to decrease gradually over the course of an utterance

Most common explanation has been that lung pressure slowly decreases over the length of the utterance (although many say that the lung pressure shift is not significant enough to account for the declination of f0)

Some evidence suggests that a decline in muscular activity affects f0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is duration used to signal?

A

Used to signal semantic boundaries (preboundary lengthening)

Increased duration of one or more syllables in utterance-final position can signal the end of words or utterances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is juncture?

A

The pause time or separation of syllables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is sonority?

A

The loudness level of a sound relative to other sounds of similar length, pitch, and stress

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

How can the status of the velopharyngeal port be assessed?

A

Using endoscopy, conventional x-ray, CT, and MRI, as well as point-tracking instrumentation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are the three ways that hypernasality decreases overall intelligibility?

A

The radiated acoustic signal has less energy (nasal antiformants dampen the acoustic radiated energy) - less intensity (harder to hear)

Sometimes the emission of excessive airflow through the nose results in turbulence, which can be heard as excessive noise, which interferes with the acoustic information necessary for speech comprehension

The escape of air through the open velopharyngeal port into the nasal cavity results in decreased intraoral pressure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What can measurement of intraoral pressure provide information about?
The adequacy of breathing effort Velopharyngeal port closure and the temporal coordination of these phenomena with movement of the articulators
26
Organic vs. nonorganic causes of disorders
Organic disorders - Hearing impairments, structural problems, facial abnormalities - Neurological impairments (dysarthria, etc.) - Cognitive deficits - Childhood apraxia of speech Nonorganic (functional) etiologies - The actual etiology is unknown - Articulation problems: motor-based misarticulations of one or more sounds with no clear etiology - Phonological disorders: errors in phoneme production arising from incorrect linguistic rules
27
How is the glottal stop achieved?
By rapid cessation of voicing through closure of the vocal folds
28
What is coarticulation?
The phenomenon of simultaneously articulating more than one phoneme
29
Upstream vs. downstream in the VT
Constrictions closer to the mouth are downstream, while those closer to the vocal folds are upstream
30
What are egressive sounds?
Sounds produced on the outward flow of air from the lungs
31
What are the three sources of speech sounds?
Vibration of the vocal folds; produces a nearly periodic complex wave; vowels are produced by this method Airflow driven by lung pressure through the open glottis becoming turbulent; creates continuous aperiodic waves generated by turbulence; many consonants are produced by this method (s, f, etc.) Transient noise; generation of pressures in the mouth; rapid pressure changes in the supraglottal vocal tract (p, k)
32
What is cepstral analysis used for?
Cepstral analysis can overcome the significant weakness of time-based measures Shows the extent to which the f0 and harmonic structure stand out from the background noise Does not rely upon the identification of individual vibratory cycles, so it’s valid even when the vocal signal is not periodic Can be used as an objective measure of vocal quality
33
What is a typical H/N?
11.9 dB, and those with voice pathologies were significantly below that level
34
What is the harmonics-to-noise (H/N) ratio?
A numerical evaluation of the ratio of the energy in the fundamental and harmonics to the energy in the aperiodic or noise component of the speech signal, averaged over several cycles
35
How are the formants related in front vowels?
F1 and F2 are further apart, and F2 and F3 are closer
36
How are the formants related in back vowels?
F1 and F2 are closer together, and F2 and F3 are farther apart
37
Narrowband vs. wideband spectrograms
Narrowband filtering provides good resolution of the harmonic (frequency) structure of the source signal Wideband filtering provides good time resolution of the glottal pulses and formant structure of the vocal tract
38
Characteristics of wideband spectrograms
Individual harmonics are not resolved clearly Broader bands of energy are evident (spectral peaks - they’re associated with the vocal tract formants) Has vertical striations, which are representative of the glottal pulses
39
Characteristics of narrowband spectrograms
The gray scale represents intensity, so darker harmonics contain more energy Lower harmonics contain more energy Harmonic energy is attenuated what it does not coincide closely with a formant frequency
40
What is the center frequency of a filter?
The midpoint or peak of the filter; it represents the frequency that is allowed to pass with the greatest amplitude
41
What is the bandwidth of a filter?
The range of frequencies between the low and high-pass cutoff frequencies; it specifies the range of frequencies that are allowed to pass
42
How do acoustic filters work?
Acoustic filters selectively pass certain frequency components of a complex wave (it passes certain ones more effectively than others)
43
What do low-pass filters do?
They block the high-frequency components of the wave and allow the low-frequency components to pass
44
What do high-pass filters do?
They block the low-frequency components of the wave and allow the high-frequency components to pass
45
What is the frequency cutoff?
The frequency below which all frequencies would be damped (not allowed to pass)
46
What is a bandpass filter?
A filter with both a low and a high-pass cutoff frequency
47
Vocal tract as an intensity regulator
Formants are resonant characteristics of the vocal tract, independent of the presence or absence of a sound source They don’t add energy They just choose certain frequencies to attenuate and others to radiate out of the mouth
48
What are diphthong spectrogram characteristics?
The defining feature of diphthongs is that the articulatory posture shifts smoothly from the first vowel to the second one (if this smooth glide isn’t present, it’s not a diphthong) Onglide: articulatory starting point Offglide: ending articulation
49
Smaller vs. larger resonating space
A smaller resonating space within the vocal tract vibrates at a higher frequency than a larger resonating space The larger resonating spaces will produce lower formant frequencies than the smaller resonating spaces
50
How does lip rounding affect formants?
It tends to lower the frequency of all formants
51
How do the formants change as vowel quality moves forward?
F1 and F2 become more separated
52
How are the formants change when the resonance space of the posterior oral cavity widens?
F1 and F2 become more separated
53
How does F1 change with vowel height?
For both the front and back vowels, F1 varies inversely with vowel height; vowel height decreases and F1 increases as we move from the high (close) front vowel /i/ to the low (open) front vowel /ae/
54
Which are the most extreme high vowels?
i, u
55
Which are the lowest vowels?
ae, a
56
Which are the most front vowels?
i, ae
57
Which are the most back vowels?
u, a
58
Which are the five front vowels from high to low?
/i, I, e, E, ae/
59
Which are the five back vowels from high to low?
/u, û, o, c, a/
60
F1 characteristics
Influenced by oral cavity opening and by constriction in the lower pharynx, just above the glottis F1 is lowered by a constriction in the oral cavity near the point of maximum volume velocity F1 is raised by constriction in the pharynx
61
F2 characteristics
Most influenced by the shape of the posterior portion of the tongue F2 is lowered by a constriction in the area of the lips or at the back of the oral cavity in the oropharynx (u) F2 is raised by a constriction in the anterior oral cavity behind the lips (i)
62
F3 characteristics
Most influenced by the position of the tip of the tongue F3 is lowered by constriction at the lips and in the middle of the mouth (er) F3 is raised by a constriction in the oropharynx (a) and by constriction in the anterior mouth (j)
63
Because an antinode is always located at the opening of the lips...?
All formants are lowered by labial constriction Formants tend to be raised by lowering of the mandible
64
What two rules govern the relationship between formant frequency and perturbation?
A constriction at or near an antinode, at which point the volume velocity is at a maximum and the pressure is at a minimum, lowers the frequency of the formant A constriction at or near a node, at which point the volume velocity is at a minimum and the pressure is at a maximum, raises the frequency of the formant
65
What defines the formants of a tube?
The length and cross-section
66
How does increasing the length of a tube affect formant frequencies?
It lowers them
67
What do vocal tract formants represent?
Potential resonances of the vocal tract… the vocal tract can only filter the energy with which it is supplied The vocal tract does not add energy (it does not increase the amplitude of any specific harmonic); it only selectively allows a greater or lesser amount of the energy of each harmonic to be radiated out of the vocal tract
68
Acoustic characteristics of lip radiation
The characteristics of this boundary are such that when the sound pressure wave exits the oral cavity, the higher-frequency harmonics are resonated more than the lower-frequency harmonics because the radiation characteristics at the lips favor the high-frequency components (air particle displacement is greater at frequencies with greater intensity. Lower frequencies and harmonics have greater intensity and therefore face more resistance from the atmosphere)
69
What is a formant?
A concentration of energy around a particular frequency in the acoustic wave
70
What defines the resonant characteristics of the vocal tract?
The length and cross-section
71
How does changing VT length affect formant frequencies?
Shortening the vocal tract will raise the formant frequencies Elongating the vocal tract will lower the formant frequencies
72
Formants vs. harmonics
Formants are resonating characteristics of the vocal tract and describe the acoustic filter Harmonics are multiples of the fundamental frequency and describe the sound source
73
What two important things does the acoustic theory of speech production tell us?
The features of the vocal tract can be inferred from its acoustic output (so, specific articulatory postures produce specific sounds) The speech production system may be broken down into two major components: the sound source and the filter, or resonator
74
What is the spectral roll-off?
Spectral roll-off is 12 dB per octave (doubling of harmonic frequency) The spectral roll-off of an actual glottal waveform may be very different from the triangular wave, but the roll-off (the decrease in amplitude of successively higher harmonics of the phonatory signal) will always decrease uniformly
75
Tell me about falsetto
Occupies the frequencies above modal register Frequency changes occur by contraction of the cricothyroid, unopposed by the thyroarytenoid; the vocal folds elongate and mass per unit length decreases, creating lots of tension and increased stiffness The high level of tension causes increased buildup of elastic recoil of the vocal folds, reduced amplitude of mucosal wave vibration, and shortened closed phase of the vibratory cycle (sometimes even a complete lack of closed phase)
76
Tell me about vocal fry
The circothyroid is relaxed so there is minimal tension on the vocal folds Vocal folds are shortened and thickened, with increased mass per unit length and a lax mucosal cover Four major features: Fundamental frequency is quite low (35-50 Hz for both men and women) Prolonged duration of the closed phase The mean airflow and lung pressure produced in this register are considerably lower than in modal voice There’s a double closure pattern for each cycle (called dichrotic phonation)
77
What are the four factors of confusion in defining the different registers?
Registration is a psychoacoustic phenomenon, so much of the identification of registers is based on perceptual judgments The physiology and acoustics underlying the different registers are not completely understood Register is sometimes used to refer to the voice quality change due to fundamental frequency alone and not the quality change associated with different modes of vocal fold vibration The registers identified in singing are different from those for speaking, but the terms overlap
78
What is the open quotient?
The ratio of the open phase of vocal fold vibration to the entire duration of the glottal cycle
79
What is the speed quotient?
The ratio of the duration of the opening phase of the vocal folds to the duration og the closing phase
80
What is the contact/closed quotient?
The ratio of the period during which the vocal folds are in contact to the entire glottal cycle
81
Tell me about MPT
Measures the duration of a maximally sustained vowel Ilicited by asking the individual to take a deep breath and sustain a vowel at a comfortable pitch and loudness for as long as possible Used to assess the integrity of phonatory glottal closure
82
Tell me about the S/Z ratio
Used to assess the integrity of phonatory glottal closure A statistic of the relative durations of maximum phonation of the phonemes /s/ and /z/ Not very valid/reliable
83
How is direct measurement of lung pressure achieved?
Direct measurement of lung pressure is achieved by inserting a hypodermic needle through the cricothyroid membrane just below the vocal folds Air flows from the trachea through the needle to a pressure sensor Disadvantage: must be performed by a physician and can be uncomfortable for the research subject Advantage: obtains an accurate measure of lung pressure, and it can be measured during running speech
84
What is a voice range profile?
A close relationship exists between f0 and intensity, and this relationship is shown graphically in a voice range profile The upper contour represents the maximum intensity at each frequency The lower contour represents the minimum intensity at each frequency
85
What are the notable features of a VRP?
The wider the area, the more flexible the voice, in that both the dynamic (intensity) and pitch ranges are large The dynamic range is reduced at the upper and lower extremes, and widest around the mid-frequency range Both the upper and lower contours tilt upward in the higher frequencies
86
How can measurement of f0 and intensity be divided?
Levels of habitual use of the voice (what is the performance of the vocal system of this individual under routine use?) Levels of maximum performance (what is the performance of the system under mechanical stress? What are the physiological limits of the system?) Degree of regularity (how stable is the voice production system?)
87
What is perturbation?
The variability or irregularity in a system
88
What is jitter?
Short-term f0 perturbation that represents the nonvolitional variability in the f0 as measured during sustained vowel phonation
89
What is shimmer?
The short-term variability in the amplitude of the acoustic waveform