Speech Science Exam 2 Flashcards

Question

What is quantization rate measured in?

Answer 1

bits (binary digits, 0 and 1)

Answer 2

1. time waveform | 2 spectrum/spectrogram/spectra

Answer 3

presents the amplitude of the signal as a function of time

Answer 4

the number of times an object (such as air molecules) vibrates thru a complete cycle per second (measured in Hertz (Hz)

Answer 5

the length of one cycle

Answer 6

1/period time in seconds

Answer 7

amplitude as a function of time

Answer 8

the amplitude of the signal as a function of frequency (i.e. the amplitude on the Y axis and the frequency on the X axis); creates a display of the frequency composition of a signal at a point in time

Answer 9

looking at the magnitude at various frequency components at a specific time

Answer 10

1. FFT (fast Fourier Transform); 2 LPC (Lindear Predictive Coding)

Answer 11

decomposes a signal into it freq components; an algorithm that greatly expedites the computations req for a more precise Discrete Fourier Transform; useful for looking at all (or most) frequency components; increasing the FFT points allows more accurate display

Answer 12

amplitude is on the vertical axis and afrequency is on the horizontal axi; line represents freq components and their amplitudes,

Answer 13

linear predictive coding: a method that attempts to predict upcoming speech samples based on a weighted sum of previous samples; uses estimation based on a vocal tract model filter (not as precise as FFT); not all components, just peaks

Answer 14

amplitude is on the vertical axis and frequency is on the horizontal axis; line represents estimated spectral peaks and their amplitudes; helpful for seeing vowel formants; useful for looking at spectral peaks, but not detailed freq components (must be cautious when attempting to interpret LPC display)

Answer 15

changes, will look for more peaks

Answer 16

literally a series of spectra over time with a time-freq-intensity display; related to the spectrum; sounds are analyzed in a 3d pattern of time (horizontal) freq (vertical) and amplitude (coded by dif colors or shades of gray); shows spectral peaks; the input spectrum is averaged by a filter and the formant (=resonance of the vocal tract) freq appear as darkened bars on the spectrogram; has a time domain and allows one to view changes over time

Answer 17

narrow-band and wide-band spectrograms

Answer 18

1. "sample" waveform every several milliseconds 2. plot series of spectra over time with shades of gray or color (amplitude) 3. turn it sidewise and that is one slice

Answer 19

have detailed frequency resolution (i.e. sho frequencies more precisely than a wide-band spectrogram); good for looking at pitch changes, harmonic structure; but not as good for looking at resonances

Answer 20

analysis bandwith (window) has to be narrower than the distance in the frequency between the harmonics of the voicing source; gen, one would pick an analysis bandwidth that is less than the speaker's F0 (i.e. if the speakers F0 is 100 Hz, the bandwidth should be less than that, or 50 Hz); window has to be less thant the F0

Answer 21

span over a wider range of freq than narrow-band spectrogram; more than the F0; vertical striations indicate glottal pulses; good for loking at resonances; not good for looking at pitch changes or harmonic structure

Answer 22

it has to be larger than the distance in the freq between the harmonics of the voiceing source; generally one would pick an analysis bandwith that is larger than the speaker's F0; (i.e. if the speakers' F0 is 100 Hz, the bandwith should be larger than that, at lest 150 Hz, but 200 Hz is referable)

Answer 23

resonances of the vocal tract

Answer 24

glottal pulses

Answer 25

easiest display to get; hover in one of the displays and it will display the pitch contour; don't click inside, it changes the values

Answer 26

the fundamental frequency

Answer 27

the lowest frequency component

Answer 28

12 dB/octave

Answer 29

further apart

Answer 30

vocal tract

Answer 31

interspeaker and intraspeaker

Answer 32

formant plot

Answer 33

0-4,000 Hz and 0-10,000 or 12,000 Hz

Answer 34

glottal buzz

Answer 35

F0 changes

Answer 36

252 Hz; 247 Hz

Answer 37

317-342 Hz; 442 Hz; 442 Hz; 442 Hz

Answer 38

65 Hz; 87 Hz

Answer 39

not made at the larynx (often esophagus)

Answer 40

resonators

Answer 41

natural resonant frequencies

Answer 42

1 open or closed ends of the tube 2 length of the tube 3 shape of the tube 4 size of the openings of the tube

Answer 43

vocal tract

Answer 44

F is a resonance, n= integer, c = 34400 cm/s, and l = length of the tube in cm

Answer 45

Fn = (2n-1)c/4l

Answer 46

shaping of the vocal tract

Answer 47

17.5 cm; 14.5 cm

Answer 48

the speed of sound at sea level

Answer 49

a peak of resonance in the vocal tract

Answer 50

the quarter wavelength formula

Answer 51

could be glottal source for vowels (typically voicing, but noise (hiss) excitation is also possible (such as whisper)

Answer 52

variable resonator

Answer 53

different resonant frequencies

Answer 54

glottal source

Answer 55

periodic; harmonics

Answer 56

6 dB/octave

Answer 57

the speed of sound at sea level: 34400cm/s

Answer 58

periodic; aperiodic (noise, voiceless) OR BOTH (affricates)

Answer 59

shapes the sound

Answer 60

closer together

Answer 61

further apart

Answer 62

1/ time of the period in seconds

Answer 63

monophthongs

Answer 64

diphthongs

Answer 65

1 tongue height 2 tongue backness 3 lip rounding 4 tenseness

Answer 66

tense vowels are associated w/ more extreme tongue position than lax vowels, length (tense vowels are longer), can be in open or closed syllables

Answer 67

Formant transitions indicate articulatory | changes

Answer 68

F1, F2, F3, and F4 are generallylowered by lip rounding.

Answer 69

Shape and size

Answer 70

270; 2290 Hz

Answer 71

310; 2790 Hz

Answer 72

730; 1090 Hz

Answer 73

850; 1220 Hz

Answer 74

300; 870 Hz

Answer 75

370; 950 Hz

Answer 76

what the sound will be like

Answer 77

the center frequency; whole band

Answer 78

formant plot (don't touch inside the graph!!)

Answer 79

pitch contour (only shows F0)

Answer 80

1 static properties v. dynamic properties, | 2 intrinsic properties v extrinsic properties

Answer 81

such as steady-state formant frequencies and the fundamental phonetic environment) e.g. speaking rate

Answer 82

including inherent spectral change and consonantal context effects; relative vowel amplitude

Answer 83

(intra-segmental) relational properties, especially relations among the fundamental and formant frequencies within vowels

Answer 84

(transsegmental) relational properties, such as relative vowel duration and the relative formant frequencies of a vowel compared to those of other vowels of the same speaker

Answer 85

ae; relatively long and lax

Answer 86

no, it relevant but is non-phonemic

Answer 87

secondary, but it does matter

Answer 88

duration, intensity and F0/pitch

Answer 89

1 absolute formant frequencies determine the vowel identity 2 ranges of formant frequencies determine vowel identity 3 ratios of formant frequencies determine vowel identity

Answer 90

1 vowel identity is determined by "normalizing" by means of point (corner) vowels. 2 listeners "estimate" (infer) the speakers' vocal tracts and use that as normalizing info to perceive vowels. 3 vowel identity is aided by formant transitions of consonants.

Answer 91

1 variability involving formant values and even formant ranges exist on man levels (individual, speaking rate) 2 there are some overlapping areas in the formant ranges 3 ignores other factors that can be relevant cues (such as formant transitions, etc.)

Answer 92

Peterson & Barney

Answer 93

1 variability 2 male, female, and child vocal tracts are not scale models of each other 3 ignores other factors that can be important relevant cues (such as formant transitions, etc.)

Answer 94

Formant Radiots w/ F0 (speaker normalization is F0, where SR is "sensory reference")

Answer 95

there are 3 dimensions; x axis= log(SF3/SF2) | y axis= log(SF1/SR) and z axis=log(SF2/SF1)

Answer 96

"sensory reference" = 168 (GMF0/168)1/3 (grand mean F0)

Answer 97

variable | ignores other factors that can be important relevant cues (such as formant transitions, etc.)

Answer 98

it is a tetrahedron (Miller's Tetrahedron; vowels vary along the three dimensions)

Answer 99

listeners "estimate" (infer) the speakers' vocal tracts and use that as normalizing info to perceive vowels

Answer 100

Vocal Tract Normalization

Answer 101

Vocal Tract Normalization

Answer 102

extrinsic factors

Answer 103

point vowel normalization

Answer 104

vocal tract normalization

Answer 105

1 Absolute Formant frequencies determine vowel ID 2 Ranges of Formant frequencies determine vowel ID (Peterson and Barney) 3 Ratios of Formant frequencies determine vowel ID (Peterson and Barney) 4 Formant Ratios adding an element of Speaker Normalization (Miller)

Answer 106

Talkers blocked

Answer 107

1 Vocal Tract Normalization (Liberman and Gerstman) 2 Point Vowel Normalization (Lagefoged & Broadbendt) 3 Dynamic Specification Model (Strange)

Answer 108

``` 1 static properties 2 dyanmic properties 3 intrinsic (intrasegmental) 4 extrinsic (transegmental) ```

Answer 109

"the main conclusion of this work is that alhtough the relative imp of some of these FX is situation dependent, none of these factors can be safely ignored in a full acct of English vowel perception. It seems fruitless for us to concentrate on only one set of FX and assume that the others are lab curiosities."

Answer 110

gradual transitions from one vowel like articulation to another; highly variable

Answer 111

the spectrum of the glottal buzz or laryngeal output

Answer 112

the greater the F0 the larger the distance will be between the harmonics

Answer 113

grand mean F0

Speech Science Exam 2 Flashcards

(179 cards)