Week 5 Flashcards

Question 1

Q

What’s the time complexity of the algorithm used for local decoding?

Answer

A

Polynomial time

Question 2

Q

What are the three steps in the local decoding algorithm?

Answer

A

<ul>
<li>Calculate the forward probability</li>
<li>Calculate the backward probability</li>
<li>Combine both to find retrospective distribution</li>
</ul>

Question 3

Q

How do we intuitively decode t?

Answer

A

Fuse information from the past observations and the future observations

Question 4

Q

What does the forward probability calculate?

Answer

A

The sum of probabilities of all paths from given sequence of observable symbols

Question 5

Q

What is the difference between local decoding forward and viterbi algorithm?

Answer

A

Viterbi takes the max where forward algorithm sums

Question 6

Q

What is unsupervised learning?

Answer

A

Estimating the HMMs parameters without annotated tags

Question 7

Q

What is replaced in the Baum Welch algorithm?

Answer

A

The counts are replaced with their estimates

Question 8

Q

What are the three stages in the Baum Welch algorithm?

Answer

A

<ul>
<li>Initialisation</li>
<li>E step</li>
<li>M step</li>
</ul>

Question 9

Q

Describe the initialisation step in the Baum Welch algorithm

Answer

A

Randomly guess some starting value for A0 and B0

Question 10

Q

Describe the E step in the Baum Welch algorithm

Answer

A

<ul>
<li>Apply forward and backward probabilities to calculate</li>
<li>Calculate retrospective distribution</li>
<li>Calculate pseudo transition estimate for all i, q, q'</li>
</ul>

Question 11

Q

Describe the M step in the Baum Welch algorithm

Answer

A

<ul>
<li>Re-estimate A for all q, q'</li>
<li>Re-estimate b for all q,w</li>
<li>Update iteration counter</li>
</ul>

Question 12

Q

How does the likelihood change with each re-estimation step in the Baum Welch algorithm?

Answer

A

It increases, till the likelihood converges for the E and the M step

Question 13

Q

What are the four probabilities for each pair of characters in P(O|C)?

Answer

A

<ul>
<li>rev(i,e)</li>
<li>insp(p,h)</li>
<li>del(k,n)</li>
<li>sub(f,v)</li>
</ul>

Question 14

Q

What does rev(i,e) indicate in misspelling probabilities?

Answer

A

Order has been reversed : ie -> ei

Question 15

Q

What does ins(p,h) indicate in misspelling probabilities?

Answer

A

Second h is inserted after the first letter :

p -> ph

Question 16

Q

What does del(k,n) indicate in misspelling probabilities?

Answer

A

The first letter, preceeding the second is deleted : kn -> n

Question 17

Q

What does sub(f,v) indicate in misspelling probabilities?

Answer

A

The first letter is substituted for the seconds : f -> v

Question 18

Q

What is the spelling correction equation?

Answer

A

argmax P(O|C)P(C)

Question 19

Q

How can we get observable symbols from acoustic waveform?

Answer

A

<ul>
<li>Slice up signal (make it discritised)</li>
<li>Represent each slice as a feature vector</li>
</ul>

Question 20

Q

What does a feature vector for a slice of acoustic noise consist of?

Answer

A

Floating point values representing energy (volume) or frequencies within that slice

Question 21

Q

What is our goal for speech recognition?

Answer

A

Compute the most probable sentence W for the given acoustic observation

Question 22

Q

What are the two main components for automatic speech recognition (ASR)?

Answer

A

<ul>
<li>Language model P(W)</li>
<li>Acoustic model P(O|W)</li>
</ul>

Question 23

Q

What levels may a HMM have when involved with speech instead of written language?

Answer

A

<ul>
<li>Bigram model of words in a sentence</li>
<li>Bigram model of phones within a word</li>
<li>Bigram model of suphones within a phone</li>
</ul>

Question 24

Q

What is a phone?

Answer

A

LANGUAGE INDEPENDENT Individual speech unit

Question 25

Q

What are phones represented by?

Answer

A

Symbols from the phonetic alphabet

Question 26

Q

What does IPA try to do?

Answer

A

Offer sound symbols to transcribe any spoken language with transcription principles

Question 27

Q

What is a phoneme?

Answer

A

Abstract class of sounds that are perceived as one distinctive sound in a given language

Question 28

Q

What is an allophone?

Answer

A

Different pronounciations of the same phoneme

Question 29

Q

What are pronounciation dictionaries?

Answer

A

Tools used for speech recognition and speech synthesis that contain phonemes and their pronounciations

Question 30

Q

How many subphones does a phone model normally distinguish between in ASR?

Question 31

Q

What are the three subphones a phone model distinguishes between in ASR?

Answer

A

Beginning, middle, end

Question 32

Q

How is the variable duration of a subphone represented in a HMM?

Answer

A

Single loop that transitions from the state representing a subphone back to itself

Question 33

Q

What is a Bakis network?

Answer

A

A structure where transitions can only go forwards or loop on a single state, but never backwards

Question 34

Q

How are unresolved ambiguities in speech recognition returned?

Answer

A

Word lattice

Question 35

Q

What is a word lattice

Answer

A

Labelled, directed acyclic graph, with states roughly corresponding to points in time

Question 36

Q

What is the goal of automatic speech recognition?

Answer

A

Computationally build systems that map from an acoustic signal to a string of words

Question 37

Q

What is the intuition of a noisy channel model?

Answer

A

Treat the acuostic waveform as a noisy version of the string of words

Question 38

Q

Describe how the noisy channel model works

Answer

A

If we know how the channel distorts the sound, we could find the correct source sentence for a waveform by taking every possible sentence in a language, running it through channel and seeing if it matches the output

Question 39

Q

What is a HMM characterised by?

Answer

A

<ul>
<li>Set of states</li>
<li>Transition probability matrix, probability of moving from state i to state j</li>
<li>Set of observations</li>
<li>Emission probability matrix, probability of observation being generated from given state</li>
<li>Start and end state</li>
</ul>

Question 40

Q

What are HMM characterized by in speech recognition?

Answer

A

<ul>
<li>Set of states corresponding to subphones</li>
<li>Transition probabilitiy matrix with representations for self loop and going to next subphone</li>
<li>Emission probabilities, probability of a feature vector generated for each subphone state</li>
</ul>

Question 41

Q

What are lexicons?

Answer

A

Lists of words, with a pronounciation for each word expressed as a phone sequence

Question 42

Q

What is the decoding question for speech recognition?

Answer

A

Given a string of acoustic observations, how should we choose the string of words that has the highest posterior probability?

Brainscape's Knowledge GenomeTM

Week 5 Flashcards

Brainscape's Knowledge Genome^TM