Hidden Markov Models Flashcards

1
Q

What types of probability are involved in creating a HMM?

A
  1. Conditional probability
  2. Joint probability
  3. Marginal probability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What do we need to calculate the probability of observing a sequence?

A
  1. The model
  2. Model parameters (transition and emission probabilities)
  3. The coin used for each toss
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How probable are the observations under a specified model?

A

Forward algorithm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the most probable hidden states of a model for the observations?

A

Viterbi algorithms (this algorithm shows all the possible probability paths)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can we learn the HMM parameters given a set of sequences?

A
  1. Training a forward-backward algorithm
  2. Baum-Welch expectation maximization
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Why are CpG islands underrepresented?

A

Because the cytosine is modified by methylation, and methylated C easily mutates into T

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Where is methylation suppressed?

A

Around promoters and start regions of genes. There is a higher frequency of CpG islands in these regions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do we build a HMM model for sequence profiles?

A
  1. Use an MSA to find conserved regions associated with signalling, structure, or activity
  2. Use the MSA to train a sequence HMM profile
  3. Search for similar sequences that have a good fit to the HMM profile
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do we find similar sequences?

A
  1. Search proteome databases
  2. For an unknown sequence, find the probability that it came from the sequence model profile and use a threshold to determine the entry
  3. If likelihood > threshold, add to the protein family and update/train the sequence profile with the new sequence
  4. Iterate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly