Bioinformatics 7: Sequence alignment and its significance Flashcards

1
Q

The 2 types of homolog and their differences?

A

ortholog - separated via speciation event

paralog - separated via duplication event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is meant by the orthology conjecture?

A

Orthologs are more likely to show more functional conservation that paralogs

i.e. ortholog genes usually related in function
paralog genes duplicate and diverge (as 2 copies of gene, same function)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is meant by ‘chance similarity’?

A

Any two sequences that show similarity by chance

not structurally or functionally

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

2 ways dna sequences might differ?

A

Mismatches
Gaps

created by substitutions and indels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a dotplot in the context of allignment?

A

Matrix of 2 sequences marked where rows and colums match

Used by alignment algorithms to find most likely evolutionary pathway between the 2 sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which is more common - indels or substitutions? How does this affect alignment?

A

Substitutions far more common than indels
-> must be considered in alignment algorithms

thus ‘quality’ of alignments is assessed via a scoring matrix (matches +ve, mismatches 0, gaps -ve) -> algorithms maximise score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Types of gap penalty?

A

Constant
Proportional
Affine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How would a penalty applied to an amino acid substitution vary in severity?

A

If amino acid which has been substituted is similar in chemical properties (function) -> low penalty

If completely different, likely to be deleterious -> high penalty

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do heuristic algorithms work and why are they used over dynamic programming algorithms? Example of one?

A

Heuristic methods assume high scoring alignments contain short regions of exact matches

  • > they break queries into short ‘words’ and look for matches above a threshold
  • > initial hits examined to see if they can be extended
  • > alignment then scored to quantify similarity
    e. g. BLAST
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How does BLAST work?

A

Basic local alignment search tool

word (W) size: 3 (proteins), 11 (DNA)
-> searches only for word matches above threshold, T

  • > matches above T extended (form HSPs) until gaps cause alignment score to fall drastically
  • > neighbouring HSPs are joined, HSPs in low identity regions are not joined

High-scoring segment pairs (HSPs) along query are reported + ordered by score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Types of blast searches and their uses?

A

blastn: nucleotide query vs nucleotide db (what gene is this?)
blastp: protein query vs protein db (what protein is this?)
blastx: translated nucleotide query vs protein db (does this DNA code for a known protein?)
tblastn: protein query vs translated nucleotide db (what DNA might encode this protein)
tblastx: translated nucleotide query vs translated nucleotide db (does this DNA code for a novel protein?)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is FASTA? how does it compare to BLAST?

A

Heuristic algorithm
AND sequence format (single line of description followed by sequence data)

  • more sensitive to distant relationships but slower than BLAST
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

In the context of a search result, what is a P value and an E value?

A

P value = probability of observing as high of an alignment score between 2 unrelated sequences of the same length + composition

E (Expect) Value = How often a match would be expected to occur in a db by chance (at a given p value)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly