Lecture 2 Flashcards
(34 cards)
When comparing two nucleotide sequences what do we need to keep in mind?
We have to keep in mind that they are result of mutation during replication (genotypic level) and selection (phenotypic level)
Could you give and example of how evolution occurs in genotypic and phenotypic level ?
At sequence level is GAC changes to AAC, this muting results in antibody not binding anymore and there fore it minds to HIV instead.
How many nucleotides are there for RNA and DNA?
DNA: TCAG, RNA: UCAG
How many amino acids are there?
20
What is a codon?
3 nucelotides which encode for one amino acid
To change the phenotype, at least how many nucleotides should be changed?
1
What could have to DNA/RNA when they are copied?
mutation, insertion, deletion, repeat, inversions, inverted repeated
To compare two sequences, what do we need to know about them initially?
which positions in the sequences correspond to each other.
A correct alignment represents — events such as — and — ( — and —)
actual, substitutions, indels, insertion and deletion
Sequences with shared ancestry are referred to as what?
homologous
to be able to align sequences we base the idea of what?
That there is a common ancestor which genes evolved from. The ancestor has a certain nucleotide or aminoaide at a certain position which could have changed during the evolutionary history.
Could we be sure of the alignment?
no but we choose the model with the highest probability or score
What are types of alignments?
pairwise : protein-protein, DNA/DNA, RNA/RNA, DNA or RNA with a protein : shifts within a codon
multiple sequence alignment
What are two strategies for alignments? explain each
Global: aligns one sequence to there other from start to end, local: finds the longest subsequences with highest similarity
what are strategies for finding a pairwise sequence alignment?
e.g different methods
qualitative method : dot-matrix method
exact method via dynamic programming: needle-man for global and smith for local,
heuristic and fast methods: word methods like BLAST
align CTG and CTAAG, CTAAGAAG and CTAAG, ATC and CTAAG using the dot matrix method and say what each pattern shows
slide 15
What are the pros and cons of dot matrix?
pro: visually easy to identify sequence features such as indels, repeated, inversions and inverted repeats, cons: time consuming and due to being qualitative doesnt give optimal alignment
In quantitative methods how do we know to accept for instance mutation or gap?
By assigning costs to different actions in the alignment process
What are 3 possibilities of characters being compared at one position?
match, mismatch, gap
What are the total number of alignments between a sequence of length m and sequence of length n<m? explain how you got to this
slide 21
What is dynamic programming?
its breaking down a more complicated problem to simpler sub problems and solve them in a recursive manner, and finding the optimal solution to the sub problems.
Explain how the smith waterman algorithm works? how many steps does it include?
if we have two sequences A and B, we calculate the optimal alignment to one point only once and we built am trip for the two sequences with seaA row and serb column. the field (I,j) corresponds to the score of the optimal alignment with the nucleotides ai and bj as the end of the alignment. Finally we find the best way through the complete matrix
it requires m*n steps
What are the pros and cons of the smith waterman algorithm?
its fast in comparison of brute force , it finds the optimal or one of multiple alignments with the same highest score, its only for local alignments, only pair wise alignment possible, still too slow for scanning against big libraries
align AATC and AGAC according to needle man wunsh
slide 31c