VL 6 Flashcards

1
Q

What is Multiple Sequence Alignment?

A

Multiple Sequence Alignment (MSA) is a technique used to align three or more biological sequences, such as DNA, RNA, or proteins. It reveals evolutionary relationships, conserved regions, and functional similarities among the sequences. MSA helps identify similarities and differences between sequences, facilitating various biological analyses and understanding of sequence conservation and evolution.

MSA is for comparing homologous sequences
* Homologs: gene related to a second gene by descent from a
common ancestral DNA sequence
–> Orthologs: genes in different species that evolved from a
common ancestral gene by speciation, normally retain
function
–> Paralogs: genes related by duplication within a genome,
might acquire new functions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is progressive Alignments?

A

Progressive alignment is a step-by-step approach in multiple sequence alignment. It aligns pairs of sequences and incorporates additional sequences using information from previous alignments. It is computationally efficient but may not always produce the optimal alignment.

  • combining pairwise alignments by starting with most similar alignments
  • initial guided tree, adding more sequences
  • not garanteed to be globally optimal
  • errors at the beginning might propagate to the end
  • examples: ClustalW, MAFFT (fast but might give more
    errors), T-Coffee (slow but very accuarate)
  • state of the art: Clustal Omega
  • tradeoff between speed and accuracy …
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what are Iterative Alignment Methods?

A

Iterative alignment methods refine multiple sequence alignments by iteratively adjusting alignment positions based on alignment scores and information from previous alignments. They improve alignment accuracy by capturing finer sequence similarities and resolving ambiguities.

  • similar to progressive methods
  • but might realign initial alignments and the restart slightly
    before again
  • examples: MUSCLE, Dialign
  • improves accuracy at the cost of efficiency in comparison to
    progressive methods
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are Consensus Alignment Methods?

A

Consensus alignment methods combine multiple alignments to generate a representative alignment. They calculate a consensus sequence or profile and construct an alignment that represents the agreement among the individual alignments. These methods help reduce biases and capture reliable regions in the alignment.

  • try out a set of different alignment methods
  • take the best found one
  • M-Coffee uses seven different alignment methods * Merge-Align 91 different methods
  • if time and CPU is not an issue, try them out …
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Clustal Omega (State of the Art).

A

Clustal Omega is a widely used multiple sequence alignment (MSA) program and tool for aligning three or more biological sequences, such as DNA, RNA, or protein sequences. It is an advanced version and successor of the popular ClustalW program.

  • most often used tool for multiple Allignments
  • provides you with Cladorgrams (not an evolutionary tree) or Phylograms
  • An * (asterisk) indicates positions which have a single, fully conserved residue.
  • A : (colon) indicates conservation between groups of strongly similar properties as below - roughly equivalent to scoring > 0.5 in the Gonnet PAM 250 matrix: (STA, NEQK, …)
  • A . (period) indicates conservation between groups of weakly similar properties as below - roughly equivalent to scoring =< 0.5 and > 0 in the Gonnet PAM 250 matrix: (CSA, ATV, …)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Sequence Motifs?

A

Sequence motifs are short, conserved patterns or sequences of nucleotides (in DNA or RNA) or amino acids (in proteins) that are found repeatedly in biological sequences. These motifs are often associated with specific biological functions, structural elements, or binding sites.

  • conclude from sequence to structure to function
  • motifs can be used to identify protein families * databases: primary and secondary data
    – biophysical properties used to predict for instance transmembrane proteins
    – PROSITE patterns from sequence over biophysical properties to signatures and motifs
    SEQUENCE -> BIOPHYSICAL PROPERTIES -> STRUCTURE -> SIGNATURE

(Learn Single Letter Code of Aminoacids)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a Cladogram?

A

A cladogram (from Greek clados ”branch”and gramma ”character”) is a diagram used in cladistics to show relations among organisms. A cladogram is not, however, an evolutionary tree because it does not show how ancestors are related to descendants, nor does it show how much they have changed; many evolutionary trees can be inferred from a single cladogram.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly