Sequence alignment (long ver p2) Flashcards

1
Q

Examples of Pairwise alignment software

A

EMBL - EBI Pairwise Sequence Alignment
BLAST’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the different applications of Pairwise Alignment?

A

measuring sequence similarity
studying the evolution of sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

share a common evolutionary ancestor

A

Homologous sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

True or False: Homologous sequences does not share a significantly related 3D structure but share the same evolutionary ancestor

A

False

shares the same 3D structure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

usually share significant amino acid/ nucleotide identity

A

homologous sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

sequence regions that are homologous are also called

A

conserved regions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

sequences that share a common evolutionary ancestry

A

homologs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

derived from a single ancestral gene in the last common ancestor

A

orthologs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

homologous genes with identical function in different organisms and is only separated by speciation

A

orthologs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

two or more homologous genes found within a single species

A

paralogs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

separated by a gene duplication event

A

paralogs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

if a gene in an organisms is duplicated and transposed so that two copies occupy two different positions in the same genome, then the two copies are _

A

paralogous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

create gene families

A

paralogs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

consists of two or more copies of paralogous genes within the genome of a single organism

A

gene families

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

True or False: Biological sequences does not occur in families

A

False

it often occurs in families

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

related genes within an organism

A

paralogs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

sequences within a population

A

polymorphic variants

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

genes in other species

A

orthologs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

True or false: Homologous sequences often retain similar structures and functions

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

collection of three or more proteins (or nucleic acid) sequences that are partially or completely aligned

A

multiple sequence alignment

22
Q

Homologous residues are aligned in _ across the length of the sequences

23
Q

In multiple sequence alignment, the residues are presumed to be homologous in an:

A

evolutionary and structural sense

24
Q

residues are homologous as they are presumably derived from a common ancestor

A

evolutionary sense

25
aligned residues tend to occupy corresponding positions in the three-dimensional structure of each aligned protein
structural sense
26
What are the 5 main approaches to multiple sequence alignment
exact methods progressive alignment iterative approaches consistency-based methods structure-based methods
27
employs dynamic programming (similar to Needleman Wunsch but the matrix is multidimensional)
exact methods
28
goal is to maximize the summed alignment score of each pair of sequences
exact methods
29
generate optimal alignments but are not feasible in time or space for more than a few sequences
exact methods
30
strategy entails calculating pairwise sequence alignment scores between all the proteins (or nucleic acid sequences) being aligned
Progressive Sequence Alignment
31
beginning the alignment with 2 closest sequences and progressively adding more sequences to the alignment
progressive sequence alignments
32
What is the pro of Progressive Sequence Alignment?
permits rapid alignment of hundredsthousands of sequences
33
What is the con of Progressive Sequence Alignment?
final alignment depends on the order in which sequences are joined; not guaranteed to provide most accurate alignments
34
What are the examples of Progressive Sequence Alignment?
ClustalIW
35
What are the 3 stages of ClustalIW algorithm?
STAGE 1: create pairwise alignment of every protein included in MSA STAGE 2: guide tree is calculated from the distance (similarity) matrix STAGE 3: MSA is created based on guide tree
36
two ways to construct guide tree
Unweighted Pair Group Method of Arithmetic Averages (UPGMA) Neighbor-Joining Method
37
38
compute a suboptimal solution using a progressive alignment strategy, and then modify the alignment using dynamic programming or other methods until a solution converges
Iterative Approaches
39
What is the advantage of Iterative Approach over Progressive Sequence Alignment?
overcome alignment errors by iterative refinment
40
What is an example of Iterative Approach?
MAFFT
41
What does MAFFT mean?
Multiple Alignment using Fast Fourier Transform
42
example of multiple alignment package that is considered to be highly accurate based on recent benchmarking studies
MAFFT
43
use information about the multiple sequence alignment as it is being generated to guide the pairwise alignments
consistency-based methods
44
example of Consistency-based approach
T-coffee
45
What does t-coffee mean
tree-based consistency objective function for alignment evaluation
46
include all possible pairwise global alignments of the input sequences and the 10-highest scoring local alignments
T-coffee
47
True or False: every pair of aligned residues is assigned a weight
T-coffee
48
based on the idea that the tertiary structures evolve more slowly than primary sequences
structure-based approaches
49
accuracy of msa is improved by including information about the 3-dimensional structure of one or more members of the group of proteins being aligned
structure-based approaches
50
a compilation of both multiple sequence alignments and profil HMMs of protein families
Pfam
51
What does Pfam mean?
Protein Family Database of Profile HMMs