Sequence Similarity Searching Flashcards
What is database structure determined by?
The requirements of designers/users
Complete this statement, databases can be local or…?
Remote
Complete this statement, querying can be manual or…?
Automated
What must providers such as NCBI/EBI balance across users?
Demand on computation resources
What does sequence similarity in DNA/proteins suggest?
Common ancestry
What might common ancestry imply?
Common function
What is the name given to homologs separated by a speciation event?
Orthologs
What is the name given to homologs separated by a duplication event?
Paralogs
Paralogs and orthologs are two types of homologous sequence, true or false?
True
What does the alignment or equivalencing of bases enable?
Maximisation of similarity
What could a database query look like?
Could simply be a sequence (DNA/protein)
Could be a logical structure, e.g. human + mitochondrial + HVS2
Why do sequence databases require specialised search tools?
Due to size and similarity
Is quantification of biological similarity easy or difficult?
Can be difficult
What can searching sequence databases for similar sequences predict about novel sequences?
Possible functions
What can alignments of sequences contain?
Mismatches and gaps
How are mismatches and gaps interpreted in sequence alignments?
As substitutions and indels respectively
What do alignment algorithms ideally try to identify about sequences?
The most likely evolutionary ‘path’ between sequences
What are databases?
Searchable collections of information
What does how we search databases depend on?
Database access, design and location
What does the quantification of sequence similarity require?
Alignment
What is the constant gap penalty?
Opening a gap of any size attracts a constant (a) negative score
= -a
What is the proportional gap penalty?
Opening a gap attracts a penalty proportional to its length (L)
= -(aL)
What is the affine gap penalty?
Opening a gap attracts a constant (a), extending it attracts a penalty (b) proportional to the gap’s length (L)
= -(a+bL) where a»b
What type of gap penalty is generally the most relevant biologically?
Affine