4. Bioinformatics Flashcards
(15 cards)
bioinformatics
the development and application of computer science to the analysis of large amounts of biological information
key elements of bioinformatics
- algorithms and statistical analysis
- software tool implementation
- database infrastructure to store and access data
typical bioinformatics workflow
sequence DNA
-> Assemble sequence
-> identify genes
-> assign genes to groups/clusters
-> find gene functions and protein sequences
-> identify variation and 3D structure
Protein domain identification
using InterProScan
- compares protein to known domain functions
eg. LacI protein has a DNA-binding domain and a regulatory domain
3D structure prediction
Alpha fold
input sequence-> database search-> multiple sequence alignment-> covariation -> prediction
protein variation effect prediction
ClinVar: database of known variants and clinical significance
AlphaMissense: predicts clinical significance of variants
Blast- query
Searched sequence
BLAST subject
compared gene sequence
BLAST CDS
amino acid coding sequence
BLAST CDS number
location in amino acid sequence
BLAST Score
number of gene matches found, not including mismatches
BLAST Expect
chance of this match occurring coincidentally in this database
BLAST identity
similarity between Sbjct and Query and how many changes have occurred between them
Gaps
missing regions of sequence, indicates InDel mutations
BLAST strand
shows which direction the strands are oriented
+ = 5’ - 3’
- = 3’ - 5’