2. Functional Genetic Information and Sequence Alignment Flashcards
(35 cards)
How can we determine gene expression?
ChIP-seq
Western Blot
Polysomes
Mass spectrometry
Northern Blot
Microarray
RT-qPCR
RNA sequencing
Transcriptomics (definition)
Measurement of gene expression by NGS of entire transcriptomes
How do we quantify expression in RNA-sequencing experiments?
RPKM: reads per kilobase of transcript per million mapped reads
FPKM: fragments per kilobase per million reads mapped
TPM: transcripts per million transcripts
What is the difference between a read and a fragment?
A read is the result of a single-read sequencing; it is a part of the gene sequenced in the forward direction
A fragment is the result of a paired-read sequencing; it includes a forward read and a reverse read to yield a more accurate sequencing
What is baseline expression?
Baseline refers to where the gene is usually expressed
What tools can we use to determine baseline expression?
RefSeqGene
UniProt
Expression Atlas
The Human Protein Atlas
GTEx portal
What is studied in differential expression?
We compare gene expression across two or more states (healthy vs disease)
In what aspects are ontologies connected?
Molecular function
Cellular components
Biological processes
Ontology (definition)
a set of concepts and categories in a subject area or domain that shows their properties and the relations between them
What do we use ontologies for?
To find tendencies, pathways or cellular components common to multiple genes
Databases for gene ontology (GO) enrichment analysis:
G:profiler
Geneontology
Enrichr
What is a Kegg pathway?
Map representing our knowledge of the molecular interaction, reaction and relation networks for metabolism, genetics, environmental information, cellular processes, organismal systems, human diseases and drug development
What is homology in genes?
Two sequences are said to be homologous if they have evolved from a common ancestor. There are no degrees of homology, it’s yes or no
What are paralogous genes?
two factors that came from some kind of gene duplication from the same organism and have evolved in parallel inside the same organism
What are orthologous genes?
One single gene, coming from an ancestor, evolving differently in different species (mouse vs human hemoglobin)
What is identity between two sequences?
The number of identical positions divided by the total number
What is similarity between two sequences?
The number of identical positions + the number of similar positions (only possible in amino acid sequences, never nucleotides) divided by the total number
How to use dot plots for pairwise sequence alignments?
On one axis plot one gene and on the other axis the other gene. Every time there is an identity between the two, plot a dot. In the end, if there is a 100% identity, you will see a straight line with a slope = 1 (y = x)
Downsides to the dot plot approach for pairwise alignment?
Relies on visual analysis
Cannot distinguish between gaps and mismatches
What is an S score?
It represents how good an alignment in through a dynamic approach
How is an S raw score calculated?
Sum of all identities + Sum of all mismatches - Total gap penalty
What is the total gap penalty?
Go is the gap opening penalty ( = 3) for the first nucleotide of a gap
Ge is the gap extension penalty ( = 1) for all nucleotides after the first in a gap
Gt = Go + (Ge * total number of gaps)
What is the Jukes Cantor assumption?
We assume that all nucleotides appear with the same frequency, and we assign scores to identities, mismatches and gaps according to that assumption (high positive value for identities, low negative value for mismatches).
What is the difference between local and global alignments?
In global alignments, sequences are aligned as a whole so we are forced to see both entire sequences
In local alignments, we find a high scoring subsequence; we only see the aligned fragments