4 RNA sequencing Flashcards by Zunaira Ali

Whag is the DOGMA

DNA to rna to protein

The directionality move toward making protiens

How well did you know this?

Not at all

Perfectly

What is transcritomics and proteomics

How is expression of RNA and protien regulated

Transcriptimics: capturing all the RNA species from a population of cells all at once . Measuring the abundance of RNA species that are present in the cell.

Regulated:
- gene getting transcribed to mRNA
- mRNA stability (like the tags that cause their degredation or chemical modifications)

proteomics: capturing all the protien species from a population of cells all at once . Measuring the abundance of protien species that are present in the cell.

Regulated:
- translation of the mRNA to protien
- protien stability and PTM

How well did you know this?

Not at all

Perfectly

What are all the types of RNA and why is this important

mRNA
TRNA
siRNA
MiRNA

important because you need to purify RNA using a specific method so you get the type of RNA you want

How well did you know this?

Not at all

Perfectly

What is gene expression

Transcriptome

Proteome

Gene expression:
- the process by which information from a gene is used in synthesis of a functional gene product

Transcriptome:
- the set of all RNA molecules in one or a population of cells (mRNA TRNA rRNA ncRNA)

proteome:
- the set of all protien molecules in one or a population of cells

How well did you know this?

Not at all

Perfectly

Explain differential expression using microarrays and Transcriptome sequencing

Trying to sequence RNA

Have treatment and control conditions, extract rna from both cells

Microarray:
- add fluorophore to ends of RNA that was extracted
- it’s a comparative analysis, ratiometric because your analyzing one type of probe to another to quantify whether something is up or down regulated in one cell type vs the other

Transcriptome seqeuncing:
- convert the extracted RNA fragments to cDNA
- genetically barcode the cDNA fragments and fix them to a flow cell
- then do next Gen DNA seqeuncing
- this is quantitative because it lets you do discrete quantification of RNA transcripts
- give actual number values whereas microarray is qualitative

How well did you know this?

Not at all

Perfectly

How do you prepare RNA for DNA seqeuncing

Turing the rna into cDNA using RT PCR or RNA seq

How well did you know this?

Not at all

Perfectly

How do you make cDNA for qRT-PCR

Isolate the RNA
Random priming: take small primers of 6 random nucleotides called hexamers, they prime randomly onto the RNA
First strand synthesis: the hexamer is used as a primer for the murine leukima virus reverse transcriptase (using dNTPs) . Then reverse transcription Of the RNA to make the first strand of cDNA
Second strand synthesis: RNases degrade the intial RNA strand, these rna pieces act as primers for the single stand cDNA. E. Coli DNA pol I then elongates and fills in to make dsDNA. T4 DNA ligase ligates the nicks
Double stranded cDNA library made

How well did you know this?

Not at all

Perfectly

Is RT PCR useful for RNA seq

No because doesn’t have a way to remove rRNA from the sample that cDNA is made from

How well did you know this?

Not at all

Perfectly

What does purified bacterial total cellular RNA look like on a gel

What does this mean

Smears but Most intense bands are 23S rRNA 16S rRNA and 5s rRNA

This means the most abundant RNA in bacterial cells is rRNA

Same for eukaryotes

How well did you know this?

Not at all

Perfectly

How does cDNA synthesis work for RNA seq

Same for qRT-PCR but before hexamer priming you do rRNA depletion

This is because if making cDNA and 95% of sample is rRNA , most of the cDNA made is coming from rRNA which you don’t need that much and your just wasting sequencing power on rRNA

How well did you know this?

Not at all

Perfectly

How is RNA capture and enrichment done

To get rRNA or mRNA from cells (mainly to remove rRNA)

TEX (terminator exonuclease):
- degrades 5’ phos RNA in the cell, these RNA is usually rRNA, so it’s a way to remove rRNA

RNA immunoprecipitation sequencing (RIP-seq):
- capturing RNA that interacts with a specific protien
- so using a tag to bind the protien and also pull down the target RNA

rRNA capture:
- magnetic beads with specific probe seqeunces bind RNA sequences you want to capture
- can bind rRNA so you can capture them with a magnetic and separate them from mRNA

Selective poly adenylation of mRNA:
- the e.coli poly A polymerase enzyme selectively poly adenylates mRNA
- then that mRNA can get captured by oligo dT primers or reverse transcribed using reverse dT primers

How well did you know this?

Not at all

Perfectly

What is not so random priming

Use the hexamers to prime the rna BUT

Predict which of the hexamers in the random mix actually bind to rRNA, and remove them:
- so design the primers so that the ones that bind rRNA are gone and don’t get turned to cDNA
- depletes 50-80% of the rRNA in the sample

Highly biased

How well did you know this?

Not at all

Perfectly

How are rna seq libraries made

Same as DNA seq libraries but using cDNA from RNA instead of DNA

How well did you know this?

Not at all

Perfectly

How do we interpret data from RNA seq

U mapping

Cluster analysis

How well did you know this?

Not at all

Perfectly

Whag is the workflow of bioinformatics and statistical analysis of rna seq

Ex. Getting seqeunce read from an illumina seqeuncer

Base calling: computer processing the image raw data and and make base calls based on the fluorescence patterns
The base calling give you short read seqeunces
These short read sequences are mapped/traced back to a reference genome
Use algorithm to bin those reads to specific genomic intervals that give integer counts of how many reads align to that region of the genome
This gives uniquely mapped reads, , multiple mapped reads and unmapped reads (which you get rid of to calibrate the instrument of its contaminants)

34:43

How well did you know this?

Not at all

Perfectly

What is the simple concept of rna seq analysis

Study These Flashcards

The algorithm maps reads to positions in the genome and then counts the number of reads that map to specific loci

This can be done manually for individual loci, but not for entire genomes (too much data)

Then you can estimate changes in transcript abundance mathematically

Computers do algorithms to make this easier

Explain how to analyze the results from RNA seq analysis

Study These Flashcards

In a gene you have genomic intervals of open reading frames and intergenic regions

Read is assigned to the plus strand or minus strand of the RNA, diff colour for plus or minus strand

If a strand read overlaps with the orf is counted as 1 count to that orf : so ex 4 plus and 1 minus strand to the orf

If 5’ end overlap to a region it’s assigned to that interval

Explain the differential expression analysis

Explain why changes would be seen

Study These Flashcards

Ex. Comparing condition A and B genes:

orfB: both cond A and B have 3 plus, 3/3=1 no change in fold change in expression

OrfA: sense: 4/4 no change antisense: 8/1=8

igBC: cond A has 2 and B has 16, 16/2=8, 8 fold change in expression

Infer:
- locus 1: have sense and antisense in one orf, this means it could stop gene expression in this region because sense and antisense have complimentary pairing there
- locus 2 cond b: if no overlap in the intergenic region and the orf can assume that intergenic region has mainly non coding RNA

What is normalization RPKM

Why is it useful

Study These Flashcards

Reads per kilo base per million mapped reads

RPKM: quantifies the gene expression by normalizing for genic (or intergenic) length on a human friendly scale

The key issue is that the read counting is biased towards longer genes (longer genes are over represented) so this normalizes the length

Scales the reads to numbers that are easier to use

What can RPKM be used for and how

Study These Flashcards

Can be used to asses reproducibility of RNA sequencing

Is does a goodness of fit measurment between repeats of the same sequence library just in diff flow cells

Then is give a r squared value (coefficient of determination) to see how close the reads are

How do you carry out a differential expression analysis using rna seq data

What are the key issues of analysis of gene expression

Study These Flashcards

Stats using R:
- most common tools used are edgeR and DESeq

Analysis of gene expression requires:

an accurate statistical model for variance (gene expression follows a negative binomial distribution):
since not normal distribution can’t just do a bunch of t tests to compared diff genomic intervals and see what differentially expressed
a way to correct for false discovery
a calculation that’s aware of count depth and normalized between replicates

What are false discovery/false negative in analysis of gene expression

Study These Flashcards

Most common error with Arna seq data is false negatives (type II error)

Output of the R program to find change in gene expression between region of the dna give false negatives

This is type II error when it says their not diff regulated but they are (type I error is false positives)

Ex. Filtering for a false discovery rate of 0.05,

In genes that are part of the same operon and regulated by the same transcription factor, it says the first three are differently expresses from the cond b but the last two aren’t

You know this wrong because if part of the same operon they should all be acting the way and all be differentially expressed

Also Degredation could lead to high variance but the false negatives remove that variance and say theyre regulated that same

What is special about DE-Seq DATA ANALYSIS

Study These Flashcards

it’s depth aware and aware to variance

Can make volcano plots:
- below a certain threshold of mean of normalized counts (so when the mean of normalized counts is low) the algorithm won’t make a call a about anything being differentially expressed or not

This is because of the depth of the counts: just due to random sampling it’s unlike that what your seeing at those lower values is actually true (depth aware)

The high depth (how much sample you have), more confidence in saying it’s differentially expressed

But if depth too high this increases false discovery rate (type I error false positive)

What else can you capture from RNA seq other than the read counts

Study These Flashcards

The strandedness of the libraries

What are stranded cDNA libraries for rna seqeuncing

After rRNA depletion, random priming and first strand synthesis Second strand synthesis where you incorporate dUTP, then get a strand of cDNA that’s labelled with Uracil Then the cDNA gets all steps of A railing and adapter ligation Before PCR step to amplify, USER degraded the cDNA strand labelled with uracil This preserves the the strandedness orientation (sense or antisense) of the gene you seqeunced This allows you to capture information for RNA that comes from one of the strands of the genome and helps identify problems that might be on the either postive or negative stand of the DNA in the genome (depending on the strand your analyzing)

What is strand specific RNA seq

RNA seq lets you better form genome annotation by seqeuncing RNA that can help annotate regions Kd the genome that encode non protien RNA This is a way to drive discovery

How does RNA seq drive discovery using 5’ UTR

Regulatory region of 5’ UTR in rna contains ribo switches which binds metabolites and regulate translation The 5’ UTR when sequenced can help you detect the ribo switches in the mRNA and you can then assemble them The 5’ UTR also changes in diff conditions which can give an idea of how diff promoters activate the gene under diff circumstances Can revise genome annotations : the start codon isn’t correctly caught and annotated - the mRNA can correct that because you’ll see an issue in the transcript Can also discern operon structures

Last Slode didn’t save relearn

Okay

4 RNA sequencing Flashcards

(29 cards)