PART III FROM GENOTYPE TO PHENOTYPE Flashcards

1
Q

Which areas of the genome are genes concentrated in ?

A

-G/C rich areas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What percentage of the genome encodes protein or non-coding RNA?

A

<2%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What percentage of the genome is regulatory/introns?

A
  • 25%
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What rough % of the genome is junk DNA?

A
  • > 50%
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the genes and gene product mismatch problem?

A
  • That there are about 20 000 genes but more than 500 000 proteins
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a putative gene?

A
  • A gene whose protein and function is not known but it is based on an ORF and believed to be a gene
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the trasncriptome?

A
  • COMPLETE collection of RNA produced from a genome BUT not every RNA is present in every cell and eukayotic RNAs are spiced
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does alternative splicing give rise to?

A
  • Different protein isoforms from the SAME gene (this partially explains the gene product mismatch problem)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How can an RNA sequence be deduced?

A
  • By making and analysing cDNA
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are cDNAs and ESTs (Expressed Sequence Tags) used to analyse?

A
  • Used to analyse gene structure, and presence + levels of specific RNA in cells
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is transcriptomics?

A
  • The study of THOUSANDS of RNAs simultneously
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Is the whole transcriptome produced in cells?

A

NO

- Because only a subset of genes is active in any cell

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the 3 major classes of RNAs that make up the eukaryotic transcriptome?

A
  1. Ribosomal RNAs trancribed by RNA pol I
  2. Protein encoding RNAs (mRNA) and microRNAs (miRNA) transcribed by RNA polymerase II
  3. Small RNAs (including tRNA) trsanscribed by RNA pol III
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Are genes organised into operons in eukaryotes?

A

-NO

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the splicing process?

A
  • Where eukaryotic mRNA is produced by excision of non-coding segments (introns) from precursor (pre-mRNA)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Is splicing SEQUENCE specific and if so what can be found out from this?

A
  • YES!
  • Intron/exon boundaries can be predicted using bioinformatics genomic sequence analyses
  • But there is NO specific splice seuqence that is cut out…more an overall general pattern
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the key to gene identification in eukaryotic genome analyses?

A
  • Accurately predicting splice junctions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Via what process can related but DIFFERENT polypeptides be generated from the same primary transcript?

A
  • Alternative splicing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What allows for different isoforms of a transcript specifically?

A
  • Different EXONS being incorperated OR omitted from the final mRNA
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What process explains why relatively few genes in genome can give rise to vastly greater number of proteins?

A
  • Alternative splicing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Can splicing errors cause disease via mutations?

A
  • YES!
  • Mutations can occur in splice donor or acceptor sequences OR generate NEW (cryptic) splice sequences
    e. g. Exons being omitted (skipped) deletes a section of protein –> severely affects the structure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How can the use of false (cryptic) acceptor or donor sites sseverely affect the protein strucutre?

A
  • By truncating (shortening) or lengthening exons
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the old definition and 2 new definitions for the gene repectively?

A

OLD: One gene encodes one protein
NEW 1: Single transcription unit (gene) encodes one set of protein isoforms
NEW 2 (newest): A single polypeptide is the product of a single gene

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What 3 things do we need to know from each gene in terms of RNA?

A
  1. Where and when it is transcribed into RNA
  2. How it is spliced, and how many spliceoforms there are
  3. Whether particular spliceoforms are restricted to particular cells or growth stage
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Can 1. Where and when it is transcribed into RNA

  1. How it is spliced, and how many spliceoforms there are
  2. Whether particular spliceoforms are restricted to particular cells or growth stage be directly deduced from genomic DNA sequence with CONFIDENCE?
A
  • NO

- Rely on analysis of cDNA and ESTs derived from RNA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is a method to sequence RNA that is stable?

A
  • Make a DNA cop as DNA is stable, easy to amplify, and easy to sequence (cDNA)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Why is RNA unstable?

A
  • Because it is HIGHLY susceptible to nucleases
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is used to produce DNA from an RNA template (like in some viruses)?

A
  • Reverse transcriptase
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What 4 things does creating a complementary DNA (cDNA) rely on?

A
  1. RNA can base pair with DNA
  2. mRNA has a polyadenylated tail (so can be a DNA primer-TTTTTT)
  3. Aretroviral enztyme–> Reverse transcriptase can prodce DNA from RNA
  4. No pre-existing gene sequence info is required to generate a cDNA
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What does producing a cDNA using PCR require?

A
  • Pre-existing sequence information to design primers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What are ESTs? (Expressed Sequence Tag)

A
  • cDNAs made from mRNAs originating from a specific cell or tissue (DNA copies of mRNA or mRNA fragments)
  • represent a SNAPSHOT of the mRNA at that time and place
  • If there is a transcriptionally ACTIVE gene it will be evident in Expressed Sequence Tag databases
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is the collection of colonies of ESTs known as?

A
  • The library –> EST from the colony is then sequenced and data lodged in database
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What are the 3 uses of EST and EST databases?

A
  1. Gene verification
  2. Gene structure
  3. Gene expression
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

How can EST and EST databases apply to Gene verificaiton?

A
  • if DNA sequence from genome matches EXACTLY to a specific EST it can be concluded that the genomic DNA is TRANSCRIBED and it represents a gene (or gene fragment)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

How can EST and EST databases apply to Gene Structure?

A
  • In identifying intron and exon boundaries

- ESTs will only match exons–> so segments that do not match with an EST derived from that gene are introns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Do ESTs only match with introns or exons?

A
  • They only match with EXONS
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

How can EST and EST databases apply to Gene Expression? (5 things…Identify:)

A
  • Identify specific cells or tissue in which the gene is active
  • Identify LEVEL of gene activity
  • Identify alterations in gene activity in disease
  • Identify transcription start and end points
  • Identify alternative splicing patterns
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What happens if you BLAST an EST sequence BACK onto a genomic sequence and why?

A
  • It will ONLY MATCH EXONS because ESTs are made from POST spliced mRNA
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What is the number of clones containing the same EST in one library PROPORTIONAL to?

A

-Proportional to the transcriptional activity of the gene

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Do ESTs have a 5’ end matching the transcriptional start point of its gene?

A

NO

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Do ESTs represent genes active in EVERY CELL?

A

NO

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

What does the program UniGene do?

A
  • Matches ESTs from various sources and organises them into transcript families
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Is each Unigene entry a collection of ESTs derived from MULTIPLE GENES or a SINGLE GENE?

A
  • SINGLE GENE!
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What are microarrays used for (in general)?

A
  • to assess where, when, and how many genes are expressed in specific cells or tissues
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

What does ‘deep sequecing’ rely on?

A
  • ESTs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

What does having no hits in one section of an encode read mean?

A
  • Alternative splicing has occurred (e.g. Exon 4 removed)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

What is the simplest and BEST way to determing if a gene is real?

A
  • Identification of a MATCHING RNA transcript (determine transcription start and end points AND to map intron/exon boundaries)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

What does transcriptomics via deep sequencing enable?

A
  • The simultaneous identification and study of THOUSANDS of transcripts produced by a specific cell or tissue
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

What are the two methods that allows transcripts from MANY genes to be assessed simultaneously?

A
  1. Microarray analysis

2. RNA deep sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

What are the two methods that allow for trancripts from a SINGLE gene to be assessed?

A
  1. In situ hybridisation

2. Reverse transcriptase (RT) PCR and real time quanitative (q)PCR

51
Q

What occurs in the single gene trancript method of in situ hybridisation?

A
  • Labelled DNA or RNA COMPLEMENTARY to target mRNA is soaked into the cell or tissue
  • Probe with SPECIFICALLY BIND to the target mRNA and identify where it is being produced
52
Q

What must be known for primer design in Reverse Trasncriptase (RT) PCR (single gene analysis)?

A
  • The sequence of the target RNA must be known
53
Q

What does Deep Sequencing involve?

A
  • THOUSANDS OF GENES SIMULTANEOUSLY preparing a cDNA library (Purify the mRNA, Bind polyA fraction (mRNA), Fragment RNA, Convert to cDNA by random priming (Random hexamers and oligo(dT) primers), applying adaptors and sequence)
  • Then alalysing milions of SHORT SEQUENCE READS (sequenced from cDNA fragments)
  • Match to genome reference DNA sequence
54
Q

Can deep sequencing be carried out in parallel?

A
  • YES!
55
Q

What are 3 ways a protein can be detected in cells?

A
  1. Antiodies or other binding reagent (cell/tissue strucutre can be maintained)
  2. Enzyme activity (usually in cell or fluid extracts)
  3. Mass spectrometry/proteomics (usually in cell of fluid extracts)
56
Q

What is the difference between RNA and Protein analysis?

A
  • RNA analysis tells you where something is in general (e.g. it is in the neural system) BUT protein analysis tells you what SPECIFIC tissues it is in
57
Q

What can you use an antibody for in protein detection?

A
  • To PROBE for the location of the product IN or ON cells

- Pattern can suggest the structure or organelle that protein is associated with

58
Q

What is the process of using a reporter molecule and making a transgenic cell or animal to find out where and when the gene is expressed?

A
  • Identify and CLONE the genes PROMOTER
  • Join the PROMOTER to reporter protein coding sequence (e.g. GFP) to make a TRANSGENE
  • Introduce transgene into cell or animal and examine by microscopy
59
Q

What do homologous genes share?

A
  • A COMMON ancestor
60
Q

What is an orthologue?

A
  • A gene in a SEPARATE species that has the same biological properties and function (doing the same job)
61
Q

Where can orthologues be found?

A
  • Within conserved sequence segments (syntenic regions) when two genomes are compared
62
Q

What is a paralogue?

A
  • A related gene for the SAME species for which a function is known
63
Q

How are paralogues generated?

A
  • By GENE DUPLICATION
64
Q

What can knowing the function of a gene in one species suggest?

A
  • Can suggest the function of the CORRESPONDING GENE (orthologue) in another species
65
Q

Why can identification of orthologous genes be complicated?

A
  • they may be on DIFFERENT chromosomes in DIFFERENT species (during evolution of speices, chromosome number and size changes due to shuffling of large segments of DNA–> each segment contains multiple genes)
  • May be a number of similar genes (PARALOGUES) in the genome.
66
Q

What do inter-species comparisons of chromosomes reveal ?

A
  • They reveal segment boundaries and syntenic regions where orthologous genes are likely to be located.
67
Q

Can syntenic regions between two chromosomes be mapped?

A
  • YES
68
Q

What is the order of syntenic genes commonly conserved in?

A
  • Commonly conserved in syntenic blocks and paralogues may be found in the SAME region
69
Q

What is forward genetics?

A
  • Going from PHENOTYPE to GENOTYPE e.g. deafness
70
Q

What is reverse genetics?

A
  • Going from GENOTYPE to PHENOTYPE e.g. C.elegans deletion of Srp-6 –> Targeted mutations reveal function by altering the phenotype
71
Q

What are the 4 ways of approaching reverse genetics?

A
  1. Loss of functon mutations (gene-knockouts and knock ins) –> Inactivate or silence the gene to destroy expression
  2. Change of function mutation (Replace the normal gene with an altered gene–> carrying a point mutation in cell or organism)
  3. Gain of function mutation (Express a gene at incorrect time, or incorrect tissue)
  4. Dominant negative mutation–> Specifically SUPPRESS protein function by making a COMPETING dysfunctional protein in cell
72
Q

What does RISC stand for and what does it do (also what process is it involved in)?

A
  • RNA- Inducing- Silencing - Complex
  • Involved in RNA interference
  • Cleaves and inactivates the target mRNA (Expression reduced but NOT abolished)
73
Q

What are the two components of CRISPR-cas-9?

A
  1. targeting module (RNA)

2. Cas protein

74
Q

Which organisms can CRISPR/Cas editing be used?

A
  • In any organism where IVF technology exists
75
Q

What accounts for the majority of human sequence variation?

A
  • SNPS–> Single Nucelotide Polymorphisms (90%)
76
Q

What are 3 reasons for the 3% variation in two people?

A
  1. INDELs–> Large scale (kilobase) or small scale (several bp) INsertion or DELetion of nucleotides
  2. Differing numbers or positions of MOBILE GENETIC ELEMENTS e.g. L1
  3. Single Nucleotide Polymorphisms (SNPs)
77
Q

What is a polymorphism?

A
  • DNA variation present in >1% of people
78
Q

What is a mutation?

A

-A sequence present in <0.1% of people

79
Q

What is a haplotype?

A
  • unique combo of alleles that makes up an individual
80
Q

How many alleles do SNPs have?

A
  • 2
81
Q

What is the average number of SNPs per chromosome?

A
  • 4-5 million
82
Q

What can a SNP in the non coding region result in?

A
  • Possible gene regulation altering
83
Q

What can a SNP in the coding region result in that is synonymous (same aa)?

A
  • No effect
84
Q

What can a SNP in the coding region result in that is NOT synonymous?

A
  • NONSENSE (STOP) –> Prevents protein production

- MISSENSE (AA change) –> Alter protein structure

85
Q

What is an example of a missense occurring in a SNIPS?-

A
  • Factor V needed to be degraded by protease APC to STOP clotting
  • Autosomal DOMINANT missense SNP in FACTOR V gene changes Arg506 to Gln
  • APC can no longer degrade V
  • Deep vein thrombosis occurs
86
Q

What can SNPs be indicators for?

A
  • Disease risk such as Alzheimers (ApoE) (ApoE4 higher risk than ApoE2)
87
Q

What is linkage disequilibrium?

A
  • “non-random association of alleles at different loci in a given population.” google.
88
Q

What is each recombined DNA segment known as?

A
  • Haplotype block (each carries unique string of SNPs)
89
Q

What can determining the SNP haplotype of an individual be useful to test?

A
  • Susceptibiliy for a specific disease
90
Q

Do SNPs have the ability to modify proteins and hence drug responses?

A
  • YES
91
Q

How can treatments in personalised medicine be customized for each individual?

A
  • By correlating medication, dosages and side effects specific to SNP profiles
92
Q

What is a route to personalised medicine via SNP analysis?

A
  • Haplotyping
93
Q

What two things does producing suscpetibility/risk profiles for a BROAD RANGE of diseases or treatments for a particular individual require? (2 things)`

A
  1. A reference map of SNPs

2. Developing rapid and cheap screening methods to map at least 10 000 of these SNPs in a patient

94
Q

What does producing a reference map of SNPs involve?

A
  • Sequencing >100 individual human genomes to have a 95% confidence that all SNPs occurring at 1% or greater are MAPPED
95
Q

What is HapMap short for?

A
  • Haplotype Map project
96
Q

What is the International Haplotype Map Project (HapM ap)?

A
  • The first genome wide glimpse of genetic variation

- Describes common disease patterns of the human sequence

97
Q

What percentage of the genome is identical between two people?

A
  • 99.5%
98
Q

How many SNPs did the HapMap project characterise?

A
  • 600 000 SNPs! (1SNP per 5kb of genome)
99
Q

In the HapMap project, how many individuals were used for SNP identification?

A
  • 270 individuals from 4 ethnic groups
100
Q

Are SNP microarrays a thing?

A
  • YEAH!
101
Q

What is the principle of SNP microarrays?

A
  • Gene chip –> has thousands of spots, each with a ss 25 base reference DNA molecules (oligonucleotides)
  • Each reference DNA is COMPLEMENTARY to a SNP allele
  • Oligonucleotiodes are printed onto the chip and synrthesized DIRECTLY onto it
  • Genomic DNA to be tested is FRAGMENTED, AMPLIFIED as single strand, LABELED, and put on the chip
  • Binding (hybridisation) conditions favour perfect matching between probe DNA and chip DNA 3
102
Q

What are two words to describe modern SNP arrays?_

A
  • Complex

- Redundant

103
Q

Roughly how many DNPs can be interrogated on a SNP chip simultaneously?

A
  • > 90 000
104
Q

What reduces false positives in SNP arrays?

A
  • Each SNP position is represented by up to 40 different BUT overlapping (tiled) DNA sequences
105
Q

What does the Affymetrix Genome Wide Human SNP Array chip have a median inter-SNP distance of? **

A
  • 0.7kb
106
Q

What can the Affymetrix Chip be used in?

A
  • GWAS (genome wide associaton studies)

- e.g. 7 diseases: 2000 patients: 3000 controls

107
Q

Roughtly how many SNPs does the Affymetrix chip have?

A
  • 500,568
108
Q

How can we apply the study of SNPs to Pharmacogenetics?

A
  • By studying the relationship b/w genetic variation (haplotype) and response to medications
109
Q

What can the study of SNPs with Pharmacogenetics result in?

A
  • Individuals reacting or responding to drugs differently
    Thus any patient may require DIFFERING DOSES compared to others
  • May be more or less susceptible to side-effects
110
Q

What is an example of identifying a SNP with relation to medication dosage?

A
  • VKOC1 (Vitamin K Epoxide Reductase) is usually inhibited by warfarin to control blood clotting disorders
  • However people with SNP in the VKORC1 promoter region (chinese) means that it is associated with a LOW WARFARIN dose requirement
  • Therefore safe warfarin dose can be predicted by determining a patients VKORC1 haplotype
111
Q

How is SNP analysis better than STR analysis for crime scenes and ancestry?

A
  • SNP analysis is CHEAPER and can identify lots of SNPs

- SNP analysis also has low mutation rates than STR (changes less overtime) —> for identifying relatives

112
Q

What is a disadvantage of SNP analysis and from this, what will give a more complete picture of the genome?

A
  • Does not give all genetic information on individual
    e. g. no info on variations –> INDELs and mobile elements
  • Routine sequencing of genome will give more complete picture
113
Q

What does the genomics revolution encompass?

A
  • The fact that over time gene sequencing is getting cheaper
114
Q

Which company does ULTRA FAST GENOME SEQUENCING and how much does it cost per instrument?

A
  • Illumina
  • 10 million per instrument
  • 18 000 genomes per year
115
Q

What is whole Exome Sequencing? (WES)

A
  • For mendelian disorders linked to mutations in EXONS
  • Sample DNA is fragmented
  • Predetermined genomic fragments containing exons are isolated
  • ## Sequence compared to reference genomes
116
Q

What can Whole Exome Sequencing (WES) be used to detect?

A
  • RAPIDLY detect and diagnose RARE genetic disorders

- Especially those caused by a SINGLE GENE dysfunction

117
Q

What was the difference between the HapMap project and the 100 genomes project?

A
  • 1000 genomes project was done AFTER the HapMap project and aimed to have a much more detailed ctalogue of the human genome
118
Q

What type of sequencing did the 1000 genomes project use?

A
  • Whole Exon Sequencing
119
Q

What was found from the 1000 genomes project?

A
  • Each person carries around 250-300 loss of function variants in KNOWN GENES
  • 50-100 variants are implicated in INHERITED DISORDERS
  • Also the rate of de novo germline mutation is approximately 10E-8 per base, per generation.
120
Q

What is an example of an application of WES (Whole Exome Sequencing)?

A
  • In cancer treatment
  • Take the tumour and screen for differences then make personalised assays.
  • Can then be used to direct the treatment (e.g. inhibiting certain pathway)
121
Q

What is involved in stage 1 and stage 2 of the 1000 genomes project?

A

Stage 1: Whole Exome sequencing (genomes of 700 people from 25 populations)
Stage 2: Analyse 2500 genomes by whole GENOME sequencing

122
Q

What does Whole Genome Sequencing involve?

A
  • Sequencing EVERY base
  • DNA molecules are attached to primers on a slide and amplified so that local clusters are formed
  • 4 types (A,T,C,G) of reversible terminating nucleotides are added
  • Each nucleotide is fluorescently labelled with a different colour + attached to a BLOCKING group
  • 4 nucleotides COMPETE for binding sites on the template DNA to be sequenced (non incorperated molecules wash away)
  • After each synthesis, laser removes the blocking group and probe -
    Fluorescent colour (specific to one of four bases) becomes visible)
  • this allows for sequence identification and is repeated until ENTIRE DNA molecule is sequenced
123
Q

As of 2018, how many genomes had been sequenced?

A
  • 71 095 genomes