Genomics Flashcards

(328 cards)

1
Q

What is T m

A

Melting temperature, Point at which 50% strands separate

Half of max y axis, then measure across and go down to x axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is hyperchromicity

A

When single stranded DNA absorbs UV light to a greater extent than double stranded DNA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What happens under high stringency

A

Only complementary sequences are stable, determined by temp near ™ or low salt concentration

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

why is Genomics important, why study it?

A

We are now able to treat monogenic diseases such as sickle cell disease. We are able to find the point mutation and nucleotide it affected including the amino acid it was on. Can be treated by stem cell transplantation, but only for small number. Targeted genome editing can provide a permanent cure by altering mutation in stem cells that can be transplanted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

meaning of omics

A

Omics aims at the collective characterization and quantification of pools of biological molecules that translate into the structure, function, and dynamics of an organism or organisms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is genomics

A

the study of the entire DNA sequence that contains the complete set of genes for an organism

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is genetics

A

the study of how traits are passed down the generations and the role of genes in that process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

transcriptome

A

the total RNA content in cell produced by transcription

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

proteome

A

the total protein content in cell produced by translation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

meaning of transcriptomics

A

study of all RNA transcrips produced by a cell, tissue or organism

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

benefits of using microarrays rather than next generation sequencing

A
  • cheaper than NGS as microarrays cost £10-100, whereas NGS costs £100 to £1000
  • GWAS is carried out using this technology (genome wide association study).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

mitochondrial genome

A

16kbp, many diseases associated with variants

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

epigenome

A

changes in marks on the DNA strand or in histones, has some disease associations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

metagenome

A

genomes of all the organisms from a specific location. Has some disease associations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

microbiome

A

all organisms in a specific location, eg microbiome of gut

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

meaning of recombinant

A

containing different combination of alleles.- produced by combining genetic material from different places.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

difference between pyrimidine and purine

A

pyrimidine have one nitrogen ring, purines have 2 nitrogen rings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what watson crick pairing is stronger, GC or AU

A

GC because they have 3 hydrogen bonds.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what bonds are within base stacking of DNA

A

HYDROPHOBIC interactions, arrangement of bases set above each other internalised to the structure and excludes water.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

BONDS IN DNA

A

Hydrogen between base pairs, phosphodiester bonds between sugar phosphates, hydrophobic interactions, arrangement of bases above each other internalised to the structure, and excludes water. Van der Waals forces, individually small but contributes to the stability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

denaturation of DNA

A

conversion of a double stranded molecule into a single stranded molecule. It is by disruption of hydrogen bonds within the double helix, occurs when DNA in solution is heated, can also be induced by strong alkali or urea. On denaturation it forms a randomly structured coil, moving and changing shape constantly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What factors does Tm depend on?

A
  • number of hydrogen bonds,
  • GC content (GC have an extra hydrogen bond, hence the more GCs, the more hydrogen bonds contained within the structure.
  • length of DNA molecule , however little further contribution beyond 300bp (on graph it begins to saturate).
  • salt concentration
  • pH (alkali is a denaturant)
  • mismatches (unmatched base pairs)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the effect of increasing salt concentration on base pairing

A

high salt reduces the specificity of base pairing at a given temperature, so a duplex containing mismatches can form and be stable at a given temperature in the presence of high salt concentration, whilst the same duplex would be unstable and dissociate at the same temperature in low salt.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

examples of chemical denaturants that disrupt hydrogen bonds?

A

Alkali, fermamide, urea

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
why does alkali disrupt hydrogen bonds
NaOH= Na+ + OH-, where the OH- ion disrupts H bond pairing. Fewer hydrogen bond means lower stability of the structure and so lowering Tm.
26
mismatch
base pair combination that is unable to form hydrogen bonds.
27
effect of mismatch on DNA stability
reduces number of hydrogen bonds, hence lowers stability and Tm, distorts the structure and destabilises adjacent base pairing, can lead to zipping and unzipping,. It also ,akes the formation of a duplex less energetically favourable, reducing the change in free energy on duplex formation. It also creates shorter contiguous stretches of double stranded sequence, leading to a lower Tm.
28
what is reverse of denaturation called and what factors cause it
renaturation, caused by cooling or neutralisation.
29
how to prevent mismatches forming between 2 molecules
performing a hybridisation at the Tm of the duplex molecule.
30
Stringency
the concept of manipulating the conditions to select duplexes with a perfect match only. Manipulating the conditions meaning to limit hybridisation between imperfectly matched sequences allowing us to manipulate specificity-->changing temperature or environment, reducing amount of denature element.
31
low vs high stringency
low stringency is high salt and low temperature, high stringency is promoting the formation of non mismatched base pairing.
32
examples of nucleic acid based techniques
northern blotting, southern blotting, microarrays, dideoxy and next gen sequencing, PCR, cloning.
33
nucleic acid hybridisation techniques
Identifies the presence of NA containing a specific sequence of bases. Allows the absolute or relative quantitation of these sequences in a mixture.
34
what is a probe
Probes are usually between 20-1000 bases in length, depending on the technique it is used for. A probe is a sequence that uniquely identifies specific sequences, which under high stringency conditions form a duplex.
35
Nucleic acid blotting techniques (northern/southern blotting), disadvantages
Analysis of mRNA or DNA, can be used to identify specific RNA. Limited technique, only detects one gene at a time and small numbers of samples. It is very time consuming and messy, hence is largely superseded by quantitative PCR or microarrays.
36
Process of northern or southern blotting
uses DNA or RNA (that is extracted) respectively that is separated by gel electrophoresis which is then transferred by mass capillary flow to a nylon membrane. It is covalently bonded to the membrane and then hybridised with a labelled probe to mRNA transcript in sample. Nylon or nitrocellulose membrane captures nucleic acid.
37
Microarrays
An ordered assembly of thousands nucleic acid probes. The probes are fixed to a solid surface, then sample of interest is hybridised to the probes. It simultaneously measures 50,000 different transcripts in a cell, tissue or organ.
38
what are microarrays used for
gene expression profiling, comparison of drug treated and untreated cells. RNA is extracted, labelled, hybridised to the array and the amount and location of the label measured. This tells us how much of each and everyone of the transcripts in the human genome are being expressed. They can also be used to assess the presence or absence of millions of individual SNPs simply through hybridisation of genomic DNA to an array. used in genome wide association studies, assess millions of SNPs through hybridisation of DNA to an array
39
what does GWAS stand for?
Genome Wide Association studies (GWAS).
40
what is the formula for GC percentage
(G+C)/(G + C +A + T) X 100
41
Nucleic acid hybridisation techniques
Identified the presence of NA containing a specific sequence of bases Allows the absolute or relative quantisation of these sequences in a mixture
42
Disadvantage of mucking acid blotting techniques
Analysis of mRNA or DNA Limited technique, only detects one gene at a time and small numbers of samples the gel based techniques are time consuming and messy Largely superseded by quantitative PCR
43
Antibiotics
Substances produced by fungi which are toxic to bacteria but not fungi are called antibiotics
44
What does an exponential amplification require
2 primers corresponding to ends of sequence
45
What is PCR
pcr is an enzyme based method to specifically amplify segments of DNA using a thermal DNA polymerase in a cyclical process.
46
what is a chain reaction
A chain reaction is a series of events each one of which is dependent upon the preceding event to sustain itself. a series of reactions that lead to an exponential increase in the number of events occurring in a sequence.
47
role of DNA dependent DNA polymerase
It recognises a specific structure consisting of a partially double stranded DNA forming an initiation complex with it. It then extends a partially double stranded molecule from the 3' end of the non-template strand.
48
How do we ensure that annealing occurs rather than renaturation in PCR
provide huge excess of the primer, ensuring template is in low concentration at start.
49
Enzyme used in PCR and its properties
DNA dependent DNA polymerase. It synthesises a new nucleic acid strand by copying a DNA molecule. It cannot copy RNA nor make RNA. RNA must first be copied to DNA by reverse transcription before it can be amplified by PCR.
50
What factors does DNA dependent DNA polymerase require in PCR
A template strand with a primer (20-10 bases long) annealed to it, Deoxy nucleotide triphosphates (dATP, dGTP, dCTP, dTTP), Magnesium ions are required as a cofactor for the enzyme, a roughly neutral pH.
51
What are the 3 stages of PCR
Denatured (template becomes single stranded) Annealed (formation of initiating template/formation of a duplex with the primer and template strand) Native state (optimal conditions for the extension of the initiation complex and enzyme activity, inc temp and pH).
52
property of the polymerase used in PCR
Must be thermostable, derived from a thermostable bacteria called Thermus aquaticus (Taq polymerase)This is because for PCR to work, the reaction must go through multiple rounds of extreme heating and cooling.
53
meaning of thermostability
able to retain activity, upon repeated heating to temperatures that would 'destroy' most enzymes.
54
Process of PCR
1. Mix all the reactants into PCR, enymes, reactants, excess of primers 2. Start cycle, first with denaturation, where you heat the PCR to 95 degrees, to denature the template strand and break the hydrogen bonds between bases. 3. Begin to cool the reaction, to a temperature of 55 degrees Celsius, to allow annealing of primers, so primers can bind to template strands. Each primer binds to two ends of the DNA, and to the corresponding template strand they are complementary to. 4. Change the temperature again to 72 degrees, which is optimal for DNA polymerase to work. An initiation complex is formed, which elongates from 3’ end of primer, creating a second strand. 5. Continue to repeat the process
55
How to calculate product in each cycle
Every cycle results in a doubling of the product, thus there is an exponential accumulation. 3rd cycle is 8, 10th cycle is 1024, 30th cycle is one billion
56
PCR applications
Diagnostics-routine diagnostic tool used for identification, confirmation and quantification of specific DNA sequence. Eg, presence absence calling TB, detection in sputum, determining treatment response/ drug efficacy. Differentiating between closely related organisms, swine flu, vs human influenxa, both H1N1 subtypes. And How much determine when treatment might be commenced, HIV viral load.
57
What are real time PCR or quantitative PCR?
Different quantitative PCR detection methods used for diagnostics.
58
How to detect SNP using PCR
Adaptations of quantitative real time PCR. These methods depend upon the differences in the melting temperature™ conferred upon short sequences of DNA by their nucleotide composition. They rely upon differences in the Tm of a duplex containing a single nucleotide mismatch (single nucleotide polymorphism).
59
PCR in foresnsics and law enforcement
amplification of genetic markers: - parentage or kinsship: immigration and inheritance - identiffication: millitary casualties, missing prsons or environmental disasters - matchiing 2 sources: crime scene - authentification of biological material: cell lines, purity of food.
60
STRs
short tandem repeats, 2-5 bases in length, repeated many times at specific locations in the genome. Many different STRs are found scattered around the genome. UK data base consists of database to identify individuals with 10STRs. They are highly polymorphic, the number of repeats varies between individuals. They provide a pattern of uniquely sized products accordedby each individuals genome providing a molecular bar code, or DNA fingerprint. highly polymorphic (vary for each individual) but are inherited and are similar between siblings and parents.
61
how many STRS does UK DNA databasse contian
10STRS and each STR will differ in size, giving 20 numbers and a gender indicator together, they give a matching probablility of around 1 in 1 billion.
62
why is PCR used prior to NGS
SO NGS can simultaneously sequence large number multiple PCR products of candidate cancer genes.
63
Exxamplles of use of PCR
- NGS, simultaneously sequencing large number multiple PCR products of candidate cancer genes. - isolating individual segments of DNA prior to cloning or sequencing. - manipulating and modifuing DNA, introducing mutations into a sequence of DNA. Modifying the ends of a sequence to make them contain restriction sites compatible with cloning vectors. - PCR is one of the most commonly used and important tools used in recombinant DNA technology. Eg developing recombinant vaccines, pharmaceuticals, (interferons, clotting factors, tissue plasminogen activator).
64
Define Creationism
The idea that species are made by a supernatural intelligent creator.
65
what 2 things does science have to assume
Natural phenomena have natural explanations, which can be studied by scientific experiments.
66
what does a scientific theory need?
Make testable predictions, stand or fall according to whether the predictions are confirmed or refuted. Popper: 'a scientific theory must be falsifiable'
67
what is relative fitness?
The average number of surviving progeny of a genotype (compared with competing genotypes) after one generation.
68
w<1 and w>1 for relative fitness
If w<1, the frequency of the allele; will decrease with each generation until the allele disappears (negative selection). If w>1 the frequency of the allele will increase with each generation, until the allele reaches fixation (positive selection).
69
examples of how small mutations and large mutations can occur
small: base substitutions, small insertions and deletions. Large mutations are via large dna duplications, large deletions, insertion of transposable elements, viral insertions, chromosome rearrangements.
70
How does gene duplication drive evolution?
Gene duplication is a major driving force of evolution. Once a gene has been duplicated, one copy can continue to maintain the original function whilst the other can evolve new functions. There are likely to be changes both in the coding sequence (in amino acid sequence) and in control sequences.
71
how is it possible that y genes can be expressed during foetal life and B genes are expressed during postnatal life?
Promoter duplicated along with coding sequence Promoter sequence has evolved so B and Y promoters now bind different transcription factors Interact differently with gene enhancers Differential control of B and Y genes
72
what is a pseudogene?
Gene that cannot make a functional protein. However it is a duplication of B-globin gene. hence one copy can maintain original function whilst the other can lose all function.
73
Fanconi's anaemia
Recessive lethal genetic disorder, most affected patients die of bone marrow failure during childhood. Do not reproduce. Gene arises by random mutation, eliminated by natural selection, very low allele frequency.
74
what is modern synthesis
modern synthesis refers to the combination of natural selection with mednelian genetics. Evolution can be seen as a logical consequence of Mendelian inheritance and ecological competition.
75
what types of genes does gene duplication create?
A redundant copy of gene, which can evolve to gain new functions eg globin genes. But other duplicated genes may become pseudogenes.
76
What phrase explains why many genetic diseases are extremely rare?
Evolutionary theory explains why many genetic diseases are extremely rare, and how others are maintained at higher frequencies by positive selection, particularly by heterozygote advantage.
77
Why does log base 2 and linesr curve show plateau of PCR products
As reaction progresses, we get acidification as we are producing hydrogen ions, due to addition of dAMP (as elongation occurs). Also producing pyrophosphate. Each cycle is incorporating primers into the reaction and product, hence we are depleting the primers that are present, and increasing the template concentration. As a consequence, we are changing environment in which polymerase is working. AS A CONSEQUENCE OF acidification and depletion of reactants, the kinetics change and the reaction progresses where we have a plateauing and are no longer producing product. Green is change in kinetics where reaction cannot change place. Hence reduce how polymerase is working, so kinetics change, and reaction progresses to a point where we have a plateau. Eventually reaction doesn't occur.
78
How would you use PCR to identify TB
Presence absence calling TB - detection in sputum, determining treatment response/drug efficacy. If you take a sample, perform PCR with primers specific to TB, you can identify the presence or absence of specific DNA segments that correspond to TB and identify the presence or absence of that organism. within the sample.
79
How would you use PCR to differentiate between organisms
differentiating between closely related organisms “swine flu vs human influenza” both H1N1 subtypes. Allows us to understand the epidemiology and how to. Treat them,.
80
How would you use HIV to determine how much treatment
Use it to determine when to commence treatment, with HIV WE DO not treat until HIV viral load increases to a certain level. Then we commence treatment. How much: determine when treatment might be commenced, “HIV viral load” We can also monitor using HIV load assays to determine when we have a failure in treatment of HIV and we have emerging resistance as a consequence of mutations within a population of organisms present in an individual. HIV viral loads are done routinely in order to determine when and how we treat an individual.
81
How do we perform quantitative or real time PCR
We have a serial dilution of template of known quantity and as a consequence of performing that, we can perform our assay and compare our assay results to these and therefore identify amount of template that we started off with. These techniques use fluorescent detection of the accumulation of product. The crossing point of the amplification is determined and is proportional to the template concentration at the start.
82
SNP detection from PCR
HIGH RESOLUTION MELTING, perform PCR at the end of PCR we perform a melt curve, heat reaction up and slowly cool it and measure the annealing of the template. In order to determine presence or absence of particular SNP, the Tm is effected by the particular sequence within the amplicon, and therefore we obtain different curves. Right hand side diagram, are a number of different curves which describe different variance within particular amplicon. By comparing particular curve of a sample amplicon against that of known variance, then we can identify the particular SNP that is present, within the segments. Allelic discrimination: specific binding of the probe to the amplified region containing the SNP is detected.
83
what is allelic discrimination, or probe based version of qPCR
pROCESS where specific binding of the probe to the amplified region containing the SNP is detected.
84
what does HRM detect
Tm of the amplified product is used to determine which sequence is present.
85
what does it mean if STRs are polymorphic?
They are highly polymorphic meaning that they vary from one individual to another but are inherited and are similar between siblings and parents
86
how does the UK dna database identify individuals
should be on email.
87
what is synonomous substitution
Synonymous substitution a mutation substitution that doesn't cause a change in amino acid sequence
88
Non-synonymous substitution
Mutation that does cause a change in amino acid composition.
89
Sickle cell Anaemia
Point mutation in the β globin gene Single amino acid substitution a hydrophilic a.a. (glutamic acid) is replaced by a hydrophobic a.a. (valine) at position 6 The crystals damage the red cell membrane resulting in Cell lysis causing anaemia Cell adhesion, causing blockage of small blood vessels, followed by tissue infarction
90
what does relative fitness determine?
Relative fitness, w, will determine whether the frequency of an allele increase or decreases over generations
91
Alpha 2 beta 2
Adult hb
92
Alpha 2 delta 2
Minor adult hb, less than 1% in us
93
Alpha 2 gamma 2
Foetal hb
94
What is NAH
Mixing DNA from two sources that have been denatured by heat or alkali to make them single stranded , then under certain conditions allowing complementary base pairing of homologous sequences
95
In what way are single stranded DNA sequences listed?
5' to 3'
96
what are histones
basic proteins that bind DNA. Eight histones form the nucleosome. Histone 1 binds the linker DNA.
97
exome
The sum of all the gene sequences
98
cis linked
regions physically close to the exons on the DNA strand. Contrast with trans regulatory regions that can be on different chromosomes.
99
size of human genome
3 x 10 ^9 - 3Gbp
100
where are pseudo genes found in DNA
Intergenic region
101
What are the three RNA polymerases and their role in getting transcription?
RNA polymerase I-needed to transcribe rRNA genes RNA polymerase II-needed to transcribe mRNA RNA polymerase III-needed to transcribe tRNA and other small RNAs
102
Introns in genes
vary in number 0-311 vary in size 30bp to 1Mbp some introns contain other genes
103
enhancers
upregulate gene expression-they are short sequences that can be in the gene or many kilobases distant. They are targets for transcription factors (activators)
104
silencers
downregulate gene expression. They are also position-independent and are also targets for transcription factors (repressors)
105
Insulators
short sequences that act to prevent enhancers/silencers influencing other genes
106
3 stages of modification of eukaryothic mRNA
Capped at 5' end (methylated cap) Polyadenylated at 3' end Intervening sequences (introns) removed
107
alternative splicing
Exons can be skipped or added so variations of a protein (called isoforms) can be produced from the same gene
108
process in which processed pseudogenes are copied from mRNA
Retrotransposition, as a result they have no promoter or exons
109
How does UK DNA database allow us to find individuals
Since each of the alleles for eg CSF1PO on chromosome 5 may contain between 6 and 15 repeats this gives for the first STR two sets of numbers between 6 and 15 that makes 10 possible numbers for the first allele and 10 possible numbers for the second allele ie 100 different combinations, if we then add in VWA it can have between 11 & 24 repeats similarly two sets of 14 numbers 192 combinations , combining these two STRs we have 19,600 possible combinations from just 2 STRs . If we use 10 STRs and a gender identifier we have more than 1 billion combinations. Hopefully you get the idea. In the UK DNA database there are more than 6 million individuals. PCR of the 10 STRs is done in the same way, primers flanking the STRs are used one of these is labelled, The PCR products are separated by capillary gel electrophoresis and the size of the product determined, this allows in turn the determination of the number of repeats in each STR; these numbers are combined and compared to the database to try to find a match.
110
why is it possible that the PCR product may not be a multiple of 4 even if the repeated sequence consists of 4 bases
The primer will not be directly next to ie immediately flanking the STR see the graphic below. From the slide you can see that TH01 is a repeat of 4 bases with a minimum of 4 bases but the minimum amplicon size given is 163 bases. So there are an additional 147 bases between the ends of the STR and the ends of the primers
111
percentage of our genome that codes for protein
2%
112
what are major macro-level differences associated with?
Disease (aneuploidy, translocations, etc)
113
what are micro or molecular level pathogenic differences associated with and give example of these differences
Micro-molecular level pathogenic difference is sometimes associated with disease (point mutation and SCA, 3bp deletion in CFTR)
114
WHAT IS polymorphic
any position in the genome that varies between individuals is considered polymorphic=a variant
115
if we compare human genomes, how many SNPs will we find?
single base differences once every 300 bases
116
how are SNPs made
They are typically generated by faulty replication of DNA during mitosis. Although here e mismatch repair mechanisms, these mistakes do not get repaired.
117
what is polymerase slippage?
it is when the polymerase sometimes slips fom the template strand during replication It is this event that holds the lead tocodon expansions.
118
describe the polymerase slippage model
If the polymerase slips, it causes the new strand to unpair (release) from the template strand. If the polymerase slips, it causes the new strand to unpair (release) from the template strand. If the slip occursat the templates codon repeat region of the Huntington gene, then when the new strand tries to reattach to the template strand, it will have many identical copies of codon to choose from.
119
snv
single nucleotide variant, change in nucleotide that is not corrected. SNVs may be in a gene, promoter, non coding region. when pathogenic, may call point mutations. base substitutions, generally bi-allelic, due to mutation and mismatch repair. May do nothing, may affect trait or be associated with disorder.
120
why are mutations common in african population?
beneficial in places where malaria is rife (heterozygote advantage)
121
mutation
new allele arises, we now have a variant
122
gene flow
migration leading to introduction of that variant into another population
123
genetic drift
random change in variant allele frequency between generations
124
selection
non random change in variant allele frequency between generations because presence of one allele/genotype is pathogenic (negative selection) or beneficial (positive selection)
125
what does repeat in tandem mean?
one after the other.
126
copy number variants
intergenic, quite large.
127
variant effects
Can be beneficial Can be pathogenic Most are neutral Are these of any use? Yes, can be used as markers to help find disease-causing genes and mutations Autozygosity mapping & linkage studies (Microsatellites, SNPs) Association analysis (SNPs, CNVs)
128
allele
An allele is one version of a particular position or locus on the genome
129
what is a locus
unique position in genome. A single base to entire genomic region.
130
what is an allele
particular form of a specific locus. A single base to entire genomic region. An individual has 2 alleles for any autosomal locus.
131
2 possible alleles
biallelic
132
3 possible alleles
triallelic
133
greater than 3 alleles
multiallelic
134
what are pedigree drawings?
standardised set of symbols
135
representation of pedigree drawings
males are squares, females are circles. Partners have a line between them. Siblings have a line above them. there is a line down for children. Affected people are shaded. Carriers have dots in them
136
what does a double line between male and female represent?
a consanguineous couple
137
what does sb on pedigree diagram mean
stillborn baby of unknown sex
138
what is the risk of child having a disease from autosomal dominant parent
50% for each child.
139
meaning of penetrance
percentage of individuals who carry the mutation and develop symptoms of the disorder. Many dominant disorders show age dependant penetrance.
140
meaning of variable expressivity
variation in severity/symptoms of disorder between individuals with same mutation.
141
new mutation rate
de nova mutation rate varies considerably between AD conditions
142
somatic mosaicism
new mutation arising at early stage in embryogenesis. It is present in only some tissues/cells.
143
germ line mosaicism
gonadal mosaicism. a new mutation arises during oogenesis or spermatogenesis. Mutation present in variable proportion of gametes; can be transmitted to offspring.
144
anticipation
worsening of disease severity in successive generations; characteristically occurs in triplet repeat disorders.
145
describe autosomal recessive inheritance
Manifest in HOMOZYGOUS/ COMPOUND HETEROZYGOUS form Carriers (heterozygote) not affected Both sexes affected Male to female and female to male transmission Usually one generation affected May be consanguinity e.g. cousin marriages
146
compound heterozygote
2 mutations in the same gene. mutations are different
147
compound homozygote
2 mutations in same gene, identical mutations
148
X linked inheritance
women have 2 X chromosomes, so they have two copies of X-linked genes. Can be homozygous or heterozygous. Men have one X and a Y, therefore only a single copy of X linked genes, are hemizygous.
149
what is skewed X inactivation
normally the majority of genes on one of a woman’s X-chromosomes are inactivated generally random but ~10% of women have uneven or skewed X-inactivation.
150
what are manifesting carriers
some women have some symptoms in X-linked recessive conditions e.g. cardiomyopathy in DMD.
151
Y linked inheritance
always and only passed from fathers to sons
152
what is a pathogenic mutation
results in an alteration of the function of the gene product and can cause a disease phenotype
153
What are isoforms
Variations of a protein
154
Sense and antisense strand of dna
Sense strand is the rna chain being naked and the template strand is anti sense strand. Only one dna strand of double helix acts as the template strand
155
How do bacterial plasmids act as vectors ?
Using PCR to amplify DNA, restriction enzymes to cut it and DNA ligament to re join it we can manipulate DNA, make and insert recombinant genes into plasmids. We can then transfixed the bacteria where the plasmids will replicate and be maintained. We can isolate them which will express the recombinant gene
156
what is recombinant protein
recombinant proteins are proteins that have been generated from vectors to be produced in large quantities for manufacture use
157
what are transgenic organisms
organisms that have altered genomes
158
what are nucleases
enzymes that degrade nucleic acids by hydrolysing phosphodiester bonds. Ribonuclease RNase: degrade RNA Deoxyribonuclease (DNase): degrade DNA Exonuclease: degrade from end of molecule Endonuclease: cleave within nucleotide chain.
159
what are restriction endonucleases
restriction: limit transfer of nucleic acids from infecting phages into bacteria. There are many different enzymes from different bacteria. They do 2 things, they recognise a specific sequence, and they cut that sequence.
160
what do restriction enzymes do?
restriction enzymes recognise specific DNA sequences, and they catalyse the hydrolysis of phosphodiester bonds.
161
what are restriction maps
map of restriction sites within a molecule. they are a useful way of describing plasmids
162
role of dna ligase
creates phosphodiester bonds
163
phosphatase enzyme
hydrolyses a phosphate group off its substrate. Calf intestinal alkaline phosphatase, or shrimp alkaline phosphatase. It should be used to prevent cut plasmids from resealing.
164
role of polynucleotide kinase
adds phosphate to 5' hydroxyl group of DNA or RNA.
165
Why use a polynucleotide kinase?
To phosphorylate chemically synthesised DNA so that it can be ligated to another fragment. To sensitively label DNA so that it can be traced using: -radioactively labeled atp -fluorescently labeled ATP
166
reverse transcriptase
RNA dependent DNA polymerase Isolated from RNA containing retroviruses. Synthesises a DNA molecule complementary to mRNA template using dNTPs.
167
phages
bacterial viruses eg lambda
168
non-primate lentiviruses
vectors used to integrate dna in mammalian cells
169
Baculoviruses
vectors used in combination with recombinant expression in insect cells (a eukaryotic expression system)
170
vectors
cut down version of naturally occurring plasmids and are used as molecular tools to manipulate genes.
171
what are the important features of plasmid vectors
they can be linearised in non essential regions of DNA Can be re-circularised without loss of ability to replicate Replicate at high copy number contain selectable markers such as antibiotic resistance eg ampicilin or tetracyclin. They are relatively small often between 4 &5 kilobases.
172
How do we produce recombinant proteins in bacteria
We use them to investigate their properties | To develop and produce therapeutics
173
How to plasmids allow us to add functionality for example
They allow us to: Express a recombinant gene in an organism of our choice (prokaryote or eukaryote) Modify its control elements eg switch it on or off at will, or express it at high levels on demand Alter the properties of a gene product -to make it secreted extracellularly or into the periasmic space Add a peptide tag or join it to another protein Make it useful as a therapeutic
174
How much are recombinant proteins or peptides used in bio pharmaceuticals
30%
175
Examples of recombinant proteins in clinical use
Human insulin Interferons a and b Erythropoietin for kidney disease and anaemia Factor XIII Tissue plasminogen activator -embolism, stroke
176
What is coding sequence in a gene
Coding sequence is the part of the genes coding for the protein not including UTRs not any intronic or regulatory sequences such as promoter nor enhancers
177
What are shine Dalgarno sequence
It is the ribosomal binding site found around 8 nucleotides before the start codon in the RNA in prokaryotes. Remember that the RNA of this group of organism is not capped.
178
Two types of promoters
Constitutive: always on. Allows a culture of cells to express the foreign protein to a high level Fine if the protein isn’t toxic to E. coli Bad idea if it is Inducible: molecular switch Allows large cultures to be grown without expressing the foreign protein Induced in response to a defined signal
179
Describe the use of inducible promoters as transcriptional depressors
Typically used lac operator which is de repressed by addition of lactose mimic IPTG
180
Why are some promoters best made in eukaryotes
Many pharmacologically useful proteins are heavily modified and will not be appropriately processed in bacteria Eg interferons, usually by glycosylation. Some proteins retain biological activity and some don’t Therefore they are expressed in a eukaryotic system
181
What is reduced penetrance a characteristic of?
Characteristic of dominant inheritance
182
What is the meaning of pedigree
Family tree
183
How do You know a disease is not sex linked?
If there is an equal pair of distribution between males and females
184
Role of restriction endonucleaseS
Recognise a specific sequence | Cut that sequence
185
autosomal dominant
``` Manifest in HETEROZYGOUS form Multiple generations affected Both sexes affected Male to female & female to male transmission Most will have an affected parent 50% risk to offspring ```
186
what is age dependent penetrance
Age dependent penetrated, someone might be heterozygous for an allele they have, are healthy then suddenly develop the disease
187
what does mosaicism mean
a mixed population.
188
what is pleiotropy
one gene influences 2 or more unrelated phenotypic traits.
189
dna polymerase slippage model
Sometimes the polymerase slips from the template strand during replication. It is this event, calledpolymerase slippage, that many researchers believe holds the key to codon expansions. According to thepolymerase slippage model, if the polymerase slips, it causes the new strand to unpair (release) from the template strand. If the slip occurs at the template’s codon repeat region of the Huntington gene, then when the new strand tries to reattach to the template strand, it will have many identical copies of the codon to choose from. With so many identical codon copies to reattach to, the new strand may reattach to the template at the wrong copy, usually one more distant than the copy that was adjacent to the polymerase before it slipped. As a result of this misplacement, the new strand forms a bubble of unpaired bases, which represents the expansion of the new strand. Once DNA replication is complete, an unknown mechanism allows the template strand to realign with the new strand and bring the bases from the bubble back into line with the template strand. The bases are then paired with their corresponding partner bases (cytosine (C) to guanine (G); adenine (A) to thymine (T)). In the end, the brand new double helix of DNA contains more CAGs in the repeat region of the Huntington gene than existed before. Polymerase slippage has caused expansion.
190
what is copy number duplication
The simplest type of copy number variation is the presence or absence of a gene. An individual’s genome could therefore contain two, one, or zero copies.
191
describe non allelic homologous recombination
Driven by fact that you can get sequence similarity between different bits of chromosomes. When homologous chromosomes align, they are looking for sequence similarities, they are looking for their partner, but because the sequences can misalign, it can shift the chromosome. This is a problem as when recombination does occur, you can end up having a deletion on one, or a duplication on another.
192
what is allele
particular form of a specific locus | Single base to entire genomic region
193
what is locus
Locus = unique position in genome | single base to entire genomic region
194
Polymorphism
Any genetic variation. Different types of polymorphism are SNVs, microsafellites, CNVs etc SNVs generated when mismatch relair goes wrong
195
What to call a genetic variat that is pathogenic
A mutation
196
Describe plasmids
Discrete Circular dsDNA molecules found in many but not all bacteria Are a means by which genetic information is maintained in bacteria Are genetic elements (replicons) that exist and replicate independently of the bacterial chromosomes and are therefore extra-chromosomal Can normally be exchanged between bacteria within a restricted host range (eg plasmid borne antibiotic resistance)
197
describe features of plasmid vectors
Can be linearized at one or more sites in non-essential stretches of DNA Can have DNA inserted into them and can be re-circularised without loss of the ability to replicate Are often modified to replicate at high multiplicity (copy number) within a host cell Contain selectable markers Are relatively small 4-5kb in size
198
why is the linearisation of plasmids important?
Plasmids can be linearised in one or more sites. If you cut DNA, then circularise it, then as a consequence of re-circularising it, then you can insert something accidentally. This can disrupt that particular segment of DNA, AND CAUSe a loss of function. I must be able to linearise the plasmid, cut a single enzyme at a single site. This means that DNA should be able to be inserted and still circularised and still function and importantly replicate.
199
define vector
a plasmid, phage or cosmid into which foreign DNA can be inserted for cloning
200
why do we use plasmids as recombinant tools
Expression of a recombinant gene in a living organism of choice, what the function of that particular protein might be. It can be expressed in either a prokaryote or a eukaryote, eg. Prokaryote or eukaryote Add or modify control elements, that control expression of particular protein in plasmid or vector. And as a consequence... Make it inducible (switch it on or off) or express it to high levels on demand. Or understand its regulation eg its own promoter or its own elements. Alter the properties of the gene product Make it secreted extra-cellularly or into the periplasmic space, fuse it to a peptide tag or other protein, join other bits together to give useful properties make it useful as a therapeutic Make it into a fusion protein, make recombinant protein, which are proteins which may have different proteins. This can be done in order to make it easier to purify, or understand where it is in a cell.
201
``` Synagis -Respiratory Syncitial Virus Herceptin -HER-2 positive breast cancer Remicade (Infliximab) -Rheumatoid arthritis Humira (Adalimumab) -Crohn’s, Plague Psoriasis Xolair (Omalizimab) -Asthma ```
remember that
202
What control elements are required for expression in bacteria?
the coding sequence is the part of the gene coding for the protein not including the UTRs nor any intronic or regulatory sequences such as a promotor nor enhancers These shine-Delgarno sequence is the ribosomal binding site found around 8 nucleotides before the start codon in the RNA in prokaryotes. Remember the RNA of this group of organism is not capped The promoter is the gene element that is involved in regulation and initiation of transcription The transcriptional terminator is a sequence that terminates transcription and initiates the dissociation of transcription
203
what is shine dalgarno sequence
These shine-Delgarno sequence is the ribosomal binding site found around 8 nucleotides before the start codon in the RNA in prokaryotes. Remember the RNA of this group of organism is not capped
204
constitutive promoter
Constitutive – always on allows a culture of cells to express the foreign protein to a high level fine if the protein isn’t toxic to E.coli Bad idea if it is toxic to E.COLI as it will kill bacteria.
205
inducible promoter
nducible – molecular switch allows large cultures to be grown without expressing the foreign protein, induced in response to a defined signal. Therefore it can be switched on or off, can be expressed at high level before it kills organism.
206
what are the 2 tags used in gene fusions
Glutothianes transferase, and 6 histidine tag.
207
Differences with PCR and dideoxy chain termination
Does not have temperature changes Only uses a single primer Results in linear amplification Does not regularly use thermos table polymerase
208
what are germ line and somatic mutations and de novo mutations
germ line: passed onto descendants, somatic mutations are not transmitted to descendants. De novo mutations are not inherited from either parent.
209
what is gene flow
the movement of genes from one population to another (eg migration). It is an important source of genetic variation
210
What is genetic recombination?
It is the shuffling of chromosomal segments between partner (homologous) chromosomes of a pair.
211
What is difference between mutation and polymorphism?
Mutation is rare change in the DNA sequence that is different to the normal (reference) sequence. A polymorphism is a DNA sequence variant that is common in the population-no single allele is regarded as the normal allele, there are 2 or more acceptable alternatives.
212
What is MAF to be classed as polymorphism?
equal to or greater than 1% of population.
213
what is haplotype ?
a group of alleles that are inherited together from a single parent . The order of alleles along a chromosome.
214
Mendelian/Monogenic diesease
Disease that is caused by a single gene, with little or no impact from the environment (PKD)
215
Non-Mendelian/Polygenic disease
diseases or traits caused by the impact of many different genes, each having only a small individual impact on the final condition (psoriasis)
216
What is Linkage analysis
A method used to map the location of a disease gene in the genome. The term linkage refers to the assumption of two things, being physically linked to each other.
217
what is physical proximity?
Using genetic markers to identify the location of a disease gene based on a its physical proximity.
218
genetic maps
They look at information in blocks or regions (similar to zones on a tube map).
219
What is genetic linkage?
The tendency for alleles at neighbouring loci to segregate together at meiosis. Therefore to be linked, two loci must lie very close together.
220
What are the two types of genetic markers?
Microsatellite markers, and single nucleotide polymorphisms.
221
What is LOD score used for?
The probability of linkage can be assessed using a LOD score. LOD is logarithm of the odds score. It assesses the probability of observing the same dataif the two loci are linked, purely by chance, i.e it calculates a likelihood ratio of observed vs expected (no linkage).
222
What is the recombination fraction?
The proportion of recombinant births. The higher the LOD score, the higher the likelihood of linkage.
223
What is genetic association?
The presence of a variant allele at a higher frequency in unrelated subjects, with a particular disease (cases), compared to those that do not have the disease (controls), or for particular traits compared to those that do not have this trait.
224
what does GWAS do
use markers across the whole genome (SNP Microarrays) Look for association between disease and each marker-chi squared test This has resulted in the detection of large numbers of disease associated genes. GWAS data is presented as a single graph called Manhattan plot. The X axis is the position of the SNP on the chromosome. The Y axis is -log10 (p value) of the association.
225
WTCCC
Welcome Trust Case Control Consortium
226
What are the problems with GWAS
GWAS has identified associations that are statistically strong and reproducible. However, their contribution to the genetic component of disease is estimated to be low. This may be because of many common SNPs of small effect, rare SNPs, copy number variation and epigenetic variation.
227
What are association studies
They are undertaken by comparing the frequency of a particular variant in affected patients with its frequency in a carefully matched control group. This is described as a case control study. If the frequency in the two groups differ significantly, this provides evidence for an association
228
uses of DNA sequencing
In Research for example Mammalian and Pathogen Gene sequencing Clone or PCR Amplicon sequencing to confirm a cloned or site-directed mutagenesis “Walking” a gene to identify a causative mutation in candidate gene studies Confirmation of causative variants associated with genetic disease following association study Health Today dideoxy sequencing is still the gold standard confirmatory test for specific genetic mutations in patients with suspected genetic diseases Used to confirm all types of mutation Silent, Misense, Nonsense, Truncating, Indel, and Mis-Splicing the one exception low frequency mosaicism Identifying HIV haplotypes resistant to anti-retrovirals HAART, when patients treated with anti viral, then we can identify mutations within that population of molecules and whether individual will need. treatment. problem si that. If you want to sequence with. Sanger sequencing, then you are sequencing what is the average, tether this means you. will fai to detect that by Sanger sequencing.. mosaicism is undertaken.
229
mutation vs polymorphism
A mutation is a rare variant Polymorphism means common variation We have a standard reference sequence. Variance different to reference sequence is called mutation. Polymorphism tend of be common variants They are different tonreference sequences. You don't know which is the normal allele The arbitrary cut off point is a minor allele frequency of 1% Homologous recombination is important to think about linkage analysis
230
what is genetic association?
Genetic Association is the presence of a variant allele at a higher frequency in unrelated subjects with a particular disease (cases), compared to those that do not have the disease (controls) For disease we could use the broader term “trait”, for example height is not a disease
231
what are the problems with GWAS
``` GWAS has identified associations that are statistically strong and reproducible However, their contribution to the genetic component of disease is estimated to be low (<5%) Possible answers: Many common SNPs of small effect Rare SNPs Copy Number Variation Epigenetic variation Heritability is overestimated ```
232
what is NGS
They use an in vitro cloning step to amplify individual DNA molecules by emulsion or bridge PCR. USED FOR PERSONALISED MEDICINE, genetic diseases and clinical diagnostics. Can sequence multiple individuals at the same time. In principle, the concepts behind Sanger vs. next-generation sequencing (NGS) technologies are similar. In both NGS and Sanger sequencing (also known as dideoxy or capillary electrophoresis sequencing), DNA polymerase adds fluorescent nucleotides one by one onto a growing DNA template strand. Each incorporated nucleotide is identified by its fluorescent tag. The critical difference between Sanger sequencing and NGS is sequencing volume. While the Sanger method only sequences a single DNA fragment at a time, NGS is massively parallel, sequencing millions of fragments simultaneously per run. This high-throughput process translates into sequencing hundreds to thousands of genes at one time. NGS also offers greater discovery power to detect novel or rare variants with deep sequencing.
233
What is a DNA library
A DNA library is a collection of random DNA fragments of a specific sample to be used for further study; in our case next generation sequencing
234
process of ngs | -check with original notes
Library preparation: libraries are created using random fragmentation of DNA, followed by ligation with custom linkers Amplification: the library is amplified using clonal amplification methods and PCR Sequencing: DNA is sequenced using one of several different approaches
235
two ways of clustering microarray results
- clustering in circles within graph | - dendograms
236
what is qPCR and what is rt PCR
Q PCR is real time PCR, where PCR made quantitative. RT PCR is reverse transcriptase PCR, where RNA made into copy DNA then copied.
237
what's in a spot in a microarray
each spot contains lots of copies of the same oligonucleotide probe. This is a single stranded piece of DNA approximately 20-30 nucleotides long. Each probe is designed to hybridise with one SNP. Here’s a gene Kate showed last week containing a SNP at this position here. EVC is the Ellis van Creveld gene The SNP is denoted by the red Y. In the IUPAC list of DNA base symbols Y means Cytosine or Thymine, i.e. it is one or other of the two pYrimidine bases in a sequence We take that DNA sequence and we design a probe complementary to the region next to the SNP
238
whats in a spot in a microarray?
Lots of copies of the same probe in a spot Each spot gives the genotype for one SNP Up to 5 million spots per sample! Genome wide analysis possible
239
What is the epigenome
The sum of all the (heritable) changes in the genome that do not occur in the primary DNA sequence and that affect gene expression An epigenetic change results in “A change in phenotype but not in genotype
240
What are the 4 epigenetic mechanisms
DNA Methylation Histone modification X-inactivation Genomic Imprinting
241
Describe DNA methylation
DNA methylation in humans is the addition of a methyl group in the 5’ position of a Cytosine This is catalysed by DNA methyltransferase enzymes DNMT1, DNMT3a and DNMT3b It requires S-Adenosyl Methionine to provide the methyl group In differentiated cells it occurs in CpG dinucleotides In general, DNA Methylation turns transcription off by preventing the binding of transcription factors DNA methylation patterns change during development and are an important mechanism for controlling gene expression
242
Describe histone modification
This is the addition of chemical groups to the proteins that make up the nucleosome There are a large number of known histone modifications (>100) and many are of unknown function Common modifications include acetylation and methylation. Large range of enzymes catalyse modification Modifications are named based on the histone, the amino acid and the actual modification For example, H3K4Me3 means that on Histone 3, the Lysine at position 4 is tri-methylated
243
What are the histone modifiers
Writers Histone Acetyltransferase - HAT1 Histone Methyltransferase - EHMT1 Erasers Histone Deacetylase - HDAC1 Histone Demethylase - KDM1 Readers Bromodomain and extra-terminal (BET) proteins – BRD2 Chromodomain proteins – CBX1
244
Role of histone modification
Histone acetylation at Lysine residues relaxes the chromatin structure and makes it accessible for transcription factors Histone methylation is more complex and can repress or activate transcription depending on where it occurs Histone modifications can occur concurrently and so their effects can interact or modify each other
245
Describe X inactivation
This is the inactivation of one of the two X chromosomes in every somatic cell in females This is needed as the Y chromosome has virtually no genes, so there is only one copy of each X chromosome gene in males (hemizygosity) X-inactivation ensures that every somatic cell in all humans has the same number of active copies of every gene
246
What is genomic imprinting
Imprinting is the selective expression of genes related to the parental origin of the gene copy Every autosomal gene has one paternal and one maternal copy Imprinted genes tend to be found in clusters There are very few imprinted genes (~250)
247
How do we imprint genes?
Imprinting is mediated by imprinting control regions (ICRs) One copy is silenced by DNA methylation catalysed by DMNT3a and histone methylation leading to inactivation LncRNAs are essential to the process Imprinting patterns are reset during gamete formation
248
What is WES
Whole exile sequencing, used to capture the sequence of the coding region of the genome
249
sexual determination and differentiation
determination: Genetically controlled process dependent on the ‘switch’ on the Y chromosome. Differentiation:The process by which internal and external genitalia develop as male or female. The two processes are contiguous and consist of several stages
250
What is the SRY
The Sex determining region Y (SRY) switches on briefly during embryo development (>week 7) to make the gonad into a testis. In its absence an ovary is formed. Testis develop cells that make 2 important hormones: -sertoli cells produce Anti-mullerian hormone (AMH) -Leydig cells make testosterone Products of the testis influence further gonadal and phenotypic sexual development
251
What are the three waves of cell that invade the genital ridge at gonadal development
3 waves of cells invade the genital ridge: Primordial Germ Cells – become Sperm (male) or Oocytes (female). Primitive Sex Cords – become Sertoli cells (male) or Granulosa cells (female). Mesonephric Cells – become blood vessels and Leydig cells (male) or Theca cells (female
252
What is premordial germ cell migration
An initially small cluster of cells in the epithelium of the yolk sac expands by mitosis at around 3 weeks. They then migrate to the connective tissue of the hind gut, to the region of the developing kidney and on to the genital ridge – completed by 6 weeks.
253
Mesenphric cells ?
These originate in the mesonephric primordium which are just lateral to the genital ridges. In males they act under the influence of pre-sertoli cells (which themselves express SRY) to form… Vascular tissue Leydig cells (synthesize testosterone, do not express SRY) Basement membrane – contributing to formation of seminiferous tubules and rete-testis In females without the influence of SRY they form… Vascular tissue Theca cells (synthesize androstenedione which is a substrate for estradiol production by the granulosa).
254
what allows dna fragments to attaching to flow cell
dna anchors at p5 and p7
255
compare targeted 16S PCR to whole genome shotgun sequencing
Targeted 16S PCR amplification Assess taxanomic diversity in sample Biased, only bacteria Whole genome shotgun sequencing Assess taxanomic diversity in sample Assess composite gene functions in sample Unbiased, all micro-organisms
256
what is metagenomics
Metagenomics is the study of genetic material recovered directly from environmental or biological systems/compartments Unbiased view of taxanomic diversity in a sample Not limited by ability to culture Overall view of gene content in a sample
257
What is SCD
Sudden cardiac death | Death from definite or probable cardiac causes within one hour of onset of symptoms
258
Describe subunits of POLG and their roles
``` Mitochondrial DNA polymerase Polymerase gamma Heterotrimer protein -one catalytic subunit (POLyA) -two accessory subunits (POLyB) -encoded for by diff genes in nucleus POLyA contains 3’-5’ exonuckease donain to proofread newly synthesized DNA, corrects mutations by cutting them out POLyB enhances interactions with dna template and increases activity and processiviry of POLyA ```
259
Meta centric
50:50
260
Submetacentric
Chromosomes with short arm at top and long arms at bottom
261
Acrocentric
Satellite at top Long arms at bottom Chromosome is ^
262
Micro level differences
Pathogenic differences sometimes associated with disease eg point mutation, SCA, 3bp deletion in CFTR
263
Macro level differences
Generally associated with disease, aneuploidy, translocations etc
264
Why are genetic variants most likely to be neutral
Depends on the type of variant (lots of variants in every gene-some pathogenic, some not; depends on the environment).
265
Are promoters found in coding or non coding sections
Non coding
266
Genetic variation
Differences in DNA sequence between indiciduals | Inherited it due to environmental factors
267
Syntenic
Genes close together on same chromosome
268
What is heteroplasmy
Mutation load which can be quantified with NGS Need 80% or more heteroplasmy or mutant dna to then develop a disease However inheritance of mutation load is random
269
Difference in linkage analysis and control analysis
Linkage analysis is the Finding of the map location of disease gene in a genome. -where the single variant is using the tendency for alleles at neighbouring loci to segregate together at meiosis. Association analysis is the presence of a single variant allele at a higher frequency in unrelated subjects with a particular disease than in control subjects without the disease.
270
GWAS Manhattan plot x axis vs y axis
X axis is position of snp on chromosome | Y axis is log 10(p value) of the association
271
WtCCC diseases
``` Analyses 2000 samples from each of 7 diseases type 1 diabetes Type 2 diabetes Coronary heart disease Hypertension Bipolar disorder Rheumatoid arthritis Crohn’s disease ``` Controls come from 1958 brititish birth cohort and others are blood donars
272
Proteins that bind to histone tails
``` Writers which add histone modifications -histone acetyl transferase -Histone methyl transferase Erasers remove modifications -histone deacetylase -histone demethylase Readers bind to the modifications, effect gene activity, chromatic condensation and accessibility -bromodomain and extra terminal proteins -chrimodomain proteins ```
273
What is imprinting
The selective expression of genes related to the parental origin of gene copy
274
Meaning of imprinting genes
Selective expression of only mother pattern of genes or father pattern. If egg, imprints rewritten with mother imprints, if spermatic imprints erased and written with paternal imprint, even the genes that came from dad. Imprinted genes are found in clusters. Imprintingbis mediated by imprinting control regions. One copy is silenced by dna methylation ( DMNT3a) and histone methylation, leading to inactivation. Imprinting patterns are reset during gamete formation
275
Epigenetic targets
Important in gene expression, therefore could be good target for drugs. In cancer, a lot of the genome becomes hyper or hypo methylated. If we could effect that, we can control cancer. Drug can inhibit methylases or demethylases. Global dna methylation has been known to be altered in tumour cells. Hypermethylation of tumour suppressor genes means by suppressing tumour suppressors, results in tumours As methylation suppresses gene expression Hypometbylation of tumour activating genes can also result in cancer Epigenetic enzymes often mutated in tumour cells. Histone acetyl transferases, methyltransferases, Kinases, readers etc
276
Pharmacoepigenetic drugs
Dnamethyltransferase inibitors are used as standard treatment for mylodys plastic syndrome - 5 Azeicytidine - Myelodysplastic syndrome Histone deacetylase inhibitors - romidepsin (istodax) - cutaneous T cell lymphoma
277
Describe process of x inactivation
Picture of process found in phone
278
What is a dna library
Collection of random dna fragments of a specific sample to be used for further study; next generation sequencing
279
Describe NGS
Prep dna sample Cut into fragments Repair sticky ends of fragments with polymerase Add adenine bases create an a tail overhang Add thymine nucleotides for adaption ligation Illumina SBS sequencing machine then performs NGS (adapters contain primer binding sites to allow for sequencing). Hybridise dna library fragments to a flow cell, they attach to surface of flow cell as single molecules Molecules too small to see so perform PCR so so clusters big enough to be visualised. Now that clusters made on flow cell, flow cell is ready to be loaded onto sequencing platform to perform sequencing. Polymerase incorporated terminator base with diff fluorescent dye Wash flow cell Image Cleave terminator base so other base can be added Repeat the process -have billions of clusters originating from single dna library molecules Machine tells you how confident it is that each base is correct Get identification number of the sequence Parallel process Short read sequences from gene then re assembled. Can compare consensus sequence against the human genome reference and look for the genetic variants. Dedicated software and bioinformatics tools will achieve this
280
NGS vs Sanger
``` Sanger 800bp NGS is 100-200bp NGS produces a digital readout Sanger produces an analogue readout NGS produces a consensus sequence of many reads Sanger is one sequence read ``` Can look for shared mutations, identify mutation
281
Third generation sequencing
Oxford nanopore sequencing Single molecule sequencing No PCR Dna passes through a nanopore Base sequence converted into an electrical current New technologies are applying Principles of this technology is different Nanopore dare cell membrane proteins where dna is forced through a nanopore and this generates an electrical signal, which gives rise to sequence. Able to sequence larger sequences 10Mbp
282
RNA sequencing
NGS also used to study rna use rna or mRNA from collection of cells and tissues RNA is first converted to xDNA prior to library construction NGS of rna samples determine which genes are actively expressed Single experiment can capture the expression levels of thousands of genes Amount of sequence get from each gene is indication of how abundant that gene is in being expressed. Calculate differences in gene expression of all genes in experimental conditions With appropriate analysis, rna sequence can be used to discover distinct forms of genes that are differentially regulated and expressed
283
What are the assumptions we make in bioinformatics
Candidate gene filtering using WES Ignore structural variants and other forms of genetic variation-just target coding regions Assumes casual variant is in coding, ignoring regulatory and other non coding variants outside of exon definitions. Assume casual variant alters protein sequence ignoring rare cases of functional synonymous changes (Remove synonymous variants) Assumes casual variant has complete penetrant death , remove previously identified variants Assume casual variant has complete detectance (restrict to variants filling dominant/recessive model of inheritance).
284
In vitro vs in vivo
In vitro is in glass | In vivo -> in living body
285
In vitro cell culture techniques advantage
Pic on phone
286
Describe gene knockdown by rnai induced gene silencing
On phone i
287
Why is cell culture not enough
Cells behave differently in dish compared to whole organism Does not stimulate actual conditions inside an organism -signals from other tissues No information about gene expression and function, with regards to developmental phenotypes
288
Benefits of using a mouse
On phone pic
289
Zebrafish advantage and making mutant zebra fish
On pic in phone
290
What are the different techniques to making a mutant
Forward genetics: ENU screening (phenotype based). - treated fish with ENU - caused mutations - our crosses then with normal fish - approach where you try and find a genetic cause of phenotype Reverse Genetics: find phenotypic consequence of a genotype change RNA rescue experiments: proving pathogenesis Use mutants and morphants to test your variants Morpholino embryo you can inject variant into mutant or morphin and see if you can rescue the phenotype
291
Transcriptomics
Transcriptomics is whole cell gene expression Proteinomics is whole cell protein content Metabolimics is whole cell metabolite content
292
Microbiota vs metagenome
Microbiota is the different organisms in a community | Microbiome is the genome of these organisms in a community
293
Prokaryotic ribosomes vs eukaryotic ribosomes sub units
Pic on phone
294
Describe variable regions
Variable regions are conserved and diverged between species. We can use variable regions to try and separate the species based on their sequences.
295
Describe 16S targeted PCR amplification workflow to detect organism in a sample
Pic on phone It only allows identification up to genus level It will amplify any contamination. Minimise contamination by randomising samples, use negative controls, note batch numbers of reagents Can only use for bacteria and not fungus
296
Whole genome shotgun sequencing
Same process, however instead of performing PCR, we are smashing up dna and sequencing all the dna Process on pic on phone Can be used for host viruses and yeast-there is no bias unlike PCR 16s Amplifying whole genome, not just single copy of dna gene (PCR does only single copy...) Once we have wgs shorty gun, we can re-assemble it Put them back together, this can be done by algorithms Creates a sequencing assembly. We can assemble bits of dna for each of the species in that sample Once we have our wgs shot gun We can look at taxanomic diversity, build trees, same as with PCR Can run context through gene prediction algorithms-identify which genes are present and then identify which metabolic pathways or processes are present. Then compare with patients. Problems: Host cell in excess in the sample No amplification step to enrich bacterial dna Can be contaminated 10% faecal reads can be contaminated and 90% of human reads including saliva nasal and skin samples
297
How to enrich without amplification
Pre extraction Post extraction Pic on phone
298
Where does my replication begin
Origin of heavy strand
299
Where does my transcription begin
Starts at heavy strand promoter
300
What is mitochondrial dna replication machinery
``` Mt polymerase gamma (POLG) mtDNA helicase (twinkle-unwinds dna) ``` On phone
301
Subunits of POLG used for mtDNA replication
1) pol gamma A Pol gamma A contains 3-5’ exo nuclease domain to proof read newly synthesised dna, corrects the mutations by cutting them out 2) pol gamma b (x2) Enhances interactions with dna template and increases activity and processivity of POLyA
302
Describe structure twinkle which is mitochondrial dna helicase
Hexameter (6 twinkle subunits) | UnWinDs mtDNA template to allow replication by Pol gamma
303
MtSSBP
``` Binds to ssdna once it has been unwound Prevents it from annealing again and Protects against jucleases Prevents secondary structure formation Enhances mtDNA synthesis by stimulating twinkle helicase activity (enhances activity of twinkle and helicase) ```
304
Describe mt dna replication
Photo on phone
305
What are the classical signs of mt dna disease
Neurodegeneration, migrants, diabetes, visual impairment, hearing deficit, infertility
306
What do heteroplasmy levels do ?
People are becoming increasingly aware of disorders. | Heteroplasmy levels determine disease manifestation of mt diseases
307
How to identify mutations in mt
NGS X axis is mtDNA nucleotide positions in bp Y axis is read counts
308
How do u get secondary mutations in mtDNA
Mutations in mt dna replication machinery cause secondary mutations Mutations arising somatically are mutations that are not inherited but occur in post motorists tissues as a result of mutations in nuclear genesC which are encoding mtDNA replication machinery eg POLG and twinkle If these mutations are not working properly, you cannot replicate mtDNA, causes deletions or depletion of dna Common variants in mt dna can contribute to development of complex diseases,
309
Describe what dominant mutations in twinkle can cause
Pic on phone
310
How to identify disease causing genes?
Have snp markers that have been known from studies and from data bases We know where they are on chrocomose Perform linkage analysis, find region inherited together Examine region further, make assumption that the disease gene is in that region If market linked to disease locus, then the same market alleles will be inherited by two affected relatives If unlinked, effected members of family less likely to inherit the same marker alleles Prove mutation causes disease Family based design, need to look at similarities between family members Genotyoing array, what markers in body, linkage analysis, paramattric, non parametric, identify chromosome gene locus, perform Sanger sequencing for each of the Exons on the genes, 100 candidate genes later, pro and had homozygous mutation which both parents were heterozygous for, prove mutations in gene is disease causing.
311
What is a proband
A person serving as the starting point for the genetic study of a family’s
312
Zygosity
Degree of similarity of alleles for a trait/mutation for an organism
313
Intermediate phenotypes
Quantitative biological trait that is reliable and reasonably heritable Shows a greater prevalence in unaffected relatives of patients than in the general population
314
What does heritability tell you?
Tells you how much of normal variation that you see is because of genetics (a particular combination of variants) and how much is due to environment. Simplest heritability test is to look at twins, how different they are from ECG. Number given will tell you percent of variation that is due to genetics, eg 0.58-58% due to genetics. number is always betwee 0-1, high number, strong resemblance.
315
Describe the genetic association time line
Pic on phone
316
method of linkage analysis/how to find disease causing gene
Pic on phone 1. Take a pedigree, see pattern in genotype 2. Use some kind of tool to generate genotyping data for your pedigree (genotyping array, up to 1 mill markers on 1 chip). 3. Get results from machine, generate graph to show physical and genetic distribution of markers on a genotyping array chip. -chromosome number on X, y is distribution of them along chromosome in cM -markers are evenly distributed along chromosomes 4. run a linkage programmme 5. choose to run analysis in a non parametric way or a parametric way NPL-non parametric linkage testing not assuming anything about inheritance pattern
317
How to measure association of genotype to heart rate
Perform snp microarray 3 colours for homozygous aa hetero ag and another homo gg Then Pic on phone
318
Linkage disequilibrium
LD between 2 SNPs decreases with physical distance. Extent of LD varies greatly depending on region of genome. If LD strong, you need fewer SNPs to capture variation in a region. They are across stretches of DNA, because of being in close proximity, different variants or regions are inherited together.
319
Give an example of a short read vs long read machine
Short read -Roche 454 | Long read -Pac Bio
320
What are the multiple source analysis pipelines available?
MOTHUR QIIME DADA 2
321
What is haplotype
Sequence of alleles along a single chromosome
322
what is the epigenome and describe processes associated with epigenome
The sum of all the (heritable) changes in the genome that do not occur in the primary DNA sequence and that affect gene expression An epigenetic change results in “A change in phenotype but not in genotype
323
What are the 4 epigenetic mechanism
DNA Methylation Histone modification X-inactivation Genomic Imprinting
324
How to purify genes
Add protein rages 6 histidines Glutathione S transferase
325
What is linkage analysis
Linkage analysis is a method used to map the location of a disease gene in the genome
326
What machine allows dna sequencing
ABI 3730
327
difference between linkage and association
Linkage : two alleles on a chromosome are linked physicay | Association: same sequence found in unrelated subjects suggesting it is the cause of location of mutation
328
Describe process of NGS
4 step process 1. DNA library construction In the wet lab – first we need to prepare the DNA sample for sequencing Essentially the DNA is chopped into small 300bp fragments. This is shearing This can be achieved chemically, enzymatically or physically (sonication) We have to repair the end of the sheared DNA fragments Adenine (A) nucleotide overhangs are added to end of fragments Adapters with Thymine (T) overhangs can be ligated to the DNA fragments The end result is the DNA library of literally billions of small, stable random fragments representative of our original DNA sample Adapters contain the essential components to allow the library fragments to be sequenced Sequencing Primer binding sites P5 and P7 anchors for attachment of library fragments to the flow cell 2. Cluster generation Hybridise DNA library fragments to the flowcell Random process But we can’t see individual single molecules of our DNA library –too small We need use PCR to amplify the fragments to a size that we can see Perform bridge PCR to generate clusters Many billions of clusters originating from single DNA library molecules Clusters are now big enough to be visualised Flow cell is now ready to be loaded on to the sequencing platform to perform the sequencing 3. Sequencing by synthesis Sequence each nucleotide 1 cycle at a time in a controlled manner Modified 4 bases (ATCG) with chain terminators AND a different fluorescent colour dye Single nucleotide incorporation (DNA polymerase) Flowcell wash Image the 4 bases (digital photograph) Cleave terminator chemical group and dye with enzyme Camera sequentially images all 4 bases on the surface of the flowcell each cycle Each cycle image is converted to a nucleotide base call (ACGT) Cycle number anywhere between 50 – 250 nucleotide base pairs 4. Data analysis Short read sequences from the machine need to be re-assembled like a jigsaw To generate a consensus sequence of our original DNA samples We can compare this consensus sequence against the human genome reference and look for the genetic variants Dedicated software and bioinformatics tools will achieve this