manipulating genomes Flashcards
(14 cards)
SEQUENCING METHODS-
what is DNA sequencing?
what are the two techniques?
-DNA sequencing allows the nucleotide base sequence of an organism’s genetic material to be identified
•sanger sequencing = old
•high throughput sequencing = advanced
describe what happens in Sanger sequencing?
-The chain termination method of DNA sequencing uses modified nucleotides called dideoxynucleotides
Dideoxynucleotides have a different structure to the deoxynucleotides of DNA of organisms
Dideoxynucleotides can pair with nucleotides on the template strand during DNA replication
They will pair with nucleotides that have a complementary base
When DNA polymerase encounters a dideoxynucleotide on the developing strand it stops replicating = why this method of sequencing is referred to as the chain termination method
describe the chain termination method-
-Four test tubes are prepared that contain the DNA to be sequenced (in the form of a single-stranded template), DNA polymerase, DNA primers, free nucleotides A, C, T, and G, and one of the four types of dideoxynucleotide; either A* C* T* or G*
-The test tubes are incubated at a temperature that allows the DNA polymerase to function
-The primer anneals to the start of the single stranded template, producing a short section of double stranded DNA at the start of the sequence
-DNA polymerase attaches to this double stranded section and begins DNA replication using the free nucleotides in the test tube
-Hydrogen bonds form between the complementary bases on the nucleotides
-At any time, DNA polymerase can insert one of the dideoxynucleotides by chance which results in the termination of DNA replication
-Because each of the test tubes only contains one type of dideoxynucleotide, it is possible to know what the terminal nucleotide of each fragment is (if the test tube contains A*, then the final nucleotide of every chain in that test tube is A)
-Because the point at which the dideoxynucleotide is inserted varies with every strand, complementary DNA chains of varying lengths are produced
-These chains can vary in length from one nucleotide to several hundred nucleotides
-Once the incubation period has ended the new, complementary, DNA chains (also referred to as the developing strands) are separated from the template DNA
-The resulting single-stranded DNA chains are then separated according to length using gel electrophoresis
-The gel will have four wells, one each for A, C, T, and G
-A fragment that consists of only one nucleotide will travel all the way to the bottom of the gel
-and every band above this on the gel represents the addition of one more base. E.g. If the band on the gel that travels furthest comes from the C* well, scientists can see that the first base in the sequence is C.
-This allows the base sequence to be built up one base at a time
what is high throughput sequencing?
a term that describes multiple DNA sequencing technologies
-all of which allow simultaneous sequencing of multiple DNA strands
-High throughput methods are rapid and so produce large datasets very quickly
HOW GENE SEQUENCING HAS ALLOWED FOR GENOME-WIDE COMPARISONS BETWEEN INDIVIDUALS AND SPECIES-
Bioinformatics and computational biology, researching:
-genotype and phenotype relationships
-epidemiology
-searching for evolutionary relationships
-prediction of amino acids
-development of synthetic biology
what is bioinformatics?
-Bioinformatics is a field of biology that involves the storage, retrieval, and analysis of data from biological studies
-These studies may generate data on DNA sequences, RNA sequences, and protein sequences, as well as on the relationship between genotype and phenotype
-High-power computers are required to create databases
-The large databases contain information about an organism’s gene sequences and amino acid/protein sequences
-Once a genome is sequenced, bioinformatics allows scientists to make comparisons with the genomes of other organisms using the many databases available
-This can help to find the degree of similarity between organisms which then gives an indication of how closely related the organisms are
-This can be useful for scientists looking for organisms that could be used in experiments as a model organism for humans
describe genetic variation and evolutionary relationships-
-The genetic variation within a species can be investigated
-Many individuals of the same species have their genomes sequenced and compared
-A species that has a high level of genetic variation will exhibit a large number of differences in base sequences between individuals
-The evolutionary relationships between species can be investigated by comparing the genomes of different species
-Species with a small number of differences between their genomes are likely to share a more recent common ancestor than species with a large number of differences
-The protein cytochrome c is involved in respiration, and so is found in a large number of species
-For this reason it is especially useful for making comparisons between different species
describe genotype-phenotype relationships-
-Genotype-phenotype relationships are explored by “knocking out” different genes (stopping their expression) and observing the effect it has on the phenotype of an organism
-When an organism’s genome sequence is known, scientists can target specific base sequences to knock out
describe epidemiology-
-Epidemiologists study the spread of infectious disease within populations
-The genomes of pathogens can be sequenced and analysed to aid research and disease control
-Highly infectious strains can be identified
E.g. the Delta variant of SARS-CoV-2 (a well-known coronavirus)
-The ability of a pathogen to infect multiple species can be investigated
E.g. Ebola can infect primates as well as humans
-The most appropriate control measures can be implemented based on the data provided
-Potential antigens for use in vaccine production can be identified
genome comparison-
the human genome project?
-A genome project works by collecting DNA samples from many individuals of a species.
-These DNA samples are then sequenced and compared to create a reference genome
-More than one individual is used to create the reference genome as one organism may have anomalies/mutations in its DNA sequence that are atypical of the species
-The Human Genome Project (HGP) began in 1990 as an international, collaborative research programme
-It was publicly funded so that there would be no commercial interests or influence
-DNA samples were taken from multiple people around the world, sequenced, and used to create a reference genome
-Laboratories around the globe were responsible for sequencing different sections of specific chromosomes
-It was decided that the data created from the project would be made publicly available
-As a result, the data can be shared rapidly between researchers
-The information discovered could also be used by any researcher and so maximised for human benefit
-By 2003 the human genome had been sequenced to 99.9% accuracy
The finished genome was over 3 billion base pairs long but contained only about 25,000 genes, a surprisingly low number
-Work is currently underway to sequence the human proteome and the human epigenome
what is the application for the human genome project?
-Scientists have noticed a correlation between changes in specific genes and the likelihood of developing certain inherited diseases
-with the end goal of finding cures from diseases
how to estimate protein sequences
-The genetic code can be used to predict the amino acid sequence within a protein
-Once scientists know the amino acid sequence they can predict how the new protein will fold into its tertiary structure
-This information can be used for a range of applications, such as in synthetic biology
what is synthetic biology?
-Synthetic biology is a recent area of research that aims to create new biological parts, devices, and systems, or to redesign systems that already exist in nature
-It goes beyond genetic engineering, as it involves large alterations to an organism’s genome. This new genome can cause a cell to operate in a novel way, not yet seen before
-The assembly of the new genome can be done using existing DNA sequences or using entirely new sequences
-These new sequences can be designed and written (using special computer programmes) so that they produce specific proteins
what is PCR?
how?
-polymerase chain reaction
The three stages are:
1. Denaturation – the double-stranded DNA is heated to 95°C which breaks the hydrogen bonds that bond the two DNA strands together
- Annealing – the temperature is decreased to between 50 - 60°C so that primers (forward and reverse ones) can anneal to the ends of the single strands of DNA
- Elongation / Extension – the temperature is increased to 72°C for at least a minute, as this is the optimum temperature for Taq polymerase to build the complementary strands of DNA to produce the new identical double-stranded DNA molecules