Lecture 5 DA Flashcards by Rodwan Halimi

What are some reasons why we sequenced the human genome (3)?

Because its there - a bioinformatical challenge.
Helps against inherited diseases, including those we don’t know about.
Helps understands consequences of mutation.

How well did you know this?

Not at all

Perfectly

What is responsible for the phenotypic diversity among different individual humans?

Single nucleotide polymorphisms - SNPs.

How well did you know this?

Not at all

Perfectly

What is more important, the nucleotide sequence or the protein sequence?

Protein sequence.

How well did you know this?

Not at all

Perfectly

Which chromosome was sequenced first, and why? Which came after?

22 because it’s the shortest. 21 came after.

How well did you know this?

Not at all

Perfectly

Describe the hierarchical approach to sequencing the human genome (5).

Different groups are each given a chromosome to sequence.
The groups generate bacterial artificial chromosome sequences (BACS).
BACS were divided, and shotgun sequencing was done on them.
High fidelity maps with identifiable motifs allowed them to detect overlapping regions and assemble the sequence.

How well did you know this?

Not at all

Perfectly

Describe the shotgun approach to sequencing the human genome (6).

DNA is isolated and chopped into fragments.
Fragments are cloned into vectors, and sequenced.
Overlapping genes are combined to assemble the genome into contigs.
Scaffolds generated from contigs.

How well did you know this?

Not at all

Perfectly

What is celera sequencing, and what is it like to hierarchical sequencing?

Celera sequencing is a whole genome shotgun sequence at once. Finished faster than hierarchical approach.

How well did you know this?

Not at all

Perfectly

At how many locations do SNPs occur?

3m.

How well did you know this?

Not at all

Perfectly

How many genes total were found?

~51k.

How well did you know this?

Not at all

Perfectly

How many coding genes were found?

~20k.

How well did you know this?

Not at all

Perfectly

How many non-coding genes were found?

~20k

How well did you know this?

Not at all

Perfectly

What are pseudogenes, and how many were found?

Genes that seem to be protein coding, but mutation rendered them non-coding. 18k found.

How well did you know this?

Not at all

Perfectly

How many genes with variants were found?`

~20k.

How well did you know this?

Not at all

Perfectly

How many mRNA genes were found? What does this mean?

98k. For about every gene, there are 5 mRNAs that can be made, meaning we technically have ~100k genes.

How well did you know this?

Not at all

Perfectly

What % of the genome is coding? What % is repeating junk DNA?

Coding -

How well did you know this?

Not at all

Perfectly

What are some issues with being able to sequence the human genome?

Study These Flashcards

Who owns the information.
Who can access it (police, insurers, employers etc)
Impact on a person
Foetal genetic testing - counselling/accuracy
Patenting the sequence - impact on medical discoveries

What % of the genome encodes small RNAs? What do they do?

Study These Flashcards

8-20%. They are regulatory, and can inhibit mRNA translation.

Why is junk DNA believed to be so important?

Study These Flashcards

It is believed to be like the operating system of the genome, running the coding genes.

What number of RNAs are believed to control how a given protein is switched on or off?

Study These Flashcards

For every protein, 10 times that number of RNAs control it. This depends on the cell type/developmental stage.

What is the output of classical sequencing methods such as sanger sequencing?

Study These Flashcards

500-1k base pairs.

What is the output of next generation sequencing methods (NGS)?

Study These Flashcards

Billions of base pairs.

Describe how sanger sequencing works (5).

Study These Flashcards

DNA sequence is amplified (PCR).
Primers are annealed.
dNTPs are used for extension.
ddNTPs labelled with fluorescence are used to terminate the sequence one base at a time.
Fragments seperated using gel/capillary electrophoresis.

What are the benefits of NGS? What are the limitations?

Study These Flashcards

Benefits
-Huge sequencing cap vs classical sequencing
-Rapid throughput/output - very quick
-No gel electrophoresis needed
Limitations
-Expensive, only economic when using large number of base pairs

How do NGS sequencing work (2)?

Study These Flashcards

Full genome immobilised on a chip, 100 base pairs long.

- All sequenced at once, very quickly.

What is the sequence quality score in NGS?

Prediction of probabilities of an error in base calling.

What is the most common way of genome assembly?

Denovo

How does denovo work?

- Data froms equencing is partitioned - Overlaps are found between datasets, building the genome gradually - Forms segments called contigs - Contigs are used to form scaffolds

Does denovo assembly require a reference sequence?

No.

Gaps can be found between contigs in denovo assembly. How can they be closed (2)?

Hope that a clone can be used to close the gap. | Otherwise can be closed using PCR with a primer at the end of the gap.

What is annotation in bioinformatics?

Determining what the gene does.

What is the data used denovo assembly based on?

De bruijn graphs.

Lecture 5 DA Flashcards

(31 cards)