Genomics Flashcards

1
Q

Whole genome shotgun sequencing

A
  • Isolate genomic DNA
  • Fragment DNA, make genomic library of clones w known insert sizes
    • Plasmid library, ~2kb inserts
    • Plasmid library, ~10kb inserts
    • BAC library, ~200kb inserts
  • Sequence paired end reads from every clone
    • 1000 bp sequence reads from ends of each clone
  • Collapse overlapping contiguous sequences into contigs
  • Use paired end reads to connect contigs into scaffolds
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Contig

A

-Sequences that overlap can be collapsed into contigs
-Need 20bp to ensure significant unique overlap
2 copies from complementary strands

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Paired-end read

A
  • Seq of both ends of a piece of DNA
  • Can connect two contigs into a scaffold
  • Computer kept track of which contig came from where
  • Paired-end reads from multiple inserts can be seq in parallel
  • Paired-end reads can be used to join two sequence contigs
  • -You know the size of the sequence in between them
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Rsal

A

Blunt ends

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Scaffold (supercontig)

A

overlapping contigs separated by gaps of known length

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Kpnl

A

Sticky 3’ ends

overhang

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Sanger sequencing

A
  • Both ends of an insert can be sequences (paired end reads)
  • Need oligonucleotide, ~20 bp, primer
  • Can put primers in vector, cannot be same sequence or else would not be able to distinguish between ends
  • Extend with fluorescent molecules, which kill rxn
  • Can then sequence –> dideoxy sequencing
  • Can generate about 1000bp of sequence per read
  • End reads from multiple inserts can be sequenced in parallel
  • Can run DNA frag on gel, can go in and cut out specific size and sequence that
  • Can make different sized libraries
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Sanger sequencing

A
  • Both ends of an insert can be sequences (paired end reads)
  • Need oligonucleotide, ~20 bp, primer
  • Can put primers in vector, cannot be same sequence or else would not be able to distinguish between ends
  • Extend with fluorescent molecules, which kill rxn
  • Can then sequence –> dideoxy sequencing
  • Can generate about 1000bp of sequence per read
  • End reads from multiple inserts can be sequenced in parallel
  • Can run DNA frag on gel, can go in and cut out specific size and sequence that
  • Can make different sized libraries
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Annotation

A

Process of attaching biological info to genome sequences (eg. Determining which subset of the genome sequence is transcribed etc)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

cDNA

A
  • complementary DNA = DNA made from mRNA w reverse transcriptase
  • cDNA library
    - Collection of clones of cDNA, usually made from same mRNA source
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

cDNA

A

-complementary DNA = DNA made from mRNA w reverse transcriptase
-cDNA library
-Collection of clones of cDNA, usually made from same mRNA source
-Converting RNA transcripts to cDNA
-Harvest mRNA
-Reverse transcribe first cDNA strand
-Reverse transcribe
-2nd cDNA strand
cDNA can be cloned and sequenced just like genomic DNA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Ortholog

A

Genes in 2 diff species that arose from the same gene in the species’ common ancestor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Complementation test

A
  • Test for determining whether two mutations are in diff genes (they complement) or same gene (do not complement)
  • Complementation group=synonymous with a gene
  • Can’t do it w dominant mut, only recessive mut
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Saturation mutagenesis

A
  • Systematic search for finding all the genes in the genome that can mutate to affect a biological process
  • A genetic screen is saturated when you stop finding new loci (genes), but rather just more alleles (mutants) of the same loci
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Complementation group

A

A group of mutant genes which do not complement each other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Fwd genetic screen

A
  • identify genes (or set of genes) responsible for a particular phenotype of an organism
  • ie. start with phenotype
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Complementation table

A
\+ = complement
- = no complement
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Enhancer

A

A region of DNA that positively reg gene expr (tx), often in a spatial and/or temporal specific manner

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Cis-acting regulatory element

A

Enhancer A region of DNA that positively reg downstream gene expr (tx)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Visualizing gene

A

Fix tissue or crosslink it first to preserve tissue, treat w formaldehyde which crosslinks and prevents degradation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

In situ hybridization

A
  • Uses single stranded seq specific RNA probe to visualize mRNA of interest
  • Probe should be anti-sense
22
Q

Immunolocalization

A

Uses Ab to visualize a prot of interest

23
Q

Reporter gene

A
  • a gene whose expr is easy to monitor

- Used to study tissue-specific promoter and enhancer activities in transgenes

24
Q

Reporter gene

A
  • a gene whose expr is easy to monitor

- Used to study tissue-specific promoter and enhancer activities in transgenes

25
Q

Copy number variants

A
  • Massive structural variation
  • Huge duplications
  • Ex. Amylase
    • Adapted for humans eating starch rich diet or not
26
Q

Polymorphisms

A
  • A change in the DNA sequence

- Doesn’t need to change the phenotype of the organism to be able to track it

27
Q

Repetitive DNA

A
  • Transposable elements (transposons)
  • Direct mutagenesis from transposition
  • Indirect mutagenesis from transposable elements
28
Q

Molecular marker

A
  • Segregate according to mendelian rules
  • Can be used to search for linkage w human disease
  • Once linked to disease or trait, marker can be used as starting point to find linked relevant DNA sequence changes for disease/trait
29
Q

SNP

A
  • single nucleotide polymorphism

- most diallelic

30
Q

Microsatellite

A
  • Site w highly variable number of short sequence repeats, usually 2-4 nucleotides
  • SSR= Simple Sequence Repeat
  • SSLP = Simple Sequence Length Polymorphism
  • VNTRs = Variable number Tandem Repeats
  • Especially variable in human populations
  • Can be detected w PCR using seq that flank repeats
  • Locus can show linkage to a disease gene
31
Q

Transposable element transposition

A
  • Half of human genome is repetitive
  • Repeats are almost everywhere except where mut would be lethal
  • Regions w lowest density interspersed repeats = Hox gene clusters
  • Transposition can be directly mutagenic
  • Transposable elements can be indirectly mutagenic by mediating chromosome rearrangements
  • Legacy of old transposons is more important than activity of current transposons
32
Q

Deletions and transposable elements

A
  • Pairing by looping and crossing over b/w 2 transposable elements oriented in the same direction
  • aberrant recombination
  • polydactyly
33
Q

Inversions and transposable elements

A

Pairing by bending and crossing over b/w two transposable elements oriented in opposite directions leads to inversion

34
Q

Duplication and transposable elements

A

Misalignment and unequal exchange b/w transposable elements located on sister chromatids leads to one chromosome with a deletion and one with a duplication

35
Q

Copy number variants

A
  • Massive structural variation
  • Huge duplications
  • Ex. Amylase
    • Adapted for humans eating starch rich diet or not
36
Q

Haploid genotype

A
  • A combination of alleles at multiple loci on the same chromosomal homolog
  • If alleles are close together, will not behave independently
37
Q

Linkage b/w phenotype and genotype

A

QTL mapping and GWAS

38
Q

Selective sweep

A
  • Loss of all diversity in SLC24A5 in northern europe
  • Skin pigmentation
  • Gene w least heterozygosity in human population
  • Locus under intense natural selection
39
Q

Selective sweep

A
  • Loss of all diversity in SLC24A5 in northern Europe
  • Skin pigmentation
  • Gene w least heterozygosity in human population
  • Locus under intense natural selection
40
Q

Disturbing forces

A
Mutation
Drift
Selection
Migration
Non-random mating
41
Q

Continuous traits

A

Vary continuously

-height

42
Q

Meristic traits

A

Measured in whole numbers

-litter size

43
Q

Threshold traits

A

Measured by presence or absence

-susceptibility to disease

44
Q

Quantitative Trait Loci

A
  • Chromosome regions containing a gene or genes that influence a quantitative trait
  • Chromosome regions can be molecularly genotyped so their segregation can be followed in crosses and pedigrees
  • For every genotyped region, F2s fall into discrete categories (AA, Aa, aA or aa)
  • Genotyped markers that are linked together are inherited together
  • the more simple the inheritance patter, the closer the linkage
45
Q

GWAS

A

A test of the association b/w markers (SNPs) across the genome and disease or quantitative trait phenotype, usually involving hundreds of thousands of SNPs spread throughout the genome

46
Q

Linkage disequilibrium

A
  • Deviation in the freq of haplotypes in a pop from the freq expected if the alleles at diff loci are associated at random
  • recombination hotspots disrupt
47
Q

tagSNPs

A

minimal set of SNPs that tag most common haplotypes

48
Q

Bonferroni correction

A
  • At p <0.05 level of significance, an association study using 1,000,000 SNPs would be expected to show 50,000 SNPs to be “associated” with disease purely due to chance and large number of markers tested
  • Divide p value by # of tests performed to find actual significance value
49
Q

Common disease - common variant hypothesis

A

many common diseases are caused by common alleles that individually have little effect but in concert confer a high risk

50
Q

Odds ratio

A

Odds is the ratio of the probability that the event of interest occurs to the probability that it does not