LEC43: Intro to Genomics Flashcards by Allison Vise

what does every human cell contain?

complete genetic code

How well did you know this?

Not at all

Perfectly

when are chromosomes visualizable?

during mitosis, when chromosomes condense

How well did you know this?

Not at all

Perfectly

what was first genome screen technology?

karyotyping by G-banding

How well did you know this?

Not at all

Perfectly

what was reference human genome, its use?

traditional sequencing method: took DNA from individual, arranged into pieces of chromosomes, chopped up, individually sequenced, stitched back together, compared to reference human genome

= sanger sequencing

costly and slow, not method anymore

How well did you know this?

Not at all

Perfectly

what did human genome project do?

sequence entirety of human genome

took 10 yrs, $1 billion

info acquired allowed cataloging of complete set of human genes

How well did you know this?

Not at all

Perfectly

what did human genome proj find?

complete set of human genes = similar to number observed in other model organisms like mouse, watercress plant, roundworm

explanation: complexity in mammals due to alternative splicing, permitting increased number of potential proteins

How well did you know this?

Not at all

Perfectly

how much of our DNA is protein coding genes?

1.5%

How well did you know this?

Not at all

Perfectly

what is difference between number of genes, gene density, in humans vs plants?

genome size is similar

but gene number is much greater in humans

thus human’s avg gene density is much lower; only 1.5% of our DNA encodes proteins

due to alternative splicing

How well did you know this?

Not at all

Perfectly

what is result of alternative splicing?

from 1 single gene, exons’ arrangement can be different, get different resulting proteins!

only 1.5% of our DNA is protein coding, though

How well did you know this?

Not at all

Perfectly

how much of non-gene DNA is conserved?

2-5% of non-gene DNA is conserved through evolution

How well did you know this?

Not at all

Perfectly

if a piece of DNA is conserved, what does that suggest?

that it’s important

basis for idea that there’s functionality among non-gene portions of our DNA that’ve been conserved through ages/across animals

suggests these regions have important regulatory role in genome function

How well did you know this?

Not at all

Perfectly

HOXD gene cluster function?

basic body patterning control

example of conserved region of essential proteins that regulate genome function

How well did you know this?

Not at all

Perfectly

how much of our genome is repeat elements?

what are they relics of?

50%

relics of retrovirsues and ‘genomic parasites’ that invaded our DNA in evolutionary history, i.e. HIV - ‘junk DNA’

How well did you know this?

Not at all

Perfectly

what causes finger webbing?

mutation in hoxD gene cluster, as HoxD genes encode for basic body patterning

How well did you know this?

Not at all

Perfectly

segmental duplications?

blocks of DNA 1-500 kb in length that occur at multiple sites in the genome, share a high level of sequence identity

~5% of our DNA

can be intrachromosomal (same chromsome) duplications or interchromosomal (between chromosomes)

How well did you know this?

Not at all

Perfectly

what role do segmental duplications play in genetic disease?

these large highly idneitical repeats often flank certain regions of the genome that are thus prone to misalignment during meiosis, leading to improper recombination

if any repeats are dosage sensitive, results in genomic deletions and/or duplications that are associated w/ a particular genetic disease

How well did you know this?

Not at all

Perfectly

examples of recurrent genomic disorders?

Study These Flashcards

velocardiofacial syndrome

angelman/prader-willi syndrome

charcot-marie tooth disease

x-linked hemophilia

all caused by mechanism of recombination between large high-identity repeats

recurrent deletion on chromosome 15 causes what/example of what?

Study These Flashcards

causes intellectual disability, dysmorphisms, epilepsy

deletion = most common known genetic cause of epilepsy, present in ~1% of epilepsy patients

example of recurrent genomic disorder caused by aberrant recombination between large high-identity repeats

how many bases of difference exist between 2 individuals?

Study These Flashcards

avg ~6 million bases, ~0.1% of genome

means 99.9% shared DNA among humans

what are the types of variation in the human genome? from smallest to largest

Study These Flashcards

1) single base-pair changes - point mutatoins/SNVs/SNPs
2) small insertions/deletions & microsatellites
3) mobile elements - retroelement insertions (300 bp - 10kb in size)
4) large-scale genomic variation (>1 kb) - deletions, duplications
5) chromosomal variation - translocatoins, inversions, trisomy

most common type of genetic variant?

Study These Flashcards

SNVs, single nucleotide variants or polymorphism or point mutation

occurs 1x every 1,000 bp = 3-5 million SNVs in individual genome

where do SNVs usually occur?

Study These Flashcards

most in non-coding regions - may have regulatory effects, but not well understood

however, 10,000 per genome (0.3%) are in coding regions, & cause changes in protein sequence

what do SNPs within coding regions cause?

Study These Flashcards

sometimes, no change, since a.as are reduntant

sometimes, changes amino acid, different protein results

what do SNPs outside of coding regions cause?

how much of SNPs are outside of genes?

Study These Flashcards

can influence disease by altering gene regulation

i. e. if change a ntd within txn factor binding site code, txn factor may not recognize, may not bind to DNA, no activation occurs, gene may be OFF when should be ON
99. 7% of SNPs are outside of genes

what does microarry on SNP chips show?

useable to genotype millions of SNPs in a single experiment **can find identity of a base pair at an SNP** floursescently labeled DNA is hybridized to an array of probes immobilized on a glass slide that bind either to normal or variant DNA

how does array CGH work?

label a control and patient DNA w/ flourscent dyes cohybridize them together onto a slide that contians DNA corresponding to different parts of the genome flourescently labeled DNA hybridizes to the slide scan it, get image YELLOW indicates no gain or loss or duplication on the array however if see color of flourescence of sample, know there is duplication in, for ex., patient's DNA, at that position

what does array CGH enable?

detection of copy number changes that're too small to be seen by karyotyping

what do different chip/microarray tests give info about?

sattistical testing for assocaition between diseases of interset and SNPs at specific chromosomal locations DNA copy number across genome detection of sub-microscopic gains or losses of material for rare conditions and common conditions alike duplications of genomic regions that're associated w/ protection from disease

how can array-based technology be used to inform diagnosis and treatment of cancer?

take cancerous tumor DNA and match to control DNA from same person's blood or non-tumorous site compare the 2 to see what happened in tumor see where chromosomes have extra copies, see deletion of known tumor suppressor see **massive amplification of EGFR gene region, growth-promoting gene** and see **deletion of tumor suppressor genes**: so can **develop drugs to inhibit this gene where see amplification** b/c clearly key event in tumorogenesis

what are tandem repeats?

serial repetition of 2 bases (acacacacac...) inherently unstable highly repetitive sequences are rich source of variation in genome b/c polymerase working on DNA at repeat site will add or delete copies of repeats highly variable regoins btwn individuals

what are triple repeat expansions associated w?

neurological diseases

what is cause of fragile x?

CGG motif repeat has 5-50 copies in healthy individuals; in ill individuals, can be up to 50-200 copies; in patient w/ fragile X, hundreds/thousands of repeats this switches off nearby gene, causes disease causes breakage of chromosome, making DNA polymerase unable to replicate causes mental retardation w/ distinct dysmorphic features, accompanied by a 'fragie site' on X chromosome (= original name)

what can a large tandem repeat contain?

entire genes may be true for genes present in multiple copies, e.g. salivary amylase

are genomic duplication regions ever protective from disease?

yes eg. cheokine CCL3L1, inflammatory signaling molecule it's binding partner of CCR5, major receptor molecule for HIV cell entry more copies of CCL3L1 gene is inversely correlated w/ susceptibility to HIV infection

is complete personal genome sequencing expensive?

no! quick and cheap now

what is focus of next generation sequencing?

whereas old sanger sequencing focused on 1 gene at a time, next gen sequencing permits analysis of **massively parallel sequencing- more data simultaneously**

describe process of next generation genome sequencing

1) extract genomic DNA 2) shear DNA into small 200-500 ntd pieces 3) ligate adaptors to ends of fragments 4) enrich and amplify library by PCR 5) sequence on microscopic scale, from adaptor w/ platform wash through w/ bases that floursece differently; each cluster of DNA will flouresce measure that flourescence or electrochemical energy, detemrine which base was added durign each step of DNA synthesis rxn

describe whole genome shotgun sequencing

can stictch back together fragments of DNA by mapping onto reference human genome due to random nature of sequences, depth of coverage at any 1 place in genome is variable reads also contain errors (1%) therefore need **high redundancy **to generate high-quality gap-free sequence (20x-20x)

what is imperfect about whole genome shotgun sequencing?

random errors in sequencing occur thus cannot know if heterozygous SNV or sequencing error or random error has occurred when a base is mismatched so SNV calling in genome sequencing is a probabilistic exercise

what are barriers to personalized genomics being the be-all/end-all of medical treatment today?

cost is falling rapidly ($1500-2k now) BUT knowledge of how to interpret consequences of majority of genetic variation is limited geneticists only know phenotypes caused by mutations in ~3200 of ~25,000 human genes (13%) each human has ~3 million SNVs, 1200 CNVs - what are effects of these on individual disease risk? even for the ~10,000 that change a.a. sequence of proteins, currently we can interpret effects of a minority of these, + these are 0.3% of each person's variation

LEC43: Intro to Genomics Flashcards

(40 cards)