1000 Genome Slidedeck Flashcards by Tanner Tipton

Long haplotypes= what type of frequency

low

How well did you know this?

Not at all

Perfectly

Length of a haplotype that a mutations is present on is proportional to

how old the mutation is
Recent=long=low frequency

How well did you know this?

Not at all

Perfectly

why was wide and shallow coverage done in the 1000 genome project?

Wide=more people and more data
having more people means more variation in data and allows for the identification of common variants

How well did you know this?

Not at all

Perfectly

Why was exon sequencing used in 1000 Genome project

sequencing is expensive
exons are the coding region so to find meaningful variants it would make sense to use the coding region

How well did you know this?

Not at all

Perfectly

average distance of nucleotides between variants

number of variants over total space

How well did you know this?

Not at all

Perfectly

Which would produce more accurate variable calls, low coverage WGS, or high coverage exome?

Snip chips are more accurate so variant calls

How well did you know this?

Not at all

Perfectly

Pros of WGS

errors become big with little data

How well did you know this?

Not at all

Perfectly

Pros of high coverage exomes

average of 80x

How well did you know this?

Not at all

Perfectly

cons of high coverage exome

more expensive

How well did you know this?

Not at all

Perfectly

Pros of SNP

high confidence and cheap

How well did you know this?

Not at all

Perfectly

Why did the 1000 genomes project summarize variant sites with 0,1, and 2

Diploid=2 chromosomes
AA=0
AB=1
BB=2

How well did you know this?

Not at all

Perfectly

Why is the evidence for a single genotype typically weak in low coverage regions

can be a sequencing error
not enough data to confirm

How well did you know this?

Not at all

Perfectly

How can we address the problem of evidence being weak in low coverage regions

sequencing using SNP chips

How well did you know this?

Not at all

Perfectly

type of variation we didn’t talk about

structural

How well did you know this?

Not at all

Perfectly

what is another name given to regions of low complexitity

repetitive regions

How well did you know this?

Not at all

Perfectly

What technology is used to help with repetitive regions

Study These Flashcards

longer reads that can map out more unique data

Accessible genome

Study These Flashcards

the fraction of the reference genome in which short-read data can lead to reliable variant discovery

Accessible genome percentage went from 85% to

Study These Flashcards

94% now

why would individual calls be more accurate at common variants than at low frequency variants

Study These Flashcards

common variants have more data and are more likely to be true than to be a sequencing error

variation among samples in genotype accuracy is primarily driven by sequencing depth- why is this true

Study These Flashcards

more data=less sequencing errors
allows you to determine what are the variants

Moderate to high frequency variants tend to be

Study These Flashcards

old

low frequency variants tend to be

Study These Flashcards

new

New mutation equation

Study These Flashcards

1/2N

Lower frequency variants are

Study These Flashcards

population dependent- show up in one population and have not spread to others

Why would we expect many low frequency variants

different environments and more people is what gets new variants

What would you expect for a population that is contracting

less new variants, more variants at a higher frequency

Are all variants equally important?

Wobble

third base on codon is changed

Synonymous

same amino acid is coded for

nonsynonymous

different amino acid is coded for

How do you know if an individual has more or less variants than expected

Intron vs. exon placement of variation wobble nonsense synonymous nonsynonymous

How is it that we can have an average 150 broken genes but still be normal

It depends on other genes or factors environment also plays a role

Everyone carries

bad variants

Lots of variants in regulatory regions. Why

Why would regulatory sequence tolerate deleterious variations?

What is the primary reason to do imputation

fine mapping existing association signals and detecting new associations can fill in missing stuff and find variants

Rare variants need to be evaluated using

the correct null distribution

1000 Genome Slidedeck Flashcards

(37 cards)