Week 5 (Long Read and Element) Flashcards
ability to resolve a _________ structure is dependent on the length of the molecules in your library
repetitive
segmental duplications
“low copy repeats” blocks that range from 1 to 400 kb in length, occur at more than one site within the genome, and typically share a high level (>90%) of the sequence identity.
segmental duplications make up about ____% of the human genome
5
long read technology
- Oxford nanopore (ONT) (protein nanopores)
- Pacific BioSciences - PacBio (SMRT)
- proximity ligation (assembly)
ONT
Oxford nanopore (protein nanopores)
SMRT
single molecule real time sequencing
__________ is a heptameric protein pore with an inner diameter of a few nanometers
a-hemolysin
the diameter of a-hemolysin is the same scale as many single molecule, including DNA. Why?
so that DNA can be extruded from the membrane
ONT (protein nanopore) can be used real time in the field. 10-20 Gb are read in less than _______
24 hours (standard is 72 hours)
where is a-hemolysin derived from?
it was discovered in staph, the pathogenic organism uses this protein ore to penetrate cells in the body
how is DNA extruded from the cell using protein nanopore?
the pore is in the membrane, there is a tether that holds the DNA on the pore and a motor protein allows the DNA to move through the pose
How do we read the bases as the exit the protein pore?
as the DNA goes through the pore, each base has its own structure that will disrupt the charge in a base specific way (ion current), so we can estimate what is coming out based on the change in charge
using protein pores, _____ bases are read per second
400
long read sequencers are important for resolving ________ sequences
repeat
____ Mb is the largest long read that has been read (the largest read is the largest chromosome)
4.2
selective sequencing
the protein nanopore is able to chose only the sequences that we are interested in, it will reject and eject the molecule if it has seen it already and then restart with a new sequence
why is selective sequencing a really great tool?
it will save time and resources
what can protein nanopores read in one read?
- bases sequenced
- bases inserted
- bases deleted
- SNVs
- CpG methylations
centromeres are found in the ________ of the chromosome
middle
telomere are found on the _____ of the chromosome
end
what is the difference between illumina sequencing and ONT (protein pore)’s average read length?
- illumina = 150 bp
- ONT = 33-35 kb
what is a major benefit of ONT (protein nanopore)?
read length
what is the average read length of ONT (protein nanopore)?
33-35 kb
in ONT, about _____ bp/sec
400