Lecture 1 - Biology for Computational Genetics Flashcards
(39 cards)
What are the two forms computational formulation?
- Numerically encoded input with a computable objective function
- Numerically encoded input with a computable statistical test for significance
What is an example of a computational form in which there is a numerical input an a computable objective function?
input: x is all real value
f(x) = x^2
output: x that maximizes f(x)
What is an example of a computational form in which there is a numerical inout and a computable statistical test for significance?
input: the counts of each type of nucleotide in a genome
output: the probability that the nucleotides could be generated by a process that is independent and identically distributed
What is maximum parsimony?
if there are multiple solutions to a problem the one involving the fewest steps is the biologically correct one
-meaning if you want to figure out is two sequences are related it is through the way that has the least steps
What are synteny blocks?
show regions of homologous genes between two organisms
-direction matters for these
What are the three steps for the computational formulation of genome arrangements?
- Transform the problem into a numeric input
a. a list of genes [1…N] and their order and orientation a permutation in another genome
Example: A=[1,2,3,4,5,6,7,], B=[1,-7,-6,-5,2,3,4] - Define model
a. let the evolutionary model be a reversal (define Ref(G,i, j) to be the reversal of elements in G from position i to position j. this reversed the order and orientation of each element
Example: Rev (B, 2,4) = [1,5,6,7,2,3,4] - Define problem:
a. given genome A and permutation B find the minimal number of reversals to transform B into A
What is the central dogma of genetics?
DNA - RNA - protein
What is a plasmid?
small circular genome and can encode a virus or some bacterial genes
What has a circular genome?
bacteria
What has a linear genome made up of chromosomes?
eukaryotes
How many pairs of autosomal chromosomes and sex chromosomes to humans have and how many bases in total?
22 autosomal chromosomes
1 sex chromosome
3.2 billion bases in total
How many angstroms and in m are DNA basepairs?
3.4 angstroms (1^-10m)
How many bases is 1 full DNA twist and how many angstroms is it?
10.5 bases
34 angstroms (1^-9m)
How many angstroms is one nucleosome?
340 angstroms (1^-8m)
How many nm is a virus?
20-300nm (1^-8-1^-7m)
How many m is a bacteria?
10^-6m
How many um is a nucleus?
6um or 10^-6m
How many um is a cell?
10um or 10^-5m
What are the genome orders of magnitude?
What is the repeat structure in the human genome?
-the human genome is highly repetitive with repeats spanning many scales
-50% of the human genome is repeats excluding centromeres since when you add them you get a 55% range
-the repeat structure of genomes effects computational problems such as genome assembly and read alignment
What is a short tandem repeat or STR?
-repeated monomer sequence that is 2-7 bases repeated up to several hundred time
-total amount of genome is 4.3% or 138 Mbp
What is the mutation rate of STR compared to DNA?
up to 10,000 time greater than ordinary DNA
What is the societal importance of STRs?
since the repeats are highly mutable the odds of people getting the same sequence at a locus are low which means you can get a unique DNA “fingerprint” with only 13 loci for forensics databases
What is the biological importance of STRs?
STR expansions are linked to diseases such as ALS pr Huntington’s