Lecture 1 - Biology for Computational Genetics Flashcards

1
Q

What are the two forms computational formulation?

A
  1. Numerically encoded input with a computable objective function
  2. Numerically encoded input with a computable statistical test for significance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

What is an example of a computational form in which there is a numerical input an a computable objective function?

A

input: x is all real value
f(x) = x^2
output: x that maximizes f(x)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is an example of a computational form in which there is a numerical inout and a computable statistical test for significance?

A

input: the counts of each type of nucleotide in a genome
output: the probability that the nucleotides could be generated by a process that is independent and identically distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is maximum parsimony?

A

if there are multiple solutions to a problem the one involving the fewest steps is the biologically correct one
-meaning if you want to figure out is two sequences are related it is through the way that has the least steps

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are synteny blocks?

A

show regions of homologous genes between two organisms
-direction matters for these

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the three steps for the computational formulation of genome arrangements?

A
  1. Transform the problem into a numeric input
    a. a list of genes [1…N] and their order and orientation a permutation in another genome
    Example: A=[1,2,3,4,5,6,7,], B=[1,-7,-6,-5,2,3,4]
  2. Define model
    a. let the evolutionary model be a reversal (define Ref(G,i, j) to be the reversal of elements in G from position i to position j. this reversed the order and orientation of each element
    Example: Rev (B, 2,4) = [1,5,6,7,2,3,4]
  3. Define problem:
    a. given genome A and permutation B find the minimal number of reversals to transform B into A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the central dogma of genetics?

A

DNA - RNA - protein

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a plasmid?

A

small circular genome and can encode a virus or some bacterial genes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What has a circular genome?

A

bacteria

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What has a linear genome made up of chromosomes?

A

eukaryotes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How many pairs of autosomal chromosomes and sex chromosomes to humans have and how many bases in total?

A

22 autosomal chromosomes
1 sex chromosome
3.2 billion bases in total

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How many angstroms and in m are DNA basepairs?

A

3.4 angstroms (1^-10m)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How many bases is 1 full DNA twist and how many angstroms is it?

A

10.5 bases
34 angstroms (1^-9m)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How many angstroms is one nucleosome?

A

340 angstroms (1^-8m)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How many nm is a virus?

A

20-300nm (1^-8-1^-7m)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How many m is a bacteria?

A

10^-6m

16
Q

How many um is a nucleus?

A

6um or 10^-6m

17
Q

How many um is a cell?

A

10um or 10^-5m

18
Q

What are the genome orders of magnitude?

A
19
Q

What is the repeat structure in the human genome?

A

-the human genome is highly repetitive with repeats spanning many scales
-50% of the human genome is repeats excluding centromeres since when you add them you get a 55% range
-the repeat structure of genomes effects computational problems such as genome assembly and read alignment

20
Q

What is a short tandem repeat or STR?

A

-repeated monomer sequence that is 2-7 bases repeated up to several hundred time
-total amount of genome is 4.3% or 138 Mbp

21
Q

What is the mutation rate of STR compared to DNA?

A

up to 10,000 time greater than ordinary DNA

22
Q

What is the societal importance of STRs?

A

since the repeats are highly mutable the odds of people getting the same sequence at a locus are low which means you can get a unique DNA “fingerprint” with only 13 loci for forensics databases

23
Q

What is the biological importance of STRs?

A

STR expansions are linked to diseases such as ALS pr Huntington’s

24
Q

What is a mobile element?

A

mobile DNA are sequences that copy themselves or hijack reverse translation system; pieces of DNA that have info encoded in them so the cell can copy them and put them back into a gene

25
Q

What is an autonomous mobile element?

A

autonomous mobile elements are sequences that encode the proteins that copy themselves; encodes for proteins that copy that very sequence of DNA - create a positive feedback loop and are often hyper methylated to terminate the positive feed back loop

26
Q

What is a non-autonomous mobile element?

A

get copied into RNA and then get spliced back but cannot do this themselves and rely on other cell machinery to do so

27
Q

What is a non-autonomous mobile element of the human genome?

A

Alu (280-350 base sequence) that have 1 million copies in the human genome

28
Q

What is an autonomous mobile element of the human genome?

A

LINE - up to 7,000 bases and is 15% of the human genome

29
Q

What is a segmental duplication?

A

a sequence that is atleast 1kb in length that is not a mobile element or tandem repeat that is duplicated with at least 90% identity elsewhere in the human genome

30
Q

How are segmental duplication drivers of evolution and disease?

A

segmental duplication drive nonallelic homologous recombination and this results in the duplication or deletion of a region

i.e. 15% of chromosome 16 is novel with respect to the human chimpanzee ancestor due to segmental duplication expansion
-1% of autism cases are linked to a deletion in a segmental duplication

31
Q

What is the binary representation of DNA?

A

A = 00
T = 11
G = 10
C = 01

represented in two bits

32
Q

How many bits make a byte?

A

8 bits

33
Q

How can the reverse complement be computed?

A

the NOT or !A operator

34
Q

How to represent all combinations of nucleotides of length 8?

A

4^8 combinations

35
Q

How much memory is required to store a 4G or 4 billion base genome with binary encoding?

A

one byte = 8 bits
one base = 2 bits
4 bases = one byte

1 billion bytes

36
Q

What is included in the anatomy of a gene?

A

5’ untranslated region to an ORF to another 3’ untranslated region

37
Q

What do enhancer or silencer’s do?

A

increase or decrease the amount of RNA transcribed through DNA binding proteins where certain proteins or Transcription Factors bind to the DNA

38
Q
A