Lecture 2 Flashcards

(49 cards)

1
Q

What is protein bioinformatics

A

Analysis of protein sequences and structure to get insight on the properties and function of the protein

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What do we use to compare sequences

A

Blast (from the NCBI website)

It looks for other sequences in the data base that match the one you put in

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What can you put in the query of a blast

A

The accession number

The gi

The bare sequence

Or the FASTA formatted sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a GI number (gi)

A

It’s a simple series of numbers that are assigned to each sequence process by NCBI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is fasta format

How long are the lines

A

Starts with > then a single line description of the sequence on top

All lines of sequence are shorter than 80 characters

No blank lines

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What do B U X Z * - stand for in fasta

A

Aspartate/asparagine

Selenocysteine

And amino acid residue

Glutamate/glutamine

Translation stop

Gap of any length to align the sequence better

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is selenocysteine

A

Another AA after bacteria hijack 1 of 3 stop codons and replace them with pyrolysine or selenocysteine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are metagenomic proteins

A

Extract RNA/DNA for bulk sample (like ocean water)

Takes that sequence and do blastp

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When does quick blastP (accelerated protein protein blast) work best

A

If the target is more than 50% identical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the other type of blasts

A

Psi blast (position specific scoring matrix based on first run)

Phi blast (alignments that are limited to one that match a pattern in the query)

Delta blast (position specific scoring using results of a conserved domain database)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is BLOSUM62

A

The matrix assigns a score for aligning pairs of residues

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Negative charged amino acids

A

Aspartate glutamate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Postive charged amino acids

A

Lysine, histidine, arginine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Polar uncharged amino acids

A

Serine, threonine ,asparagine, glutamine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Amino acids with hydrophobic side chains

A

Leucine, valine, isoleucine, alanine, methionine, phenylalanine, tyrosine, tryptophan

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which amino acids are special cases

A

Cysteine, selenocysteine (U), glycine, proline (helix breaker)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Why do the unique amino acids get higher score during BLOSUM

A

Because since they’re so unique, they’re in the position for a reason meaning they get a higher score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

When scoring, what is the affect of putting gaps in the sequence to match the amino acids

A

That match gets a -1 score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

In scoring, cysteine with any other amino acid gets what score

A

A negative score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Why does the algorithm really like to align tryptophans with tryptophans? (give it a very high score)

A

Because it’s such an unusual amino acid

21
Q

In the graphic summary of a blast p, the first line is

If one line start shorter then other lines what does this mean

A

The top hit

The first half of that sequence doesn’t match so it’s a gap

22
Q

What is the expected or E value of a blastp

A

Tells the number of hits (matches) expected to be got by chance

It’s used to create a threshold of significance (like how likely is it that it got aligned by chance)

If low that means that the sequence is a signifanct match and should be

23
Q

What do the positives mean in a blastp sequence alignment

A

If it puts + in this means it aligned a conserved substitution

Ex. F to Y, it matched these with a plus because they have the some properties but are different amino acids

24
Q

If in a sequence there are AV

How many gaps is it

25
What is a blastp clustal alignment
Shows all the different sequence alignments of all matches
26
What is similarity between sequences quantified by
% identity % similarity (similar amino acids, Leucine, isoleucine)
27
What is homologous in matcheing sequences
The products of 2 genes have a shared ancestry Meaning it matches sequences that may have come from a common ancestor
28
In a table with amino acids and their preference to adopt a specific secondary structure, what does a value greater than one mean
Show that that amino acid has a tendency to adopt that secondary structure
29
What are the helix breakers
Glycine and proline
30
What are IDR’s
Intrinsically disordered regions
31
Why would something want intrinsically disordered regions
Exposes short linear motifs that mediated protien protein interactions Allows for regulation of the protein funtion due to PTM at this IDR Regulates the proteins half life by engaging proteins that have been targeted for degredation by the proteosome (so adds ubiquitin to the IDR) Adopts different confirmations when binding to different interaction partner
32
What are traits of intrinsically disordered protiens (IDP)
They are fully disordered Can be boiled and stay soluble (instead of precipitating)
33
IDR are ____ than loops and turns
Longer
34
Example of a protein with IDR
PP2B/calcineurin
35
What are sequence signatures
A sequence that has certain key amino acids in specific positions that only are there to do a specific role (like fold specifically or a have specific property)
36
[LMFY] {EF} x In a sequence signature means what
Any amino acid in the brackets Any amino acid except the ones in the brackets Any amino acid
37
What are motifs How long are they
Short sequence pattern that has a specific function Usually 3-8 aA , max is 20aa
38
Give example of motifs
Transit peptides (n term sequence that takes the protein to a specific area in The cell) Binding sequence (the sequence makes the protien complex with another protien, specific) Motif is recognized for covalent modification
39
What are domains
A region of the protiens polypeptide chain that folds independently and has a specific function Like a parts list for proteins Ex. SH2/SH3 domains
40
What does the website PROSITE tell us
About the proteins signatures, domains, and motifs
41
What does < and > mean is prosite
Amino terminal element Carboxy terminal element
42
What does x(2,4) mean in prosite
x-x Or x-x-x Or x-x-x-x So any number from 2 to 4 of any amino acid
43
What is the rule for x(2,4)
Only for x and not allowed at the amino of carboxy terminus unless anchored to the terminus
44
What website lets us see transmembrane regions/prediction of a protein
DeepTMHMM
45
What are SLiMs
Short linear interaction motifs They drive specific protein protein interactions
46
Give 2 examples of what a SLiM does
The motif RVxF on one protein docks PP1 (protein phosphotase 1) on to that protein It’s a 5 residue motif Peroxisime targeting: signals are located at the c termini of the protein (ex. SKL coo-) This makes it go to the peroxisome
47
What is pY
Phosphotyrosine
48
Once a transit peptide takes its protein to a certain area in the cell what happens
A protease cleaves the transit peptide
49
What are transit peptides used for
To go to chloroplast, mitochondria, secretion From cell