Bioinformatics and AI Flashcards

1
Q

What the heck is Bioinformatics?

A
  • An emerging interdisciplinary in research & applied sciences
  • Deals with the computational management and analysis of biological information: genes, genomes, proteins, cells, ecological systems, medical information, robots, artificial intelligence…
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the 2 cores of Bioinformatics?

A
  1. Coding
  2. Algorithm
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Bioinformatics is NOT simply using an existing software to analyze biological data. You will need to be able to create your own ____ and ____.

A

Code and Algorithm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are 5 Bioinformatics Applications?

A
  1. Sequence analysis
    – Geneticists/ molecular biologists analyze genome sequence information to understand disease processes
  2. Molecular modeling
    – Crystallographers/ biochemists design drugs using computer-aided tools
  3. Phylogeny/evolution
    – Geneticists obtain information about the evolution of organisms by looking for similarities in gene sequences
  4. Ecology and population studies
    – Bioinformatics is used to handle large amounts of data obtained in population studies
  5. Medical informatics
    – Personalized medicine
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain KEGG

A
  • Protein Pathway Database
  • Search database for metabolic and regulatory pathways
  • Compute KEGG: Generate possible reaction pathways between two compounds
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define Phylogeny Tree

A

Analysis of sequences allows evolutionary relationships to be determined

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain the Task of Sequence Alignment

A
  • The draft human genome is available
  • Automated gene finding is possible
  • Gene: AGTACGTATCGTATAGCGTAA
  • One approach: Is there a similar gene in another species?
    1. Align sequences with known genes
    2. Find the gene with the “best” match
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define Heuristic

A

use of the general knowledge gained by experience

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is BLAST?

A
  • Basic Local Alignment Search Tool
  • BLAST is by far the most frequently used database search program. This algorithm finds the longest significant match between query sequence and corresponding database.
  • Example of a Heuristic Method for database search
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

BLAST Key Terminologies: Word

A

a substring of a sequence of a given length

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

BLAST Key Terminologies: Segment

A

a substring of a sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

BLAST Key Terminologies: Segment Pair

A

an un-gapped alignment between 2 equal-length segments with an associated score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

BLAST Key Terminologies: MSP (maximum scoring pair)

A

the segment pair with the highest score in a given context

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How does BLAST works

A

Preprocessing –> Comparison –> Extension

Key idea: the longer the MSP can be stretched in both directions, the less chance the matches found is occurred by chance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Explain BLAST Preprocessing

A

Query sequence is broken down into a list of short, contiguous words with no repeats.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Explain BLAST Comparison

A
  • Words in the list are compared with a list of words from the database. Matches are recorded as scores.
  • Only the high scoring seeds are stored for later use
17
Q

Explain BLAST Extension

A

Extend the matches between query & database sequence by linking MSPs in both directions. The process continues
or significant matches.

18
Q

What does ktup mean?

A
  • ktup factor is used to adjust the word size.
  • Larger word size increases speed.
  • ktup (Tuple) = n –> n letters are read as the basic “scan” unit
  • Ktup ↑ = selectivity ↑ = sensitivity ↓
19
Q

ADD SLIDES 28-32

20
Q

What are some challenges in Bioinformatics?

A
  1. Explosion of Information
  2. Lack of “Bioinformaticians”
21
Q

Explain Challenge 1 in Bioinformatics: Explosion of Information

A
  • Need for faster, automated analysis to process large amounts of data
  • Need for integration between different types of information (sequences, literature, annotations, protein levels, RNA levels etc…)
  • Need for “smarter” software to identify interesting relationships in very large data sets
22
Q

Explain Challenge 2 in Bioinformatics: Lack of “Bioinformaticians”

A
  • Software needs to be easier to access, use and understand
  • Biologists need to learn about the software, its limitations, and how to interpret its results