BIOINFORMATICS Flashcards

1
Q

Concerned with knowledge and the flow of knowledge in biological systems using computational methods in genetics and genomics

A

BIOINFORMATICS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

study of genes

A

Genomics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

study of proteins

A

Proteomics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

A collection of related information which are:
○ Structured
○ Searchable → index
○ Updated periodically
○ Cross-referenced → hyperlinks

A

DATABASES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

○ These are programs that keep the database
working behind the scenes
○ Computerized data-keeping system

A

Tier 1: Database management system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

○ Facilitates communications between applications or databases
○ Extracts information from either local or remote databases

A

Tier 2: Middleware layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

○ Enables users to access the database from anywhere without the need for downloading or installing any code
○ The one that we see – the graphic user interface.

A

Tier 3: Web interface

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

CLASSIFICATION OF DATABASES
1. Scope of data coverage
give me the 2

A

● Comprehensive
● Specialized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

CLASSIFICATION OF DATABASES
2. Methods of biocuration
give me the 2

A

● Expert-curated (RefSeq)
● Community-curated (GenWiki)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

CLASSIFICATION OF DATABASES
3. Level of biocuration
give me the 3

A

● Primary
● Secondary
● Composite

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

CLASSIFICATION OF DATABASES
4. Type of data managed
give me the 3

A

● DNA/RNA/Protein
● Disease
● Nomenclature/Literature

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

● Information on sequence or structure alone
● Experimentally derived data submitted directly
● Archival in nature

A

PRIMARY DATABASE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

● A variety of primary databases, that allow for an ‘all-in-one’ search with multiple resources

A

COMPOSITE DATABASE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

● Derived from primary databases
● Based on analysis of the data from the primary
database

A

SECONDARY DATABASE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

“Google” of bioinformatics

A

COMPOSITE DATABASE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

● Primarily used is PubMed
● Contains entries for >11 million abstracts of scientific publications

A

LITERATURE DATABASE

17
Q

● GenBank, EMBL-bank, and DDBJ exchange data to ensure comprehensive worldwide coverage;
accession numbers are managed consistently between the three centers

A

NUCLEIC ACID DATABASE

18
Q

● Contains publicly available DNA sequences from >100,000 organisms
● Also contains derived protein sequences, and annotations describing biological, structural, and other relevant features

19
Q

● Contains nucleotide sequences from all public sources.
● Accessible through Sequence Retrieval System (SRS), which allows keyword searching.
● Sequence similarity search tools: BLAST, Blitz, Fasta

20
Q

● Contains curated data on everything that has to do
with proteins, motifs, and interactions with other
substances.

A

PROTEIN DATABASE

21
Q

● >18,000 macromolecular structures on proteins,
peptides, viruses, protein/NA complexes, nucleic acids, and carbohydrates.
● Determined by X-ray diffraction and NMR.

A

PROTEIN DATA BANK

22
Q

○ Curated database focusing on high level of annotation (sequence, function, structure, post-translational modifications, variants) of proteins.
○ Non-redundant and reviewed.

A

● SWISS-PROT

23
Q

○ Computer-annotated supplement to SWISS-PROT.
○ Redundant and unreviewed.

24
Q

● Secondary database on protein families, domains and functional sites that contain manually curated
information.
● Provides tools for analysis of protein sequences and motifs.

25
● Protein family fingerprints (groups/motifs). ● Detects distant relatives of large and highly divergen protein superfamilies by looking at conserved regions in alignments.
PRINTS
26
● Protein families and domains represented as multiple sequence alignments.
PFAM
27
PFAM ___ : Automatically Generated, LQ Entries
Pfam-B
28
PFAM ___ : Manually Curated, HQ Entries
Pfam-A
29
● Collection of ungapped multiple alignments of segments of related protein sequences (blocks) ● For: protein family classification, protein structure prediction
BLOCKS
30
● Contain data regarding structures of nucleic acids and proteins.
STRUCTURAL DATABASES
31
Easy to use website to align FASTA files.
MULT-ALN
32
Translates DNA sequences or RNA sequences into their protein sequences.
EXPASY
33
Provides a prediction of the protein structure.
I-TASSER