Bioinformatic Methods Flashcards Preview

MedSci 720 > Bioinformatic Methods > Flashcards

Flashcards in Bioinformatic Methods Deck (27):

Define bioinformatics:

The process of turning data into information
Bioinformatics is the application of computing, mathematics and statistics to the analysis of biological information
Has become essential for large-scale measurement technologies, such as microarrays, proteomics, metabolmics, genomics


Describe bioinformatics used historically:

Celera to sequence the human genome - originally used to fish out disease-causing SNPs, but decided to help sequence the whole genome


Describe evolutionary biology:

Finding the ancestral ties between different organisms and using animal homologues of human proteins to gain an insight into disease (e.g. looking at pandemics, such as bird flu, AIDS)


Describe phylogenics:

The field of biology that deals with identifying and understanding the relationships between different kinds of life on earth (cataloging the earth based on DNA, and determining what is descended from what)
E.g. looking at cancers to determine which part of tumours have a selective advantages and how part of a tumour may differ from another part


Describe molecular modelling:

Using experimentally determined protein structures (templates) to predict the structure of another protein that has a similar amino acid sequence (in silico models)
Can be used to study drug interactions


Describe metagenomics:

Using next generation sequencing of ribosomal subunit genes to identify the mix of species in a population
E.g. how different species use different pathways - human metabiome with proportion of gut bacteria


Describe genome sequencing, assembly and mapping:

Generating and using information about genomes - very technical


Describe genomic, proteomic, glucomic and metabolomic analysis:

Gaining an understanding of biology and pathology by measuring the abundance of thousands of molecules in cells or tissue


Describe integrative bioinformatics = systems biology:

Bringing data about different aspects of cells and tissues together to allow a more holistic understanding of normal function and pathology (this incorporates mathematical and statistical models)
E.g. in tumours, gene copy numbers, gene expression, microRNAs and epigenetics can be looked at, and systems biology can be used to identify links between these and apply this information clinically (determine best treatment option etc.)


Describe clinical bioinformatics:

Bringing clinical information and molecular information together to optimise treatment


Describe the role of bioinformatics:

Large part of genomic work - almost 50% of time in next generation sequencing is spent performing bioinformatic analysis
Bioinformatics acts as a translators for clinicals, biologists, statisticians, computational biologists and biotech


Describe who can do bioinformatics:

Computer science background not needed - communication skills are more important


Describe how bioinformatics has evolved overtime:

Previously, 1000 genes looked at via cDNA spots, RNA labelled radioactive probes, hybridised to a nylon filter like a Northern blot
Now 1,300,000x volume of data can be used for tumour analysis via RNA sequencing
The cost of DNA sequencing has drastically decreased overtime and the amount of information received for this cost has increased


Describe systems biology:

While many people would debate the biggest challenge of modern biology, a significant challenge is to understand how thousands of individual molecules of different types work together in organisms
Systems biology addresses this issue by applying computational approaches to large scale data
Systems biology can identify hidden features of an experimental system that are hidden to other approaches


Describe how cell function and fate determined:

Many studies look at individual molecular signals as operating in isolation, however, it is more likely that there is a complex interaction of hundreds of molecular signals
While what is going on inside cells is a mystery, many aspects of the cell can be simultaneously studies
Thought that the interactions in cells are very dynamics, and can be changed when treated with a drug or mutation occurs
Ultimately, the purpose of the model is not to fit the data but to sharpen the question - systematic/holistic analysis of data to generate testable questions


Describe gene regulatory networks:

A GRN is a representation of relationships between RNAs
Protein, signalling pathways and metabolites are the hidden engine that determines the gene network edges

Some genes can feedback on other genes
All pathways can have an effect which alters expression
The relationships between mRNAs can be used to estimate what occurs in gene pathways


Describe expression network hubs:

Hubs have many downstream genes, which are sometimes linked together in a biologically or clinically meaningful way
Use biological databases to highlight the gene network hubs that have immediate biological relevance


Describe bioinformatics in medicine:

Genomics is making medicine and information science, and is helping with the treatment and diagnosis of disease
Medicine of the future will be a synergy between genomic technologies, pathology, and traditional clinical acumen
Most NZ cancer clinicians expect the use and influence of molecular and genomic tests to increase over the next 10 years


Describe molecular pathways in companion diagnostics (e.g. kRAS testing for cetuximab in metastatic colorectal cancer):

Colorectal cells respond to growth factors to divide quickly and divide.
EGF binds to EGF-R on cancer cells, causing a cytoplasmic signalling through intermediate signalling pathways, which ultimately leads to enhanced proliferation and survival, and reduced apoptosis
Cetuximab is an antibody which blocks EGFR and has very few side effects, but a relatively high portion of patients are resistant to this drug
These patients have mutant RAS, which is not dependent on EGFR for stimulation
Bioinformatics has great use in understanding systems biology to test for resistance before giving the drug
Similar approaches are being used for anti-infective drugs


Describe the 'nuts and bolts' in bioinformatics:

Experimental design (key for getting meaningful data and producing good information)
Quality control
Normalisation and pre-processing
Statistical evaluation


Describe experimental design for gene expression experimental experiments:

Microarray experiments are old but solid technology, easy to analyse, but only cover a pre-determined set of RNAs, not the whole transcriptome of a cell or tissue

RNAseq experiments are a newer technology, little harder to analyse, but cover whatever mRNAs are expressed, and can be figured to measure expression of additional types of RNA such as miRNA

RNAseq and microarray experiments, despite their large scale, require good design including replication, multiple testing correction and careful visualisation


Describe the importance of replicates, multiple testing correcting and careful visualisation in experimental design of gene expression:

Replicates are essential, regardless of cost
Multiple testing correction attempts to address the technical noise, whereby gene expressions combine by chance so that all data points from one cell culture are higher than others, leading to falsely significant changes
Visualisation of the data is important to see where mistakes were made


Describe the uncertainty of science - definitions of statistical concepts for biologists:

p ≤ 0.05
P value = the probability of obtaining a result due to chance alone
Multiple Testing = when you are asking many questions at the same time (e.g. studying many RNAs at once using a microarray).
False Discovery Rate = the number of incorrect findings you will accept in your results (this is a common way of dealing with the problems raised by multiple testing) – this is a mathematical method of detecting how many chance results you will see and match this to how many you will tolerate


Describe endothelial cells and blood vessel apoptosis and bioinformatics:

ECs have signals which allow the blood vessels to regress and be removed
This occurs via apoptosis of ECs, whereby a coordinated gene expression program directs cell death
Reduction of HL-GAGs is important for this process
Measuring carbohydrate levels corresponds to the genes encoding carbs


Describe the three important levels of deep analysis of tumours:

1. Mutation frequencies in genes
2. Mutation patterns in RNA (not all mutations are being expressed) - may be recurrent or tumour-specific
3. Structural changes in DNA/chromosomes - can be more important than mutations to individual genes - structural changes can affect both alleles
It is important to note that somatic cells are also in tumours, so some gene expression can be attributable to these cells


Describe the importance of bioinformatics in clinical medicine and drug use:

Used to determine if a patient is drug resistant
Used to determine whether multiple drugs could be used to overcome this resistance
Used to determine whether a drug should be kept in reserve in case resistance develops


Describe combining different types of genomic information with pathway and drug information for melanoma:

Exome sequencing of the Ras-Raf signalling pathway could be used to detect the V600E mutation, which is targeted by drugs and turned off
Bioinformatics could be used to the identify the best candidates for the drug, who do not have resistance mutations in the Ras gene or other resistance-causing mutations