big data Flashcards

1
Q

what are the four different kinds of OMICs data and how are they viewed?

A

genomics - DNA

transcriptomics - RNA

epigenomics - chromatin/where proteins bind to DNA

^^^acquired through next generation sequencing using illumina sequencing machines. RNA often converted to cDNA

proteomics - acquired by mass spectrometry

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

how is microscopy used for big data?

A

high throughput imaging = a moving stage imaging loads of samples

AI can be used to analyse the images

analyses fluorescent tagging in live cells, fixed cell staining, automated image analysis
Can tell us about cell type, differentiation, pathological processes, migration

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how would one investigate factor Xs effects on gene expression in sample Y in four steps?

A

Sequence mRNA from both the control group and the group that received factor X, because we want to look at gene expression

Convert to cDNA and prepare a sequencing library (sequences to be analysed)
Amount of cDNA represents amount of mRNA

Run data through pipeline (computational steps to analyse it)

make a plot e.g. volcano plot to look at fold change

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

when looking at gene expression and transcriptonomics, what is meant by fold change?

A

the difference in gene expression between tested and control groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what plot might be used to look at changes in gene expression (fold change)?

A

volcano plot, a type of scatter-plot that is used to quickly identify changes in large data sets composed of replicate data. It plots significance versus fold-change on the y and x axes, respectively.
So this volcano plot looks at the down and up-regulation of genes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

single cell RNA sequencing - what 3 things is it often used to investigate?

A

Great for telling which genes are expressed by which cells

How a cell’s gene expression changes over time and differentiation

Tissue composition changes - e.g. proportion off immune cells in one sample compared to a disease one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what plot is used to show results of single cell RNA sequencing?

A

UMAP - where each individual dot is a cell
Each colour marks ‘clusters’ of similar cells with similar transcriptional profiles
Can see which genes are expressed by particular cells and if cell-type specific gene expression changes
Can also see how cells change over time

Trajectories can show how change has occurred from certain cells

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is a GWAS study?

A

used for finding risk alleles - versions of genes that contribute to causing a disease, they’re good if you don’t know what to look for…

examine a panel of SNPs in a genome for an association with disease of interest

It looks for differences in allelic frequency between disease and control groups
Studies must be very large to be statistically sound

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

how are results of GWAS study represented?

A

Manhattan plots - along X axis is genomic position, all of the dots are SNPs plotted to where they are in the genome, along Y axis is degree of association so the higher up the higher chance of association

remember - the SNP itself may not be part of the allele causing the problem, just might be linked to (close to) the gene involved

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what might you combine a GWAS study with?

A

Can combine GWAS results with other data like RNA sequencing to identify the cell types in which genetic variants cause a functional difference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is the significance threshold for a Manhattan plot?

A

less than 5 x 10^-8

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

example of big data - what was the UK biobank?

A

500,000 adults
Lots of measurements taken and tests such as anatomical, physiological, biochemical
Followed over time where some develop diseases
Baseline data is then studied and compared with follow up data to discover new disease associations

There is a social gradient with life expectancy, poor neighbourhoods have a greater burden of ill-health than wealthy ones

How well did you know this?
1
Not at all
2
3
4
5
Perfectly