Final Exam - Vocab Flashcards

(96 cards)

1
Q

sensitivity

A

trying to discover all the real variants

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

specificity

A

trying to limit the false positives that creep in when filters get too lenient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

high tranche

A

if you want more variants and are willing to accept false positives

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

middle tranche

A

if you want to remove most false positives but are also willing to remove some true variants

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

low tranche

A

if you only want highly accurate true variants with few false positives and willing to miss perhaps many true positives

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

monogenic

A

mendelian = 1 “gene”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

multigenic

A

more than 1 “gene” (but not many)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

polygenic

A

many “genes”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

linkage disequilibrium

A

the non-random association of alleles at different loci in a given tissue

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

contig

A

set of sequence reads that overlap to form a contiguous stretch of DNA sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

N50

A

shortest contig length such that 50% of the bases are contained in contigs of length N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

L50

A

smallest number of contigs whose length sum to N50

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

De Bruijn graph

A

assembly method that uses smaller sub-sequences (k-mers) of sequence reads to find overlaps and build a graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

bubble

A

polymorphisms between haplotypes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

tangle

A

region with complicated haplotypes not able to be resolved

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

cell type

A

a classification used to describe cells that share common structural, functional, and molecular characteristics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

morphology

A

cells often have characteristic shapes, sizes, and structural features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

gene expression profile

A

each cell type has a unique pattern of gene activity that defines its function identity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

function

A

different cell types are specialized for specific biological tasks, such as muscle contraction, immune defense, or neurotransmission

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

location

A

cell types are often found in specific tissues or organs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

nuclear blebbing

A

herniations of the nucleus that occur in diseased nuclei rupture leading to cellular dysfunction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

apoptosis

A

the death of cells which occurs as a normal and controlled part of an organism’s growth or development

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

reverse transcriptase

A

an enzyme that converts RNA into DNA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

DNA barcoding

A

a technique that uses unique nucleic acid sequences to label and track individual cells or cell populations, enabling researchers to study their behavior, lineage, and interactions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
normalization
the practice of organizing data entries to ensure they appear similar across all fields and records
26
principal components
when a collection of points in a real coordinate space are a sequence of unit vectors
27
principal components analysis
a process of computing the principal components and using them to perform a change of bases on the data
28
clustering
grouping a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups
29
nearest neighbor graph
a directed graph defined for a set of points in metric space
30
community
nodes refer to cells and cell-cell pairwise distances are applies in the Leiden algorithm
31
PCA
principal components analysis projects a set of possibly correlated variables into a set of linear orthogonal variables
32
t-SNE
t-distributed stochastic neighbor embedding creates a probability distribution using the Gaussian distribution that defines the relationships between the points in high dimensional space
33
UMAP
uniform manifold approximation and projection, is constructed from a theoretical framework based in Riemannian geometry and algebraic topology
34
deconvolution
a process of resolving something into its constituent elements or removing complication in order to clarify it
35
ontology
a set of concepts and categories in a subject area or domain that shows their properties and the relations between them
36
gene enrichment analysis
a computational method that identifies biological pathways that are overrepresented in a group of genes
37
trajectory
the curve that a body describes in space; a path, progression, or line of development resembling a physical trajectory
38
cell trajectory analysis
allocation of cells to lineages and then ordering them based on pseudotime values within the lineages
39
pseudotime
the distance along the trajectory from its position back to the beginning
40
transcription factor
a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence
41
spatial transcriptomics
de novo gene expression measures in a defined tissue space
42
comparative genomics
a field of biological research in which researches use a variety of methods to compare the genome sequences of different or within species populations
43
species
group of organisms that can interbreed, produce fertile offspring and are reproductively isolated
44
subspecies
a geographically distant population within a species that has evolved independently and shows limited introgression with other populations
45
phylogenetic tree
a branching diagram of "tree" that visually represents the evolutionary relationships between biological entities, such as species or genes, based on their genetic characteristics and descent from a common ancestor
46
pseudogene
a DNA sequence that resembles a gene but has been mutated into an inactive form over the course of evolution
47
tranche
The recalibrated variant quality score provides a continuous estimate of the probability that each variant is true, allowing one to partition the call sets into quality
48
imputation
in genetics, it refers to the statistical inference of unobserved genotypes
49
trying to discover all the real variants
sensitivity
50
trying to limit the false positives that creep in when filters get too lenient
specificity
51
if you want more variants and are willing to accept false positives
high tranche
52
if you want to remove most false positives but are also willing to remove some true variants
middle tranche
53
if you only want highly accurate true variants with few false positives and willing to miss perhaps many true positives
low tranche
54
mendelian = 1 "gene"
monogenic
55
more than 1 "gene" (but not many)
multigenic
56
many "genes"
polygenic
57
the non-random association of alleles at different loci in a given tissue
linkage disequilibrium
58
set of sequence reads that overlap to form a contiguous stretch of DNA sequence
contig
59
shortest contig length such that 50% of the bases are contained in contigs of length N
N50
60
smallest number of contigs whose length sum to N50
L50
61
assembly method that uses smaller sub-sequences (k-mers) of sequence reads to find overlaps and build a graph
De Bruijn graph
62
polymorphisms between haplotypes
bubble
63
region with complicated haplotypes not able to be resolved
tangle
64
a classification used to describe cells that share common structural, functional, and molecular characteristics
cell type
65
cells often have characteristic shapes, sizes, and structural features
morphology
66
each cell type has a unique pattern of gene activity that defines its function identity
gene expression profile
67
different cell types are specialized for specific biological tasks, such as muscle contraction, immune defense, or neurotransmission
function
68
cell types are often found in specific tissues or organs
location
69
herniations of the nucleus that occur in diseased nuclei rupture leading to cellular dysfunction
nuclear blebbing
70
the death of cells which occurs as a normal and controlled part of an organism's growth or development
apoptosis
71
an enzyme that converts RNA into DNA
reverse transcriptase
72
a technique that uses unique nucleic acid sequences to label and track individual cells or cell populations, enabling researchers to study their behavior, lineage, and interactions
DNA barcoding
73
the practice of organizing data entries to ensure they appear similar across all fields and records
normalization
74
when a collection of points in a real coordinate space are a sequence of unit vectors
principal components
75
a process of computing the principal components and using them to perform a change of bases on the data
principal components analysis
76
grouping a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups
clustering
77
a directed graph defined for a set of points in metric space
nearest neighbor graph
78
nodes refer to cells and cell-cell pairwise distances are applies in the Leiden algorithm
community
79
projects a set of possibly correlated variables into a set of linear orthogonal variables
PCA
80
creates a probability distribution using the Gaussian distribution that defines the relationships between the points in high dimensional space
t-SNE | (t-distributed stochastic neighbor embedding)
81
is constructed from a theoretical framework based in Riemannian geometry and algebraic topology
UMAP | (uniform manifold approximation and projection)
82
a process of resolving something into its constituent elements or removing complication in order to clarify it
deconvolution
83
a set of concepts and categories in a subject area or domain that shows their properties and the relations between them
ontology
84
a computational method that identifies biological pathways that are overrepresented in a group of genes
gene enrichment analysis
85
the curve that a body describes in space; a path, progression, or line of development resembling a physical trajectory
trajectory
86
allocation of cells to lineages and then ordering them based on pseudotime values within the lineages
cell trajectory analysis
87
the distance along the trajectory from its position back to the beginning
pseudotime
88
a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence
transcription factor
89
de novo gene expression measures in a defined tissue space
spatial transcriptomics
90
a field of biological research in which researches use a variety of methods to compare the genome sequences of different or within species populations
comparative genomics
91
group of organisms that can interbreed, produce fertile offspring and are reproductively isolated
species
92
a geographically distant population within a species that has evolved independently and shows limited introgression with other populations
subspecies
93
a branching diagram of "tree" that visually represents the evolutionary relationships between biological entities, such as species or genes, based on their genetic characteristics and descent from a common ancestor
phylogenetic tree
94
a DNA sequence that resembles a gene but has been mutated into an inactive form over the course of evolution
pseudogene
95
The recalibrated variant quality score provides a continuous estimate of the probability that each variant is true, allowing one to partition the call sets into quality
tranche
96
in genetics, it refers to the statistical inference of unobserved genotypes
imputation