statistics Flashcards

(35 cards)

1
Q

population

A

the entire set of individuals or objects of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

sample

A

a portion, a selected part, of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

individuals

A

the minimum unit that can be studied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

statistic

A

an approximation to the parameter, that can be calculated from our data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

simple random (probability) sampling

A

random numbers form 1 - N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

cluster (probability) sampling

A

simple random sampling of clusters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

stratified (probability) sampling

A
  • order population in strata
  • simple random sampling in the strata
  • these might have different sample sizes (or not)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

convenience (non-probability) sampling

A

when units are selected for inclusion in the sample because they are the easiest for the researcher to access

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

snowball (non-probability) sampling

A

a recruitment technique in which research participants are asked to assist researchers in identifying other potential subjects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

quota (non-probability) sampling

A

it relies on the non-random selection of a predetermined number or proportion of units.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

frequency

A

the number of observations for each group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

absolute frequencies

A

counting the observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

relative frequencies

A

percentage (or fraction) of observations in each group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

measures of centrality

A

trying to summarize the data by identifying the central position of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

mean (or average)

A

the sum of the data divided by the number of observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

median

A

midpoint of the values ordered in size

17
Q

mode

A

most frequent observation

18
Q

dispersion

A

informs about the variability in the data

19
Q

variance

A

a measurement of how far each number in a data set is from the mean, and thus from every other number in the set.

20
Q

standard deviation

A

a statistic that measures the dispersion of a dataset relative to its mean and it is calculated as the square root of the variance

21
Q

boxplot

A

univariate descriptives

22
Q

multivariate descriptive statistics

A

shows the relation between two or more variables, which can be of different types

23
Q

statistical inference

A

data analysis to study the underlying probability distribution

24
Q

hypothesis testing

A

an act in statistics whereby an analyst tests an assumption regarding a population parameter

25
null hypothesis
a type of statistical hypothesis that proposes that no statistical significance exists in a set of given observations.
26
false positive
an investigator rejects a null hypothesis that is actually true in the population. It is usually more problematic
27
false negative
the investigator fails to reject a null hypothesis that is actually false in the population.
28
probability (alpha)
statistical significance
29
p-value
probability, under the null hypothesis, of sampling a test statistic at least as extreme as that which was observed
30
low p-value
reject H0 and accept H1
31
high p-value
cannot reject H0 and cannot accept H1
32
homogeneity of contingency tables
when the distribution of observations in the rows (or in the columns) could be explained by random sampling of the observations in the columns (or rows)
33
Shapiro-Wilk
normality test but only for n<50
34
Kolmogorov-Smirnov
one sample vs a distribution, or two samples
35
95% interval of confidence
has a 95% likelihood of containing the parameter value