Flashcards in L4 Statistical techniques and sampling designs Deck (36)
Loading flashcards...
1
Descriptive statistics
Methods of summarizing the data in an informative way
- central tendency: mean, median, mode
- dispersion: range, stdev, variance, interquartile range
2
Inferential statistics
Methods to draw conclusions (or to make inferences, test hypotheses)
• Mean difference test
• Chi-square test
• Analysis of variance (ANOVA)
• Regression analysis
• Logit analysis
3
Four types of scales
- Nominal (qualitative)
- Ordinal (qualitative)
- Interval (quantitative)
- ratio (quantitative)
4
Nominal scale
allows classifying data into groups/categories
e.g. gender
5
Ordinal scale
rank orders in a meaningful way
e.g. education level
6
Interval scale
Meaningful differences between values, but no natural zero point --> zero means something (0 degrees)
7
Ratio scale
Meaningful differences and ratios between values due to a natural zero point --> zero is actually nothing (0 dollar is no money)
8
Choosing between inferential statistics:
IV=nominal/ordinal DV=nominal/ordinal
Chi-square test
9
Choosing between inferential statistics:
IV=nominal/ordinal DV=interval/ratio
T-test, Anova
10
Choosing between inferential statistics:
IV=interval/ratio DV=nominal/ordinal
logit analysis
11
Choosing between inferential statistics:
IV=interval/ratio DV=interval/ratio
regression analysis
12
When to perform T-Test vs Anova
T-Test --> compare two means (two levels of IV)
Anova --> compare more than two levels
13
Rating scales
- Likert scale: strongly agree/disagree
- Semantic differential: Cold warm
TREATED AS INTERVAL/RATIO so that you can use regression
14
What is a population?
Entire group of people, firms, events, or things of interest for which you would like to make inferences
15
What is a sample?
A subset of the population of interest
16
What is a subject?
Single member
17
What is low representativeness?
= properties of the population are over- or underrepresented in the sample
= high sampling error
18
The sampling process
1. define population
2. determine sampling frame
3. determine sampling design
4. determine sample size
19
1. define population
e.g. students TISEM, dutch organ donors
20
2. determine sampling frame
“Physical” representation of the target population
- where you can reach out to e.g. Donorregister
21
coverage error
sampling frame ≠ population
• Under-coverage: true population members are excluded
• Miss-coverage: non-population members are included
22
solutions to coverage error
• If small, recognize but ignore
• If large, redefine the population in terms of the sampling frame
23
3. determine sampling design
probability vs non-probability sampling
24
Probability sampling
Each element of the population has a known chance
of being selected as a subject
-->Results generalizable to population
BUT more time and resource intensive
25
Nonprobability sampling
The elements of the population do not have a known chance of being selected as a subject
--> less time and resource intensive
BUT results not generalizable to population
26
Probability sampling techniques
- Simple random sampling (SRS)
- Systematic sampling
- Stratified sampling
- Cluster sampling
27
Simple random sampling (SRS)
Each population element has an equal chance of being chosen
e.g. out of a hat
--> Highest generalizability
BUT costly?
28
Systematic sampling
Select random starting point and then pick every nth element
--> simplicity
BUT low generalizability if there happens to be a systematic difference between every nth observation
29
Stratified sampling
Divide the population in meaningful (homogenous) groups, then apply SRS within each group
e.g. level of income
--> All groups are adequately sampled, allowing for group comparisons
BUT more time consuming and Requires homogenous subgroups
30
Cluster sampling
Divide the population in heterogeneous groups, randomly select a number of groups and select each member within these groups
e.g. geographic clusters (areas)
--> Geographic clusters
BUT Subsets of naturally occurring clusters are typically more homogeneous than heterogeneous
31
Nonprobability sampling
- Convenience sampling
- Quota sampling
- Judgment sampling
- Snowball sampling
32
Convenience sampling
Select subjects who are conveniently available
e.g. random on the street
--> Convenient (inexpensive and fast)
BUT lower generalizability
33
Quota sampling
Fix quota for each subgroup (percentage in population)
--> When minority participation is critical
BUT lower generalizability
34
Judgment sampling
Select subjects based on their knowledge/professional judgment
e.g. experts
--> Convenient (inexpensive and fast) when a limited # of people has the info you need
BUT Lower generalizability
35
Snowball sampling
“Do you know people who...”
e.g. people with rare disease
--> For rare characteristics (“experts”)
BUT first participants strongly influence the sample
36