Chapter 4: Sampling Flashcards
(25 cards)
Sampling
We select individuals from the population and we hope to make conclusions that we can generalize to the population
Population and population-of-interest (N)
Census
Sampling frame
Sample (n)
Population and population-of-interest: specific group of people in the population
Census: sample is the entire population of interest
Sampling frame: the people we invite to participate in our study
Sample: the people who actually participate in the experiment
Probability sampling vs biased sampling
Probability: using probability for selecting people from the population
Biased: sample will be biased in a way and not representative for the population
Sampling error
The difference between your estimate and the estimate you would get if you had access to everyone in the population of interest
The error is larger when some members of the population are more likely to be selected (= sampling bias
Two ways a sample may be biased
- Convenience sampling: sampling only those who are easy to contact
- Self-selection: sampling only those who volunteer
A biased sample is a problem when the characteristic that is making the sample biased is relevant to what you are measuring
Margin-of-error
Tells us how big the sampling error is
As the sample size (! not population size) increases, the margin of error decreases and the estimate is more accurate
Indicates the reliability of estimate based on sample data
Interpretation: if one obtains many unbiased samples of the same size from a defined population, the difference between the sample percent and the true population percent will be within the margin of error at least 95% of the time
Two categories of techniques for sampling
- Techniques that lead to representative samples: we give every member of the population an equal chance to be in our sample
- Techniques that lead to biased samples: certain individuals or groups have a higher/lower chance of being in our sample
Random selection vs random assignment
Random selection: randomly selecting people from population to sampling frame
Random assignment: randomly assigning people in an experiment to a specific condition/group
Six types of probability sampling
- Simple random sample
- Cluster sample
- Stratified random sample
- Oversampling
- Systematic sampling
- Multistage sampling
Simple random sample
Get a list of every member of the population and assign each member a number, randomly generate ‘n’ numbers to select sample
Every member of the population has equal likelihood of being selected to participate in the study
However, you must know all the members of the population which is not always possible
Cluster sample
Divide the population in clusters/groups, then randomly select clusters and sample all members of the selected cluster
Every member of the population has an equal chance of being invited to the study, but nog we can base it on the cluster instead of on the person
Stratified random sample
Divide the population in subgroups (strata) and randomly select members within each subgroup
Selecting participants in a way that your sample mirrors the real situation of the population
Size of subgroups in sample is proportional with the size of subgroups in population (fa. population of 200 students that is 80% female and 20% male → sample of 10 students of which 8 are women and 2 are men)
Oversampling
Same as stratified random sampling, but the size of the strata doesn’t have to be the same (not proportional) as in the population
(fa. population of 200 students that is 80% female and 20% male → sample of 10 students of which 5 are women and 5 are men → oversampling men)
Why: enough participants in each group for statistical techniques
Problem: sample is not representative of the population → weighting (= raking): give everyone an individual weight which indicates how important the observation is
Systematic sampling
Select one person from an unordered list of population members between 1-k and then select every k case from list
Example: population of 100 (N) and sample size of 20 (n) → k = N/n = 5 → we select every fifth case
Multistage sampling
Combination of multiple sampling techniques
Benefits: efficiency, lower cost, less transportation, easier to collect data
Three types of biased sampling
- Convenience sampling
- Purposive sampling
- Snowball sampling
Convenience sampling
Selecting participants because they are easy to contact or to convince to participate
Problem: more often than not biased in certain ways
Purposive sampling
Selecting certain participants into your sample on purpose → selection is non-random
Snowball sampling
Select certain people into sample and ask every person in the sample to recruit additional participants
‘Snowball’ because the ball rolls down the hill and gets bigger and bigger
Can be useful for population that are difficult to reach
Quota sampling
Divide population in subgroups (strata) and non-randomly select members within each subgroup
Similar to stratified random sampling, but without random selection
Quota can be proportional or non-proportional with subgroup sizes in the population
External validity
Crucial for studies making frequency claims, less important for studies making causal or association claims
How to evaluate: look at sampling technique, look at the claims they make, replicability
WEIRD samples
Using White, Educated, Industrialized, Rich, Democratic people as samples
They represent 80% of study participants in the literature, but only 12% of the world’s population
The assumption that the study of psychological processes is universal, is often false because sample is not representative
Online paid panels
Paying companies to look for a good sample for your research
Advantage: efficient
Dangers: bot that fill in the survey, participants are motivated by financial incentives → low quality data, invalid responses
→ People don’t participate because they care about the research, but because of the reward
Sample size
Larger samples are not automatically more representative
Larger samples have a smaller margin-of-error (if probability sampling is used!)
Large sample size is not a requirement for good external validity, but it is important for statistical validity
Sample size is determined by number of people that actually participate