Surveys - Sample Designs Flashcards
what is a controlled experiment?
- Comparitive (2+ things, groups, ideas etc.)
- Manipulative (manipulate one variable or more (ie treatment), to study relationships)
- cause and effect
- before and after
What is an observational study?
- Absolute (no baseline comparison)
- Mensurative (measure natural variation between variables, with no manipulation)
- survey
- monitoring
What is the difference between a survey and monitoring?
Survey - (estimate a statistic, no temporal change in period or survey)
Monitoring - (estimate a change in statistic, temporal changes during period of observations)
What are the 2 types of survey sampling?
- sampling with replacement (SIR)
2. sampling without replacement (SI)
Mean for random sampling
Sum of all values divided by the number of samples
Variance of mean for simple random sampling
SD squared of all samples, divided by # of samples
What is a confidence interval?
The interval in which we are x% confident that the true pop mean u lies.
What do confidence intervals consist of?
- un upper and lower limit
- a degree of confidence
What is the solution to bad random sampling pick?
- divide population into sub-groups (strata)
- don’t overlap
- randomisation within strata
What are the 2 types of stratified random sampling?
- Stratification - elements in pop divided into strata based on their variables
* must be non-overlapping and together constitute the whole pop* - Sampling within strata - samples selected randomly and independently from each stratum
Why do we stratify?
- Precision - more homogenous strata then more precise estimates.
- Captures individual strata characteristics - characteristics of each sample weighed proportional to entire pop - similar to weighted average.
- Practical - already know info may differ between groups/ strata is occuring (e.g suburbs)
How is the mean for stratified random sampling (StR) calculated?
First calculate the mean of each strata, then multiply each mean by its weighting (usually a proportion)
Then add up weighted means
How do we calculate the variance of the mean for stratified random sampling?
First calculate varience for mean of each strata, multiply each varience value by square of weighting
Then add up weighted variences
Worked example: Stratified sampling
# definitions A = c(90, 78, 86, 71) # define stratum A (4 samples) B = c(48, 56, 42) # defime stratum B (3 samples) n = 7 # total number of samples tcrit = qt(.975, df = n-2) # t critical value for 95% CI wt = c(A = .62, B = .38) # define weights # calculations: wmean = sum(mean(A) * wt[1], mean(B) * wt[2]) # weighted mean # weighted^2 variance of mean: wvar = sum(var(A)/4 * wt[1]^2, var(B)/3 * wt[2]^2) L95t = wmean - tcrit * se # lower 95% CI U95t = wmean + tcrit * se # upper 95% CI c(lower95 = L95t, upper95 = U95t) ## lower95 upper95 ## 61.04864 76.68803
Worked example: Simple Random Sampling
# definitions A = c(90, 78, 86, 71) # define stratum A (4 samples) B = c(48, 56, 42) # defime stratum B (3 samples) n = 7 # total number of samples tcrit = qt(.975, df = n-1) # t critical value for 95% CI # calculations: mean_ab = mean(c(A, B)) # mean var_ab = var(c(A, B))/n # variance of the mean L95s = mean_ab - tcrit * sqrt(var_ab) # lower 95% CI U95s = mean_ab + tcrit * sqrt(var_ab) # upper 95% CI c(lower95 = L95s, upper95 = U95s) ## lower95 upper95 ## 49.84627 84.72516