Sampling Error and Bias Flashcards
(44 cards)
Why is the sampling distribution important?
We never draw lots of samples. We estimate the population parameter from a single or small number of samples. Our point estimate is drawn from a theoretical sampling distribution. Variation associated with this distribution is influenced by sample size.
What is sampling distribution?
A sampling distribution is a probability distribution of a statistic obtained through a large number of samples drawn from a specific population.
What is the central limit theorem?
Tells us the sampling distribution will approximate to a normal distribution with sufficient sample size, representative sample, random sampling.
What is a confidence interval?
defines a range in which we estimate the true value will fall, accept some error (level of confidence 95%)
2xME
What does a 95% confidence level mean?
We accept a 5% likelihood that our confidence interval will not contain the true value.
What is margin of error?
Confidence interval is constructed by ME either side of our point estimate (mean). SE x 1.96
Standard Error
Measure of how much our estimate differs from the true population value.
How would you get a precise estimate, with a narrow confidence interval?
Increase sample size
When do we use t-scores?
When dealing with small samples (<40). Instead of z scores and normal distribution.
What do we have to do when calculating confidence interval for RR and OR?
We must log transform estimate and then antilog it as they do not follow a normal distribution.
Define sampling frame.
Actual list of survey population from which the sample is drawn, after which inclusion and exclusion criteria have been determined.
define sampling fraction.
Ratio between sample size and population size.
What is systematic error?
Sample not representative of population due to inaccuracy in sampling design or procedures of measurement. Form of bias. Predictable and once identified can be avoided. Will likely not form normal distribution.
What is random error?
Not predictable. Caused by natural fluctuations in sampling or measurement process. When plotting random errors as a histogram they should always form a normal distribution.
Describe the process of simple random sampling?
Identify survey population, create sampling frame, list eligible units, number them, determine sample size needed, randomly draw units (random number generator).
What are the advantages of simple random sampling?
simple, sampling error easily measured, every unit in frame has equal probability of being selected
What are limitations of simple random sampling?
create list of all units, get list of units from records (what if they don’t represent the population e.g. telephone directory excludes people without telephone), logistical challenge (time and cost), important minority groups may be missed by chance
Describe systematic sampling.
identify survey population, sampling frame, arrange units in a sequence (alphabetically), determine sample size, divide sampling population by sample size, choose random starting point, draw units at reg. intervals.
Advantages of systematic sampling.
simple, easy to implement, sampling error easily determined, ensures representivity.
Limitations of systematic sampling.
Needs a complete list that is representative of target population, patterns in ordering sequence increases probability of some units being selected.
Describe cluster sampling.
- list potential clusters e.g. all schools in a state 2. list of units in each cluster 3. calculate systematic sampling interval (cumulative population/number desired clusters) e.g. say it is 738 4. choose random start number between 1 and 738 5. select remaining clusters
Advantages of cluster sampling.
complete list of units not needed, less travel, within clusters all units have equal probability of being selected
Limitations of cluster sampling.
positive covariance within a cluster (bias), increased sampling (standard) error
Describe stratified sampling.
Stratify the sampling frame into homogenous sub-populations (strata), sample drawn randomly from each strata.