Introduction to Inferential Statistics Flashcards
(37 cards)
How can we infer something about the population based on what we find in the sample?
Normal distribution
Empirical rule
Central limit theorem
Are t-values and standard errors used to estimate population parameters?
Yes
What are inferential statistics used for?
To draw conclusions and make inferences about population parameters by analyzing data collected in a sample
To infer something about the population parameter using sample statistics
Will parameter estimates based on a sample exactly equal the population parameter?
No, because they vary from sample-to-sample
So, when reporting a statistic, we also typically report an interval (e.g., 95% confidence interval) which we believe includes the population parameter
What is a confidence interval?
A range of values within which a population parameter (e.g., mean or proportion) is likely to fall, based on a sample from that population
If you repeated a study of the same size many times, 95% of the resulting confidence intervals would cover (include) the true population parameter
Do statistics estimate the sample?
Yes
And parameters estimate the population
Since samples describe the population, do statistics describe the parameters?
Yes
What notations are used for samples?
Number of people: n
Mean: X (with a bar on top)
Variance: s^2
Standard deviation: S or SD
What notations are used for the population?
Number of people: N
Mean: mu
Variance: sigma^2
Standard deviation: sigma
What are the primary characteristics of a normal distribution?
Symmetric and unimodal (one peak)
Why is a normal distribution the most important distribution in inferential statistics?
It’s characteristics form the foundational assumptions underlying many interferential statistics
Can you apply the empirical rule to any normal distribution?
Yes
What is the empirical rule?
68% of the observations fall within 1 standard deviation of the mean
95.4% of the observations fall within 2 standard deviations of the mean
99.7% of the observations fall within 3 standard deviations of the mean
If something has a normal distribution, can you use the central limit theorem?
Yes
What are the three distributions?
Sample distributions (distribution of the sample)
Population distribution (distribution of the population)
Sampling distribution (distribution of a statistic over a set of of theoretical samples; distribution of sample means; plotted means of various samples)
Is the sampling distribution of the sample means approximately normal?
Yes
The sampling distribution of the sample means becomes “more normal” as n (or number of samples) increases
The mean of the sampling distribution of sample means will be the same as the mean of the population
Since the distribution of the sample means is approximately a normal distribution, can the empirical rule be applied?
Yes
Do we know that 95% of all sample means are within 2 SDs of the population mean?
Yes, based on the empirical rule
This theoretical assumption is the basis for inferential statistics
What is the standard error of the mean?
Analogous to the SD of the population data
SEM = SD of the sample/square root of the sample size
How is the standard deviation related to the sample error of the mean?
Just as the standard deviation increments the distance of a raw score from the population mean, the SEM increments the distance of the sample mean from the population mean
What will make the SEM change?
The SD of the sample
The size of the sample
Why are the standard error (SD) increments of the distribution of the sample means so much narrower?
Because we are just plotting means, not raw scores
Outliers are not applied
What are z-scores?
Indicate how far a score is from the mean in a population context
Raw scores incremented by standard deviation units
A z-score indicates how far a raw score is from the population mean
What are t-scores?
Indicate how far a score is from the mean in a sample context
Sample means incremented by standard error units
A t-score indicates how far a sample mean is from another mean
Sample means are distributed following the t-distribution