confidence interval for sample mean Flashcards
Why do sample estimates vary?
Sample estimates from the same population vary because different random samples produce different values.
How can we assess if a sample represents the population?
If estimates from different samples are similar → a particular estimate is likely close to the true parameter.
If estimates differ greatly → difficult to assume any estimate is close to the true parameter.
What does a sample estimate provide about the population parameter?
A sample estimate provides a point estimate of the true parameter.
How is uncertainty in a sample estimate represented?
Standard error (SE): SD of the sampling distribution
Confidence interval (CI): Range where the true parameter likely lies
what is a confidence interval
an interval (defined by a lower/upper limit) within which the true value of the population parameter is stated to lie with a specified confidence
How can we transform X∼N(μ,σ) into a standard normal variable?
Z = (X - μ) / σ
This standardises 𝑋 to follow N(0,1)
What percentage of
𝑍 values lie between -1.96 and 1.96 in a standard normal distribution?
95% of randomly generated values lie in this range:
Pr(−1.96≤Z≤1.96)=0.95
What happens to the sampling distribution as the sample size increases?
The distribution becomes more symmetric (normal) around 𝜇.
Variability depends on population variability (𝜎) and sample size (𝑛).
How is the standard error (SE) calculated?
SE(μ^)= σ / sqrt(n)
What is the formula for the distribution of the sample mean?
μ^∼N(μ, (σ^2/n)
How do we transform 𝜇^ into a standard normal variable?
Z= (μ^−μ) / (σ/sqrtn)
What is the formula for a 95% confidence interval when 𝜎 is known?
μ^±1.96×σ/sqrtn
What does the Central Limit Theorem (CLT) state?
If sufficiently many samples are drawn, the sample mean follows a normal distribution, regardless of the population distribution.
What if the population standard deviation (𝜎) is unknown?
Use the sample standard deviation (s) and the t-distribution instead of the normal distribution.
How is the confidence interval calculated when
𝜎 is unknown?
μ^±t(1−α/2,df=n−1) × s/sqrtn
What are the key properties of the t-distribution?
Similar in shape to the normal distribution
Centered at zero.
Shape depends on degrees of freedom (df=n−1)
shape changes with sample size e.g. has heavier tails for small samples compared to the normal distribution
How does the similarity between the 𝑡-distribution and the standard normal distribution depend on sample size
The 𝑡-distribution is more spread out for small 𝑛.
As 𝑛 increases, the 𝑡-distribution approaches the normal distribution.
For 𝑛=1000, 95% of the 𝑡-distribution falls within ±1.96, similar to the normal distribution.
What are the key differences between the normal and 𝑡-distributions?
The normal distribution has a fixed shape.
The 𝑡-distribution has heavier tails, especially for small sample sizes.
The 𝑡-distribution approaches the normal as 𝑛 increases.
How does the 95% coverage change between normal and 𝑡-distributions?
Normal: 95% falls within ±1.96 standard deviations.
𝑡-distribution with df=10: Only 92% falls within ±1.96, requiring ±2.23 for 95% coverage.
What is the formula for a confidence interval using the 𝑡-distribution?
μ^±(t(1−α/2,df=n−1)×SE(μ^))
What does each term in the confidence interval formula represent?
𝜇^ : Sample mean (estimate of the true mean)
t(1−α/2,n−1): 𝑡-value for the desired confidence level
SE (μ^): Standard error of the estimate
What two factors are needed to calculate a confidence interval?
Confidence level (choice of 𝛼)
Standard error (SE(μ^), which measures uncertainty)
How do we find the
𝑡-multiplier for a 95% confidence interval?
Choose 𝛼=0.05
Compute 1−α/2=0.975.
Find 𝑡 such that 𝑃(𝑇≤𝑡)=0.975 for T∼tdf=n−1
How does the choice of confidence level affect the confidence interval?
Higher confidence levels (e.g., 99%) → wider intervals → more likely to contain the true parameter.
Lower confidence levels (e.g., 90%) → narrower intervals → less likely to contain the true parameter.
Common choices: 90%, 95%, and 99%.