proportions Flashcards
(36 cards)
What are the four conditions of a Binomial random variable?
Fixed number of trials (𝑛)
Independent trials
Two possible outcomes per trial
Same probability of success for each trial
How is the sample proportion (𝑝̂) estimated?
𝑝̂ = (Number of successes) / (Total number of trials)
What is the formula for the standard error of the sample proportion?
𝑠𝑒(𝑝̂) = sqrt[𝑝̂(1−𝑝̂)/𝑛]
What is the formula for a 95% confidence interval for 𝑝?
𝑝̂ ± (1.96 × 𝑠𝑒(𝑝̂))
Why do we assume a Normal approximation for the sample proportion?
Because the sample sizes are large, allowing the Binomial distribution to be approximated by a Normal distribution.
What R function is used to compute a confidence interval for proportions?
binconf(x, n, alpha=0.05, method=”asymptotic”) from the Hmisc package.
What happens when the sample size is too small for a Normal approximation in confidence intervals?
The confidence interval may extend below zero, which is not possible for probabilities. A different method, such as Wilson’s interval, should be used.
What is Wilson’s confidence interval formula used for?
It is used for small sample sizes to ensure the confidence interval does not go below zero or above one.
What is the formula for the standard error of the difference in two proportions?
se(p^1− p^2)= sqrt[((p^1(1 - p^1) / n1) + (p^2 (1 - p^2) / n2)]
What are the null and alternative hypotheses for comparing two proportions?
Null Hypothesis (𝐻0): No difference between proportions (𝑝𝐴 - 𝑝𝐶 = 0)
Alternative Hypothesis (𝐻1): There is a difference (𝑝𝐴 - 𝑝𝐶 ≠ 0)
Why is a hypothesis test used to compare two proportions?
It determines whether the observed difference between two proportions is due to chance or represents a true difference.
Why is Wilson’s interval preferred for small sample sizes?
It avoids impossible probability values (e.g., negative probabilities) and provides more accurate confidence intervals when 𝑝̂ is close to 0 or 1.
What does the test statistic measure in a two-proportion z-test?
It measures the observed difference between sample proportions as a ratio of the standard error, helping determine statistical significance
How is the test statistic for comparing two proportions calculated?
dataestimate−hypothesizedvalue / standard error
What does a test statistic of 4.82 indicate in a z-test?
It means the observed difference is 4.82 standard errors away from the null hypothesis (zero difference), suggesting strong evidence against 𝐻0
What is the p-value for a test statistic of 4.82 in a z-test?
The probability of obtaining such an extreme value (or more) under 𝐻0 is very small, around10−6, leading to rejection of 𝐻0
How is the confidence interval for the difference in two proportions calculated?
(p^1−p^2)±(z×se(p^1−p^2))
What are the three sampling situations for comparing proportions?
Situation A: Independent samples (e.g., comparing two countries).
Situation B: One sample, mutually exclusive categories (e.g., voting choices).
Situation C: One sample, multiple response options (e.g., survey with multiple answers).
How is the standard error calculated for Situation A (independent samples)?
sqrt [((P^1(1-P^1))/ n1)+ ((p^2(1-p^2) / n2)]
When comparing survey responses from two countries, which sampling situation applies?
Situation A (independent samples), since each person belongs to only one country’s sample.
How does the choice of standard error formula impact results?
If the wrong formula is used, confidence intervals and hypothesis tests may be incorrect, leading to misleading conclusions.
What is the formula for the standard error of the difference between two proportions?
se(p^1− p^2)= sqrt [(P^1 + P^2 - ( P^1-P^2)^2) /n]
When should Situation B be used in sampling?
Situation B is used when one sample is asked a single question with mutually exclusive response options, such as “agree,” “disagree,” or “don’t know.”
How are statistical odds calculated?
Odds= p(success) / p(failure) = p / 1−p