proportions Flashcards by Anita Ferrari

What are the four conditions of a Binomial random variable?

Fixed number of trials (𝑛)

Independent trials

Two possible outcomes per trial

Same probability of success for each trial

How well did you know this?

Not at all

Perfectly

How is the sample proportion (𝑝̂) estimated?

𝑝̂ = (Number of successes) / (Total number of trials)

How well did you know this?

Not at all

Perfectly

What is the formula for the standard error of the sample proportion?

𝑠𝑒(𝑝̂) = sqrt[𝑝̂(1−𝑝̂)/𝑛]

How well did you know this?

Not at all

Perfectly

What is the formula for a 95% confidence interval for 𝑝?

𝑝̂ ± (1.96 × 𝑠𝑒(𝑝̂))

How well did you know this?

Not at all

Perfectly

Why do we assume a Normal approximation for the sample proportion?

Because the sample sizes are large, allowing the Binomial distribution to be approximated by a Normal distribution.

How well did you know this?

Not at all

Perfectly

What R function is used to compute a confidence interval for proportions?

binconf(x, n, alpha=0.05, method=”asymptotic”) from the Hmisc package.

How well did you know this?

Not at all

Perfectly

What happens when the sample size is too small for a Normal approximation in confidence intervals?

The confidence interval may extend below zero, which is not possible for probabilities. A different method, such as Wilson’s interval, should be used.

How well did you know this?

Not at all

Perfectly

What is Wilson’s confidence interval formula used for?

It is used for small sample sizes to ensure the confidence interval does not go below zero or above one.

How well did you know this?

Not at all

Perfectly

What is the formula for the standard error of the difference in two proportions?

se(p^1− p^2)= sqrt[((p^1(1 - p^1) / n1) + (p^2 (1 - p^2) / n2)]

How well did you know this?

Not at all

Perfectly

What are the null and alternative hypotheses for comparing two proportions?

Null Hypothesis (𝐻0): No difference between proportions (𝑝𝐴 - 𝑝𝐶 = 0)

Alternative Hypothesis (𝐻1): There is a difference (𝑝𝐴 - 𝑝𝐶 ≠ 0)

How well did you know this?

Not at all

Perfectly

Why is a hypothesis test used to compare two proportions?

It determines whether the observed difference between two proportions is due to chance or represents a true difference.

How well did you know this?

Not at all

Perfectly

Why is Wilson’s interval preferred for small sample sizes?

It avoids impossible probability values (e.g., negative probabilities) and provides more accurate confidence intervals when 𝑝̂ is close to 0 or 1.

How well did you know this?

Not at all

Perfectly

What does the test statistic measure in a two-proportion z-test?

It measures the observed difference between sample proportions as a ratio of the standard error, helping determine statistical significance

How well did you know this?

Not at all

Perfectly

How is the test statistic for comparing two proportions calculated?

dataestimate−hypothesizedvalue / standard error

How well did you know this?

Not at all

Perfectly

What does a test statistic of 4.82 indicate in a z-test?

It means the observed difference is 4.82 standard errors away from the null hypothesis (zero difference), suggesting strong evidence against 𝐻0

How well did you know this?

Not at all

Perfectly

What is the p-value for a test statistic of 4.82 in a z-test?

Study These Flashcards

The probability of obtaining such an extreme value (or more) under 𝐻0 is very small, around10−6, leading to rejection of 𝐻0

How is the confidence interval for the difference in two proportions calculated?

Study These Flashcards

(p^1−p^2)±(z×se(p^1−p^2))

What are the three sampling situations for comparing proportions?

Study These Flashcards

Situation A: Independent samples (e.g., comparing two countries).

Situation B: One sample, mutually exclusive categories (e.g., voting choices).

Situation C: One sample, multiple response options (e.g., survey with multiple answers).

How is the standard error calculated for Situation A (independent samples)?

Study These Flashcards

sqrt [((P^1(1-P^1))/ n1)+ ((p^2(1-p^2) / n2)]

When comparing survey responses from two countries, which sampling situation applies?

Study These Flashcards

Situation A (independent samples), since each person belongs to only one country’s sample.

How does the choice of standard error formula impact results?

Study These Flashcards

If the wrong formula is used, confidence intervals and hypothesis tests may be incorrect, leading to misleading conclusions.

What is the formula for the standard error of the difference between two proportions?

Study These Flashcards

se(p^1− p^2)= sqrt [(P^1 + P^2 - ( P^1-P^2)^2) /n]

When should Situation B be used in sampling?

Study These Flashcards

Situation B is used when one sample is asked a single question with mutually exclusive response options, such as “agree,” “disagree,” or “don’t know.”

How are statistical odds calculated?

Study These Flashcards

Odds= p(success) / p(failure) = p / 1−p

What does an odds ratio (OR) greater than 1 indicate?

It indicates that the intervention group has higher odds of success compared to the control group.

How is an odds ratio (OR) calculated?

θ= odds in group 1/ odds in group 2

What does an odds ratio (OR) less than 1 indicate?

It indicates that the control group has higher odds of success compared to the intervention group.

Why do we use the log of the odds ratio to construct confidence intervals?

Because the distribution of the odds ratio is highly skewed, and taking the log makes it approximately normal.

What is the formula for the standard error of the log odds ratio?

seOR= sqrt [ 1/n11 + 1/n12 + 1/n21 + 1/n22 ]

How do you obtain a confidence interval for an odds ratio?

Compute log(θ̂) use: log(θ^)±z1−α/2×seOR Exponentiate the lower and upper limits to return to the odds ratio scale.

What does an odds ratio of exactly 1 indicate?

It indicates no difference between the two groups.

What is the formula for the pooled sample proportion when testing the difference between two proportions?

p^ = (x1 + x2) / (n1 + n2) where 𝑥1 and 𝑥2 are the number of successes in each sample.

How do you interpret a confidence interval for the difference between two proportions?

If 0 is in the interval, there is no significant difference. If the interval is entirely positive, 𝑝1>𝑝2 If the interval is entirely negative, 𝑝1<𝑝2

Why are odds used instead of probabilities in logistic regression?

Because odds have mathematical properties that allow for a linear relationship with predictor variables on the log scale.

What does a log odds ratio of 0 mean?

It means the odds ratio is 1, indicating no difference between the two groups.

What transformation is used to make the odds ratio’s distribution approximately normal?

The natural logarithm (log transformation) is applied to the odds ratio.

proportions Flashcards

(36 cards)