flashcard 6
(50 cards)
What is the primary goal of inferential statistics?
To draw conclusions about a population based on sample data by testing hypotheses and estimating the probability that observed patterns arose by chance.
List the main steps involved in performing a formal statistical test.
1) Formulate null (H₀) and alternative (H₁) hypotheses; 2) Choose an appropriate test based on data type and study design; 3) Compute the test statistic and associated P-value; 4) Evaluate statistical significance; 5) Reject or fail to reject H₀.
How is a null hypothesis (H₀) defined, and what role does it play in hypothesis testing?
The null hypothesis is a statement assumed true unless data provide sufficient evidence to reject it; it serves as the baseline for comparison, such as “no effect” or “no difference” in the population.
What distinguishes a one-tailed test from a two-tailed test?
A one-tailed test examines an effect in a specified direction (e.g., increase only), while a two-tailed test considers deviations in either direction (increase or decrease). The choice depends on prior knowledge or logical constraints indicating which direction is plausible.
Why might a researcher choose a one-tailed test instead of a two-tailed test?
When previous literature, biological constraints, or physical logic indicate that any true effect can only go in one direction, making a one-tailed test more powerful for detecting that specific directional change.
What does the P-value represent in hypothesis testing?
The P-value is the probability of obtaining the observed result— or one more extreme— assuming that the null hypothesis is true; it quantifies how surprising the data are under H₀.
How is statistical significance conventionally determined using the P-value?
If P < 0.05, the result is typically deemed statistically significant; P < 0.01 or P < 0.001 are considered very or highly significant, respectively. Values between 0.05 and 0.10 are often called “borderline significant,” suggesting that more data might clarify the effect.
What is a Type I error, and how is it related to the significance level (α)?
A Type I error occurs when one incorrectly rejects a true null hypothesis (false positive). The significance level α (e.g., 0.05) is the maximum tolerated probability of making that error.
What is a Type II error, and what concept quantifies our ability to avoid it?
A Type II error happens when one fails to reject a false null hypothesis (false negative). Statistical power (1 − β) measures the probability of correctly detecting a true effect, thus avoiding Type II errors.
Why is it incorrect to say a large P-value proves there is no effect?
A large P-value only indicates insufficient evidence to reject H₀; it does not prove that no effect exists, as lack of evidence could stem from small sample size or high variability rather than a true null condition.
When testing for a linear relationship between two continuous variables, which correlation coefficient is appropriate?
Pearson’s correlation coefficient is used when both variables are continuous, normally distributed, and expected to have a linear relationship.
Under what circumstances should Spearman’s rank correlation be used instead of Pearson’s?
When data are ordinal, skewed, not normally distributed, or the relationship is monotonic but not strictly linear; Spearman’s assesses correlation based on ranks and is robust to outliers.
What does a Pearson correlation coefficient (r) of zero indicate, and why does it not necessarily imply no association?
r = 0 indicates no linear relationship between variables, but a non-linear or curvilinear association may still exist; thus, zero linear correlation doesn’t guarantee complete independence.
What is a chi-square (χ²) test, and when is it applied?
The χ² test evaluates whether observed frequencies in categorical data deviate from expected frequencies under the assumption of no association; it’s used for contingency tables to test independence or goodness of fit.
How are expected frequencies calculated for a simple 2×2 contingency table in the χ² test?
For each cell, multiply the row total by the column total, then divide by the overall sample size; this gives the expected count if there is no association between row and column variables.
Why might Fisher’s exact test be used instead of a χ² test?
Fisher’s exact test provides an exact P-value for small sample sizes (when one or more expected counts are < 5), whereas the χ² test is an approximation that can be inaccurate under those conditions.
What assumptions underlie Student’s t-test for comparing two means?
Data in each group should be approximately normally distributed, groups should have equal variances (for the classic t-test), and observations must be independent. If variances are unequal, Welch’s t-test is preferred.
When should an unpaired t-test be used over a paired t-test?
Use an unpaired t-test for independent samples (e.g., two different groups of subjects). A paired t-test is for dependent samples or repeated measures (e.g., the same subjects measured before and after treatment).
How can one verify whether data meet the normality assumption before applying a parametric test?
By visual inspection of histograms or Q-Q plots, or via formal tests like the D’Agostino–Pearson normality test; if data deviate substantially from normality, nonparametric alternatives should be considered.
What is the principle behind the Mann–Whitney U-test?
The Mann–Whitney U-test (Wilcoxon rank-sum) compares the ranks of two independent samples to test whether one group tends to have larger values than the other; it does not assume normality.
Under what conditions is the Mann–Whitney U-test appropriate?
When comparing two independent groups with continuous or ordinal data that are not normally distributed, provided the distributions have similar shapes and sample sizes are adequate.
What does ANOVA (Analysis of Variance) test, and why is it used instead of multiple t-tests?
One-way ANOVA tests whether there are any statistically significant differences among the means of three or more independent groups; it controls the family-wise error rate better than performing multiple pairwise t-tests.
What assumption does ANOVA make about the variances across groups, and what alternative exists if this assumption is violated?
ANOVA assumes homogeneity of variances (equal variances across groups). If variances are unequal, alternatives include Welch’s ANOVA or, for non-normal data, the Kruskal–Wallis test.
Explain the difference between one-way and two-way ANOVA.
One-way ANOVA examines the effect of a single categorical predictor on a continuous outcome. Two-way ANOVA evaluates the effects of two categorical predictors (and their interaction) on a continuous outcome.