flashcard 6

(50 cards)

1
Q

What is the primary goal of inferential statistics?

A

To draw conclusions about a population based on sample data by testing hypotheses and estimating the probability that observed patterns arose by chance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

List the main steps involved in performing a formal statistical test.

A

1) Formulate null (H₀) and alternative (H₁) hypotheses; 2) Choose an appropriate test based on data type and study design; 3) Compute the test statistic and associated P-value; 4) Evaluate statistical significance; 5) Reject or fail to reject H₀.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How is a null hypothesis (H₀) defined, and what role does it play in hypothesis testing?

A

The null hypothesis is a statement assumed true unless data provide sufficient evidence to reject it; it serves as the baseline for comparison, such as “no effect” or “no difference” in the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What distinguishes a one-tailed test from a two-tailed test?

A

A one-tailed test examines an effect in a specified direction (e.g., increase only), while a two-tailed test considers deviations in either direction (increase or decrease). The choice depends on prior knowledge or logical constraints indicating which direction is plausible.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why might a researcher choose a one-tailed test instead of a two-tailed test?

A

When previous literature, biological constraints, or physical logic indicate that any true effect can only go in one direction, making a one-tailed test more powerful for detecting that specific directional change.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does the P-value represent in hypothesis testing?

A

The P-value is the probability of obtaining the observed result— or one more extreme— assuming that the null hypothesis is true; it quantifies how surprising the data are under H₀.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How is statistical significance conventionally determined using the P-value?

A

If P < 0.05, the result is typically deemed statistically significant; P < 0.01 or P < 0.001 are considered very or highly significant, respectively. Values between 0.05 and 0.10 are often called “borderline significant,” suggesting that more data might clarify the effect.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a Type I error, and how is it related to the significance level (α)?

A

A Type I error occurs when one incorrectly rejects a true null hypothesis (false positive). The significance level α (e.g., 0.05) is the maximum tolerated probability of making that error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a Type II error, and what concept quantifies our ability to avoid it?

A

A Type II error happens when one fails to reject a false null hypothesis (false negative). Statistical power (1 − β) measures the probability of correctly detecting a true effect, thus avoiding Type II errors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why is it incorrect to say a large P-value proves there is no effect?

A

A large P-value only indicates insufficient evidence to reject H₀; it does not prove that no effect exists, as lack of evidence could stem from small sample size or high variability rather than a true null condition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When testing for a linear relationship between two continuous variables, which correlation coefficient is appropriate?

A

Pearson’s correlation coefficient is used when both variables are continuous, normally distributed, and expected to have a linear relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Under what circumstances should Spearman’s rank correlation be used instead of Pearson’s?

A

When data are ordinal, skewed, not normally distributed, or the relationship is monotonic but not strictly linear; Spearman’s assesses correlation based on ranks and is robust to outliers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does a Pearson correlation coefficient (r) of zero indicate, and why does it not necessarily imply no association?

A

r = 0 indicates no linear relationship between variables, but a non-linear or curvilinear association may still exist; thus, zero linear correlation doesn’t guarantee complete independence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a chi-square (χ²) test, and when is it applied?

A

The χ² test evaluates whether observed frequencies in categorical data deviate from expected frequencies under the assumption of no association; it’s used for contingency tables to test independence or goodness of fit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How are expected frequencies calculated for a simple 2×2 contingency table in the χ² test?

A

For each cell, multiply the row total by the column total, then divide by the overall sample size; this gives the expected count if there is no association between row and column variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why might Fisher’s exact test be used instead of a χ² test?

A

Fisher’s exact test provides an exact P-value for small sample sizes (when one or more expected counts are < 5), whereas the χ² test is an approximation that can be inaccurate under those conditions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What assumptions underlie Student’s t-test for comparing two means?

A

Data in each group should be approximately normally distributed, groups should have equal variances (for the classic t-test), and observations must be independent. If variances are unequal, Welch’s t-test is preferred.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

When should an unpaired t-test be used over a paired t-test?

A

Use an unpaired t-test for independent samples (e.g., two different groups of subjects). A paired t-test is for dependent samples or repeated measures (e.g., the same subjects measured before and after treatment).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How can one verify whether data meet the normality assumption before applying a parametric test?

A

By visual inspection of histograms or Q-Q plots, or via formal tests like the D’Agostino–Pearson normality test; if data deviate substantially from normality, nonparametric alternatives should be considered.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the principle behind the Mann–Whitney U-test?

A

The Mann–Whitney U-test (Wilcoxon rank-sum) compares the ranks of two independent samples to test whether one group tends to have larger values than the other; it does not assume normality.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Under what conditions is the Mann–Whitney U-test appropriate?

A

When comparing two independent groups with continuous or ordinal data that are not normally distributed, provided the distributions have similar shapes and sample sizes are adequate.

22
Q

What does ANOVA (Analysis of Variance) test, and why is it used instead of multiple t-tests?

A

One-way ANOVA tests whether there are any statistically significant differences among the means of three or more independent groups; it controls the family-wise error rate better than performing multiple pairwise t-tests.

23
Q

What assumption does ANOVA make about the variances across groups, and what alternative exists if this assumption is violated?

A

ANOVA assumes homogeneity of variances (equal variances across groups). If variances are unequal, alternatives include Welch’s ANOVA or, for non-normal data, the Kruskal–Wallis test.

24
Q

Explain the difference between one-way and two-way ANOVA.

A

One-way ANOVA examines the effect of a single categorical predictor on a continuous outcome. Two-way ANOVA evaluates the effects of two categorical predictors (and their interaction) on a continuous outcome.

25
Why are post-hoc tests necessary following a significant ANOVA result?
ANOVA only indicates that at least two group means differ, but doesn’t specify which ones; post-hoc tests (e.g., Tukey’s HSD) perform pairwise comparisons with adjustments to control the overall Type I error rate.
26
What is the Tukey Honest Significant Difference (HSD) method used for?
To conduct all pairwise comparisons among group means after a significant one-way ANOVA, providing adjusted P-values that control the family-wise error rate under the assumption of equal variances.
27
When might the Games–Howell post-hoc test be preferred over Tukey’s HSD?
When group variances are unequal—an assumption of Tukey’s HSD—because Games–Howell does not require homogeneity of variances to perform pairwise comparisons.
28
What is the Kruskal–Wallis test, and when should it be used?
The Kruskal–Wallis test is a nonparametric alternative to one-way ANOVA that compares the ranks of multiple independent groups; it’s used when data are not normally distributed or when variances differ substantially.
29
How does the Kruskal–Wallis test handle ties and unequal sample sizes?
It ranks all observations jointly, accounting for ties by assigning average ranks to tied values; unequal sample sizes are handled naturally since the test compares overall rank sums relative to group sizes.
30
Define statistical power and explain two ways to increase it.
Statistical power is the probability of correctly rejecting a false null hypothesis (1 − β). It can be increased by enlarging the sample size or reducing data variability (e.g., using more precise measurements).
31
What is the difference between statistical significance and practical (or clinical) significance?
Statistical significance refers to the likelihood that an observed effect is not due to chance. Practical significance considers the real-world importance or magnitude of that effect—whether it is large enough to matter.
32
What is a confidence interval, and how should it be interpreted?
A confidence interval provides a range of plausible values for a population parameter (e.g., a mean) with a specified confidence level (e.g., 95%). It means that, across repeated samples, the true parameter will lie within that range 95% of the time.
33
Why does a large sample size sometimes lead to “statistically significant” results that might be misleading?
With very large samples, even trivial differences can produce very small P-values, making them statistically significant despite having negligible practical impact.
34
Explain the concept of “degrees of freedom” in the context of the χ² test.
Degrees of freedom for a contingency table are calculated as (number of rows − 1) × (number of columns − 1). They reflect how many cell counts can vary independently when computing expected frequencies.
35
How do researchers assess if two categorical variables are independent?
By constructing a contingency table and using a χ² test of independence (or Fisher’s exact test if cell counts are small) to see if observed frequencies deviate significantly from those expected under independence.
36
In hypothesis testing, what does “rejecting the null hypothesis” imply?
That there is sufficient statistical evidence to conclude that the null hypothesis (e.g., “no effect” or “no difference”) is unlikely to be true, supporting the alternative hypothesis.
37
Describe one scenario in which a researcher would use a two-sided alternative hypothesis.
When there is no prior indication of whether the effect could be positive or negative—only that it differs from zero—so the test must detect deviations in either direction.
38
Why is formulating the alternative hypothesis crucial before analyzing data?
Because the specific wording of H₁ determines whether a one-tailed or two-tailed test is appropriate and ensures that the analysis aligns with the research question and prior knowledge.
39
What is the purpose of testing for normality in quantitative data?
To determine whether parametric tests (which assume normal distribution) are valid or if nonparametric alternatives should be used; normality tests and plots help identify distributional deviations.
40
Name two formal tests for assessing normality and briefly describe how they differ.
1) D’Agostino–Pearson test, which combines skewness and kurtosis measures into a single P-value; 2) Shapiro–Wilk test, which assesses how well data fit a normal distribution based on correlation with theoretical normal scores.
41
When is Welch’s t-test preferred over the classic Student’s t-test?
When comparing two independent samples that do not have equal variances, as Welch’s adjusts the degrees of freedom to account for unequal variability.
42
How does the concept of ranking underlie many nonparametric tests?
Nonparametric tests often convert raw data into ranks, then compare rank sums or average ranks across groups, minimizing the influence of outliers and not requiring normality.
43
What does it mean for two variables to have a monotonic relationship?
As one variable increases, the other either consistently increases or consistently decreases, but not necessarily at a constant rate; Spearman’s correlation captures monotonic associations.
44
Explain how sample size influences both the standard error of the mean and the P-value.
Larger sample sizes reduce the standard error of the mean, leading to narrower confidence intervals and increasing the likelihood of detecting statistically significant differences (smaller P-values), all else equal.
45
What is the rationale for using a two-way ANOVA with interaction?
To test not only the main effects of two factors on an outcome but also whether the effect of one factor depends on the level of the other factor, revealing synergistic or antagonistic interactions.
46
Under what condition would you choose the Kruskal–Wallis test over one-way ANOVA?
When comparing more than two independent groups whose data violate normality assumptions or have heterogeneity of variances, making ANOVA inappropriate.
47
Why is it important to ensure sample independence when selecting a statistical test?
Because many tests (e.g., unpaired t-test, χ² test, one-way ANOVA) assume that observations are independent; violating this can inflate Type I error rates and invalidate results.
48
Summarize the conceptual difference between descriptive and inferential statistics.
Descriptive statistics summarize and visualize data from a specific sample (e.g., means, medians, histograms), while inferential statistics use sample data to make generalizations, test hypotheses, and estimate population parameters.
49
What key information must be reported alongside a P-value to provide proper context?
The test statistic, sample size, direction of the effect (if one-tailed), confidence intervals for estimates, and an indication of practical or clinical significance to ensure transparent interpretation.
50
How does family-wise error rate relate to multiple comparisons, and how do post-hoc tests address it?
Family-wise error rate is the probability of making at least one Type I error across all pairwise comparisons. Post-hoc methods (e.g., Tukey’s HSD) adjust P-values or significance thresholds to control this overall error rate when comparing multiple groups.