5. stats and IR Flashcards
(29 cards)
what is the purpose of the Chi-Square Test
examine relationships between categorical variables.
ex: whether different political systems or ideologies are associated with different behaviors or outcomes (like war, repression, or immigration attitudes).
what are the 2 type of Chi-Square Test?
Goodness of Fit Test
-Tests whether observed counts differ from expected counts in one variable: tests whether the distribution of one categorical variable matches what we expect.
-Example: If we expect democracies and autocracies to go to war at the same rate, does the observed frequency support this?
Test of Association (Chi-square test of independence)
-Tests whether two categorical variables are related.
-either correlation or regression test
-Example: Is regime type (democracy vs. autocracy) associated with war behavior (yes/no)?
what are expected proportion in the chi-square test (goodness of fit test)
what you expect from a test =/ observed proportion
ex: If regime type doesn’t matter for war initiation (the null hypothesis), we’d expect all three regime types to be responsible for equal numbers of wars:
Total wars = 381
381 ÷ 3 regime types = 127 wars each (expected count)
So your observed vs. expected is:
-> Regime Type Observed Expected
Democracies 107 127
Hybrid regimes 124 127
Autocracies 150 127
You would then plug these values into the chi-square formula:
𝜒²=∑(𝑂−𝐸)²/E
Where:
O = observed value
E = expected value
what is the expected count formula
ExpectedCount (row,column)= (RowTotal×ColumnTotal)/GrandTotal
-> You’re using this formula when you have a contingency table and want to calculate expected counts under the null hypothesis:
means that if we assume no relationship between the two variables, we’d expect …
Then, you compute the chi-square statistic
the 2 types of chi-square test at work (with the example of war and regime)
*Goodness of Fit Test
You’re comparing observed vs. expected counts for one categorical variable (regime type).
Example question: Are wars equally distributed across regime types?
- Chi-Square Test of Association (Independence)
You test whether two categorical variables (e.g., regime type and aggression) are associated.
-Think of a contingency table with counts of war vs. peace for each regime type.
-Hypotheses:
.Null hypothesis (H₀): State aggression is not associated with regime type.
.Alternative hypothesis (H₁): State aggression is associated with regime type.
.If you reject H₀, you’re saying: regime type does make a difference in aggression.
how to interpret chi-squared statistics
the bigger X² (the observed counts differ from expected ones to some degree) is, the more you can reject null hypo
what is ANOVA (Analysis of Variance)
*used when you’re dealing with continuous variables and want to compare means across multiple groups.
*While chi-square compares categories, ANOVA compares means — it’s used when your dependent variable is continuous, and your independent variable is categorical with 3 or more groups.
*ex: What type of diet leads to more happiness?
Happiness (e.g., score 1–10) = dependent variable (continuous)
Diet type (e.g., vegan, vegetarian, omnivore) = independent variable (categorical)
Use ANOVA to test if mean happiness scores differ across diet types.
ANOVA is based on comparing:
Variability between groups (how far apart the group means are; the further apart the means are, the more significant the test is bcs reflects potential effect of the factor you’re testing)
Variability within groups (how spread out values are inside each group; the smaller the variation is in groups, the more significant the test is bcs reflects random noise or error, not the effect of the factor)
ANOVA formula
TotalVariability=BetweenGroupVariability+Within-GroupVariability
-> ∑(x− xˉ)²=∑ni( xˉi− xˉ)²+∑(x− xˉi )²
with:
xˉ (x bar)x: overall (grand) mean
x: individual value
xˉi: mean of group i
ni: number of observations in group i
interpretation of results of F-statistics
F-value: Interpretation
*F ≈ 1: Group means are likely not significantly different — between-group ≈ within-group variability
groups don’t look meaningfully different from each other so you cannot reject H0 that says that all means are equal
*F > 1 (and significant): Group means are likely different — there’s more variability between groups than expected by chance
*F is very large: Strong evidence against the null hypothesis (H₀: no difference in means)
in ANOVA, what does SSTotal; SSG and SSE mean
SSTotal (Total Sum of Squares) = Total variation in the data
SSG (Sum of Squares due to Groups) = Variation between group means
SSE (Sum of Squares due to Error) = Variation within each group
what is the core ratio of ANOVA?
F-statistics (tels us whether the differences between group means are statistically significant): a ratio of the average variability between groups to the average variability within groups
F= MeanSquareBetweenGroups(MSB) /
MeanSquareWithinGroups(MSE)
Where:
𝑀𝑆𝐵=𝑆𝑆𝐺/𝑘−1
MSE= SSE/N−k
-> The larger this ratio, the more likely the group means are truly different so that the test is statistically relevant bcs null hypothesis is that the means are equal
how can we determine if F-statistics is relevant?
To determine if the F-statistic is significant, we:
1. Compare the calculated F-value to a critical value from the F-distribution
— depends on:
*Number of groups (df₁ = k – 1)
*Total sample size (df₂ = N – k)
2. Or look at the p-value:
*If p < 0.05 → Reject the null hypothesis
*Conclude that at least one group mean is different
what are the different ANOVA hypotheses
H₀ (null): All group means are equal 𝜇1=𝜇2=𝜇3=⋯=𝜇𝑘
H₁ (alternative): At least one group mean is different
what are the assumptions needed to use ANOVA
- Sample Size / Normality
Each group should have n ≥ 30 (so the Central Limit Theorem applies),
OR the data in each group should be approximately normally distributed. - Equal Variance (Homogeneity of Variance)
The variability within groups (measured by standard deviation) should be roughly the same across all groups.
-> variability is not similar if the standard deviation of one group is more than double the standard deviation of another group
+ DV: continuous; IV: categorical; want to compare means across at least 3 categories
what is the logic of hypothese testing
We assume H₀ is true until evidence strongly suggests otherwise.
criticism of null hypothese testing
Can lead to binary thinking (significant vs. not significant).
Doesn’t provide substantive insight—only indicates whether something is likely due to chance.
Measures of association and effect size offer deeper understanding than significance tests alone.
what is a p-value
tells you how likely the pattern you found in your sample could have occurred by chance if the null hypothesis (H₀) were true.
It’s the probability of getting a test statistic as extreme or more extreme than the one you observed, assuming H₀ is correct.
if p<0,05 (ex0,03)→ reject the null hypothesis. There’s only a 3% chance you’d see this result if there were no real effect, link.
what are the one-sample or two-sample significance test
One-sample tests compare a sample statistic (e.g., mean or proportion) to a known or hypothesized population value.
Two-sample tests compare two groups (e.g., men vs women, democracies vs autocracies) to see if the difference in means or proportions is statistically significant.
what are the limitations of chi-square
Sensitive to sample size — large samples can produce statistically significant results even if the actual association is weak.
Doesn’t measure direction or strength of the relationship — just whether it exists.
statistically vS marginally significant for p value
*Values between 0.05 and 0.1 are considered marginally significant
*p<0.05
This means there’s less than a 5% probability that the observed results happened by chance.
You reject the null hypothesis → there’s a statistically significant relationship between the variable
what abt if you want to do ANOVA but you only have 2 groups
you run a t-test
what are the One-sample significance tests?
Used when you want to compare a sample statistics (mean or proportion) against a known or hypothesized population value.
Example:
Are people’s support for foreign aid different from 50%?
This involves calculating:
The standard error,
A test statistic (like z or t),
Then comparing it to a critical value or using the p-value.
what are the Two-sample significance tests?
Difference of Means Test: Compares the means of a dependent variable across two independent groups (e.g., men vs. women on abortion approval).
Difference of Proportions Test: Compares proportions (e.g., proportion of men and women supporting foreign aid).
In both cases, the logic is:
Compute the observed difference.
Determine if it’s likely due to chance (via standard error and test statistic).