5. stats and IR Flashcards
(21 cards)
what is the purpose of the Chi-Square Test
examine relationships between categorical variables.
ex: whether different political systems or ideologies are associated with different behaviors or outcomes (like war, repression, or immigration attitudes).
what are the 2 type of Chi-Square Test?
Goodness of Fit Test
-Tests whether observed counts differ from expected counts in one variable.
-Example: If we expect democracies and autocracies to go to war at the same rate, does the observed frequency support this?
Test of Association (Chi-square test of independence)
-Tests whether two categorical variables are related.
-Example: Is regime type (democracy vs. autocracy) associated with war behavior (yes/no)?
what are expected proportion in the chi-square test
what you expect from a test =/ observed proportion
ex: If regime type doesn’t matter for war initiation (the null hypothesis), we’d expect all three regime types to be responsible for equal numbers of wars:
Total wars = 381
381 ÷ 3 regime types = 127 wars each (expected count)
So your observed vs. expected is:
-> Regime Type Observed Expected
Democracies 107 127
Hybrid regimes 124 127
Autocracies 150 127
You would then plug these values into the chi-square formula:
𝜒²=∑(𝑂−𝐸)²/E
Where:
O = observed value
E = expected value
what is the expected count formula
ExpectedCount (row,column)= (RowTotal×ColumnTotal)/GrandTotal
-> You’re using this formula when you have a contingency table and want to calculate expected counts under the null hypothesis:
means that if we assume no relationship between the two variables, we’d expect …
Then, you compute the chi-square statistic
the 2 types of chi-square test at work (with the example of war and regime)
*Goodness of Fit Test
You’re comparing observed vs. expected counts for one categorical variable (regime type).
Example question: Are wars equally distributed across regime types?
- Chi-Square Test of Association (Independence)
You test whether two categorical variables (e.g., regime type and aggression) are associated.
-Think of a contingency table with counts of war vs. peace for each regime type.
-Hypotheses:
.Null hypothesis (H₀): State aggression is not associated with regime type.
.Alternative hypothesis (H₁): State aggression is associated with regime type.
.If you reject H₀, you’re saying: regime type does make a difference in aggression.
what is ANOVA (Analysis of Variance)
*used when you’re dealing with continuous variables and want to compare means across multiple groups.
*While chi-square compares categories, ANOVA compares means — it’s used when your dependent variable is continuous, and your independent variable is categorical with 3 or more groups.
*ex: What type of diet leads to more happiness?
Happiness (e.g., score 1–10) = dependent variable (continuous)
Diet type (e.g., vegan, vegetarian, omnivore) = independent variable (categorical)
Use ANOVA to test if mean happiness scores differ across diet types.
ANOVA is based on comparing:
Variability between groups (how far apart the group means are)
Variability within groups (how spread out values are inside each group)
ANOVA formula
TotalVariability=BetweenGroupVariability+Within-GroupVariability
-> ∑(x− xˉ)²=∑ni( xˉi− xˉ)²+∑(x− xˉi )²
with:
xˉ (x bar)x: overall (grand) mean
x: individual value
xˉi: mean of group i
ni: number of observations in group i
interpretation of results of F-statistics
F-value: Interpretation
*F ≈ 1: Group means are likely not significantly different — between-group ≈ within-group variability
*F > 1 (and significant): Group means are likely different — there’s more variability between groups than expected by chance
*F is very large: Strong evidence against the null hypothesis (H₀: no difference in means)
in ANOVA, what does SSTotal; SSG and SSE mean
SSTotal (Total Sum of Squares) = Total variation in the data
SSG (Sum of Squares due to Groups) = Variation between group means
SSE (Sum of Squares due to Error) = Variation within each group
what is the core ratio?
F-statistics (key number that ANOVA calculates to decide whether the differences between group means are statistically significant; it’s a ratio)
F= MeanSquareWithinGroups(MSE) /
MeanSquareBetweenGroups(MSB)
Where:
𝑀𝑆𝐵=𝑆𝑆𝐺/𝑘−1
MSE= SSE/N−k
-> The larger this ratio, the more likely the group means are truly different (i.e., not just due to random variation
how can we determine if F-statistics is relevant?
To determine if the F-statistic is significant, we:
1. Compare the calculated F-value to a critical value from the F-distribution
— depends on:
*Number of groups (df₁ = k – 1)
*Total sample size (df₂ = N – k)
2. Or look at the p-value:
*If p < 0.05 → Reject the null hypothesis
*Conclude that at least one group mean is different
what are the different ANOVA hypotheses
H₀ (null): All group means are equal 𝜇1=𝜇2=𝜇3=⋯=𝜇𝑘
H₁ (alternative): At least one group mean is different
what are the assumptions needed to use ANOVA
- Sample Size / Normality
Each group should have n ≥ 30 (so the Central Limit Theorem applies),
OR the data in each group should be approximately normally distributed. - Equal Variance (Homogeneity of Variance)
The variability (measured by standard deviation) should be roughly the same across all groups.
A rule of thumb: If the standard deviation of one group is more than twice that of another, this assumption may be violated.
what is the logic of hypothese testing
We assume H₀ is true until evidence strongly suggests otherwise.
criticism of null hypothese testing
Can lead to binary thinking (significant vs. not significant).
Doesn’t provide substantive insight—only indicates whether something is likely due to chance.
Measures of association and effect size offer deeper understanding than significance tests alone.
what is a p-value
tells you how likely it is that the pattern you found in your sample could have occurred by chance if the null hypothesis (H₀) were true.
It’s the probability of getting a test statistic as extreme or more extreme than the one you observed, assuming H₀ is correct.
if p<0,05 (ex0,03)→ reject the null hypothesis. There’s only a 3% chance you’d see this result if there were no real effect.
what are the one-sample or two-sample significance test
One-sample tests compare a sample statistic (e.g., mean or proportion) to a known or hypothesized population value.
Two-sample tests compare two groups (e.g., men vs women, democracies vs autocracies) to see if the difference in means or proportions is statistically significant.
what are the limitations of chi-square
Sensitive to sample size — large samples can produce statistically significant results even if the actual association is weak.
Doesn’t measure direction or strength of the relationship — just whether it exists.
what is measures of association:
These tell us how strong a relationship is — beyond just whether it’s statistically significant.
Most are based on Proportional Reduction in Error (PRE):
How much better we can predict the dependent variable if we know the independent variable.
what are the 3 common measures of association
- Lambda (λ): Used for nominal-level variables (e.g., gender, party ID).
-Answers: “By how much does knowledge of X improve our guess of Y?”
-Value ranges from 0 (no improvement) to 1 (perfect prediction).
-Asymmetric: λyx (predict Y from X) may differ from λxy.
-Example: Knowing someone’s party ID improves your guess about their stance on immigration by 20% → λ = 0.20 - Somers’ dyx: For ordinal variables (e.g., education level, income group).
-Captures both strength and direction:
.Positive d: higher values of X → higher values of Y
.Negative d: higher X → lower Y
-Based on concordant and discordant pairs (are values of X and Y ranked the same way?)
-Example: dyx = 0.50 means a moderately strong, positive association between education and support for international cooperation. - Cramér’s V: Also used for nominal variables, especially larger cross-tabs (more than 2x2).
-Ranges from 0 to 1.
-Unlike Lambda, it doesn’t assume you’re predicting one variable from another — it’s symmetric.
-Based on chi-square, but adjusts for table size.
-Example: V = 0.60 indicates a fairly strong association between religion and vote choice.